© 2010 DYLAN DODD

DEGRADATION OF XYLAN BY RUMEN PREVOTELLA SPP.

BY

DYLAN DODD

DISSERTATION

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Microbiology in the Graduate College of the University of Illinois at Urbana-Champaign, 2010

Urbana, Illinois

Doctoral Committee:

Associate Professor Isaac K. O. Cann, Chair Professor Abigail A. Salyers Professor Gary J. Olsen Professor Peter A. Orlean

ABSTRACT

Prevotella ruminicola 23 and Prevotella bryantii B14 are obligate anaerobic

bacteria in the Bacteroidetes phylum that contribute to hemicellulose utilization

within the bovine rumen. These bacteria efficiently degrade the hemicellulosic

polymer, xylan, however the biochemical basis for deconstruction of this

polysaccharide remains largely unknown. In the current study, genomic,

transcriptomic, and biochemical analyses were employed to gain insight into the

cellular machinery that these two bacteria elaborate to degrade the

hemicellulosic polymer, xylan. These studies identified two previously

uncharacterized glycoside proteins from P. ruminicola 23: a bi-

functional xylanase/ferulic acid (xyn10D-fae1A) and a bi-functional β-D-

xylosidase/α-L-arabinofuranosidase (xyl3A). These two genes were expressed

as recombinant proteins in Escherichia coli, and biochemical analyses of the

purified proteins revealed that these two function synergistically to

depolymerize xylan. Furthermore, directed mutagenesis studies of Xyn10D-

Fae1A mapped the catalytic sites for the two enzymatic functionalities to distinct

regions within the polypeptide sequence. Subsequent bioinformatics studies with

P. bryantii B14 identified four glycoside hydrolase (GH) family 3 genes in the

genome for this organism. To test whether or not these genes encode proteins

with redundant biochemical functions, the four genes were cloned and over-

expressed as recombinant proteins in E. coli, and the biochemical properties of these proteins were assessed. The results from these studies indicated that each

ii

of the four purified proteins exhibited unique properties with one gene exhibiting

β-D-glucosidase activity, and the remaining three exhibiting β-D-xylosidase

activity. Further probing of the biochemical properties for the three β-D- xylosidase enzymes identified salient differences in substrate specificities for these enzymes. Whole genome transcriptional profiling of P. bryantii cultures grown either on polymeric wheat arabinoxylan (WAX) or the constituent monosaccharides, xylose and arabinose (XA) identified a subset of genes that are highly induced on WAX compared to XA. The most highly induced genes formed an operon which contained putative outer membrane proteins analogous to the starch utilization system (Sus) previously identified in the prominent human gut symbiont, Bacteroides thetaiotaomicron. The arrangement of this gene cluster is highly conserved in other xylanolytic organisms within the

Bacteroidetes phylum which suggests that the mechanism of xylan utilization involving these genes is conserved amongst these bacteria. A large number of genes encoding proteins of unknown function were also induced on WAX relative to XA which suggests that P. bryantii B14 may employ novel mechanisms for

xylan degradation. A hypothetical gene predicted to encode a protein with low

homology to glycoside hydrolase (GH) family 5 was upregulated during growth

on WAX, and was cloned and heterologously expressed as a hexahistidine

fusion protein in E. coli. Biochemical analysis of the purified protein revealed that

it possesses endo-xylanase activity thus the was named PbXyn5A. Two

of the most similar proteins in the GenBank database derive from the human

colonic Bacteroides spp. (B. eggerthii ORF1299 and B. intestinalis ORF4213)

iii

and are also annotated as hypothetical proteins. The corresponding genes were

expressed and the purified recombinant proteins also exhibited robust endo-

xylanase activity. Mutational analysis revealed two carboxylic acid residues that

were essential for activity in all three enzymes. These glutamic acid residues are

conserved amongst the distantly related members of the GH family 5 which

supports the assignment of these proteins as GH family 5 endo-xylanases. This study provides insight into polysaccharide degradation by P. bryantii B14 and

reveals similarities in the mechanism for xylan degradation between human

colonic and ruminant Bacteroidetes.

iv

ACKNOWLEDGEMENTS

I would like to thank all of my friends, colleagues, and family members without whom, the work presented within this thesis would not have been possible. First, I would like to thank Professor Isaac K. O. Cann for his excellent tutelage and selfless dedication to the training of the students within his laboratory and to whom I owe my success. I would also like to thank Professor

Roderick I. Mackie for his wisdom, guidance, and willingness to share his encyclopedic knowledge of rumen physiology with everyone. Also, I would like to thank Professor Steven R. Blanke for providing me with excellent training during the first two years of my thesis research and Dr. M. Ashley Spies for many exciting conversations about mechanistic enzymology. Members of my thesis committee including Professors Abigail A. Salyers, Gary J. Olsen, and Peter A.

Orlean have always been willing to take time out of their busy schedules to sit down and provide me with insight into my research directions, and for that I am very grateful. I also appreciate the many stimulating conversations that I have had with Professors Richard I. Tapping and James A. Slauch throughout my graduate career.

I have been very fortunate to work with a large number of wonderful people who have made coming in to work in the laboratory every day an enjoyable experience. From Steven R. Blanke's laboratory I would like to thank my good friends Vijay Gupta, Prashant Jain, Amandeep Gargi, Joseph Reese, and Ian Gut for their support and Michael Prouty for patiently teaching me many

v

of the techniques in molecular biology that I know. In Isaac K. O. Cann's laboratory I have enjoyed having the opportunity to interact closely with excellent scientists including Shosuke Yoshida, Shinichi Kiyonari, Yejun Han, Michael

Iakiviak, Charles Hespen, and Xiaoyun Su. Young-Hwan Moon was instrumental in helping me to clone and characterize enzymes from P. bryantii B14 and for that

I am very thankful. From the Institute for Genomic Biology I thank Jose Solbiati and Dawn Eriksen for their amiable personalities and uplifting attitudes.

I would like to thank my loving family who have persevered through the years with limited contact from me and who have shown me unparalleled support.

I appreciate Yuanyuan (Aedan) Liu's companionship during my thesis years and her willingness to allow me to work on weekends. Last, I would like to thank the

Energy Biosciences Institute for fantastic research funding and the National

Institutes of Health for funding my individual pre-doctoral fellowship.

vi

TABLE OF CONTENTS

CHAPTER 1: LITERATURE REVIEW ...... 1 1.1 THE RUMEN MICROBIAL ECOSYSTEM ...... 1 1.2 CHARACTERISTICS OF PREVOTELLA BRYANTII B(1)4 AND PREVOTELLA RUMINICOLA 23 ...... 7 1.2.1 CHARACTERISTICS OF PREVOTELLA BRYANTII B(1)4 ...... 8 1.2.2 CHARACTERISTICS OF PREVOTELLA RUMINICOLA 23 ...... 17 1.3 ENZYMATIC DEGRADATION OF XYLAN...... 19 1.3.1 THE ENDO-1,4-β-XYLANASES ...... 23 1.3.2 THE α-L-ARABINOFURANOSIDASES ...... 25 1.3.3 THE XYLAN-1,4-β-XYLOSIDASES ...... 27 1.3.4 THE α-GLUCURONIDASES ...... 31 1.3.5 THE XYLANOLYTIC ...... 31 1.3.6 MODULARITY OF XYLANOLYTIC ENZYMES ...... 32 1.3.7 COORDINATED FUNCTION OF XYLANOLYTIC ENZYMES ...... 33 1.4 MAJOR GOAL FOR THESIS RESEARCH ...... 34 1.5 REFERENCES ...... 35 1.6 TABLES AND FIGURES ...... 49

CHAPTER 2: BIOCHEMICAL ANALYSIS OF A β-D-XYLOSIDASE AND A BIFUNCTIONAL XYLANASE-ESTERASE FROM A XYLANOLYTIC GENE CLUSTER IN PREVOTELLA RUMINICOLA 23 ...... 66 2.1 ABSTRACT ...... 66 2.2 INTRODUCTION ...... 67 2.3 EXPERIMENTAL PROCEDURES ...... 69 2.4 RESULTS...... 77 2.5 DISCUSSION ...... 88 2.6 ACKNOWLEDGEMENTS ...... 94

vii

2.7 REFERENCES ...... 94 2.8 TABLES AND FIGURES ...... 102

CHAPTER 3: FUNCTIONAL DIVERSITY OF FOUR GLYCOSIDE HYDROLASE FAMILY 3 ENZYMES FROM THE RUMEN BACTERIUM, PREVOTELLA BRYANTII B(1)4 ...... 111 3.1 ABSTRACT ...... 111 3.2 INTRODUCTION ...... 112 3.3 EXPERIMENTAL PROCEDURES ...... 114 3.4 RESULTS...... 125 3.5 DISCUSSION ...... 133 3.6 ACKNOWLEDGEMENTS ...... 140 3.7 REFERENCES ...... 141 3.8 TABLES AND FIGURES ...... 146

CHAPTER 4: FUNCTIONAL ASSIGNMENT OF ENDO-XYLANASE ACTIVITY TO HYPOTHETICAL GENES FROM GUT BACTEROIDETES IDENTIFIED THROUGH ANALYSIS OF THE PREVOTELLA BRYANTII B(1)4 TRANSCRIPTOME ...... 160 4.1 ABSTRACT ...... 160 4.2 INTRODUCTION ...... 161 4.3 EXPERIMENTAL PROCEDURES ...... 163 4.4 RESULTS...... 172 4.5 DISCUSSION ...... 188 4.6 ACKNOWLEDGEMENTS ...... 194 4.7 REFERENCES ...... 195 4.8 TABLES AND FIGURES ...... 199

CHAPTER 5: DISCUSSION AND CONCLUSIONS ...... 218 5.1 BIOINFORMATIC ANALYSIS OF XYLAN DEGRADING ENZYMES IN RUMEN PREVOTELLA SPP...... 218

viii

5.2 TRANSCRIPTOMIC ANALYSIS OF XYLAN DEGRADING ENZYMES IN PREVOTELLA BYRANTII B(1)4 ...... 223 5.3 CONSERVED MECHANISMS FOR XYLAN DEGRADATION BY RUMEN PREVOTELLA SPP. AND HUMAN COLONIC BACTEROIDES SPP...... 226 5.4 REFERENCES ...... 229 5.5 FIGURES ...... 231

APPENDIX A: FUNCTIONAL COMPARISON OF THE TWO BACILLUS ANTHRACIS GLUTAMATE RACEMASES ...... 233 A.1 ABSTRACT ...... 233 A.2 INTRODUCTION ...... 234 A.3 EXPERIMENTAL PROCEDURES ...... 236 A.4 RESULTS ...... 245 A.5 DISCUSSION ...... 252 A.6 ACKNOWLEDGEMENTS ...... 258 A.7 REFERENCES ...... 258 A.8 TABLES AND FIGURES ...... 263

APPENDIX B: DIVERGENT KINETIC PROPERTIES OF THE TWO ALANINE RACEMASES FROM BACILLUS ANTHRACIS ...... 273 B.1 ABSTRACT ...... 273 B.2 RESULTS AND DISCUSSION ...... 273 B.3 REFERENCES ...... 279 B.4 TABLES AND FIGURES ...... 282

CURRICULUM VITAE ...... 287

ix

CHAPTER 1

LITERATURE REVIEW

1.1 THE RUMEN MICROBIAL ECOSYSTEM

The gastrointestinal (GI) tract of animals is a highly specialized tube composed of distinct anatomical regions that are adapted to perform functions related to nutrient acquisition from the diet. The GI tract has evolved to elaborate highly specialized fermentation chambers which may be pre-gastric as found in foregut fermentors (deer, sheep, and cattle) or post-gastric as found in hind-gut fermentors (pigs, elephants, and humans). Foregut fermentation is typically associated with herbivory, wherein the animal grazes on forage material that is high in lignocellulose and is incapable of extracting sufficient energy from the recalcitrant plant polysaccharides. Correspondingly, foregut fermentors have evolved a large compartment that houses a microbial community that performs the bulk of degradation of dietary material. Microbial fermentation within foregut fermentors such as cattle occurs within the rumen which is a large (50-100 L) space (61) that harbors an enormous population of microbes that form an ecological unit within the host (73). The rumen microbial ecosystem is an extraordinary example of a mutualistic relationship wherein hundreds of trillions of microbes ferment dietary carbohydrates and the products of microbial fermentation become the primary source of metabolic energy for the host.

For forage fed cattle, the major substrates for microbial fermentation include lignocellulosic plant cell wall polymers. The microbial consortium in the

1

rumen harbors a highly evolved repertoire of enzymes that catalyze the

depolymerization and fermentation of recalcitrant lignocellulosic plant cell wall

polysaccharides including cellulose, xylans, mannan, pectins, and arabinan.

Importantly, the cow’s genome does not harbor genes that encode enzymes

capable of performing these functions (32), thus the collective genome of

microbial rumen inhabitants (microbiome) considerably extends the metabolic

capacity of the ruminant.

Plant material is ingested by the cow during grazing and enters the rumen

where muscular contractions facilitate mixing of the ingesta with saliva and

microbes. After feeding has commenced, the forage material is periodically

regurgitated and re-chewed through a process known as rumination which

functions to decrease the particle size. The major rate-limiting step to energy

acquisition from the plant polysaccharides is the deconstruction of plant cell wall

polysaccharides into monosaccharides for subsequent fermentation by the

microbial inhabitants of the rumen (52). Thus, to permit complete digestion of cell

wall material, rather long transit times of digesta are necessary within ruminants

with the retention time of feedstuff in cows lasting between 50 and 60 h,

depending on the digestibility of the feedstuff (119). During this time microbes

within the rumen secrete extracellular that catalyze the degradation of plant cell wall polysaccharides into monosaccharides or disaccharides which are taken up and fermented. The major fermentation products in the rumen include the volatile fatty acids (VFAs), acetate, propionate, and butyrate and the gasses carbon dioxide and methane. The gasses are expelled from the ruminant by the

2

process of eructation, and the volatile fatty acids are absorbed through the wall of

the rumen to enter the bloodstream for subsequent transfer to the liver. Simple

sugars are typically nonexistent within the rumen since they are rapidly taken up

by resident microbes, thus the glucose required for the cow’s normal metabolic

processes cannot be absorbed from the rumen and rather is reconstituted from

VFAs via gluconeogenesis in the liver.

Microbial inhabitants in the bovine rumen include representatives from all three domains of life (Bacteria, Archaea, and Eukaryota); however, the most abundant members are the rumen bacteria whose numbers approximate 1 x 1010

organisms per milliliter of rumen fluid. Traditional methods of bacterial

enumeration relied upon culture counts, however, in recent years these

techniques have been largely supplanted by culture-independent molecular

phylogenetic methods. Culture-independent methods, based upon the analysis of

rRNA sequences (84), have revolutionized the view of microbial diversity in the

rumen and these approaches have facilitated the transition from the original

belief that twenty two major strains of bacteria are present in the rumen (58) to

the current estimation of nearly 16,000 bacterial taxa (31). Despite this high

species abundance in the rumen, the phylogenetic diversity is quite low with only

9 of the 55 bacterial divisions having been identified (31). The genera Prevotella,

Clostridium and Eubacterium each constitute ~30% of the phylotypes in the

rumen and together they represent more than 98% of the ribosomal RNA

sequences identified in metagenomic studies of the rumen (70).

3

Nevertheless, it is obvious that the phylotypic diversity in the rumen is very

high and this diversity may be explained partly by the trophic abundance in the rumen (67). That is to say that the forage material which enters the rumen is

complex and that a single (or even one hundred) bacterial species may not be

capable of elaborating all of the metabolic machinery necessary for harvesting

energy from the various components of ruminal digesta. In this regard, some

bacteria may have evolved to occupy a particular niche, thus becoming proficient

at metabolizing certain dietary components (58).

This idea is illustrated by the metabolic profile of the cellulolytic rumen

bacterium, Fibrobacter succinogenes S85. F. succinogenes S85 is a Gram

negative obligate anaerobic bacterium which adheres to, degrades, and utilizes

digestion products of crystalline cellulose. Cellulose consists of strands of β-1,4

linked glucose molecules which are tightly connected by interstrand hydrogen

bonds. The resulting structure is a highly crystalline polymer which is critical for

conferring structural rigidity to the plant cell wall. The cellulose is typically tightly

associated with pentosan hemicellulosic polymers which restrict access of the

bacterium and its hydrolytic enzymes to the cellulose. To circumvent this problem

of limited accessibility, in addition to its numerous cellulolytic enzymes, F.

succinogenes S85 secretes a variety of hemicellulolytic enzymes including

endoxylanases which are capable of deconstructing xylan polymers that are

commonly found associated with cellulose. Interestingly, F. succinogenes S85

does not possess xylose transporters or xylose and thus is incapable

of metabolizing the xylose sugars released from xylan (78). These observations

4

suggest that the primary role for hemicellulases secreted by F. succinogenes

S85 may be to clear the hemicellulose away from the underlying cellulose,

thereby permitting the bacterium to gain access to and metabolize cellulose.

Although there are other examples of specialist rumen bacteria including

Ruminobacter amylophilus, which utilizes starch and its derivatives and

Oxalobacter formigenes, which obtains energy from the reduction of oxalate to formate (26), there also exist bacteria which are more general in their substrate specificities (108). These ‘generalists’ include Butyrvibrio fibrosolvens,

Selenomonas ruminantium, and Prevotella ruminicola 23 amongst others. These

bacteria are capable of degrading a wide variety of plant-derived polysaccharides

including starch, xylan, and pectin. In addition, these generalist bacteria can

function as scavengers, competing with other rumen bacteria for hydrolytic

products from plant cell polymers. For example, Selenomonas ruminantium is

incapable of growing with cellulose as the sole carbohydrate source, however

when inoculated in co-culture with Fibrobacter succinogenes S85 (a cellulose

degrading bacterium), the two bacteria grew efficiently with crystalline cellulose

as the sole carbohydrate source. These data suggest that F. succinogenes S85

disrupts the cellulose structure and makes cellulose fragments available for S.

ruminantium to metabolize (96, 101).

In addition to the numerically dominant rumen bacteria, there are

significant populations of protozoa, anaerobic fungi, and methanogenic archaea

in the rumen. Although there is no consensus amongst rumen microbiologists on

the value of the protozoa to the bovine ruminant in terms of overall meat and milk

5

production, these organisms do contribute to a variety of processes including

fiber breakdown, starch digestion, and VFA production (137). The anaerobic fungi of the rumen produce a diverse array of cellulolytic and hemicellulolytic enzymes and are capable of growing in pure culture with cellulose and a variety of other plant polysaccharides as the sole carbohydrate substrates (85). The

precise role of the fungi in ruminant nutrition remains unknown, however it has

been shown that the anaerobic fungi improve the digestibility of low quality forage

foods which are rich in cellulose and xylan, whereas they have no significant

effect on the digestibility of low-fiber diets which are high in starch (44).

When grown in monoculture, the products of bacterial fermentation

typically include hydrogen, ethanol, and succinate; however none of these

compounds are found at significant levels in the rumen. This is due in part to the microbial cross-feeding that takes place within the rumen. There are a number of dominant bacteria in the rumen that produce succinate as the major fermentation product; however, succinate does not accumulate because it is rapidly decarboxylated to propionate by other ruminal bacteria. Methanogenic archaea also play a vital role in bacterial fermentations in the rumen through an interaction known as interspecies hydrogen transfer (79). When grown in monoculture, Ruminococcus albus 8 ferments glucose to ethanol, acetate, H2,

and CO2 (80); however when grown in co-culture with Methanobrevibacterium

ruminantium, a methanogen that metabolizes H2, R. albus 8 no longer produces

ethanol as an electron sink and metabolic flux resulting in acetate production

increases. The immediate effect of inter-species hydrogen transfer in the rumen

6

is that the molar yield of ATP during bacterial fermentation of glucose increases from 3 to 4 in the presence of methanogens. Thus, methanogenic archaea increase the energy yield of bacterial metabolism and thereby drive forward bacterial fermentation in the rumen.

1.2 CHARACTERISTICS OF PREVOTELLA BRYANTII B(1)4 AND PREVOTELLA RUMINICOLA 23

Among the bacterial genera in the rumen, Prevotella are the most numerically abundant species by both culture counts and by analysis of ribosomal RNA abundances (31, 107). These bacteria exhibit extensive variability both in terms of genetic and phenotypic properties (5) and consequently the classification of these organisms has seen considerable emendation since the time of their first isolation by Bryant et al. (17). These

organisms were first classified as two species, Bacteroides ruminicola subsp.

ruminicola and B. ruminicola subsp. brevis, however as more strains were

isolated, it became clear that these organisms differed considerably from the

human colonic Bacteroides and thus they were re-classified into the new genus

Prevotella (102). Prevotella ruminicola was later subdivided into four species,

Prevotella ruminicola, Prevotella bryantii, Prevotella albensis, and Prevotella

brevis on the basis of carbohydrate utilization patterns and genotypic

characteristics (5) (Table 1.1). The type strains of the four ruminal Prevotella spp.

T T are as follows: Prevotella bryantii B14 (DSM 11371), Prevotella brevis GA33

(ATCC 19188), Prevotella albensis M384T (DSM 11370), and Prevotella

ruminicola 23T (ATCC 19189). As can be appreciated from the results

7

summarized in Table 1.1, the metabolic profiles for strains clustered into the two

species P. bryantii and P. ruminicola are broader than those for P. brevis and P.

albensis, with most of these strains exhibiting carboxymethyl cellulase (CMCase)

and xylanase activities, in addition to fermenting a wide variety of mono- and

disaccharides. Prevotella spp. are largely responsible for much of the proteolytic

activity within the rumen and are important for breaking down plant protein into

usable nitrogen, which becomes accessible for other microbes within the rumen

(49, 132).

The genomes of these bacteria may harbor multiple copies of the 16S

rRNA gene (P. ruminicola 23, 4 copies; P. bryantii B14, 11 copies), although

diversity in these genes is low with nucleotide sequence divergence between

these copies ranging from 0.3-1.2%. Phylogenetic analyses of the multiple rRNA

genes from P. bryantii B14 resulted in the grouping of all rRNA copies within a single operational taxonomic unit (OTU), thus it is anticipated that these multiple

copies do not cause significant difficulties in the phylogenetic classification of

distinct species within the genus Prevotella (90).

1.2.1 CHARACTERISTICS OF PREVOTELLA BRYANTII B(1)4

Prevotella bryantii B14 exhibits an incredibly versatile metabolic profile and

is able to ferment a wide variety of monosaccharides in addition to the

polysaccharides, xylan, pectin, dextrin, and starch (15, 17, 54). In addition to this

metabolic diversity, one of the major reasons why P. bryantii B14 can compete so successfully in the rumen is because it has a highly efficient energy metabolism

8

and can multiply rapidly (54). Indeed the molar growth yield after adjusting for intracellular polysaccharide (which P. bryantii B14 rapidly accumulates during logarithmic growth (21)) is quite high with 66 g (dry weight cells) produced per mol of glucose fermented. Interestingly, the growth yields of P. bryantii B14 with the pentose sugars, xylose (62 g cells/mol substrate) and arabinose (68 g cells/mol substrate) were comparable to the growth yields with glucose (118).

The fermentation product distribution between the hexose and pentose grown cells was comparable with the major products being succinate, acetate, and formate in similar proportions (54, 118). Arabinose metabolism (18) and xylose metabolism (78) proceed via hexose re-synthesis combined with the Embden-

Meyerhof-Parnas sequence. Although glucose utilization is constitutive (109) in P. bryantii B14, arabinose and xylose utilization is inducible (18).

P. bryantii B14 is unable to grow with intact crystalline cellulose as the sole carbohydrate source; however, this strain can grow efficiently with soluble cello- oligosaccharides (96). A 40.5 kDa protein with CMCase activity was purified from an E. coli clone harboring a cloned genomic fragment of P. bryantii B14 (77). This protein does not possess a cellulose binding domain and thus is not capable of cleaving crystalline cellulose. Further studies revealed that two polypeptides (82 kDa and 88 kDa) are produced in P. bryantii B14 that have CMCase activity and cross react with antibodies raised against the 40.5 kDa CMCase, however a 40.5 kDa protein was not identified in P. bryantii B14 cell lysates (76). N-terminal sequencing of the two high molecular weight CMCase proteins revealed that they were both encoded by the same DNA fragment and that the 88 kDa protein is

9

encoded in two reading frames. Furthermore, the smaller 40.5 kDa protein

expressed in E. coli also originated from the same DNA fragment, but was

predicted to be a cleavage product of the larger 82 or 88 kDa proteins. It is

interesting to note that the smaller protein has full CMCase activity, and BLAST

analysis reveals that the mature 82 kDa polypeptide has two predicted glycoside

hydrolase (GH) catalytic domains; an N-terminal GH family 26 domain and a C-

terminal GH family 5 domain (Dylan Dodd, unpublished observations, Table 1.2).

The authors did not address the possibility of the mature protein having two

different domains and thus they may potentially have missed an additional

activity that this protein may possess.

Expression of the endoglucanase was induced during growth on mannose,

guar gum, and cellobiose (40). A mutant of P. bryantii B14 was isolated following

chemical mutagenesis and was deficient in both endoglucanase and β-

mannanase activities (34). Antisera raised against the previously cloned

endoglucanase did not cross-react with cell lysates from the mutant, indicating that the endoglucanase gene was not expressed. The mutant was deficient in both CMCase and mannanase activities and lost the capacity to grow with mannan, but after successive passages with cellobiose and mannan, a revertant was obtained that regained the ability to utilize mannan. Antisera to the

endoglucanase were found to cross-react with cell lysates from the revertant

suggesting that this enzyme may have been responsible for the growth defect.

One potential caveat to this study was that an additional putative mannanase

(Orf6) occurs immediately downstream of the endoglucanase gene and it is

10

possible that the mutation that was characterized may have had a polar effect on

this gene. The CMCase enzyme was found to be extracellular, but cell-

associated (39). Based on this observation and the fact that the enzyme does not

have a cellulose binding domain, it was proposed that the physiological role for

the enzyme may be to cleave soluble cellulose chains into smaller cello-

oligosaccharides which may subsequently be taken up and fermented. This

mechanism may permit P. bryantii B14 to scavenge cellulose fragments in the

rumen which are released by cellulolytic microbes.

A β-1,4-D-glucan glucohydrolase (CdxA) enzyme with cellodextrinase

activity was cloned from P. bryantii B14 (141). This enzyme was membrane- associated, although no data were collected to identify whether the enzyme was present in the periplasm or on the outer membrane surface of the bacterium. The

CdxA enzyme had higher activity with cello-oligosaccharides as compared to

cellobiose, which may indicate that its role in the cell is to degrade cellulose

byproducts and not cellobiose. Cellobiose phosphorylase enzyme activity was

detected in cell extracts of P. bryantii B14, indicating that energy conservation in

P. bryantii B14 during growth on cello-oligosaccharides may be improved via

phosphorylytic cleavage of cellobiose.

In summary, considerable work has been done to study the utilization of

soluble cellulose fragments by P. bryantii B14 and the overall model for this is

summarized in Figure 1.1. First, soluble cellulose fragments are converted to

cello-oligosaccharides by the extracellular, membrane attached endoglucanase.

These oligosaccharides or cellobiose molecules that may be liberated by the

11

action of cellulolytic bacteria in the rumen are transported into the periplasm by an unknown process. CdxA then cleaves a fraction of these into glucose and cellobiose which are then transported into the cell where cellobiose may be phosphorylytically cleaved by cellobiose phosphorylase. The resulting glucose-1- phosphate is converted to glucose-6-phosphate by phosphoglucomutase, and may then enter the Embden-Myerhof-Parnas pathway.

Another important property of P. bryantii B14 is its remarkable capacity to grow efficiently at relatively low pH values (99). Cellulolytic rumen bacteria including Fibrobacter succinogenes (23) and Ruminococcus albus (116) are highly sensitive to drops in pH and cellulose hydrolysis in the rumen decreases considerably at low pH values (86). Traditionally this has been explained by the suggestion that volatile fatty acids which are present in the rumen may act as uncouplers of the cell membrane potential at low pH values when they are protonated. This does not explain, however, why certain ruminal bacteria are resistant to low pH values and fairly high VFA concentrations. A better explanation for differences in susceptibility of bacteria to low pH values in the presence of VFAs was proposed by James Russell, when he measured the intracellular pH values of a variety of acid tolerant ruminal bacteria during growth at different pH values (98). These results indicated that the acid tolerant ruminal bacteria decrease the intracellular pH values in response to decreasing extracellular pH, thus maintaining a near constant difference in pH across the membrane. The mechanism by which these bacteria are able to modulate their internal pH without any deleterious effect on metabolism remains undetermined.

12

Ruminal acidosis is a significant problem that has a negative effect on

fiber degradation and consequently meat and milk production in cattle. There is

significant interest in circumventing these problems and considerable effort has

been invested in identifying a cellulolytic organism that is resistant to drops in pH

so that cellulose degradation may continue during acidotic conditions. No

cellulolytic bacteria have been identified that are capable of growing under

acidotic conditions and correspondingly, efforts have focused on engineering

acid tolerant ruminal bacteria to degrade cellulose. Due to the capacity of P.

bryantii B14 to efficiently utilize solubilized cellulose fragments, and the fact that

P. bryantii B14 can grow at pH values as low as 5.1, this bacterium is an

attractive candidate (100). A proof of principal experiment was performed

wherein an acid-stable cellulase from Thermomonospora fusca was added to P.

bryantii B14 cultures with ball milled cellulose as the sole carbohydrate source

permitted growth at a range of pH values from 5.2 to 6.4 (77). The native

endoglucanase from P. bryantii B14 did not possess a cellulose binding domain

and so Wilson and colleagues reconstructed the endoglucanase gene to fuse a

cellulose binding module from the Thermomonospora fusca cellulase (E2) to the

P. bryantii B14 endoglucanase (74). The fusion protein had approximately 10-fold increased activity when measured in vitro with ball milled cellulose and looked to be a promising candidate for genomic integration into P. bryantii B14 to make this

organism cellulolytic. The fusion protein was then cloned into a Bacteroides-

Prevotella shuttle vector (103) using a P. ruminicola 23 endo-xylanase promoter

upstream of the fusion gene (37). The fusion protein was expressed in

13

Bacteroides uniformis, the strain used for conjugal transfer, but the protein was not expressed in P. bryantii B14. This result likely stems from the fact that the genetic and phenotypic diversity amongst ruminal Prevotella spp. is high and that

RNA polymerases may recognize different promoter elements amongst these bacteria. Efforts were made to introduce this construct into P. ruminicola 23 which also represents an acid tolerant bacterium that can utilize cellulose fragments, however, after repeated attempts, no transconjugants were obtained following mating of B. uniformis with P. ruminicola 23. These studies expose the problem that members of the genus Prevotella are highly resistant to genetic manipulation, which may be a result of the large number of extracellular that these organisms possess (1, 2).

Ruminal Prevotella spp. are capable of hydrolyzing protein and converting amino acids into useable nitrogen and branched chain fatty acids within the rumen; indeed, based on the numerical abundance and amount of ammonia produced, it is suggested that Prevotella spp. are the most important ammonia producing bacteria in the rumen (10). Although P. bryantii B14 was unable to grow on Trypticase alone, the maximum growth yields of glucose-limited cultures were significantly enhanced by Trypticase addition (97). This is unique amongst the ruminal bacteria as many of these organisms prefer ammonia to amino nitrogen, and grow rather poorly with amino acids or peptides as the sole nitrogen source (16). Despite the fact that proteinaceous nitrogen sources result in increased growth rates over ammonium grown cultures, the maximum cell density is higher with ammonium (46). These later studies indicate that the

14

important proteolytic activity that P. bryantii B14 possesses may be related to

peptide hydrolysis rather than to protein hydrolysis. Analogous to the cellulolytic

activities in this organism, the proteolytic activities are typically cell associated

(138).

Plant lignocellulose that becomes a primary substrate for bacterial

fermentation in the rumen consists primarily of cellulose, and much attention has

been paid to the mechanisms by which cellulolytic bacteria degrade plant fiber.

Another important and somewhat overlooked lignocellulosic polysaccharide that

supports bacterial fermentation in the rumen is xylan. In addition to contributing

to nitrogen cycles within the rumen, one of the most significant activities of the

ruminal Prevotella spp. is the capacity to ferment hemicellulosic polysaccharides

including xylans (27). P. bryantii B14 can ferment a variety of xylans, although the

degree of utilization of insoluble xylans is considerably lower than that of other

rumen bacteria, especially Butyrivibrio fibrosolvens strains (51).

On the basis of Southern blotting of genomic digests with an endoxylanase gene probe, Avgustin et al. predicted that the P. bryantii

B14 genome harbors at least two endoxylanases (4). This proposal was later

supported when an additional endoxylanase gene (xynA) was identified within a

xylan degrading gene cluster (42). The xylanolytic gene cluster, which contains

the xynA gene, also contains genes predicted to encode a hybrid two component

system (HTSC) regulator (xynR), a glycoside hydrolase family 43 β-D-

xylosidase/α-L-arabinofuranosidase (xynB), a xylose/H+ symporter (xynD), a

carbohydrate esterase family 1 acetylxylan esterase (xynE), and the N-terminal

15

fragment of a glycoside hydrolase family 67 α-glucuronidase (xynF) (Figure 1.2).

This arrangement of xylanolytic genes is analogous to a similar system which

was identified by Salyers and colleagues in Bacteroides ovatus V975 (133). The genes, xynA and xynB were cloned into expression vectors and over-expressed

in E. coli (42) and were found to encode enzymes with endo-xylanase and β-D-

xylosidase/α-L-arabinofuranosidase activities, respectively. The enzymatic activity for XynB was sensitive to oxygen, which may be a reflection of the anaerobic environment under which this enzyme has evolved. The HTCS gene, xynR, was investigated with regards to its potential capacity to regulate xylanase

gene expression. The C-terminal helix-turn-helix region of this gene product was shown to bind to a DNA fragment containing a 141 bp region spanning the translational start site for the putative xylose/H+ symporter (xynD) in the xylan utilization cluster (81). Additional efforts to map the precise binding sites and regulatory role for XynR were performed by Miyazaki et al. (81), however

interpretation of their data was possibly inaccurate and will not be discussed in

this section. An additional endoxylanase (XynC) with an interrupted catalytic domain was identified in a genomic library of P. bryantii B14 (35). This enzyme

has a unique domain architecture that consists of a glycoside hydrolase (GH)

family 10 domain with a domain of unknown function inserted within the catalytic

region. The role of this insertional domain was not identified, however based on

the close proximity to the , it was predicted that this domain may

influence either enzymatic activity or substrate binding.

16

1.2.2 CHARACTERISTICS OF PREVOTELLA RUMINICOLA 23

P. ruminicola 23 is an important member of the rumen microbiota and was

one of the first bacterial strains isolated from the rumen of cattle. This bacterium

has an absolute requirement for heme and detailed studies by Caldwell et al.

showed that the common heme-biosynthesis pathway is present in this strain,

however some or all of the enzymes involved with synthesis of the tetrapyrrole

nucleus from linear molecules are either lacking or inactive (19). When P.

ruminicola 23 was first isolated by Bryant et al., the fermentation products

detected included acetate, succinate, and formate (17). When Dehority (27) later

grew the bacterium in the presence of 40% rumen fluid, he identified an

additional product, propionate, and found that the proportion of succinate

produced was highly diminished as compared to the study conducted by Bryant.

The mechanism by which propionate was produced in response to added rumen

fluid was not discovered until some 30 years later when Strobel showed that

propionate production by P. ruminicola 23 was absolutely dependent upon

exogenous addition of vitamin B12 (110). Strobel suggested that in the presence

of vitamin B12, succinate may be converted to propionate via a sequence of reactions that includes the conversion of succinyl-CoA to methylmalonyl-CoA; a conversion that requires the action of a vitamin B12-dependent methylmalonyl

epimerase. Although initial studies in Strobel’s laboratory were unable to identify

enzymes in this pathway, analysis of the recent complete genome sequence for

P. ruminicola 23 has confirmed that these genes do exist and this is likely to be

the preferred route for propionate production (Dylan Dodd, unpublished

17

observations). As is the case for P. bryantii B14, P. ruminicola 23 also has the capacity to produce branched-chain VFAs and ammonia from peptides (3), a role that is likely to be very important in the rumen.

As indicated in Table 1.1, P. ruminicola 23 has a rather broad substrate range and is able to ferment a variety of different plant cell wall derived monosaccharides and polysaccharides. Dehority and colleagues found that P. ruminicola 23 could grow on pectin from intact forages and was also able to utilize galacturonic acid as the sole carbohydrate source in pure culture (45). P. ruminicola 23 was also able to grow with xylan (27) and cellodextrins (96) as a sole carbohydrate source. Several studies have explored the carbohydrate active enzymes from P. bryantii B14 (see above), however considerably less is known about the metabolic systems at work in Prevotella ruminicola 23. These two species share 88.9% nucleotide identity in their 16S rRNA genes (1473 nucleotides aligned), however, there is evidence that these two species exhibit differences in their xylanolytic systems (4). One endo-xylanase gene has previously been characterized from P. ruminicola 23: a 66 kDa bifunctional endo- xylanase/endo-glucanase gene with significant sequence identity to glycoside hydrolase (GH) family 5 endoglucanases (134, 136). The functional role for this gene's product was not studied in any significant detail and it still remains unclear what the cellular location is or its role in the physiology of P. ruminicola 23.

18

1.3 ENZYMATIC DEGRADATION OF XYLAN

The composition of dry plant biomass harvested from perennial grasses consists primarily of the plant cell wall polymers, cellulose (31%), xylan (20%) and lignin (18%) (Figure 1.3). Cellulose is a highly homogeneous polymer that consists of β-1,4-linked glucose units, which make inter-strand hydrogen bonds to form a highly stable crystalline lattice (69). Xylan, on the other hand, is a heteropolymeric hemicellulosic polymer that consists predominantly of the pentose sugars arabinose and xylose, although it may also contain glucuronyl, feruloyl, and acetyl groups (30). Complete hydrolysis of the two major components, cellulose and xylan, of perennial grasses, therefore, releases glucose, xylose and arabinose, which can then be fermented by ruminal microbes. The third major component, lignin, is a phenolic polymer that associates with plant cell wall polysaccharides, mainly through hydroxycinnamic acids, such as p-coumaric acid and ferulic acid in grasses and other graminaceous plants (111). The two hydroxycinnamic acids are mostly found engaged in ester linkages to arabinose units in xylan and ether linkages to hydroxyl groups in lignin (59). The cross linking of plant cell wall polysaccharides and lignin through ferulic acid and p-coumaric acid is known to negatively impact hydrolysis to monomeric sugars and ultimately biodegradation.

Xylan is a heteropolymeric substrate consisting of a repeating β-1,4-linked xylose backbone decorated with acetyl, arabinofuranosyl, and 4-O-methyl glucuronyl groups and as stated above, xylan may be cross linked to lignin by aromatic esters. In order to efficiently depolymerize xylan to the component

19

monosaccharides, a mixture of different enzymatic functionalities is required,

including endo-1,4-β-xylanases (EC 3.2.1.8), β-D-xylosidases (EC 3.2.1.37), α-L-

arabinofuranosidases (EC 3.2.1.55), α-glucuronidases (EC 3.2.1.139), acetyl xylan esterases (EC 3.1.1.72), and ferulic/coumaric acid esterases (EC 3.1.1.73).

Much is known about the individual enzymatic functionalities required to

depolymerize xylan and hundreds, perhaps thousands, of xylanolytic enzymes

from a variety of microbial sources have been identified (CAZy;

http://www.cazy.org/). A significant limitation to optimizing xylan depolymerization

is the lack of detailed knowledge of both the structural diversity in xylan and the

corresponding enzymatic strategies employed by microbes to hydrolyze the

linkages within this complex heteropolymer. However, current efforts in microbial

and structural genomics are gradually leading to a better understanding of the

structural basis for substrate specificity of a number of xylanolytic enzymes

belonging to different glycoside hydrolase families. The purpose of the current

section is to summarize these recent insights, and to discuss their significance

for further understanding the degradation of xylan by ruminal bacteria,

specifically the ruminal Prevotella spp.

Xylans can be broadly classified as homoxylans, arabinoxylans,

glucuronoxylans, and arabinoglucuronoxylans. Homoxylans, which consist of a

chain of β-1,4 and β-1,3 linked xylose units, are relatively rare in higher plants,

but represent a significant structural component of cell walls in red seaweeds

(87). All of the xylans of higher plants are based on a backbone of β-1,4-linked

xylopyranose sugars, and are typically substituted with acetyl groups and other

20

sugar residues (Figure 1.4). Arabinoxylans are principal components of the plant cell walls, especially in cereal grains including wheat, and consist of a xylose backbone with arabinose residues linked to the O-2 or O-3 of xylose. Xylose residues may be singly or doubly substituted with arabinose. Furthermore, the arabinose residues may have additional linkages to the phenolic compound ferulic acid, which may form covalent crosslinks to lignin or ferulic acid groups in other arabinoxylan chains. Glucuronoxylans are mainly found in hardwoods, herbs and woody plants (30, 117), and typically consist of a xylose backbone with 4-O-methyl-α-D-glucuronic acid (MeGA) residues linked off of the O-2 of xylose. Arabinoglucuronoxylans are typically found in the lignocellulose isolated from grasses and have arabinofuranosyl, MeGA, and acetyl side chains linked to the xylose backbone, and reflect the composition of biomass currently targeted for biofuel production. While the relative proportions of plant cell wall polysaccharides within biofuel feedstocks are generally known, the dearth of knowledge on how individual sugar components are chemically linked within the cell wall represents an important limitation to identifying the molecular mechanism for xylan depolymerization by rumen bacteria.

The structural diversity of plant cell wall xylans necessitates an equal diversity in the repertoire of enzymes for deconstruction of these polysaccharides.

Thus, xylan-degrading bacteria and fungi have evolved diverse enzymatic machineries for the hydrolysis of xylan, and evidence is readily gleaned from the genome sequences of such organisms

(http://www.ncbi.nih.gov/genomes/lproks.cgi). Detailed structural and biochemical

21

studies of the substrate specificities for these enzymes are beginning to provide

important insights into the function and evolution of these proteins.

Plant cell wall polysaccharides are highly stable polymers composed of

repeating sugar units whose glycosidic linkages exhibit half lives at room

temperature on the order of 20 million years (139). Organisms across the three

domains of life (eukarya, archaea and bacteria) have evolved glycoside

hydrolases (GHs), which are extraordinarily efficient catalysts that enhance the

rate of hydrolysis of glycosidic linkages over the non-catalyzed reaction by up to

17 orders of magnitude (139). Xylan degrading GHs hydrolyze glycosidic

linkages by inversion or retention of stereochemical configuration at the anomeric

carbon. Both of these two different mechanisms employ a pair of carboxylic acid residues in the active site. Retaining glycosidases employ a double displacement mechanism involving two consecutive bimolecular substitution (SN2) reactions.

The first substitution is carried out by a catalytic carboxylic acid nucleophile

which results in the formation of a covalent enzyme-substrate intermediate (128),

and the second is performed by an activated water molecule (Figure 1.5A).

Inverting glycosidases, on the other hand, employ a single displacement reaction

wherein a carboxylic acid group acts as a base to activate a water molecule for

attack at the anomeric carbon (Figure 1.5B). In the following sections, the

enzymes required for xylan hydrolysis into its component sugars for subsequent

fermentation by rumen microbes are described.

22

1.3.1 THE ENDO-1,4-β-XYLANASES

One of the fundamentally important enzymatic activities required for the

depolymerization of xylans is endo-1,4-β-xylanase (xylanase) activity. These

enzymes cleave the β-1,4 glycosidic linkage between xylose residues in the

backbone of xylans. Xylanases have been classified into glycoside hydrolase

(GH) families 5, 7, 8, 10, 11, and 43 on the basis of their amino acid sequences,

structural folds, and mechanisms for (20). Previous reviews have

discussed the biochemical and mechanistic properties of these enzymes in great

detail (6, 24, 140). The binding sites for xylose residues in xylanases are termed

subsites with bond cleavage occurring between sugar residues at the -1 subsite

(non-reducing end) and the +1 subsite (reducing end) of the polysaccharide substrate (Figure 1.6) (25). GH 10 and 11 xylanases represent the best studied

xylanase families, and they differ in the number of subsites they possess, with

GH 10 having four or five subsites (7, 9, 28) and GH 11 having at least seven

subsites (12, 130). While commercially available non-decorated xylo-

oligosaccharide substrates are convenient for assessing xylanase activity, limited

information can be drawn from studies with these substrates since they do not

represent the natural xylan substrates that these enzymes would come into

contact with when deconstructing plant cell walls. A recent study assessed the

activity of several GH 10 and GH 11 proteins with purified xylo-oligosaccharides

substituted with 4-O-methyl-glucuronic acid (MeGA) and revealed that GH 10

enzymes cleave xylan chains when MeGA is linked to xylose at the +1 subsite,

whereas GH 11 enzymes cleave xylan when MeGA is appended at the +2

23

subsite (Figure 1.7) (64). Direct support for these results was reported in a recent study on the mass spectra of the products of hydrolysis for GH 10 and GH 11 with arabinoxylan substrates. The results indicated that GH 10 products have arabinose residues substituted on xylose at the +1 subsite, whereas GH 11 products have arabinose residues substituted at the +2 subsite (75). These results suggest that GH 10 enzymes are able to hydrolyze xylose linkages closer to side chain residues, and thus help to explain why GH 10 enzymes release shorter products than GH 11 enzymes when incubated with arabinoglucuronoxylan substrates (9).

Recent structural and other biochemical studies support the data above and indicate that while GH 10 and GH 11 enzymes are both able to bind decorated xylo-oligosaccharides (36, 89, 125, 126), GH 10 enzymes can accommodate linkages on xylose at the +1 subsite (Figure 1.8A) (62, 88), whereas GH 11 enzymes can accommodate linkages on xylose at the +2 subsite

(Figure 1.8B) (125). Furthermore, the structure for a GH 10 enzyme complexed with feruloyl arabinoxylodisaccharide (FAX2) revealed that the arabinose substituent appended at the +1 subsite was a significant determinant of substrate specificity for a GH 10 xylanase from the thermophilic fungus Thermoascus aurantiacus (126). Taken together these results support the idea that xylan side chain decorations are recognized by xylanases, and the degree of substitution in xylan will influence the products of hydrolysis for xylanases. This difference in substrate specificity for xylanases has important implications in the deconstruction of xylan. For example, α-glucuronidases can only release MeGA

24

from a terminal non-reducing end xylose unit, thus the products of a GH 10

enzyme acting on glucuronoxylan will be a direct substrate for α-glucuronidase

(Figure 1.7A), whereas α-glucuronidases will be unable to hydrolyze GH 11 products since the MeGA will be substituted off of the penultimate xylose residue

(Figure 1.7B).

The foregoing discussions, therefore, underscore the importance of understanding the natural substrate specificities for xylanolytic enzymes and indicate that comprehensive knowledge of the structure of the xylan substrate will be indispensable for determining the mode of action of bacterial xylanolytic enzymes.

1.3.2 THE α-L-ARABINOFURANOSIDASES

The function of α-L-arabinofuranosidases in xylan deconstruction is to

remove arabinose side chains from the xylose backbone of arabinoxylan or

arabinoglucuronoxylan. These enzymes are grouped into four different families of

glycoside hydrolases (GH 43, 51, 54, and 62), and can hydrolyze glycosidic

linkages with net inversion (GH 43) or retention (GH 51, 54) of stereochemical

configuration at the anomeric carbon. In the context of arabinoglucuronoxylans,

there is diversity in the types of linkages for arabinosyl units, and this has

significant consequences in terms of the type of enzyme that removes arabinose

side chains. Specifically, arabinose may be linked to the O-2 or O-3 of xylose in

the xylan backbone, and these xylose sugars may be either singly or doubly

substituted with arabinose. Furthermore, some arabinose molecules may have

25

ferulic acid molecules esterified to the 5'-OH which may subsequently be covalently tethered to ferulic acid residues attached to other arabinoxylan chains

(Figure 1.4).

Members of the GH family 43 AF’s display a variety of different substrate specificities, which are beginning to be understood in greater molecular detail as more three-dimensional structures are solved for these enzymes. Enzymes with

α-L-arabinofuranosidase (AF) activity may be grouped into three major classes, a) AF type A enzymes, which are only active on short arabino-oligosaccharides

and pNP-α-L-arabinofuranoside (92), b) AF type B enzymes, which are active on

a wider variety of substrates including short oligosaccharides and longer

polysaccharides such as arabinoxylan and arabinan (92), and c) arabinoxylan

arabinofuranohydrolases (AXHs), which are mainly active on arabinoxylan (11,

65, 66). The AXHs can further be classified based on whether they release α-L-

arabinose from only mono-substituted xylose residues (AXH-m) (11, 63, 66) or

only di-substituted xylose residues (AXH-d) (106, 120, 122, 123).

Recently, crystal structures have been reported for a GH 43 AF type A β-

xylosidase/α-L-arabinofuranosidase from the rumen bacterium, Selenomonas

ruminantium (13) and a GH 43 AXH-m from the soil bacterium Bacillus subtilis

(124). These enzymes have distinct substrate preferences with the S.

ruminantium enzyme (SXA), having high activity on pNP-β-D-xylopyranoside,

pNP-α-L-arabinofuranoside (135) and xylo-oligosaccharides (60), whereas the B.

subtilis enzyme (BsAXH-m2,3) exhibited highest activity on pNP-α-L-

arabinofuranoside and water extractable arabinoxylans (11). The domain

26

architectures for these two enzymes, revealed by the three-dimensional

structures, provide significant insight into the differences in the substrate

specificities for these two enzymes. Although they both possess an N-terminal 5-

bladed β-propeller fold (common to GH 43 enzymes), BsAXH-m2,3 has a C- terminal carbohydrate binding module family 6 (CBM6) -like domain that is clear of the active site region (Figure 1.9A), whereas SXA has a large C-terminal β- sandwich domain that projects a loop which closes off the active site pocket for the enzyme (Figure 1.9B). For BsAXH-m2,3, a groove in the protein is located just above the active site that binds to xylo-oligosaccharides and allows for arabinose side chains substituted from either the O-2 or O-3 of xylose to enter the active site for hydrolysis. The SXA on the other hand does not possess this groove, and the β-sandwich loop restricts the size of the active site such that only

1-2 residues may enter the active site, resulting in an exo-type enzyme activity.

1.3.3 THE XYLAN-1,4-β-XYLOSIDASES

Xylan-1,4-β-xylosidase (β-xylosidase) enzymes release xylose monomers from the non-reducing end of xylo-oligosaccharides. As described earlier, xylanases break down xylan polymers into shorter fragments, thereby increasing the total number of non-reducing ends available for hydrolysis by β-xylosidases.

β-xylosidases are grouped into five different families of glycoside hydrolases (GH

3, 39, 43, 52, and 54) (50), and their reaction mechanisms result either in inversion (GH 43) or retention (GH 3, 39, 52, and 54) of stereochemical

27

configuration at the anomeric carbon. The most abundant and best characterized

β-xylosidases are from GH 3 and 43 (112).

The crystal structures for two biochemically characterized GH 43 β-

xylosidase enzymes have revealed the presence of two domains, an N-terminal

five-bladed β-propeller domain, and a C-terminal α/β sandwich domain (13, 14).

The catalytic residues are located in a shallow active site cleft in the center of the

β-propeller domain, which is closed off by a phenylalanine residue contributed from the C-terminal α/β sandwich domain. These enzymes possess two subsites for sugar binding, and it is anticipated that when hydrolyzing oligo-saccharides

longer than two sugar residues, the remaining residues will extend out into

solution (Figure 1.10). This prediction is supported by biochemical analyses of

GH 43 β-xylosidases that reveal a decrease in the catalytic efficiency (kcat/KM)

when active on xylo-oligosaccharides with a degree of polymerization greater

than two (60, 121, 131), suggesting that these enzymes possess just two xylose

binding sites.

The GH family 3 represents a large group of glycosidic enzymes, and

includes members that possess several distinct enzymatic activities including β-

D-glucosidase, β-D-xylosidase, α-L-arabinofuranosidase and N-acetyl-β-D-

glucosaminidase activities (33, 48). At the current time, the Pfam database

reports a total of 3,384 members derived from 793 species. Despite the large

number of β-xylosidase enzymes that have been characterized from this gene

family, there are currently no crystal structures available for enzymes with β-

xylosidase activity from this family. There are two crystal structures for GH 3

28

enzymes, one for a putative N-acetyl-β-D-glucosaminidase (NagZ) from the bacterium Vibrio cholera (unpublished, protein databank accession no. 1y65), and a variety of structures with different substrate analogs and inhibitors for a β-

D-glucosidase from Hordeum vulgare (Barley) (55-57, 127). In the case of the

NagZ structure from V. cholera, a single N-terminal (α/β)8 triosephosphate

isomerase (TIM) barrel domain exists (Figure 1.11A), whereas analysis of the

crystal structure for the H. vulgare β-D-glucosidase revealed the presence of an

N-terminal (α/β)8 TIM barrel and an additional C-terminal (α/β)6 sandwich domain

(Figure 1.11B). Most of the catalytic residues for these enzymes map to the N-

terminal (α/β)8 TIM barrel domain, however for the Barley enzyme, a putative

catalytic acid/base residue was located in the C-terminal domain (127).

The β-hexosaminidases from GH family 20 employ substrate-assisted

catalysis whereby the N-acetamido group on the C2 of the N-acetyl-β-D-

glucosaminyl substrate acts as the catalytic nucleophile (29, 115). If this were the

case for the GH 3 N-acetyl-β-D-glucosaminidases, then these enzymes might not require the additional acidic residue provided by the C-terminal (α/β)6 sandwich

domain (48). These observations support the possibility that the N-terminal (α/β)8

TIM barrel domain (Figure 1.11A) evolved initially to encode N-acetyl-β-D- glucosaminidase activity, a function that is important for cell wall recycling in bacteria (22). Gene duplication and the acquisition of an additional (α/β)6

sandwich domain may then have led to the evolution of β-glucosidase and β-

xylosidase activities on this protein scaffold (Table 1.3). Further support for this

idea comes from the fact that a number of GH 3 genes have been identified that 29

have both the (α/β)6 and (α/β)8 TIM barrel domains, however the regions of these

domains in the polypeptide sequence are switched, yet these proteins retain their

enzymatic activities (53, 83). An additional PA14-like domain has been identified

for GH 3 proteins, which forms an insertion within the N-terminal (α/β)6 sandwich

domain (Table 1.3). The functional significance of the protective antigen 14 kDa

(PA14) -like domain insertion in GH 3 enzymes is currently unknown, however

several lines of evidence support the possibility that the PA14 domain represents

a novel carbohydrate binding module (95, 142).

The active site for the Barley GH 3 β-glucosidase is a small, coin-shaped

cavity that, similar to GH 43 β-xylosidases, possesses only 2 substrate binding

sites. Subsite mapping studies for this enzyme using gluco-oligosaccharides of

varying lengths confirmed the presence of just two glucoside binding sites in the

enzyme (55). The structure of the Barley β-glucosidase is useful for making

generalizations on the structure-activity relationships for GH 3 β-xylosidases.

However, the amino acid residues that confer substrate specificity (β-glucosidase vs. β-xylosidase) for this family of enzymes remain to be elucidated. Thus, determination of the 3D structure for β-xylosidases in this family will yield insight into the molecular determinants of substrate specificity which will prove invaluable for characterizing the function and substrate specificity for members of this large family of enzymes.

30

1.3.4 THE α-GLUCURONIDASES

Xylose residues within arabinoglucuronoxylan may be substituted at O-2 with 4-O-methyl-glucuronic acid (MeGA) (Figure 1.4). The α-glucuronidases cleave MeGA from xylose residues when the MeGA is attached to the terminal, non-reducing end of xylo-oligosaccharides (93, 105). An additional limitation to the release of MeGA by α-glucuronidases is the extent of of the xylose chain in proximity to the MeGA substituent (94). The crystal structures for two bacterial α-glucuronidases (G. stearothermophilus AguA and C. japonicus

GlcA67A) provided insight into the specific requirements of α-glucuronidases (43,

82). These structures revealed a (β/α)8 barrel enzyme possessing a deep active site pocket which explains the requirement for terminally appended MeGA sidechains on the xylan backbone. The biological significance of this deep active site pocket and absolute requirement for MeGA linked to the O-2 of xylose residues at the non-reducing end is that α-glucuronidases require xylanase activity to generate their cognate substrates. This suggests that α-glucuronidases function downstream of xylanases to deconstruct plant cell wall polysaccharides

(82).

1.3.5 THE XYLANOLYTIC ESTERASES

Arabinoglucuronoxylans are typically acetylated at O-2 and O-3 positions of the xylose chain and frequently have ferulic acid or coumaric acid groups esterified to the 5'-OH of arabinofuranosyl groups. The acetyl groups affect the capacity of other xylanolytic enzymes such as xylanases (8, 47) and α-

31

glucuronidases (94) to bind and hydrolyze backbone or side chain linkages.

Ferulic and coumaric acid groups may be covalently linked either to lignin or to other ferulic or coumaric acid groups in xylans, thus these chemical linkages impart significant limitations to deconstructing xylan substrates. Correspondingly, many xylanolytic microbes harbor genes encoding ferulic/coumaric acid esterases and acetyl xylan esterases. Ferulic/coumaric acid esterases belong to the carbohydrate esterase (CE) family 1, whereas acetyl xylan esterase activity has been described for members of CE 1-7, 12 and the recently discovered family 16 (71). With the exception of acetyl xylan esterases in CE 4, all of the ferulic/coumaric acid esterases and acetyl xylan esterases employ a Ser-His-

Asp(Glu) analogous to the mechanism utilized by serine proteases

(68). This mechanism involves two phases, the initial acylation of the nucleophilic serine residue followed by deacylation with water acting as a nucleophile (Figure

1.12). The alternative mechanism identified for members of CE 4 involves divalent cations coordinated by histidine residues and a nucleophilic aspartic acid residue (114).

1.3.6 MODULARITY OF XYLANOLYTIC ENZYMES

Glycoside hydrolases are modular in general, and likewise many xylanolytic enzymes exhibit a modular domain organization and link various catalytic domains or carbohydrate binding domains within the same polypeptide.

Carbohydrate binding modules (CBMs) increase the effective concentration of catalytic modules on the surface of insoluble substrates and

32

may change the structure of the sugar by disrupting hydrogen bonding

interactions (104). While the structure of cellulose is quite homogeneous, as

mentioned above, the structure of xylan can vary greatly with different residues

substituted off of the xylose backbone. This difference in structure imparts

significant limitations to CBMs targeted towards xylan relative to the CBMs that

bind cellulose, which is a homogenous substrate. Similar to the binding of

decorated xylo-oligosaccharides by xylanases, the decorations on xylan will likely

influence the way that CBMs bind to xylan. Indeed, the crystal structure for the

family 15 CBM from the saprophytic soil bacterium, Cellvibrio japonicus revealed

that when bound to xylopentaose, the oligosaccharide adopts a helical

conformation such that the O-2 and O-3 for all but one of the xylose residues face outward into solution (Figure 1.13) (113). It is anticipated that this conformation will allow the CBM to accommodate decorations on the xylan chain, thus permitting the C. japonicus CBM to bind substituted arabinoglucuronoxylans

(113).

1.3.7 COORDINATED FUNCTION OF XYLANOLYTIC ENZYMES

The biocatalytic conversion of xylan to the constituent monosaccharides, xylose, arabinose and glucuronic acid requires the coordinated function of these xylanolytic enzymes. In the model presented in Fig 1.14, xylanases, acetyl xylan esterases, and ferulic acid esterases may function together to produce short, substituted xylo-oligosaccharides with the concomitant release of ferulic and acetic acid byproducts (Figure 1.14A). The substituted xylo-oligosaccharides

33

then become the substrates for arabinofuranosidases and glucuronidases that

liberate arabinose and glucuronic acid, leaving linear, non-branched xylo-

oligosaccharides (Figure 1.14B). Xylosidases then convert the xylo-

oligosaccharides into their constituent xylose sugars (Figure 1.14C). Finally,

fermenting microorganisms, naturally able to ferment pentose sugars, will take up

the xylose and arabinose and shuttle them into the pentose phosphate pathway

for subsequent fermentation (Figure 1.14D). Indeed, as discussed in the previous section, both xylose and arabinose enter the pentose phosphate pathway in P. bryantii B14 and P. ruminicola 23 and result in hexose re-synthesis followed by

the Embden-Myerhof-Parnas pathway.

1.4 MAJOR GOAL FOR THESIS RESEARCH

The ultimate goal for the current project was to provide insight into the

mechanisms by which rumen Prevotella spp. degrade xylan. The genome

sequences for P. bryantii B14 and P. ruminicola 23 were sequenced by The

Institute for Genomic Research (now the J. Craig Venter Institute) and this has

provided us with a unique view into the putative polysaccharide degrading genes

contained within the genomes of these two organisms. Bioinformatic analysis of

the genomes for these bacteria revealed the presence of 98 putative glycoside

hydrolase genes for P. ruminicola 23 and 93 putative glycoside hydrolase genes

for P. bryantii B14. To identify the subset of these GH genes which may be

involved in xylan degradation, we first employed a bioinformatics approach to

scan the genomes for genes with predicted roles in xylan utilization and then

34

tested these predictions biochemically (Chapter 2-3). Through these studies we found that many GH genes may be incorrectly annotated and that assigning substrate specificity to the gene products for GH genes is not trivial. To circumvent this issue, we employed a transcriptional profiling approach (Chapter

4) to identify genes that are specifically induced by P. bryantii B14 during growth

with natural xylan polymers. Results from these studies provide significant new

insight into the molecular machinery that P. bryantii B14 and P. ruminicola 23

employ to degrade xylan. Furthermore, the results have revealed a conserved

mechanism for xylan degradation by both rumen Prevotella spp. and human

colonic Bacteroides spp.

Additional work that was completed in Steven R. Blanke's laboratory

during the first two years of my graduate school career are also presented in

Appendix A and Appendix B.

1.5 REFERENCES

1. Accetto, T., and G. Avgustin. 2007. Studies on Prevotella using a system for the controlled expression of cloned genes in P. bryantii TC1-1. Microbiology 153:2281-2288.

2. Accetto, T., M. Peterka, and G. Avgustin. 2005. Type II restriction modification systems of Prevotella bryantii TC1-1 and Prevotella ruminicola 23 strains and their effect on the efficiency of DNA introduction via electroporation. FEMS Microbiol Lett 247:177-183.

3. Allison, M. J. 1978. Production of branched-chain volatile fatty acids by certain anaerobic bacteria. Appl Environ Microbiol 35:872-877.

4. Avgustin, G., H. J. Flint, and T. R. Whitehead. 1992. Distribution of xylanase genes and enzymes among strains of Prevotella (Bacteroides) ruminicola from the rumen. FEMS Microbiol Lett 78:137-143.

35

5. Avgustin, G., R. J. Wallace, and H. J. Flint. 1997. Phenotypic diversity among ruminal isolates of Prevotella ruminicola: proposal of Prevotella brevis sp. nov., Prevotella bryantii sp. nov., and Prevotella albensis sp. nov. and redefinition of Prevotella ruminicola. International journal of systematic bacteriology 47:284-288.

6. Berrin, J. G., and N. Juge. 2008. Factors affecting xylanase functionality in the degradation of arabinoxylans. Biotechnol Lett 30:1139-1150.

7. Biely, P., Z. Kratky, and M. Vrsanska. 1981. Substrate- of endo-1,4-beta-xylanase of the yeast Cryptococcus albidus. Eur J Biochem 119:559-564.

8. Biely, P., C. R. MacKenzie, J. Puls, and H. Schneider. 1986. of esterases and xylanases in the enzymatic degradation of acetyl xylan. Bio/Technology 4:731-733.

9. Biely, P., M. Vrsanska, M. Tenkanen, and D. Kluepfel. 1997. Endo- beta-1,4-xylanase families: differences in catalytic properties. Journal of Biotechnology 57:151-166.

10. Bladen, H. A., M. P. Bryant, and R. N. Doetsch. 1961. A Study of bacterial species from the rumen which produce ammonia from protein hydrolyzate. Appl Microbiol 9:175-180.

11. Bourgois, T. M., V. Van Craeyveld, S. Van Campenhout, C. M. Courtin, J. A. Delcour, J. Robben, and G. Volckaert. 2007. Recombinant expression and characterization of XynD from Bacillus subtilis subsp. subtilis ATCC 6051: a GH 43 arabinoxylan arabinofuranohydrolase. Appl Microbiol Biotechnol 75:1309-1317.

12. Bray, M. R., and A. J. Clarke. 1992. Action pattern of xylo- oligosaccharide hydrolysis by Schizophyllum commune xylanase A. Eur J Biochem 204:191-196.

13. Brunzelle, J. S., D. B. Jordan, D. R. McCaslin, A. Olczak, and Z. Wawrzak. 2008. Structure of the two-subsite beta-D-xylosidase from Selenomonas ruminantium in complex with 1,3- bis[tris(hydroxymethyl)methylamino]propane. Arch Biochem Biophys 474:157-166.

14. Brux, C., A. Ben-David, D. Shallom-Shezifi, M. Leon, K. Niefind, G. Shoham, Y. Shoham, and D. Schomburg. 2006. The structure of an inverting GH43 beta-xylosidase from Geobacillus stearothermophilus with its substrate reveals the role of the three catalytic residues. J Mol Biol 359:97-109.

36

15. Bryant, M. P., and I. M. Robinson. 1962. Some nutritional characteristics of predominant culturable ruminal bacteria. J Bacteriol 84:605-614.

16. Bryant, M. P., and I. M. Robinson. 1961. Studies on the nitrogen requirements of some ruminal cellulolytic bacteria. Appl Microbiol 9:96- 103.

17. Bryant, M. P., N. Small, C. Bouma, and H. Chu. 1958. Bacteroides ruminicola n. sp. and Succinimonas amylolytica; the new genus and species; species of succinic acid-producing anaerobic bacteria of the bovine rumen. J Bacteriol 76:15-23.

18. Caldwell, D. R., and K. Newman. 1986. Pentose metabolism by Bacteroides ruminicola supsp. brevis strain B14. Current Microbiology 14:149-155.

19. Caldwell, D. R., D. C. White, M. P. Bryant, and R. N. Doetsch. 1965. Specificity of the heme requirement for growth of Bacteroides ruminicola. J Bacteriol 90:1645-1654.

20. Cantarel, B. L., P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, and B. Henrissat. 2008. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res 37:D233-D238

21. Cheng, K. J., R. Hironaka, D. W. Roberts, and J. W. Costerton. 1973. Cytoplasmic glycogen inclusions in cells of anaerobic gram-negative rumen bacteria. Can J Microbiol 19:1501-1506.

22. Cheng, Q., H. Li, K. Merdek, and J. T. Park. 2000. Molecular characterization of the beta-N-acetylglucosaminidase of Escherichia coli and its role in cell wall recycling. J Bacteriol 182:4836-4840.

23. Chow, J. M., and J. B. Russell. 1992. Effect of pH and monensin on glucose transport by Fibrobacter succinogenes, a cellulolytic ruminal bacterium. Appl Environ Microbiol 58:1115-1120.

24. Collins, T., C. Gerday, and G. Feller. 2005. Xylanases, xylanase families and extremophilic xylanases. FEMS Microbiol Rev 29:3-23.

25. Davies, G. J., K. S. Wilson, and B. Henrissat. 1997. Nomenclature for sugar-binding subsites in glycosyl hydrolases. Biochem J 321 ( Pt 2):557- 559.

26. Dawson, K. A., M. J. Allison, and P. A. Hartman. 1980. Isolation and some characteristics of anaerobic oxalate-degrading bacteria from the rumen. Appl Environ Microbiol 40:833-839.

37

27. Dehority, B. A. 1966. Characterization of several bovine rumen bacteria isolated with a xylan medium. J Bacteriol 91:1724-1729.

28. Derewenda, U., L. Swenson, R. Green, Y. Wei, R. Morosoli, F. Shareck, D. Kluepfel, and Z. S. Derewenda. 1994. Crystal structure, at 2.6-A resolution, of the Streptomyces lividans xylanase A, a member of the F family of beta-1,4-D-glycanases. J Biol Chem 269:20811-20814.

29. Drouillard, S., S. Armand, G. J. Davies, C. E. Vorgias, and B. Henrissat. 1997. Serratia marcescens chitobiase is a retaining glycosidase utilizing substrate acetamido group participation. Biochem J 328 ( Pt 3):945-949.

30. Ebringerova, A., Heinze, T. 2000. Xylan and xylan derivatives - biopolymers with valuable properties, 1. Naturally occurring xylans structures, isolation procedures and properties. Macromolecular Rapid Communications 21:542-556.

31. Edwards, J. E., N. R. McEwan, A. J. Travis, and R. J. Wallace. 2004. 16S rDNA library-based anlysis of ruminal bacterial diversity. Antonie van Leeuwenhoek 86:263-281.

32. Elsik, C. G., R. L. Tellam, K. C. Worley, R. A. Gibbs, D. M. Muzny, G. M. Weinstock, D. L. Adelson, E. E. Eichler, L. Elnitski, R. Guigo, D. L. Hamernik, S. M. Kappes, H. A. Lewin, D. J. Lynn, F. W. Nicholas, A. Reymond, M. Rijnkels, L. C. Skow, E. M. Zdobnov, L. Schook, J. Womack, T. Alioto, S. E. Antonarakis, A. Astashyn, C. E. Chapple, H. C. Chen, J. Chrast, F. Camara, O. Ermolaeva, C. N. Henrichsen, W. Hlavina, Y. Kapustin, B. Kiryutin, P. Kitts, F. Kokocinski, M. Landrum, D. Maglott, K. Pruitt, V. Sapojnikov, S. M. Searle, V. Solovyev, A. Souvorov, C. Ucla, C. Wyss, J. M. Anzola, D. Gerlach, E. Elhaik, D. Graur, J. T. Reese, R. C. Edgar, J. C. McEwan, G. M. Payne, J. M. Raison, T. Junier, E. V. Kriventseva, E. Eyras, M. Plass, R. Donthu, D. M. Larkin, J. Reecy, M. Q. Yang, L. Chen, Z. Cheng, C. G. Chitko- McKown, G. E. Liu, L. K. Matukumalli, J. Song, B. Zhu, D. G. Bradley, F. S. Brinkman, L. P. Lau, M. D. Whiteside, A. Walker, T. T. Wheeler, T. Casey, J. B. German, D. G. Lemay, N. J. Maqbool, A. J. Molenaar, S. Seo, P. Stothard, C. L. Baldwin, R. Baxter, C. L. Brinkmeyer-Langford, W. C. Brown, C. P. Childers, T. Connelley, S. A. Ellis, K. Fritz, E. J. Glass, C. T. Herzig, A. Iivanainen, K. K. Lahmers, A. K. Bennett, C. M. Dickens, J. G. Gilbert, D. E. Hagen, H. Salih, J. Aerts, A. R. Caetano, et al. 2009. The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 324:522-528.

33. Faure, D. 2002. The family-3 glycoside hydrolases: from housekeeping functions to host-microbe interactions. Appl Environ Microbiol 68:1485- 1490.

38

34. Fields, M. W., J. B. Russell, and D. B. Wilson. 1997. A mutant of Prevotella ruminicola B14 deficient in β-1,4-endoglucanase and mannanase activities. FEMS Microbiol Lett 154:9-15.

35. Flint, H. J., T. R. Whitehead, J. C. Martin, and A. Gasparic. 1997. Interrupted catalytic domain structures in xylanases from two distantly related strains of Prevotella ruminicola. Biochim Biophys Acta 1337:161- 165.

36. Fujimoto, Z., S. Kaneko, A. Kuno, H. Kobayashi, I. Kusakabe, and H. Mizuno. 2004. Crystal structures of decorated xylooligosaccharides bound to a family 10 xylanase from Streptomyces olivaceoviridis E-86. J Biol Chem 279:9606-9614.

37. Gardner, R. G., J. B. Russell, D. B. Wilson, G. R. Wang, and N. B. Shoemaker. 1996. Use of a modified Bacteroides-Prevotella shuttle vector to transfer a reconstructed beta-1,4-D-endoglucanase gene into Bacteroides uniformis and Prevotella ruminicola B(1)4. Appl Environ Microbiol 62:196-202.

38. Gardner, R. G., J. E. Wells, M. W. Fields, D. B. Wilson, and J. B. Russell. 1997. A Prevotella ruminicola B(1)4 operon encoding extracellular polysaccharide hydrolases. Curr Microbiol 35:274-277.

39. Gardner, R. G., J. E. Wells, J. B. Russell, and D. B. Wilson. 1995. The cellular location of Prevotella ruminicola beta-1,4-D-endoglucanase and its occurrence in other strains of ruminal bacteria. Appl Environ Microbiol 61:3288-3292.

40. Gardner, R. G., J. E. Wells, J. B. Russell, and D. B. Wilson. 1995. The effect of carbohydrates on the expression of the Prevotella ruminicola 1,4- beta-D-endoglucanase. FEMS Microbiol Lett 125:305-310.

41. Gasparic, A., R. Marinsek-Logar, J. Martin, R. J. Wallace, F. V. Nekrep, and H. J. Flint. 1995. Isolation of genes encoding beta-D- xylanase, beta-D-xylosidase and alpha-L-arabinofuranosidase activities from the rumen bacterium Prevotella ruminicola B(1)4. FEMS Microbiol Lett 125:135-141.

42. Gasparic, A., J. Martin, A. S. Daniel, and H. J. Flint. 1995. A xylan hydrolase gene cluster in Prevotella ruminicola B(1)4: sequence relationships, synergistic interactions, and oxygen sensitivity of a novel enzyme with exoxylanase and beta-(1,4)-xylosidase activities. Appl Environ Microbiol 61:2958-2964.

39

43. Golan, G., D. Shallom, A. Teplitsky, G. Zaide, S. Shulami, T. Baasov, V. Stojanoff, A. Thompson, Y. Shoham, and G. Shoham. 2004. Crystal structures of Geobacillus stearothermophilus alpha-glucuronidase complexed with its substrate and products: mechanistic implications. J Biol Chem 279:3014-3024.

44. Gordon, G. L. R. 1986. The potential for manipulation of rumen fungi. Reviews in Rural Science 6:124-128.

45. Gradel, C. M., and B. A. Dehority. 1972. Fermentation of isolated pectin and pectin from intact forages by pure cultures of rumen bacteria. Appl Microbiol 23:332-340.

46. Griswold, K. E., and R. I. Mackie. 1997. Degradation of protein and utilization of the hydrolytic products by a predominant ruminal bacterium, Prevotella ruminicola B(1)4. J Dairy Sci 80:167-175.

47. Grohmann, K., D. J. Mitchell, M. E. Himmel, B. E. Dale, and H. A. Schroeder. 1989. The role of ester groups in resistance of plant cell wall polysaccharides to enzymatic hydrolysis. Applied Biochemistry and Biotechnology 20-21:45-61.

48. Harvey, A. J., M. Hrmova, R. De Gori, J. N. Varghese, and G. B. Fincher. 2000. Comparative modeling of the three-dimensional structures of family 3 glycoside hydrolases. Proteins 41:257-269.

49. Hazlewood, G. P., and R. Edwards. 1981. Proteolytic activities of a rumen bacterium, Bacteroides ruminicola R8/4. J Gen Microbiol 125:11-15.

50. Henrissat, B. 1991. A classification of glycosyl hydrolases based on amino acid sequence similarities. Biochem J 280:309-316.

51. Hespell, R. B., R. Wolf, and R. J. Bothast. 1987. Fermentation of xylans by Butyrivibrio fibrisolvens and other ruminal bacteria. Appl Environ Microbiol 53:2849-2853.

52. Hobson, P. N. 1997. The functioning rumen, p. 5-8. In P. N. a. S. Hobson, C. S. (ed.), The Rumen Microbial Ecosystem, Second ed. Blackie Academic & Professional.

53. Honda, H., T. Saito, S. Iijima, and T. Kobayashi. 1988. Molecular cloning and expression of a β-glucosidase gene from Ruminococus albus in Esherichia coli. Enzyme and Microbial Technology 10:559-562.

54. Howlett, M. R., D. O. Mountfort, K. W. Turner, and A. M. Roberton. 1976. Metabolism and growth yields in Bacteroides ruminicola strain B14. Appl Environ Microbiol 32:274-283.

40

55. Hrmova, M., R. De Gori, B. J. Smith, J. K. Fairweather, H. Driguez, J. N. Varghese, and G. B. Fincher. 2002. Structural basis for broad substrate specificity in higher plant beta-D-glucan glucohydrolases. Plant Cell 14:1033-1052.

56. Hrmova, M., V. A. Streltsov, B. J. Smith, A. Vasella, J. N. Varghese, and G. B. Fincher. 2005. Structural rationale for low-nanomolar binding of transition state mimics to a family GH3 beta-D-glucan glucohydrolase from barley. Biochemistry 44:16529-16539.

57. Hrmova, M., J. N. Varghese, R. De Gori, B. J. Smith, H. Driguez, and G. B. Fincher. 2001. Catalytic mechanisms and reaction intermediates along the hydrolytic pathway of a plant beta-D-glucan glucohydrolase. Structure 9:1005-1016.

58. Hungate, R. E. 1966. The rumen and its microbes. Academic Press, New York.

59. Jeffries, T. W. 1990. Biodegradation of lignin-carbohydrate complexes. Biodegradation 1:163-176.

60. Jordan, D. B., X. L. Li, C. A. Dunlap, T. R. Whitehead, and M. A. Cotta. 2007. Structure-function relationships of a catalytically efficient beta-D- xylosidase. Appl Biochem Biotechnol 141:51-76.

61. Jurgens, M. H. 2001. Animal Feeding and Nutrition, vol. 9th. Kendall Hunt Publishing Company.

62. Kaneko, S., H. Ichinose, Z. Fujimoto, A. Kuno, K. Yura, M. Go, H. Mizuno, I. Kusakabe, and H. Kobayashi. 2004. Structure and function of a family 10 beta-xylanase chimera of Streptomyces olivaceoviridis E-86 FXYN and Cellulomonas fimi Cex. J Biol Chem 279:26619-26626.

63. Kellett, L. E., D. M. Poole, L. M. Ferreira, A. J. Durrant, G. P. Hazlewood, and H. J. Gilbert. 1990. Xylanase B and an arabinofuranosidase from Pseudomonas fluorescens subsp. cellulosa contain identical cellulose-binding domains and are encoded by adjacent genes. Biochem J 272:369-376.

64. Kolenova, K., M. Vrsanska, and P. Biely. 2006. Mode of action of endo- beta-1,4-xylanases of families 10 and 11 on acidic xylooligosaccharides. J Biotechnol 121:338-345.

65. Kormelink, F. J. M., M. J. F. Searle-Van Leeuwen, T. M. Wood, and A. G. J. Voragen. 1991. (1,4)-β-D-arabinoxylan arabinofuranohydrolase−a novel enzyme in the bioconversion of arabinoxylan. Applied Microbiology and Biotechnology 35:231-232.

41

66. Kormelink, F. J. M., M. J. F. Searle-Van Leeuwen, T. M. Wood, and A. G. J. Voragen. 1991. Purification and characterization of an (1-4)-β-D- arabinoxylan arabinofuranohydrolase from Aspergillus awamori. Applied Microbiology and Biotechnology 35:753-758.

67. Krause, D. O., and J. B. Russell. 1996. How many ruminal bacteria are there? J Dairy Sci 79:1467-1475.

68. Kraut, J. 1977. Serine proteases: structure and mechanism of catalysis. Annu Rev Biochem 46:331-358.

69. Laureano-Perez, L., F. Teymouri, H. Alizadeh, and B. E. Dale. 2005. Understanding factors that limit enzymatic hydrolysis of biomass: characterization of pretreated corn stover. Appl Biochem Biotechnol 121- 124:1081-1099.

70. Ley, R. E., D. A. Peterson, and J. I. Gordon. 2006. Ecological and evolutionary forces shaping microbial diversity in the human intestine. Cell 124:837-848.

71. Li, X. L., C. D. Skory, M. A. Cotta, V. Puchart, and P. Biely. 2008. Novel family of carbohydrate esterases, based on identification of the Hypocrea jecorina acetyl esterase gene. Appl Environ Microbiol 74:7482-7489.

72. Lou, J., K. A. Dawson, and H. J. Strobel. 1996. Role of phosphorolytic cleavage in cellobiose and cellodextrin metabolism by the ruminal bacterium Prevotella ruminicola. Appl Environ Microbiol 62:1770-1773.

73. Mackie, R. I. 2002. Mutualistic fermentative digestion in the gastrointestinal tract: diversity and evolution. Integrative and Comparative Biology 42:319-326.

74. Maglione, G., O. Matsushita, J. B. Russell, and D. B. Wilson. 1992. Properties of a genetically reconstructed Prevotella ruminicola endoglucanase. Appl Environ Microbiol 58:3593-3597.

75. Maslen, S. L., F. Goubet, A. Adam, P. Dupree, and E. Stephens. 2007. Structure elucidation of arabinoxylan isomers by normal phase HPLC- MALDI-TOF/TOF-MS/MS. Carbohydr Res 342:724-735.

76. Matsushita, O., J. B. Russell, and D. B. Wilson. 1991. A Bacteroides ruminicola 1,4-beta-D-endoglucanase is encoded in two reading frames. J Bacteriol 173:6919-6926.

77. Matsushita, O., J. B. Russell, and D. B. Wilson. 1990. Cloning and sequencing of a Bacteroides ruminicola B(1)4 endoglucanase gene. J Bacteriol 172:3620-3630.

42

78. Matte, A., C. W. Forsberg, and A. M. Verrinder Gibbins. 1992. Enzymes associated with metabolism of xylose and other pentoses by Prevotella (Bacteroides) ruminicola strains, Selenomonas ruminantium D, and Fibrobacter succinogenes S85. Can J Microbiol 38:370-376.

79. Miller, T. L. 1995. Ecology of methane production and hydrogen sinks in the rumen. In W. V. Engelhardt, S. Leonhardt-Marek, G. Breves, and D. Gieseke (ed.), Ruminant Physiology: Digestion, Metabolism, Growth and Reproduction. Ferdinad Enke Verlag, Stuttgarat.

80. Miller, T. L., and M. J. Wolin. 1973. Formation of hydrogen and formate by Ruminococcus albus. Journal of bacteriology 116:836-846.

81. Miyazaki, K., H. Miyamoto, D. K. Mercer, T. Hirase, J. C. Martin, Y. Kojima, and H. J. Flint. 2003. Involvement of the multidomain regulatory protein XynR in positive control of xylanase gene expression in the ruminal anaerobe Prevotella bryantii B(1)4. J Bacteriol 185:2219-2226.

82. Nurizzo, D., T. Nagy, H. J. Gilbert, and G. J. Davies. 2002. The structural basis for catalysis and specificity of the Pseudomonas cellulosa alpha-glucuronidase, GlcA67A. Structure 10:547-556.

83. Ohmiya, K., M. Takano, and S. Shimizu. 1990. DNA sequence of a beta- glucosidase from Ruminococcus albus. Nucleic Acids Res 18:671.

84. Olsen, G. J., D. J. Lane, S. J. Giovannoni, N. R. Pace, and D. A. Stahl. 1986. Microbial ecology and evolution: a ribosomal RNA approach. Annu Rev Microbiol 40:337-365.

85. Orpin, C. G., and K. N. Joblin. 1997. The rumen anaerobic fungi, p. 140- 195, The Rumen Microbial Ecosystem. Blackie Academic & Professional, New York.

86. Orskov, E. R., and C. Fraser. 1975. The effects of processing of barley- based supplement son rumen pH , rate of digestion, and voluntary intake of dried grass in sheep. British Journal of Nutrition 34:4993.

87. Painter, T. J. 1983. p. 196-285. In G. O. Aspinall (ed.), The Polysaccharides. Academic, San Diego.

88. Pell, G., L. Szabo, S. J. Charnock, H. Xie, T. M. Gloster, G. J. Davies, and H. J. Gilbert. 2004. Structural and biochemical analysis of Cellvibrio japonicus xylanase 10C: how variation in substrate-binding cleft influences the catalytic profile of family GH-10 xylanases. J Biol Chem 279:11777- 11788.

43

89. Pell, G., E. J. Taylor, T. M. Gloster, J. P. Turkenburg, C. M. Fontes, L. M. Ferreira, T. Nagy, S. J. Clark, G. J. Davies, and H. J. Gilbert. 2004. The mechanisms by which family 10 glycoside hydrolases bind decorated substrates. J Biol Chem 279:9597-9605.

90. Peterka, M., and G. Avgustin. 2001. Ribosomal RNA genes from Prevotella bryantii: organization and heterogeneity. Folia Microbiol (Praha) 46:67-70.

91. Pettersen, E. F., T. D. Goddard, C. C. Huang, G. S. Couch, D. M. Greenblatt, E. C. Meng, and T. E. Ferrin. 2004. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem 25:1605-1612.

92. Pitson, S. M., A. G. Voragen, and G. Beldman. 1996. Stereochemical course of hydrolysis catalyzed by arabinofuranosyl hydrolases. FEBS Lett 398:7-11.

93. Puls, J., O. Schmidt, and C. Granzow. 1987. α-Glucuronidase in two microbial xylanolytic systems. Enzyme and Microbial Technology 9:83-88.

94. Puls, J., and J. Schuseil. 1993. Chemistry of hemicellulose: relationship between hemicellulose structure and enzyme required for hydrolysis., p. 1- 27. In M. P. Coughlan and G. P. Hazlewood (ed.), Hemicelluloses and Hemicellulases. Portland Press, London.

95. Rigden, D. J., L. V. Mello, and M. Y. Galperin. 2004. The PA14 domain, a conserved all-beta domain in bacterial toxins, enzymes, adhesins and signaling molecules. Trends Biochem Sci 29:335-339.

96. Russell, J. B. 1985. Fermentation of cellodextrins by cellulolytic and noncellulolytic rumen bacteria. Appl Environ Microbiol 49:572-576.

97. Russell, J. B. 1983. Fermentation of peptides by Bacteroides ruminicola B(1)4. Appl Environ Microbiol 45:1566-1574.

98. Russell, J. B. 1991. Intracellular pH of acid-tolerant ruminal bacteria. Appl Environ Microbiol 57:3383-3384.

99. Russell, J. B., and D. B. Dombrowski. 1980. Effect of pH on the efficiency of growth by pure cultures of rumen bacteria in continuous culture. Appl Environ Microbiol 39:604-610.

100. Russell, J. B., and D. B. Wilson. 1996. Why are ruminal cellulolytic bacteria unable to digest cellulose at low pH? J Dairy Sci 79:1503-1509.

44

101. Scheifinger, C. C., and M. J. Wolin. 1973. Propionate formation from cellulose and soluble sugars by combined cultures of Bacteroides succinogenes and Selenomonas ruminantium. Appl Microbiol 26:789-795.

102. Shah, H. N., and D. M. Collins. 1990. Prevotella, a new genus to include Bacteroides melaninogenicus and related species formerly classified in the genus Bacteroides. Int J Syst Bacteriol 40:205-208.

103. Shoemaker, N. B., K. L. Anderson, S. L. Smithson, G. R. Wang, and A. A. Salyers. 1991. Conjugal transfer of a shuttle vector from the human colonic anaerobe Bacteroides uniformis to the ruminal anaerobe Prevotella (Bacteroides) ruminicola B(1)4. Appl Environ Microbiol 57:2114-2120.

104. Shoseyov, O., Z. Shani, and I. Levy. 2006. Carbohydrate binding modules: biochemical properties and novel applications. Microbiol Mol Biol Rev 70:283-295.

105. Siika-aho, M., M. Tenkanen, J. Buchert, J. Puls, and L. Viikari. 1994. α-glucuronidase from Trichoderma reesei RUT C-30. Enzyme and Microbial Technology 16:813-819.

106. Sorensen, H. R., C. T. Jorgensen, C. H. Hansen, C. I. Jorgensen, S. Pedersen, and A. S. Meyer. 2006. A novel GH43 alpha-L- arabinofuranosidase from Humicola insolens: mode of action and synergy with GH51 alpha-L-arabinofuranosidases on wheat arabinoxylan. Appl Microbiol Biotechnol 73:850-861.

107. Stevenson, D. M., and P. J. Weimer. 2007. Dominance of Prevotella and low abundance of classical ruminal bacterial species in the bovine rumen revealed by relative quantification real-time PCR. Appl Microbiol Biotechnol 75:165-174.

108. Stewart, C. S., Flint, H. J., and Bryant, M. P. 1997. The Rumen Bacteria, p. 10-72. In P. N. Hobson, and Stewart, C. S. (ed.), The Rumen Microbial Ecosystem, Second ed. Elsevier, New York.

109. Strobel, H. J. 1993. Pentose utilization and transport by the ruminal bacterium Prevotella ruminicola. Arch Microbiol 159:465-471.

110. Strobel, H. J. 1992. Vitamin B12-dependent propionate production by the ruminal bacterium Prevotella ruminicola 23. Appl Environ Microbiol 58:2331-2333.

111. Sun, R. C., X. F. Sun, and S. H. Zhang. 2001. Quantitative determination of hydroxycinnamic acids in wheat, rice, rye, and barley straws, maize stems, oil palm frond fiber, and fast-growing poplar wood. J Agric Food Chem 49:5122-5129.

45

112. Sunna, A., and G. Antranikian. 1997. Xylanolytic enzymes from fungi and bacteria. Crit Rev Biotechnol 17:39-67.

113. Szabo, L., S. Jamal, H. Xie, S. J. Charnock, D. N. Bolam, H. J. Gilbert, and G. J. Davies. 2001. Structure of a family 15 carbohydrate-binding module in complex with xylopentaose. Evidence that xylan binds in an approximate 3-fold helical conformation. J Biol Chem 276:49061-49065.

114. Taylor, E. J., T. M. Gloster, J. P. Turkenburg, F. Vincent, A. M. Brzozowski, C. Dupont, F. Shareck, M. S. Centeno, J. A. Prates, V. Puchart, L. M. Ferreira, C. M. Fontes, P. Biely, and G. J. Davies. 2006. Structure and activity of two metal ion-dependent acetylxylan esterases involved in plant cell wall degradation reveals a close similarity to peptidoglycan deacetylases. J Biol Chem 281:10968-10975.

115. Tews, I., A. Perrakis, A. Oppenheim, Z. Dauter, K. S. Wilson, and C. E. Vorgias. 1996. Bacterial chitobiase structure provides insight into catalytic mechanism and the basis of Tay-Sachs disease. Nat Struct Biol 3:638- 648.

116. Thurston, B., K. A. Dawson, and H. J. Strobel. 1993. Cellobiose versus glucose utilization by the ruminal bacterium Ruminococcus albus. Appl Environ Microbiol 59:2631-2637.

117. Timell, T. E. 1964. Wood Hemicelluloses: Part I. Advances in Carbohydrate Chemistry 19:247-302.

118. Turner, K. W., and A. M. Roberton. 1979. Xylose, arabinose, and rhamnose fermentation by Bacteroides ruminicola. Appl Environ Microbiol 38:7-12.

119. Uden, P., T. R. Rounsaville, G. R. Wiggans, and P. J. Van Soest. 1982. The measurement of liquid and solid digesta retention in ruminants, equines and rabbits given timothy (Phleum pratense) hay. Br J Nutr 48:329-339.

120. van den Broek, L. A., R. M. Lloyd, G. Beldman, J. C. Verdoes, B. V. McCleary, and A. G. Voragen. 2005. Cloning and characterization of arabinoxylan arabinofuranohydrolase-D3 (AXHd3) from Bifidobacterium adolescentis DSM20083. Appl Microbiol Biotechnol 67:641-647.

121. Van Doorslaer, E., H. Kersters-Helderson, and C. K. De Bruyne. 1985. Hydrolysis of β-D-xylo-oligosaccharides by β-D-xylosidase from Bacillus pumilus. Carbohydr Res 140:342-346.

46

122. Van Laere, K. M. J., G. Beldman, and A. G. Voragen. 1997. A new arabinofuranohydrolase-D3 (AXHd3) from Bifidobacterium adolescentis able to remove arabinosyl residues from double-substituted xylose units in arabinoxylan. Applied Microbiology and Biotechnology 47:231-235.

123. Van Laere, K. M. J., C. H. L. Voragen, T. Kroef, L. A. M. Van den Broek, G. Beldman, and A. G. J. Voragen. 1999. Purification and mode of action of two different arabinoxylan arabinofuranohydrolases from Bifidobacterium adolescentis DSM 20083. Applied Microbiology and Biotechnology 51:606-613.

124. Vandermarliere, E., T. M. Bourgois, M. D. Winn, S. van Campenhout, G. Volckaert, J. A. Delcour, S. V. Strelkov, A. Rabijns, and C. M. Courtin. 2009. Structural analysis of a glycoside hydrolase family 43 arabinoxylan arabinofuranohydrolase in complex with xylotetraose reveals a different binding mechanism compared with other members of the same family. Biochem J 418:39-47.

125. Vardakou, M., C. Dumon, J. W. Murray, P. Christakopoulos, D. P. Weiner, N. Juge, R. J. Lewis, H. J. Gilbert, and J. E. Flint. 2008. Understanding the structural basis for substrate and inhibitor recognition in eukaryotic GH11 xylanases. J Mol Biol 375:1293-1305.

126. Vardakou, M., J. Flint, P. Christakopoulos, R. J. Lewis, H. J. Gilbert, and J. W. Murray. 2005. A family 10 Thermoascus aurantiacus xylanase utilizes arabinose decorations of xylan as significant substrate specificity determinants. J Mol Biol 352:1060-1067.

127. Varghese, J. N., M. Hrmova, and G. B. Fincher. 1999. Three- dimensional structure of a barley beta-D-glucan exohydrolase, a family 3 glycosyl hydrolase. Structure 7:179-190.

128. Vocadlo, D. J., G. J. Davies, R. Laine, and S. G. Withers. 2001. Catalysis by hen egg-white lysozyme proceeds via a covalent intermediate. Nature 412:835-838.

129. Vogel, J. 2008. Unique aspects of the grass cell wall. Curr Opin Plant Biol 11:301-307.

130. Vrsanska, M., I. V. Gorbacheva, Z. Kratky, and P. Biely. 1982. Reaction pathways of substrate degradation by an acidic endo-1,4-beta-xylanase of Aspergillus niger. Biochim Biophys Acta 704:114-122.

131. Wagschal, K., C. Heng, C. C. Lee, and D. W. Wong. 2009. Biochemical characterization of a novel dual-function arabinofuranosidase/xylosidase isolated from a compost starter mixture. Appl Microbiol Biotechnol 81:855- 863.

47

132. Wallace, R. J., R. Onodera, and M. A. Cotta. 1997. Metabolism of nitrogen containing compounds. In P. N. Hobson and C. S. Stewart (ed.), The Rumen Microbial Ecosystem, Second ed. Blackie Academic & Professional, New York.

133. Weaver, J., T. R. Whitehead, M. A. Cotta, P. C. Valentine, and A. A. Salyers. 1992. Genetic analysis of a locus on the Bacteroides ovatus chromosome which contains xylan utilization genes. Appl Environ Microbiol 58:2764-2770.

134. Whitehead, T. R. 1993. Analyses of the gene and amino acid sequence of the Prevotella (Bacteroides) ruminicola 23 xylanase reveals unexpected homology with endoglucanases from other genera of bacteria. Curr Microbiol 27:27-33.

135. Whitehead, T. R., and M. A. Cotta. 2001. Identification of a broad- specificity xylosidase/arabinosidase important for xylooligosaccharide fermentation by the ruminal anaerobe Selenomonas ruminantium GA192. Curr Microbiol 43:293-298.

136. Whitehead, T. R., and R. B. Hespell. 1989. Cloning and expression in Escherichia coli of a xylanase gene from Bacteroides ruminicola 23. Appl Environ Microbiol 55:893-896.

137. Williams, A. G., and G. S. Coleman. 1992. The Rumen Protozoa. Springer-Verlag, New York.

138. Williams, A. G., and S. E. Withers. 1985. Formation of polysaccharide depolymerase and glycoside hydrolase enzymes by Bacteroides ruminicola supsp ruminicola in batch and continuous culture. Current Microbiology 12:79-84.

139. Wolfenden, R., X. Lu, and G. Young. 1998. Spontaneous hydrolysis of glycosides. Journal of the American Chemical Society 120:6814-6815.

140. Wong, K. K., L. U. Tan, and J. N. Saddler. 1988. Multiplicity of beta-1,4- xylanase in microorganisms: functions and applications. Microbiol Rev 52:305-317.

141. Wulff-Strobel, C. R., and D. B. Wilson. 1995. Cloning, sequencing, and characterization of a membrane-associated Prevotella ruminicola B(1)4 beta-glucosidase with cellodextrinase and cyanoglycosidase activities. J Bacteriol 177:5884-5890.

142. Zupancic, M. L., M. Frieman, D. Smith, R. A. Alvarez, R. D. Cummings, and B. P. Cormack. 2008. Glycan microarray analysis of Candida glabrata adhesin ligand specificity. Mol Microbiol 68:547-559.

48

1.6 TABLES AND FIGURES

TABLE 1.1 Phenotypic and genomic characteristics of ruminal Prevotella spp.a,b Characteristic P. bryantii P. brevis P. albensis P. ruminicola DNA G+C content 39.2-42.1 45.3-50.6 39.4-43.8 46.5-50.6 (mol %) CMCase + - w + Xylanase + - - + Amylase + + + + DNase + + - + Fermentation of : D-Xylose + - + + D-Mannose + + - + D-Glucose + + + + D-Fructose + + + + D-Raffinose + + - + Galactose + + + + L-Arabinose + + + + Esculin + + + + Cellobiose + + + + Lactose + + + + Melibiose + + - + Inulin + - - + Nitrogen from protein + + + + Heme requirement - n.d. n.d. + a Data are from modified from Avguŝtin et al. (5). b +, positive reaction; -, negative reaction; w, weak reaction; n.d., not determined.

49

TABLE 1.2 Summary of carbohydrate active enzymes cloned from Prevotella bryantii B14. a,b Enzymatic activity Gene CAZy family Notes (reference) xynA GH 10 Endo-xylanase (42) Part of xylan utilization cluster. xynB GH 43 β-Xylosidase/α-L- Part of a xylan utilization cluster. arabinofuranosidase (42) xynC GH 10 Endo-xylanase (42) Possesses an insertional domain of unknown function. xynE CE 1 Not determined (81) Part of a xylan utilization cluster. xynF GH 67 Not determined (81) Part of a xylan utilization cluster. cdxA GH 3 1,4-β-D-glucan glucohydrolase Cleaves cellodextrins and (141) cyanogenic glucosides. orf4/5 GH 26/GH 5 CMCase (76) Extracellular, cell associated. Two proteins are encoded in two reading frames. Predicted to encode a orf6 GH 26 Not determined (38) mannanase. a Family designations were made using the Carbohydrate Active Enzymes (CAZy) database

(http://www.cazy.org/index.html). b GH, glycoside hydrolase; CE, carbohydrate esterase.

50

Table 1.3 Domain architecture for GH family 3 genes. Domain organizationa Pfam Hitsb

1221

1636

76

228 a Functional domains were assigned utilizing the Pfam online server (http://www.sanger.ac.uk/Software/Pfam/) (20). Domain hits were included if the expect value (E value) was smaller than 1 x 10-8. b The number of domains for each domain organization was estimated from values provided on the Pfam online server.

51

Figure 1.1 Model for utilization of cello-oligosaccharides and cellobiose in P. bryantii B14. The question marks indicate that the transporters for cellobiose and cellodextrins is unknown. This model is adapted from a similar diagram published by Lou et al. (72). Abbreviations are as follows: CB, cellobiose; CD, cellodextrins; CBPase, cellobiose phosphorylase; HK, hexokinase; PGM, phosphoglucomutase.

52

Figure 1.2 Xylanolytic gene cluster identified by Flint and colleagues in P. bryantii B14 (41, 81).

53

Figure 1.3 Compositional analysis of the Alamo cultivar of Switchgrass. As indicated, glucans are predominantly composed of cellulose. Hemicellulosic components include galactan, mannan, xylan, arabinan, and uronic acids, although xylan represents the most abundant hemicellulosic polymer (129). These data were obtained from the US DOE Biomass Feedstock Composition and Property Database (http://www.afdc.energy.gov/biomass/progs/search1.cgi).

54

Figure 1.4 General structure showing the various linkages found in a variety of xylans isolate from plant cell walls. As described in the text, xylans isolated from different sources may not possess all of the linkages shown. Ferulic acid may form a di-ferulic acid bridge with ferulic acid residues attached to other arabinoxylan chains. Xylose residues may be di- substituted or mono-substituted with arabinose at the O-2 and O-3 positions.

55

Figure 1.5 General glycosidase mechanisms for (A) retaining glycosidases and (B) inverting glycosidases. In (A), a deprotonated carboxylic acid nucleophile attacks the anomeric carbon, displacing the attached sugar residue (indicated as R) and forming a covalent enzyme-sugar adduct. Subsequently an activated water molecule displaces the enzymic carboxylic acid resulting in net retention of stereochemical configuration at the anomeric carbon. In (B), an activated water molecule displaces the attached sugar residue resulting in net inversion of stereochemical configuration. In both of these mechanisms, the glycoside leaving group is assisted through protonation by a catalytic acid residue.

56

Figure 1.6 Diagrammatic representation of the sugar binding sites in glycosidases based on the scheme proposed by Davies et al. (25).

57

Figure 1.7 Differences in the products of hydrolysis between GH 10 and GH 11 endo- xylanases when incubated with substituted xylans (9, 75). (A) For GH family 10 enzymes, substitutions on the xylan chain are accommodated at the +1 site, thus these enzymes can release xylo-oligosaccharides in which the terminal non-reducing xylose residue is substituted. (B) For GH family 11 enzymes, substitutions on the xylan chain are not accommodated at the +1 site, thus these enzymes produce xylo-oligosaccharides with substitutions at the penultimate xylose residue.

58

Figure 1.8 Structural surface representation of the (A) Neocallimastix patriciarum GH family 11 xylanase in complex with ferulic acid arabinoxylotrisaccharide (PDB accession no. 2VGD) (125) and the (B) Cellvibrio mixtus GH family 10 xylanase in complex with aldotetraouronic acid (PDB accession no. 1UQZ) (89). For (A), the active site cleft of the GH 11 enzyme excludes the possibility of a substitution at the +1 subsite of the xylose chain, whereas substitutions may be accommodated at the +2 subsite. For (B), the open topology of the active site of the GH 10 enzyme permits the accommodation of a substitution at the +1 subsite. All structural representations in this and subsequent figures were generated with the UCSF Chimera software package (91).

59

Figure 1.9 Structural representations of the (A) Bacillus subtilis GH family 43 arabinoxylan arabinofuranohydrolase (AXH-m2,3) in complex with xylotetraose (PDB accession no. 3C7G) (124) and the (B) Selenomonas ruminantium GH family 43 β- xylosidase (SXA) in complex with 1,3-bis[tris(hydroxymethyl)methylamino]propane (PDB accession no. 3C2U) (13). Both enzymes possess two domains, an N-terminal β- propeller domain and a C-terminal mainly β-sheet domain, although the C-terminal domain for SXA is much larger and projects a loop that contacts the active site for the enzyme.

60

Figure 1.10 Schematic representation of the GH family 43 β- xylosidase (SXA) from Selenomonas ruminantium indicating the two xylose binding sites in the active site and the projection of extended xylose chains out into solution.

61

Figure 1.11 Structural representations of the (A) Vibrio cholera putative GH family 3 N- acetyl-β-glucosaminidase in complex with N-acetyl-β-glucosaminidase (PDB accession no. 1Y65) and the (B) Hordeum vulgare GH family 3 β-xylosidase in complex with glucose (PDB accession no. 1EX1) (127). As described in the text, acquisition of the second (α/β)6 sandwich domain may have led to the evolution of β-glucosidase activity within this family of proteins.

62

Figure 1.12 General mechanism for esterases employing the Ser-His-Asp(Glu) catalytic triad. Analogous to serine proteases, the first set of reactions leads to acylation of the enzyme followed by de-acylation of the enzyme involving attack of the ester linkage by an activated water molecule.

63

Figure 1.13 Structural surface rendering of the Cellvibrio japonicus family 15 carbohydrate binding module (CBM) in complex with xylopentaose (PDB accession no. 1GNY) (113). The xylopentaose sugar adopts a helical conformation wherein most of the O-2 and O-3 hydroxyl groups point out into solution suggesting that this CBM could accommodate a highly decorated xylan chain. The only exception is the fourth xylose residue which makes hydrogen bond contacts from both the O-2 and O-3 hydroxyl groups to the protein.

64

Figure 1.14 Schematic outline depicting the functional coordination of xylanolytic enzymes in the deconstruction and utilization of xylan. (A) Xylanases, acetyl xylan esterases, and ferulic acid esterases function together to produce short, substituted xylo-oligosaccharides with the concomitant release of ferulic and acetic acid byproducts. (B) Arabinofuranosidases and glucuronidases then liberate arabinose and glucuronic acid from these substituted xylo-oligosaccharides. (C) Xylosidases convert the xylo-oligosaccharides into their constituent xylose sugars. (D) Fermenting microorganisms take up the xylose and arabinose and shuttle them into the pentose phosphate pathway for subsequent fermentation.

65

CHAPTER 2

BIOCHEMICAL ANALYSIS OF A β-D-XYLOSIDASE AND A BIFUNCTIONAL XYLANASE-ESTERASE FROM A XYLANOLYTIC GENE CLUSTER IN PREVOTELLA RUMINICOLA 23

Published in the Journal of Bacteriology 2009, Vol. 191(10): 3328-3338

2.1 ABSTRACT

Prevotella ruminicola 23 is an obligate anaerobic bacterium in the

Bacteroidetes phylum that contributes to hemicellulose utilization within the bovine rumen. To gain insight into the cellular machinery that this organism elaborates to degrade the hemicellulosic polymer, xylan, we identified and cloned a gene predicted to encode a bi-functional xylanase-ferulic acid esterase

(xyn10D-fae1A) and expressed the recombinant protein in E. coli. Biochemical analysis of purified Xyn10D-Fae1A revealed that this protein possesses both endo-β-1,4-xylanase and ferulic acid esterase activities. A putative glycoside hydrolase (GH) family 3 β-D-glucosidase, with a novel PA14-like insertion sequence, was identified two genes downstream of xyn10D-fae1A. Biochemical analyses of the purified recombinant protein revealed that the putative β-D- glucosidase rather has activity towards pNP-β-D-xylopyranoside, pNP-α-L- arabinofuranoside, and xylo-oligosaccharides, thus the gene was designated, xyl3A. When incubated in combination with Xyn10D-Fae1A, Xyl3A improved the release of xylose monomers from a hemicellulosic xylan substrate, suggesting that these two enzymes function synergistically to depolymerize xylan. Directed mutagenesis studies of Xyn10D-Fae1A mapped the catalytic sites for the two

66

enzymatic functionalities to distinct regions within the polypeptide sequence.

When a mutation was introduced into the putative catalytic site for the xylanase domain (Glu280Ser), the ferulic acid esterase activity increased 3-fold which suggests that the two catalytic domains for Xyn10D-Fae1A are functionally coupled. Directed mutagenesis of conserved residues for Xyl3A resulted in attenuation of activity, which supports the assignment of Xyl3A as a GH family 3

β-D-xylosidase.

2.2 INTRODUCTION

β-1,4-linked xylopyranose is the principal component of plant cell wall hemicellulose, which represents the second largest reservoir of fixed carbon in the biosphere (11, 30, 34, 38, 42, 72). The catabolic breakdown of hemicellulose thus represents a critical step in the recycling of carbon in nature and has been targeted as a subject of intense research with respect to renewable energy resources. Two enzymes of principal importance for recycling hemicellulosic material are endo-1,4-β-xylanases (EC 3.2.1.8), which cleave the xylan backbone and β-D-xylosidases (EC 3.2.1.37), which cleave xylose monomers from the non-reducing end of xylo-oligosaccharides (17).

Prevotella ruminicola 23 is an important member of the anaerobic rumen microbiota (60) that contributes to the utilization of non-cellulosic polysaccharides such as starch and xylan (15, 25, 35, 55). Despite its documented importance in rumen physiology, relatively little is known about the cellular machinery that P. ruminicola 23 employs to harvest energy from hemicellulosic substrates. Several

67

studies have explored the carbohydrate active enzymes from the related taxon,

Prevotella bryantii B14 (previously classified as Prevotella ruminicola B14) (25, 27,

28, 50, 52, 66, 73), however less is known about the xylanolytic system that P. ruminicola 23 elaborates. These two species share 88.9% nucleotide identity in their 16S rRNA genes (1473 nucleotides aligned), however, there is evidence that these two species exhibit differences in their xylanolytic systems (2). Two xylanases have previously been characterized from P. ruminicola 23; a 66 kDa xylanase with significant sequence identity to glycoside hydrolase (GH) family 5 endoglucanases (69, 70), and an 80 kDa xylanase related to GH family 10 endoxylanases which possesses a 260 amino acid insertion within the putative catalytic domain (25). Recently, the draft genome sequence for Prevotella ruminicola 23 has been obtained (http://www.tigr.org/tdb/rumenomics/) and a

large number of glycoside hydrolases have been identified. Consistent with the

predicted role of P. ruminicola 23 in hemicellulose degradation, many of these

genes appear to encode enzymes important for hemicellulose depolymerization

and utilization.

In this study, we report the biochemical properties for two chromosomally

associated glycoside hydrolases with unique domain architectures from

Prevotella ruminicola 23. These two enzymes function synergistically to release

monomeric sugars from a complex hemicellulosic substrate, xylan. This study,

therefore, provides important insight into the molecular strategies for xylan

depolymerization by Prevotella ruminicola 23.

68

2.3 EXPERIMENTAL PROCEDURES

Materials. Prevotella ruminicola 23 (ATCC 19189) was obtained from our culture collection preserved in liquid nitrogen vapor and housed at the

Department of Animal Sciences, University of Illinois at Urbana-Champaign. E. coli JM109 and BL21-CodonPlus (DE3) RIL competent cells were acquired from Stratagene (La Jolla, CA). The pGEM-T TA-cloning vector was purchased from Promega (Madison, WI). The pET-28a expression vector was obtained from

Novagen (San Diego, CA). NdeI and XhoI restriction were purchased from New England Biolabs (Ipswich, MA). ExTaq high fidelity DNA polymerase was obtained from TaKaRa Bio (Madison, WI). The UltraClean

DNA isolation kit was obtained from MO BIO Laboratories, Inc. (Carlsbad, CA).

Xylo-oligosaccharides (X2-X6) were obtained from Megazyme (Bray, Ireland).

Methyl ferulate was obtained from Apin Chemicals Ltd. (Abingdon, UK). Gel filtration standards were obtained from Bio-Rad (Hercules, CA). All other reagents were of the highest possible purity and were purchased from Sigma-

Aldrich (St. Louis, MO).

Gene sequencing, cloning, expression, and purification of xyn10D- fae1A and xyl3A. The genome of Prevotella ruminicola 23 was sequenced by the North American Consortium for Fibrolytic Rumen Bacteria in collaboration with The Institute for Genomic Research (TIGR). Auto-annotation of the P. ruminicola genome identified ORF02827 (xyn10D-fae1A) as a gene encoding a bi-functional xylanase-ferulic acid esterase and ORF02829 (xyl3A) as a gene

69

encoding a putative β-glucosidase. The gene sequences for xyn10D-fae1A and

xyl3A have been deposited within the NCBI GenBank database with accession

numbers FJ713437.1 and FJ713438.1, respectively.

Prevotella ruminicola 23 was maintained and grown in a standard anaerobically prepared medium as described previously (32). DNA was isolated

from mid-log-phase cultures and was purified using the UltraClean DNA

isolation kit. The DNA sequences corresponding to the entire ORFs of xyn10D-

fae1A and xyl3A were amplified from P. ruminicola 23 genomic DNA by PCR

using the primers ORF02827-F (5’-

CATATGAAGAAACTATTAGTAGCGTTATCG-3’) and ORF02827-R (5’-

CTCGAGTTACTTAAAGAGACTCTGAGCCATCTTTTC-3’) for xyn10D-fae1A and

ORF02829-F (5’-CATATGAAATATCAACTATTCTTATCATTGGC-3’) and

ORF02829-R (5’-CTCGAGCTATAAAGTGATGGTCAGTTTTTTCAGATC-3’) for xyl3A. The primer sets were engineered to incorporate 5’-NdeI and 3’-XhoI restriction sites for subsequent directional cloning. The resulting amplicons were

then cloned into the pGEM-T vector via TA-cloning and subcloned in-frame with the hexahistidine (His6) encoding sequence of a modified pET-28a expression

vector by replacing the NdeI−XhoI polylinker. Thus, recombinant expression of

the gene products results in N-terminal polyhistidine fusion proteins. The pET-

28a vector was modified to substitute the gene for kanamycin resistance with that

for ampicillin resistance (9). The integrity of the cloned xyn10D-fae1A and xyl3A

genes was confirmed by DNA sequencing (W. M. Keck Center for Comparative

and Functional Genomics at the University of Illinois at Urbana-Champaign).

70

The resulting plasmid constructs, pET28-xyn10D-fae1A and pET28-xyl3A, were introduced into E. coli BL-21 CodonPlus (DE3) RIL competent cells by heat shock, and grown overnight in lysogeny broth (4, 5) (LB) (10 mL) supplemented with ampicillin (100 µg/mL) and chloramphenicol (50 µg/mL) at 37 oC and with aeration. After 12 h, the starter cultures were diluted into fresh LB (1 L) supplemented with ampicillin and chloramphenicol and cultured at 37 oC with aeration until the optical density at 600 nm reached 0.3. The culture was then incubated at 16 oC with aeration and protein expression was induced by the addition of isopropyl β-D-thiogalactopyranoside (IPTG) at a final concentration of

0.1 mM. After 16 h, the cells were harvested by centrifugation (4000 x g, 15 min,

4 oC), resuspended in 35 mL lysis buffer (50 mM sodium phosphate, 300 mM

NaCl, pH 7.0), and the cells were ruptured by two passages through a French pressure cell (American Instrument Company Inc., Silver Spring, MD). The cell lysate was then clarified by centrifugation at 30,000 x g for 30 min at 4 oC and purified utilizing TALON® Polyhistidine-Tag Purification Resin from Clontech

(Mountain View, CA) as described by the manufacturer. Aliquots of eluted fractions were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) using the method described by Laemmli (43) and protein bands were visualized by staining with Coomassie brilliant blue G-250.

Elution fractions were pooled, dialyzed against storage buffer (50 mM Tris-HCl,

200 mM NaCl, 1 mM dithiothreitol (DTT), pH 7.0) and assays were performed immediately.

71

The protein concentrations were determined by a kit from Bio-Rad

(Hercules, CA) based on the Bradford method (7) with bovine serum albumin

(BSA) as the standard.

Size exclusion chromatography. Size exclusion chromatography was

conducted using an AKTA Purifier 900 fast protein liquid chromatography (FPLC)

system equipped with a Superdex 200 10/300 GL size exclusion column and a

UV detector, all obtained from GE Healthcare (Piscataway, NJ). The proteins,

Xyn10D-Fae1A (1 mg/ml; 100 µl), Xyl3A (1 mg/ml; 100 µl), or a gel filtration standard mixture was injected onto the column pre-equilibrated with a Tris buffer

(50 mM Tris·HCl, 150 mM NaCl; pH 8.0) liquid phase at a flow rate of 0.5 ml/min.

Standard curves were generated by plotting the logarithm of the molecular

weights of the gel filtration standards versus retention times. Experimental

retention times for three independent experiments were used to calculate the

apparent molecular weights of Xyl3A and Xyn10D-Fae1A from the standard

curve.

Evaluation of xylanase and esterase activities on agar plates.

Enzyme-catalyzed hydrolysis of plant cell wall polysaccharides was evaluated by

spotting purified Xyn10D-Fae1A (5 µg) onto agar plates (0.8%) containing oat

spelt xylan (0.1%), carboxymethyl cellulose (0.1%), or locust bean gum (0.1%).

The plates were then incubated at 37 °C for 16 h and were stained by incubating

with congo red (0.1%) for 5 minutes followed by destaining with NaCl (1 M) (63).

72

Esterase activity was assessed by adapting the method described by Lanz and

Williams to an agar plate (45). Briefly, purified Xyn10D-Fae1A (14 µg, 28 µg, or

56 µg) was spotted onto an agar plate (0.8%) containing 1-napthyl butyrate (5

mM) and Tween 80 (2%). The plate was then incubated at 37 °C for 16 h and

was stained with Fast Garnet GBC sulfate (5 mg/mL in 10% SDS) for 10 minutes

followed by washing the plate with Tween 80 (2%).

Hydrolysis of xylo-oligosaccharides and oat spelt xylan. The capacity

of P. ruminicola Xyn10D-Fae1A and Xyl3A to hydrolyze xylo-oligosaccharides

was assessed by resolving and detecting the hydrolysis products utilizing thin

layer chromatography (TLC). Initial biochemical studies had revealed that the

optimum pH for activity for Xyn10D-Fae1A and Xyl3D were 5.0 and 6.0,

respectively (data not shown). Thus, for measuring the hydrolysis of xylo-

oligosaccharides, reaction mixtures (10 µl) were prepared in phosphate buffer

(50 mM sodium phosphate, 100 mM NaCl, pH 6.0 (Xyn10D-Fae1A) or pH 5.0

(Xyl3A)) with X2-X6 (100-150 µg) and reactions were initiated by the addition of

Xyn10D-Fae1A (10 µM, final concentration) or Xyl3A (12 µM, final concentration).

Following incubation at 37 °C for 7.5 h, two volumes of ethanol were added, and

the fluid was evaporated using a SC110A Speed Vac concentrator (Savant;

Ramsey, MN). The dry matter was then dissolved in 2.5 µl of H2O and spotted onto a DC-Plastikfolien Silica Gel 60 F254 TLC plate from Merck (Whitehouse

Station, NJ) to resolve the products (40).

73

To identify the products of hydrolysis of oat spelt xylan following

incubation with Xyn10D-Fae1A or Xyl3A thin layer chromatography was utilized.

Reaction mixtures (250 µl) were prepared in phosphate buffer (50 mM sodium phosphate, 100 mM NaCl, pH 6.0 (Xyn10D-Fae1A) or pH 5.0 (Xyl3A) or pH 5.5

(Xyn10D-Fae1A + Xyl3A) with oat spelt xylan (1%) and reactions were initiated by the addition of Xyn10D-Fae1A (0.50 µM) or Xyl3A (0.50 µM) or both enzymes

(0.50 µM, each). Following incubation at 37 °C for 14 h, the reaction mixtures

were centrifuged and then 2.5 µL or 5.0 µL were spotted onto TLC plates.

Monomeric xylose (X1) and xylo-oligosaccharides (X2-X6) (25 µg each)

were used as standards and the TLC plates were developed in an acetone-ethyl

acetate-acetic acid (2:1:1, v/v/v) mobile phase. The products were then

visualized by spraying the plates with a 1:1 (v/v) mixture of methanolic orcinol

(0.2% w/v) and sulfuric acid (20% v/v) followed by heating the plates at 100 °C

for 5 min (33, 53).

To quantify the amount of reducing equivalents that were released

following incubation of Xyn10D-Fae1A or Xyl3A or both enzymes with oat spelt xylan, the para-hydroxybenzoic acid hydrazide (PAHBAH) assay was performed as described previously by Lever (47).

Hydrolysis of para-nitrophenyl linked sugars. Enzyme-catalyzed hydrolysis of para-nitrophenyl (pNP) linked monosaccharide substrates was assayed using a thermostated Cary 300 UV-Vis spectrophotometer from Varian

Inc. (Palo Alto, CA). pNP substrates (10 mM) in acetate buffer (115 µL; 50 mM

74

sodium acetate, 100 mM NaCl, pH 5.0) were incubated at 37 °C in the presence or absence of Xyl3A (0.74 µM) for 10 min and the level of pNP release was determined by continuously monitoring the absorbance at 400 nm. The extinction coefficient for pNPl at pH 5.0 and a wavelength of 400 nm was measured as 319

M-1 cm-1.

Analysis of methyl ferulate hydrolysis. To evaluate the capacity of

Xyn10D-Fae1A and the Xyn10D-Fae1A site-directed mutants (see below) to hydrolyze the ester linkage of methyl ferulate, we utilized an HPLC based detection system. Reaction mixtures (2 mL) were prepared with methyl ferulate

(5 mM) in phosphate buffer (50 mM sodium phosphate, 100 mM NaCl, pH 6.0) and reactions were initiated by the addition of wild type or mutant Xyn10D-Fae1A

(1.2 µM). Following incubation at 37 °C for 30 minutes, an equal volume of ethyl acetate was added to the mixture, vortexed and then centrifuged at 6,000 x g for

10 minutes. The ethyl acetate layer was then transferred to a new tube, evaporated under N2 gas, and the extracted ferulic acid was dissolved in 1 mL of

50% methanol. The concentration of ferulic acid was then determined by HPLC as described by Wang et al. (67).

Site-directed mutagenesis. Mutagenesis was performed using the

QuikChange® Multi Site-Directed Mutagenesis Kit from Stratagene (La Jolla, CA).

First, mutagenic primers were engineered with the desired mutation in the center of the primer and ~15 bases of correct sequence on either side (Table 2.1).

75

Reaction mixtures were prepared as described in the QuikChange protocol with

pGEMT-xyn10D-fae1A and pGEMT-xyl3A as the DNA template for generation of

the Xyn10D-Fae1A and Xyl3A mutants, respectively. After cycling of the reaction

mixture 18 times in a PCR thermal cycler, the mixture was digested with DpnI,

and the resulting DNA was transformed into chemically competent E. coli JM109 cells by heat shock. The mutant genes were then subcloned into the pET-28a vector and sequenced to ensure that the appropriate mutations were introduced, while the rest of the gene sequences remained unchanged. Expression and purification of the mutant recombinant proteins were performed as described above for wild type Xyn10D-Fae1A and Xyl3A.

Steady-state kinetic measurements for wild type and mutant Xyl3A.

Kinetic studies of the wild type and mutant Xyl3A proteins were performed using a Cary 300 UV-Vis spectrophotometer equipped with a circulating water bath from Varian Inc. (Palo Alto, CA). pNP β-D-xylopyranoside (pNPX) (0-10 mM) was incubated in acetate buffer (50 mM sodium acetate, 100 mM NaCl, pH 5.0) at 37 oC in a 1 cm path-length quartz cuvette and reactions were initiated by the

addition of Xyl3A (0.74 µM). Hydrolysis of pNPX was continuously monitored by

recording the UV signal at 400 nm. Initial rate data were then plotted against the

substrate concentration and kinetic values were estimated by applying a

nonlinear curve fit using GraphPad Prism v5.01 from GraphPad Software (San

Diego, CA). The extinction coefficient for para-nitrophenol at pH 5.0 and a

wavelength of 400 nm was measured as 319 M-1 cm-1.

76

For mutants with very low activities, kcat(apparent) was determined at 10 mM

pNPX concentration and an enzyme concentration of 7.4 µM. Each kcat

determination was performed in triplicate and the rates of spontaneous pNPX

hydrolysis (absence of enzyme) were subtracted from the enzyme catalyzed

reactions.

1H NMR analysis of the stereochemical outcome of Xyl3A-catalyzed pNP-β-D-xylopyranoside hydrolysis. The stereochemistry at the reactive site

(C1) for pNP-β-D-xylopyranoside (pNPX) hydrolysis was assessed using an

INOVA 500NB MHz NMR spectrometer (Varian Inc.; Palo Alto, CA). pNPX (4.5

mM) in deuterated acetate buffer (800 µL; 50 mM sodium acetate, pD 5.0) was

incubated at 25 °C in the presence or absence of Xyl3A (2.5 µM). At t=0 min, 15

min, 6.5 h and 24 h, 1H NMR spectra were recorded with 8 accumulations and

data were analyzed using the Nuts2 NMR Data Processing Software (Acorn

NMR Inc.; Livermore, CA).

2.4 RESULTS

Identification, cloning, and expression of Prevotella ruminicola 23

Xyn10D-Fae1A. To identify potential xylanolytic gene clusters in P. ruminicola 23,

the genome sequence of the bacterium was scanned for putative glycoside

hydrolase (GH) family 10 or GH 11 endo-β-1,4-xylanases, which will generate xylo-oligosaccharides from a xylan source. This scan revealed the presence of three putative GH 10 endo xylanases; however, no GH 11 endoxylanases were

77

identified. One of these putative endo-xylanase genes (ORF02827) was predicted to encode a bi-functional polypeptide, composed of a GH 10 domain in the N-terminal half and a carbohydrate esterase (CE) family 1 ferulic acid esterase domain in the C-terminal half (10). Furthermore, ORF02827 was identified nearby another gene (ORF02829) encoding a putative GH family 3 β- glucosidase (Figure 2.1). Since a family 10 xylanase, XynC, has already been described from this bacterium, the gene encoding the putative bi-functional protein was designated xyn10D-fae1A. The gene was amplified by PCR and cloned into an expression vector for recombinant protein expression in E. coli.

Xyn10D-Fae1A was expressed as an N-terminal hexa-histidine fusion protein to facilitate purification by metal affinity chromatography. The predicted molecular weight for the hexa-histidine fusion protein, his-Xyn10D-Fae1A, is 84 kDa, which is in agreement with the size of the purified protein estimated by comparison with molecular weight markers through SDS-PAGE analysis (Figure 2.2A). To evaluate the quaternary structure of Xyn10D-Fae1A, the apparent molecular weight of purified Xyn10D-Fae1A was determined by size-exclusion chromatography. The molecular weight of Xyn10D-Fae1A was calculated from the retention times of the peak absorbance versus calibration standards of known molecular weights. The apparent molecular weight of Xyn10D-Fae1A was calculated to be 124 ± 1.7 kDa, and since the molecular weight of a monomer of his-Xyn10D-Fae1A is ~84 kDa, this result suggested a protein existing as either a monomer or homodimer in solution (Figure 2.2B).

78

Xyn10D-Fae1A possesses both xylanase and esterase activities. The gene sequence for xyn10D-fae1A is predicted to encode a polypeptide with two distinct catalytic properties (xylanase and esterase activities). To evaluate whether Xyn10D-Fae1A possesses endo-xylanase activity, agar plates containing 0.1% of either oat spelt xylan (OSX), carboxymethyl cellulose (CMC), or locust bean gum (LBG) were prepared. Purified Xyn10D-Fae1A (5 µg) was then spotted on each plate and incubated overnight at 37 °C. Upon staining of agar plates containing OSX with Congo red followed by destaining with 1 M NaCl, a clearing zone, representing hydrolysis of the polysaccharide substrate, was observed (Figure 2.2C). In contrast, clearing zones were not seen in the case of the plates containing CMC and LBG. These results suggested that Xyn10D-

Fae1A possesses xylanase activity, but not endoglucanase or mannanase activities. To evaluate whether Xyn10D-Fae1A possesses esterase activity,

Xyn10D-Fae1A in increasing amounts, 14 µg, 28 µg or 56 µg, was spotted onto agar plates infused with 1-naphthyl butyrate, incubated overnight at 37 °C, and stained with Fast Garnet GBC Sulfate (45). These experiments revealed that incubation with increasing amounts of Xyn10D-Fae1A led to a concomitant increase in 1-naphthol production, apparent from the intensity and size of the spots developed from staining with GBC sulfate (Figure 2.2D). The results, therefore, suggested that in addition to xylanase activity, Xyn10D-Fae1A possesses esterase activity. To determine the relevance of the Xyn10D-Fae1A to plant cell wall hydrolysis in the rumen, however, the specificity of the esterase activity needed determination.

79

Endo-β-1,4-xylanases hydrolyze the β-1,4 linkage between xylose units

within the backbone of xylan polymers. To evaluate the size of xylose products

that Xyn10D-Fae1A releases from xylo-oligosaccharides of different degrees of

polymerization, we incubated Xyn10D-Fae1A with xylo-oligosaccharides (X2-X6) and resolved the products by thin layer chromatography (TLC). In the presence of Xyn10D-Fae1A, xylobiose hydrolysis to xylose was not detected (Figure 2.2E),

however; all xylo-oligosaccharides of higher degrees of polymerization (X3-X6)

were hydrolyzed mostly to xylobiose with some xylose also being released

(Figure 2.2E). These results suggest that Xyn10D-Fae1A is a functional endo-β-

1,4-xylanase. Furthermore, these data indicate that the smallest xylo- oligosaccharide that Xyn10D-Fae1A can effectively hydrolyze is xylotriose.

Identification, cloning, and expression of Prevotella ruminicola 23

ORF02829. Xyn10D-Fae1A encodes a functional xylanase-esterase (Figure 2.2C

and 2-2D), however the major product released from xylo-oligosaccharides

appears to be xylobiose (Figure 2.2E). In order for P. ruminicola 23 to utilize

xylose from longer chain xylo-oligosaccharides or xylan, the bacterium requires

an enzyme that releases xylose or monomeric sugars from the products of

Xyn10D-Fae1A. We, therefore, cloned and expressed the two genes

downstream of the Xyn10D-Fae1A encoding gene. While no catalytic activity has

been detected yet for ORF02828 (hypothetical protein), the product of

ORF02829, expressed as an N-terminal hexa-histidine fusion protein, exhibited

robust catalytic activity as described below. The gene product was purified by

80

metal affinity chromatography, and the predicted molecular weight of the

hexahistidine fusion protein (98 kDa) was in agreement with the size of the

purified protein estimated by comparison with molecular weight markers resolved

by SDS-PAGE (Figure 2.3A). To evaluate the quaternary structure for this fusion

protein, the apparent molecular weight was determined by size-exclusion

chromatography. As shown in figure 2.3B, the elution volume of the putative β-

glucosidase was 8.22 mL, which was just after the exclusion volume of the

column (~7.8 mL), indicating that this protein exists as a large polymeric species

in solution. This peak had a shorter retention time than that for the highest

molecular weight standard tested (Figure 2.3B, inset). Therefore, the apparent

molecular weight for the protein was estimated by extrapolation of the standard

curve. Based on our calculation, the apparent molecular weight for this protein is

868 ± 2.2 kDa, which is close to the value expected for a nonamer (886 kDa).

ORF02829 encodes a bi-functional β-xylosidase/α-L-

arabinofuranosidase. The gene sequence for ORF02829 is predicted to encode

a GH family 3 glycoside hydrolase and was annotated as a β-D-glucosidase. To

determine the substrate specificity for the gene product of ORF02829, the

recombinant protein was screened for activity with a library of α- and β-para- nitrophenol linked monosaccharides. The putative β-D-glucosidase was incubated in the presence of the derivatized sugars and the reactions were monitored spectrophotometrically. These experiments revealed that when incubated with pNP-β-D-glucopyranoside, release of pNP was not detectable

81

(Figure 2.3C). In contrast, hydrolysis of pNP-β-D-xylopyranoside (pNPX) was

readily observed in the presence of the protein (Figure 2.3C). In addition, low

activity was observed with pNP-α-L-arabinofuranoside, however, this activity represented a fraction of the pNP-β-D-xylopyranoside activity (Figure 2.3C).

These results indicated that ORF02829 encodes a functional glycoside hydrolase that exhibits primarily β-xylosidase activity, although it also possesses the capacity to hydrolyze pNP-α-L-arabinofuranoside with lower activity. Based on these results, which indicated that ORF02829 encodes the first GH family 3 xylosidase to be characterized from P. ruminicola 23, the gene was designated,

xyl3A.

To evaluate whether Xyl3A possesses the capacity to hydrolyze β-1,4-

linked xylo-oligosaccharides of various lengths, Xyl3A (0.42 µM) was incubated

with the substrates (X2-X6) and the products were resolved by thin layer

chromatography followed by staining with methanolic orcinol. Upon incubation of

the enzyme with each of the β-1,4-linked xylo-oligosaccharides, the sole products

released were xylose monomers (Figure 2.3D). These experiments revealed that

Xyl3A is capable of hydrolyzing the glycosidic linkages of β-1,4-xylo-

oligosaccharides with varying lengths.

Taken together, these results suggest that xyl3A does not encode an

enzyme with β-D-glucosidase activity as previously annotated; rather this gene

encodes a bi-functional β-D-xylosidase / α-L-arabinofuranosidase that can

hydrolyze the glycosidic linkage of β-1,4-xylo-oligosaccharides.

82

Xyn10D-Fae1A and Xyl3A function synergistically to release xylose from xylan. To evaluate whether the enzymatic properties of Xyn10D-Fae1A and Xyl3A may function coordinately in the depolymerization of xylan, the two enzymes were incubated independently or in combination with oat spelt xylan

(OSX) and the products of hydrolysis were resolved by TLC. In the absence of either enzyme, no depolymerization of the xylan substrate was detected (Figure

2.4A). When OSX was incubated with Xyn10D-Fae1A, depolymerization of the xylan substrate was detected (Figure 2.4A). By comparison with xylo- oligosaccharide standards, the major hydrolysis products are predicted to be xylobiose and several other undefined products. When OSX was incubated with

Xyl3A, a relatively small amount of depolymerization was observed (Figure 2.4B) and the only detectable product was predicted to be xylose (Figure 2.4A).

Incubation of OSX with both Xyn10D-Fae1A and Xyl3A resulted in a similar xylan depolymerization pattern as observed for Xyn10D-Fae1A alone (Figure 2.4A), however conversion of the hydrolytic products to the monosaccharide, xylose, was increased (Figure 2.4B).

These experiments revealed that when incubated together, Xyn10D-

Fae1A and Xyl3A release more xylose from oat spelt xylan than each individual enzyme. Taken together, these results indicate that the catalytic properties of

Xyn10D-Fae1A and Xyl3A are synergistic and that the two enzymes function to release monosaccharides from oat spelt xylan, a complex substrate.

83

Mutational analyses of Xyn10D-Fae1A maps the two catalytic

domains to discrete regions of the polypeptide sequence. To test our

prediction that Xyn10D-Fae1A possesses two distinct catalytic modules, amino

acid substitutions were introduced in the putative catalytic residues for these two

domains. Based upon alignment with the catalytic nucleophile (Glu298) identified

for the extracellular xylanase from Bacillus stearothermophilus (29, 64) we

predicted that Glu280 may be the catalytic nucleophile for the xylanase domain

of Xyn10D-Fae1A (Figure 2.5A). Amino acid alignments also revealed that a

catalytic triad (Ser-His-Glu) that is crucial for catalysis in ferulic acid esterase

enzymes (6, 58) is present and that S629 may be the catalytic nucleophile for the

esterase domain of Xyn10D-Fae1A (Figure 2.5A).

When the putative nucleophile for the xylanase domain was mutated

(Glu280Ser), the capacity for Xyn10D-Fae1A to depolymerize oat spelt xylan was

attenuated (Figure 2.5B, lanes 5-6 and Figure 2.5C). Similarly, when the putative

nucleophile for the esterase domain was mutated (Ser629Ala), the ferulic acid

esterase activity for Xyn10D-Fae1A was attenuated (Figure 2.5D). These results suggest that for Xyn10D-Fae1A, Glu280 is an important residue for xylanase activity and Ser629 is an important residue for ferulic acid esterase activity.

Furthermore, these results indicate that these distinct catalytic activities

(xylanase and esterase) map to distinct regions on the polypeptide.

To evaluate whether these independent mutations influenced the catalytic

activity of the adjacent domains, the Glu280Ser mutant was monitored for ferulic

acid esterase activity and the Ser629Ala mutant was characterized for xylanase

84

activity. When incubated with oat spelt xylan, the Ser629Ala mutant

depolymerized xylan to a similar extent as compared to the wild type Xyn10D-

Fae1A, although a statistically significant decreased was detected for the mutant

protein (Figure 2.5B and 2-5C). However, when the Glu280Ser mutant was

evaluated for ferulic acid esterase activity, the specific activity was increased 3-

fold over the wild type Xyn10D-Fae1A (Figure 2.5D).

Xyl3A is a GH family 3 β-xylosidase with a unique domain

architecture. Comparison of the domain architectures of Xyl3A and other

biochemically characterized members of the GH Family 3 utilizing Pfam (24)

suggested that Xyl3A possesses two domains, as with most members of this family. However, the analysis also predicted that Xyl3A, which belongs to cluster

B, has a PA14 domain insertion in the C-terminal GH family 3 domain (Table 2.2).

It is predicted that the PA14-like insertion sequence may represent a novel carbohydrate binding module (57).

In order to identify conserved amino acids that may play important roles in

catalysis among members of the GH family 3, amino acid sequences for

biochemically characterized GH family 3 β-glycosidases were aligned (data not

shown). Several conserved motifs were identified, and conserved residues

shown to reside within the active site of ExoI, a family 3 glycoside hydrolase from

Hordeum vulgare (Barley) (65), were chosen for site-directed mutagenesis (Table

2.2).

85

Steady state kinetic analysis of wild type and mutant Xyl3A. To compare the catalytic properties for wild type and six mutant forms (R149L,

Y237F, D269A, D269N, E594Q, and E616A) of Xyl3A, we analyzed the steady- state kinetic values of these enzymes in the presence of pNPX. Wild type or mutant Xyl3A was incubated in a sodium acetate buffer in the presence of various concentrations of pNPX and the change in magnitude of the A400nm signal was monitored by UV spectroscopy. Initial rate data for Xyl3A and the mutants were obtained for each of the substrate concentrations and plots of the initial velocity of pNPX hydrolysis (nmol/sec) versus substrate concentration (µM) were generated (data not shown). For mutants that exhibited very small rates, the kcatapp was determined at 10 mM substrate. Steady-state kinetic values for wild type or mutant Xyl3A were obtained by applying a non-linear curve fit to the data

(Table 2.3). These results suggested that the E594Q mutation has no significant effect on catalysis for Xyl3A. Conversely, the Y237F and E616A mutations have moderate effects on catalysis for Xyl3A, and the R149L, D269A, and D269N mutations have severe effects on catalysis for Xyl3A. The circular dichroism (CD) spectra for the wild type and Xyl3A mutants were collected and analyzed as described by Dodd et al. (18). Comparison of the CD spectra for the wild type and Xyl3A mutants revealed that they possess similar overall secondary structures (data not shown).

Xyl3A catalyzes the hydrolysis of pNPX with net retention of stereochemical configuration. To establish whether Xyl3A is an inverting or

86

retaining β-xylosidase, we monitored the stereochemical configuration of the anomeric carbon (C1) of pNPX during the course of Xyl3A-catalyzed hydrolysis by 1H-NMR spectroscopy. pNPX (4.5 mM) was dissolved in deuterated acetate buffer, and Xyl3A (2.5 µM) was added to the reaction. At t=0 min, 15 min, 6.5 h and 24 h, 1H-NMR spectra were collected and data were analyzed using the

Nuts2 NMR Data Processing Software. Prior to addition of the enzyme (t=0 min), a doublet peak centered at δ 5.15 ppm, which represents the C1β proton of pNPX was observed. At t=15 min, the doublet peak at δ 5.15 ppm decreased in size, while two doublet peaks centered at δ 4.48 ppm and δ 5.11 ppm appeared

(Figure 2.6A). The ratio of these two peaks which represent the C1β proton of β- xylose (δ 4.48 ppm) and the C1α proton of α-xylose (δ 5.11 ppm) was 95:5. At t=6.5 h, the doublet at δ 5.15 ppm was further diminished, and the doublet peaks at δ 4.48 ppm and δ 5.11 ppm had both increased. At t=6.5 h, the ratio of the δ

4.48 ppm and δ 5.11 ppm peaks was 65:35. At t=24 h, the doublet peak at δ 5.15 ppm was not detectable, indicating that the pNPX substrate had been hydrolyzed to completion. The final ratio of the peaks centered at δ 4.48 ppm and δ 5.11 ppm after the 24 h incubation was 65:35.

The 95:5 ratio for β-xylose:α-xylose at t=15 min indicates that the product of Xyl3A-catalyzed hydrolysis of pNPX is β-xylose. At later time points, the observed equilibrium ratio of 65:35 (20) for β-xylose:α-xylose is highly suggestive of mutarotation (Figure 2.6B). These experiments revealed that Xyl3A catalyzes the hydrolysis of pNPX with retention of stereochemical configuration.

87

2.5 DISCUSSION

The rumen harbors a complex community of microbes which catalyzes the depolymerization of cellulose and hemicellulose to soluble sugars which are then fermented to short chain fatty acids. These end products of fermentation are utilized by the mammalian host as the major source of energy (3, 41). Among the bacterial genera in the rumen, Prevotella has been reported to be one of the most numerous (8, 60). Prevotella ruminicola 23 is one of the most commonly isolated species (8) from rumen contents and possesses the capacity to degrade plant cell wall polysaccharides, proteins, and peptides (61). Therefore, this species is thought to play an important role in the utilization of xylans and other hemicelluloses within the rumen.

Two xylan-hydrolyzing enzymes have been previously characterized from

P. ruminicola: a bi-functional GH family 5 xylanase / endoglucanase cloned from the type strain, P. ruminicola 23T (70) and a GH family 10 xylanase with an interrupted catalytic domain cloned from the related strain, P. ruminicola D31d

(71). The recently sequenced genome for P. ruminicola 23 harbors genes corresponding to these two previously characterized xylanases (data not shown), and also harbors two additional putative xylanase genes that have not been previously characterized. One of these genes was annotated as a bi-functional xylanase-esterase (designated xyn10D-fae1A). To provide further insight into the molecular basis for xylan depolymerization by P. ruminicola 23, we cloned this gene and characterized the recombinant protein. Our results show that xyn10D- fae1A encodes a bi-functional GH family 10 xylanase/ferulic acid esterase.

88

Furthermore, directed mutagenesis studies mapped the two catalytic functionalities to distinct regions on the polypeptide. A unique result that was obtained during the course of these experiments was that mutation of the putative nucleophile for the xylanase domain (Glu280Ser) led to an approximately 3-fold increase in ferulic acid esterase activity. While a clear explanation for these data will require further biophysical experiments, these results indicate that the xylanase and esterase domains may be functionally coupled.

The recently sequenced genomes for Bacteroides intestinalis DSM 17393

(GenBank accession no. EDV07678.1) and Bacteroides eggerthii DSM 20697

(GenBank accession no. ABVO01000038.1, nucleotides 298769 – 296631) harbor the only putative proteins in the GenBank database with significant sequence identity (64% for both) across the entire protein sequence for Xyn10D-

Fae1A. Two genes from C. thermocellum (xynY and xynZ) encode proteins with both GH family 10 xylanase (26, 31) and ferulic acid esterase activities (6); however in addition, the clostridial proteins have carbohydrate binding modules

(CBMs) and dockerin domains. Despite the presence of these accessory domains in C. thermocellum XynY and XynZ, the natural occurrence of the GH

10 xylanase and CE 1 ferulic acid esterase domains within a single polypeptide sequence from different organisms suggests that this linkage may impart a selective advantage for xylan degradation over the two enzymatic functionalities in isolation. Another rumen bacterium, Ruminococcus flavefaciens 17, harbors a gene predicted to encode a protein with GH 11 xylanase and CE 1 esterase

89

domains (1), however the biochemical properties of this protein have not been

characterized. Thus it is not clear whether the CE 1 domain possesses ferulic

acid esterase activity as is the case for Xyn10D-Fae1A.

A number of different exo-glycosidase functions have been identified for

GH family 3 enzymes including β-D-glucosidase, β-D-xylosidase, α-L- arabinofuranosidase, and N-actetyl-β-D-glucosaminidase (21). When we cloned the putative β-D-glucosidase gene (ORF02829) and characterized the recombinant protein, we found that it releases pNP from pNP-β-D-xylopyranoside and pNP-α-L-arabinofuranoside, but not pNP-β-D-glucopyranoside. The hydrolysis of both pNP-β-D-xylopyranoside and pNP-α-L-arabinofuranoside

(Xyl/Ara) has been reported for some members of the GH family 3 (20, 46, 51,

74). With one exception (68), all of the characterized family 3 β-D-xylosidases possess substrate specificity for Xyl/Ara and do not possess β-D-glucosidase activity. The crystal structure of a family 3 β-D-glucosidase from Barley (Hordeum vulgare) revealed that Asp120 forms a hydrogen bond to the 6'OH group of the glucose substrate bound at the -1 subsite (36). The authors found that Asp120 was highly conserved amongst biochemically defined family 3 β-D-glucosidases and also observed that the Xyl/Ara-active enzymes did not possess this conserved aspartate residue (36). The pentose sugars (β-D-xylopyranose and α-

L-arabinofuranose) do not possess the additional CH2OH substituent from C5

that is present in glucose, thus it is possible that the larger glutamate residue could function to discriminate between hexoses and pentoses on the basis of steric interactions. It is more difficult to explain why the structurally distinct six-

90

membered ring of xylopyranose and the five-membered ring of arabinofuranose

are permitted into the active site (36, 44). To identify a primary amino acid

sequence motif that might distinguish the Xyl/Ara-active enzymes from the

glucose active enzymes, we constructed amino acid sequence alignments for

biochemically defined members of these two distinct groups. These alignments

revealed a conserved motif (WWSEAL) for the Xyl/Ara enzymes that was not

found in the β-D-glucosidase enzymes (data not shown). This motif may be

useful for distinguishing between GH family 3 β-D-glucosidases and Xyl/Ara- active enzymes based solely on primary sequence analysis.

Xyl3A possesses a PA14 domain that forms an insertion within the C- terminal GH family 3 domain (Table 2.2). The precise function for the PA14 domain remains poorly defined, however its presence in bacterial, archaeal, and eukaryotic proteins including glycosidases, glycosyltransferases, proteases, amidases, adhesins, and bacterial toxins (57) and its predominantly β-sheet structure (56) suggests a carbohydrate-binding function. Alignment of the amino acid sequences for Xyl3A with other GH family 3 β-D-xylosidases / α-L- arabinofuranosidases suggests that the PA14 domain may be inserted within the

C-terminal domain (data not shown). The predicted proximity of this region to the active site supports the possibility that the PA14 domain may aid in binding to oligosaccharide substrates. Thus, insertion of the PA14 domain in this region could influence the substrate specificity for the enzyme. A recent report identified a loop within the PA14 domain of two fungal adhesins from Candida glabrata that is responsible for determining the binding specificity (14, 75) to eukaryotic

91

membrane-anchored glycoproteins. The amino acid sequence for the PA14

domain of Xyl3A aligns with the ligand specificity determinants for the fungal cell

surface adhesins Epa6 and Epa7 (data not shown) which supports the possibility

that the PA14 domain could facilitate anchoring of Xyl3A to sugars. Identification

of the true functional role for the PA14 domain within Xyl3A will require further

studies.

There are a number of residues that have been shown to make hydrogen

bond contacts to hydroxyl groups of the glycosyl substrate within the active site

for the β-D-glucan glucohydrolase (ExoI) from H. vulgare (65), and directed

mutagenesis studies of the β-glucosidase from Flavobacterium meningosepticum

have confirmed the importance of these residues for catalysis (49). We identified

analogous residues in Xyl3A by amino acid sequence alignments and based on

these alignments, we constructed site-directed mutants that are predicted to

have important roles in catalysis (residues indicated in Table 2.2). The significant

attenuation in activity for the mutants tested in this study (Table 2.3) provides

support for the classification of Xyl3A as a family 3 glycoside hydrolase. This

assignment is further strengthened by the observation that the hydrolysis of pNP-

β-D-xylopyranoside by Xyl3A occurs with net retention of stereochemical

configuration (Fig. 2.6), a property also shared by GH family 3 members.

In the gene cluster identified in this study, there is a protein just upstream of xyl3A that was annotated as a hypothetical protein. Based on the short intergenic space (31 nucleotides) between this gene and xyl3A, it is possible that a single polycistronic mRNA may encode these two gene products. Comparison

92

of the hypothetical protein with other proteins in the GenBank database revealed

that this putative protein shares significant sequence identity (30%) with an α-

1,2-fucosidase from Bifidobacterium bifidum (39, 54). It is unclear whether this protein may function synergistically with Xyn10D-Fae1A and Xyl3A to hydrolyze xylan, however studies of the substrate specificity for this protein are currently underway in our laboratory.

A putative hybrid two-component system (HTCS) regulator was identified just downstream of the xyl3A gene (Figure 2.1). These HTCS regulators contain all of the regulatory elements found in ‘classical’ two-component regulatory systems, but rather than encoding the sensor and effector domains in separate genes as in the ‘classical’ system, HTCS’s combine all of these functions within a single polypeptide. The domain organization for the putative HTCS regulator downstream of Xyl3A is analogous to that for a HTCS regulator characterized from the related bacterium, Bacteroides thetaiotaomicron VPI-5482 (59), however the sequence similarity across the entire polypeptide is relatively low

(24%). The putative HTCS regulator in P. ruminicola 23 shares considerable amino acid identity (43%) with XynR, a biochemically defined regulator of xylanase gene expression in P. bryantii B14 (52). This regulator was shown to be

important for coordinating the transcription of a xylan utilization gene cluster in

response to growth on xylan in P. bryantii B14. It is possible that the putative

HTCS regulator downstream of Xyl3A in P. ruminicola 23 functions in a similar

fashion. The additional 538 amino acids in the putative P. ruminicola 23 HTCS

map to a region of the polypeptide predicted to be a periplasmic ligand binding

93

domain which represents an extension of this domain relative to that found in

XynR from P. bryantii B14. Thus, it is possible that the putative HTCS in P.

ruminicola 23 could coordinate xylanolytic gene expression similar to XynR from

P. bryantii B14, however, the sensor domain may respond to different signals.

2.6 ACKNOWLEDGEMENTS

This work was conducted under a Sponsored Research Agreement no.

2005-2619-00 with Archer Daniels Midland Company, Decatur, Illinois. Partial support was also supported in part by the USDA Cooperative State Research,

Education and Extension Service, Hatch project ILLU 538-364 (to I. K. O. C.) and

the USDA National Research Initiative (proposal no. AG 2008-35206-18784 (R. I.

M., I. K. O. C., and M. A. S.)).

2.7 REFERENCES

1. Aurilia, V., J. C. Martin, S. I. McCrae, K. P. Scott, M. T. Rincon, and H. J. Flint. 2000. Three multidomain esterases from the cellulolytic rumen anaerobe Ruminococcus flavefaciens 17 that carry divergent dockerin sequences. Microbiology 146:1391-1397.

2. Avgustin, G., H. J. Flint, and T. R. Whitehead. 1992. Distribution of xylanase genes and enzymes among strains of Prevotella (Bacteroides) ruminicola from the rumen. FEMS Microbiol Lett 78:137-143.

3. Bergman, E. N. 1990. Energy contributions of volatile fatty acids from the gastrointestinal tract in various species. Physiol Rev 70:567-590.

4. Bertani, G. 2004. Lysogeny at mid-twentieth century: P1, P2, and other experimental systems. J Bacteriol 186:595-600.

5. Bertani, G. 1951. Studies on lysogenesis. I. The mode of phage liberation by lysogenic Escherichia coli. J Bacteriol 62:293-300.

94

6. Blum, D. L., I. A. Kataeva, X. L. Li, and L. G. Ljungdahl. 2000. Feruloyl esterase activity of the Clostridium thermocellum cellulosome can be attributed to previously unknown domains of XynY and XynZ. J Bacteriol 182:1346-1351.

7. Bradford, M. M. 1976. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 72:248-254.

8. Bryant, M. P., N. Small, C. Bouma, and H. Chu. 1958. Bacteroides ruminicola n. sp. and Succinimonas amylolytica; the new genus and species; species of succinic acid-producing anaerobic bacteria of the bovine rumen. J Bacteriol 76:15-23.

9. Cann, I. K., S. Ishino, M. Yuasa, H. Daiyasu, H. Toh, and Y. Ishino. 2001. Biochemical analysis of replication factor C from the hyperthermophilic archaeon Pyrococcus furiosus. J Bacteriol 183:2614- 2623.

10. Cantarel, B. L., P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, and B. Henrissat. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res 37:D233-D238.

11. Coughlan, M. P. 1985. Cellulose hydrolysis: the potential, the problems and relevant research at Galway. Biochem Soc Trans 13:405-406.

12. Cournoyer, B., and D. Faure. 2003. Radiation and functional specialization of the family-3 glycoside hydrolases. J Mol Microbiol Biotechnol 5:190-198.

13. Dan, S., I. Marton, M. Dekel, B. A. Bravdo, S. He, S. G. Withers, and O. Shoseyov. 2000. Cloning, expression, characterization, and nucleophile identification of family 3, Aspergillus niger beta-glucosidase. J Biol Chem 275:4973-4980.

14. de Groot, P. W., and F. M. Klis. 2008. The conserved PA14 domain of cell wall-associated fungal adhesins governs their glycan binding specificity. Mol Microbiol 68:535-537

15. Dehority, B. A. 1969. Pectin-fermenting bacteria isolated from the bovine rumen. J Bacteriol 99:189-196.

16. Deshpande, V., K. E. Eriksson, and B. Pettersson. 1978. Production , purification and partial characterization of 1,4-beta-glucosidase enzymes from Sporotrichum pulverulentum. Eur J Biochem 90:191-198.

95

17. Dodd, D., and I. K. O. Cann. 2009. Enzymatic deconstruction of xylan for biofuel production. GCB Bioenergy 1:2-17.

18. Dodd, D., J. G. Reese, C. R. Louer, J. D. Ballard, M. A. Spies, and S. R. Blanke. 2007. Functional comparison of the two Bacillus anthracis glutamate racemases. J Bacteriol 189:5265-5275.

19. Eneyskaya, E. V., H. Brumer, 3rd, L. V. Backinowsky, D. R. Ivanen, A. A. Kulminskaya, K. A. Shabalin, and K. N. Neustroev. 2003. Enzymatic synthesis of beta-xylanase substrates: transglycosylation reactions of the beta-xylosidase from Aspergillus sp. Carbohydr Res 338:313-325.

20. Eneyskaya, E. V., D. R. Ivanen, K. S. Bobrov, L. S. Isaeva-Ivanova, K. A. Shabalin, A. N. Savel'ev, A. M. Golubev, and A. A. Kulminskaya. 2007. Biochemical and kinetic analysis of the GH3 family beta-xylosidase from Aspergillus awamori X-100. Archives of biochemistry and biophysics 457:225-234.

21. Faure, D. 2002. The family-3 glycoside hydrolases: from housekeeping functions to host-microbe interactions. Appl Environ Microbiol 68:1485- 1490.

22. Faure, D., J. Desair, V. Keijers, M. A. Bekri, P. Proost, B. Henrissat, and J. Vanderleyden. 1999. Growth of Azospirillum irakense KBC1 on the aryl beta-glucoside salicin requires either salA or salB. J Bacteriol 181:3003-3009.

23. Faure, D., B. Henrissat, D. Ptacek, M. A. Bekri, and J. Vanderleyden. 2001. The celA gene, encoding a glycosyl hydrolase family 3 beta- glucosidase in Azospirillum irakense, is required for optimal growth on cellobiosides. Appl Environ Microbiol 67:2380-2383.

24. Finn, R. D., J. Mistry, B. Schuster-Bockler, S. Griffiths-Jones, V. Hollich, T. Lassmann, S. Moxon, M. Marshall, A. Khanna, R. Durbin, S. R. Eddy, E. L. Sonnhammer, and A. Bateman. 2006. Pfam: clans, web tools and services. Nucleic Acids Res 34:D247-D251.

25. Flint, H. J., T. R. Whitehead, J. C. Martin, and A. Gasparic. 1997. Interrupted catalytic domain structures in xylanases from two distantly related strains of Prevotella ruminicola. Biochim Biophys Acta 1337:161- 165.

26. Fontes, C. M., G. P. Hazlewood, E. Morag, J. Hall, B. H. Hirst, and H. J. Gilbert. 1995. Evidence for a general role for non-catalytic thermostabilizing domains in xylanases from thermophilic bacteria. Biochem J 307:151-158.

96

27. Gasparic, A., R. Marinsek-Logar, J. Martin, R. J. Wallace, F. V. Nekrep, and H. J. Flint. 1995. Isolation of genes encoding beta-D- xylanase, beta-D-xylosidase and alpha-L-arabinofuranosidase activities from the rumen bacterium Prevotella ruminicola B(1)4. FEMS Microbiol Lett 125:135-141.

28. Gasparic, A., J. Martin, A. S. Daniel, and H. J. Flint. 1995. A xylan hydrolase gene cluster in Prevotella ruminicola B(1)4: sequence relationships, synergistic interactions, and oxygen sensitivity of a novel enzyme with exoxylanase and beta-(1,4)-xylosidase activities. Appl Environ Microbiol 61:2958-2964.

29. Gat, O., A. Lapidot, I. Alchanati, C. Regueros, and Y. Shoham. 1994. Cloning and DNA sequence of the gene coding for Bacillus stearothermophilus T-6 xylanase. Appl Environ Microbiol 60:1889-1896.

30. Gilbert, H. J., and Hazlewood, G. P. 1993. Bacterial cellulases and xylanases. Journal of General Microbiology 139:187-194.

31. Grepinet, O., M. C. Chebrou, and P. Beguin. 1988. Purification of Clostridium thermocellum xylanase Z expressed in Escherichia coli and identification of the corresponding product in the culture medium of C. thermocellum. J Bacteriol 170:4576-4581.

32. Griswold, K. E., and R. I. Mackie. 1997. Degradation of protein and utilization of the hydrolytic products by a predominant ruminal bacterium, Prevotella ruminicola B(1)4. J Dairy Sci 80:167-175.

33. Han, S. O., H. Yukawa, M. Inui, and R. H. Doi. 2004. Isolation and expression of the xynB gene and its product, XynB, a consistent component of the Clostridium cellulovorans cellulosome. J Bacteriol 186:8347-8355.

34. Henrissat, B. 1992. Analysis of hemicellulases sequences. Relationships to other glycanases, p. 97-110. In J. Visser, Beldman, G., Kusters-van Someren, M. A., and Voragen, A. G. J. (ed.), Xylans and Xylanases, vol. 7. Elsevier, Amsterdam.

35. Hespell, R. B., Akin, D. E., and Dehority, B. A. 1997. Bacteria, fungi, and protozoa of the rumen., p. 59-141. In R. I. Mackie, White, B. A., and Isaacson, R. E. (ed.), Gastrointestinal Microbiology. Chapman and Hall, New York.

36. Hrmova, M., R. De Gori, B. J. Smith, J. K. Fairweather, H. Driguez, J. N. Varghese, and G. B. Fincher. 2002. Structural basis for broad substrate specificity in higher plant beta-D-glucan glucohydrolases. Plant Cell 14:1033-1052.

97

37. Hrmova, M., A. J. Harvey, J. Wang, N. J. Shirley, G. P. Jones, B. A. Stone, P. B. Hoj, and G. B. Fincher. 1996. Barley beta-D-glucan exohydrolases with beta-D-glucosidase activity. Purification, characterization, and determination of primary structure from a cDNA clone. J Biol Chem 271:5277-5286.

38. Jeffries, T. W. 1990. Biodegradation of lignin-carbohydrate complexes. Biodegradation 1:163-176.

39. Katayama, T., A. Sakuma, T. Kimura, Y. Makimura, J. Hiratake, K. Sakata, T. Yamanoi, H. Kumagai, and K. Yamamoto. 2004. Molecular cloning and characterization of Bifidobacterium bifidum 1,2-alpha-L- fucosidase (AfcA), a novel inverting glycosidase (glycoside hydrolase family 95). J Bacteriol 186:4885-4893.

40. Kiyohara, M., Y. Hama, K. Yamaguchi, and M. Ito. 2006. Structure of beta-1,3-xylooligosaccharides generated from Caulerpa racemosa var. laete-virens beta-1,3-xylan by the action of beta-1,3-xylanase. J Biochem (Tokyo) 140:369-373.

41. Krause, D. O., S. E. Denman, R. I. Mackie, M. Morrison, A. L. Rae, G. T. Attwood, and C. S. McSweeney. 2003. Opportunities to improve fiber degradation in the rumen: microbiology, ecology, and genomics. FEMS Microbiol Rev 27:663-693.

42. Kuhad, R. C. 1993. Lignocellulose biotechnology: current and future prospects. Critical Reviews in Biotechnology 13:151-172.

43. Laemmli, U. K. 1970. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227:680-685.

44. Langston, J., N. Sheehy, and F. Xu. 2006. Substrate specificity of Aspergillus oryzae family 3 beta-glucosidase. Biochim Biophys Acta 1764:972-978.

45. Lanz, W. W., and P. P. Williams. 1973. Characterization of esterases produced by a ruminal bacterium identified as Butyrivibrio fibrisolvens. J Bacteriol 113:1170-1176.

46. Lee, R. C., M. Hrmova, R. A. Burton, J. Lahnstein, and G. B. Fincher. 2003. Bifunctional family 3 glycoside hydrolases from barley with alpha -L- arabinofuranosidase and beta -D-xylosidase activity. Characterization, primary structures, and COOH-terminal processing. J Biol Chem 278:5377-5387.

47. Lever, M. 1972. A new reaction for colorimetric determination of carbohydrates. Anal Biochem 47:273-279.

98

48. Li, Y. K., J. Chir, and F. Y. Chen. 2001. Catalytic mechanism of a family 3 beta-glucosidase and mutagenesis study on residue Asp-247. Biochem J 355:835-840.

49. Li, Y. K., J. Chir, S. Tanaka, and F. Y. Chen. 2002. Identification of the general acid/base catalyst of a family 3 beta-glucosidase from Flavobacterium meningosepticum. Biochemistry 41:2751-2759.

50. Matsushita, O., J. B. Russell, and D. B. Wilson. 1991. A Bacteroides ruminicola 1,4-beta-D-endoglucanase is encoded in two reading frames. J Bacteriol 173:6919-6926.

51. Minic, Z., C. Rihouey, C. T. Do, P. Lerouge, and L. Jouanin. 2004. Purification and characterization of enzymes exhibiting beta-D-xylosidase activities in stem tissues of Arabidopsis. Plant Physiol 135:867-878.

52. Miyazaki, K., H. Miyamoto, D. K. Mercer, T. Hirase, J. C. Martin, Y. Kojima, and H. J. Flint. 2003. Involvement of the multidomain regulatory protein XynR in positive control of xylanase gene expression in the ruminal anaerobe Prevotella bryantii B(1)4. J Bacteriol 185:2219-2226.

53. Morag, E., E. A. Bayer, and R. Lamed. 1990. Relationship of cellulosomal and noncellulosomal xylanases of Clostridium thermocellum to cellulose-degrading enzymes. J Bacteriol 172:6098-6105.

54. Nagae, M., A. Tsuchiya, T. Katayama, K. Yamamoto, S. Wakatsuki, and R. Kato. 2007. Structural basis of the catalytic reaction mechanism of novel 1,2-alpha-L-fucosidase from Bifidobacterium bifidum. J Biol Chem 282:18497-18509.

55. Osborne, J. M., and B. A. Dehority. 1989. Synergism in degradation and utilization of intact forage cellulose, hemicellulose, and pectin by three pure cultures of ruminal bacteria. Appl Environ Microbiol 55:2247-2250.

56. Petosa, C., R. J. Collier, K. R. Klimpel, S. H. Leppla, and R. C. Liddington. 1997. Crystal structure of the anthrax toxin protective antigen. Nature 385:833-838.

57. Rigden, D. J., L. V. Mello, and M. Y. Galperin. 2004. The PA14 domain, a conserved all-beta domain in bacterial toxins, enzymes, adhesins and signaling molecules. Trends Biochem Sci 29:335-339.

58. Schubot, F. D., I. A. Kataeva, D. L. Blum, A. K. Shah, L. G. Ljungdahl, J. P. Rose, and B. C. Wang. 2001. Structural basis for the substrate specificity of the feruloyl esterase domain of the cellulosomal xylanase Z from Clostridium thermocellum. Biochemistry 40:12524-12532.

99

59. Sonnenburg, E. D., J. L. Sonnenburg, J. K. Manchester, E. E. Hansen, H. C. Chiang, and J. I. Gordon. 2006. A hybrid two-component system protein of a prominent human gut symbiont couples glycan sensing in vivo to carbohydrate metabolism. Proc Natl Acad Sci U S A 103:8834-8839.

60. Stevenson, D. M., and P. J. Weimer. 2007. Dominance of Prevotella and low abundance of classical ruminal bacterial species in the bovine rumen revealed by relative quantification real-time PCR. Appl Microbiol Biotechnol 75:165-174.

61. Stewart, C. S., Flint, H. J., and Bryant, M. P. 1997. The Rumen Bacteria, p. 10-72. In P. N. Hobson, and Stewart, C. S. (ed.), The Rumen Microbial Ecosystem, Second ed. Elsevier, New York.

62. Sumida, T., N. Sueyoshi, and M. Ito. 2002. Molecular cloning and characterization of a novel glucocerebrosidase of Paenibacillus sp. TS12. J Biochem 132:237-243.

63. Teather, R. M., and P. J. Wood. 1982. Use of Congo red-polysaccharide interactions in enumeration and characterization of cellulolytic bacteria from the bovine rumen. Appl Environ Microbiol 43:777-780.

64. Teplitsky, A., A. Mechaly, V. Stojanoff, G. Sainz, G. Golan, H. Feinberg, R. Gilboa, V. Reiland, G. Zolotnitsky, D. Shallom, A. Thompson, Y. Shoham, and G. Shoham. 2004. Structure determination of the extracellular xylanase from Geobacillus stearothermophilus by selenomethionyl MAD phasing. Acta Crystallogr D Biol Crystallogr 60:836- 848.

65. Varghese, J. N., M. Hrmova, and G. B. Fincher. 1999. Three- dimensional structure of a barley beta-D-glucan exohydrolase, a family 3 glycosyl hydrolase. Structure 7:179-190.

66. Vercoe, P. E., and K. Gregg. 1992. DNA sequence and transcription of an endoglucanase gene from Prevotella (Bacteroides) ruminicola AR20. Mol Gen Genet 233:284-292.

67. Wang, X., X. Geng, Y. Egashira, and H. Sanada. 2004. Purification and characterization of a feruloyl esterase from the intestinal bacterium Lactobacillus acidophilus. Appl Environ Microbiol 70:2367-2372.

68. Watt, D. K., H. Ono, and K. Hayashi. 1998. Agrobacterium tumefaciens beta-glucosidase is also an effective beta-xylosidase, and has a high transglycosylation activity in the presence of alcohols. Biochim Biophys Acta 1385:78-88.

100

69. Whitehead, T. R. 1993. Analyses of the gene and amino acid sequence of the Prevotella (Bacteroides) ruminicola 23 xylanase reveals unexpected homology with endoglucanases from other genera of bacteria. Curr Microbiol 27:27-33.

70. Whitehead, T. R., and R. B. Hespell. 1989. Cloning and expression in Escherichia coli of a xylanase gene from Bacteroides ruminicola 23. Appl Environ Microbiol 55:893-896.

71. Whitehead, T. R. a. L., D. A. 1991. Cloning and comparison of xylanase genes from ruminal and colonic Bacteroides species. Curr Microbiol 23:15-19.

72. Wilkie, K. C. B. 1983. Hemicellulose. Chemtech 13:306-319.

73. Wulff-Strobel, C. R., and D. B. Wilson. 1995. Cloning, sequencing, and characterization of a membrane-associated Prevotella ruminicola B(1)4 beta-glucosidase with cellodextrinase and cyanoglycosidase activities. J Bacteriol 177:5884-5890.

74. Xiong, J. S., M. Balland-Vanney, Z. P. Xie, M. Schultze, A. Kondorosi, E. Kondorosi, and C. Staehelin. 2007. Molecular cloning of a bifunctional beta-xylosidase/alpha-L-arabinosidase from alfalfa roots: heterologous expression in Medicago truncatula and substrate specificity of the purified enzyme. J Exp Bot 58:2799-2810.

75. Zupancic, M. L., M. Frieman, D. Smith, R. A. Alvarez, R. D. Cummings, and B. P. Cormack. 2008. Glycan microarray analysis of Candida glabrata adhesin ligand specificity. Mol Microbiol 68:547-559.

101

2.8 TABLES AND FIGURES

TABLE 2.1 Primer sequences used for mutagenesis Desired b c Protein a Primer Name → Mutation Primer Sequence (5' 3') Xyn10D xyn10D-fae1A- 5'-ACTATCCACATTACATCGCTGGATCTGCGTACC- Glu280Ser -Fae1A Glu280Ser 3' 5'- Xyn10D xyn10D-fae1A- Ser629Ala CGTGCGATGGCAGGTCTGGCGATGGGTGGTATG -Fae1A Ser629Ala GAAACCAAG-3' 5'- Xyl3A Arg149Leu xyl3A-R149L GACCCTCGTTGGGGACTTGGACAGGAAACCTAT- 3' 5'- Xyl3A Lys188Leu xyl3A-K188L GCTCTGGGCCTGTGCCCTACACTATGCCGTGCAC A-3' 5'- Xyl3A Tyr237Phe xyl3A-Y237F GAGGTGATGTGTGCCTTTCAGCGTCTGGACGAT- 3' 5'- Xyl3A Asp269Ala xyl3A-D269A TACCTCGTTGTGAGCGCATGTGGTGCTGTTAGC-3' 5'- Xyl3A Asp269Asn xyl3A-D269N TACCTCGTTGTGAGCAACTGTGGTGCTGTTAGC-3' 5'-AATATCGATTATCAACAGACCATCGCCCAGCTG- Xyl3A Glu594Gln xyl3A-E594Q 3' Xyl3A Glu616Ala xyl3A-E616A 5'-GCTCCCAGTCTGGCGGGCGAGGAGATG-3' a Residues were selected for conservative mutagenesis due to their conservation as predicted by alignment of the primary amino acid sequences for family 3 glycoside hydrolases with biochemically defined catalytic activities. b Primers were designed in accordance with the QuikChange Multi protocol by Stratagene (La Jolla, CA) and were synthesized by Integrated DNA Technologies (Coralville, IA). c Underlined sequence indicates engineered codon mutations.

102

TABLE 2.2 Activities and domain architecture for experimentally characterized family 3 glycoside hydrolases. GH3 b primary enzymatic a domain architecture organism c cluster activity

Flavobacterium A (A2) β-D-glucosidase(48) meningosepticum

B (B1) Hordeum vulgare β-D-xylosidase(46)

α-L- B (B1) Hordeum vulgare arabinofuranosidase

(46) Aspergillus B (B2) β-D-xylosidase(19) awamori K4

Prevotella β-D-xylosidase (this B (B3) ruminicola 23 study)

Azospirillum C (C1) β-D-glucosidase(22) irakense

C (C2) Aspergillus niger β-D-glucosidase(13)

Phanerochaete C (C2) β-D-glucosidase(16) chrysosporium

Prevotella C (C3) β-D-glucosidase(73) bryantii B14

D Hordeum vulgare β-D-glucosidase(37)

Azospirillum D β-D-glucosidase(23) irakense

Azospirillum F β-D-glucosidase(22) irakense

Paenibacillus sp. F β-D-glucosidase(62) TS12 a Glycoside hydrolase family 3 clusters are assigned based on phylogenetic relationships inferred by Cornoyer and Faure (12). Subclusters where applicable are indicated in parentheses. b Functional domains were assigned utilizing the Pfam online server (http://www.sanger.ac.uk/Software/Pfam/) (24). Domain hits were included if the Expect value (E value) was smaller than 1 x 10-8. c Enzymatic activities were assigned based on results from kinetic assays with purified enzymes as reported in the corresponding cited manuscripts.

103

TABLE 2.3 Steady state kinetic measurements for wild type and mutant Xyl3A with pNP-β-D-xylopyranoside. -1 -1 -1 r Protein kcat (s ) KM (mM) kcat/ KM (mM s ) kcat (%)

Wild type 9.7 ± 0.3 1.8 ± 0.2 5.4 ± 0.6 100 a b R149L 2.5 ± 0.2 × 10-2 N.D. N.D. 0.26 a Y237F 1.6 ± 0.08 3.9 ± 0.5 0.41 ± 0.06 16 b D269A 1.2 ± 0.5 × 10-2 N.D. N.D. 0.12 b D269N 2.2 ± 1.0 × 10-1 N.D. N.D. 2.3 a E594Q 10 ± 0.2 2.0 ± 0.1 5.0 ± 0.3 103 a E616A 3.3 ± 0.1 18 ± 1 0.2 ± 0.01 34 a Error estimates are from propagating errors from non-linear regression analysis utilizing

GraphPad Prism v.5.0. b The reported kcat values for R149L, D269A, and D269N are kcat(apparent), as KM values were not determined; rates for these two mutants were determined at 10 mM pNPX concentration and thus represent a lower boundary on kcat. Each kcat determination was performed in triplicate and the errors reported reflect standard deviations from the mean.

104

Figure 2.1 Identification of a putative xylanase-esterase (xyn10D-fae1A) from the genome of Prevotella ruminicola 23. The genome of Prevotella ruminicola 23 was sequenced by the North American Consortium for Fibrolytic Rumen Bacteria in collaboration with The Institute for Genomic Research (TIGR). Open Reading Frame (ORF) 02827 was annotated as a putative bi-functional xylanase-esterase. Downstream of the xylanase-esterase are genes predicted to encode a hypothetical protein, a β-D- glucosidase, an ABC transporter and a hybrid two-component system regulator.

105

Figure 2.2 xyn10D-fae1A encodes a bi-functional xylanase-esterase. (A) Purification of recombinant Xyn10D-Fae1A. The eluate from cobalt-chelate chromatography was analyzed by 12% SDS-PAGE, followed by Coomassie brilliant blue G-250 staining. (B) Gel filtration chromatography. The size of purified Xyn10D-Fae1A was estimated by size exclusion chromatography. The molecular weight (MW) of Xyn10D-Fae1A was calculated from the retention time of the peak absorbance by comparison with calibration standards having known molecular weights. (C) Depolymerization of oat spelt xylan. Xyn10D-Fae1A was assessed for its capacity to depolymerize oat spelt xylan (OSX) by incubating the protein on an agar plate infused with OSX followed by staining with congo red. (D) Hydrolysis of 1-napthyl butyrate. Xyn10D-Fae1A was assessed for its capacity to hydrolyze 1-naphtyl butyrate (1-NB) by incubating the protein on an agar plate infused with 1-NB followed by development with Fast Garnet GBC sulfate. (E) Hydrolysis of xylo- oligosaccharides. Xyn10D-Fae1A-catalyzed hydrolysis of xylo-oligosaccharides (X2-X6) was assessed by incubating the enzyme with each substrate and then resolving the products by thin layer chromatography followed by staining with methanolic orcinol. For panel B, the molecular weight is reported as the mean ± standard deviation from three independent experiments. mAU, milliabsorbance units.

106

β Figure 2.3 xyl3A encodes a functional -xylosidase. (A) Purification of recombinant Xyl3A. The eluate from cobalt-chelate chromatography was analyzed by 12% SDS- PAGE, followed by Coomassie brilliant blue G-250 staining. (B) Gel filtration chromatography. The size of purified Xyl3A was estimated by size exclusion chromatography. The molecular weight (MW) of Xyl3A was calculated from the retention time of the peak absorbance by comparison with calibration standards having known molecular weights. (C) Hydrolysis of pNP-linked sugars. Xyl3A was assessed for its capacity to hydrolyze several pNP-linked sugars by UV spectroscopy. (D) Hydrolysis of xylo-oligosaccharides. Xyl3A-catalyzed hydrolysis of xylo-oligosaccharides (X2-X6) was assessed by incubating the enzyme with each substrate and then resolving the products by thin layer chromatography followed by staining with methanolic orcinol. For panel B, the molecular weight is reported as the mean ± standard deviation from three independent experiments. mAU, milliabsorbance units. For panel C, the specific activities are reported as the means ± standard deviations from three independent experiments.

107

Figure 2.4 Xyn10D-Fae1A and Xyl3A function synergistically to release xylose from xylan. (A) Thin layer chromatographic separation of products released from oat spelt xylan by Xyn10D-Fae1A. Xyn10D-Fae1A (0.50 µM) or Xyl3A (0.50 µM) were incubated with oat spelt xylan (1 %) and the products were resolved by thin layer chromatography followed by staining with methanolic orcinol. Xylo-oligosaccharide standards X1-X4 were spotted on the plate in lane 1 to serve as markers for the identification of hydrolysis products. In lanes 2-9, the indicated enzymes were incubated with OSX at 37 °C for 14 h and 2.5 µL (lanes 2, 4, 6, 8) or 5.0 µL (lanes 3, 5, 7, 9) of the reaction mixtures were spotted onto the TLC plate. (B) Reducing sugars released from oat spelt xylan by Xyn10D-Fae1A and Xyl3A. Wild type or mutant (Glu280Ser or Ser629Ala) Xyn10D-Fae1A was incubated with oat spelt xylan (1 %) and the reducing sugars were detected by using the para-hydroxybenzoic acid hydrazide (PAHBAH) assay. The reducing sugar concentrations were calculated from the absorbance at 410 nm with comparison to a standard curve generated with known concentrations of glucose. Two-tailed P values were determined by performing an unpaired Student’s t- test.

108

Figure 2.5 Mutational analyses of Xyn10D- Fae1A maps the two catalytic domains to discrete regions of the polypeptide sequence. (A) Domain architecture for Xyn10D-Fae1A. Functional domains were assigned utilizing the Pfam online server (http://www.sanger.ac.uk/Software/Pfam/) (24). Amino acid substitutions were independently introduced into the putative catalytic residue for the xylanase domain (Glu280) and the ferulic acid esterase domain (Ser629) by site-directed mutagenesis. (B) Depolymerization of oat spelt xylan by wild type and mutant (Glu280Ser or Ser629Ala) Xyn10D-Fae1A. Wild type or the mutant (Glu280Ser or Ser629Ala) Xyn10D- Fae1A was incubated with oat spelt xylan (1 %) and the products were resolved by thin layer chromatography followed by staining with methanolic orcinol. Xylo-oligosaccharide standards X1-X4 were spotted on the plate in lane 1 to serve as markers for the identification of hydrolysis products. The wild type enzyme, Glu280Ser mutant, or Ser629Ala mutant (0.93 µM protein in lanes 3, 5, 7, respectively or 1.9 µM protein in lanes 4, 6, 8, respectively) were incubated with OSX at 37 °C for 21 h and 2.5 µL of the reaction mixtures were spotted onto the TLC plate. (C) Reducing sugars released from oat spelt xylan by wild type and mutant (Glu280Ser or Ser629Ala) Xyn10D-Fae1A. Wild type or mutant (Glu280Ser or Ser629Ala) Xyn10D-Fae1A was incubated with oat spelt xylan (1 %) and the reducing sugars were detected by using the para-hydroxybenzoic acid hydrazide (PAHBAH) assay. The reducing sugar concentrations were calculated from the absorbance at 410 nm with comparison to a standard curve generated with known concentrations of glucose. (D) Hydrolysis of methyl ferulate by wild type and mutant (Glu280Ser or Ser629Ala) Xyn10D-Fae1A. Methyl ferulate (5 mM) was incubated with Xyn10D-Fae1A wild type, Glu280Ser, or Ser629Ala (1.2 µM, final concentration). Following incubation at 37 °C for 30 minutes, ferulic acid was determined by HPLC as described by Wang et al. (67). For panels C and D, the reducing ends and specific activities are reported as means ± standard deviations from three independent experiments. Two-tailed P values were determined by performing an unpaired Student’s t-test.

109

Figure 2.6 Xyl3A catalyzes pNPX hydrolysis with retention of stereochemical configuration. (A) The stereochemical configuration of the anomeric carbon of pNPX was monitored during Xyl3A-catalyzed hydrolysis by 1H-NMR spectroscopy. pNPX (4.5 mM) was dissolved in deuterated acetate buffer (50 mM) and Xyl3A (2.5 µM) was added. At t=0 min, 15 min, 6.5 h and 24 h, 1H-NMR spectra were recorded utilizing a Varian INOVA 500NB MHz NMR spectrometer. Data were analyzed using the Nuts2 NMR Data Processing Software from Acorn NMR Inc. (B) Schematic outlining the chemical changes that occur during Xyl3A-catalyzed hydrolysis of pNPX.

110

CHAPTER 3

FUNCTIONAL DIVERSITY OF FOUR GLYCOSIDE HYDROLASE FAMILY 3 ENZYMES FROM THE RUMEN BACTERIUM, PREVOTELLA BRYANTII B(1)4

Published in the Journal of Bacteriology 2010, Vol. 192(9): 2335-2345

3.1 ABSTRACT

Prevotella bryantii B14 is a member of the Bacteroidetes phylum and

contributes to degradation of hemicellulose in the rumen. The genome of P.

bryantii harbors four genes predicted to encode glycoside hydrolase (GH) family

3 enzymes. To evaluate whether these genes encode enzymes with redundant

biological functions, each gene was cloned and expressed in Escherichia coli.

Biochemical analysis of the recombinant proteins revealed that the enzymes exhibit different substrate specificities. One gene encoded a cellodextrinase

(CdxA), and three genes encoded β-xylosidase enzymes (Xyl3A, Xyl3B, Xyl3C) with different specificities for either para-nitrophenyl linked substrates or substituted xylo-oligosaccharides. To identify the amino acid residues that contribute to catalysis and substrate specificity within this family of enzymes, the roles of conserved residues (R177, K214, H215, M251, and D286) in Xyl3B were probed by site-directed mutagenesis. Each mutation led to a severely decreased catalytic efficiency without change in the overall structure of the mutant enzymes.

Through amino acid sequence alignments, an amino acid residue (E115) was identified which when mutated to aspartic acid resulted in a 14-fold decrease in kcat/KM for pNP-β-D-xylopyranoside (pNPX) with a concurrent 1.1-fold increase in

kcat/KM for pNP-β-D-glucopyranoside (pNPG). Amino acid residue E115 may,

111

therefore, contribute to discrimination between β-xylosides and β-glucosides. Our results demonstrate that each of the four GH 3 enzymes has evolved to perform a specific role in ligno-polysaccharide hydrolysis, and provide insight into the role of active site residues in catalysis and substrate specificity for GH 3 enzymes.

3.2 INTRODUCTION

Among the bacterial genera in the bovine rumen, Prevotella are the most numerically abundant species by both culture counts and by analysis of 16S ribosomal RNA abundances (13, 43). These organisms contribute to the breakdown of plant protein and hemicellulose within the rumen microbial ecosystem. Despite these important physiological roles, relatively little is known about the molecular mechanisms for plant polysaccharide utilization by ruminal

Prevotellas. The genomes for P. bryantii B14 and P. ruminicola 23 have recently

been sequenced and harbor a large number of putative glycoside hydrolase

genes (http://www.tigr.org/tdb/rumenomics/). Glycoside hydrolase genes are

grouped into different families according to their amino acid sequences and three

dimensional folds (8). The genome for P. bryantii B14 harbors multiple gene

copies for several GH families. The presence of multiple copies of genes from a

single glycoside hydrolase family suggests that these genes either encode

proteins with redundant or divergent biochemical functions. The genome for P.

bryantii B14 contains four GH family 3 genes, each of which was annotated as

coding for an enzyme with β-glucosidase activity. Previous efforts to identify genes that are important for xylan degradation by its relative, Prevotella

112

ruminicola 23 revealed a GH family 3 protein which was annotated as a β-

glucosidase enzyme, but actually exhibited β-xylosidase activity (12). Thus the goal of the current study was to express these four GH family 3 genes from P. bryantii B14 and to characterize and compare the biochemical properties of their

gene products.

GH family 3 is one of the most abundant families of carbohydrate active

enzymes and includes members that possess distinct enzymatic activities

including β-D-glucosidase, β-D-xylosidase, α-L-arabinofuranosidase and N- acetyl-β-D-glucosaminidase activities (16, 24). The broad substrate specificity in this family of enzymes has implications for their respective physiological roles, and studies have shown that their functions are important for: assimilation of glycosides, recycling of bacterial cell wall components, and modification of free glycosides (reviewed in (16)). GH family 3 enzymes catalyze the hydrolysis of aryl glycosides with net retention of stereochemical configuration at the anomeric carbon, and this reaction involves a pair of residues that act as a catalytic nucleophile (Asp) and a catalytic acid/base (Glu). The catalytic nucleophile in several GH family 3 β-D-glucosidases has been identified by covalent labeling followed by mass spectrometry (2, 9, 11) and X-ray crystallography (30). The

catalytic acid/base that serves to protonate the leaving group oxygen and

subsequently activate an attacking water has been identified for several GH 3 β-

D-glucosidases (9, 35, 40). In addition to these two critical residues, GH 3

enzymes possess three additional highly conserved amino acid residues (Arg,

Lys, and His) that are predicted to be involved with catalysis (27, 29). The

113

residues that contribute to substrate specificity for this family of enzymes,

however, have not been thoroughly investigated.

In this study, we report the biochemical properties of four GH family 3

enzymes from P. bryantii B14 and mutational studies of one of these enzymes.

Results from this study indicate that these GH family 3 genes encode enzymes

with non-redundant biochemical functions and provide important insight into

residues that contribute to catalysis and substrate specificity for this glycoside

hydrolase family.

3.3 EXPERIMENTAL PROCEDURES

Materials. Prevotella bryantii B14 (DSM 11371) was initially isolated by

Bryant et al. (6), and was obtained from our culture collection housed in the

Department of Animal Sciences, University of Illinois at Urbana-Champaign. The

culture was grown on agar maintenance slants and stored in liquid nitrogen vapor

at -105° C (XLC1110, MVE Cryogenics). E. coli JM109 and E. coli BL21-

CodonPlus (DE3) RIL competent cells and the PicoMaxx high fidelity DNA

polymerase were acquired from Stratagene (La Jolla, CA). The pGEM-T TA-

cloning vector was purchased from Promega (Madison, WI). The pET-28a, pET-

15b expression vectors and the pET-46 Ek/LIC Vector Kit were obtained from

Novagen (San Diego, CA). NdeI, DpnI, and XhoI restriction endonucleases were

purchased from New England Biolabs (Ipswich, MA). The RNAprotect bacteria

reagent, RNeasy Mini kit with on-column DNase digestion, and QIAprep Spin

Miniprep kit were obtained from Qiagen (Valencia, CA). Xylo-oligosaccharides,

114

cello-oligosaccharides, aldouronic acids, and wheat arabinoxylan (medium

viscosity – 20 cSt) were obtained from Megazyme (Bray, Ireland). Gel filtration

standards were obtained from Bio-Rad (Hercules, CA). The iScript single-

stranded cDNA synthesis kit was obtained from BioRad (Hercules, CA). The

SYBR green master mix was obtained from Applied Biosystems (Foster City, CA).

All other reagents were of the highest possible purity and were purchased from

Sigma-Aldrich (St. Louis, MO).

Gene cloning, expression, and purification of P. bryantii B14 GH family 3 genes. The genome of Prevotella bryantii B14 was sequenced by the

North American Consortium for Fibrolytic Rumen Bacteria in collaboration with

The Institute for Genomic Research (TIGR). Auto-annotation of the partially

sequenced P. bryantii B14 genome identified ORF0348 (cdxA), ORF0398 (xyl3A),

ORF0911 (xyl3B), and ORF0975 (xyl3C) as genes predicted to encode β-

glucosidase enzymes. The gene sequence for cdxA is available on the GenBank

website (accession no. AAA86753.2) and the sequences for xyl3A, xyl3B, and

xyl3C have been deposited within the NCBI GenBank database with accession

numbers GU564512.1, GU564513.1, and GU564514.1, respectively.

Prevotella bryantii B14 was maintained and grown in a standard

anaerobically prepared medium as described previously (23). DNA was isolated

from mid-log-phase cultures using the DNeasy blood and tissue DNA purification

kit. The DNA sequences corresponding to the entire ORFs of cdxA, xyl3A, xyl3B,

and xyl3C were amplified from P. bryantii B14 genomic DNA with the PicoMaxx

115

high fidelity PCR system with the primers listed in Table 3.2. Putative signal

sequences with predicted cleavage sites were identified for each of the four

genes (CdxA, MNKKIGVFALGICLSGASMA; Xyl3A,

MKLFTKYAVVAILTLPSTATYSLICA; Xyl3B, MRKKILLMCASIALVSCNN; Xyl3C,

MKSKQLITLFIVSTSILSLHA) by using the SignalP v3.0 online server (14). To

ensure the accumulation of recombinant protein in E. coli cells to facilitate

purification, the primers were designed to clone the gene beginning with the

amino acid codon just downstream of the predicted protease cleavage sites.

During the course of the current study, our laboratory made a transition

from traditional restriction enzyme based cloning methods to ligation independent

methods. Thus the cloning procedures employed for the four genes in the current

study differ. The gene cdxA was amplified by PCR with primers engineered to

incorporate 5' NdeI and 3' XhoI restriction sites for subsequent directional cloning.

The resulting amplicon was cloned into pGEM-T via TA-cloning and then a

second primer set (CdxA-SDM1 and CdxA-SDM2) was employed to introduce a

silent mutation in a native NdeI restriction site in the gene (A→T at nucleotide

position 2256) by site-directed mutagenesis. The resulting gene was then

excised from the pGEM-T vector by digestion with NdeI and XhoI and subcloned

in-frame with the hexahistidine (His6) encoding sequence of a modified pET-28a expression vector by replacing the NdeI−XhoI polylinker. The pET-28a vector

was modified by substituting the gene for kanamycin resistance with that for

ampicillin resistance (7).

116

For xyl3A, the gene was amplified with primers engineered to incorporate

5’-GACGACGACAAGA and 3’-ACCGGGCTTCTCCTC extensions to facilitate subsequent annealing with the expression vector pET-46b. The resulting xyl3A amplicon was then digested with the activity of T4 DNA polymerase, annealed with a similarly digested pET-46b vector and introduced into E. coli

JM109 by electroporation.

For xyl3B and xyl3C, the primer sets were engineered to incorporate 5'-

NdeI and 3'-XhoI restriction sites for subsequent directional cloning. The resulting amplicons were then digested with NdeI and XhoI and cloned in-frame with the hexahistidine (His6) encoding sequence of a pET-15b expression vector by replacing the NdeI−XhoI polylinker.

Thus, expression of all four genes resulted in N-terminal polyhistidine fusion proteins. The integrity of the cloned cdxA, xyl3A, xyl3B, and xyl3C genes was confirmed by DNA sequencing (W. M. Keck Center for Comparative and

Functional Genomics at the University of Illinois at Urbana-Champaign).

The resulting plasmid constructs, pET28a-cdxA, pET46b-xyl3A, pET15b- xyl3B, and pET15b-xyl3C were introduced into E. coli BL-21 CodonPlus (DE3)

RIL competent cells by heat shock. The recombinant E. coli cells were then grown overnight in lysogeny broth (3, 4) (LB) (10 mL) supplemented with ampicillin (100 µg/mL) and chloramphenicol (50 µg/mL) at 37 oC with vigorous aeration. After 8 h, the starter cultures were diluted into fresh LB (1 L) supplemented with ampicillin and chloramphenicol and cultured at 37 oC with aeration until the optical density at 600 nm reached 0.3. Gene expression was

117

then induced by addition of isopropyl β-D-thiogalactopyranoside (IPTG) at a final

concentration of 0.1 mM and the temperature for culturing was lowered to 16 °C .

After 16 h, the cells were harvested by centrifugation (4000 x g, 15 min, 4 oC), re-

suspended in 35 mL lysis buffer (50 mM Tris-HCl, 300 mM NaCl, pH 7.5), and ruptured by two passages through an EmulsiFlex C-3 cell homogenizer from

Avestin (Ottawa, Canada). The cell lysates were then clarified by centrifugation

at 20,000 x g for 30 min at 4 oC, and the recombinant proteins were purified

using a two-step purification scheme with an AKTAxpress fast protein liquid

chromatograph (FPLC, GE Healthcare) by coupling nickel affinity

chromatography (5 mL HisTrap FF column, GE Healthcare) with buffer exchange

(HiPrep 26/10 desalting column, GE Healthcare). The proteins were then eluted

from the HisTrap column with a Tris-HCl elution buffer (50 mM Tris-HCl, 300 mM

NaCl, 250 mM imidazole, pH 7.5) and exchanged into a Tris-HCl protein storage

buffer (50 mM Tris-HCl, 150 mM NaCl, pH 7.5). Aliquots of eluted fractions were

analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-

PAGE) using the method described by Laemmli (33), and protein bands were

visualized by staining with Coomassie brilliant blue G-250.

The protein concentrations were quantified by absorbance spectroscopy

using the method described by Gill and von Hippel (21) with the following

extinction coefficients: 92,710 M-1 cm-1, 130,640 M-1 cm-1, 96,615 M–1 cm–1, and

123,650 M-1 cm-1 for CdxA, Xyl3A, Xyl3B, and Xyl3C, respectively. The

absorbance value for each protein solution was measured at 280 nm with a

NanoDrop 1000 from Thermo Fisher Scientific Inc. (Waltham, MA).

118

Hydrolysis of para-nitrophenyl linked sugars. Enzyme-catalyzed

hydrolysis of para-nitrophenyl (pNP) linked monosaccharide substrates was

assayed using a thermostated Synergy II multi-mode microplate reader from

BioTek Instruments Inc. (Winooski, VT). A library of pNP substrates were

screened for activity including: pNP-α-L-arabinopyranoside, pNP-α-L-

arabinofuranoside, pNP-β-D-fucopyranoside, pNP-α-L-fucopyranoside, pNP-α-D-

galactopyranoside, pNP-β-D-galactopyranoside, pNP-α-D-glucopyranoside, pNP-

β-D-glucopyranoside, pNP-β-D-maltopyranoside, pNP-α-D-maltopyranoside,

pNP-α-D-mannopyranoside, pNP-β-D-mannopyranoside, pNP-α-L-

rhamnopyranoside, pNP-β-D-xylopyranoside, pNP-β-D-cellobioside. The

substrates (1 mM) in citrate buffer (100 µL; 50 mM sodium citrate, 150 mM NaCl,

pH 5.5) were incubated at 37 °C in the presence or absence of CdxA (6 nM),

Xyl3A (0.5 µM), Xyl3B (9 nM), or Xyl3C (0.5 µM) for 30 min, and the level of pNP

release was determined by continuously monitoring the absorbance at 400 nm.

The pathlength correction feature of the instrument was employed to convert the

absorbance values recorded to correspond to a 1 cm pathlength. The extinction

coefficient for pNP at pH 5.5 and a wavelength of 400 nm was measured as

0.673 mM-1cm-1. In preliminary studies, we determined that the enzymatic

properties of Xyl3B were unaffected by the amino-terminal hexa-histidine fusion

peptide (data not shown). Therefore, in subsequent analysis the hexa-histidine

tagged proteins were used.

119

Hydrolysis of neutral and acidic oligosaccharides. To assess the

capacity of P. bryantii GH family 3 proteins to hydrolyze neutral xylo- and cello-

oligosaccharides the enzymes were incubated with cellohexaose or xylohexaose

and the products were analyzed by high performance liquid chromatography

(HPLC). Initial biochemical studies with pNP-linked sugars substrates revealed

that the optimal pH for the four proteins was 5.5 (Figure 3.3). Reaction mixtures

(450 µl) were prepared in citrate buffer (50 mM sodium citrate, 150 mM NaCl, pH

5.5) with xylohexaose or cellohexaose (60 µg/mL, final concentration) at 37 °C and reactions were initiated by the addition of CdxA, Xyl3A, Xyl3B, or Xyl3C (100 nM, final concentration). At regular time intervals (0 min, 15 min, 30 min, 3 h) 100

µL aliquots were removed and the reactions were terminated by the addition of

300 µL of 0.1 M NaOH.

For analysis of aldouronic acid hydrolysis, the enzymes (0.5 µM, final concentration) were incubated with the aldouronic acid mixture (60 µg/mL, final concentration) in citrate buffer (50 mM sodium citrate, 150 mM NaCl, pH 5.5) at

37 °C. After 16 h, 100 µL aliquots were removed and the reactions were terminated by the addition of 300 µL 0.1 M NaOH.

The oligosaccharide composition of the neutral and acidic oligosaccharide mixtures following enzymatic hydrolysis were then assessed by high performance anion exchange chromatography (HPAEC-PAD) with a System

Gold® HPLC instrument from Beckman Coulter (Fullerton, CA) equipped with a

CarboPac™ PA1 guard column (4 x 50 mm) and a CarboPac™ PA1 analytical

column (4 x 250 mm) from Dionex Corporation (Sunnyvale, CA) and a

120

Coulochem® III electrochemical detector from ESA Biosciences (Chelmsford,

MA). Monomeric xylose (X1), glucose (G1), xylo-oligosaccharides (X2-X6), and cello-oligosaccharides (G2-G6) were used as standards. Oligosaccharides were resolved using a mobile phase of 100 mM NaOH with a linear gradient ending with 125 mM NaOAc over 25 min.

Growth of P. bryantii B14 and analysis of gene expression. P. bryantii

B14 was grown anaerobically in a chemically defined medium modified from

Griswold et al. (Table 3.1) with either glucose (0.18% w/v) or wheat arabinoxylan

(0.15% w/v) as the sole carbohydrate sources. Cultures were grown in butyl rubber stoppered serum bottles which were modified to incorporate a sidearm tube to allow for turbidity measurements at 600 nm using a Spectronic 20D+ spectrophotometer from Thermo Fisher Scientific Inc. (Waltham, MA). At mid-log phase of growth (OD600nm = 0.2), 30 mL of culture was removed and combined with two volumes of RNAprotect bacteria reagent and RNA was purified using the

RNeasy kit with the optional on-column DNase treatment step. The RNA was converted to single stranded cDNA using the iScript cDNA synthesis kit.

Quantitative polymerase chain reaction (Q-PCR) experiments were performed using the SYBR green master mix with single stranded cDNA as the template and the primers listed under Q-PCR in Table 3.2. Reaction mixtures (20 µL) were prepared in 384 well PCR plates with a final primer concentration of 500nM and amplification was monitored using a LightCycler® 480 Real-Time PCR System from Roche Applied Science (Indianapolis, IN).

121

To determine the amplification efficiency for each primer set, a plot of the

cycle number at the crossing point vs. the amount of cDNA per reaction for a set

of 6 cDNA dilutions ranging from 0.05 ng to 50 ng per reaction was constructed.

The slope of the line was determined and the amplification efficiency (E) was

calculated using equation 1.

E = 10^(-1/slope) (1)

The relative expression ratio (R) was then calculated for each gene using

equation 2 where E is the amplification efficiency for each primer set, CP is the

cycle number at the crossing point, and the reference gene was the DNA gyrase

subunit A gene (gyrA).

R = ((Etarget)^∆CP(Glucose-WAX))/((Ereference)^∆CP(Glucose-WAX)) (2)

Multiple-sequence alignment. To generate multiple sequence

alignments, the indicated sequences were imported into ClustalW

(http://align.genome.jp/) and aligned using the Blosum62 matrix. The multiple-

sequence alignment file was then entered into the ESPript V2.2 alignment

program (http://espript.ibcp.fr/ESPript/cgi-bin/ESPript.cgi) to apply shading and

boxing schemes.

Site-directed mutagenesis. Mutagenesis was performed using the

QuikChange® Site-Directed Mutagenesis Kit from Stratagene (La Jolla, CA).

First, mutagenic primers were engineered with the desired mutation in the center

of the primer and ~15 bases of correct sequence on either side (Table 3.2).

122

Reaction mixtures were prepared according to the manufacturer with pET15b-

xyl3B as the DNA template for generation of the Xyl3B mutants. After cycling the

reaction mixture 18 times in a PCR thermal cycler, the mixture was digested with

DpnI, and the resulting DNA was transformed into electro-competent E. coli

DH5α cells by electroporation. The mutant genes were sequenced to ensure that

the appropriate mutations were introduced, while the rest of the gene sequences

remained unchanged. Expression and purification of the mutant recombinant

proteins were performed as described above for wild type Xyl3B.

Protease sensitivity assays. Xyl3B wild type and mutant enzymes (2

mg/mL, final concentration) were incubated in a TRIS buffer (50 mM Tris-HCl,

150 mM NaCl, 50 mM DTT, pH 8.0) with varying concentrations of trypsin (0

µg/mL, 3.1 µg/mL, 25 µg/mL, 100 µg/mL) at 25 °C. After 1 h, the reactions were terminated by the addition of an equal volume of 2X SDS sample buffer (4% w/v

SDS, 100 mM Tris-HCl, 0.4 mg bromophenol blue/mL, 0.2 M DTT, 20% v/v glycerol). The samples were boiled for 5 min, and resolved by SDS-PAGE (12% w/v acrylamide; 70 V for 20 min, and then 200 V for an additional 40 min). The gels were soaked in fixing solution (25% isopropanol, 10% v/v acetic acid) for 15 min and then placed in Coomassie stain (10 % v/v acetic acid, 0.06 mg/mL

Coomassie brilliant blue G-250). After 10 h, the gels were destained for 4 h with

10% v/v acetic acid, rinsed twice with ddH2O and then photographed with a

G:Box gel dock imaging system from Syngene (Frederick, MD).

123

Circular dichroism spectroscopy of wild type and mutant Xyl3B.

Circular dichroism (CD) spectra were collected for wild type and mutant Xyl3B in

the far UV range utilizing a J-815 Circular Dichroism Spectropolarimeter from

JASCO (Easton, MD). A rectangular cuvette with a total volume of 350 µL and a path-length of 0.1 cm was used for each assay. The CD spectra of wild type and mutant Xyl3B (1.25 µM, final concentration) in a phosphate buffer (50 mM sodium phosphate, pH 5.5) were recorded from 190 nm to 260 nm at a scan rate of 50 nm/sec with a wavelength step of 0.1 nm and with 5 accumulations.

Data acquisition was coordinated using the JASCO Spectra Manager

v1.54A software. Raw data files were uploaded onto the DICHROWEB on-line

server (http://www.cryst.bbk.ac.uk/cdweb/html/home.html) and analyzed using

the CDSSTR algorithm with reference set #4, which is optimized for the analysis

of data recorded in the range of 190 nm – 240 nm (36).

Determination of steady-state kinetic values for wild type and mutant

Xyl3B. Kinetic studies of the wild type and mutant Xyl3B proteins were

performed using a thermostated Synergy II multi-mode microplate reader from

BioTek Instruments Inc. (Winooski, VT). pNPX or pNPG (0-46.25 mM) were

incubated in a citrate reaction buffer (100 µL; 50 mM sodium citrate, 150 mM

NaCl, pH 5.5) at 37 oC in a 96-well flat bottom microtiter plate and reactions were initiated by the addition of wild type or mutant Xyl3B. In order to obtain initial velocities of the enzyme-catalyzed reactions, the protein concentrations were adjusted such that the slopes of the resulting progress curves (change in

124

absorbance per unit time) were linear. Hydrolysis of pNPX was continuously monitored by recording the UV signal at 400 nm. Initial rate data were then plotted against the substrate concentration and kinetic values were estimated by applying a nonlinear curve fit using GraphPad Prism v5.02 from GraphPad

Software (San Diego, CA). The extinction coefficient for para-nitrophenol at pH

5.5 and a wavelength of 400 nm was measured as 0.673 mM-1cm-1. For mutants with very low activities, kcat(apparent) was determined at 46.25 mM pNPX concentration and an enzyme concentration of 10 µM.

3.4 RESULTS

Identification, cloning, and expression of Prevotella bryantii B14 GH family 3 genes. To identify putative GH family 3 genes in P. bryantii B14, the genome sequence of the bacterium was uploaded onto the Rapid Annotation using Subsystem Technology (RAST) online server (1) and a BLASTp search of the annotated genome was performed using the Barley ExoI enzyme (28) as a query. This search revealed the presence of four putative GH 3 genes (ORF0348,

ORF0398, ORF0911, and ORF0975) which were each annotated as encoding β-

D-glucosidase enzymes (Table 3.3). ORF0348 was previously cloned, expressed in E. coli and its biochemical properties were characterized by David B. Wilson and colleagues (46). This recombinant protein was found to have cellodextrinase and cyanoglycosidase activities, thus the protein was named CdxA. The genomic context for the four enzymes is summarized in Figure 3.1 CdxA is located just downstream of a putative GH family 5 endo-mannanase, and ORF975 is located

125

just upstream of a putative xylose transporter, whereas ORF398 and ORF911 do not appear to be linked to other genes with predicted roles in polysaccharide degradation.

The domain architectures for the coding sequences was analyzed using the Pfam online server (17). These studies suggested that all four proteins possess two conserved domains, as with most GH family 3 proteins (24).

However, the analysis also predicted that ORF0398 and ORF0975 encode proteins that have PA14 domain insertions within the C-terminal GH family 3 domains (Table 3.3). It is predicted that the PA14-like insertion sequence may represent a novel carbohydrate binding module (41). These bioinformatics analyses suggested that the four GH family 3 genes encode proteins with redundant biochemical functions. To test this prediction, we cloned and expressed the four GH family 3 genes as recombinant polyhistidine fusion proteins in E. coli. The gene products were purified by metal affinity chromatography, and the predicted molecular weights (ORF0348, 86 kDa;

ORF0398, 99 kDa; ORF0911, 86 kDa; ORF0975, 96 kDa) were in agreement with the size of the purified proteins estimated by comparison with molecular weight markers resolved by SDS-PAGE (Figure 3.2A).

Prevotella bryantii B14 GH family 3 genes encode enzymes with non- redundant biochemical functions. To determine the substrate specificity for the gene products of ORF0348 ORF0398, ORF0911, and ORF0975, the recombinant proteins were screened for activity with a library of α- and β-para-

126

nitrophenol (pNP) linked sugars. The putative β-D-glucosidase proteins were

incubated in the presence of the derivatized sugars and the reactions were monitored spectrophotometrically. These experiments revealed that ORF0348

(CdxA) catalyzed the release of pNP from pNP-β-D-glucopyranoside and pNP-β-

D-cellobioside (Figure 3.2B-E). When ORF0398 was incubated with the derivatized sugars, pNP was released from pNP-β-D-xylopyranoside (Figure

3.2C). When ORF0911 was incubated with the derivatized sugars, pNP was released from pNP-β-D-xylopyranoside and in addition, low hydrolytic activity was observed with pNP-α-L-arabinofuranoside (Figure 3.2D). When ORF0975 was incubated with the derivatized sugars, pNP was released from pNP-β-D- xylopyranoside, pNP-β-D-glucopyranoside, and pNP-β-D-cellobioside, although this activity was much lower than the activity detected for the three other proteins

(Figure 3.2E). These results indicated that each of the four putative GH family 3 genes encodes a functional glycoside hydrolase enzyme. Furthermore, these data suggest that each of the four GH family 3 proteins in Prevotella bryantii B14

possesses unique substrate specificities.

Prevotella bryantii B14 GH family 3 proteins release

monosaccharides from plant cell wall polysaccharides. Initial data suggested

that the four GH 3 enzymes have activity with β-1,4-linked xylosides or glucosides (Figure 3.2B-E). To evaluate whether the four GH family 3 proteins may contribute to the hydrolysis of plant cell wall polysaccharides that contain glucose or xylose, each protein (0.1 µM) was incubated with cellohexaose or

127

xylohexaose (60 µg/mL each) and the reaction products were monitored over time by HPAEC. During the course of the reaction, CdxA (ORF0348) converted cellohexaose to glucose and cellobiose (Figure 3.4A). In addition, CdxA converted xylohexaose to a mixture of shorter xylo-oligosaccharides, although this activity was much lower than the hydrolytic activity with cellohexaose. These data are in support of previous findings that identified ORF0348 as a cellodextrinase (CdxA) (46).

In contrast to the case for CdxA, the gene products of ORF0398,

ORF0911, and ORF0975 did not hydrolyze cellohexaose to cellobiose and glucose, although some products, mainly G4 and G5, were detectable. These proteins were, however, capable of converting xylohexaose to a mixture of shorter xylo-oligosaccharides and xylose (Figure 3.4B-C). These results suggested that the major enzymatic activity of each of the gene products of

ORF0398, ORF0911, and ORF0975 is β-xylosidase activity, and thus the genes were designated, xyl3A, xyl3B, and xyl3C, respectively.

For several of the reactions, monosaccharides did not accumulate despite the evident depolymerization of the xylohexaose or cellohexaose substrate

(CdxA, xylohexaose; Xyl3A, cellohexaose; Xyl3B, cellohexaose; Xyl3C, cellohexaose). Several GH family 3 enzymes have been reported to exhibit transglycosylase activities (10, 32, 45) and it is possible that the lack of monosaccharide accumulation in the aforementioned reactions is a result of transglycosylation side reactions.

128

Xyl3B and Xyl3C possess unique substrate specificities. Natural xylan

polymers may have a variety of different substituents on the xylose backbone

chain including arabinofuranosyl, acetyl and 4-O-methyl glucuronyl groups. To

test whether substituents on the xylan chain may affect the β-xylosidase activity

of Xyl3B and Xyl3C, the proteins were incubated with aldouronic acids in the

presence or absence of a GH family 67 α-glucuronidase enzyme which was

cloned from Prevotella bryantii B14 (Agu67A) and the hydrolytic products were

analyzed by HPAEC. The aldouronic acid mixture was obtained from Megazyme,

and according to the vendor, was constructed by controlled acid hydrolysis of 4-

O-methyl glucuronoxylan. The cloning, expression, purification and biochemical

characterization for Agu67A will be published elsewhere.

When the aldouronic acid mixture was incubated with Agu67A alone, 4-O-

methyl glucuronic acid (MGA) was released (peak at 16.08 min) and xylo-

oligosaccharides (X2, 8.79 min; X3, 9.62 min; X4, 10.62 min; X5, 11.56 min),

xylose (5.67 min) and two additional peaks (15.31 min and 15.69 min) were

detected (Agu67A: Figure 3.5A-B). In the presence of Xyl3B alone, the level of xylose in the hydrolysate was increased as compared to the control. In the presence of Xyl3C alone, the level of xylose was also increased as compared to the control. In addition, the shoulders of the two peaks centered at 15.31 min and

15.69 min disappeared, resulting in symmetrical peaks being observed at these positions (compare peaks from Xyl3C hydrolysates to peaks from control and also Xyl3B hydrolysates). When Agu67A and Xyl3B were incubated together with the aldouronic acid mixture, a large amount of xylose was released and MGA

129

was detected, however the two peaks at 15.31 min and 15.69 min remained.

When Agu67A and Xyl3C were incubated together with the aldouronic acid

mixture, a large amount of xylose was released and 4-O-methyl glucuronic acid

was detected and the two peaks at 15.31 min and 15.69 min disappeared (Figure

3.5B). In the Agu67A/Xyl3C mixture, small peaks were detected that

corresponded to xylobiose and xylotriose, indicating that cleavage of these two

xylo-oligosaccharides by Xyl3C was incomplete. These results show that

although Xyl3B and Xyl3C both possess β-xylosidase activity, they exhibit

different specificities for components of the aldouronic acid mixture.

Expression of xyl3A is induced during growth of P. bryantii B14 on

xylan compared to glucose. Three of the GH family 3 genes from P. bryantii

B14 encode enzymes with β-xylosidase activity. To evaluate whether expression of these genes is influenced by the carbohydrate growth source, P. bryantii B14

was cultured with glucose or soluble wheat arabinoxylan and the relative level of

gene expression for each of the GH 3 genes was compared after the expression

values were normalized to a housekeeping gene (gyrA). P. bryantii B14 grew

rapidly in the presence of either glucose or wheat arabinoxylan as the sole

carbohydrate source (Figure 3.6A) with specific growth rates (µ) of 0.59 h-1 and

0.62 h-1, respectively. The relative expression ratios for the four genes during

growth of P. bryantii B14 on wheat arabinoxylan (WAX) compared to glucose

were: cdxA, 0.56 ± 0.04; xyl3A, 45 ± 11; xyl3B, 1.4 ± 0.3; xyl3C, 0.96 ± 0.3

(Figure 3.6B). Of the four genes, expression of xyl3A was induced during growth

130

on WAX compared to glucose, whereas expression of xyl3B and xyl3C was

constitutive. Despite the fact that three of the P. bryantii B14 GH 3 genes encode

β-xylosidase enzymes, these results suggest that there are differences in the

transcriptional regulation of these genes.

Mutational analysis of Xyl3B reveals residues that contribute to catalysis and substrate specificity. To identify amino acid residues that

contribute to catalysis and substrate specificity for GH 3 enzymes, Xyl3B was

chosen for mutational analysis due to its less complex domain architecture

(Table 3.3). Amino acid sequence alignments were constructed with

biochemically characterized GH 3 enzymes (Figure 3.7) and these alignments

revealed the presence of five distinct motifs that contain conserved amino acid

residues. Within these motifs, nine amino acid residues (Gly119, Arg171, Arg177,

Glu180, Asp185, Lys214, His215, Met251, and Asp286) were conserved

throughout all of the sequences aligned. To assess whether these conserved

amino acid residues map to the active site for Xyl3B, a three-dimensional

homology model based upon the available crystal structure for the Barley β-

glucan exohydrolase (ExoI) was constructed using the ModWeb online server for

protein structure modeling (15). The homology model for Xyl3B exhibited a Cα

root mean squared deviation (RMSD) value of 17 Å when compared to the Barley

crystal structure. However, of the nine conserved amino acids identified in the

sequence alignment, five residues mapped to within 5 Å of the transition state

analog bound in the active site of the model. Furthermore, the side-chain RMSD

131

values for these residues were appreciably smaller than the overall Cα RMSD for the homology model (Arg177, 1.3 Å; Lys214, 0.51 Å; His215, 0.55 Å; Met251,

0.64 Å; Asp286, 0.31 Å), indicating that the spatial conservation of these residues is much higher than for other residues in the protein. In the homology model, an additional amino acid residue (Glu115) was identified that aligns with

Asp120 from the Barley enzyme (Figure 3.7). In ExoI, Asp120 makes short hydrogen bond contacts with the 4′ and 6′ hydroxyl groups of glucose (Figure

3.9A), thus we hypothesized that the longer glutamic acid residue in Xyl3B could be important for conferring xylosidase activity to Xyl3B because xylose lacks the

6′ hydroxyl group found in glucose (Figure 3.9B).

For each of the amino acid positions indicated above, the residue was mutated to alanine to investigate whether these residues were important for catalysis. In addition, a Glu115→Asp mutation was generated to test the predicted role of this residue in discriminating between xylosides and glucosides by Xyl3B.

To ensure that the amino acid substitutions had not resulted in gross structural changes, the protease sensitivity patterns of wild-type and mutant forms of Xyl3B were compared. These experiments revealed that Xyl3B

Glu115Ala, Glu115Asp, Arg177Ala, Lys214Ala, His215Ala, Met251Ala, and

D286A all exhibited sensitivity to tryptic proteolysis (Figure 3.8) similar to that for wild-type Xyl3B. To further evaluate whether mutations in Xyl3B had any influence on the protein structure, the proportions of secondary structural elements between the mutant and wild type proteins were compared by circular

132

dichroism spectroscopy. These experiments revealed that the overall secondary

structures for the mutant and wild type Xyl3B were similar (Table 3.4).

The catalytic properties for wild-type Xyl3B and the seven mutant proteins

were compared through determination of their steady-state kinetic properties in the presence of pNP-β-D-xylopyranoside (pNPX) and pNP-β-D-glucopyranoside

(pNPG). The results revealed that each of the six residues, when mutated to

alanine, resulted in mutant enzymes that exhibited large losses in catalytic

efficiency (kcat/KM) relative to the wild type enzyme. These losses were

manifested in large decreases in the magnitude of the turnover number (kcat) for

the Glu115Ala, Arg177Ala, Lys214Ala, His215Ala, and D286A mutants, and modest changes in the KM for the enzymes except for a large increase in the KM

for the Met251Ala mutation (Table 3.5).

3.5 DISCUSSION

Prevotella bryantii B14 harbors four GH 3 genes each predicted to encode

a β-glucosidase enzyme. The presence of multiple GH 3 genes in P. bryantii B14

raises the important question of whether these genes encode enzymes with

redundant activities or whether they have each evolved to perform unique

functions within the cell. This gene family is one of the largest glycoside

hydrolase families identified and many organisms that metabolize oligo- or

polysaccharides possess multiple GH 3 genes. As an example, analysis of the 18

available genomes for Bacteroides spp. (which are the closest relatives of the

Prevotella spp.) revealed an average of 12 GH 3 genes per genome, with the B.

133

uniformis genome harboring 23 putative GH 3 genes. Auto-annotation of GH 3 genes almost universally assigns the function of β-glucosidase activity to the corresponding gene products. The consequence of this is that many genes within this family are incorrectly annotated, and this limits the ability to assess the role of these genes within the context of the metabolic repertoire of the organisms in which these genes reside. This is particularly evident within the human gut microbiome where a recent survey identified GH 3 as the most abundant GH family present in this microbial ecosystem (22). Despite the high prevalence of this gene family within the human gut microbial community, the role of these genes in this gut consortium is currently unknown. The GH 3 family also abounds in the rumen microbial ecosystem (5) and as demonstrated by the present study, it is highly represented in ruminal Prevotella spp. In both the human and rumen ecosystems, these genes are involved in energy capture from host-derived or non-digestible dietary components. Understanding the functional diversity of GH family 3 gene products in a Prevotella sp may lead to inferences applicable also to the human gut ecosystem.

Thus, the goal of the current study was two-fold: a) to characterize the substrate specificities of GH 3 enzymes from P. bryantii B14 and b) to identify amino acid residues that contribute to catalysis and substrate specificity for a representative member, Xyl3B. Xyl3B was chosen due to its less complex domain architecture (Table 3.3), rendering it more amenable to homology modeling based on known crystal structures for members of this family.

134

Results from the current study clearly indicate that these four genes

encode functional glycoside hydrolases and furthermore, that each of these

enzymes possesses unique substrate specificities. CdxA was previously found to

have cellodextrinase activity (46), and the results from the current study support

this conclusion. P. bryantii B14 is not cellulolytic, however this bacterium can

rapidly ferment cello-oligosaccharides, obtained via cross-feeding between

cellulolytic rumen bacteria (42). During growth on cellodextrins, longer chain

oligosaccharides are hydrolyzed extracellularly to cellotriose or cellobiose prior to

transport (42). Thus, CdxA may function to depolymerize cellodextrins, the

products of which can subsequently be metabolized by P. bryantii B14.

β-Xylosidases release xylose units from the non-reducing ends of xylo-

oligosaccharides produced by the action of endo-xylanase enzymes. In nature,

these xylo-oligosaccharides are commonly substituted with acetyl,

arabinofuranosyl, or 4-O-methyl glucuronyl groups, and the presence of these

groups likely influences the capacity of β-xylosidases to release xylose from

these substrates. The specificity of GH family 3 β-xylosidases for substituted

xylo-oligosaccharides has not been comprehensively studied, although previous

studies reported that a GH 3 β-xylosidase (Bxl1) from Trichoderma reesei could

release xylose from aldouronic acids in which the penultimate xylose of the reducing end was substituted with 4-O-methyl glucuronic acid (25, 37). We

hypothesize that similar to T. reesei Bxl1, Xyl3C may cleave xylose from the non-

reducing end of aldouronic acids substituted with 4-O-methyl glucuronic acid at

the penultimate xylose residue, however Xyl3B lacked this capacity. Confirmation

135

of this hypothesis will require testing the enzymes on purified substrates in the future. One striking difference between Xyl3C and Xyl3B is their domain organization, with Xyl3C harboring a putative PA14 insertion sequence within the

C-terminal (α/β)6 sandwich domain (Table 3.3). It is predicted that PA14 domains may serve a carbohydrate binding function (41, 47), and it is possible that this domain could be responsible for the differences in specificity observed between

Xyl3C and Xyl3B. Xyl3A also has a PA14 domain, however this enzyme exhibits a similar cleavage pattern as compared to Xyl3B (data not shown), which suggests that factors other than the presence of a PA14 domain contribute to the unique activity of Xyl3C. This conclusion may be supported by the fact that Bxl1 from T. reesei does not possess a PA14 domain.

Xylanase activity in P. bryantii B14 is inducible during growth on xylans

(18-20, 38, 39), and the current study shows that xyl3A is induced during growth on wheat arabinoxylan compared to glucose, whereas cdxA, xyl3B, and xyl3C are not. Although xyl3A, xyl3B, and xyl3C each encodes an enzyme with β- xylosidase activity, the differential expression of these genes further supports the biochemical data which indicate that each gene occupies unique functional roles within the cell.

A gene cluster with a GH 43 β-xylosidase and a GH 10 endoxylanase was previously identified in P. bryantii B14 (19) and these enzymes were shown to function synergistically to produce xylose from xylan substrates. Although the GH

43 β-xylosidase was shown to convert xylopentaose to xylose (20), the activity on substituted xylans was not evaluated. Our study has identified three additional

136

enzymes with β-xylosidase activity from P. bryantii B14. The presence of four enzymes (one GH 43 and three GH 3) that exhibit β-xylosidase activity in this bacterium is somewhat unexpected. However, it is possible that each of these enzymes has evolved to hydrolyze xylosidic linkages adjacent to substitutions and thus these enzymes may function together mutualistically to breakdown heterogeneous xylan substrates present in ingested forage. Another possibility is that yet to be identified xylose-containing oligosaccharide substrates are encountered within the rumen and these enzymes have evolved to release fermentable sugars from these substrates.

Despite diversity in function and considerable variation in amino acid sequences amongst members of this family of enzymes (24), there are a core set of conserved amino acid residues (Figure 3.7) that make hydrogen bond contacts to the substrate in the -1 subsite (44) (Figure 3.9A). Mutation of the five highly conserved residues, Arg177, Lys214, His215, Met251 and Asp286 to alanine resulted in large decreases in activity for Xyl3B (Table 3.5), although the structure of the mutant enzymes remained relatively unperturbed (Figure 3.8,

Table 3.4). These results support the role of Arg177, Lys214, and His215 in hydrogen bonding to the hydroxyl groups of the xylose moiety bound in the active site. The activity for the Asp286→Ala mutation was too low for the KM to be accurately determined, thus the apparent kcat was estimated. The approximately

4-5 orders of magnitude decrease in catalytic activity for this mutant (Table 3.5) coupled with the observation that this residue aligns with previously identified catalytic nucleophilic residues in GH 3 enzymes (2, 9, 11, 31, 34, 35, 40)

137

provides support for the assignment of Asp286 as the catalytic nucleophile for

Xyl3B.

The crystal structure for the GH 3 β-glucan exohydrolase (ExoI) from

Barley revealed that Asp120 forms a hydrogen bond to the 4' OH and 6' OH

groups of the glucose substrate bound at the -1 subsite (26). The pentose sugar

β-D-xylopyranose does not possess the additional CH2OH substituent from C5

that is present in the hexose sugar β-D-glucopyranose, thus it is possible that amino acid residues positioned at this site could function to discriminate between the hexose and the pentose on the basis of steric interactions (12). Amino acid

sequence alignments for Xyl3B with other biochemically characterized GH 3

enzymes revealed the presence of a glutamic acid residue (Glu115) which

aligned with the Asp120 from the Barley enzyme (Figure 3.7). Mutation of this

glutamic acid residue to alanine resulted in a large decrease in the kcat and a

relatively small increase in the KM with both pNPX and pNPG as substrates

(Table 3.5) which suggests that Glu115 is localized in the active site of Xyl3B and

plays a role in catalysis. Mutation of Glu115 to aspartic acid resulted in a

decrease in catalytic efficiency (kcat/KM) for Xyl3B with pNPX as a substrate, but a

slight increase in the catalytic efficiency with pNPG as a substrate. This 15 fold

change in substrate selectivity (WT Xyl3B, (kcat/KM)pNPX/(kcat/KM)pNPG = 35;

Glu115Asp Xyl3B, (kcat/KM)pNPX/(kcat/KM)pNPG = 2.3) for pNPG compared to pNPX suggests that this residue contributes to determining substrate specificity for

Xyl3B (Figure 3.9B). The fact that the Glu115Asp mutant remains slightly more

138

selective for pNPX over pNPG indicates that additional residues within the protein likely contribute to substrate discrimination.

All of the substrates that GH 3 enzymes hydrolyze possess either a β-D- xylose, α-L-arabinose, β-D-glucosamine, or β-D-glucose residue at the -1 subsite

(16). Amongst these sugars, the stereochemical configuration at the C1, C2, and

C3 positions are absolutely conserved. The four residues that are highly conserved in GH family 3 enzymes (Arg-Lys-His-Asp) that participate directly in catalysis (Asp) or in hydrogen bonding interactions (Arg-Lys-His) are located in the (α/β)8 barrel domain. The results reported in the current study support the hypothesis that Arg177, Lys214, and His215 form a highly conserved, core set of charged amino acid residues in GH 3 enzymes and that they make hydrogen bond contacts with the sugar residue at the -1 subsite. Furthermore, our data suggest that differences in specificity at the -1 subsite for members of this enzyme family likely arise as a result of changes in amino acid residues at positions that surround the C4, C5 and C6 positions of these sugars.

The GH family 3 is one of the largest GH families with 3,384 curated sequences currently in the Pfam database (http://pfam.sanger.ac.uk/). Of the bacterial genes identified in a recent survey of the human distal gut microbiome,

GH family 3 was the most highly represented GH family (22). Therefore, the importance of this gene family in the metabolic repertoire of gut bacteria cannot be overemphasized. The results from the current study indicate that many of these genes are likely to encode enzymes with unique, non-redundant biochemical properties and that care should be taken when attempting to assign

139

physiological functions solely from primary amino acid sequences. Our studies have provided insight into the roles of specific amino acids in catalysis and substrate selectivity, however future studies on the role of additional amino acids within these proteins will prove invaluable for furthering our understanding of the metabolic versatility of members within this large gene family. Furthermore, structural studies of representative members in complex with substrates will help to unravel the molecular mechanisms underlying substrate discrimination in this family of enzymes.

3.6 ACKNOWLEDGEMENTS

This research was supported by the Energy Biosciences Institute. The research of DD was partially supported by a James R. Beck fellowship in

Microbiology at the University of Illinois and by an NIH NRSA fellowship

(Fellowship no. 1F30DK084726). We thank Charles M. Schroeder, M. Ashley

Spies, Shosuke Yoshida, Yejun Han, Michael Iakiviak, Young-Hwan Moon, and

Xiaoyun Su of the Energy Biosciences Institute for valuable scientific discussions and Farhan Quader and Amara Hussain for technical assistance. We also thank members of the North American Consortium for Genomics of Fibrolytic Ruminal

Bacteria for the partial genome sequence of P. bryantii B14.

140

3.7 REFERENCES

1. Aziz, R. K., D. Bartels, A. A. Best, M. DeJongh, T. Disz, R. A. Edwards, K. Formsma, S. Gerdes, E. M. Glass, M. Kubal, F. Meyer, G. J. Olsen, R. Olson, A. L. Osterman, R. A. Overbeek, L. K. McNeil, D. Paarmann, T. Paczian, B. Parrello, G. D. Pusch, C. Reich, R. Stevens, O. Vassieva, V. Vonstein, A. Wilke, and O. Zagnitko. 2008. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75.

2. Bause, E., and G. Legler. 1974. Isolation and amino acid sequence of a hexadecapeptide from the active site of beta-glucosidase A3 from Aspergillus wentii. Hoppe Seylers Z Physiol Chem 355:438-442.

3. Bertani, G. 2004. Lysogeny at mid-twentieth century: P1, P2, and other experimental systems. J Bacteriol 186:595-600.

4. Bertani, G. 1951. Studies on lysogenesis. I. The mode of phage liberation by lysogenic Escherichia coli. J Bacteriol 62:293-300.

5. Brulc, J. M., D. A. Antonopoulos, M. E. Miller, M. K. Wilson, A. C. Yannarell, E. A. Dinsdale, R. E. Edwards, E. D. Frank, J. B. Emerson, P. Wacklin, P. M. Coutinho, B. Henrissat, K. E. Nelson, and B. A. White. 2009. Gene-centric metagenomics of the fiber-adherent bovine rumen microbiome reveals forage specific glycoside hydrolases. Proc Natl Acad Sci U S A 106:1948-53.

6. Bryant, M. P., N. Small, C. Bouma, and H. Chu. 1958. Bacteroides ruminicola n. sp. and Succinimonas amylolytica; the new genus and species; species of succinic acid-producing anaerobic bacteria of the bovine rumen. J Bacteriol 76:15-23.

7. Cann, I. K., S. Ishino, M. Yuasa, H. Daiyasu, H. Toh, and Y. Ishino. 2001. Biochemical analysis of replication factor C from the hyperthermophilic archaeon Pyrococcus furiosus. J Bacteriol 183:2614- 2623.

8. Cantarel, B. L., P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, and B. Henrissat. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res 37:D233-D238.

9. Chir, J., S. Withers, C. F. Wan, and Y. K. Li. 2002. Identification of the two essential groups in the family 3 beta-glucosidase from Flavobacterium meningosepticum by labelling and tandem mass spectrometric analysis. Biochem J 365:857-863.

141

10. Crombie, H. J., S. Chengappa, A. Hellyer, and J. S. Reid. 1998. A xyloglucan oligosaccharide-active, transglycosylating beta-D-glucosidase from the cotyledons of nasturtium (Tropaeolum majus L) seedlings-- purification, properties and characterization of a cDNA clone. Plant J 15:27-38.

11. Dan, S., I. Marton, M. Dekel, B. A. Bravdo, S. He, S. G. Withers, and O. Shoseyov. 2000. Cloning, expression, characterization, and nucleophile identification of family 3, Aspergillus niger beta-glucosidase. J Biol Chem 275:4973-4980.

12. Dodd, D., S. A. Kocherginskaya, M. A. Spies, K. E. Beery, C. A. Abbas, R. I. Mackie, and I. K. Cann. 2009. Biochemical analysis of a beta-D-xylosidase and a bifunctional xylanase-ferulic acid esterase from a xylanolytic gene cluster in Prevotella ruminicola 23. J Bacteriol 191:3328- 3338.

13. Edwards, J. E., N. R. McEwan, A. J. Travis, and R. J. Wallace. 2004. 16S rDNA library-based anlysis of ruminal bacterial diversity. Antonie van Leeuwenhoek 86:263-281.

14. Emanuelsson, O., S. Brunak, G. von Heijne, and H. Nielsen. 2007. Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2:953-971.

15. Eswar, N., B. John, N. Mirkovic, A. Fiser, V. A. Ilyin, U. Pieper, A. C. Stuart, M. A. Marti-Renom, M. S. Madhusudhan, B. Yerkovich, and A. Sali. 2003. Tools for comparative protein structure modeling and analysis. Nucleic Acids Res 31:3375-3380.

16. Faure, D. 2002. The family-3 glycoside hydrolases: from housekeeping functions to host-microbe interactions. Appl Environ Microbiol 68:1485- 1490.

17. Finn, R. D., J. Mistry, B. Schuster-Bockler, S. Griffiths-Jones, V. Hollich, T. Lassmann, S. Moxon, M. Marshall, A. Khanna, R. Durbin, S. R. Eddy, E. L. Sonnhammer, and A. Bateman. 2006. Pfam: clans, web tools and services. Nucleic Acids Res 34:D247-D251.

18. Gardner, R. G., J. E. Wells, J. B. Russell, and D. B. Wilson. 1995. The effect of carbohydrates on the expression of the Prevotella ruminicola 1,4- beta-D-endoglucanase. FEMS Microbiol Lett 125:305-310.

19. Gasparic, A., R. Marinsek-Logar, J. Martin, R. J. Wallace, F. V. Nekrep, and H. J. Flint. 1995. Isolation of genes encoding beta-D- xylanase, beta-D-xylosidase and alpha-L-arabinofuranosidase activities from the rumen bacterium Prevotella ruminicola B(1)4. FEMS Microbiol Lett 125:135-141. 142

20. Gasparic, A., J. Martin, A. S. Daniel, and H. J. Flint. 1995. A xylan hydrolase gene cluster in Prevotella ruminicola B(1)4: sequence relationships, synergistic interactions, and oxygen sensitivity of a novel enzyme with exoxylanase and beta-(1,4)-xylosidase activities. Appl Environ Microbiol 61:2958-2964.

21. Gill, S. C., and P. H. von Hippel. 1989. Calculation of protein extinction coefficients from amino acid sequence data. Anal Biochem 182:319-326.

22. Gill, S. R., M. Pop, R. T. Deboy, P. B. Eckburg, P. J. Turnbaugh, B. S. Samuel, J. I. Gordon, D. A. Relman, C. M. Fraser-Liggett, and K. E. Nelson. 2006. Metagenomic analysis of the human distal gut microbiome. Science 312:1355-1359.

23. Griswold, K. E., and R. I. Mackie. 1997. Degradation of protein and utilization of the hydrolytic products by a predominant ruminal bacterium, Prevotella ruminicola B(1)4. J Dairy Sci 80:167-175.

24. Harvey, A. J., M. Hrmova, R. De Gori, J. N. Varghese, and G. B. Fincher. 2000. Comparative modeling of the three-dimensional structures of family 3 glycoside hydrolases. Proteins 41:257-269.

25. Herrmann, M. C., M. Vrsanska, M. Jurickova, J. Hirsch, P. Biely, and C. P. Kubicek. 1997. The beta-D-xylosidase of Trichoderma reesei is a multifunctional beta-D-xylan xylohydrolase. Biochem J 321 ( Pt 2):375-81.

26. Hrmova, M., R. De Gori, B. J. Smith, J. K. Fairweather, H. Driguez, J. N. Varghese, and G. B. Fincher. 2002. Structural basis for broad substrate specificity in higher plant beta-D-glucan glucohydrolases. Plant Cell 14:1033-1052.

27. Hrmova, M., R. De Gori, B. J. Smith, A. Vasella, J. N. Varghese, and G. B. Fincher. 2004. Three-dimensional structure of the barley beta-D- glucan glucohydrolase in complex with a transition state mimic. J Biol Chem 279:4970-4980.

28. Hrmova, M., A. J. Harvey, J. Wang, N. J. Shirley, G. P. Jones, B. A. Stone, P. B. Hoj, and G. B. Fincher. 1996. Barley beta-D-glucan exohydrolases with beta-D-glucosidase activity. Purification, characterization, and determination of primary structure from a cDNA clone. J Biol Chem 271:5277-5286.

29. Hrmova, M., V. A. Streltsov, B. J. Smith, A. Vasella, J. N. Varghese, and G. B. Fincher. 2005. Structural rationale for low-nanomolar binding of transition state mimics to a family GH3 beta-D-glucan glucohydrolase from barley. Biochemistry 44:16529-16539.

143

30. Hrmova, M., J. N. Varghese, R. De Gori, B. J. Smith, H. Driguez, and G. B. Fincher. 2001. Catalytic mechanisms and reaction intermediates along the hydrolytic pathway of a plant beta-D-glucan glucohydrolase. Structure 9:1005-1016.

31. Jager, S., and L. Kiss. 2005. Investigation of the active site of the extracellular β-D-glucosidase from Aspergillus carbonarius. World J Microbiol Biotechnol 21:337-343.

32. Kawai, R., K. Igarashi, M. Kitaoka, T. Ishii, and M. Samejima. 2004. Kinetics of substrate transglycosylation by glycoside hydrolase family 3 glucan (1-->3)-beta-glucosidase from the white-rot fungus Phanerochaete chrysosporium. Carbohydr Res 339:2851-2857.

33. Laemmli, U. K. 1970. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227:680-685.

34. Li, Y. K., J. Chir, and F. Y. Chen. 2001. Catalytic mechanism of a family 3 beta-glucosidase and mutagenesis study on residue Asp-247. Biochem J 355:835-840.

35. Li, Y. K., J. Chir, S. Tanaka, and F. Y. Chen. 2002. Identification of the general acid/base catalyst of a family 3 beta-glucosidase from Flavobacterium meningosepticum. Biochemistry 41:2751-2759.

36. Lobley, A., L. Whitmore, and B. A. Wallace. 2002. DICHROWEB: an interactive website for the analysis of protein secondary structure from circular dichroism spectra. Bioinformatics 18:211-212.

37. Margolles-Clark, E., M. Tenkanen, T. Nakari-Setala, and M. Penttila. 1996. Cloning of genes encoding alpha-L-arabinofuranosidase and beta- xylosidase from Trichoderma reesei by expression in Saccharomyces cerevisiae. Appl Environ Microbiol 62:3840-6.

38. Miyazaki, K., J. C. Martin, R. Marinsek-Logar, and H. J. Flint. 1997. Degradation and utilization of xylans by the rumen anaerobe Prevotella bryantii (formerly P. ruminicola subsp. brevis) B(1)4. Anaerobe 3:373-381.

39. Miyazaki, K., H. Miyamoto, D. K. Mercer, T. Hirase, J. C. Martin, Y. Kojima, and H. J. Flint. 2003. Involvement of the multidomain regulatory protein XynR in positive control of xylanase gene expression in the ruminal anaerobe Prevotella bryantii B(1)4. J Bacteriol 185:2219-2226.

40. Paal, K., M. Ito, and S. G. Withers. 2004. Paenibacillus sp. TS12 glucosylceramidase: kinetic studies of a novel sub-family of family 3 glycosidases and identification of the catalytic residues. Biochem J 378:141-149.

144

41. Rigden, D. J., L. V. Mello, and M. Y. Galperin. 2004. The PA14 domain, a conserved all-beta domain in bacterial toxins, enzymes, adhesins and signaling molecules. Trends Biochem Sci 29:335-339.

42. Russell, J. B. 1985. Fermentation of cellodextrins by cellulolytic and noncellulolytic rumen bacteria. Appl Environ Microbiol 49:572-576.

43. Stevenson, D. M., and P. J. Weimer. 2007. Dominance of Prevotella and low abundance of classical ruminal bacterial species in the bovine rumen revealed by relative quantification real-time PCR. Appl Microbiol Biotechnol 75:165-174.

44. Varghese, J. N., M. Hrmova, and G. B. Fincher. 1999. Three- dimensional structure of a barley beta-D-glucan exohydrolase, a family 3 glycosyl hydrolase. Structure 7:179-190.

45. Watt, D. K., H. Ono, and K. Hayashi. 1998. Agrobacterium tumefaciens beta-glucosidase is also an effective beta-xylosidase, and has a high transglycosylation activity in the presence of alcohols. Biochim Biophys Acta 1385:78-88.

46. Wulff-Strobel, C. R., and D. B. Wilson. 1995. Cloning, sequencing, and characterization of a membrane-associated Prevotella ruminicola B(1)4 beta-glucosidase with cellodextrinase and cyanoglycosidase activities. J Bacteriol 177:5884-5890.

47. Zupancic, M. L., M. Frieman, D. Smith, R. A. Alvarez, R. D. Cummings, and B. P. Cormack. 2008. Glycan microarray analysis of Candida glabrata adhesin ligand specificity. Mol Microbiol 68:547-559.

145

3.8 TABLES AND FIGURES

a TABLE 3.1 Anaerobic medium for culturing Prevotella bryantii B14. Concentration in Media Ingredients (mg/L) Glucoseb 1800 Wheat Arabinoxylan (medium viscosity)b 1500 Haemin (in solution) 1 Resazurin (in solution) 1 (NH4)2SO4 1000 Methionine 900 Cysteine 250 Sodium Sulfide 250

Na2CO3 4000

Mineral Solution α (mL/L) 50.0 mL KH2PO4 240 Na2SO4 240 NaCl 480 MgCl2 · 6H2O 80 CaCl2 · 2H2O 64

Mineral Solution β (mL/L) 10.0 mL K2HPO4 240

VFA solution (mL/L) 5.0 mL Acetic acid 1.7 mL Propionic acid 0.6 mL Butyric acid 0.4 mL Isobutyric acid 0.1 mL n-Valeric acid 0.1 mL Isovaleric acid 0.1 mL DL-α-methylbutyric acid 0.1 mL

Vitamin Solution (mL/L) 100.0 mL Thiamine 2 Pantothenate 2 Nicotinamide 2 Riboflavin 2 Pyridoxine 2 Aminobenzoic acid 0.10 Biotin 0.025 Folic acid 0.025 Thioctic acid 0.025 Vitamin B12 0.025 a This table was constructed by Michael Iakiviak of the Energy Biosciences Institute. b One carbohydrate source was selected. The final concentration of carbohydrate used corresponds to 10 mM for both wheat arabinoxylan (mixture of xylose and arabinose) and glucose.

146

TABLE 3.2 Oligonucleotide primers used in this study. Primer Primer name a Orientation Sequence (5'→3') Cloning Primers CdxA (ORF0348) Forward 5'-catatgCAGGCTCCTCAGTTACGAGC-3' Reverse 5'-ctcgagTTATTTATTTCTTCTCAATAAGTTAAGCG-3' SDM1 5'-TAGGTGATAATGCTGCTCACATGGTTGGTACAGCTAAAC-3' SDM2 5'-GTTTAGCTGTACCAACCATGTGAGCAGCATTATCACCTA-3' Xyl3A (ORF0398) Forward 5'-GACGACGACAAGATGCTCATCTGCGCTGCTGAAAAG-3' Reverse 5'-GAGGAGAAGCCCGGTTAATTTAACGTATAATGTATCTG-3' Xyl3B (ORF0911) Forward 5'-GCGCcatatgCAAACTATACTTATTAATCAGCAGG-3' Reverse 5'-CGCGctcgagTCACTTGATGACTTCAG-3' Xyl3C (ORF0975) Forward 5'-GCGCcatatgATGAAAAGTAAACAACTAATAAC-3' Reverse 5'-CGCGctcgagTTATTTTAGGTAAATAATTAATTTTTTC-3' Mutagenesis Primers b,c Xyl3B Glu115Ala Forward 5'-GCTCTATTCCACGAAGCAGTGCTCTCGGGTGTT-3' Reverse 5'-AACACCCGAGAGCACTGCTTCGTGGAATAGAGC-3' b,c Xyl3B Glu115Asp Forward 5'-GCTCTATTCCACGAAGATGTGCTCTCGGGTGTTAA -3' Reverse 5'-TTAACACCCGAGAGCACATCTTCGTGGAATAGAGC -3' b,c Xyl3B Arg177Ala Forward 5'-CGAAATCCAAGTTTCAACGCGCTCGAAGAGTCGTATGG-3' Reverse 5'-CCATACGACTCTTCGAGCGCGTTGAAACTTGGATTTCG-3' b,c Xyl3B Lys214Ala Forward 5'-GTGTGGGGGCTTGCAGCGCGCACTATCTCGGATATG-3' Reverse 5'-CATATCCGAGATAGTGCGCGCTGCAAGCCCCCACAC-3' b,c Xyl3B His215Ala Forward 5'-GGGGGCTTGCAGCAAGGCCTATCTCGGATATGGT-3' Reverse 5'-ACCATATCCGAGATAGGCCTTGCTGCAAGCCCCC-3' b,c Xyl3B Met251Ala Forward 5'-CTGGAAGCAAAGCGCTGGCGCCTGGTTATCACGCTG-3' Reverse 5'-CAGCGTGATAACCAGGCGCCAGCGCTTTGCTTCCAG-3' b,c Xyl3B D286A Forward 5'-TGGTATGGTGGTTAGTGCCTATACAGCCATAGACC-3' Reverse 5'-GGTCTATGGCTGTATAGGCACTAACCACCATACCA-3' Q-PCR Primers d GyrAQ-PCR Forward 5'-CCCGTGTAGTAGGTGAGGTTCTTG-3' Reverse 5'-TCCAGTCTTGGCCCATACG-3' d CdxAQ-PCR Forward 5'-TTAGGAATTTGTTTGTCTGGAGCAT-3' Reverse 5'-GCTTTCTCTTCCAACGTCATAGC-3' d Xyl3AQ-PCR Forward 5'-GGCCTGTGCCAAGCACTTT-3' Reverse 5'-CGCGTGGCGAGATGTTATT-3' d Xyl3BQ-PCR Forward 5'-CGCTTCGAGGCAAACAGAAT-3' Reverse 5'-GGGAACGAATAGTCACCACACA-3' d Xyl3CQ-PCR Forward 5'-AAACTCGGTGACTTTGATTCTGATAAC-3' Reverse 5'-TTTGGTAGGCAAGCTGTTTATGTT-3' a Oligonucleotide primers were synthesized by Integrated DNA Technologies (Coralville, IA). Restriction enzyme sites are in lower case and italicized in the primer sequence. b Residues were selected for conservative mutagenesis due to their conservation as predicted by alignment of the primary amino acid sequences for Family 3 glycoside hydrolases with biochemically defined catalytic activities, and primers were designed in accordance with the QuikChange protocol by Stratagene (La Jolla, CA). c Underlined sequence indicates engineered codon mutations. d Primers for Q-PCR were designed using the Primer Express v3.0 program from Applied Biosystems.

147

TABLE 3.3 Domain architecture and predicted activities for GH family 3 proteins from Prevotella bryantii B14. ORF no.a domain architectureb predicted activityc

β-D-glucosidase CdxA (experimental (ORF0348) evidence (46)) Xyl3A β-D-glucosidase (ORF0398)

Xyl3B β-D-glucosidase (ORF0911)

Xyl3C β-D-glucosidase (ORF0975) a Open reading frame assignments were made by The Institute for Genomic Research (TIGR) following autoannotation of the draft genome assembly for Prevotella bryantii B14. b Functional domains were assigned utilizing the Pfam online server (http://www.sanger.ac.uk/Software/Pfam/) (17). Domain hits were included if the expect value (E value) was smaller than 1 x 10-8. c Open reading frame functional annotations were made by The Institute for Genomic Research (TIGR) following autoannotation of the draft genome assembly for Prevotella bryantii B14. ORF0348 was cloned and biochemically characterized by Wulff-Strobel et al. (46).

148

TABLE 3.4 Analysis of CD spectra for Xyl3B wild type and mutant proteins using DICHROWEBa Protein α-helix β-sheet β-turn unordered (%) (%) (%) (%) Wild Type 51 ± 3 22 ± 2 10 ± 2 17 ± 2 Glu115Ala 50 ± 1 22 ± 1 9 ± 1 19 ± 1 Glu115As 48 ± 3 24 ± 1 10 ± 2 18 ± 1 p Arg177Ala 46 ± 2 25 ± 2 8 ± 4 21 ± 2 Lys214Ala 48 ± 2 22 ± 3 11 ± 3 17 ± 1 His215Ala 50 ± 2 21 ± 3 8 ± 2 21 ± 1 Met251Ala 47 ± 1 23 ± 1 10 ± 2 20 ± 1 D286A 52 ± 2 20 ± 3 9 ± 2 19 ± 1 a CD spectra were recorded in the far-UV range utilizing a J-815 CD spectropolarimeter. The spectra were recorded from 190 to 260 nm at a scan rate of 50 nm/s and a 1-nm wavelength step with five accumulations. Each reaction was performed three times, and the data are means ± standard deviations of the means. The spectra were uploaded onto the DICHROWEB online server and analyzed as described in Experimental Procedures.

149

TABLE 3.5 Steady state kinetic parameters for Xyl3B wild type and mutant enzymes. pNP-β-D-xylopyranosidea pNP-β-D-glucopyranosidea

kcat Km kcat/Km kcat Km kcat/Km -1 -1 -1 -1 -1 -1 (s ) (mM) (s mM ) (s ) (mM) (s mM ) Xyl3B WT 250 ± 10 8.4 ± 1 30 ± 4 19 ± 0.6 22 ± 1 0.86 ± 0.05 Xyl3B -2 -2 -3 0.35 ± 0.02 28 ± 3 (1.3 ± 0.2) x 10 (8.7 ± 0.5) x 10 39 ± 4 (2.2 ± 0.3) x 10 Glu115Ala Xyl3B 87 ± 5 39 ± 4 2.2 ± 0.2 16 ± 0.4 17 ± 1 0.94 ± 0.06 Glu115Asp Xyl3B -2 b 0.17 ± 0.01 8.8 ± 1 (1.9 ± 0.2) x 10 N.D. N.D. N.D. Arg177Ala 150 Xyl3B -2 b 0.10 ± 0.01 5.7 ± 0.9 (1.8 ± 0.3) x 10 N.D. N.D. N.D. Lys214Ala

Xyl3B -2 -3 4.0 ± 0.2 19 ± 1 0.21 ± 0.02 (3.8 ± 0.4) x 10 23 ± 4 (1.7 ± 0.3) x 10 His215Ala Xyl3B b 75 ± 6 100 ± 10 0.75 ± 0.1 N.D. N.D. N.D. Met251Ala Xyl3B D286Ab,c (1.20 ± 0.3) x 10-2 N.D. N.D. N.D. N.D. N.D. a The data are reported as means ± standard errors from the mean for three independent experiments. b Kinetic parameters for pNPG were not determined because the activity was below the limit of sensitivity for the assay. c The reported kcat value is kcat(apparent), as Km values were not determined; rates for this mutant was determined at 46.25 mM pNPX concentration and thus represent a lower boundary on kcat.

Figure 3.1 Genomic context for GH family 3 genes in P. bryantii B14. The genome for P. bryantii B14 was uploaded onto the Rapid Annotation using Subsystems Technology (RAST) server and annotated. The genomic context for each GH family 3 gene with 7.5 kB on either end of the corresponding shown. Hyp denotes hypothetical proteins.

151

Figure 3.2 GH family 3 genes from P. bryantii B14 encode functional enzymes. (A) Purification of recombinant GH 3 proteins. The eluate from nickel-chelate chromatography was analyzed by 12% SDS-PAGE, followed by Coomassie brilliant blue G-250 staining. (B- E) Hydrolysis of pNP-linked sugars. CdxA (B), Xyl3A (C), Xyl3B (D), and Xyl3C (E) were assessed for their capacity to hydrolyze several pNP-linked sugars by UV spectroscopy.

The pNPX, pNPA, pNPG, and pNPG2 represent pNP-β-D-xylopyranoside, pNP-α-L- arabinopyranoside, pNP-β-D-glucopyranoside, and pNP-β-D-cellobioside, respectively.

152

Figure 3.3 Optimal pH for GH family 3 genes from P. bryantii B14. (A-D) Each enzyme was incubated with either pNP-β-D-glucopyranoside (CdxA), or pNP-β-D- xylopyranoside (Xyl3A, Xyl3B, Xyl3C) at a substrate concentration of 46.25 mM, and the change in absorbance at 400 nm was monitored continuously. The apparent kcat was estimated from a linear curve fit of the reaction progress curves.

153

Figure 3.4 Hydrolysis of cellohexaose and xylohexaose by P. bryantii B14 GH family 3 enzymes. Hydrolysis of cellohexaose and xylohexaose was assessed by incubating CdxA (A), Xyl3A (B), Xyl3B (C), or Xyl3C (D) with each substrate and then resolving the products by high performance anion exchange chromatography (HPAEC) followed by detection with a pulsed amperometric detector (PAD). The hydrolysis products were identified by comparison of peaks with retention times of purified substrates. Abbreviations for oligosaccharides are as follows: glucose through cellohexaose, G1-G6; xylose through xylohexaose, X1-X6.

154

Figure 3.5 Xyl3B and Xyl3C exhibit differences in specificity for 4-O-methyl glucuronic acid substituted xylo-oligosaccharides. (A-B) Hydrolysis of aldouronic acids. P. bryantii B14 Xyl3B (A) or Xyl3C (B) were incubated with aldouronic acids in the presence or absence of a GH family 67 α-glucuronidase enzyme which was cloned from Prevotella bryantii B14 (Agu67A) and the hydrolytic products were analyzed by HPAEC- PAD. The hydrolysis products were identified by comparison of peaks with retention times of purified substrates.

155

Figure 3.6 Expression of GH 3 genes during growth of P. bryantii B14 with wheat arabinoxylan and glucose. (A) P. bryantii B14 was cultured in a chemically defined growth medium with either wheat arabinoxylan (0.15% w/v) or glucose (0.18% w/v) as the sole carbohydrate source and the optical densities at 600 nm (OD600nm) over time were monitored. (B) At an optical density of 0.2 (mid-log phase of growth), cells were harvested, RNA was extracted, and Q-RT-PCR experiments were performed as described in the experimental procedures. Four technical replicates of the Q-PCR reaction were performed for each of three independent biological replicates, and data are reported as means ± standard errors from the mean.

156

Figure 3.7 Multiple amino acid sequence alignment reveals conserved residues in GH 3 enzymes. Primary amino acid sequences for biochemically characterized β- glycosidases from glycoside hydrolase family 3 were imported into ClustalW (http://align.genome.jp/) and aligned using the Blosum62 matrix. The multiple-sequence alignment file was then entered into the ESPript V.2.2 program (http://espript.ibcp.fr/ESPript/ESPript/) and post-script output files were generated. Only regions within the alignment that exhibit significant amino acid conservation are shown. Arrows indicate the amino acid residues that were targeted for mutagenesis in the current study. The NCBI Protein Database accession numbers are indicated alongside the organism abbreviations. The organisms are abbreviated as follows: Pru, Prevotella ruminicola 23; Pbr, Prevotella bryantii B14; Hvu, Hordeum vulgare; Tet, Thermoanaerobacter ethanolicus; Tbr, Thermoanaerobacter brockii; Sso, Sulfolobus solfataricus P2; Ani, Aspergillus niger; Air, Azospirillum irakense.

157

Figure 3.8 Trypsin sensitivity patterns for wild type and mutant Xyl3B. Trypsin protease sensitivity patterns were generated for the wild- type and mutant forms of Xyl3B. Wild-type or mutant Xyl3B (2 mg/mL) were incubated in a Tris-HCl buffer (50 mM Tris-HCl, 150 mM NaCl, 50 mM DTT; pH 8.0) with various concentrations of trypsin (lane A, 0 µg/ml; lane B, 3.1 µg/ml; lane C, 25 µg/ml; lane D, 100 µg/ml) at 25°C. After incubation for 1 h, the reactions were terminated by addition of SDS sample buffer, and the samples were electrophoresed on a 12% SDS- polyacrylamide gel and stained with Coomassie brilliant blue G-250.

158

Figure 3.9 GH family 3 active site residues. (A) Active site residues for the β-glucan exohydrolase (ExoI) from Barley (Hordeum vulgare) bound to glucose are shown (PDB accession no. 1ex1). The residues in brackets are corresponding residues in Xyl3B that align with the Barley ExoI residues. (B) Predicted model for discrimination between xylosides and glucosides by GH 3 enzymes. Asp120 from Barley ExoI forms hydrogen bond contacts with the 4'OH and 6'OH groups of glucose. Glu115 from P. bryantii B14 Xyl3B is predicted to form hydrogen bond contacts with the 4'OH of xylose and may discriminate between glucose and xylose on the basis of steric interactions.

159

CHAPTER 4

FUNCTIONAL ASSIGNMENT OF ENDO-XYLANASE ACTIVITY TO HYPOTHETICAL GENES FROM GUT BACTEROIDETES IDENTIFIED THROUGH ANALYSIS OF THE PREVOTELLA BRYANTII B(1)4 TRANSCRIPTOME

4.1 ABSTRACT

Enzymatic degradation of lignocellulosic polysaccharides by microbes in

the bovine rumen and the human colon is an important process that contributes

to nutrient acquisition by the host. Prevotella bryantii B14 is an abundant rumen bacterium that efficiently degrades soluble xylan fractions. To identify the repertoire of genes that P. bryantii B14 employs to degrade xylan, the bacterium

was cultured in a synthetic medium with either wheat arabinoxylan (WAX) or a

mixture of the component monosaccharides, xylose and arabinose (XA) and the

transcriptomes were compared using DNA microarrays and RNA sequencing

(RNAseq). A large number of genes with predicted roles in xylan degradation and

utilization were highly induced on WAX relative to XA, including a previously

characterized xylan hydrolase gene cluster. The most highly induced genes

formed an operon which contained putative outer membrane proteins analogous

to the starch utilization system (Sus) previously identified in the prominent human

gut symbiont Bacteroides thetaiotaomicron. The arrangement of this gene cluster

is highly conserved in other xylanolytic organisms within the Bacteroidetes

phylum which suggests that the mechanism of xylan utilization involving these

genes is conserved amongst these bacteria. A large number of genes encoding

proteins of unknown function were also induced on WAX relative to XA which

160

suggests that P. bryantii B14 may employ novel mechanisms for xylan degradation. A hypothetical protein with low homology to glycoside hydrolase

(GH) family 5 which was upregulated during growth on WAX was cloned and heterologously expressed as a hexahistidine fusion protein in E. coli. Biochemical analysis of the purified protein revealed that it possesses endo-xylanase activity thus the enzyme was named PbXyn5A. Two of the most similar proteins in the

GenBank database derive from the human colonic Bacteroides spp. (B. eggerthii

ORF1299 and B. intestinalis ORF4213) and are also annotated as hypothetical proteins. The corresponding genes were expressed and the purified recombinant proteins also exhibited robust endo-xylanase activity. Mutational analysis revealed two carboxylic acid residues that were essential for activity in all three enzymes. These glutamic acid residues are conserved amongst the distantly related members of the GH family 5, which supports the assignment of these enzymes to GH 5. This study provides insight into polysaccharide degradation by

P. bryantii B14 and reveals similarities in the mechanism for xylan degradation between human colonic and ruminant Bacteroidetes.

4.2 INTRODUCTION

The degradation and fermentation of plant polysaccharides is an important metabolic process that occurs within the gut microbial ecosystem of ruminants as well as humans. Xylan is one of the most abundant plant polysaccharides and microbes within the bovine rumen have evolved to efficiently degrade this recalcitrant hemicellulosic substrate. The depolymerization of xylan requires the

161

coordinated action of a number of enzymes including endo-xylanases, β- xylosidases, α-L-arabinofuranosidases, α-glucuronidases, acetylxylan esterases, and ferulic acid esterases (12). Prevotella spp. are the most numerically dominant xylanolytic bacteria in the rumen (15, 39) and therefore the mechanism by which they degrade and utilize xylan has received a great deal of focus (13,

14, 18, 19, 28, 29). P. bryantii B14 can grow with xylan as the sole carbohydrate source (28), however to date only six genes with roles in xylan hydrolysis have been cloned and characterized from P. bryantii B14 including two glycoside hydrolase (GH) family 10 endo-xylanase enzymes (17-19), three GH family 3 β- xylosidase enzymes (13), and one GH family 43 β-xylosidase enzyme (18, 19).

Given the capacity of this organism to grow efficiently with xylan substrates and the complexity in chemical linkages within natural xylans, it is likely that P. bryantii B14 uses additional as yet unidentified enzymes to degrade xylan.

The genome for P. bryantii B14 has recently been partially sequenced (30) and this bacterium harbors at least 109 genes predicted to encode either glycoside hydrolase or carbohydrate esterase genes. In the current study a transcriptional approach was employed to identify genes that P. bryantii B14 uses to degrade xylan. Endo-xylanase activity of P. bryantii B14 is induced by medium to large sized xylo-oligosaccharides but is not induced during growth with monomeric sugars including xylose and arabinose (27). The current study provides insight into the transcriptional profile of P. bryantii B14 cultured either with soluble wheat arabinoxylan (WAX) or with a mixture of xylose and arabinose

(XA). Furthermore, a hypothetical gene in P. bryantii B14 which was upregulated

162

on WAX compared to XA was identified and found to encode a novel endo-

xylanase enzyme distantly related to GH family 5. Homologs of this gene are

present in the recently sequenced genomes for several human colonic

Bacteroides spp. (Bacteroides eggerthii and Bacteroides intestinalis). These

genes were also annotated as coding for hypothetical proteins, however results

from the current study reveal that these genes also encode endo-xylanase

enzymes.

4.3 EXPERIMENTAL PROCEDURES

Materials. Prevotella bryantii B14 (DSM 11371) was initially isolated by

Bryant et al. (5), and was obtained from our culture collection in the Department

of Animal Sciences at the University of Illinois at Urbana-Champaign.

Bacteroides intestinalis (DSM 17393) was a kind gift from Jeffrey I. Gordon

(Washington University in St. Louis) and was originally obtained from the DSMZ

microorganisms culture collection. E. coli JM109 and E. coli BL21-CodonPlus

(DE3) RIL competent cells and the PicoMaxx high fidelity DNA polymerase were

acquired from Stratagene (La Jolla, CA). The pET-46 Ek/LIC Vector Kit was

obtained from Novagen (San Diego, CA). The DpnI restriction was

purchased from New England Biolabs (Ipswich, MA). The RNAprotect bacteria

reagent, RNeasy Mini kit, RNase-free DNase set, DNeasy blood and tissue DNA purification kit, and QIAprep Spin Miniprep kit were all obtained from Qiagen

(Valencia, CA). Xylo-oligosaccharides and wheat arabinoxylan (medium viscosity

– 20 centistokes) were obtained from Megazyme (Bray, Ireland). The Superscript

163

double stranded cDNA synthesis kit and the RiboMinus transcriptome isolation kit

were obtained from Invitrogen (Carlsbad, CA). The One-Color DNA Labeling kit,

Hybridization kit, Sample Tracking Control kit, and Wash Buffer kit were all

obtained from Roche NimbleGen (Madison, WI). All other reagents were of the

highest possible purity and were purchased from Sigma-Aldrich (St. Louis, MO).

Growth of P. bryantii B14 and isolation of RNA. P. bryantii B14 was

grown anaerobically in a chemically defined medium modified from Griswold et al.

(13) with either wheat arabinoxylan (0.15% wt/vol) or a mixture containing both

xylose (0.0885% wt/vol) and arabinose (0.0615% wt/vol) as the sole

carbohydrate sources. The concentrations of XA in the mixture were equivalent

to that in the WAX polysaccharide (59:41, xylose:arabinose). Cultures were

grown in butyl rubber stoppered serum bottles which were modified to

incorporate a sidearm tube to allow for turbidity measurements at 600 nm using a

Spectronic 20D+ spectrophotometer from Thermo Fisher Scientific Inc. (Waltham,

MA). Cells were sub-cultured two times consecutively in triplicate with the respective growth media to ensure complete adaptation to the carbohydrate growth source. At mid-log phase of growth (OD600nm = 0.2), 30 mL of each culture

was removed and immediately combined with 60 mL of RNAprotect bacteria

reagent and RNA was purified using the RNeasy kit with the optional on-column

DNase treatment step. The quality of the RNA was assessed by using a

Bioanalyzer 2100 with a RNA 6000 Nano Assay Reagent kit from Agilent (Santa

Clara, CA).

164

Gene expression analysis using DNA Microarrays. Custom DNA

Microarray slides were constructed by Roche NimbleGen (Madison, WI) using the partial genome sequence of P. bryantii B14. On each slide, four arrays were printed each containing 72,000 features. For each array, a total of 2,551 open reading frames were analyzed each with nine individual probes and with three replicates per probe. The purified RNA was converted to double stranded cDNA by using the Superscript cDNA synthesis kit. The cDNA was then labeled with

Cy3 by using the One-Color DNA Labeling kit which employs Cy3 labeled random nonamer primers. Sample tracking controls were added to each of the labeled cDNA samples and hybridization was performed at 42 °C using the

Hybridization kit and a Maui 4-bay hybridization system from BioMicro Systems

(Salt Lake City, UT). After 18h, the slides were removed, washed using the Wash

Buffer kit, then the slides were scanned using a GenePix 4000B microarray scanner with the laser tuned to a wavelength of 532 nm and a photomultiplier tube (PMT) gain of 680. The files were then analyzed using the NimbleScan v2.5 software and gene expression values were determined using ArrayStar v3.0 from

DNASTAR, Inc. (Madison, WI).

Whole transcriptome analysis by RNA sequencing. For RNAseq analyses, RNA isolated from two individual experiments were used for each growth condition. Bacterial 16S and 23S ribosomal RNA were removed from ~10 ug of total RNA with the Ribominus Bacteria Transcriptome Isolation kit. The

165

enriched mRNA fraction was converted to an RNAseq library using the mRNAseq kit from Illumina Inc. (San Diego, CA) with multiplexing adaptors that tag each library with a unique identifier. Fragments 200-400bp long were size- selected for the final library. Each library was hybridized onto a single lane of an

8 lane flowcell and sequenced with a Genome Analyzer IIx according to the manufacturer's instructions (Illumina, San Diego, CA). The sequences derived from each library were 73 bp long and the overall error rate of the control lane was 0.83%. Each library yielded the following number of reads: WAX1, 11.2 million reads; WAX2, 7.1 million reads; XA1, 8.8 million reads; XA2, 6.8 million reads.

The RNAseq data was analyzed using CLC genomics workbench v3.7 from CLC Bio (Cambridge, MA). The partial genome sequence for P. bryantii B14 was uploaded onto the Rapid Annotation using Subsystem Technology (RAST) server (2) and the annotated genome was exported as a GenBank file (.gbk).

The P. bryantii B14 GenBank file was then uploaded onto the CLC software and used as a reference genome for RNAseq analyses for each of the four samples.

Reads were only assembled if the fraction of the read that aligned with the reference genome was greater than 0.9 and if the read matched other regions of the reference genome at less than 10 nucleotide positions. The RNAseq output files were then analyzed for statistical significance by using the proportion-based test of Baggerly et al. (3).

166

De novo assembly of the P. bryantii B14 transcriptome. To analyze the

entire transcriptome of P. bryantii B14, a de novo assembly of the RNAseq data was performed with 82 million 60bp single end, nucleotide reads from a total of seven illumina lanes. These data comprised the four illumina libraries previously mentioned and included three additional illumina libraries. These three additional libraries had been constructed with a mixture of RNA from WAX and XA grown cultures, but the multiplexing adaptor reaction had failed, and the individual RNA populations could not be separated based upon the cognate growth substrate.

Thus these libraries were not useful for analyzing differential gene expression, but did contain information on transcript coverage and were included in the transcriptome analysis. The reads were filtered for adapter sequences and assembled into contiguous sequences (contigs) using ABySS ver 1.0.12

(http://www.bcgsc.ca/platform/bioinfo/software/abyss) with k-mer lengths of 25,

30, 35, 40, 45 and 50. Twenty-five 2.83 GHz cores, on a cluster with 200 processors with 16 GB per 8 cores, were used for each k-mer assembly. The six different k-mer assemblies were merged and any contig that shared 100% identity to a larger contig and was completely contained within the larger contig was filtered out. The merged Abyss assembly was reassembled using the 64-bit phrap ver 1.080721 with the revise greedy option on a 16-core Intel Xeon 2.93

GHz server with 128 GB DDR2 RAM. The phrap assembly resulted in a total of

1927 contigs with a median length of 2340 bp, and maximum and minimum lengths of 47199 bp and 100 bp, respectively. The N50 contig length was 6593

167

bp and 292 contigs were larger than this size. The contigs were matched to the reference P. bryantii genomic sequence using BLAT (23) at 99% identity.

Cloning, expression, and purification of P. bryantii B14 ORF0150,

Bacteroides eggerthii ORF1299, and Bacteroides intestinalis ORF4213.

Genomic DNA was isolated from P. bryantii B14 (5) and B. intestinalis DSM

17393 (4) using the DNeasy blood and tissue DNA purification kit. B. eggerthii genomic DNA was a kind gift from Abigail A. Salyers (Department of

Microbiology at the University of Illinois at Urbana-Champaign). The DNA sequences for the three genes were amplified from genomic DNA using the

PicoMaxx high fidelity PCR system with the oligonucleotide primers listed in

Table 4.1. Putative signal sequences with corresponding cleavage sites were predicted for each of the three genes (PbXyn5A, MNKKIIIVCLACAISLSSMA;

BeXyn5A, MKNMKNTVSILLFLFVFLSACS; BiXyn5A,

MKNNMRKIIYLLTLLLGVSLAACS) with the SignalP v3.0 online server (16). To ensure that proteins accumulated within the cytoplasm of E. coli cells, the primers were engineered to clone the respective gene beginning with the amino acid just downstream of the predicted cleavage site.

The three genes were cloned into pET-46b by ligation independent cloning as described previously (13). Recombinant expression of the genes cloned into the pET-46b vector results in the production of hexa-histidine fusion proteins which facilitates protein purification. The integrity of the cloned genes was

168

confirmed by DNA sequencing (W. M. Keck Center for Comparative and

Functional Genomics at the University of Illinois at Urbana-Champaign).

The recombinant proteins were expressed in E. coli BL-21 CODON PLUS

RIL cells and purified using an ÄKTAxpress instrument equipped with His-trap and desalting columns as described previously (13). Fractions were analyzed by

SDS-PAGE followed by staining with Coomassie Blue G-250 and purified fractions were pooled and concentrated using Amicon Ultra-15 Centrifugal Filter

Units with molecular weight cutoffs of 50 kDa from Millipore (Billerica, MA). The absorbances of the protein solutions were recorded at 280 nm using a NanoDrop

100 spectrophotometer (Thermo Fisher Scientific Inc.) and the protein concentrations were calculated using the method of Gill and von Hippel (20) with the following extinction coefficients: PbXyn5A, 152.53 cm-1mM-1; BeXyn5A,

137.17 cm-1mM-1; BiXyn5A, 137.17 cm-1mM-1.

Evaluation of xylanase activity on agar plates. The hydrolytic activity of

PbXyn5A with plant polysaccharides as substrate was assessed by using a

Congo red based assay adapted from Teather and Wood (40). Five microliters of

PbXyn5A (10 µM stock) was spotted onto an agarose plate (1% wt/vol) infused with either wheat arabinoxylan (WAX), carboxymethyl cellulose (CMC), locust bean gum (LBG), or lichenan (LIC), each at 1% wt/vol and incubated at 37 °C.

After 15 h, the plates were stained with Congo red (0.1% wt/vol) for 10 min, and then destained with NaCl (1 M).

169

Hydrolysis of xylohexaose and wheat arabinoxylan. PbXyn5A (2 µM, final concentration) was incubated with xylohexaose (1% wt/vol, final concentration) in citrate buffer (50 mM sodium citrate, 150 mM NaCl, pH 5.5) at

37 °C and at regular time intervals (0, 0.5, 1, 3, 5, 10, 20, 30, 60 min) 1 µL aliquots were removed and the products of hydrolysis were analyzed by thin layer chromatography (TLC) as described below.

To evaluate the capacity of PbXyn5A, BeXyn5A, and BiXyn5A to hydrolyze a longer polysaccharide chain such as xylan, each enzyme (0.5 µM, final concentration) was incubated with soluble wheat arabinoxylan (1% wt/vol) in citrate buffer (500 µL, final volume) at 37 °C. After 15 h, the concentration of reducing ends were estimated using the para-hydroxybenzoic acid hydrazide

(PAHBAH) assay as described previously (25) with glucose as a standard. For qualitative identification of the hydrolysis products, the reactions were resolved by thin layer chromatography (TLC) as described previously (14), except that the mobile phase used consisted of n-butanol:acetic acid:H2O, 10:5:1 (vol/vol/vol) and smaller 10 cm x 20 cm plates were used. For more quantitative analysis of the products of hydrolysis, the wheat arabinoxylan hydrolysates were diluted 50- fold or 200-fold in H2O depending on the monosaccharide concentrations, and the samples were analyzed by high performance anion-exchange chromatography (HPAEC).

For HPAEC analyses, 100 µL of each diluted sample was injected onto a

System Gold HPLC instrument from Beckman Coulter (Fullerton, CA) equipped with CarboPac™ PA1 guard (4 x 50 mm) and analytical (4 x 250 mm) columns

170

from Dionex Corporation (Sunnyvale, CA) and a Coulochem III electrochemical

detector from ESA Biosciences (Chelmsford, MA). Xylose, arabinose, and xylo-

oligosaccharides (X2-X6) were used as standards. For quantification of arabinose and xylose concentrations in the samples, calibration curves were generated with known concentrations of xylose and arabinose.

Site-directed mutagenesis. Mutagenesis was performed by use of the

QuikChange site-directed mutagenesis kit from Stratagene (La Jolla, CA). First, mutagenic primers were engineered with the desired mutation in the center of the

primer and ~15 bases of correct sequence on either side (Table 4.1). Reaction

mixtures were prepared according to the manufacturer’s instructions, with either

pET46b-Pbxyn5A, pET46b-Bexyn5A, or pET46b-Bixyn5A as the DNA template

for the generation of PbXyn5A, BeXyn5A, or BiXyn5A mutants, respectively.

After cycling of the reaction mixture 18 times in a PCR thermal cycler, the mixture

was digested with DpnI, and the resulting DNA was transformed into

electrocompetent E. coli JM109 cells by electroporation. The mutant genes were

sequenced to ensure that the appropriate mutations were introduced, while the

rest of the gene sequences remained unchanged. Expression and purification of

the mutant recombinant proteins were performed as described above for wild-

type (WT) PbXyn5A, BeXyn5A, and BiXyn5A.

171

4.4 RESULTS

Transcriptional analysis of P. bryantii B14 using microarrays and

RNAseq. P. bryantii B14 was cultured with either wheat arabinoxylan (WAX) or a

mixture of xylose and arabinose (XA) in the same proportion as they exist in

WAX (59:41, xylose:arabinose) and RNA was extracted at mid-log phase of

growth. RNA samples derived from two biological replicates for each growth

condition were analyzed by both Illumina RNAseq and by using DNA microarrays.

To assess the reproducibility of the microarray and RNAseq analyses, the

expression levels for all annotated genes in the partial genome sequence of P.

bryantii B14 were compared for the two biological replicates using both

technologies. These analyses revealed a high correlation in the expression level

for each gene in the two biological replicates using both DNA microarray and

RNAseq technologies (Figure 4.1). These results suggest that the biological

reproducibility in the gene expression levels was very high for both the

microarray and RNAseq analyses.

A major limitation to DNA microarrays is their relatively small dynamic range for accurately capturing the transcriptional level of highly and lowly expressed genes (42). Conversely, RNAseq has a very large dynamic range (31,

32) and comparison between RNAseq and qPCR has revealed that RNAseq is

very accurate for quantifying gene expression (32). To evaluate how well data

derived from DNA microarrays and RNAseq analyses agree, the expression level

for each gene was compared for each sample using both technologies. These

results revealed that there was a positive correlation in the expression of genes

172

identified using the two different techniques, however this correlation was low

(WAX1, R2=0.649; WAX2, R2=0.662; XA1, R2=0.663; XA2, R2=0.663). The low

correlation between microarray and RNAseq analyses has been documented

before and it is predicted that this may arise due to the fact that microarrays lack

sensitivity for genes expressed at low and high levels (42). Because of the

reported higher dynamic range for gene expression using RNAseq, the

subsequent analyses of transcriptional regulation focused on the results from the

RNAseq experiments rather than the DNA microarray data.

The Illumina reads were assembled onto the reference genome for P.

bryantii B14 and a transcriptome map was generated for the bacterium during

growth with each carbohydrate source. A representative transcriptome map is

shown in Figure 4.2 where the number of assembled reads for each nucleotide position within the genome is plotted on the inner circle. The most highly expressed genes corresponded to fragments of the ribosomal RNA genes (16S and 23S) indicated by the most prominent peak on the transcriptome map and

five of the smaller peaks. The high abundance of reads aligning to rRNA genes

indicates that the removal of ribosomal RNA from the sample using the

Ribominus kit was incomplete. In addition to the ribosomal RNA genes, the other

most highly expressed genes included genes important for housekeeping

functions in the cell such as fatty acid biosynthesis (acyl carrier protein,

ORF0076; FabF, ORF0077), ribosome biogenesis (small and large subunit

ribosomal proteins: ORF2542, ORF2216, ORF1370, ORF1137, ORF1214,

ORF1368, ORF0740, ORF2553, ORF0177, ORF1421, ORF1369), and glycolysis

173

(fructose-bisphosphate aldolase, ORF2552; glyceraldehyde-3-phosphate dehydrogenase, ORF2095) (Table 4.2).

Identification of genes regulated at the transcriptional level during growth of P. bryantii B14 with WAX compared to XA. To provide insight into the transcriptional regulation of xylanolytic genes we compared the transcriptomes for P. bryantii B14 grown with either soluble wheat arabinoxylan

(WAX) or a mixture of xylose and arabinose (XA) as the sole carbohydrate source. During growth with WAX relative to XA, approximately 57 genes were induced greater than 4-fold and 32 genes were repressed greater than 4-fold

(Baggerly's test (3), P < 0.05) using RNAseq analysis (Tables 4-3 and 4-4).

Functional annotation of the induced genes revealed that the majority code for hypothetical proteins, although glycoside hydrolases (GHs) and membrane transporters were also highly enriched and a few additional genes with predicted functions involved with conjugative transposition, protein degradation and carbohydrate ester hydrolysis were also identified (Table 4.3). The GH genes that were highly expressed had predicted functions associated with the hydrolysis of xylan including endo-xylanases (5 genes), β-xylosidases (4 genes), and arabinofuranosidases (2 genes). These results suggest that P. bryantii B14 elaborates a specific transcriptional response in the presence of xylan that induces expression of a subset of genes related to xylan degradation and utilization. Of the 57 induced genes, 77% code for proteins with putative signal sequences as predicted by using the SignalP server (16) (Table 4.3), suggesting

174

that the metabolic repertoire induced during growth of P. bryantii B14 on WAX is polarized towards the periplasm or the extracellular space.

To improve annotation of genes which may encode xylanolytic enzymes, the 57 highly induced genes were analyzed using BLASTp to catalogue genes into different carbohydrate active enzyme (CAZy) families (http://www.cazy.org/).

Of the 57 highly induced genes, 19 clustered into CAZy families with 14 grouped into GH families, three grouped into carbohydrate esterase (CE) families, and two genes predicted to encode polypeptides harboring both GH and CE domains

(Figure 4.3). The most highly represented CAZy family in the subset of induced genes was GH 43 and the top six most highly induced genes included members from GH 10 (ORF1893, ORF1909), GH 31 (ORF2004), GH 43 (ORF1908,

ORF2351), and GH 97 (ORF2350). The trends in these gene expression patterns were confirmed by both DNA microarray and RNAseq approaches (Figure 4.3).

The genes that were repressed during growth on WAX compared to XA were highly enriched for hypothetical genes, glycoside hydrolase genes, and transporters (Table 4.4). The subset of glycoside hydrolase genes that were repressed had predicted functions associated with the depolymerization of polysaccharides such as arabinan, galacturonan, cellulose, and starch.

Furthermore, SusC and SusD homologs were also found to be among the group of repressed genes. These results suggest that during growth on WAX a transcriptional program is initiated which represses the expression of genes involved with degradation of non-xylan containing polysaccharides.

175

Identification of the major xylanolytic gene cluster in P. bryantii B14.

Five of the most highly induced genes clustered together near the edge of a

single contiguous DNA sequence (contig 17) in the partial genome sequence of

P. bryantii B14. The gene which was positioned closest to the end of the contig

(ORF1897) appeared to be truncated at the N-terminal region relative to other

homologous proteins within the GenBank database (data not shown). In order to

evaluate whether the transcriptomic data may aid in completing the sequence for this highly expressed gene, a de novo assembly of the RNAseq data was constructed and the resulting contigs were assembled with the genomic contigs.

Within the assembled transcriptome, a large contig (contig 1388, 12,646 bp) was identified which exhibited overlap with the ends of two genomic contigs (contig 17,

8,656 bp overlap; contig 19, 3,875 bp overlap). These data suggest that contigs

17 and 19 are closely positioned on the P. bryantii B14 chromosome and are

separated by a 104 bp gap (Figure 4.4). To test the results from this assembly,

an oligonucleotide primer set was designed to amplify a 746 bp region that

should include the sequence spanning the predicted gap between these two

contigs. Using this primer set, a single DNA fragment was amplified from P.

bryantii B14 genomic DNA, cloned into a plasmid vector and both DNA strands

were sequenced. Analysis of this sequence revealed that it was continuous

between contigs 17 and 19 and confirmed the result of the transcriptome

assembly in which a 104 bp gap was identified and sequenced between the ends

of the two contigs. These results allowed the two xylanolytic gene clusters to be

176

unequivocally mapped to a single cluster on the P. bryantii B14 genome (Figure

4.5).

Of the 20 most highly induced genes in P. bryantii B14, 12 of them

mapped to this major xylanolytic gene cluster (Table 4.3, Figure 4.5). Functional

annotation of the genes encoded within this gene cluster identified a central gene

(XynR, ORF1907) predicted to encode a hybrid two-component system (HTCS)

regulator flanked by two groups of xylanolytic genes (Figure 4.5). The genes

upstream of XynR are divergently oriented and include a putative bi-functional β-

xylosidase/esterase (ORF1906), and an operon which contained two pairs of

SusC-SusD (ORF1905-ORF1897 and ORF1896-ORF1895) genes in tandem

followed by a hypothetical protein (ORF1894) then an endo-xylanase gene

(ORF1893, Xyn10C). The genes downstream of XynR are also divergently

oriented relative to XynR and include a putative β-xylosidase (ORF1908, XynB) preceded by an endoxylanase gene (ORF1909, Xyn10A), a sugar transporter

(ORF1910, XynD), and a putative esterase gene (ORF1911, XynE). Further downstream from this cluster relative to XynR is a putative α-glucuronidase gene

(ORF1912) which is in the same orientation as XynR. The RNA coverage was higher for the WAX grown cultures relative to the XA grown cultures for all of the genes in this region except for XynR and ORF1913 (Figure 4.5). This result indicates that expression of these genes is strongly induced in the WAX grown cultures relative to the XA grown cultures.

The RNA coverage is continuous throughout ORF1911-ORF1908 which is highly suggestive that these four genes are co-transcribed within a single

177

messenger RNA molecule (Figure 4.5). Furthermore, the RNA coverage is continuous throughout ORF1905-ORF1893, which indicates that these six genes may also be co-transcribed within a single mRNA molecule (Figure 4.5). The

RNA coverage for genes in the putative operon containing ORF1911-ORF1908 was not consistent throughout the entire operon. Specifically, the RNA coverage for ORF1908 and ORF1909 was much higher than for ORF1910 and ORF1911, which may suggest that there are multiple promoter elements present within this cluster and that the RNA coverage represents sequence data resulting from mRNA molecules with different 5′ ends.

Fluctuations in the RNA coverage were evident throughout the length of several of the genes in this cluster, most notably ORF1906 and ORF1905. These fluctuations were observed in both of the biological replicates (WAX1 and WAX2) which provides evidence that the observed patterns in RNA coverage were reproducible. For ORF1906, the RNA coverage was higher at the 5′ end of the gene, then gradually became lower near the center of the gene and increased again around the distal 1/3 of the gene (Figure 4.5). For ORF1905, the RNA coverage was much higher at the 5′ end of the gene and gradually became lower over the remaining portion of the gene. These intragenic variations in coverage may reflect different susceptibilities of regions within the transcripts to nucleolytic cleavage by native enzymes.

Prevotella bryantii B14 ORF0150 encodes a novel glycoside hydrolase family 5 endo-xylanase. Of the 57 genes which were induced

178

greater than four-fold during growth on WAX relative to XA, 19 were predicted to code for proteins of unknown function (hypothetical proteins). Two of these genes (ORF0150 and ORF0336) exhibited homology at the amino acid level to each other (50% identity, 415 residues aligned) and homology to enzymes from the glycoside hydrolase family 5 as later illustrated with an example from a

Bacillus protein. Of these two genes, ORF0150 was selected for further analysis, since this gene was more highly induced on WAX relative to XA in the transcriptional study. For ORF0150, the most similar enzyme with a biochemically defined function in the GenBank database is the alkaline endoglucanase enzyme from Bacillus sp. KSM-635 (33), although the amino acid conservation between these two proteins is low (27% identity over 207 amino acids aligned). These results suggested that ORF0150 may encode for a glycoside hydrolase family 5 enzyme. To test this possibility, the gene was amplified by PCR and cloned into an expression vector for recombinant protein production in E. coli. The gene product of ORF0150 was made as an N-terminal hexa-histidine fusion protein to facilitate purification by metal affinity chromatography. The predicted molecular weight for the hexa-histidine fusion protein, his-PbXyn5A, is 79 kDa, which is in agreement with the size of the purified protein estimated by comparison with molecular weight markers through

SDS-PAGE analysis (Figure 4.6A).

To evaluate whether ORF0150 possesses enzymatic activity, agar plates containing 0.1% wt/vol of wheat arabinoxylan (WAX), carboxymethyl cellulose

(CMC), locust bean gum (LBG), or lichenan (LIC) were prepared. Purified

179

PbXyn5A (50 pmol) was then spotted on each plate and incubated overnight at

37 °C. Upon staining of agar plates containing WAX with Congo red followed by destaining with 1 M NaCl, a zone of Congo red exclusion, representing hydrolysis of the polysaccharide substrate, was observed (Figure 4.6B). In contrast, clearing zones were not seen in the case of the plates containing CMC, LBG, or LIC.

These results suggested that ORF0150 possesses xylanase activity, but not endoglucanase, mannanase, nor lichenenase activities. Since this gene is the first glycoside hydrolase family 5 endo-xylanase enzyme to be identified in P. bryantii B14, the gene was designated, xyn5A.

Endo-β-1,4-xylanase enzymes catalyze the hydrolysis of β-1,4 linkages between xylose units within the backbone of xylan polysaccharides. To further verify the endo-xylanase activity of PbXyn5A, the enzyme was incubated with xylohexaose and aliquots were removed and analyzed by thin layer chromatography (TLC) at specific time intervals. During the course of the reaction, products corresponding to xylobiose, xylotriose, and xylotetraose accumulated (Figure 4.6C). This result further supports the assignment of endoxylanase activity to PbXyn5A. To evaluate the size of xylose products that

Xyn5A releases from xylan, the enzyme was incubated with WAX and the products were resolved by TLC and the reducing end equivalents were analyzed.

In the absence of the enzyme, no depolymerization of the substrate was evident

(Figure 4.6D, lane 3), however, when the enzyme was incubated with the substrate, a smear of oligosaccharides was apparent near the bottom of the TLC plate (Figure 4.6D, lane 4). Furthermore, the concentration of reducing sugars in

180

the reaction increased approximately 9-fold following the addition of PbXyn5A

(Figure 4.6E), indicating depolymerization of the xylan substrate. These results suggested that PbXyn5A releases long oligosaccharides that are not efficiently resolved by thin layer chromatography.

A definitive function of endo-xylanases is their capacity to function synergistically with α-L-arabinofuranosidases and β-D-xylosidases to improve the release of monosaccharides from xylan substrates. To evaluate whether

PbXyn5A exhibits synergistic activity, the enzyme was incubated independently or in combination with an α-L-arabinofuranosidase (Ara43A) and a β-D- xylosidase (Xyl3B) in the presence of soluble wheat arabinoxylan (WAX) and the products of hydrolysis were resolved by high performance anion exchange chromatography (HPAEC). Xylo-oligosaccharides (X2-X6) were used as standards to identify retention times and the concentrations of xylose and arabinose in the enzyme reaction mixtures were determined by comparison of the peak areas with a standard curve generated with known concentrations of xylose or arabinose. Ara43A is a GH family 43 arabinoxylan arabinofuranohydrolase from P. bryantii B14 which is encoded by ORF2351 and was cloned in our laboratory (Moon et al., manuscript in preparation). Xyl3B is a

GH family 3 β-D-xylosidase enzyme from P. bryantii B14 which hydrolyzes the β-

1,4 linkage of xylo-oligosaccharides (13). In the absence of all enzymes, neither monomers nor oligosaccharides were detectable in the substrate (Figure 4.7).

When WAX was incubated with Ara43A, a sizeable amount of arabinose was detected (Figure 4.7A). When WAX was incubated with Xyl3B, small amounts of

181

arabinose and xylose were detected (Figure 4.7). The release of both xylose and

arabinose by Xyl3B is consistent with previous results which demonstrated

activity with both para-nitrophenyl (pNP) α-L-arabinofuranoside and pNP-β-D-

xylopyranoside (13). When WAX was incubated with PbXyn5A, no monosaccharides were detected (Figure 4.7A), however several peaks were present that eluted after the times observed for xylobiose. These peaks did not exhibit similar retention times with any of the purified xylo-oligosaccharide

standards which may suggest that these peaks represent xylan fragments

decorated with arabinose units for which no standards are commercially available.

Incubation of WAX with both PbXyn5A and Xyl3B resulted in an increase in the

concentrations of xylose and arabinose over either PbXyn5A or Xyl3B alone

(Figure 4.7A). When PbXyn5A and Ara43A were incubated together with WAX,

the concentrations of xylose and arabinose were similar as compared to Ara43A

alone (Figure 4.7A), however the pattern of xylo-oligosaccharides was different

compared to PbXyn5A alone with peaks appearing which exhibited similar

retentions times to the xylo-oligosaccharide standards (X2-X6) (Figure 4.7B).

These data revealed that PbXyn5A and Ara43A do not exhibit synergism in the

release of xylose or arabinose, but that they do exhibit synergism in the release

of xylo-oligosaccharides. When all three enzymes were incubated together, the

amount of xylose released was increased relative to any of the other enzyme

combinations, whereas the level of arabinose was similar to Ara43A alone.

Furthermore, no peaks that corresponding to the xylo-oligosaccharide standards

were identified.

182

These experiments revealed that PbXyn5A is an endo-xylanase enzyme

that releases xylo-oligosaccharides to which arabinose units may be appended.

These data also show that PbXyn5A functions synergistically with a bi-functional

β-D-xylosidase/α-L-arabinofuranosidase enzyme to release xylose and arabinose,

and functions synergistically with an arabinoxylan arabinofuranohydrolase

enzyme to release xylo-oligosaccharides from wheat arabinoxylan.

Bacteroides eggerthii ORF1299 and Bacteroides intestinalis

ORF4213 each encode glycoside hydrolase family 5 endo-xylanases. A

BLASTp search of the GenBank non-redundant (nr) database using PbXyn5A as the query revealed that the most closely related proteins derive from members of the human colonic Bacteroidetes (Table 4.5). These proteins do not have biochemically defined functions, therefore to assess whether these proteins may also encode endo-xylanase enzymes, the Bacteroides eggerthii ORF1299 and

Bacteroides intestinalis ORF4213 were expressed for biochemical

characterization. These two genes were amplified from genomic DNA and cloned

into an expression vector for recombinant protein expression in E. coli. The two

proteins were expressed recombinantly as hexahistidine fusion proteins and

were purified by metal affinity chromatography. The predicted molecular weights

for the hexa-histidine fusion proteins his-BeXyn5A and his-BiXyn5A are 72 kDa and 73 kDa, which are in agreement with the sizes of the purified proteins estimated by comparison with molecular weight markers through SDS-PAGE analysis (Figure 4.8A).

183

To assay for endo-xylanase activity, the two proteins were incubated with

WAX and the products were resolved by TLC and the reducing end equivalents were analyzed. In the absence of either enzyme, no depolymerization of the substrate was evident (Figure 4.8B, lane 3), however, when the enzymes were independently incubated with the substrate, oligosaccharides were released

(Figure 4.8B, lane 4-5). For BeXyn5A, spots appeared on the TLC plate which migrated similar distances as compared to the xylotriose and xylotetraose standards, although an additional spot which migrated just below the xylobiose standard was also present (Figure 4.8B, lane 4). A similar pattern was observed for BiXyn5A, except that an additional spot was observed which migrated to a distance similar but not equivalent to the xylose standard. Furthermore, these two enzymes released a large amount of reducing sugars as demonstrated in figure 4.8C. Taken together, these results reveal that BeXyn5A and BiXyn5A both encode functional endo-xylanase enzymes.

To provide insight into the products that are released from WAX by

BeXyn5A and BiXyn5A and to evaluate whether these enzymes also function synergistically with Ara43A and Xyl3B, the enzymes were incubated alone or in combination with Ara43A and Xyl3B and the hydrolysates were analyzed by

HPAEC. The patterns of hydrolysis for Ara43A and Xyl3B were identical to those described previously (Figure 4.7A). Following incubation of WAX with BeXyn5A, arabinose was released (Figure 4.9A), and a mixture of oligosaccharides including xylobiose and xylotetraose as well as a number of other oligosaccharides for which no standards are available were also detected (Figure

184

4.9B). Incubation of WAX with both BeXyn5A and Xyl3B resulted in an increase

in the concentration of xylose over either BeXyn5A or Xyl3B alone (Figure 4.9A),

and the oligosaccharide pattern was different than BeXyn5A alone with no

xylobiose or xylotetraose being detected (Figure 4.9B). When BeXyn5A and

Ara43A were incubated together with WAX, the concentration of arabinose

released was higher than the sum of each enzyme alone (Figure 4.9A) and xylo-

oligosaccharides (X2-X6) were clearly visible. These results demonstrate that

BeXyn5A functions synergistically with Xyl3B to release xylose from WAX, and that BeXyn5A synergizes with Ara43A to release arabinose and xylo- oligosaccharides from WAX. The result that BeXyn5A releases both oligosaccharides and arabinose from WAX indicates that this enzyme is a bi-

functional endo-xylanase/arabinofuranosidase.

When BiXyn5A was incubated with WAX, no monosaccharides were released (Figure 4.9C), however larger oligosaccharides were present in the hydrolysates which included xylobiose and xylotetraose (Figure 4.9D). Incubation of WAX with both BiXyn5A and Xyl3B resulted in an increase in the concentration of xylose over either BiXyn5A or Xyl3B alone (Figure 4.9C), and the oligosaccharide pattern was different than BiXyn5A alone with no xylobiose or xylotetraose being detected (Figure 4.9D). The level of arabinose released by

BiXyn5A and Ara43A in combination was similar to Ara43A alone (Figure 4.9C), however in the presence of both enzymes, xylo-oligosaccharides (X2-X6) were

released (Figure 4.9D). This indicated that BiXyn5A functions synergistically with

Ara43A to improve the release of unbranched xylo-oligosaccharides, however,

185

no synergism was observed between the two enzymes in the release of

arabinose. These results indicate that Bacteroides intestinalis ORF4213 encodes

a functional endo-xylanase enzyme.

Although both BeXyn5A and BiXyn5A have endo-xylanase activity, the

oligosaccharide patterns in the hydrolysates differ. Moreover, BeXyn5A releases

almost twice the amount of arabinose and xylose when co-incubated with Ara43A and Xyl3B relative to BiXyn5A. This observation clearly shows that although the two enzymes encode endo-xylanases within the same GH family, they release different products from arabinoxylan.

Mutational analyses reveal conserved active site residues in

PbXyn5A, BeXyn5A, and BiXyn5A. Proteins in the GenBank database which

are most closely related to PbXyn5A, BeXyn5A, and BiXyn5A at the amino acid

level are classified within the GH family 5. This family of enzymes uses a

conserved set of active site residues to catalyze the hydrolysis of glycosidic

linkages. To test the prediction that these three enzymes are GH family 5

enzymes and that they employ a similar mechanism for hydrolysis as other

biochemically characterized GH 5 enzymes, we probed for conserved amino acid

sequence motifs by using position specific iterative BLAST (PSI-BLAST) (1). This

analysis revealed the presence of two amino acid motifs which contained the

putative glutamic acid catalytic acid/base (Figure 4.10A, motif 2) and the putative

glutamic acid catalytic nucleophile (Figure 4.10A, motif 3). An additional motif

was identified (Figure 4.10A, motif 1) which contained an aspartic acid in

186

PbXyn5A, BeXyn5A, and BiXyn5A that aligned with a conserved tyrosine residue that was shown to make a contact to the cellobiose ligand in the crystal structure for the alkaline cellulase K from Bacillus sp. Strain KSM-635 (37). To evaluate whether these three carboxylic acid residues were important for catalysis for

PbXyn5A, BeXyn5A, and BiXyn5A, each residue was independently mutated to alanine and the mutant proteins were expressed and purified.

Mutation of the conserved aspartic acid residue to alanine (PbXyn5A,

D104; BeXyn5A, D129; BiXyn5A, D127) resulted in small decreases in the reducing sugars released with PbXyn5A and BeXyn5A whereas no significant effect was observed for BiXyn5A (Figure 4.10B-C). These results indicate that

D104 and D129 may contribute to the endo-xylanase activity in PbXyn5A and

BeXyn5A, respectively, but the corresponding residue in BiXyn5A is not critical for the endo-xylanase activity of this enzyme. When either the putative catalytic acid/base residue (PbXyn5A, E203; BeXyn5A, E229; BiXyn5A, E227) or the putative nucleophile (PbXyn5A, E308; BeXyn5A, E349; BiXyn5A, E3477) was mutated to alanine, the activity for all three enzymes was attenuated to the level observed in the control reaction (Figure 4.10B-C). These data reveal that the two glutamic acid residues targeted for mutagenesis are important for endo-xylanase activity in PbXyn5A, BeXyn5A, and BiXyn5A. This finding supports the prediction that these three proteins are GH family 5 enzymes and that they employ similar amino acid residues in catalysis as described for other GH 5 enzymes.

187

4.5 DISCUSSION

Xylans are an abundant group of non-digestible plant polysaccharides

which are fermented by commensal microbes within the rumen (10, 11, 21) and

the human gut (7-9, 26, 34, 35, 43). Despite the abundance of xylanolytic

bacteria in these microbial communities, relatively little is known about the

mechanisms which underlie the degradation and utilization of xylan by these

bacteria.

In the current study, a whole genome transcriptomic approach was

employed to evaluate the repertoire of genes that the rumen hemicellulolytic

bacterium, Prevotella bryantii B14 employs to degrade xylan. The regulation of xylanase activity has been studied previously in P. bryantii B14 and it was found

that xylanase activity is not induced by glucuronic acid, arabinose, xylose or

small xylo-oligosaccharides (X2-X5) (27). Rather, the major inducers of xylanase

activity were found to be medium- to large-sized xylo-oligosaccharides. Our data

confirm the results from this previous study and reveal that the two previously

studied endo-xylanase genes (Xyn10A and Xyn10C) are highly induced during

growth on soluble wheat arabinoxylan (WAX) compared to the component

monosaccharides, xylose and arabinose (XA). While these genes were amongst

the top ten most highly induced genes, a large number of additional genes were

also induced under these conditions. This observation underscores the complexity in the transcriptional response of P. bryantii B14 during growth on this

arabinoxylan polysaccharide.

188

The major xylanolytic gene cluster in P. bryantii B14 identified in this study

contains a genomic DNA fragment that was previously cloned by Gasparic et al.

(19). The genes previously identified include open reading frames 1907 (XynR),

1908 (XynB), 1909 (Xyn10A), 1910 (XynD), 1911 (XynE), and 1912 (XynF),

although in the current study two genes (XynR and XynF) on either end of the

DNA fragment reported by Gasparic et al. were found to be truncated. A

translation start site (ATG) was identified for XynR (ORF1907) that occurs 1602

bp upstream of the start site predicted for this gene by Gasparic et al. and the

assignment of this alternative start site was corroborated by the RNAseq

coverage which is continuous throughout the entire length of ORF1907. This

result suggests that the xylan catabolism regulator (XynR) in P. bryantii B14 is

534 amino acids longer than originally reported. Analysis of the domain

architecture for this protein revealed three functional domains including a putative

N-terminal periplasmic sensing domain, followed by a histidine kinase domain

and a response regulator domain (Figure 4.11). Thus, this HTCS regulator

contains all of the regulatory elements found in ‘classical’ two-component

regulatory systems, but rather than encoding the sensor and effector domains in

separate genes as in the ‘classical’ system, XynR combines all of these functions

within a single polypeptide. The domain organization for XynR is analogous to

that for a HTCS regulator (BT3172) characterized from the related bacterium,

Bacteroides thetaiotaomicron VPI-5482 (38), however the sequence similarity across the entire polypeptide is relatively low (23%, 1405 amino acids aligned).

XynR was shown to be important for coordinating the transcription of a xylan

189

utilization gene cluster in response to growth on xylan in P. bryantii B14 (29), thus this protein may be the primary regulator responsible for the transcriptional response identified in this study. This hypothesis could be tested by constructing a P. bryantii B14 strain carrying a deletion in this gene, however the lack of a genetic system for manipulating P. bryantii B14 precludes the possibility of this

analysis at the current time.

The xylan utilization gene cluster previously characterized by Gasparic et

al. (19) was found to be part of a larger xylanolytic gene cluster which includes

the previously characterized endo-xylanase gene, xyn10C (17). Xyn10C occurs

in a group of six genes which the RNAseq data suggest are co-transcribed within

a single polycistronic messenger RNA molecule. This operon includes six of the

seven most highly induced genes during growth on WAX relative to XA, which

indicates that it is likely to be of critical importance to xylan utilization by P.

bryantii B14. Contained within this operon are two tandem repeats of genes

predicted to encode outer membrane proteins which share homology to the

starch utilization system (Sus) components, SusC and SusD, followed by a

hypothetical gene which is then followed by xyn10C. This arrangement of genes

is similar in certain respects to the starch utilization system identified by Salyers

and coworkers in Bacteroides thetaiotaomicron (36). SusC is predicted to encode

an outer membrane porin which transports oligosaccharides into the periplasm in

a TonB-dependent fashion. SusD harbors a signal peptidase II cleavage site

which may facilitate the tethering of this protein onto the outer leaflet of the outer

membrane where it may play a role in oligosaccharide binding (24). The

190

hypothetical protein and Xyn10C both possess putative signal peptidase II cleavage sites which suggests that these two proteins may also be tethered onto the outside surface of the cell. The function of the hypothetical protein

(ORF1894) is currently unknown, however its location within this highly induced xylan utilization operon suggests that it may be involved in the degradation and utilization of xylan. The gene was cloned, expressed and purified and exhibited a zone of Congo red exclusion following incubation on a WAX-infused agar plate

(data not shown). Further attempts to identify catalytic activity or binding activity to soluble wheat arabinoxylan did not yield any significant activity (data not shown). Analysis of the activity of this protein is an ongoing focus in our laboratory. These observations suggest that this cluster of six proteins may be critical for the binding, depolymerization, and transport of extracellular xylan fragments into the periplasmic space, although further functional studies must be performed to verify this prediction.

The arrangement of this operon is conserved amongst a number of organisms for which genome sequences are available including Prevotella ruminicola, Prevotella copri, Prevotella buccae, Prevotella bergensis, Bacteroides intestinalis, Bacteroides cellulosilyticus, Bacteroides sp. 2_2_4, Bacteroides ovatus, Bacteroides eggerthii, Bacteroides plebeius, and an additional

Bacteroidetes member, Spirosoma linguale which is commonly isolated from soil or freshwater (Figure 4.12). These organisms are members of the phylum

Bacteroidetes and, with the exception of S. linguale, they were isolated from the bovine rumen or the human alimentary tract (Figure 4.13). This is the first

191

evidence of a xylan utilization cluster which is strictly conserved across members of the rumen Prevotella spp. and the human-associated Bacteroides spp. and the conservation of this cluster is highly suggestive that xylanolytic members from these two bacterial genera employ a conserved mechanism to degrade and utilize plant-derived xylan polysaccharides. Apart from the high level of conservation in the orientation of genes within this operon, there are no other strictly conserved arrangements of genes in the regions nearby on the chromosome (Figure 4.12).

The majority of genes which are induced by P. bryantii B14 during growth on WAX compared to XA code for proteins with unknown functions. This observation underscores the fact that there are large gaps in our current understanding of how polysaccharides are metabolized by P. bryantii B14 and other gut-associated bacteria. Whole genome transcriptional profiling represents a useful approach to gain insight into the potential roles of genes with unknown functions. In the current study, a novel subset of genes related to the glycoside hydrolase family 5 has been identified and functional roles have been assigned to them based upon biochemical data. These genes are only found within certain members of the Bacteroidetes phylum which are resident within the alimentary tract of humans or ruminants. Most of these GH 5 genes occur near the conserved xylan utilization cluster on the genome (Figure 4.12) which suggests that they may be directly involved in the degradation of xylan. BeXyn5A and

BiXyn5A contain putative signal peptidase II cleave sites, suggesting that these

192

two proteins may be anchored on the outside of the cell and function coordinately

with the core xylan utilization cluster to degrade xylan.

The two Bacteroides proteins (BeXyn5A and BiXyn5A) share 78% identity

at the amino acid level and also share similar domain organizations with both

proteins possessing a C-terminal bacterial Ig2-like domain. The function of this

C-terminal extension is currently unknown, however the Ig2-like domain is

predicted to be involved in cell surface adhesion (22). Many xylan degrading

enzymes are associated with carbohydrate binding modules (CBMs) within a

single polypeptide, and it is possible that the Ig2-like domain serves a carbohydrate binding function. Further studies in our laboratory will focus on delineating the functional role of this domain in BeXyn5A and BiXyn5A.

Despite the fact that the three enzymes characterized in this study are clearly related to each other at the amino acid sequence level, they exhibit differences in the products that they release from WAX. The observation that

BeXyn5A (ORF1299, Table 4.5) synergizes with Ara43A, whereas BiXyn5A

(ORF4213, Table 4.5) and PbXyn5A (ORF0150, Table 4.5) do not exhibit synergy with Ara43A clearly indicates that there is a fundamental difference in the activities between these enzymes. BeXyn5A releases arabinose from WAX whereas PbXyn5A and BiXyn5A do not, thus it is possible that this additional arabinofuranosidase activity of BeXyn5A potentiates the observed synergism with Ara43A. Based on the predicted role of the C-terminal domain in carbohydrate binding as described above, we predict that changes in the amino

193

acid sequences within the GH 5 domain contribute to the different activities of

BeXyn5A and BiXyn5A.

Recently, a large number of genome sequences have been made available for human colonic bacteria through the human microbiome project (41).

These genome sequences provide a wealth of information on the gene content of commensal microbes within the human gut microbiome, however proper annotation of these genes is critical to interpreting the genomic data in terms of the metabolic repertoire of the microbial community. In the current study, transcriptional profiling of a rumen bacterium led to the assignment of function to a group of proteins within the rumen Prevotella spp. and human colonic

Bacteroides spp. which previously had no assigned functions. Furthermore, the current study has provided unique insight into conserved mechanisms for xylan utilization amongst members of the phylum, Bacteroidetes.

4.6 ACKNOWLEDGEMENTS

This research was supported by the Energy Biosciences Institute. DD was supported by an NIH NRSA fellowship (Fellowship no. 1F30DK084726). We thank Charles M. Schroeder, Shosuke Yoshida, Yejun Han, Michael Iakiviak, and

Xiaoyun Su of the Energy Biosciences Institute for valuable scientific discussions and Young-Hwan Moon and Hiroshi Miyagi for technical assistance. We also thank members of the North American Consortium for Genomics of Fibrolytic

Ruminal Bacteria for the partial genome sequence of P. bryantii B14 and Alvaro

194

Hernandez and Chris Wright of the W. M. Keck Center for Comparative and

Functional Genomics for assistance with Illumina sequencing.

4.7 REFERENCES

1. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389-3402.

2. Aziz, R. K., D. Bartels, A. A. Best, M. DeJongh, T. Disz, R. A. Edwards, K. Formsma, S. Gerdes, E. M. Glass, M. Kubal, F. Meyer, G. J. Olsen, R. Olson, A. L. Osterman, R. A. Overbeek, L. K. McNeil, D. Paarmann, T. Paczian, B. Parrello, G. D. Pusch, C. Reich, R. Stevens, O. Vassieva, V. Vonstein, A. Wilke, and O. Zagnitko. 2008. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75.

3. Baggerly, K. A., L. Deng, J. S. Morris, and C. M. Aldaz. 2003. Differential expression in SAGE: accounting for normal between-library variation. Bioinformatics 19:1477-1483.

4. Bakir, M. A., M. Kitahara, M. Sakamoto, M. Matsumoto, and Y. Benno. 2006. Bacteroides intestinalis sp. nov., isolated from human faeces. Int J Syst Evol Microbiol 56:151-154.

5. Bryant, M. P., N. Small, C. Bouma, and H. Chu. 1958. Bacteroides ruminicola n. sp. and Succinimonas amylolytica; the new genus and species; species of succinic acid-producing anaerobic bacteria of the bovine rumen. J Bacteriol 76:15-23.

6. Cantarel, B. L., P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, and B. Henrissat. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res 37:D233-D238.

7. Chassard, C., and A. Bernalier-Donadille. 2006. H2 and acetate transfers during xylan fermentation between a butyrate-producing xylanolytic species and hydrogenotrophic microorganisms from the human gut. FEMS Microbiol Lett 254:116-122.

8. Chassard, C., E. Delmas, P. A. Lawson, and A. Bernalier-Donadille. 2008. Bacteroides xylanisolvens sp. nov., a xylan-degrading bacterium isolated from human faeces. Int J Syst Evol Microbiol 58:1008-1013.

195

9. Chassard, C., V. Goumy, M. Leclerc, C. Del'homme, and A. Bernalier- Donadille. 2007. Characterization of the xylan-degrading microbial community from human faeces. FEMS Microbiol Ecol 61:121-131.

10. Dehority, B. A. 1966. Characterization of several bovine rumen bacteria isolated with a xylan medium. J Bacteriol 91:1724-1729.

11. Dehority, B. A. 1967. Rate of isolated hemicellulose degradation and utilization by pure cultures of rumen bacteria. Appl Microbiol 15:987-993.

12. Dodd, D., and I. K. O. Cann. 2009. Enzymatic deconstruction of xylan for biofuel production. GCB Bioenergy 1:2-17.

13. Dodd, D., S. Kiyonari, R. I. Mackie, and I. K. Cann. 2010. Functional diversity of four glycoside hydrolase family 3 enzymes from the rumen bacterium, Prevotella bryantii B14. J Bacteriol 192: 2335-2345

14. Dodd, D., S. A. Kocherginskaya, M. A. Spies, K. E. Beery, C. A. Abbas, R. I. Mackie, and I. K. Cann. 2009. Biochemical analysis of a beta-D-xylosidase and a bifunctional xylanase-ferulic acid esterase from a xylanolytic gene cluster in Prevotella ruminicola 23. J Bacteriol 191:3328- 3338.

15. Edwards, J. E., N. R. McEwan, A. J. Travis, and R. J. Wallace. 2004. 16S rDNA library-based anlysis of ruminal bacterial diversity. Antonie van Leeuwenhoek 86:263-281.

16. Emanuelsson, O., S. Brunak, G. von Heijne, and H. Nielsen. 2007. Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2:953-971.

17. Flint, H. J., T. R. Whitehead, J. C. Martin, and A. Gasparic. 1997. Interrupted catalytic domain structures in xylanases from two distantly related strains of Prevotella ruminicola. Biochim Biophys Acta 1337:161- 165.

18. Gasparic, A., R. Marinsek-Logar, J. Martin, R. J. Wallace, F. V. Nekrep, and H. J. Flint. 1995. Isolation of genes encoding beta-D- xylanase, beta-D-xylosidase and alpha-L-arabinofuranosidase activities from the rumen bacterium Prevotella ruminicola B(1)4. FEMS Microbiol Lett 125:135-141.

19. Gasparic, A., J. Martin, A. S. Daniel, and H. J. Flint. 1995. A xylan hydrolase gene cluster in Prevotella ruminicola B(1)4: sequence relationships, synergistic interactions, and oxygen sensitivity of a novel enzyme with exoxylanase and beta-(1,4)-xylosidase activities. Appl Environ Microbiol 61:2958-2964.

196

20. Gill, S. C., and P. H. von Hippel. 1989. Calculation of protein extinction coefficients from amino acid sequence data. Anal Biochem 182:319-326.

21. Hespell, R. B., R. Wolf, and R. J. Bothast. 1987. Fermentation of xylans by Butyrivibrio fibrisolvens and other ruminal bacteria. Appl Environ Microbiol 53:2849-2853.

22. Kelly, G., S. Prasannan, S. Daniell, K. Fleming, G. Frankel, G. Dougan, I. Connerton, and S. Matthews. 1999. Structure of the cell-adhesion fragment of intimin from enteropathogenic Escherichia coli. Nat Struct Biol 6:313-8.

23. Kent, W. J. 2002. BLAT--the BLAST-like alignment tool. Genome Res 12:656-664.

24. Koropatkin, N. M., E. C. Martens, J. I. Gordon, and T. J. Smith. 2008. Starch catabolism by a prominent human gut symbiont is directed by the recognition of amylose helices. Structure 16:1105-1115.

25. Lever, M. 1972. A new reaction for colorimetric determination of carbohydrates. Anal Biochem 47:273-279.

26. Mirande, C., E. Kadlecikova, M. Matulova, P. Capek, A. Bernalier- Donadille, E. Forano, and C. Bera-Maillet. 2010. Dietary fibre degradation and fermentation by two xylanolytic bacteria Bacteroides xylanisolvens XB1A(T) and Roseburia intestinalis XB6B4 from the human intestine. J Appl Microbiol.

27. Miyazaki, K., T. Hirase, Y. Kojima, and H. J. Flint. 2005. Medium- to large-sized xylo-oligosaccharides are responsible for xylanase induction in Prevotella bryantii B14. Microbiology 151:4121-4125.

28. Miyazaki, K., J. C. Martin, R. Marinsek-Logar, and H. J. Flint. 1997. Degradation and utilization of xylans by the rumen anaerobe Prevotella bryantii (formerly P. ruminicola subsp. brevis) B(1)4. Anaerobe 3:373-381.

29. Miyazaki, K., H. Miyamoto, D. K. Mercer, T. Hirase, J. C. Martin, Y. Kojima, and H. J. Flint. 2003. Involvement of the multidomain regulatory protein XynR in positive control of xylanase gene expression in the ruminal anaerobe Prevotella bryantii B(1)4. J Bacteriol 185:2219-2226.

30. Morrison, M., S. C. Daugherty, W. C. Nelson, T. Davidsen, and K. E. Nelson. 2009. The FibRumBa database: A resource for biologists with interests in gastrointestinal microbial ecology, plant biomass degradation, and anaerobic microbiology. Microb Ecol 59:212-213.

197

31. Mortazavi, A., B. A. Williams, K. McCue, L. Schaeffer, and B. Wold. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621-628.

32. Nagalakshmi, U., Z. Wang, K. Waern, C. Shou, D. Raha, M. Gerstein, and M. Snyder. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320:1344-1349.

33. Ozaki, K., S. Shikata, S. Kawai, S. Ito, and K. Okamoto. 1990. Molecular cloning and nucleotide sequence of a gene for alkaline cellulase from Bacillus sp. KSM-635. J Gen Microbiol 136:1327-1334.

34. Salyers, A. A., J. R. Balascio, and J. K. Palmer. 1982. Breakdown of xylan by enzymes from human colonic bacteria. J Food Biochem 6:39-56.

35. Salyers, A. A., F. Gherardini, and M. O'Brien. 1981. Utilization of xylan by two species of human colonic Bacteroides. Appl Environ Microbiol 41:1065-1068.

36. Shipman, J. A., J. E. Berleman, and A. A. Salyers. 2000. Characterization of four outer membrane proteins involved in binding starch to the cell surface of Bacteroides thetaiotaomicron. J Bacteriol 182:5365-5372.

37. Shirai, T., H. Ishida, J. Noda, T. Yamane, K. Ozaki, Y. Hakamada, and S. Ito. 2001. Crystal structure of alkaline cellulase K: insight into the alkaline adaptation of an industrial enzyme. J Mol Biol 310:1079-1087.

38. Sonnenburg, E. D., J. L. Sonnenburg, J. K. Manchester, E. E. Hansen, H. C. Chiang, and J. I. Gordon. 2006. A hybrid two-component system protein of a prominent human gut symbiont couples glycan sensing in vivo to carbohydrate metabolism. Proc Natl Acad Sci U S A 103:8834-8839.

39. Stevenson, D. M., and P. J. Weimer. 2007. Dominance of Prevotella and low abundance of classical ruminal bacterial species in the bovine rumen revealed by relative quantification real-time PCR. Appl Microbiol Biotechnol 75:165-174.

40. Teather, R. M., and P. J. Wood. 1982. Use of Congo red-polysaccharide interactions in enumeration and characterization of cellulolytic bacteria from the bovine rumen. Appl Environ Microbiol 43:777-780.

41. Turnbaugh, P. J., R. E. Ley, M. Hamady, C. M. Fraser-Liggett, R. Knight, and J. I. Gordon. 2007. The human microbiome project. Nature 449:804-810.

42. Wang, Z., M. Gerstein, and M. Snyder. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57-63.

198

43. Weaver, J., T. R. Whitehead, M. A. Cotta, P. C. Valentine, and A. A. Salyers. 1992. Genetic analysis of a locus on the Bacteroides ovatus chromosome which contains xylan utilization genes. Appl Environ Microbiol 58:2764-2770.

4.8 TABLES AND FIGURES

TABLE 4.1 Oligonucleotide primers used in this study. Primer Primer name a Orientation Sequence (5'→3') Cloningb P. bryantii B14 Forward 5'-GACGACGACAAGATGCAATATCGAACTCTAAAC-3' Xyn5A (ORF0150) Reverse 5'-GAGGAGAAGCCCGGTTATAAGCATTTATTGATCTTAAC-3' B. eggerthii Forward 5'-GACGACGACAAGATGGATGATTCGATACCGGAGAAGTTG-3' Xyn5A Reverse 5'-GAGGAGAAGCCCGGTCATTCCAAGTTTACCC-3' B. intestinalis Forward 5'-GACGACGACAAGATGGACGAGACTATTCCTGAGCAG-3' Xyn5A Reverse 5'-GAGGAGAAGCCCGGTTAATTTGCTAAGGAGACATTAGC-3' Sequencingb P1 Forward 5'-GACGACGACAAGATGGAGTGGTTTCATGGATCG-3' P2 Reverse 5'-GAGGAGAAGCCCGGTTACGGCATTACAAGCAAAG-3' c,d Mutagenesis PbXyn5A D104A Forward 5'-CCGCCTACATATGGCTCCCTGCTGGACGA-3' Reverse 5'-TCGTCCAGCAGGGAGCCATATGTAGGCGG-3' PbXyn5A E203A Forward 5'-CGGAATCGAACTGGCTAATGCACCTGTCAATGTACTTAATG- 3' Reverse 5'-CATTAAGTACATTGACAGGTGCATTAGCCAGTTCGATTCCG- 3' PbXyn5A E308A Forward 5'-TCCTATTATGATAACGGCGGTAGACTGGAGTCCAC-3' Reverse 5'-GTGGACTCCAGTCTACCGCCGTTATCATAATAGGA-3' BeXyn5A D129A Forward 5'-TGTACGAATGCACCTCGCTCCTTATTGGAGCGATG-3' Reverse 5'-CATCGCTCCAATAAGGAGCGAGGTGCATTCGTACA-3' BeXyn5A E229A Forward 5'-ATGTTCGAACTTGCCAATGCACCCGTCAAGATAAAAGG-3' Reverse 5'-CCTTTTATCTTGACGGGTGCATTGGCAAGTTCGAACAT-3' BeXyn5A E349A Forward 5'-CCTATCATGGTGACTGCAATAGACTGGGCGCCC-3' Reverse 5'-GGGCGCCCAGTCTATTGCAGTCACCATGATAGG-3' BiXyn5A D127A Forward 5'-GCGTATGCACCTCGCCCCGTATTGGAGTG-3' Reverse 5'-CACTCCAATACGGGGCGAGGTGCATACGC-3' BiXyn5A E227A Forward 5'-TGTTCGAGCTTGCCAATGCACCTGTCAGAATAAAAGG-3' Reverse 5'-CCTTTTATTCTGACAGGTGCATTGGCAAGCTCGAACA-3' BiXyn5A E347A Forward 5'-CCTATCATGGTGACTGCAATAGACTGGGCTCCG-3' Reverse 5'-CGGAGCCCAGTCTATTGCAGTCACCATGATAGG-3' a Oligonucleotide primers were synthesized by Integrated DNA Technologies (Coralville, IA). b Primers were designed to clone the respective gene in frame with an N-terminal hexahistidine fusion tag encoded by the plasmid vector, pET-46b using ligation independent cloning according to the Ek/LIC protocol from Novagen (San Diego, CA). Underlined sequence indicates sequence complementary to the plasmid vector. c Residues were selected for conservative mutagenesis due to their conservation as predicted by alignment of the primary amino acid sequences for family 5 glycoside hydrolases with biochemically defined catalytic activities, and primers were designed in accordance with the QuikChange protocol by Stratagene (La Jolla, CA). d Underlined sequence indicates engineered codon mutations.

199

TABLE 4.2 Top 30 most highly expressed genes in P. bryantii B14 when grown with WAX as carbohydrate source.a Gene expression Gene length No. of unique ORF Annotation (RPKM) (bp) gene reads 2685 Large Subunit Ribosomal RNA; lsuRNA; LSU rRNA 41,770.89 2896 278581 77 3-oxoacyl-[acyl-carrier-protein] synthase, KASII 20,703.64 1263 102198 889 hypothetical protein 20,418.88 204 16280 1334 Major outer membrane protein OmpA 15,216.99 1137 67621 2542 LSU ribosomal protein L19p 15,112.13 363 21440 1415 hypothetical protein 12,404.62 270 13090 76 Acyl carrier protein 12,046.08 237 11158 2216 SSU ribosomal protein S16p 11,855.94 537 24883 2678 Large Subunit Ribosomal RNA; lsuRNA; LSU rRNA 11,714.45 2385 105557 2697 Large Subunit Ribosomal RNA; lsuRNA; LSU rRNA 11,107.11 2912 61301 2599 Large Subunit Ribosomal RNA; lsuRNA; LSU rRNA 10,819.05 2910 32826 2095 NAD-dependent glyceraldehyde-3-phosphate dehydrogenase 8,638.67 1029 34742 2661 Large Subunit Ribosomal RNA; lsuRNA; LSU rRNA 8,272.28 1493 326 200 1136 Integrase, site-specific recombinase 8,039.67 882 27714

2552 Fructose-bisphosphate aldolase class II 7,260.81 1014 28775 1370 LSU ribosomal protein L9p 7,036.76 492 13531 1135 Sigma-54 modulation protein 6,770.65 342 9050 1309 hypothetical protein 6,569.17 1017 26111 1137 SSU ribosomal protein S21p 6,507.19 192 4883 1305 hypothetical protein 6,456.78 408 10296 1214 SSU ribosomal protein S20p 6,062.45 255 6042 1368 SSU ribosomal protein S6p 5,991.09 342 8008 740 LSU ribosomal protein L31p 5,691.94 252 5606 2608 Small Subunit Ribosomal RNA; ssuRNA; SSU rRNA 5,655.85 1486 15437 1310 NifU-related domain containing protein 5,346.15 702 14668 1690 DNA-binding protein HU-β 5,220.93 348 7101 2553 LSU ribosomal protein L17p 5,081.32 477 9473 177 LSU ribosomal protein L28p 4,751.88 264 4903 1421 SSU ribosomal protein S1p 4,734.33 1782 32973 1369 SSU ribosomal protein S18p 4,673.95 273 4987 a RPKM is defined as the number of reads per kilobase of gene per million mapped reads (31).

TABLE 4.3 Genes induced greater than four-fold during growth of P. bryantii a B14 with WAX as compared to XA. Fold Signal ORF Annotationb change P-valuec sequenced 1894 hypothetical protein 402.56 1.25E-13 Yes putative outer membrane protein involved in nutrient Yes 1905 binding (SusC) 371.44 3.10E-20 1893 endo-1,4-β-xylanase A precursor 321.68 3.55E-27 Yes 2351 β-xylosidase 314.24 3.50E-136 Yes putative outer membrane protein involved in nutrient Yes 1895 binding (SusD) 300.94 5.73E-09 putative outer membrane protein involved in nutrient Yes 1896 binding (SusC) 249.31 4.86E-20 putative outer membrane protein involved in nutrient e No 1897 binding (SusD) 228.19 1.66E-11 2004 α-xylosidase 114.36 6.16E-34 Yes 1909 endo-1,4-β-xylanase A precursor 113.41 2.10E-32 Yes 1908 β-xylosidase / Alpha-N- arabinofuranosidase 78.05 2.18E-245 No 2350 maltodextrin glucosidase 77.92 2.12E-16 Yes 150 hypothetical protein 53.53 1.24E-56 Yes 336 hypothetical protein 30.32 8.89E-27 Yes GPH family transporter in unknown oligosaccharide No 1910 utilization 30.29 6.70E-06 1911 putative secreted protein 30.1 2.37E-78 Yes 1928 hypothetical protein 24.73 2.35E-09 yes 284 conjugative transposon protein TraE 21.65 1.03E-05 Yes 96 hypothetical protein 20.54 7.03E-05 No possible β-xylosidase, family 43 of glycosyl Yes 1906 hydrolases 20.28 2.94E-279 2163 hypothetical protein 19.58 4.28E-08 Yes putative outer membrane protein involved in nutrient Yes 1122 binding (SusC) 15.69 2.41E-55 2523 hypothetical protein 14.37 0.03 Yes 1927 hypothetical protein 13.84 1.08E-20 Yes 1917 glucan 1,4-β-glucosidase 13.7 1.47E-46 Yes 99 hypothetical protein 13.44 8.08E-04 Yes 2164 iron-regulated protein A precursor 13.03 2.79E-31 Yes 101 hypothetical protein 12.59 1.82E-03 No 316 endo-1,4-β-xylanase D precursor 11.85 1.74E-211 Yes putative outer membrane protein involved in nutrient Yes 1123 binding (SusD) 11.83 5.78E-57 1912 α-glucuronidase 9.54 1.17E-63 Yes 2003 xylosidase/arabinosidase 9.25 5.18E-06 Yes 1114 endo-1,4-β-xylanase A precursor 9.2 1.35E-64 Yes 2165 lipoprotein, putative 9.03 2.04E-42 Yes 2527 hypothetical protein 7.49 3.85E-11 Yes 285 hypothetical protein 7.37 0.05 No 315 endo-1,4-β-xylanase A precursor 7.35 1.15E-50 Yes 282 Conjugative transposon protein TraG 6.91 1.70E-04 Yes 2166 putative cytochrome C-type biogenesis protein 6.29 6.73E-04 Yes 1456 hypothetical protein 6.13 0.05 No 2316 hypothetical protein 6.05 0.04 No 320 serine protease, subtilase family 5.93 0.03 No

201

TABLE 4.3 (cont.) Fold Signal ORF Annotationb change P-valuec sequenced 2002 arabinan endo-1,5-α-L-arabinosidase 5.59 2.77E-15 Yes 102 cell wall endopeptidase, family M23/M37 5.51 0.05 No putative outer membrane protein involved in nutrient Yes 322 binding (SusC) 5.17 0.02 98 LtrC-like protein 5.14 9.11E-04 No 97 hypothetical protein 5.12 1.20E-03 No 286 DNA primase 5.1 0.04 No 283 conjugative transposon protein TraE 5.09 9.03E-07 Yes 1458 major outer membrane protein OmpA 4.57 1.01E-06 Yes 1131 hypothetical protein 4.39 6.35E-71 Yes 2465 xylulose kinase 4.35 5.32E-288 No 323 lipoprotein, putative 4.35 3.48E-08 Yes 2001 maltodextrin glucosidase 4.28 4.60E-16 Yes 1939 hypothetical protein 4.1 4.72E-06 Yes 1941 ferrichrome-iron receptor 4.09 3.50E-04 Yes 1936 probable zinc protease pqqL 4.08 4.12E-06 Yes 1462 hypothetical protein 4.04 8.48E-03 Yes a P. bryantii B14 was grown in a synthetic medium with either wheat arabinoxylan (WAX) or a mixture of xylose and arabinose (XA) as the sole carbohydrate source. RNA was then extracted and RNAseq experiments were performed as described in experimental procedures. b Open reading frames were predicted and annotations were assigned to the putative gene products using the rapid annotation using subsystems technology (RAST) server (2). c Two-sided P-values are derived from Baggerly’s test of the differences in means for two independent experiments are reported (3). d Signal peptides were predicted using the SignalP v3.0 online server (16). e The predicted start codon of ORF1897 is just downstream of the beginning of a contig and BLASTp analyses revealed that close homologs of this protein posses N-terminal extensions, which suggests that this protein may be truncated at the N-terminus. Thus, the assignment of the absence of a signal peptide for this protein may not be accurate.

202

TABLE 4.4 Genes repressed greater than four-fold during growth of P. a bryantii B14 with WAX as compared to XA. Signal ORF Annotationb Fold change P-valuec sequenced 960 hypothetical protein 15.61 5.72E-03 Yes putative outer membrane protein (SusD No 959 homolog) 11.31 0.05 putative outer membrane protein (SusC Yes 1551 homolog) 7.72 2.29E-03 1862 hypothetical protein 6.95 0.02 Yes 1863 endo-arabinanase 6.92 2.01E-05 Yes putative outer membrane protein (SusC Yes 958 homolog) 6.29 4.15E-08 putative outer membrane protein (SusD Yes 1094 homolog) 5.75 0.02 2354 hypothetical protein 5.56 1.87E-03 No 626 hypothetical protein 5.42 7.28E-04 No 2405 TraX family protein 5.34 7.93E-03 No β-glucanase (GH43-GH88 two domain Yes 1898 protein) 5.22 1.10E-08 1946 rhamnogalacturonan 5.16 2.74E-04 Yes putative outer membrane protein (SusC Yes 1860 homolog) 5.14 5.45E-09 1864 AraC family transcriptional regulator 5.07 0.03 No 2314 exo-poly-α-D-galacturonosidase 4.97 0.03 Yes 1558 hypothetical protein 4.87 4.87E-04 Yes 667 endoglucanase B precursor (EC 3.2.1.4) 4.87 6.98E-03 No putative outer membrane protein (SusD Yes 632 homolog) 4.86 6.38E-03 419 hypothetical protein 4.67 0.03 Yes 1575 histidine decarboxylase 4.66 2.73E-03 Yes 1092 AraC family transcriptional regulator 4.64 0.04 No 1577 hypothetical protein 4.57 4.72E-05 No 1848 α-amylase (EC 3.2.1.1) 4.40 0.03 Yes 611 hypothetical protein 4.36 9.97E-04 Yes 191 hypothetical protein 4.35 0.02 Yes translation elongation factor G-related No 1365 protein 4.32 0 612 L-asparaginase (EC 3.5.1.1) 4.31 0.02 Yes putative outer membrane protein (SusD Yes 1234 homolog) 4.31 0.03 1948 ABC transporter ATP-binding protein 4.21 0 No 1885 Putative polygalacturonase 4.21 0.05 Yes 254 hypothetical protein 4.17 0.02 Yes 487 hypothetical protein 4.05 0 Yes a P. bryantii B14 was grown in a synthetic medium with either wheat arabinoxylan (WAX) or a mixture of xylose and arabinose (XA) as the sole carbohydrate source. RNA was then extracted and RNAseq experiments were performed as described in experimental procedures. b Open reading frames were predicted and annotations were assigned to the putative gene products using the rapid annotation using subsystems technology (RAST) server (2). c Two-sided P-values are derived from Baggerly’s test of the differences in means for two independent experiments are reported (3). d Signal peptides were predicted using the SignalP v3.0 online server (16).

203

TABLE 4.5 Domain architecture for ORF0150 from Prevotella bryantii B14 and top BLAST hits. source a predicted amino acid # of amino domain architecture b (ORF) activity identity acids aligned

Prevotella bryantii B14 hypothetical - - (RAST:0150, TIGR:0125) protein

Prevotella copri cellulase 52% 433 (6092)

Prevotella bryantii B14 hypothetical 50% 415 (RAST:0336, TIGR:1403) protein Prevotella buccae D17 hypothetical 49% 414 (1726) protein Prevotella ruminicola 23 endoglucanase 46% 519 (RAST:2413, TIGR:2805) precursor

204 Prevotella buccae D17 hypothetical 35% 686 (0844) protein

Bacteroides sp. 2_2_4 hypothetical 35% 665 (3750) protein Bacteroides cellulosilyticus hypothetical 35% 657 (3415) protein Bacteroides intestinalis hypothetical 34% 657 (4213) protein Bacteroides eggerthii hypothetical 35% 664 (1299) protein Bacteroides intestinalis hypothetical 34%, 31% 434, 316 (1125) protein Bacteroides cellulosilyticus hypothetical 34%, 28% 434, 428 (2048) protein a Functional domains were assigned utilizing the Conserved Domain Database (CDD) on the NCBI database

(http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml).

Figure 4.1 Biological reproducibility of RNAseq and DNA microarrays. Microarrays and RNAseq experiments were performed as described in Experimental Procedures. The expression levels of each gene for two individual biological replicates are compared and the correlation coefficient is presented. For RNAseq, the expression level used is the number of reads per kilobase of gene per million mapped reads (RPKM) (31), whereas for microarrays the expression level reported is the normalized fluorescence intensity for each gene.

205

Figure 4.2 Transcriptome of P. bryantii B14 grown on WAX. P. bryantii B14 was grown with wheat arabinoxylan (WAX) as the sole carbohydrate source and the transcriptome was analyzed as described in Experimental Procedures. The outermost solid line represents the genome sequence for P. bryantii B14. The next two inner rings represent the forward coding sequences (red) and the reverse coding sequences (blue) as predicted by the annotated genome. The third inner ring depicts the ribosomal RNA genes (purple) and the transfer RNA genes (green). The innermost ring indicates the coverage for each nucleotide after alignment of the RNAseq reads to the P. bryantii B14 genome.

206

Figure 4.3 Carbohydrate esterase or glycoside hydrolase genes upregulated on WAX compared to XA. Each gene that was over-expressed at least four-fold was used as a query in a BLASTp search of the non-redundant (nr) database at GenBank. Genes were assigned to carbohydrate active enzyme (CAZy) (6) families if they exhibited significant similarity (E-value < 1 x 10-5) to biochemically characterized proteins already catalogued in a CAZy family.

207

Figure 4.4 RNAseq data facilitates the closure of a gap between two contiguous sequences in P. bryantii B14. A de novo assembly of RNAseq reads produced contig1388 (red) which exhibited large regions of overlap with the 5′ end of contig17 and the 3′ end of contig19 from the P. bryantii B14 genome assembly. The oligonucleotide primer set (P1 and P2) amplified a 746 bp DNA fragment which spanned the 104 bp gap between contigs 17 and 19.

208

209

Figure 4.5 RNAseq coverage for the major xylanolytic gene cluster in P. bryantii B14. P. bryantii B14 was grown with either wheat arabinoxylan (WAX) or a mixture of xylose and arabinose (XA) as the sole carbohydrate source and the transcriptome was analyzed by RNAseq as described in Experimental Procedures. A detailed view of the nucleotide coverage for the major xylanolytic gene cluster in P. bryantii B14 during growth on WAX (red) or XA (green) is shown. Two biological replicates (WAX1, WAX2; XA1, XA2) for each growth condition were analyzed and the nucleotide coverage for the major xylanolytic gene cluster is shown. The part of the gene cluster including ORF1907-1912 has been previously studied by Flint and colleagues (19). Asterisks denote genes for which biochemical activities have been demonstrated for their cognate gene products: xynB and xynA (19), xynR (29).

Figure 4.6 P. bryantii B14 ORF0150 encodes an enzyme with endo-xylanase activity. (A) Purification of recombinant PbXyn5A. ORF0150 was cloned into an expression vector and expressed heterologously as a hexahistidine fusion protein in E. coli. The protein (PbXyn5A) was purified using cobalt affinity chromatography and the eluate was analyzed by 12% SDS-PAGE, followed by Coomassie brilliant blue G-250 staining. (B) Depolymerization of wheat arabinoxylan. PbXyn5A was assessed for its capacity to depolymerize soluble wheat arabinoxylan (WAX) by incubating the protein on an agar plate infused with WAX followed by staining with congo red. (C) Hydrolysis of xylohexaose. PbXyn5A- catalyzed hydrolysis of xylohexaose was assessed by incubating the enzyme with the substrate, removing aliquots at the indicated time-points and then resolving the products by thin layer chromatography followed by staining with methanolic orcinol. (D) Thin layer chromatography of products released from WAX by PbXyn5A. PbXyn5A (0.50 µM) was incubated with WAX (1%) and the products were resolved by thin layer chromatography followed by staining with methanolic orcinol. Xylo- oligosaccharide standards X1-X5 and arabinose (A1) were spotted on the plate in lanes 1 and 2, respectively to serve as markers for the identification of hydrolysis products. In lane 4, PbXyn5A was incubated with WAX at 37 °C for 15 h and 2.5 µL of the reaction mixtures were spotted onto the TLC plate. (E) Reducing sugars released from WAX by PbXyn5A.

Wild type Xyn5A was incubated with WAX (1 %) and the reducing sugars were detected by using the para-hydroxybenzoic acid hydrazide (PAHBAH) assay. The reducing sugar concentrations were calculated from the absorbance at 410 nm with comparison to a standard curve generated with known concentrations of glucose. For panel E, the values are reported as the mean ± standard deviation from three independent experiments.

210

Figure 4.7 PbXyn5A functions synergistically with β-xylosidase and α-L- arabinofuranosidase enzymes. (A-B) Hydrolysis of wheat arabinoxylan was assessed by incubating the enzyme with the substrate and then resolving the products by high performance anion exchange chromatography (HPAEC) followed by detection with a pulsed amperometric detector (PAD). The products of hydrolysis were identified by comparison of peaks with retention times of purified substrates. Abbreviations are as follows: flow through, FT; arabinose, A1; xylose through xylohexaose, X1-X6. The concentration of xylose and arabinose in each of the reactions was estimated by comparison to a calibration curve constructed with known concentrations of each sugar. The same curves are depicted in (A) and (B), however the scale is adjusted in (B) to reveal changes in the oligosaccharide patterns between different reactions. These experiments were performed three times and single representative curves are shown. The concentrations of xylose and arabinose are reported as means ± standard deviations from the mean.

211

Figure 4.8 Bacteroides eggerthii ORF1299 and Bacteroides intestinalis ORF4213 encode endo-xylanase enzymes. (A) Purification of recombinant BeXyn5A and BiXyn5A. B. eggerthii ORF1299 and B. intestinalis ORF4123 were cloned into expression vectors and expressed heterologously as hexahistidine fusion proteins in E. coli. The proteins were purified using cobalt affinity chromatography and the eluate was analyzed by 12% SDS- PAGE, followed by Coomassie brilliant blue G-250 staining. (B) Thin layer chromatography of products released from WAX by BeXyn5A and BiXyn5A. BeXyn5A or BiXyn5A (0.50 µM, each) were incubated with WAX (1% wt/vol) for 15 h at 37°C and the products were resolved by thin layer chromatography followed by staining with methanolic orcinol. Xylo- oligosaccharide standards X1-X5 and arabinose (A1) were spotted on the plate in lanes 1 and 2, respectively to serve as markers for the identification of hydrolysis products. (C) Reducing sugars released from sWAX by BeXyn5A and BiXyn5A. BeXyn5A or BiXyn5A (0.50 µM, each) were incubated with WAX (1% wt/vol) for 15 h at 37°C and the reducing sugars were detected by using the PAHBAH assay. The reducing sugar concentrations were calculated from the absorbance at 410 nm with comparison to a standard curve generated with known concentrations of glucose.

212

Figure 4.9 BeXyn5A and BiXyn5A release long xylo-oligosaccharides from soluble WAX. Hydrolysis of wheat arabinoxylan by BeXyn5A (A-B) or BiXyn5A (C-D) was assessed by incubating each enzyme with the substrate and then resolving the products by high performance anion exchange chromatography (HPAEC) followed by detection with a pulsed amperometric detector (PAD). The products of hydrolysis were identified by comparison of peaks with retention times of purified substrates. Abbreviations are as follows: FT, flow through; xylobiose, X2, xylotriose, X3; xylotetraose, X4; xylopentaose, X5; xylohexaose, X6. These experiments were performed three times and single representative curves are shown.

213

Figure 4.10 Mutational analysis reveals residues important for catalysis in PbXyn5A,

BeXyn5A, and BiXyn5A. (A) Multiple sequence alignment reveals conserved residues in PbXyn5A, BeXyn5A, and BiXyn5A which align with active site residues for biochemically characterized GH5 enzymes. Primary amino acid sequences for PbXyn5A, BeXyn5A, BiXyn5A and selected GH5 proteins were imported into ClustalW (http://align.genome.jp/) and aligned using the Blosum62 matrix. The multiple-sequence alignment file was then entered into the ESPript V.2.2 program (http://espript.ibcp.fr/ESPript/ESPript/) and post- script output files were generated. Only regions within the alignment that exhibit significant amino acid conservation are shown. Asterisks indicate the amino acid residues that were targeted for mutagenesis in the current study. The NCBI Protein Database accession numbers are indicated alongside the organism abbreviations. The organisms names are abbreviated as follows: Bsp, Bacillus sp. KSM-635; Beg, Bacteroides eggerthii; Bin, Bacteroides intestinalis; Cce, Clostridium cellulovorans; Cjo, Clostridium josui; Pbr,

Prevotella bryantii B14. (B) Thin layer chromatography of products released from WAX by wild type and mutant proteins. Wild type or mutant PbXyn5A, BeXyn5A or BiXyn5A (0.50

µM, each) were incubated with WAX (1% wt/vol) for 15 h at 37°C and the products were resolved by thin layer chromatography followed by staining with methanolic orcinol. Xylo- oligosaccharide standards X1-X5 and arabinose (A1) were spotted on the plate in lanes 1 and 2, respectively to serve as markers for the identification of hydrolysis products. (C)

Reducing sugars released from sWAX by wild type and mutant proteins. The reducing sugars from the reactions in (B) were detected by using the PAHBAH assay. The reducing sugar concentrations were calculated from the absorbance at 410 nm with comparison to a standard curve generated with known concentrations of glucose. In (C), the data are reported as means ± standard deviations from three independent experiments.

214

Figure 4.11 Domain organization for the xylan catabolism regulator, XynR. Functional domains were assigned utilizing the Conserved Domain Database (CDD) on the NCBI database (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). The abbreviations are as follows: YYY, unknown domain with conserved YYY repeat; HK, histidine kinase; RR, response regulator.

215

216

Figure 4.12 The core xylan utilization system is conserved amongst certain species within the phylum, Bacteroidetes. The Prevotella bryantii B14 endoxylanase, Xyn10C, was used as the query sequence in a BLASTp search of the GenBank database. The genomic context is shown for each of the top BLASTp hits. Only the region which contains genes with predicted roles associated with xylan deconstruction are shown.

Figure 4.13 Phylogenetic clustering of organisms containing a core xylan utilization system. The organisms whose genomes contain the core xylan utilization system identified in Figure 4.11 were analyzed for their phylogenetic relationships. The small subunit (16S) ribosomal RNA gene for each organism was aligned and a neighbor-joining tree was constructed using the CLC Genomics Workbench v3.0 software. Each alignment was re- sampled 100 times and the bootstrap values are indicated on the internal branches. The branch length is reported as the expected number of substitutions per nucleotide position.

217

CHAPTER 5

DISCUSSION AND CONCLUSIONS

5.1 BIOINFORMATIC ANALYSIS OF XYLAN DEGRADING ENZYMES IN RUMEN PREVOTELLA SPP.

Enzymatic deconstruction of the abundant plant polysaccharide xylan is an important process that occurs in the foregut of ruminants as well as in the hindgut of humans. Organisms within the phylum Bacteroidetes are highly abundant in these two microbiomes, and a number of cultured representatives have been shown to degrade and ferment xylan. Prevotella represents the most abundant genus in the bovine rumen as demonstrated by culture-independent studies (1, 13). Several of the major strains of ruminal Prevotella are xylanolytic,

including P. bryantii B14 and P. ruminicola 23. The genome sequences for P.

bryantii B14 and P. ruminicola 23 were sequenced by The Institute for Genomic

Research (now the J. Craig Venter Institute) through a grant funded by the

United States Department of Agriculture to the North American Consortium for

Fibrolytic Rumen Bacteria (FibRumBa, http://jcvi.org/rumenomics/). The P.

bryantii B14 genome was partially sequenced leaving a total of ~50 physical gaps

in the genome, whereas the P. ruminicola 23 genome was sequenced to

completion (8). These genome sequences revealed a highly expanded repertoire

of plant polysaccharide degrading genes (glycobiome) relative to many other sequenced organisms which is consistent with their demonstrated roles in hemicellulose degradation. Furthermore, these genomes revealed a great deal of information about the glycoside hydrolase gene content at a molecular detail that

218

had not been previously available for rumen microbiologists. At the outset, our

ultimate goal was to place these genes within the context of the hemicellulose

degrading niche of these important rumen micro-organisms. To this end, we first

used a bioinformatic approach to predict functions of the gene products

associated with xylan degradation and then cloned and expressed the

recombinant proteins to determine their individual biochemical properties.

Analysis of the P. ruminicola 23 genome revealed the presence of a gene

predicted to encode a protein with a unique modular domain architecture that had

not been described for any other protein in the GenBank database. This protein

harbored two distinct domains including a putative N-terminal glycoside hydrolase (GH) family 10 endo-xylanase domain and a C-terminal carbohydrate esterase (CE) family 1 domain. We hypothesized that this unique domain organization may provide the polypeptide with an improved capacity to degrade xylan polymers and that this may reflect an adaptation specific to the ruminal

Prevotella spp. Subsequent cloning of the gene and heterologous expression of the recombinant protein revealed that this enzyme is bi-functional, possessing both endo-xylanase activity and ferulic acid esterase activity, thus the enzyme was named Xyn10D-Fae1A. Therefore, this protein may cleave xylan fragments and also reduce interstrand cross-links between xylan molecules or with lignin which may be mediated by diferulate bridges (17). To provide further insight into the modular architecture of the protein, we introduced site-specific mutations into catalytic active site residues for the two putative domains. These results allowed us to unequivocally map these two biochemical activities to two discrete domains

219

within the polypeptide. These findings have been confirmed by X-ray

crystallographic studies which revealed two separate catalytic domains for this

protein (Bae et al., unpublished data) (Figure 5.1).

The major products released following incubation of Xyn10D-Fae1A with

natural xylan polymers included xylo-oligosaccharides. In order to gain access to xylose monomers for subsequent fermentation, P. ruminicola 23 must possess a

β-xylosidase enzyme capable of converting the xylo-oligosaccharides to xylose.

Further analysis of the genomic context for xyn10D-fae1A revealed the presence

of a GH 3 β-glucosidase downstream of the bi-functional xylanase-esterase. This

family consists of biochemically characterized β-glucosidase and β-xylosidase enzymes, thus it is possible that this gene may be incorrectly annotated as encoding a β-glucosidase. To test this hypothesis, the gene was cloned, and expressed heterologously in E. coli. The purified recombinant protein exhibited strong β-xylosidase activity and no detectable β-glucosidase activity. Thus the protein was named Xyl3A to denote the first GH 3 β-xylosidase described in this bacterium. Furthermore, Xyl3A cleaved xylo-oligosaccharides and functioned

synergistically with Xyn10D-Fae1A to release monosaccharides from a natural xylan polysaccharide. These results provided the first molecular insight into xylan degrading enzymes in P. ruminicola 23. In addition, this study revealed that

annotations for GH 3 genes may be inaccurate.

Xyl3A in P. ruminicola 23 was incorrectly annotated as a β-glucosidase

and that prompted us to investigate the function of additional GH 3 enzymes in

the ruminal Prevotella spp. The genome for P. ruminicola 23 contains 11 putative

220

GH 3 genes, whereas the genome for P. bryantii B14 contains four putative GH 3

genes. Based upon the smaller number of total GH 3 genes, P. bryantii B14 was

selected for further functional analyses of the GH 3 genes in this organism. Each of the GH 3 genes was cloned and expressed recombinantly in E. coli and the recombinant proteins were characterized for catalytic activity with a library of synthetic para-nitrophenyl linked sugar substrates as well as with plant-derived xylo- and cello-oligosaccharides. Our results revealed that one of the proteins had robust β-glucosidase (cellodextrinase) activity, whereas the other three GH 3 enzymes possess β-xylosidase activity. Each of these genes had been annotated

as coding for β-glucosidase enzymes, however our results clearly revealed that

three of the four genes actually encoded β-xylosidase enzymes, thus these

genes were called xyl3A, xyl3B, and xyl3C. This study also addressed an

important question, do the three β-xylosidase enzymes in P. bryantii B14 each have redundant biochemical functions or does each enzyme recognize and cleave distinct chemical linkages within xylan substrates? Transcriptional

analysis of P. bryantii B14 during growth on arabinoxylan revealed that xyl3A was

induced relative to growth on glucose, whereas the expression of xyl3B and

xyl3C was constitutive. Furthermore, Xyl3C was found to exhibit a different

cleavage pattern during the hydrolysis of 4-O-methyl glucuronoxylan substituted

xylo-oligosaccharides relative to Xyl3B, suggesting a fundamental difference in

substrate specificity between these enzymes. These results provided evidence

that these three β-xylosidase enzymes possess different transcriptional and

biochemical profiles and support the prediction that each enzyme fulfills a

221

specific and unique functional role within the cell. In addition, mutational analysis of Xyl3B revealed an amino acid residue which contributes to determining substrate specificity for this enzyme and this amino acid position may aid in assignment of biochemical function based upon analysis of the amino acid sequence for members in this family.

Although bioinformatic analysis facilitated our discovery of a number of xylanolytic enzymes in P. bryantii B14 and P. ruminicola 23, these studies also revealed that there are definite limitations to predicting biochemical function for glycoside hydrolase enzymes based upon amino acid sequence alone. One of the major goals of the human microbiome project is to sequence a large number of microbial genomes in order to understand the genetic content of this microbial community and how these genes expand the metabolic repertoire of the human host (14). Our studies reveal that the bioinformatic analysis of these genomic data is not trivial, especially as it pertains to the glycobiome. Thus, to facilitate proper interpretation of the genomic content of saccharolytic micro-organisms, functional studies which improve our understanding of the molecular determinants of substrate specificity for glycoside hydrolase enzymes will be important.

One of the major unresolved issues in these studies was the functional role for the PA14 domain which forms an insertion sequence within the C- terminal β-sandwich domain of some GH 3 enzymes (P. ruminicola 23, Xyl3A; P. bryantii B14, Xyl3A, Xyl3C). The precise function for the PA14 domain remains poorly defined, however its presence in bacterial, archaeal, and eukaryotic

222

proteins including glycosidases, glycosyltransferases, proteases, amidases, and adhesins, and bacterial toxins (10) and its predominantly β-sheet structure (9) suggests a carbohydrate binding function. This domain is widespread amongst

GH 3 proteins with 228 sequenced genes predicted to encode GH 3 enzymes with this insertion domain. To probe the potential role for the PA14 domain in

Xyl3A and Xyl3B, we constructed truncational mutant proteins in which the PA14 domain was deleted. When expressed as recombinant proteins in E. coli, neither of the two truncational mutants exhibited detectable hydrolytic activity (data not shown). These results suggested that either the PA14 domain was critical for glycoside hydrolase activity, or that the truncational mutants did not properly fold.

Future studies into the functional role of the PA14 domain will be important for determining the contribution of this domain to the proteins which harbor them.

5.2 TRANSCRIPTOMIC ANALYSIS OF XYLAN DEGRADING ENZYMES IN PREVOTELLA BRYANTII B(1)4

While the bioinformatic approach that we employed allowed us to identify genes in P. ruminicola 23 and P. bryantii B14 with potential roles in xylan degradation, it also presented several limitations. The genome for P. bryantii B14 contains at least 108 putative carbohydrate active enzymes and using bioinformatics alone to identify the subset of these genes that are specifically involved in xylan deconstruction may not be possible. Furthermore, novel genes may be involved in xylan degradation and bioinformatic analyses would likely not detect these potentially unique enzymes. A previous study had shown that xylanase activity is inducible in P. bryantii B14 by medium to large-sized xylan

223

fragments and is not induced by monosaccharides or short oligosaccharides (6).

Therefore, we chose to use a transcriptional approach to identify genes that are induced in P. bryantii B14 during growth on intact arabinoxylan relative to the monosaccharide components of this polysaccharide. In addition to DNA microarray studies, massively parallel DNA sequencing technologies were used to sequence the entire messenger RNA population and then the sequence reads were mapped back onto the P. bryantii B14 genome.

Results from the transcriptional analysis revealed a subset of approximately 57 genes which were induced greater than four-fold during growth on wheat arabinoxylan (WAX) relative to a mixture of xylose and arabinose (XA).

Amongst these genes were several highly induced genes with predicted roles in xylan degradation such as endo-xylanases, β-xylosidases, α-L- arabinofuranosidases, and esterases. By performing a de novo assembly of the transcriptomic data, a gap between two contiguous DNA sequences in the genome was closed and this revealed the presence of a large xylanolytic gene cluster in P. bryantii B14. This xylan hydrolase gene cluster included previously characterized xylan degrading genes (2-4, 7) and also a unique operon which contained protein homologs of the starch utilization system previously identified in Bacteroides thetaiotaomicron (11). This operon contained the most highly induced genes during growth on WAX compared to XA which suggests that it may be important for the utilization of xylan by P. bryantii B14.

In addition to genes with assigned functions, a number of highly induced hypothetical genes were identified during growth of P. bryantii B14 on WAX

224

relative to XA. One of these hypothetical genes was predicted to encode a protein with low homology to GH family 5, and was cloned and expressed heterologously in E. coli. Biochemical analysis of the purified recombinant protein revealed that it possess endo-xylanase activity, thus the protein was named

Xyn5A implying the first characterized GH 5 endo-xylanase in this bacterium. The closest homologs within the GenBank database for this protein are hypothetical proteins that derive from the human colonic Bacteroides spp. It was hypothesized that these proteins serve a similar xylan-degrading function for these organisms.

To test this hypothesis, two of these genes (Bacteroides intestinalis ORF4123 and Bacteroides eggerthii ORF1299) were cloned, expressed heterologously in E. coli, and the biochemical properties of the purified proteins were determined.

These studies revealed that the two hypothetical proteins exhibit robust endo- xylanase activity which supports the functional assignment of endo-xylanase to these proteins. To identify whether these three novel endo-xylanase enzymes are related to the GH family 5 enzymes, amino acid sequence alignments were constructed with biochemically verified GH family 5 enzymes and amino acid substitutions were introduced into putative catalytic residues. Following mutation of the putative catalytic residues, endo-xylanase activity for the three novel endo- xylanase enzymes was completely attenuated which supports the assignment of these proteins as GH family 5 enzymes.

Through this study, we identified and characterized the transcriptional profile of genes involved in xylan degradation and utilization by P. bryantii B14.

This study represents the first transcriptional analysis of xylan degrading genes

225

from a rumen bacterium and provides important insight into the mechanisms that gut microbiota employ to degrade and utilize xylan. Furthermore, we identified and assigned functions to a novel subset of glycoside hydrolases which are specifically conserved amongst the rumen Prevotella spp. and the human colonic

Bacteroides spp.

5.3 CONSERVED MECHANISMS FOR XYLAN DEGRADATION BY RUMEN PREVOTELLA SPP. AND HUMAN COLONIC BACTEROIDES SPP.

Members of the phylum Bacteroidetes are dominant within the rumen and the human colonic microbial ecosystems. The major genera which comprise the

Bacteroidetes differ between these environments with Prevotella dominating in the rumen and members of the genus Bacteroides dominating in the human gut ecosystem. At the outset of the current study, genome sequences were available for just two rumen Prevotella spp. (P. bryantii B14 and P. ruminicola 23) and two human colonic Bacteroides spp. (B. thetaiotaomicron and B. fragilis). With the recent implementation of the National Institutes of Health (NIH) funded human microbiome project, the genomes of a large number of commensal bacteria are being sequenced. At the current time (March 2010), there are a total of 30

Bacteroides spp. and 13 Prevotella spp. (including eleven Prevotella spp. isolated from the human alimentary tract) with genome sequences available. This rapid increase in genomic sequence information has revolutionized our view of the metabolic transformations that these bacteria catalyze within their respective environments. Furthermore, these data have revealed a considerable conservation in the polysaccharide utilization loci among these organisms (5).

226

Analysis of the genome sequences for several xylanolytic Bacteroides spp.

and Prevotella spp. revealed a highly conserved gene cluster which may be

important for xylan utilization by these bacteria. In P. bryantii B14, this operon

includes the most highly induced genes in the genome during growth on WAX

relative to XA (Figure 5.2A). This cluster harbors genes predicted to encode

outer membrane proteins which may be involved in the binding, fragmentation,

and transport of xylan fragments across the outer membrane in a mechanism

which is analogous to the starch utilization system (Sus) identified in Bacteroides

thetaiotaomicron (Figure 5.2B). The components within this gene cluster include

tandem repeats of genes predicted to encode SusC and SusD homologs (xusA,

xusB, xusC, xusD), followed by a hypothetical protein encoding gene (xusE), and then followed by an endoxylanase encoding gene (xyn10C). The SusC homologs

(XusA, XusC) are predicted to be TonB-dependent receptors which are involved in the transport of xylan fragments across the outer membrane by coupling this transport to the inner membrane proton motive force. The SusD homologs (XusB,

XusD) are predicted to be involved in the binding of polysaccharide fragments and delivering these fragments to the outer membrane transporters (XusA,

XusC). The function of XusE is unknown, however the presence of an N-terminal signal peptidase II cleavage site supports the prediction that this protein is anchored onto the outside leaflet of the outer membrane. Finally, Xyn10C is a functional GH family 10 endo-xylanase enzyme which possesses an insertion sequence within the catalytic domain that has homology to members of

carbohydrate binding module (CBM) family 4 (unpublished data).

227

The precise role for genes within this cluster are currently unknown, therefore future studies should be performed to investigate the roles for proteins in this cluster. One of the major limitations to functional analysis of the roles of genes in the ruminal Prevotella spp. is that they are highly resistant to genetic manipulation (12). In order for complete functional studies of the physiological roles for xylan-degrading genes to be pursued, considerable improvements in the tools available for genetic manipulation in these organisms will be necessary.

This gene cluster is strictly conserved across members of the rumen Prevotella spp. and the human colonic Bacteroides spp. and the conservation of this cluster is highly suggestive that xylanolytic members from these two bacterial genera employ similar mechanisms to degrade and utilize xylan. Therefore an alternative approach to understanding the function of genetic components within this xylan utilization system is to use xylanolytic members of the Bacteroides spp. as a model to investigate the conserved mechanisms that these organisms may employ for xylan degradation and utilization. Indeed, a genetic system is available for Bacteroides ovatus including transposon mutagenesis approaches

(15) as well as tools for insertional mutagenesis within the chromosome (16).

Emerging technologies including high throughput proteomics may also prove of considerable value for probing the function of components within the xylan utilization loci for these organisms.

228

5.4 REFERENCES

1. Edwards, J. E., N. R. McEwan, A. J. Travis, and R. J. Wallace. 2004. 16S rDNA library-based anlysis of ruminal bacterial diversity. Antonie van Leeuwenhoek 86:263-281.

2. Flint, H. J., T. R. Whitehead, J. C. Martin, and A. Gasparic. 1997. Interrupted catalytic domain structures in xylanases from two distantly related strains of Prevotella ruminicola. Biochim Biophys Acta 1337:161- 165.

3. Gasparic, A., R. Marinsek-Logar, J. Martin, R. J. Wallace, F. V. Nekrep, and H. J. Flint. 1995. Isolation of genes encoding β-D-xylanase, β-D-xylosidase and α-L-arabinofuranosidase activities from the rumen bacterium Prevotella ruminicola B(1)4. FEMS Microbiol Lett 125:135-141.

4. Gasparic, A., J. Martin, A. S. Daniel, and H. J. Flint. 1995. A xylan hydrolase gene cluster in Prevotella ruminicola B(1)4: sequence relationships, synergistic interactions, and oxygen sensitivity of a novel enzyme with exoxylanase and β-(1,4)-xylosidase activities. Appl Environ Microbiol 61:2958-2964.

5. Martens, E. C., N. M. Koropatkin, T. J. Smith, and J. I. Gordon. 2009. Complex glycan catabolism by the human gut microbiota: the Bacteroidetes Sus-like paradigm. J Biol Chem 284:24673-24677.

6. Miyazaki, K., T. Hirase, Y. Kojima, and H. J. Flint. 2005. Medium- to large-sized xylo-oligosaccharides are responsible for xylanase induction in Prevotella bryantii B14. Microbiology 151:4121-4125.

7. Miyazaki, K., H. Miyamoto, D. K. Mercer, T. Hirase, J. C. Martin, Y. Kojima, and H. J. Flint. 2003. Involvement of the multidomain regulatory protein XynR in positive control of xylanase gene expression in the ruminal anaerobe Prevotella bryantii B(1)4. J Bacteriol 185:2219-2226.

8. Morrison, M., S. C. Daugherty, W. C. Nelson, T. Davidsen, and K. E. Nelson. 2009. The FibRumBa database: A resource for biologists with interests in gastrointestinal microbial ecology, plant biomass degradation, and anaerobic microbiology. Microb Ecol 59:212-213.

9. Petosa, C., R. J. Collier, K. R. Klimpel, S. H. Leppla, and R. C. Liddington. 1997. Crystal structure of the anthrax toxin protective antigen. Nature 385:833-838.

229

10. Rigden, D. J., L. V. Mello, and M. Y. Galperin. 2004. The PA14 domain, a conserved all-β domain in bacterial toxins, enzymes, adhesins and signaling molecules. Trends Biochem Sci 29:335-339.

11. Shipman, J. A., J. E. Berleman, and A. A. Salyers. 2000. Characterization of four outer membrane proteins involved in binding starch to the cell surface of Bacteroides thetaiotaomicron. J Bacteriol 182:5365-5372.

12. Shoemaker, N. B., K. L. Anderson, S. L. Smithson, G. R. Wang, and A. A. Salyers. 1991. Conjugal transfer of a shuttle vector from the human colonic anaerobe Bacteroides uniformis to the ruminal anaerobe Prevotella (Bacteroides) ruminicola B(1)4. Appl Environ Microbiol 57:2114-2120.

13. Stevenson, D. M., and P. J. Weimer. 2007. Dominance of Prevotella and low abundance of classical ruminal bacterial species in the bovine rumen revealed by relative quantification real-time PCR. Appl Microbiol Biotechnol 75:165-174.

14. Turnbaugh, P. J., R. E. Ley, M. Hamady, C. M. Fraser-Liggett, R. Knight, and J. I. Gordon. 2007. The human microbiome project. Nature 449:804-810.

15. Valentine, P. J., P. Arnold, and A. A. Salyers. 1992. Cloning and partial characterization of two chromosomal loci from Bacteroides ovatus that contain genes essential for growth on guar gum. Appl Environ Microbiol 58:1541-1548.

16. Weaver, J., T. R. Whitehead, M. A. Cotta, P. C. Valentine, and A. A. Salyers. 1992. Genetic analysis of a locus on the Bacteroides ovatus chromosome which contains xylan utilization genes. Appl Environ Microbiol 58:2764-2770.

17. Wong, D. W. 2006. Feruloyl esterase: a key enzyme in biomass degradation. Appl Biochem Biotechnol 133:87-112.

230

5.5 FIGURES

Figure 5.1 Overview of the crystal structure for Xyn10D-Fae1A. (A) Xyn10D-Fae1A possess two distinct domains which are connected by a proline-rich linker sequence. (B- C) Xyn10D-Fae1A possess two distinct active sites which contain a catalytic triad (Ser- His-Asp) within the ferulic acid esterase domain (B) and a pair of catalytic carboxylic acid residues within the endo-xylanase domain (C).

231

Figure 5.2 Model for the xylan utilization system in xylanolytic Bacteroidetes. (A) Genetic structure and RNAseq coverage of the major xylan utilization system in P. bryantii B14. (B) Predicted model for binding of xylan fragments, cleavage, and transport of xylan fragments across the outer membrane.

232

APPENDIX A

FUNCTIONAL COMPARISON OF THE TWO BACILLUS ANTHRACIS GLUTAMATE RACEMASES

Published in the Journal of Bacteriology 2007, Vol. 189(14): 5265-5275

A.1 ABSTRACT

Glutamate racemase activity in B. anthracis is of significant interest with respect to chemotherapeutic drug design, because L-glutamate stereoisomerization to D-glutamate is predicted to be closely associated with peptidoglycan and capsule biosynthesis, which are important for growth and virulence, respectively. In contrast to most bacteria that harbor a single glutamate racemase gene, the genomic sequence of Bacillus anthracis predicts two genes encoding glutamate racemases, racE1 and racE2. To evaluate whether racE1 and racE2 encode functional glutamate racemases, we cloned and expressed racE1 and racE2 in E. coli. Size exclusion chromatography of the two purified recombinant proteins suggested differences in their quaternary structures, as

RacE1 eluted primarily as a monomer, while RacE2 demonstrated characteristics of a higher-ordered species. Analysis of purified recombinant RacE1 and RacE2 revealed that both proteins catalyze the reversible stereoisomerization of L- glutamate and D-glutamate with similar, but not identical, steady state kinetic properties. Analysis of the pH-dependence of L-glutamate stereoisomerization suggested that RacE1 and RacE2 both possess two titratable active site residues important for catalysis. Moreover, directed mutagenesis of predicted active site residues resulted in complete attenuation of enzymatic activities in both RacE1

233

and RacE2. Homology modeling of RacE1 and RacE2 revealed potential

differences within the active site pocket that might affect the design of inhibitory

pharmacophores. These results suggest that racE1 and racE2 both encode

functional glutamate racemases with similar, but not identical, active site features.

A.2 INTRODUCTION

Inhalational anthrax is a complex human disease that begins with

deposition of Bacillus anthracis spores into lungs, followed by infiltration and

proliferation of vegetative bacteria in the blood and several organs, ultimately

leading to death of the host (23, 25, 36, 42). Current antibiotic treatments are

effective during early stages of infection; however, the emergence of engineered

(55) and naturally-occurring (4) antibiotic-resistant strains underscores the need to identify new antibacterial targets (7). Moreover, due to the potential deployment of B. anthracis as a bio-weapon, the development and stockpiling of new anthrax countermeasures have been prioritized by the US government (6).

Inhibition of cell wall synthesis remains an effective approach for

preventing bacterial growth (53, 54). For most eubacteria, an important

constituent of the peptidoglycan cell wall is D-glutamate (38, 39, 41, 44, 45). D- glutamate is not typically available within the environment, but instead is generated by the enzyme glutamate racemase [E.C. 5.1.1.3], which catalyzes the reversible stereoinversion of L-glutamate (5, 14, 31). Insights into these - independent amino acid racemases have begun to emerge from biochemical studies of enzymes isolated from several organisms, including B. subtilis, B.

234

pumilus, B. sphaericus, E. coli, L. fermenti, L. brevis, A. pyrophilus, S.

haemolyticus, B. lactofermentum and M. tuberculosis (1, 2, 5, 10, 14, 18, 19, 21,

28, 29, 33, 35, 40, 47, 58, 61), as well as a recent crystal structure for RacE-D-

glutamate from B. subtilis (43). Several studies have identified glutamate

racemase as an essential gene in B. subtilis and E. coli, which has led to the

prediction that glutamate racemase activity is important for peptidoglycan

biosynthesis in these organisms (14, 31). Because glutamate racemases are not

found in mammals, these enzymes have emerged as excellent targets for the

design of a new class of antibacterial agents (3, 12, 22, 28, 43, 59).

Analogous to other eubacteria, D-glutamate is predicted to be an

important constituent of B. anthracis peptidoglycan (48). However, for B.

anthracis, D-glutamate is also the sole component of the poly-γ-D-glutamic acid

(PDGA) capsule (24), an important virulence factor required for dissemination in

a murine model of inhalational anthrax (16), and presumed to be required for

disease in humans as well (16, 36, 46). Therefore, in addition to its predicted role

in cell wall biosynthesis, glutamate racemase is also predicted to be the major

source of D-glutamate for PDGA capsule synthesis in B. anthracis (9). In contrast

to most bacteria that possess only one glutamate racemase gene (5, 10, 14, 18,

19, 26, 29, 33, 35, 40, 61), the B. anthracis genome contains two genes

(BAS0806 [GenBank] and BAS4379 [GenBank]) predicted to encode glutamate racemases, called racE1 and racE2. Recently, inactivation of racE2 was reported

to cause a severe growth defect in B. anthracis, while inactivation of racE1 only

moderately inhibited growth (52). However, the underlying reasons for these

235

growth phenotypes, especially for the racE2 mutant, were not identified, and thus cannot be attributed at this time to defects in peptidoglycan synthesis resulting from insufficient D-glutamate availability (52). Further complicating the interpretation of the phenotypes reported for the racE1 and racE2 mutants is uncertainty as to whether either of these two genes encodes functional enzymes with glutamate racemase activity. Thus, validation of RacE1 or RacE2 as potential drug targets awaits elucidation of their enzymatic and biochemical properties.

Here, we characterized and compared recombinant wild type and mutant forms of RacE1 and RacE2 from B. anthracis. The racE1 and racE2 genes were cloned from B. anthracis Sterne 7702, and expressed as recombinant proteins in

E. coli. These studies revealed that within cell free assays, B. anthracis RacE1 and RacE2 both catalyze stereoinversion of L-glutamate to D-glutamate.

However, the two enzymes differ in ways that could influence future inhibitor development.

A.3 EXPERIMENTAL PROCEDURES

Materials. Bacillus anthracis Sterne 7702 (56) was obtained from Dr.

Theresa M. Koehler (Houston, TX). Escherichia coli [XL1-Blue] was obtained from Stratagene (La Jolla, CA). Escherichia coli T7 lysogen [BL21(DE3)] was obtained from Novagen (Madison, WI). The DNeasy Tissue Kit, DNA PCR purification kit, and DNA gel extraction kit were acquired from Qiagen (Valencia,

CA). DNA oligonucleotides were synthesized at Integrated DNA Technologies

236

Incorporated (Coralville, IA). Picomaxx DNA polymerase for PCR-amplification and the QuikChange Site-Directed Mutagenesis Kit were acquired from

Stratagene (La Jolla, CA). XhoI, SalI, and BamHI restriction endonucleases were

obtained from New England Biolabs (Ipswich, MA). Ni-NTA His•Bind metal affinity resin and pET-15b were purchased from Novagen (Madison, WI). DNA sequencing was performed at the Roy J. Carver Biotechnology Center (Urbana,

IL). The plasmid miniprep kit and gel filtration standards were obtained from

Biorad (Hercules, CA). Components for Luria-Bertani (LB) and Bacto Brain Heart

Infusion (BHI) medium were purchased from BD Diagnostics (Franklin Lakes, NJ).

Isopropyl-β-D-thiogalactoside (IPTG) was obtained from Fisher Scientific (Fair

Lawn, NJ). L-glutamate, D-glutamate, ampicillin, dithiothreitol (DTT), and tris(2- carboxyethyl)phosphine (TCEP) were obtained from Sigma Aldrich (St. Louis,

MO). Amicon centrifugal filter devices with a MWCO of 10,000 Daltons were obtained from Millipore (Billerica, MA).

Multiple Sequence Alignment. To compare the primary amino acid sequence between B. subtilis RacE and B. anthracis RacE1 and RacE2, a

multiple sequence alignment was generated. Amino acid sequences for B.

anthracis Sterne RacE1 (BAS0806 [GenBank]), B. anthracis Sterne RacE2

(BAS4379 [GenBank]), and B. subtilis RacE (BSU2835 [GenBank]) were aligned

using ClustalW (http://align.genome.jp/). The multiple sequence alignment file

was then entered into the ESPript V2.2 alignment program

(http://mail.bic.nus.edu.sg/ESPript/cgi-bin/ESPript.cgi). Secondary structural

237

elements and solvent accessibility indices were imported from the crystal

structure of Bacillus subtilis 168 RacE (PDB 1ZUW) (43).

Cloning of RacE1, RacE2, and preparation of expression strains. To obtain genomic DNA, B. anthracis Sterne 7702 were cultivated at 37 °C and with aeration on a rotary shaker in BHI (3.7% Bacto Brain Heart Infusion, Millipore deionized water, 0.5% glycerol shaker) medium. DNA was isolated from mid-log cultures and purified using the DNeasy Tissue Kit. B. anthracis Sterne racE1

(BAS0806 [GenBank]) and B. anthracis Sterne racE2 (BAS4379 [GenBank])

were PCR amplified using primers corresponding to the 5’ and 3’ ends of each

gene (Table A.1). These primers were engineered such that 5’ XhoI (racE1) or

SalI (racE2) and 3’ BamHI (racE1 and racE2) restriction sites were incorporated.

Each PCR product was purified using the PCR purification kit. The purified

amplicons were incubated with XhoI (racE1) or SalI (racE2) and BamHI to

generate directional annealing sites. The amplicons were then ligated with pET-

15b, to replace the XhoI - BamHI fragment within the polylinker region. The

ligation mixtures were introduced into E. coli XL1-Blue by electroporation. The

integrity of each gene from individual clones was confirmed by DNA sequencing.

pET-15b-racE1 or pET-15b-racE2 were isolated using the plasmid miniprep kit

and introduced by electroporation into E. coli T7 lysogen [BL21(DE3)].

Expression of racE1 or racE2. E. coli [BL21(DE3)] transformed with

either pET-15b-racE1 or pET-15b-racE2 were grown overnight in Luria-Burtani

broth (5 mL) supplemented with ampicillin (100 µg/mL) at 37 °C and with aeration.

After 10-12 h, the starter cultures were back-diluted into fresh Luria-Burtani broth

238

(500 mL) supplemented with ampicillin (100 µg/mL) at 37 °C and grown with

aeration until the optical density at 600 nm reached 0.5. Expression of racE1 or

racE2 was induced by the addition of IPTG (0.1 mM). The cultures were grown

for an additional 4 h, and then harvested by centrifugation at 5000 x g for 15 min

at 4 °C. The cell pellets were re-suspended in 20 mL Ni-NTA binding buffer (50

mM phosphate, 300 mM NaCl, 10 mM imidazole, 0.5 mM TCEP, pH 8.0) and

disrupted by sonication (3 X 20 sec, 23 kHz, 20 watts) using a Sonic

Dismembrator Model 100 from Fisher Scientific (Fair Lawn, NJ). The cell lysate

was then clarified by centrifugation at 30,000 x g for 30 min, and applied to a 2

mL bed volume of Ni-NTA His•Bind metal affinity resin. After 1 h of gentle

rotation at 4 °C, the bound protein was washed with two 10 mL volumes of Ni-

NTA wash buffer (20 mM imidazole, 50 mM phosphate, 300 mM NaCl, 0.5 mM

TCEP, pH 8.0), then eluted with two 6 mL volumes of Ni-NTA elution buffer (250 mM imidazole, 50 mM phosphate, 300 mM NaCl, 0.5 mM TCEP, pH 8.0). The column eluant was then exchanged into storage buffer (50 mM TRIS, 100 mM

NaCl, 0.2 mM DTT, pH 8.0) utilizing an Amicon centrifugal filter device.

Protein concentrations were quantified by absorbance spectroscopy using the method described by Gill and von Hippel (20). Briefly, the extinction coefficient at 280 nm (ε280) was calculated based upon the amino acid

composition of the polyhistidine-tagged recombinant proteins. The resulting

extinction coefficients (17420 M-1cm-1 for RacE1 and 21430 M-1cm-1 for RacE2)

were then used to determine the concentration of the appropriate protein solution

utilizing a Multiskan Spectrum UV Spectrophotometer from Thermo Electron

239

Company (Waltham, MA). The protein stocks were stored at -20 °C in storage buffer (50 mM TRIS, 100 mM NaCl, 0.2 mM DTT, pH 8.0) with 20% glycerol at a final concentration of 10 mg/mL for up to 2 months.

Preliminary experiments revealed no detectable differences in the kinetic properties of RacE1 or RacE2 in the presence or absence of the amino-terminal polyhistidine fusion peptides (data not shown). Based on these data, purified proteins with amino-terminal polyhistidine fusion peptides were utilized for all of the experiments described in this study.

Circular Dichroism Spectroscopy of Purified RacE1 and RacE2.

Circular dichroism (CD) spectra were collected for RacE1 and RacE2 in the far

UV range utilizing a J-720 Circular Dichroism Spectropolarimeter from JASCO

(Easton, MD). A cylindrical cuvette with a total volume of 350 µL and a path- length of 0.1 cm was used for each assay. The CD spectra of RacE1 (2.7 µM) or

RacE2 (2.4 µM) in an optically clear borate buffer (50 mM potassium borate, pH

8.0) were recorded from 190 nm to 260 nm at a scan rate of 50 nm/sec with a wavelength step of 1 nm and with 5 accumulations.

Data acquisition was coordinated using the JASCO Spectra Manager v1.54A software. Raw data files were uploaded onto the DICHROWEB on-line server (http://www.cryst.bbk.ac.uk/cdweb/html/home.html) and analyzed using the CDSSTR algorithm with reference set #4 , which is optimized for the analysis of data recorded in the range of 190 nm – 240 nm (34).

Size exclusion Chromatography. Size exclusion chromatography was conducted using an AKTA Purifier 900 FPLC system, equipped with a Superdex

240

200 10/300 GL size exclusion column and a UV detector, all from Amersham

Pharmacia Biotech (Little Chalfont, United Kingdom). RacE1 (5 mg/mL; 100 µL),

RacE2 (5 mg/mL; 100 µL), or a gel filtration standard mixture was injected onto the column pre-equilibrated with potassium borate buffer (50 mM boric acid, 100 mM KCl, 0.2 mM DTT, pH 8.0) liquid phase at a flow rate of 0.5 mL/min.

Standard curves were generated by plotting the log of the molecular weights

(provided by the supplier) of the gel filtration standards versus retention times.

Experimental retention times were used to calculate the apparent molecular weights of RacE1 and RacE2 from the standard curve.

Racemization Assays. Enzyme-catalyzed stereoisomerization of D- or L- glutamate was assayed using a J-720 Circular Dichroism Spectropolarimeter from JASCO (Easton, MD). A thermostated, cylindrical cuvette with a capacity of

700 µL and a path-length of 1 cm was used for each assay. D- or L-glutamate (5 mM) in an optically clear potassium borate buffer (50 mM boric acid, 100 mM KCl,

0.2 mM DTT, pH 8.0) was incubated at 25 °C in the absence or presence of

RacE1 or RacE2, at the indicated concentrations (0.08, 0.33 or 1.3 µM). D- or L- glutamate stereoisomerization was monitored by recording the CD signal at 217 nm. Data acquisition was performed using the JASCO Spectra Manager v1.54A software and a non-linear curve-fit was applied using GraphPad Prism V4.03 from GraphPad Software (San Diego, CA).

Determination of Steady State Kinetic Parameters. Racemase assays were carried out as described above, with the exception that D-glutamate concentrations were varied from 0.2 mM to 10 mM while the concentrations of L-

241

glutamate were varied from 5 mM to 200 mM. In addition, a 700 µL cuvette with a

1 cm pathlength was used for reactions below 5 mM substrate concentration and

a 350 µL cuvette with a 0.1 cm pathlength was used for reactions with higher

substrate concentrations. Reactions were initiated by the addition of RacE1 (0.78

µM) or RacE2 (0.78 µM) and the levels of D- or L-glutamate were monitored by

recording the CD signal at 215 nm for D- or L-glutamate concentrations from 0.2

mM to 10 mM and 225 nm for L-glutamate concentrations from 30 mM to 200

mM. Data acquisition was performed using the JASCO Spectra Manager v1.54A

software and a non-linear curve-fit was applied using GraphPad Prism V4.03

from GraphPad Software (San Diego, CA).

pH Rate Profile. The stereoisomerization of glutamate, in the L→D direction, by RacE1 and RacE2 in buffers of varying pH was assayed using a J-

720 circular dichroism spectropolarimeter from JASCO (Easton, MD). A

cylindrical cuvette with a total volume of 350 µL and a path-length of 0.1 cm was

used for each assay. Seven different buffers were prepared spanning a range in

pH from 6.5 to 9.5 with increments of 0.5 pH units. To maintain a an optically clear and well-buffered system, both phosphate and borate buffer formulations

were utilized as follows: pH 6.5, 7.0, 7.5 (50 mM potassium phosphate, 100 mM

KCl, 200 mM L-glutamate, 0.2 mM DTT), pH 8.0, 8.5, 9.0, 9.5 (50 mM boric acid,

100 mM KCl, 200 mM L-glutamate, 0.2 mM DTT). Each buffer was prepared with

200 mM L-glutamate, which yields fully saturating conditions for RacE1 and near

saturating conditions (83% saturating) for RacE2, such that initial rate data would

report true kcat values. Reactions were initiated by the addition of RacE1 (0.78

242

µM) or RacE2 (0.78 µM) and the levels of L-glutamate were monitored by recording the CD signal at 225 nm. Data acquisition was performed using the

JASCO Spectra Manager v1.54A software and a user-defined curve-fit was applied using GraphPad Prism V4.03 from GraphPad Software (San Diego, CA).

Homology Models. The homology models for RacE1 and RacE2 were constructed using The Chemical Computing Group’s Molecular Operating

Environment (MOE) 2006.08. The template for both models was the B. subtilis glutamate racemase-D-glutamate structure (PDB 1ZUW), which was aligned with the sequences for RacE1 and RacE2 using the Blosum62 substitution matrix.

Ten intermediate homology models resulting from a permutational selection of different loop candidates and sidechain rotamers were built for RacE1 and

RacE2. The intermediate model which scored best according to a packing evaluation function was chosen as the final model. Each of the intermediate models was subjected to a degree of energy minimization using the forcefield

MMFF94x, with a distance dependent dielectric (i.e., simulates the polar environment of water).

Docking of Compound 69 to RacE1 and RacE2. A conformational database was generated for (2R,4R)-2-amino-4-(2-benzo[b]thienyl)methyl pentanedioic acid (compound 69) (12) by using a stochastic conformational search, as implemented in MOE 2006.08 (Chemical Computing Group, Inc.).

This program employs a variation of the method of Ferguson (17), in which bonds are randomly rotated, rather than using perturbation of Cartesian coordinates. The forcefield was MMFF94x. Minimization was performed for each

243

conformation up to RMS gradient of 0.001. Any two conformers were considered identical if their optimal heavy atom RMS superposition distance was less than a tolerance value of 0.1 Å. All conformations with an energy greater than 7 kcal/mol were excluded from the database. This yielded a conformational database of 17 unique conformations of compound 69, which were used in the docking procedure. The docking of compound 69 into RacE1 and RacE2 was performed with the Dock function within MOE 2006.08, using the Alpha Triangle placement method, and the London dG Scoring method for free energy estimation.

Site-directed Mutagenesis. Mutagenesis was performed using the

QuikChange Mutagenesis Kit from Stratagene (La Jolla, CA). First, complimentary mutagenic primers (Table A.2) were engineered with the desired mutation in the center of the primer and 10-15 bases of correct sequence on either side. Reactions were prepared as described in the QuikChange protocol with pET-15b-racE1 or pET-15b-racE2 as the plasmid template for generation of the RacE1 and RacE2 mutants. After cycling the reaction 18 times in a thermal cycler, the resulting mix was digested with Dpn I, and the resulting DNA was transformed into super competent E. coli XL1-Blue cells. The resulting mutant plasmids were isolated and the entire gene was sequenced to ensure that the appropriate mutations were introduced, while the rest of the gene sequences remained unchanged.

Protease sensitivity assays. RacE1 wild type (0.13 mM), RacE1 C77A

(0.13 mM) and RacE1 C188A (0.13 mM) were incubated in a TRIS buffer (50 mM

TRIS-HCl, 100 mM NaCl, 2 mM DTT, pH 8.0) with varying concentrations of

244

chymotrypsin (0 µg/mL, 53 µg/mL, 213 µg/mL, 640 µg/mL) at 4 °C. RacE2 wild

type (0.13 mM), RacE2 C74A (0.13 mM) and RacE2 C185A (0.13 mM) were

incubated in a TRIS buffer (50 mM TRIS-HCl, 100 mM NaCl, 2 mM DTT, pH 8.0)

with varying concentrations of chymotrypsin (0 µg/mL, 10 µg/mL, 40 µg/mL, 160

µg/mL) at 4 °C. After 1 h, the reactions were stopped by the addition of an equal

volume of 2x SDS sample buffer (4% SDS, 100 mM TRIS, 0.4 mg bromophenol

blue/mL, 0.2 M DTT, 20% glycerol). The samples were boiled for 3 min, and

resolved by SDS-PAGE (16% acrylamide; 70 V for 25 min, and then 200 V for an

additional 100 min). The gels were soaked in fixing solution (25% isopropanol,

10% acetic acid) for 15 min and then placed in Coomassie stain (10% acetic acid,

0.06 mg/mL Coomassie brilliant blue G-250). After 10 h, the gels were rinsed

twice with H2O, preserved by soaking in water with 3% glycerol for 10 h, and

dried between gel drying films.

A.4 RESULTS

Identification and Cloning of B. anthracis RacE1 and RacE2. The two

genes (racE1, BAS0806 [GenBank]) and (racE2, BAS4379 [GenBank]) predicted

to encode glutamate racemases from B. anthracis were identified by sequence homology to racE (BSU2835 [GenBank]) from B. subtilis. racE1 and racE2 were

PCR-amplified from chromosomal DNA purified from B. anthracis Sterne 7702

(Table A.1), and cloned into an expression vector for recombinant expression in

E. coli. racE1 and racE2 are predicted to encode proteins of 32,532 and 32,316

Da, respectively. RacE1 and RacE2 share 51% sequence identity and 67%

245

sequence similarity (Table A.1, Figure A.1), and differ in the lengths of their respective amino- and carboxyl-termini; RacE1 possesses amino-terminal (MSV) and carboxyl-terminal (NARICN) amino acid extensions that are absent in RacE2

(Figure A.1).

Expression and Purification of B. anthracis RacE1 and RacE2. Both

RacE1 and RacE2 were expressed as soluble, recombinant proteins in E. coli, each with an amino-terminal hexa-histidine fusion peptide to facilitate purification via nickel-chelate affinity chromatography. In this single chromatography step, both RacE1 and RacE2 were purified to greater than 98% purity, as estimated by

SDS-PAGE analysis (Figure A.2A). Preliminary experiments to compare properties of the recombinant proteins before and after the amino-terminal polyhistidine fusion peptide was removed by thrombin cleavage indicated that the presence of the amino-terminal polyhistidine fusion peptide had no detectable effects on the kinetic properties of either RacE1 or RacE2 (data not shown).

RacE1 and RacE2 demonstrate similar secondary structures. To compare the secondary structural composition of recombinant RacE1 and RacE2,

CD spectra in the far UV region (190 nm – 260 nm) were collected for RacE1 or

RacE2. These experiments revealed that recombinant RacE1 and RacE2 both yielded CD spectra indicating the presence of α-helix, β-sheet, and β-turn secondary structural elements. Moreover, comparison of the CD spectra revealed that the relative percentages of α-helix, β-sheet, and β-turn secondary structural elements were nearly identical for RacE1 and RacE2 (Figure A.2B;

246

Table A.3). These data suggest that when expressed as recombinant proteins,

RacE1 and RacE2 were similar in their overall secondary structural properties.

RacE1 and RacE2 differ in their solution quaternary structures. To compare the quaternary structures of RacE1 and RacE2, and to rule out the possibility that the purified proteins exist as aggregates, the apparent molecular weights of purified RacE1 and RacE2 were determined by size-exclusion FPLC.

These experiments revealed that both RacE1 and RacE2 eluted well after the exclusion volume of the column, indicating that neither of these proteins exist as large aggregates in solution. However, RacE2 demonstrated a shorter retention time than RacE1, suggesting that, in solution, RacE2 exists in a higher molecular weight form than RacE1 (Figure A.2C). The molecular weights of RacE1 and

RacE2 were calculated from the retention times of the peak absorbance versus calibration standards of known molecular weights. The apparent molecular weight of RacE1 was calculated to be 32.3 ± 4.6 kDa, which is the predicted molecular weight of monomeric RacE1 (32.3 kDa). In contrast, the apparent molecular weight of RacE2 was calculated to be 51.5 ± 5.2 kDa, which is lower than that expected for dimeric protein (64.6 kDa), suggesting that in solution

RacE2 may be polydisperse, existing as both monomers and higher-ordered complexes. These results suggest that although RacE1 and RacE2 share significant sequence similarity, these proteins demonstrate different quaternary structural properties.

Purified RacE1 and RacE2 are both functional in cell-free assays. To evaluate whether racE1 and racE2 encode functional glutamate racemase

247

enzymes, RacE1 and RacE2 were assessed for their capacity to convert L-

glutamate to the corresponding D-enantiomer by using CD to directly observe the

loss of L-glutamate as it is converted to D-glutamate. These experiments revealed that in the absence of RacE1 or RacE2, stereoisomerization of L- glutamate was not detectable (Figure A.2D). In contrast, the consumption of L- glutamate was readily observed in the presence of either RacE1 or RacE2.

Moreover, the rate of L-glutamate consumption increased as a function of RacE1 or RacE2 concentration. These data indicated that under cell-free and highly defined conditions, RacE1 and RacE2 both catalyzed the conversion of L- glutamate to D-glutamate in a concentration-dependent fashion. These results also established, for the first time, that, despite reported differences in phenotypes of the null mutants (52), racE1 and racE2 both encode functional enzymes that demonstrate glutamate racemase activity.

RacE1 and RacE2 both catalyze the reverse reaction: conversion of

D-glutamate to L-glutamate. In addition to catalyzing the conversion of L- glutamate to the corresponding D-enantiomer, glutamate racemases also catalyze the reverse reaction; e.g. conversion of D-glutamate to the corresponding L-enantiomer. To evaluate whether RacE1 and RacE2 share this canonical property of the glutamate racemase family, we used CD to directly measure the stereoisomerization of D-glutamate to the corresponding L- enantiomer. The experiments revealed a time-dependent increase in the CD signal, corresponding to the loss of D-glutamate, in the presence of either RacE1 or RacE2 (Figure A.2E). Under the conditions of the reaction, the initial rate of

248

stereoisomerization of D-glutamate was higher for RacE2 than for RacE1. By comparison, the rates of stereoisomerization of L-glutamate were similar for

RacE2 and RacE1. Finally, these experiments revealed that for both RacE1 and

RacE2, the rate of D-glutamate conversion is lower than the rate of L-glutamate conversion. These results indicated that RacE1 and RacE2 share one of the canonical properties of glutamate racemases, which is the capacity to catalyze the stereoisomerization of either glutamate enantiomer.

Steady-state kinetic analysis of RacE1 and RacE2. To compare the catalytic properties of RacE1 and RacE2 in more detail, we analyzed the steady- state kinetic parameters of the two enzymes in the presence of D- or L-glutamate.

RacE1 or RacE2 was incubated in a potassium borate buffer in the presence of varying concentrations of D-glutamate or L-glutamate and the change in magnitude of the CD signal was monitored. Initial rate data were obtained for

RacE1 and RacE2 in each of the different substrate concentrations, and a plot of the rate of glutamate racemization (nmol/sec) versus substrate concentration

(mM) was generated for RacE1 and RacE2 in the presence of L- and D- glutamate (Figure A.3). Steady state kinetic parameters for RacE1 and RacE2 in the presence of L- or D-glutamate were obtained by applying a non-linear curve fit to the data (Table A.4). In the L→D direction, differences in the individual kinetic parameters for RacE1 and RacE2 were statistically significant (RacE1, kcat

-1 -1 = 12 ± 0.6 sec ; RacE2, kcat = 18 ± 0.6 sec , P = 0.0003; RacE1, KM = 19 ± 4 mM; RacE2, KM = 38 ± 6 mM; P = 0.01). In the D→L direction, differences in the individual kinetic parameters for RacE1 and RacE2 were not statistically

249

-1 -1 significant (RacE1, kcat = 1.8 ± 0.1 sec , RacE2, kcat = 2.0 ± 0.1 sec , P = 0.07;

RacE1, KM = 1.0 ± 0.2 mM; RacE2, KM = 0.77 ± 0.1 mM, P = 0.1). These findings

are in contrast to those for B. subtilis, where approximately 100-fold differences

in the catalytic efficiencies exist between RacE and YrpC (1, 2). Overall, these

data indicated that RacE1 and RacE2 demonstrate similar, but not identical,

steady-state kinetic properties under cell-free conditions.

pH dependence of RacE1 and RacE2 catalysis. To probe active site

features of RacE1 and RacE2 that might contribute to catalysis, we investigated

the pH dependence of RacE1- and RacE2-catalyzed glutamate racemase activity.

These experiments revealed that the rates of stereoisomerization in the L→D

direction catalyzed by RacE1 or RacE2 are both pH-dependent, with pH optima

of 8.2 and 8.1, respectively. The kcat vs. pH data fit a bell-shaped curve (Eq. 1)

representative of an enzyme utilizing two ionizable side-chains for catalysis

(Figure A.4). kcatapp corresponds to the measured kcat, while kcatmax reports the

true kcat and pKA1 and pKA2 correspond to the pKAs of two ionizable groups important for enzyme activity.

kcatapp = kcatmax/(1+10^(pKA1-pH)+10^(pH-pKA2)) (1)

The pKA values for the two ionizable groups were calculated to be 6.3 ±

0.1 and 9.7 ± 0.1 for RacE1 and 6.1 ± 0.1 and 9.7 ± 0.1 for RacE2. These data

suggest that the racemization reactions catalyzed by RacE1 and RacE2 are both

dependent upon the ionization state of two catalytic residues. Moreover, the pKA

values for the two residues are strikingly similar between the two enzymes,

suggesting that both RacE1 and RacE2 utilize similar residues for catalysis.

250

Site directed mutagenesis reveals active site residues important for

catalysis in both RacE1 and RacE2. To further compare active site features of

RacE1 and RacE2, we constructed three-dimensional homology models for both

B. anthracis RacE1 and RacE2, based upon the available crystal structure for B.

subtilis RacE (43). RacE shares 53% and 59% amino acid sequence identity with

RacE1 and RacE2, respectively. The three-dimensional homology models

generated for B. anthracis RacE1 and RacE2 based on the B. subtilis RacE-D-

glutamate template exhibited Cα root mean square deviation (RMSD) values of

2.6 Å and 7.5 Å, respectively, indicating that RacE2 may share significant

structural similarities with B. subtilis RacE, while RacE1 may possess a more divergent structural arrangement. The homology models for both RacE1 and

RacE2 revealed that two cysteine residues (C77 and C188 for RacE1 and C74 and C185 for RacE2) are predicted to be in close proximity to the predicted location of the D-glutamate substrate (Figure A.5A), suggesting that these two

cysteine residues may be important for the enzymatic activities of RacE1 or

RacE2.

To determine whether the predicted RacE1 and RacE2 active site

residues are important for catalysis, RacE1 C77 and C188, and RacE2 C74 and

C185 were independently changed to alanine by site-directed mutagenesis.

These experiments revealed that, for RacE1, alanine substitutions at C77 or

C188 attenuated detectable racemase activity by at least 100-fold (Figure A.5B).

Likewise, for RacE2, alanine substitutions at C74 or C185A attenuated detectable activity by at least 100-fold. The lower limit of detectable activity was

251

obtained by diluting RacE1 or RacE2 until racemase activity was no longer

detectable (data not shown). To ensure that the alanine substitution had not

resulted in gross structural changes, we compared protease sensitivity patterns

of wild-type and mutant forms of RacE1 and RacE2. These experiments revealed

similar chymotrypsin sensitivity patterns between full-length RacE1 and RacE1

C77A or RacE1 C188A, as well as for full-length RacE2 and RacE2 C74A or

RacE2 C185A (Figure A.5C), indicating a lack of gross structural differences

between wild type and mutant forms of the full-length proteins. Taken together,

these data indicate that C77/C188 and C74/C185 are important for the

racemization activities of RacE1 and RacE2, respectively, and suggest that

RacE1 and RacE2 may possess several similar active site features. In addition,

these results suggest that the homology models we generated for RacE1 and

RacE2 may be useful for generating future predictions about the structure-

function relationships underlying the activities of these enzymes.

A.5 DISCUSSION

While most bacteria possess a single glutamate racemase gene, B.

anthracis has two genes, racE1 and racE2, each predicted to encode a

glutamate racemase. Only several other low G+C Gram-positive bacteria,

including B. subtilis, are predicted, or have been shown experimentally, to encode two glutamate racemases (30). The presence of two putative glutamate

racemases in B. anthracis is potentially of great importance from a

chemotherapeutic standpoint because such functional redundancy could

252

mandate that both enzymes be effectively inhibited to successfully prevent the growth of bacilli during infection. On one hand, if both enzymes possess similar enzymatic properties and active site features, a single compound might be sufficient to inhibit both enzymes. On the other hand, the RacE1 and RacE2 active sites may be sufficiently divergent that a single compound might be ineffective at preventing D-glutamate production by B. anthracis.

A recent study reported a significantly more severe growth defect for a B. anthracis racE2 deletion mutant than a racE1 deletion mutant (52), suggesting fundamental differences between these two genes. However, it was not demonstrated that the growth defects were due specifically to reduced stereoisomerization of L-glutamate to D-glutamate by the mutant bacilli.

Moreover, at the time of this study, neither racE1 nor racE2 had been demonstrated to encode functional glutamate racemases. The overall objective of the current study was therefore two-fold: First, to establish if either racE1 or racE2 encode a functional glutamate racemase and, if so, to explore whether fundamental differences exist in the biochemical properties between RacE1 and

RacE2, which could provide insights into the growth phenotype differences between the racE1 and racE2 mutant strains (52). Notably, the two glutamate racemase isoenzymes in B. subtilis (RacE and YrpC) were demonstrated to possess markedly different catalytic properties, with RacE exhibiting 100-fold higher catalytic efficiency over YrpC (1, 2). In addition, racE is an essential gene, while yrpC is dispensable for growth in rich medium, supporting the idea that in B.

253

subtilis, only one of two glutamate racemases is necessary for converting L- glutamate to the D-glutamate required for rapid proliferation (31).

Our data indicated that racE1 and racE2 both encode functional glutamate racemases. In addition, we demonstrated that in a highly defined, cell-free system, RacE1 and RacE2 both catalyze the stereoisomerization of glutamate. In contrast to the disparate properties demonstrated for B. subtilis RacE and YrpC

(1, 2), steady state kinetic analysis identified B. anthracis as the first organism harboring genes encoding two glutamate racemases, which in cell free assays, demonstrate similar, although not identical, catalytic properties. Analysis of the pH dependence of L-glutamate racemization suggested RacE1 and RacE2 each possess two active site residues with titratable side chains of nearly identical pKA values. Furthermore, based on homology models we generated for RacE1 and

RacE2, we used directed mutagenesis to demonstrate the importance of two cysteine residues predicted to be in the active sites of both enzymes. Taken together, these results suggested that RacE1 and RacE2 may share similar active site features. However, a more detailed comparison of the predicted active sites geometries of the RacE1 and RacE2 homology models (based on the B. subtilis RacE-D-glutamate crystal structure) suggested that several conserved active site residues may be positioned differently relative to bound D-glutamate

(Figure A.6A). For example, distances from the nitrogen atom of D-glutamate to the analogous oxygen atoms of two conserved active site side chains varied between RacE1 (3.1 Å for T189, 3.0 Å for S15) and RacE2 (3.4 Å for T186, 2.6 Å for S12). Furthermore, the proximity of the α-carbon of D-glutamate to the

254

analogous sulfur atoms of RacE1 C77 (3.1 Å) and RacE2 C74 (3.4 Å) was

different.

Differences in non-conserved residues surrounding the RacE1 and RacE2

active sites may also impart altered sensitivity to inhibitors. (2R,4R)-2-amino-4-

(2-benzo[b]thienyl)methyl pentanedioic acid (compound 69) is a potent inhibitor of Streptococcus pneumoniae RacE (IC50=0.036 ug/mL), yet its inhibitory

properties are highly attenuated for Staphylococcus aureus and Moraxella

catarrhalis RacE (12). It has been suggested that the substitution of an alanine at

position 149 with a valine may be responsible for these altered

pharmacodynamic properties between S. pneumoniae and S. aureus (43).

Interestingly, B. anthracis RacE1 possesses an alanine (A152) and RacE2

possesses a valine (V149) at the same positions in the predicted structural

models, which suggests that these two enzymes may exhibit altered sensitivity to

inhibition by compound 69. To explore this possibility in further detail, we

computationally generated a library of conformers of compound 69 and scored

the conformer library for high affinity binding in the active site of the RacE1 or

RacE2 homology models (see Experimental Procedures). Compound 69 bound

tightly (-6.17 kcal/mol) within the active site of RacE1 and adopted a structure

analogous to that of the bound substrate. However, entrance of compound 69

into the active site of RacE2 was occluded by the presence of V149 (Figure

A.6B) and no favorable active site docking conformations could be identified.

These results suggest that fitting of a pharmacophore into the catalytic pocket of

RacE1 and RacE2 may have different requirements, underscoring the potential

255

importance of considering the active sites of both enzymes during the process of pharmacophore design.

The results from these studies suggest that differences in the racE1 and racE2 phenotypes reported earlier are unlikely to be due solely to differences between the intrinsic catalytic efficiencies of the two enzymes (52). However, we cannot rule out that the catalytic properties of RacE1 and RacE2 may be substantially different within the bacterium. Several other possible levels of regulation of racE1 and racE2 inside the organism may contribute to the observed phenotypic differences in the respective deletion mutants. Transcript levels of virulence factors including the PDGA capsule are regulated by the global transcriptional regulator, atxA (8, 11, 32, 60). The increased production of

PDGA capsule that occurs in response to CO2 induction may lead to a larger demand for cytoplasmic levels of D-glutamate, thus it is plausible to speculate that racE1 and racE2 could be differentially regulated at the transcriptional level.

In addition, the relative importance or roles of RacE1 and RacE2 may be associated with differential localization of the two proteins. Differential intracellular localization of proteins has been shown to be important and tightly coupled with the life cycle of B. subtilis (49-51). Putative localization signals were not computationally identified for RacE1 or RacE2 (data not shown). An alternative possibility is that the enzymatic activities of RacE1 and RacE2 are regulated at the post-translational level. Indeed, the enzymatic activity of glutamate racemase from E. coli is regulated by a peptidoglycan precursor (15,

27), a function that raises the possibility that a metabolite or peptidoglycan

256

precursor could regulate the enzymatic activity of RacE1 or RacE2. Finally, our

experiments revealed RacE1 and RacE2 demonstrated distinct properties in

solution when analyzed by gel filtration. The source of the unexpected apparent

molecular weight for RacE2 in solution (51.5 ± 5.2 kDa; Figure A.2) is currently

unknown, but could be due one or more here-to-fore unrecognized factors,

including polydispersity, the binding of co-factors, post-translational modifications, etc. Moreover, the significance of these apparent quaternary structural differences between RacE1 and RacE2 in solution, and whether these differences are associated with regulation of enzymatic activities within the cell, is unclear. Notably, within the family of glutamate racemases, quaternary structure differences apparently exist, as enzymes from B. subtilis (YrpC) (1), B. pumilus

(33), L. fermenti (13, 19) and P. pentosaceus (37) have been reported to be

monomers, while a dimeric form has been reported for A. pyrophilus (29), and

conflicting reports exist about the quaternary structure of E. coli enzyme (15, 62).

Furthermore RacE from B. subtilis has been reported to exist in equilibrium

between a monomer and a dimer (57). Studies are currently underway in the

laboratory to further explore the cellular roles of RacE1 and RacE2 and elucidate

the basis of regulation of their cellular activities.

In summary, we have demonstrated that racE1 and racE2 encode

functional glutamate racemases, suggesting that the enzymatic activities of both

enzymes should be targeted for inhibition. Although we have demonstrated

experimentally that RacE1 and RacE2 possess similar, but not identical,

enzymatic properties and catalytic residues, several differences in active site

257

features are predicted for RacE1 and RacE2, suggesting that both active sites will need to be considered when designing effective drugs for effective inhibition of glutamate racemase activity in B. anthracis.

A.6 ACKNOWLEDGEMENTS

This work was supported by an NIH-NIAID Award to the Western Regional

Center of Excellence for Biodefense and Emerging Infectious Disease Research

U54-AI057156 (SRB; P.I. D. Walker).

A.7 REFERENCES

1. Ashiuchi, M., K. Soda, and H. Misono. 1999. Characterization of yrpC gene product of Bacillus subtilis IFO 3336 as glutamate racemase isozyme. Biosci Biotechnol Biochem 63:792-798.

2. Ashiuchi, M., K. Tani, K. Soda, and H. Misono. 1998. Properties of glutamate racemase from Bacillus subtilis IFO 3336 producing poly- gamma-glutamate. J Biochem (Tokyo) 123:1156-1163.

3. Ashiuchi, M., T. Yoshimura, N. Esaki, H. Ueno, and K. Soda. 1993. Inactivation of glutamate racemase of Pediococcus pentosaceus with L- serine O-sulfate. Biosci Biotechnol Biochem 57:1978-1979.

4. Athamna, A., M. Athamna, N. Abu-Rashed, B. Medlej, D. J. Bast, and E. Rubinstein. 2004. Selection of Bacillus anthracis isolates resistant to antibiotics. J Antimicrob Chemother 54:424-428.

5. Baltz, R. H., J. A. Hoskins, P. J. Solenberg, and P. J. Treadway. 1999. Method for knockout mutagenesis in Streptococcus pneumoniae. U.S. Patent 5,981,281.

6. Borio, L. L., and G. K. Gronvall. 2005. Anthrax countermeasures: current status and future needs. Biosecur Bioterror 3:102-112.

7. Bossi, P., D. Garin, A. Guihot, F. Gay, J. M. Crance, T. Debord, B. Autran, and F. Bricaire. 2006. Bioterrorism: management of major biological agents. Cell Mol Life Sci 63:2196-2212.

258

8. Bourgogne, A., M. Drysdale, S. G. Hilsenbeck, S. N. Peterson, and T. M. Koehler. 2003. Global effects of virulence gene regulators in a Bacillus anthracis strain with both virulence plasmids. Infect Immun 71:2736-2743.

9. Candela, T., and A. Fouet. 2006. Poly-gamma-glutamate in bacteria. Mol Microbiol 60:1091-1098.

10. Choi, S. Y., N. Esaki, T. Yoshimura, and K. Soda. 1991. Overproduction of glutamate racemase of Pediococcus pentosaceus in Escherichia coli clone cells and its purification. Protein Expr Purif 2:90-93.

11. Dai, Z., J. C. Sirard, M. Mock, and T. M. Koehler. 1995. The atxA gene product activates transcription of the anthrax toxin genes and is essential for virulence. Mol Microbiol 16:1171-1181.

12. de Dios, A., L. Prieto, J. A. Martin, A. Rubio, J. Ezquerra, M. Tebbe, B. Lopez de Uralde, J. Martin, A. Sanchez, D. L. LeTourneau, J. E. McGee, C. Boylan, T. R. Parr, Jr., and M. C. Smith. 2002. 4-Substituted D-glutamic acid analogues: the first potent inhibitors of glutamate racemase (MurI) enzyme with antibacterial activity. J Med Chem 45:4559- 4570.

13. Diven, W. F. 1969. Studies on amino acid racemases. II. Purification and properties of the glutamate racemase from Lactobacillus fermenti. Biochim Biophys Acta 191:702-726.

14. Doublet, P., J. van Heijenoort, J. P. Bohin, and D. Mengin-Lecreulx. 1993. The murI gene of Escherichia coli is an essential gene that encodes a glutamate racemase activity. J Bacteriol 175:2970-2979.

15. Doublet, P., J. van Heijenoort, and D. Mengin-Lecreulx. 1994. The glutamate racemase activity from Escherichia coli is regulated by peptidoglycan precursor UDP-N-acetylmuramoyl-L-alanine. Biochemistry 33:5285-5290.

16. Drysdale, M., S. Heninger, J. Hutt, Y. Chen, C. R. Lyons, and T. M. Koehler. 2005. Capsule synthesis by Bacillus anthracis is required for dissemination in murine inhalation anthrax. Embo J 24:221-227.

17. Ferguson, D. M., Raber, J. 1989. A New Approach to Probing Conformational Space with Molecular Mechanics: Random Incremental Pulse Search. J Am Chem Soc 111:4371-4378.

18. Fotheringham, I. G., S. A. Bledig, and P. P. Taylor. 1998. Characterization of the genes encoding D-amino acid transaminase and glutamate racemase, two D-glutamate biosynthetic enzymes of Bacillus sphaericus ATCC 10208. J Bacteriol 180:4319-4323.

259

19. Gallo, K. A., and J. R. Knowles. 1993. Purification, cloning, and cofactor independence of glutamate racemase from Lactobacillus. Biochemistry 32:3981-90.

20. Gill, S. C., and P. H. von Hippel. 1989. Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182:319-326.

21. Glaser, L. 1960. Glutamic acid racemase from Lactobacillus arabinosus. J Biol Chem 235:2095-2098.

22. Glavas, S., Tanner, M.E. 1997. The inhibition of glutamate racemase by D-N-hydroxyglutamate. Bioorg Med Chem Lett 7:2265-2270.

23. Guidi-Rontani, C., M. Weber-Levy, E. Labruyere, and M. Mock. 1999. Germination of Bacillus anthracis spores within alveolar macrophages. Mol Microbiol 31:9-17.

24. Hanby, W. E., Rydon, H. N. 1946. The Capsular Substance of Bacillus anthracis. Biochem J 40:297-309.

25. Hanna, P. 1998. Anthrax pathogenesis and host response. Curr Top Microbiol Immunol 225:13-35.

26. Harth, G., S. Maslesa-Galic, M. V. Tullius, and M. A. Horwitz. 2005. All four Mycobacterium tuberculosis glnA genes encode glutamine synthetase activities but only GlnA1 is abundantly expressed and essential for bacterial homeostasis. Mol Microbiol 58:1157-1172.

27. Ho, H. T., P. J. Falk, K. M. Ervin, B. S. Krishnan, L. F. Discotto, T. J. Dougherty, and M. J. Pucci. 1995. UDP-N-acetylmuramyl-L-alanine functions as an activator in the regulation of the Escherichia coli glutamate racemase activity. Biochemistry 34:2464-2470.

28. Hwang, K. Y., C. S. Cho, S. S. Kim, H. C. Sung, Y. G. Yu, and Y. Cho. 1999. Structure and mechanism of glutamate racemase from Aquifex pyrophilus. Nat Struct Biol 6:422-426.

29. Kim, S. S., I. G. Choi, S. H. Kim, and Y. G. Yu. 1999. Molecular cloning, expression, and characterization of a thermostable glutamate racemase from a hyperthermophilic bacterium, Aquifex pyrophilus. Extremophiles 3:175-183.

30. Kimura, K., L. S. Tran, and Y. Itoh. 2004. Roles and regulation of the glutamate racemase isogenes, racE and yrpC, in Bacillus subtilis. Microbiology 150:2911-2920.

31. Kobayashi, K., Ehrich, S.D., Albertini, A., et. al. 2003. Essential Bacillus subtilis genes. Proc Natl Acad Sci USA 100:4678-4683.

260

32. Koehler, T. M., Z. Dai, and M. Kaufman-Yarbray. 1994. Regulation of the Bacillus anthracis protective antigen gene: CO2 and a trans-acting element activate transcription from one of two promoters. J Bacteriol 176:586-595.

33. Liu, L., T. Yoshimura, K. Endo, N. Esaki, and K. Soda. 1997. Cloning and expression of the glutamate racemase gene of Bacillus pumilus. J Biochem (Tokyo) 121:1155-1161.

34. Lobley, A., Whitmore, L., and Wallace, B. A. 2002. DICHROWEB: an interactive website for the analysis of protein secondary structure from circular dichroism spectra Bioinformatics 18:211-212.

35. Malathi, K. C., M. Wachi, and K. Nagai. 1999. Isolation of the murI gene from Brevibacterium lactofermentum ATCC 13869 encoding D-glutamate racemase. FEMS Microbiol Lett 175:193-196.

36. Mock, M., and A. Fouet. 2001. Anthrax. Annu Rev Microbiol 55:647-671.

37. Nakajima, N., Tanizawa, K., Tanaka, H., Soda, K. 1986. Cloning and expression in Escherichia coli of the glutamate racemase gene from Pediococcus pentosaceus. Agricul Biol Chem 50:2823-2830.

38. Nanninga, N. 1998. Morphogenesis of Escherichia coli. Microbiol Mol Biol Rev 62:110-129.

39. Park, J. T. 1996. p. 48. In F. C. Neidhardt, Curtiss, R. (ed.), Escherichia coli and Salmonella: Cellular and Molecular Biology ASM Press, Washington D.D.

40. Pucci, M. J., J. A. Thanassi, H. T. Ho, P. J. Falk, and T. J. Dougherty. 1995. Staphylococcus haemolyticus contains two D-glutamic acid biosynthetic activities, a glutamate racemase and a D-amino acid transaminase. J Bacteriol 177:336-342.

41. Rogers, H. J., Perkins, H.R., Ward, J.B. 1980. Microbial cell walls and membranes. Chapman & Hall, London.

42. Ross, J. M. 1955. On the histopathology of experimental anthrax in the guinea-pig. Br J Exp Pathol 36:336-339.

43. Ruzheinikov, S. N., M. A. Taal, S. E. Sedelnikova, P. J. Baker, and D. W. Rice. 2005. Substrate-induced conformational changes in Bacillus subtilis glutamate racemase and their implications for drug discovery. Structure (Camb) 13:1707-1713.

44. Salton, M. R. J. 1994. p. 1. In J. M. Ghuysen, Hackenbeck, R. (ed.), Bacterial cell wall. Elsevier Science Pub. Co. , Amsterdam.

261

45. Schleifer, K. H., and O. Kandler. 1972. Peptidoglycan types of bacterial cell walls and their taxonomic implications. Bacteriol Rev 36:407-477.

46. Scorpio, A., D. J. Chabot, W. A. Day, K. O'Brien D, N. J. Vietri, Y. Itoh, M. Mohamadzadeh, and A. M. Friedlander. 2007. Poly-gamma- glutamate capsule-degrading enzyme treatment enhances phagocytosis and killing of encapsulated Bacillus anthracis. Antimicrob Agents Chemother 51:215-222.

47. Sengupta, S., M. Shah, and V. Nagaraja. 2006. Glutamate racemase from Mycobacterium tuberculosis inhibits DNA gyrase by affecting its DNA-binding. Nucleic Acids Res 34:5567-5576.

48. Severin, A., K. Tabei, and A. Tomasz. 2004. The structure of the cell wall peptidoglycan of Bacillus cereus RSVF1, a strain closely related to Bacillus anthracis. Microb Drug Resist 10:77-82.

49. Shapiro, L., and R. Losick. 2000. Dynamic spatial regulation in the bacterial cell. Cell 100:89-98.

50. Shapiro, L., and R. Losick. 1997. Protein localization and cell fate in bacteria. Science 276:712-718.

51. Shapiro, L., H. H. McAdams, and R. Losick. 2002. Generating and exploiting polarity in bacteria. Science 298:1942-1946.

52. Shatalin, K. Y., and A. A. Neyfakh. 2005. Efficient gene inactivation in Bacillus anthracis. FEMS Microbiol Lett 245:315-319.

53. Silver, L. L. 2006. Does the cell wall of bacteria remain a viable source of targets for novel antibiotics? Biochem Pharmacol 71:996-1005.

54. Silver, L. L. 2003. Novel inhibitors of bacterial cell wall synthesis. Curr Opin Microbiol 6:431-438.

55. Stepanov, A. V., L. I. Marinin, A. P. Pomerantsev, and N. A. Staritsin. 1996. Development of novel vaccines against anthrax in man. J Biotechnol 44:155-160.

56. Sterne, M. 1946. Onderstepoorte J Vet Sci Anim Ind 21:41-43.

57. Taal, M. A., S. E. Sedelnikova, S. N. Ruzheinikov, P. J. Baker, and D. W. Rice. 2004. Expression, purification and preliminary X-ray analysis of crystals of Bacillus subtilis glutamate racemase. Acta Crystallogr D Biol Crystallogr 60:2031-2034.

262

58. Tanaka, M., Y. Kato, and S. Kinoshita. 1961. Glutamic acid racemase from Lactobacillus fermenti. Purification and properties. Biochem Biophys Res Commun 4:114-117.

59. Tanner, M. E., Miao, S. 1994. The synthesis and stability of aziridinoglutamate, an irreversible inhibitor of glutamate racemase. Tetrahedron Lett 35:4073-4076.

60. Uchida, I., J. M. Hornung, C. B. Thorne, K. R. Klimpel, and S. H. Leppla. 1993. Cloning and characterization of a gene whose product is a trans-activator of anthrax toxin synthesis. J Bacteriol 175:5329-5338.

61. Yagasaki, M., K. Iwata, S. Ishino, M. Azuma, and A. Ozaki. 1995. Cloning, purification, and properties of a cofactor-independent glutamate racemase from Lactobacillus brevis ATCC 8287. Biosci Biotechnol Biochem 59:610-614.

62. Yoshimura, T., M. Ashiuchi, N. Esaki, C. Kobatake, S. Y. Choi, and K. Soda. 1993. Expression of glr (murI, dga) gene encoding glutamate racemase in Escherichia coli. J Biol Chem 268:24242-24246.

A.8 TABLES AND FIGURES

TABLE A.1 Cloning of Bacillus anthracis glutamate racemase genes a d Organism Gene Primer Sequence (5’→3’)c Predicted MW Accession (Da) Numberb Bacillus racE1 Forward 32532 anthracis BAS0806 5’-GCTGCTCGAGTCTGTATGTCATAAAC-3’ Sterne Reverse 7702 5’-AGTGGATCCCTAATTACAGATGCGA-3’ Bacillus racE2 Forward 32316 anthracis BAS4379 5’-GTGATGTCGACAAGTTGAATAGAGCAATCGGTG-3’ Sterne Reverse 7702 5’-GCTAGGATCCTTATTCTTTTTCTAAATGAATATG-3’ a Bacillus anthracis Sterne 7702 (pXO1+ pXO2-) was used in these studies. b DNA sequences for racE1 and racE2 were obtained from The Institute for Genomic Research Comprehensive Microbial Resource (http://cmr.tigr.org/tigr-scripts/CMR/CmrHomePage.cgi). c Oligonucleotide primers were engineered such that 5’ Xho I (racE1) or Sal I (racE2) and 3’ Bam H I (racE1 and racE2) restriction sites were incorporated. Primers were synthesized by Integrated DNA Technologies (Skokie, IA). d Coding sequences and molecular weight calculations were predicted based upon additional residues contributed by the pET-15b vector N-terminal polyhistidine-tag (Novagen).

263

TABLE A.2 Primer sequences used for mutagenesis b Gene Desired Primer Name Primer Sequence (5’→3’)c Mutationa racE1 Cys77Ala racE1C77AFor 5’-GGCTCTAGTTGTAGCAGCGAATACTGCTGCAGCTGC-3’ racE1C77ARev 5’-GCAGCTGCAGCAGTATTCGCTGCTACAACTAGAGCC-3’ racE1 Cys188Ala racE1C188AFor 5’-GATACGTTAATTCTTGGGGCGACGCATTATCCACTTTTAGAG-3’ racE1C188ARev 5’-CTCTAAAAGTGGATAATGCGTCGCCCCAAGAATTAACGTATC-3’ racE2 Cys74Ala racE2C74AFor 5’-CAAAATGTTAGTTATTGCAGCGAATACAGCAACTGCAGTTGTATTAG-3’ racE2C74ARev 5’-CTAATACAACTGCAGTTGCTGTATTCGCTGCAATAACTAACATTTTG-3’ racE2 Cys185Ala racE2C185AFor 5’-GATACACTTATTTTAGGTGCGACACATTATCCGATTTTAGGTCC-3’ racE2C185ARev 5’-GGACCTAAAATCGGATAATGTGTCGCACCTAAAATAAGTGTATC-3’ a Residues were selected for conservative mutagenesis due to their close proximity to D-glutamate in the homology models generated from the B. subtilis RacE crystal structure (43). b Primers were designed in accordance with the Quikchange Site-Directed Mutagenesis protocol by Stratagene (La Jolla, CA) and were synthesized by Integrated DNA Technologies (Coralville, IA). c Underlined sequence indicates engineered codon mutations. 264

TABLE A.3 Analysis of CD spectraa, b using DICHROWEBc Protein α- β- β-turn unordered helix sheet (%) (%) (%) (%) RacE1 38 ± 3 15 ± 2 20 ± 1 28 ± 1 RacE2 37 ± 3 16 ± 2 20 ± 1 28 ± 1 a Circular dichroism spectra were recorded in the far UV range utilizing a J-720 Circular Dichroism Spectropolarimeter, with spectra recorded from 190 nm to 260 nm at a scan rate of 50 nm/sec, a wavelength step of 1 nm, and 5 accumulations. b Each reaction was performed with three independent enzyme preparations and error bars represent standard deviations from the mean. c Spectra were uploaded onto the DICHROWEB on-line server and analyzed as described under Materials and Methods.

265

TABLE A.4 Kinetic parameters for RacE1 and RacE2 Racemization (L→D)a, b Racemization (D→L)a, b kcat KM kcat/KM kcat KM kcat/KM (sec-1)c,d (mM)c,e (1x103M-1s-1)c,f (sec-1)c,g (mM) c,h (1x103M-1s-1) c,i RacE1 12 ± 0.6 19 ± 4 0.64 ± 0.2 1.8 ± 0.1 1.0 ± 0.2 1.7 ± 0.2 RacE2 18 ± 0.6 38 ± 6 0.48 ± 0.2 2.0 ± 0.1 0.77 ± 0.1 2.6 ± 0.2 a The racemization of both D- and L- glutamate by RacE1 and RacE2 were analyzed as described under Materials and Methods. b Each reaction was performed with three independent enzyme preparations and error bars represent standard deviations from the mean. c The kinetic parameters for RacE1 and RacE2 were assessed for statistically significant differences by the d e f g h i student’s t-test: , P = 0.0003; , P = 0.01; , P = 0.4; , P = 0.07; , P = 0.1; , P = 0.005.

266

Figure A.1 B. anthracis RacE1 and RacE2 possess significant sequence homology to B. subtilis RacE. Primary amino acid sequences for B. anthracis RacE1, RacE2 and B. subtilis RacE were aligned utilizing ESPript V.2.2 (http://espript.ibcp.fr/ESPript/ESPript/). Secondary structural elements and solvent accessibility indices were imported from the crystal structure of Bacillus subtilis 168 RacE (PDB 1ZUW) (43). For solvent accessibility indices, the darker shaded regions indicate sections in the folded proteins that are exposed to solvent.

267

Figure A.2 RacE1 and RacE2 are both functional enzymes in a cell-free system. (A) Purification of recombinant RacE1 and RacE2. The RacE1 and RacE2 clarified lysates and fractions from nickel-chelate affinity chromatography were analyzed by 12% SDS-PAGE followed by Coomassie brilliant blue G-250 staining. (B) RacE1 and RacE2 secondary structure. Circular dichroism spectra in the far UV region (190 nm – 260 nm) were recorded for RacE1 or RacE2 (2.7 µM or 2.4 µM, respectively: both in 50 mM potassium borate buffer, pH 8.0). (C) Gel filtration chromatography. The sizes of purified RacE1 and RacE2 were scored by size-exclusion FPLC. The molecular weights of RacE1 and RacE2 were calculated from the retention times of the peak absorbance versus calibration standards of known molecular weights. (D) Racemization of glutamate in the L→D direction. RacE1 and RacE2 were assessed for their capacity to convert L-glutamate to the corresponding D-enantiomer by using circular dichroism to directly observe the loss of L- glutamate as it is converted to D-glutamate. L-glutamate (5 mM) was incubated in the absence or presence of RacE1 or RacE2 (1.3 µM, 0.33 µM, or 0.08 µM) and the CD signal was recorded for 2.25 hours. (E) Racemization of glutamate in the D→L direction . RacE1 and RacE2 were assessed for their capacity to catalyze the forward (L→D-glutamate) and the reverse (D→L-glutamate) reactions by circular dichroism spectroscopy. L-glutamate (5 mM) or D-glutamate (5 mM) was incubated in the presence of RacE1 (0.31 µM) or RacE2 (0.31 µM) and the CD signal was recorded for 2.25 hours. For (A-E), at least three separate experiments were performed. For each independent experiment, we used RacE1 or RacE2 from one of three independent enzyme preparations, as well as assay reagents from one of three independent preparations. For (A-E), representative data from a single experiment are shown. For (C), the molecular weights are reported as the mean from three independent experiments, with standard deviations.

268

Figure A.3 Steady-State Kinetic Analysis of RacE1 and RacE2. RacE1 (0.78 µM) or RacE2 (0.78 µM) was incubated in a potassium borate buffer (50 mM boric acid, 100 mM KCl, 0.2 mM DTT, pH 8.0) in the presence of varying concentrations of D-glutamate (A) (0.1 mM – 10 mM) or L-glutamate (B) (5 mM – 200 mM) and the change in magnitude of the CD signal was monitored. (A) Steady-state kinetic parameters for the racemization of glutamate in the L→D direction by RacE1 and RacE2. The data are displayed as initial rate of racemization for RacE1 and RacE2 as a function of L-glutamate concentration. Steady state kinetic parameters for RacE1 and RacE2 in the presence of L- glutamate were obtained by applying a non-linear curve fit to the data. (B) Steady-state kinetic parameters for the racemization glutamate in the D→L direction by RacE1 and RacE2. The data are displayed as initial rate of racemization for RacE1 and RacE2 as a function of D-glutamate concentration. Steady state kinetic parameters for RacE1 and RacE2 in the presence of D- glutamate were obtained by applying a non-linear least squares regression utilizing GraphPad Prisim V4.03. For all studies, at least three independent experiments were performed. For each independent experiment, we used RacE1 or RacE2 from one of three independent enzyme preparations, as well as assay reagents from one of three independent preparations. The data are presented as the mean of the data from three independent experiments, with error bars indicating standard deviations from the mean.

269

Figure A.4 pH dependence of RacE1 and RacE2 catalysis. RacE1 and RacE2 were assayed in buffers of varying pH from 6.5 to 9.5 with increments of 0.5 pH units. Each buffer was prepared with a concentration of glutamate (200 mM) that was fully saturating for RacE1 and near saturating (83% saturating) for RacE2, such that initial rate data would report true kcat values. Initial rate data was then obtained for RacE1 (0.78 µM) and RacE2 (0.78 µM) in each of the different buffer formulations by measuring the change in magnitude of the CD signal over time. The data are displayed as turnover number (kcat) as a function of the reaction pH. The results are presented as the mean of the data from three independent experiments. For each independent experiment, we used RacE1 or RacE2 from one of three independent enzyme preparations, as well as assay reagents from one of three independent preparations. Error bars represent standard deviations from the mean.

270

Figure A.5 Site directed mutagenesis reveals active site residues important for catalysis in both RacE1 and RacE2. (A) Three-dimensional homology models. Homology models for RacE1 and RacE2 were constructed using the Chemical Computing Group’s Molecular Operating Environment (MOE) 2006.08. (B) Demonstration of residues important for catalysis. The two putative catalytic cysteine residues in RacE1 (Cys77 and Cys188) and RacE2 (Cys74 and Cys185) were independently changed to alanine by site-directed mutagenesis and assessed for their capacity to racemize glutamate in the L→D direction. (C) Chymotrypsin sensitivity patterns. Chymotrypsin protease sensitivity patterns were generated for the wild-type and mutant forms of RacE1 and RacE2. RacE1 wild type, C77A, or C188A (0.13 mM, each) were incubated with varying concentrations of chymotrypsin (0 µg/mL (A), 53 µg/mL (B), 213 µg/mL (C), 640 µg/mL (D)) at 4 °C. RacE2 wild type, C74A, or C185A (0.13 mM, each) were incubated with varying concentrations of chymotrypsin (0 µg/mL (A), 10 µg/mL (B), 40 µg/mL (C), 160 µg/mL (D)) at 4 °C. After incubation for one hour, the reactions were analyzed by 16 % SDS-PAGE. Experiments in (B) and (C) were performed three separate times.

271

Figure A.6 Three-dimensional homology models reveal differences in RacE1 and RacE2 fine active site features. Three-dimensional homology models for RacE1 and RacE2 were constructed using the Chemical Computing Group’s Molecular Operating Environment (MOE) 2006.08. The template for both models was the B. subtilis glutamate racemase-D- glutamate structure (PDB 1ZUW), which was aligned with the sequences for B. anthracis RacE1 (BAS0806 [GenBank]) and RacE2 (BAS4379 [GenBank]) using the Blosum62 substitution matrix. (A) RacE1 and RacE2 exhibit differences in the spatial arrangement of active site residues. Residues predicted to be important for catalysis in B. anthracis RacE1 (green) and RacE2 (purple) and B. subtilis RacE (gray) were aligned. The root-mean-square deviation (RMSD) of the superposition of Cα atoms between B. anthracis RacE1 and B. subtilis RacE is 7.5 Å, while the RMSD between B. anthracis RacE2 and B. subtilis RacE 2.6 Å. (B and C) Differences in inhibitor docking to the active sites of RacE1 and RacE2. The glutamate racemase inhibitor, (2R,4R)- 2-Amino-4-(2-benzo[b]thienyl)methyl pentanedioic acid (compound 69) (12) is shown docked within the active site of B. anthracis RacE1 (B) and overlaid with the active site of B. anthracis RacE2 (C). Residues that exist at the entrance to the hydrophobic binding pocket of B. anthracis RacE1 (green) and RacE2 (purple) are shown. Compound 69 was unable to dock within the active site of RacE2, presumably due to the larger side chain of RacE2 V149, which aligns with A152 in RacE1.

272

APPENDIX B

DIVERGENT KINETIC PROPERTIES OF THE TWO ALANINE RACEMASES FROM BACILLUS ANTHRACIS

B.1 ABSTRACT

The genome of Bacillus anthracis harbors two genes, alr and dal, predicted to encode alanine racemase enzymes. To evaluate whether alr and dal both encode functional alanine racemases, we expressed, purified, and characterized recombinant forms of Alr and Dal. These studies indicated that alr and dal both encode functional alanine racemases, but with significantly different steady state kinetic properties. The greater catalytic efficiency of Alr than Dal may, in part, reflect physiological roles that are unique to each enzyme.

B.2 RESULTS AND DISCUSSION

The D-isomer of alanine is a key component within the bacterial cell walls of both Gram-negative and Gram-positive bacteria (16). D-alanine is used to form

D-Ala-D-Ala dipeptide, which cross-links adjacent peptidoglycan strands.

However, D-alanine is generally not present at levels within the environment sufficient for peptidoglycan synthesis. Rather, bacteria employ alanine racemases (EC 5.1.1.1) to catalyze the stereoisomerization of L-alanine to D- alanine. The importance of alanine racemase is underscored by genetic studies indicating that mutant strains of bacteria lacking this enzyme demonstrate defects in cell wall biosynthesis and/or are dependent on the addition of exogenous D-alanine for growth (5, 8, 12, 13, 18, 24, 25). The importance and

273

restriction of alanine racemases to bacteria suggest that these enzymes are efficacious targets for the development of antimicrobial agents. While many bacteria have a single gene encoding alanine racemase, including Bacillus stearothermophilus (22) and Bacillus subtilis (4), other species, including

Escherichia coli (26) and Salmonella typhimurium (6, 23), harbor two genes that each encode alanine racemases (7). For S. typhimurium, one enzyme (Alr) is predicted to be important for peptidoglycan biosynthesis, whereas the DadB enzyme is involved in L-alanine catabolism (24).

For Bacillus anthracis, the causative agent of anthrax, the genome contains two genes (dal and alr) predicted to encode alanine racemases (14). Alr has been identified in the exosporium, which is the outermost layer of B. anthracis spores, and has been predicted to play a functional role in both the efficiency of sporulation (2) and the auto-inhibition of germination (11). Several studies have demonstrated that alr encodes a functional alanine racemase (3, 9).

In contrast, it is not clear whether dal encodes a functional alanine racemase, and, if so, whether Alr and Dal share identical or, alternatively, different enzymatic properties. To clarify these gaps in knowledge, the genes predicted to encode alanine racemases, alr (BAS0238 [GenBank]) and dal (BAS1932

[GenBank]), were PCR amplified from genomic DNA isolated from B. anthracis

Sterne 7702 (19) using primers corresponding to the 5′ and 3′ ends of each gene

(Table B.1). The amplicons were digested with the corresponding endonucleases and ligated with a pET-15b expression vector (Novagen) to replace the XhoI -

BamHI fragment within the polylinker region. pET-15b-alr or pET-15b-dal

274

plasmids were isolated from individual clones and the integrity of each gene was

confirmed by DNA sequencing.

Both Alr and Dal were expressed as soluble, recombinant proteins in E.

coli [BL21(DE3)], each with an amino-terminal hexa-histidine fusion peptide to

facilitate purification via nickel-chelate affinity chromatography. In this single chromatography step, both Alr and Dal were purified to greater than 98% purity, as estimated by SDS-PAGE analysis (Figure B.2A). Three independent preparations of Alr and Dal, each originating from an independent transformant of

E. coli [BL21(DE3)], were used for the biochemical and kinetic studies described below.

To evaluate whether alr and dal encode functional alanine racemase enzymes, recombinant Alr and Dal were assessed for their capacity to convert L- alanine to the corresponding D-enantiomer by using circular dichroism (CD) spectropolarimetry to directly observe the loss of L-alanine as it is converted to

D-alanine. L-alanine (5 mM) in an optically clear potassium borate buffer (50 mM boric acid, 100 mM KCl, 0.2 mM DTT, pH 8.0) was incubated at 25 °C in the absence or presence of Alr or Dal (0.16 µM, each). L-alanine stereoisomerization was monitored by recording the CD signal at 225 nm using a J-720 CD

Spectropolarimeter from JASCO (Easton, MD). Data acquisition was performed using the JASCO Spectra Manager v1.54A software and the data were plotted using GraphPad Prism V4.03 from GraphPad Software (San Diego, CA).

These experiments revealed that in the absence of Alr or Dal, stereoisomerization of L-alanine was not detectable (Figure B.2B). In contrast,

275

the consumption of L-alanine was readily observed in the presence of either Alr

or Dal (Figure B.2B). In preliminary studies, we determined that the enzymatic

properties of Alr and Dal were unaffected by the amino-terminal hexa-histidine

fusion peptide (data not shown). These data indicated that under cell-free and highly defined conditions, Alr and Dal both catalyzed the conversion of L-alanine to D-alanine. These results also established, for the first time, that B. anthracis dal encodes a functional enzyme that demonstrates alanine racemase activity.

Our initial studies suggested that Alr converted L-alanine to D-alanine much more rapidly than Dal (Figure B.2B). To obtain additional insights into the differences in catalytic properties between Alr and Dal, we determined the steady-state kinetic parameters of the two enzymes in the presence of L-alanine.

Alr (0.16 µM) or Dal (0.16 µM) was incubated in the presence of varying concentrations of L-alanine (5 mM – 50 mM). Initial rate data were obtained for

Alr and Dal in each of the different substrate concentrations, and a plot of the rate of alanine racemization (nmol/sec) versus L-alanine concentration (mM) was generated for Alr (Figure B.3A) and Dal (Figure B.3B). Steady state kinetic parameters for Alr- and Dal-catalyzed stereoisomerization of L-alanine were obtained by applying a non-linear curve fit to the data (Table B.2). For stereoisomerization of L-alanine, differences in the individual kinetic parameters

-1 for Alr and Dal were statistically significant (Alr, kcat = 960 ± 50 sec ; Dal, kcat =

-1 37 ± 1 sec , P < 0.0001; Alr, KM = 21 ± 2 mM; Dal, KM = 12 ± 0.9 mM; P = 0.002).

These data indicated that Alr is a more efficient enzyme than Dal, as indicated by

the enzyme’s greater kcat/Km values (Table B.2).

276

Because PLP is critical for the stereoisomerization of alanine catalyzed by

alanine racemases, it was important to confirm that the differences in steady

state kinetic parameters we had measured for Alr and Dal were not due to

differences between the two enzymes in PLP incorporation. Alr or Dal (1 mg/mL),

from three independent enzyme preparations, was evaluated

spectrophotometrically to generate the ratios of the absorbance values at 280 nm

(protein absorbance) and 420 nm (Schiff-base absorbance). The absorbance

value at 280 nm was normalized for the difference in tryptophan/tyrosine content

-1 -1 - between the two proteins (Alr, ε280nm = 56.84 mM cm ; Dal, ε280nm = 32.32 mM

1 -1 cm ). These experiments revealed that the A280nm,norm/A420nm ratios for Alr and

Dal were not significantly different (Alr, A280nm,norm/A420nm = 6.2 ± 0.5; Dal,

A280nm/A420nm = 7.4 ± 0.6; P = 0.06). These results indicate that PLP associates

with both recombinant Alr and Dal, and that differences in the steady state kinetic

properties are not due to differences in PLP incorporation between these two

enzymes.

To evaluate the possibility that differences in Alr and Dal steady state

kinetic properties might be due the misfolding of recombinant Dal, we

investigated whether recombinant preparations of Dal might contain high-

molecular weight aggregates, using size-exclusion FPLC (AKTA Purifier 900 fast

protein liquid chromatography system equipped with a Superdex 200 10/300 GL

size exclusion column). These experiments revealed that neither Alr nor Dal was

detected within the exclusion volume of the column (corresponding to proteins

with a molecular mass greater than 1300 kDa, indicating that neither of these

277

proteins exist as large aggregates in solution (data not shown). Experimental retention times were used to calculate the apparent molecular masses from standard curves revealed the presence of proteins with apparent molecular masses of 77.7 (± 1.1) kDa and 81.7 (± 1.8) kDa, which are slightly lower than the predicted molecular masses of 91.9 kDa and 91.6 kDa predicted for dimeric

Alr and Dal, respectively. Notably, several studies have indicated that alanine racemases are likely to function as dimers (17, 20, 27).

The structure-function relationships underlying the differences in kinetic properties between Alr and Dal are poorly understood. Active site residues known to be important for alanine racemization (17) are highly conserved in both enzymes. One exception is a highly conserved carbamylated lysine (Lys131) found in Dal that is instead an asparagine (Asn131) in Alr. Interestingly, a chloride ion was recently identified in the active site of Alr in the space normally occupied by the side chain of carbamylated Lys131 (3). While it is tempting to speculate that the novel active site chloride ion of Alr may contribute to the higher catalytic efficiency of this enzyme, testing this idea will require future directed mutagenesis studies.

While the physiological significance of retaining two genes encoding alanine racemases is unclear, our finding that the kinetic properties of the two enzymes are distinct, suggests that the two enzymes may contribute to B. anthracis growth and survival in fundamentally different ways. Indeed, the enrichment of Alr within the exosporium of dormant B. anthracis spores (15), along with recent genetic (2) and biochemical (11) data indicating that alanine

278

racemase activity modulates germination initiation, suggests that Alr is not important for cell wall biogenesis, but may be a key factor in regulating the germination efficiency of dormant spores. The greater catalytic efficiency of Alr

(relative to Dal) may be required for B. anthracis to generate D-alanine rapidly enough to effectively suppress germination during spore development (2), as well as to auto-inhibit germination initiation in the presence of abundant germinant

(11). These data support an expanded role for D-alanine in the case of B. anthracis, it remains to be seen if such a functional role is conserved in other spore-forming Bacillus species such as B. thuringiensis, B. cereus, and B. subtilis which also have associated exosporium and whose genomes contain two putative alanine racemase genes. Furthermore, there is a second alanine racemase gene present in non-spore-forming organisms such as E. coli (10), S. typhimurium (24), and P. aeruginosa (21), and the presence of this gene in close genomic association with a D-alanine dehydrogenase gene is highly suggestive of a cellular function related to the catabolic metabolism of L-alanine (23).

B.3 REFERENCES

1. Au, K., J. Ren, T. S. Walter, K. Harlos, J. E. Nettleship, R. J. Owens, D. I. Stuart, and R. M. Esnouf. 2008. Structures of an alanine racemase from Bacillus anthracis (BA0252) in the presence and absence of (R)-1- aminoethylphosphonic acid (L-Ala-P). Acta Crystallogr Sect F Struct Biol Cryst Commun 64:327-333.

2. Chesnokova, O. N., S. A. McPherson, C. T. Steichen, and C. L. Turnbough, Jr. 2009. The spore-specific alanine racemase of Bacillus anthracis and its role in suppressing germination during spore development. J Bacteriol 191:1303-1310.

279

3. Counago, R. M., M. Davlieva, U. Strych, R. E. Hill, and K. L. Krause. 2009. Biochemical and structural characterization of alanine racemase from Bacillus anthracis (Ames). BMC Struct Biol 9:53.

4. Ferrari, E., D. J. Henner, and M. Y. Yang. 1985. Isolation of an alanine racemase gene from Bacillus subtilis and its use for plasmid maintenance in B. subtilis. Nat Biotechnol 3:1003-1007.

5. Franklin, F. C., and W. A. Venables. 1976. Biochemical, genetic, and regulatory studies of alanine catabolism in Escherichia coli K12. Mol Gen Genet 149:229-237.

6. Galakatos, N. G., E. Daub, D. Botstein, and C. T. Walsh. 1986. Biosynthetic alr alanine racemase from Salmonella typhimurium: DNA and protein sequence determination. Biochemistry 25:3255-3260.

7. Hayashi, H., H. Wada, T. Yoshimura, N. Esaki, and K. Soda. 1990. Recent topics in pyridoxal 5'-phosphate enzyme studies. Annu Rev Biochem 59:87-110.

8. Hols, P., C. Defrenne, T. Ferain, S. Derzelle, B. Delplace, and J. Delcour. 1997. The alanine racemase gene is essential for growth of Lactobacillus plantarum. J Bacteriol 179:3804-3807.

9. Kanodia, S., S. Agarwal, P. Singh, and R. Bhatnagar. 2009. Biochemical characterization of alanine racemase--a spore protein produced by Bacillus anthracis. BMB Rep 42:47-52.

10. Lambert, M. P., and F. C. Neuhaus. 1972. Mechanism of D-cycloserine action: alanine racemase from Escherichia coli W. J Bacteriol 110:978- 987.

11. McKevitt, M. T., K. M. Bryant, S. M. Shakir, J. L. Larabee, S. R. Blanke, J. Lovchik, C. R. Lyons, and J. D. Ballard. 2007. Effects of endogenous D-alanine synthesis and autoinhibition of Bacillus anthracis germination on in vitro and in vivo infections. Infect Immun 75:5726-5734.

12. Milligan, D. L., S. L. Tran, U. Strych, G. M. Cook, and K. L. Krause. 2007. The alanine racemase of Mycobacterium smegmatis is essential for growth in the absence of D-alanine. J Bacteriol 189:8381-8386.

13. Palumbo, E., C. F. Favier, M. Deghorain, P. S. Cocconcelli, C. Grangette, A. Mercenier, E. E. Vaughan, and P. Hols. 2004. Knockout of the alanine racemase gene in Lactobacillus plantarum results in septation defects and cell wall perforation. FEMS Microbiol Lett 233:131- 138.

280

14. Read, T. D., S. N. Peterson, N. Tourasse, L. W. Baillie, I. T. Paulsen, K. E. Nelson, H. Tettelin, D. E. Fouts, J. A. Eisen, S. R. Gill, E. K. Holtzapple, O. A. Okstad, E. Helgason, J. Rilstone, M. Wu, J. F. Kolonay, M. J. Beanan, R. J. Dodson, L. M. Brinkac, M. Gwinn, R. T. DeBoy, R. Madpu, S. C. Daugherty, A. S. Durkin, D. H. Haft, W. C. Nelson, J. D. Peterson, M. Pop, H. M. Khouri, D. Radune, J. L. Benton, Y. Mahamoud, L. Jiang, I. R. Hance, J. F. Weidman, K. J. Berry, R. D. Plaut, A. M. Wolf, K. L. Watkins, W. C. Nierman, A. Hazen, R. Cline, C. Redmond, J. E. Thwaite, O. White, S. L. Salzberg, B. Thomason, A. M. Friedlander, T. M. Koehler, P. C. Hanna, A. B. Kolsto, and C. M. Fraser. 2003. The genome sequence of Bacillus anthracis Ames and comparison to closely related bacteria. Nature 423:81-86.

15. Redmond, C., L. W. Baillie, S. Hibbs, A. J. Moir, and A. Moir. 2004. Identification of proteins in the exosporium of Bacillus anthracis. Microbiology 150:355-363.

16. Schleifer, K. H., and O. Kandler. 1972. Peptidoglycan types of bacterial cell walls and their taxonomic implications. Bacteriol Rev 36:407-477.

17. Shaw, J. P., G. A. Petsko, and D. Ringe. 1997. Determination of the structure of alanine racemase from Bacillus stearothermophilus at 1.9-Å resolution. Biochemistry 36:1329-1342.

18. Steen, A., E. Palumbo, M. Deghorain, P. S. Cocconcelli, J. Delcour, O. P. Kuipers, J. Kok, G. Buist, and P. Hols. 2005. Autolysis of Lactococcus lactis is increased upon D-alanine depletion of peptidoglycan and lipoteichoic acids. J Bacteriol 187:114-124.

19. Sterne, M. 1946. Avirulent anthrax vaccine. Onderstepoort J Vet Sci Anim Ind 21:41-43.

20. Strych, U., and M. J. Benedik. 2002. Mutant analysis shows that alanine racemases from Pseudomonas aeruginosa and Escherichia coli are dimeric. J Bacteriol 184:4321-4325.

21. Strych, U., H. C. Huang, K. L. Krause, and M. J. Benedik. 2000. Characterization of the alanine racemases from Pseudomonas aeruginosa PAO1. Curr Microbiol 41:290-294.

22. Tanizawa, K., A. Ohshima, A. Scheidegger, K. Inagaki, H. Tanaka, and K. Soda. 1988. Thermostable alanine racemase from Bacillus stearothermophilus: DNA and protein sequence determination and secondary structure prediction. Biochemistry 27:1311-1316.

281

23. Wasserman, S. A., E. Daub, P. Grisafi, D. Botstein, and C. T. Walsh. 1984. Catabolic alanine racemase from Salmonella typhimurium: DNA sequence, enzyme purification, and characterization. Biochemistry 23:5182-5187.

24. Wasserman, S. A., C. T. Walsh, and D. Botstein. 1983. Two alanine racemase genes in Salmonella typhimurium that differ in structure and function. J Bacteriol 153:1439-1450.

25. Wijsman, H. J. 1972. The characterization of an alanine racemase mutant of Escherichia coli. Genet Res 20:269-277.

26. Wild, J., J. Hennig, M. Lobocka, W. Walczak, and T. Klopotowski. 1985. Identification of the dadX gene coding for the predominant isozyme of alanine racemase in Escherichia coli K12. Mol Gen Genet 198:315-322.

27. Yokoigawa, K., Y. Okubo, and K. Soda. 2003. Subunit interaction of monomeric alanine racemases from four Shigella species in catalytic reaction. FEMS Microbiol Lett 221:263-267.

B.4 TABLES AND FIGURES

TABLE B.1 Cloning of Bacillus anthracis alanine racemase genes a Organism Gene Primer Sequence (5’→3’)c Accession Numberb Bacillus alr Forward anthracis BAS0238 5’-GCCGCTCGAGGAAGAAGCACCATTTTATC-3’ Sterne Reverse 7702 5’-CGGCTGATCACTATATATCGTTCAAATAATTAATTACTTCCAC-3’ Bacillus dal Forward anthracis BAS1932 5’-GCCGCTCGAGAGTTTGAAATATGGAAGAG-3’ Sterne Reverse 7702 5’-CGGCGGATCCTCAGTTTTTTCTTAGTATATTTAC-3’ a Bacillus anthracis Sterne 7702 (pXO1+, pXO2-) was used in these studies. b DNA sequences for alr and dal were obtained from The Institute for Genomic Research Comprehensive Microbial Resource (http://cmr.tigr.org/tigr-scripts/CMR/CmrHomePage.cgi). c Oligonucleotide primers were engineered such that 5’ Xho I (alr and dal) and 3' Bcl I (alr) or Bam H I (dal) restriction sites were incorporated. Primers were synthesized by Integrated DNA Technologies (Skokie, IA). d Coding sequences and molecular weight calculations were predicted based upon additional residues contributed by the pET-15b vector N-terminal polyhistidine-tag (Novagen).

282

TABLE B.2 Kinetic parameters for Alr and Dal Racemization (L→D)a, b kcat KM kcat/KM -1 c,d c,e 3 -1 -1 c,f (sec ) (mM) (1x10 M s ) Alr 960 ± 50 21 ± 2 45 ± 5 Dal 37 ± 1 12 ± 0.9 3.2 ± 0.3 a The racemization of L- alanine by Alr and Dal was analyzed as described under Materials and Methods. b Each reaction was performed with three independent enzyme preparations and the error indicates standard deviations from the mean. c The kinetic parameters for Alr and Dal were assessed for d statistically significant differences by the student’s t-test: , P e f < 0.0001; , P = 0.002; , P = 0.0001

283

Figure B.1 B. anthracis Alr and Dal share significant sequence homology. Primary amino acid sequences for B. anthracis Sterne Alr (BAS0238) and B. anthracis Sterne Dal (BAS1932) were aligned utilizing ESPript V.2.2 (http://espript.ibcp.fr/ESPript/ESPript/). Secondary structural elements and solvent accessibility indices were imported from the crystal structure of B. anthracis Alr (Protein Data Bank accession no. 2VD8) (1). For solvent accessibility indices, the darker shaded regions indicate sections in the folded proteins that are exposed to solvent.

284

Figure B.2 Alr and Dal are both functional enzymes in a cell-free system. (A) Purification of recombinant Alr and Dal. The Alr and Dal elution fractions from nickel-chelate affinity chromatography were analyzed by 12% SDS- PAGE, followed by Coomassie brilliant blue G- 250 staining. (B) Racemization of L-alanine. Alr and Dal were assessed for the capacity to convert L-alanine to the corresponding D- enantiomer by using CD to directly observe the loss of L-alanine as it was converted to D- alanine. For panels A and B, three separate experiments were performed and data from representative experiments are shown. For each experiment, Alr or Dal from one of three independent enzyme preparations, as well as assay reagents from one of three independent preparations were used.

285

Figure B.3 Steady-state kinetic analysis of Alr and Dal. Alr (A) or Dal (B) was incubated in a potassium borate buffer in the presence of various concentrations of L-alanine (5 to 50 mM), and the change in magnitude of the CD signal was monitored. For all studies, three independent enzyme preparations were used, as well as assay reagents from one of three independent preparations. The symbols indicate the means of the data from three independent experiments, and the error bars indicate the standard deviations from the means.

286

CURRICULUM VITAE Dylan Dodd

EDUCATION Doctorate of Medicine, Expected June 2013 University of Illinois, Urbana-Champaign, IL Doctorate of Philosophy, Microbiology, Expected May 2010 University of Illinois, Urbana-Champaign, IL Advisor: Isaac K. O. Cann, PhD Masters of Science, Microbiology, August 2007 University of Illinois, Urbana-Champaign, IL Advisor: Steven R. Blanke, PhD Bachelor of Science, Biochemistry, June 2005 University of California, Davis, CA Associates of Science, Chemistry, June 2003 Santa Rosa Junior College, Santa Rosa, CA

HONORS / AWARDS Recipient, Francis and Harlie Clark Microbiology Fellowship, [Fall 2009] Recipient, Thomas E. Buetow Memorial Fund Travel Grant, Carle Hospital, [2009] Recipient, Midwest Trainee Travel Award, Central Society for Clinical Research, [2009] Recipient, The Tylenol Scholarship, Tylenol, [2008] Recipient, Thomas E. Buetow Memorial Fund Travel Grant, Carle Hospital, [2008] Recipient, Outstanding Poster Award, American Society for Clinical Investigation Annual Meeting, [2008] Recipient, James R. Beck Microbiology Fellowship, [Fall 2008] Recipient, Molecular and Cellular Biology Graduate College Travel Award, UIUC, [2008] Recipient, Midwest Trainee Travel Award, Central Society for Clinical Research, [2008] Recipient, Corporate Activities Program Student Travel Grant, American Society for Microbiology, [2007] Recipient, Thomas E. Buetow Memorial Fund Travel Grant, Carle Hospital, [2007] Recipient, Midwest Trainee Travel Award, Central Society for Clinical Research, [2007] Recipient, MacArthur Foundation Fellowship, Program in Arms Control, Disarmament and International Security (ACDIS), UIUC, [01/2007 – 06/2007] Recipient, Molecular Biology Fellowship, University of Illinois at Urbana-Champaign, [Fall 2005] Recipient, Presidents Undergraduate Research Fellowship, UC Davis, [2004 – 2005] Recipient, Gary Y. Fujita Memorial Chemistry Scholarship, [2003] Recipient, Ellis Nixon Memorial Biology Scholarship, [2003] Recipient, Luther Burbank Botany Scholarship, [2003] Recipient, Graton Community Club Scholarship, [2003] Recipient, Independent Research Grant in Chemistry, SRJC, [2001 – 2002] Recipient, Doyle Scholarship, SRJC, [2000-2003]

287

TEACHING EXPERIENCE University of Illinois at Urbana-Champaign, Department of Molecular and Cellular Biology ♦ Teaching Assistant, Techniques in Biochemistry and Biotechnology [08/2009 – 12/2009] ♦ Teaching Assistant, MCB Merit Program [01/2006 – 05/2006]

Santa Rosa Junior College Chemistry Department ♦ Instructor, Chemistry for Kids [06/2003 – 07/2003]

CURRENT FUNDING National Institutes of Health ♦ Ruth L. Kirschstein National Research Service Awards for Individual Predoctoral MD/PhD Fellows (F30), National Institute of Diabetes and Digestive and Kidney Disorders (NIDDK) [01/2010 – 05/2013]

PUBLICATIONS Dodd, D., Moon, Y., Swaminathan, K., Mackie, R. I., and Cann, I. K. O. Transcriptomic analyses of xylan degradation by Prevotella bryantii and insights into energy acquisition by xylanolytic Bacteroidetes. (submitted)

Han, Y., Dodd, D., Schroeder, C. M., Mackie, R. I., and Cann, I. K. O. Comparative analysis of two thermostable bi-functional enzymes with endoglucanase and endomannanse activities from Caldanaerobius polysaccharolyticus. (submitted)

Dodd, D., Reese, J. G., Ballard, J. D., Spies, M. A., and Blanke, S. R. Divergent kinetic properties of the two alanine racemases from Bacillus anthracis. (submitted)

Dodd, D., Kiyonari, S., Mackie, R. I., and Cann, I. K. O. Functional diversity of four glycoside hydrolase family 3 enzymes from the rumen bacterium, Prevotella bryantii B14. Journal of Bacteriology. (in press; doi:10.1128/JB.01654-09)

Yeoman, C., Han, Y., Dodd, D., Schroeder, C. M., Mackie, R. I., and Cann, I. K. O. 2010. Thermostable enzymes as biocatalysts in the biofuels industry. Advances in Applied Microbiology. 70: 1-55.

Spies, M. A., Reese, J. G., Dodd, D., Pankow, K. L., Blanke, S. R., and Baudry, J. Y. 2009. Determinants of catalytic power and ligand binding in glutamate racemase. Journal of the American Chemical Society. 131(14): 5274-5284.

Dodd, D.*, Kocherginskaya*, S., Spies, M. A., Abbas, C. A., Beery, K. E., Mackie, R. I., and Cann, I. K. O. 2009. Biochemical analysis of a β-D-xylosidase and a bifunctional xylanase-rerulic acid esterase from a xylanolytic gene cluster in Prevotella ruminicola 23. Journal of Bacteriology. 191(10): 3328-3338.

*These authors contributed equally to the work.

288

Dodd, D. and Cann, I. K. O. 2009. Enzymatic deconstruction of xylan for biofuel production. Global Change Biology Bioenergy. 1(1): 2-17.

Cann, I. K. O., Dodd, D., Nair, S. K., Mackie, R., Spies, M. A. 2007. Bacterial genome mining to advance enzymology for plant cell wall degradation. ©China Agric. Univ. Press 2007. The Proceedings of the VII International Symposium on the Nutrition of Herbivores. 17-21, Sep. 2007. Beijing, China. Edited by Q.X. Meng.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., Blanke, S. R. 2007. Functional comparison of the Bacillus anthracis glutamate racemases. Journal of Bacteriology. 189(14): 5265-5275.

PUBLISHED ABSTRACTS Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., and Cann, I. K. 2009. Functional and biochemical analysis of two glycoside hydrolases provides insight into the utilization of recalcitrant polysaccharides by gut microorganisms. Journal of Investigative Medicine. 57(3): 537.

Dodd, D., Baudry, J., Pankow, K., Reese, J. G., Blanke, S. R., Spies, M. A. 2008. Cyclic rransition states in the catalysis of proton transfer in glutamate racemase: implications for antibacterial drug design. Journal of Investigative Medicine. 56(3): 653.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., and Blanke, S. R. 2007. Bacillus anthracis racE1 and racE2 encode functional glutamate racemases with distinct properties. Journal of Investigative Medicine. 55(2): S354.

ORAL PRESENTATIONS Dodd, D., Kiyonari, S., Iakiviak, M., Quader, F., Mackie, R. I., and Cann, I. K. Biochemical analysis of the glycoside hydrolase family 3 genes from the rumen bacterium, Prevotella bryantii B14. (2009). Department of Microbiology Student Seminar Series. Urbana, IL.

Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., Cann, I. K. O. Biochemical analysis of two glycoside hydrolases provides insight into the utilization of recalcitrant polysaccharides by rumen Prevotella spp. (2009). Conference on Gastrointestinal Function. Chicago, IL.

Dodd, D., Quader, F., Mackie, R. I., and Cann, I. K. Biochemical characterization of a highly active family 3 β-D-xylosidase from the rumen bacterium, Prevotella bryantii B14. (2008). Energy Biosciences Institute. Berkeley, CA.

Dodd, D., Kiyonari, S., Quader, F., Mackie, R. I., and Cann, I. K. Depolymerization of hemicellulose by glycosyl hydrolases from rumen bacteria. (2008). Energy Biosciences Institute Seminar Series. Urbana, IL.

289

Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., Cann, I. K. O. Biochemical characterization of a family 3 glycosyl hydrolase gene cloned from the rumen bacterium, Prevotella ruminicola 23. (2008). Department of Microbiology Student Seminar Series. Urbana, IL.

Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., Cann, I. K. O. Functional characterization of two glycosyl hydrolases from the rumen bacterium, Prevotella ruminicola 23. (2007). Allerton Microbiology Research Conference. Monticello, IL.

Dodd, D., Marquez, J. P., Torres, J. V. Vaccination with Hypervariable Epitope Constructs provides protection in a murine human papillomavirus tumor model. (2003). University of California, Davis Undergraduate Research Conference, Davis, CA.

Dodd, D., Dodd, R. S., and DeCicco, G. Foliar Epicuticular Analysis of Umbellularia californica. (2003). Association of North Bay Scientists Conference, Solano, CA.

ABSTRACTS Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., Cann, I. K. O. Functional and biochemical analysis of two glycosyl hydrolases provides insight into the utilization of recalcitrant polysaccharides by gut microorganisms. American Society for Clinical Investigation and Association of American Physicians Joint Annual Meeting. Chicago, IL, April 25th, 2009.

Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., Cann, I. K. O. Functional and biochemical analysis of two glycosyl hydrolases provides insight into the utilization of recalcitrant polysaccharides by gut microorganisms. Department of Microbioogy Annual Retreat. Champaign, IL, September 20th, 2008.

Dodd, D., Kocherginskaya, S. A., Spies, M. A., Beery, K. E., Abbas, C. A., Mackie, R. I., Cann, I. K. O. Functional and biochemical analysis of two glycosyl hydrolases provides insight into the utilization of recalcitrant polysaccharides by gut microorganisms. Medical Scholars Program Research Symposium. Monticello, IL, August 23rd, 2008.

Dodd, D., Baudry, J., Pankow, K., Reese, J. G., Blanke, S. R., Spies, M. A. Cyclic Transition states in the catalysis of proton transfer in glutamate racemase: implications for antibacterial drug design. American Society for Clinical Investigation and Association of American Physicians Joint Annual Meeting. Chicago, IL, April 26th, 2008. (Awarded a $1000 poster award)

290

Dodd, D., Baudry, J., Pankow, K., Reese, J. G., Blanke, S. R., Spies, M. A. Cyclic Transition states in the catalysis of proton transfer in glutamate racemase: implications for antibacterial drug design. Central Society for Clinical Investigation and the Midwestern Section for the American Federation for Medical Research Combined Annual Meeting. Chicago, IL, April 24, 2008. (Selected as a moderated poster)

Dodd, D., Baudry, J., Pankow, K., Blanke, S. R., Spies, M. A. Computational Analysis of the Bacillus subtilis Glutamate racemase reveals a novel class of inhibitors for this important enzyme. Medical Scholars Program Research Symposium. Monticello, IL, August 25th, 2007.

Dodd, D., Baudry, J., Pankow, K., Blanke, S. R., Spies, M. A. Computational Analysis of the Bacillus subtilis glutamate racemase provides insight into the mechanism for catalysis. Biochemistry Fall Research Conference. Champaign, IL, September 21, 2007.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., and Blanke, S. R. Functional comparison of the two Bacillus anthracis glutamate racemases. 107th Annual American Society for Microbiology General Meeting, Champaign, IL, April 19, 2007.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., and Blanke, S. R. Glutamate racemase as a target for a new class of antimicrobials. 10th Annual Conference on New and Re-emerging Infectious Diseases, Champaign, IL, April 20, 2007.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., and Blanke, S. R. Functional analysis of the two Bacillus anthracis glutamate racemases provides insight into the design of Anthrax therapeutics. University of Illinois College of Medicine Research Symposium, Champaign, IL, April 19, 2007.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., and Blanke, S. R. Functional characterization of the two B. anthracis glutamate racemases. American Society for Clinical Investigation and Association of American Physicians Joint Meeting, Chicago, IL, April 14, 2007.

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., and Blanke, S. R. Bacillus anthracis racE1 and racE2 encode functional glutamate racemases with distinct properties. Central Society for Clinical Research and Midwestern section of the American Federation for Medical Research Combined Annual Meeting, Chicago, IL, April 13, 2007.

291

Dodd, D., Reese, J. G., Louer, C. R., Ballard, J. D., Spies, M. A., Blanke, S. R. (2006). Bacillus anthracis racE1 and racE2 encode functional glutamate racemases with distinct functional roles. UIUC Microbiology Allerton Conference, Monticello, IL.

292