USOO8426158B2

(12) United States Patent (10) Patent No.: US 8.426,158 B2 Xu et al. (45) Date of Patent: Apr. 23, 2013

(54) METHODS FOR INCREASINGENZYMATIC FOREIGN PATENT DOCUMENTS HYDROLYSIS OF CELLULOSIC MATERAL WO WO 2005, O74656 8, 2005 IN THE PRESENCE OF A WO WO 2008, 134259 11, 2008 WO WO 2010/O12579 2, 2010 (75) Inventors: Feng Xu, Davis, CA (US); Jason Quinlan, Emeryville, CA (US) OTHER PUBLICATIONS Ramachandra et al "Characterization of an Extracellular Lignin (73) Assignee: Novozymes, Inc., Davis, CA (US) Peroxidase of the Lignocellulolytic Actinomycete Streptomyces virdosporus' Dec. 1988 Applied and Environmental Microbiology (*) Notice: Subject to any disclaimer, the term of this vol. 54 No. 12 pp. 3057-3063.* patent is extended or adjusted under 35 Nicholis et al., Enzymology and structure of , 2001, Advances in Inorganic Chemistry. 51: 51-106. U.S.C. 154(b) by 568 days. Masaki et al., Differential role of and in cultured human fibroblasts under exposure of HO, or ultraviolet B (21) Appl. No.: 12/638,920 light, 1998, Archives of Dermatological Research 290: 113-118. Kedderis et al. Characterization of the N-Demethylation Reactions (22) Filed: Dec. 15, 2009 Catalyzed by , 1983, J. Biol. Chem. 258: 8129-8138. (65) Prior Publication Data Jonsson et al., Detoxification of wood hydrolysates with laccase and peroxidase from the white-rot fungus Trametes versicolor, 1998, US 2010/O159509 A1 Jun. 24, 2010 Appl Microbiol Biotechnol, 49: 691-697. Lobarzewski et al., Lignocellulose biotransformation with immobi lized cellulose, D-glucose oxidase and fungal , 1985, Related U.S. Application Data Microb. Technol. 7: 564-566. Steffen et al., Differential degradation of oak (Quercus petraea) leaf (60) Provisional application No. 61/139,373, filed on Dec. litter by litter-decomposing basidiomycetes, 2007. Research in 19, 2008. Microbiology, 158: 447-455. Kedderis et al., Characterization of the N-Demethylation Reactions (51) Int. Cl. Catalyzed by Horseradish Peroxidase, 1983, J. of Biological Chem CI2P I/O (2006.01) istry, 258: 8129-8138. CI2P 13/04 (2006.01) Masaki et al., Differential role of catalase and glutathione peroxidase CI2P 7/00 (2006.01) in cultured human fibroblasts under exposure of H2O2 or ultraviolet CI2P 7/26 (2006.01) B light, 1998, Arch Dermatol Res, 290: 113-118. CI2P 7/02 (2006.01) (52) U.S. Cl. * cited by examiner USPC ...... 435/41; 435/106; 435/132; 435/148; 435/155 Primary Examiner — Herbert J Lilling (58) Field of Classification Search ...... 435/41, (74) Attorney, Agent, or Firm — Robert L. Starnes 435/106, 132, 148, 155 See application file for complete search history. (57) ABSTRACT The present invention relates to methods for increasing (56) References Cited hydrolysis of a cellulosic material, comprising: hydrolyzing the cellulosic material with an composition in the U.S. PATENT DOCUMENTS presence of a polypeptide having peroxidase activity. 8,148,495 B2 * 4/2012 Harris et al...... 530/350 2007/OO77630 A1 4/2007 Harris et al. 12 Claims, 3 Drawing Sheets

U.S. Patent Apr. 23, 2013 Sheet 2 of 3 US 8,426,158 B2

45

...

3.

3

.

(HRP) (J/mL)

Fig. 2 U.S. Patent Apr. 23, 2013 Sheet 3 of 3 US 8,426,158 B2

?º, ?,,

S N

Fig. 3 US 8,426,158 B2 1. 2 METHODS FOR INCREASINGENZYMATIC The present invention also relates to methods for producing HYDROLYSIS OF CELLULOSIC MATERAL a fermentation , comprising: IN THE PRESENCE OF A PEROXIDASE (a) Saccharifying a cellulosic material with an enzyme composition in the presence of a polypeptide having peroxi CROSS-REFERENCE TO RELATED dase activity; APPLICATION (b) fermenting the saccharified cellulosic material with one or more (several) fermenting microorganisms to produce the This application claims the benefit of U.S. Provisional fermentation product; and Application No. 61/139,373, filed Dec. 19, 2008, which (c) recovering the fermentation product from the fermen application is incorporated herein by reference. 10 tation. The present invention further relates to methods of fer REFERENCE TO ASEQUENCE LISTING menting a cellulosic material, comprising: fermenting the cellulosic material with one or more (several) fermenting This application contains a Sequence Listing filed elec microorganisms, wherein the cellulosic material is hydro tronically by EFS, which is incorporated herein by reference. 15 lyzed with an enzyme composition in the presence of a polypeptide having peroxidase activity. BACKGROUND OF THE INVENTION BRIEF DESCRIPTION OF THE FIGURES 1. Field of the Invention The present invention relates to methods for increasing FIG. 1 shows horseradish peroxidase mitigation of cellu hydrolysis of cellulosic material with an enzyme composi lase-inhibition by cellobiose dehydrogenase. Fractional cel tion. lulose conversion is plotted for various reaction conditions. 2. Description of the Related Art Solid bars: 1 day of hydrolysis; hatched bars: 3 days of Cellulose is a polymer of the simple Sugar glucose linked 25 hydrolysis. by beta-1,4-bonds. Many microorganisms produce enzymes FIG. 2 shows horseradish peroxidase enhancement of PCS that hydrolyze beta-linked glucans. These enzymes include hydrolysis. Circles: 1 day of hydrolysis; squares: 3 days of endoglucanases, cellobiohydrolases, and beta-glucosidases. hydrolysis. Endoglucanases digest the cellulose polymer at random loca FIG. 3 shows the effect of various peroxidases on PCS tions, opening it to attack by cellobiohydrolases. Cellobiohy 30 hydrolysis. Solid bars: 1 day of hydrolysis; hatched bars: 3 drolases sequentially release molecules of cellobiose from days of hydrolysis. Numbers indicate the volumes of the the ends of the cellulose polymer. Cellobiose is a water stock peroxidases added, as indicated in the text. Mn-perox: soluble beta-1,4-linked dimer of glucose. Beta-glucosidases ; lignin-perox: lignin-peroxidase; and hydrolyze cellobiose to glucose. cellulase: Trichoderma reesei cellulase composition. It is well known in the art that oxidation of biomolecules 35 Such as DNA, lipids, or protein is a significant issue in bio DEFINITIONS logical systems. Consequently, treatment with a peroxidase may improve the performance of cellulose-hydrolyzing Peroxide-generating system: The term "peroxide-generat enzyme systems. ing system” is defined herein as either a peroxide generating Different peroxide-decomposing enzymes often have dif 40 enzyme as defined below, or as a leading to ferent specificities and potencies. For example, catalase is production of peroxide. Common examples of chemical very efficient only at high levels of (0.1 M methods of peroxide generation include, but are not limited or above) because of its high Michaelis constant, K, on this to. UV-irradiation of Rose Bengal; the Reidl-Pfleiderer pro substance (K, ranges from 0.1 to 1 M: see Nicholls et al., cess of autooxidation of 2-ethyl-9,10-dihydroxyanthracene-- 2001, Advances in Inorg. Chem. 51:52-106; and Masaki et 45 O to 2-ethylanthraquinone--H2O, reaction of singlet state al., 1998, Archives of Dermatological Research 290: 113 molecular oxygen 'O, with ascorbate; the oxidation of 118). At low peroxide levels, a peroxidase can be significantly organic alcohols by molecular oxygen in the presence of more efficient (than catalase) to decompose the peroxide, various metal and metal complex catalysts; and the oxidation because of the enzyme’s high affinity (sub-mM ranges) for of unsaturated lipid by oxygen (after a radical initiation) to the peroxide. For example, horseradish peroxidase, an arche 50 form lipid peroxide. typical peroxidase, has a K, of 0.02 mM on hydrogen peroX Peroxide-generating enzyme: The term “peroxide-gener ide or ethyl hydroperoxide (Kedderis and Hollenberg, 1983, ating enzyme” is defined herein as an donor:oxygen oxi J. Biol. Chem. 258: 8129-8138), and glutathione peroxidase doreductase (E.C. number 1.1.3.x) that catalyzes the reaction has a K of O.025-0.06 mM (Masaki et al., 1998, supra). Since reduced (2e)+O->oxidized substrate+HO, such many biomass conversion techniques are prone to generate 55 as glucose oxidase that catalyzes the reaction glucose-- low level peroxide, peroxidase may be more effective than O->gluconolactone--HO, and a donor: Superoxide oxi catalase to remove the peroxide to improve cellulose hydroly doreductase (E.C. 1.15.1.X). Such as Superoxide dismutase S1S. that catalyzes the reaction 20+2H->O+HO. Other The present invention provides methods for increasing examples of peroxide-generating enzymes are provided hydrolysis of cellulosic materials with enzyme compositions. 60 herein. Alternatively with side activities, wherein molecular oxygen can be used as electron acceptor SUMMARY OF THE INVENTION by the enzyme, are also included within the term hydrogen peroxide-generating enzyme. In addition to hydrogen peroX The present invention relates to methods for degrading or ide, other peroxides may also be generated by these enzymes. converting a cellulosic material, comprising: treating the cel 65 Peroxidase activity: The term “peroxidase activity” is lulosic material with an enzyme composition in the presence defined herein as an enzyme activity that converts a peroxide, of a polypeptide having peroxidase activity. e.g., hydrogen peroxide, to a less oxidative species, e.g., US 8,426,158 B2 3 4 water. It is understood herein that a polypeptide having per substrate. The assay was established by the International oxidase activity encompasses a peroxide-decomposing Union of Pure and Applied Chemistry (IUPAC) (Ghose, enzyme (defined below). 1987, Measurement of cellulase activities, Pure Appl. Chem. Peroxide-decomposing enzyme: The term “peroxide-de 59: 257-68). composing enzyme’ is defined herein as an donor:peroxide 5 For purposes of the present invention, cellulolytic activity (E.C. number 1.11.1.x) that catalyzes the is determined by measuring the increase in hydrolysis of a reaction reduced substrate (2e)+ROOR'->oxidized sub cellulosic material by cellulolytic enzyme(s) under the fol strate+ROH--ROH; such as horseradish peroxidase that cata lowing conditions: 1-20 mg of cellulolytic protein/g of cellu lyzes the reaction phenol--HO->quinone+H2O, and cata lose in PCS for 3-7 days at 50-65° C. compared to a control lase that catalyzes the reaction H2O+HO->O+2H2O. In 10 hydrolysis without addition of cellulolytic protein. Typical addition to hydrogen peroxide, other peroxides may also be conditions are 1 ml reactions, washed or unwashed PCS, 5% decomposed by these enzymes. insoluble solids, 50 mM sodium acetate pH 5, 1 mM MnSO, Cellobiose dehydrogenase: The term “cellobiose dehydro 50-65° C., 72 hours, sugar analysis by AMINEX(R) HPX-87H genase' is defined herein as a cellobiose:acceptor 1-oxi column (Bio-Rad Laboratories, Inc., Hercules, Calif., USA). doreductase (E.C. 1.1.99.18) that catalyzes the conversion of 15 Endoglucanase: The term “endoglucanase' is defined cellobiose in the presence of an acceptor to cellobiono-1,5- herein as an endo-1,4-(1,3:1,4)-beta-D-glucan 4-glucanohy lactone and a reduced acceptor. 2,6-Dichloroindophenol can drolase (E.C. 3.2.1.4), which catalyses endohydrolysis of act as acceptor, as can iron, especially Fe(SCN), molecular 1,4-beta-D-glycosidic linkages in cellulose, cellulose deriva oxygen, ubiquinone, or cytochrome C, and likely many other tives (such as carboxymethyl cellulose and hydroxyethyl cel polyphenolics. Substrates of the enzyme include cellobiose, 20 lulose), lichenin, beta-1,4 bonds in mixed beta-1,3 glucans cello-oligosaccharides, lactose, and D-glucosyl-1,4-B-D- Such as cereal beta-D-glucans or Xyloglucans, and other plant mannose, glucose, maltose, mannobiose, thiocellobiose, material containing cellulosic components. Endoglucanase galactosyl-mannose, Xylobiose, and Xylose. Electron donors activity can be determined based on a reduction in substrate are preferably beta-1-4 dihexoses with glucose or mannose at Viscosity or increase in reducing ends determined by a reduc the reducing end, though alpha-1-4 hexosides, hexoses, pen- 25 ing Sugar assay (Zhanget al., 2006, Biotechnology Advances toses, and beta-1-4 pentomers have also been shown to act as 24: 452-481). For purposes of the present invention, endoglu substrates for these enzymes (Henriksson et al., 1998, Bio canase activity is determined using carboxymethyl cellulose chimica et Biophysica Acta Protein Structure and Molecu (CMC) hydrolysis according to the procedure of Ghose, lar Enzymology, 1383: 48-54; and Schou et al., 1998, Bio 1987, Pure and Appl. Chem. 59: 257-268. chem.J. 330: 565-571). 30 Cellobiohydrolase: The term "cellobiohydrolase' is Cellobiose dehydrogenases comprise two families, 1 and defined herein as a 1,4-beta-D-glucan cellobiohydrolase 2, differentiated by the presence of a cellulose binding motif (E.C. 3.2.1.91), which catalyzes the hydrolysis of 1,4-beta (CBM). The 3-dimensional structure of cellobiose dehydro D-glucosidic linkages in cellulose, cellooligosaccharides, or genase features two globular domains, each containing one of any beta-1,4-linked glucose containing polymer, releasing two cofactors: a or a flavin. The lies at a cleft 35 cellobiose from the reducing or non-reducing ends of the between the two domains. The catalytic cycle of cellobiose chain (Teeri, 1997, Crystalline cellulose degradation: New dehydrogenase follows an ordered sequential mechanism. insight into the function of cellobiohydrolases, Trends in Oxidation of cellobiose occurs via 2-electron transfer from Biotechnology 15: 160-167; Teeri et al., 1998, Trichoderma cellobiose to the flavin, generating cellobiono-1,5-lactone reesei cellobiohydrolases: why so efficient on crystalline cel and reduced flavin. The active FAD is regenerated by electron 40 lulose?, Biochem. Soc. Trans. 26: 173-178). For purposes of transfer to the heme group, leaving a reduced heme. The the present invention, cellobiohydrolase activity is deter native state heme is regenerated by reaction with the oxidiz mined using a fluorescent disaccharide derivative 4-methy ing Substrate at the second active site. lumbelliferyl-?3-D-lactoside according to the procedures The oxidizing substrate is preferentially iron ferricyanide, described by van Tilbeurgh et al., 1982, FEBS Letters 149: cytochrome C, or an oxidized phenolic compound Such as 45 152-156 and van Tilbeurgh and Claeyssens, 1985, FEBS Let dichloroindophenol (DCIP), a substrate commonly used for fers 187: 283-288. colorimetric assays. Metalions and O. are also substrates, but Beta-glucosidase: The term “beta-glucosidase' is defined for most cellobiose dehydrogenases the reaction rate for these herein as a beta-D-glucoside glucohydrolase (E.C. 3.2.1.21), Substrates is several orders of magnitude lower than that which catalyzes the hydrolysis of terminal non-reducing observed for iron or organic oxidants. Following cellobiono- 50 beta-D-glucose residues with the release of beta-D-glucose. lactone release, the product may undergo spontaneous ring For purposes of the present invention, beta-glucosidase activ opening to generate cellobionic acid (Hallberg et al., 2003, J. ity is determined according to the basic procedure described Biol. Chem. 278: 7160-7166). by Venturietal., 2002, Extracellular beta-D-glucosidase from Cellulolytic activity: The term "cellulolytic activity” is Chaetomium thermophilum var. coprophilum: production, defined herein as a biological activity that hydrolyzes a cel- 55 purification and some biochemical properties, J. Basic Micro lulosic material. The two basic approaches for measuring biol. 42: 55-66. One unit of beta-glucosidase activity is cellulolytic activity include: (1) measuring the total cellu defined as 1.0 Limole of p-nitrophenol produced perminute at lolytic activity, and (2) measuring the individual cellulolytic 40°C., pH 5 from 1 mM p-nitrophenyl-beta-D-glucopyrano activities (endoglucanases, cellobiohydrolases, and beta-glu side as substrate in 100 mM sodium citrate containing 0.01% cosidases) as reviewed in Zhang et al., Outlook for cellulase 60 TWEENOR 20. improvement: Screening and selection strategies, 2006, Bio Cellulolytic enhancing activity: The term “cellulolytic technology Advances 24: 452-481. Total cellulolytic activity enhancing activity” is defined herein as a biological activity is usually measured using insoluble Substrates, including that enhances the hydrolysis of a cellulosic material by Whatman N1 filter paper, microcrystalline cellulose, bacte polypeptides having cellulolytic activity. For purposes of the rial cellulose, algal cellulose, cotton, pretreated lignocellu- 65 present invention, cellulolytic enhancing activity is deter lose, etc. The most common total cellulolytic activity assay is mined by measuring the increase in reducing Sugars or the the filter paper assay using Whatman N. 1 filter paper as the increase of the total of cellobiose and glucose from the US 8,426,158 B2 5 6 hydrolysis of a cellulosic material by cellulolytic protein testing of methods for assay of Xylanase activity, Journal of under the following conditions: 1-50 mg of total protein/g of Biotechnology 23(3): 257-270. cellulose in PCS, wherein total protein is comprised of For purposes of the present invention, Xylan degrading 50-99.5% w/w cellulolytic protein and 0.5-50% w/w protein activity is determined by measuring the increase in hydrolysis of cellulolytic enhancing activity for 1-7 day at 50-65° C. 5 of birchwood xylan (Sigma Chemical Co., Inc., St. Louis, compared to a control hydrolysis with equal total protein Mo., USA) by xylan-degrading enzyme(s) under the follow loading without cellulolytic enhancing activity (1-50 mg of ing typical conditions: 1 ml reactions, 5 mg/ml Substrate (total cellulolytic protein/g of cellulose in PCS). In a preferred solids), 5 mg of xylanolytic protein/g of substrate, 50 mM aspect, a mixture of CELLUCLAST(R) 1.5 L (Novozymes Sodium acetate pH 5, 50° C., 24 hours, Sugar analysis using A/S, Bagsvaerd, Denmark) in the presence of 3% of total 10 p-hydroxybenzoic acid hydrazide (PHBAH) assay as protein weight Aspergillus Oryzae beta-glucosidase (recom described by Lever, 1972. A new reaction for colorimetric binantly produced in Aspergillus Oryzae according to WO determination of carbohydrates, Anal. Biochem 47:273-279. 02/095.014) or 3% of total protein weight Aspergillus fumi Xylanase activity: The term “xylanase activity” is defined gatus beta-glucosidase (recombinantly produced in Aspergil herein as a 1,4-beta-D-xylan-Xylohydrolase activity (E.C. lus oryzae as described in WO 2002/095.014) of cellulase 15 3.2.1.8) that catalyzes the endo-hydrolysis of 1,4-beta-D- protein loading is used as the source of the cellulolytic activ Xylosidic linkages in Xylans. For purposes of the present ity. invention, Xylanase activity is determined using birchwood The polypeptides having cellulolytic enhancing activity Xylan as Substrate. One unit of Xylanase activity is defined as enhance the hydrolysis of a cellulosic material catalyzed by 1.0Lmole of reducing Sugar (measured in glucose equivalents proteins having cellulolytic activity by reducing the amount as described by Lever, 1972. A new reaction for colorimetric of cellulolytic enzyme required to reach the same degree of determination of carbohydrates, Anal. Biochem 47:273-279) hydrolysis preferably at least 1.01-fold, more preferably at produced per minute during the initial period of hydrolysis at least 1.05-fold, more preferably at least 1.10-fold, more pref 50° C. pH5 from 2 g of birchwood xylan per literas substrate erably at least 1.25-fold, more preferably at least 1.5-fold, in 50 mM sodium acetate containing 0.01% TWEENR) 20. more preferably at least 2-fold, more preferably at least 25 Beta-xylosidase activity: The term “beta-xylosidase activ 3-fold, more preferably at least 4-fold, more preferably at ity” is defined herein as a beta-D-xyloside xylohydrolase least 5-fold, even more preferably at least 10-fold, and most (E.C.3.2.1.37) that catalyzes the exo-hydrolysis of short beta preferably at least 20-fold. (1->4)-Xylooligosaccharides, to remove Successive D-xylose Family 61 glycoside : The term “Family 61 gly residues from the non-reducing termini. For purposes of the coside hydrolase' or “Family GH61 is defined herein as a 30 present invention, one unit of beta-xylosidase activity is polypeptide falling into the glycoside hydrolase Family 61 defined as 1.0 Limole of p-nitrophenol produced perminute at according to Henrissat B., 1991. A classification of glycosyl 40° C., pH 5 from 1 mM p-nitrophenyl-beta-D-xyloside as based on amino-acid sequence similarities, Bio substrate in 100 mM sodium citrate containing 0.01% chem. J. 280: 309-316, and Henrissat B., and Bairoch A., TWEENOR 20. 1996. Updating the sequence-based classification of glycosyl 35 Acetylxylan esterase activity: The term “acetylxylan hydrolases, Biochem. J. 316: 695-696. Presently, Henrissat esterase activity” is defined herein as a carboxylesterase lists the GH61 Family as unclassified indicating that proper activity (EC 3.1.1.72) that catalyses the hydrolysis of acetyl ties Such as mechanism, catalytic nucleophile/base, and cata groups from polymeric Xylan, acetylated Xylose, acetylated lytic proton donors are not known for polypeptides belonging glucose, alpha-napthyl acetate, and p-nitrophenyl acetate. For to this family. 40 purposes of the present invention, acetylxylan esterase activ Xylan degrading activity: The terms "xylan degrading ity is determined using 0.5 mM p-nitrophenylacetate as Sub activity” or “xylanolytic activity” are defined herein as a strate in 50 mM sodium acetate pH 5.0 containing 0.01% biological activity that hydrolyzes Xylan-containing material. TWEENTM 20. One unit of acetylxylan esterase activity is The two basic approaches for measuring Xylanolytic activity defined as the amount of enzyme capable of releasing 1 umole include: (1) measuring the total Xylanolytic activity, and (2) 45 of p-nitrophenolate anion per minute at pH 5, 25°C. measuring the individual Xylanolytic activities (endoxyla Feruloyl esterase activity: The term “feruloyl esterase nases, beta-xylosidases, arabinofuranosidases, alpha-glucu activity” is defined herein as a 4-hydroxy-3-methoxycin ronidases, acetylxylan esterases, feruloyl esterases, and namoyl-sugar hydrolase activity (EC 3.1.1.73) that catalyzes alpha-glucuronyl esterases). Recent progress in assays of xyl the hydrolysis of the 4-hydroxy-3-methoxycinnamoyl (feru anolytic enzymes was Summarized in several publications 50 loyl) group from an esterified Sugar, which is usually arabi including Biely and Puchard, Recent progress in the assays of nose in “natural substrates, to produce ferulate (4-hydroxy Xylanolytic enzymes, 2006, Journal of the Science of Food 3-methoxycinnamate). Feruloyl esterase is also known as and Agriculture 86(11): 1636-1647: Spanikova and Biely, ferulic acid esterase, hydroxycinnamoyl esterase, FAE-III, 2006, Glucuronoyl esterase Novel carbohydrate esterase cinnamoyl ester hydrolase, FAEA, cinnaE, FAE-I, or FAE-II. produced by Schizophyllum commune, FEBS Letters 580 55 For purposes of the present invention, feruloyl esterase activ (19): 4597-4601; Herrmann, Vrsanska, Jurickova, Hirsch, ity is determined using 0.5 mM p-nitrophenylferulate as sub Biely, and Kubicek, 1997. The beta-D-xylosidase of Tricho strate in 50 mM sodium acetate pH 5.0. One unit of feruloyl derma reesei is a multifunctional beta-D-xylan xylohydro esterase activity equals the amount of enzyme capable of lase, Biochemical Journal 321:375-381. releasing 1 umole of p-nitrophenolate anion per minute at pH Total Xylan degrading activity can be measured by deter 60 5,250 C. mining the reducing Sugars formed from various types of Alpha-glucuronidase activity: The term “alpha-glucu Xylan, including oat spelt, beechwood, and larchwood Xylans, ronidase activity is defined herein as an alpha-D-glucosidu or by photometric determination of dyed Xylan fragments ronate glucuronohydrolase activity (EC 3.2.1.139) that cata released from various covalently dyed Xylans. The most com lyzes the hydrolysis of an alpha-D-glucuronoside to mon total Xylanolytic activity assay is based on production of 65 D-glucuronate and an alcohol. For purposes of the present reducing Sugars from polymeric 4-O-methylglucuronoxylan invention, alpha-glucuronidase activity is determined accord as described in Bailey, Biely, Poutanen, 1992, Interlaboratory ing to de Vries, 1998, J. Bacteriol. 180: 243-249. One unit of US 8,426,158 B2 7 8 alpha-glucuronidase activity equals the amount of enzyme In another aspect, the cellulosic material is corn Stover. In capable of releasing 1 umole of glucuronic or 4-O-methyl another aspect, the cellulosic material is corn fiber. In another glucuronic acid per minute at pH 5, 40°C. aspect, the cellulosic material is corn cob. In another aspect, Alpha-L-arabinofuranosidase activity: The term “alpha-L- the cellulosic material is orange peel. In another aspect, the arabinofuranosidase activity is defined herein as an alpha cellulosic material is rice Straw. In another aspect, the cellu L-arabinofuranoside arabinofuranohydrolase activity (EC losic material is wheat Straw. In another aspect, the cellulosic 3.2.1.55) that catalyzes the hydrolysis of terminal non-reduc material is Switch grass. In another aspect, the cellulosic ing alpha-L-arabinofuranoside residues in alpha-L-arabino material is miscanthus. In another aspect, the cellulosic mate sides. The enzyme activity acts on alpha-L-arabinofurano rial is bagasse. sides, alpha-L-arabinans containing (1,3)- and/or (1,5)- 10 In another aspect, the cellulosic material is microcrystal linkages, arabinoxylans, and arabinogalactans. Alpha-L- line cellulose. In another aspect, the cellulosic material is bacterial cellulose. In another aspect, the cellulosic material arabinofuranosidase is also known as arabinosidase, alpha is algal cellulose. In another aspect, the cellulosic material is arabinosidase, alpha-L-arabinosidase, alpha cotton linter. In another aspect, the cellulosic material is arabinofuranosidase, polysaccharide alpha-L- 15 amorphous phosphoric-acid treated cellulose. In another arabinofuranosidase, alpha-L-arabinofuranoside hydrolase, aspect, the cellulosic material is filter paper. L-arabinosidase, or alpha-L-arabinanase. For purposes of the The cellulosic material may be used as is or may be sub present invention, alpha-L-arabinofuranosidase activity is jected to pretreatment, using conventional methods known in determined using 5 mg of medium Viscosity wheat arabi the art, as described herein. In a preferred aspect, the cellu noxylan (MegaZyme International Ireland, Ltd., Bray, Co. losic material is pretreated. Wicklow, Ireland) per ml of 100 mM sodium acetate pH 5 in Pretreated corn Stover: The term 'PCS or “Pretreated a total volume of 200ul for 30 minutes at 40°C. followed by Corn Stover is defined hereinas a cellulosic material derived arabinose analysis by AMINEX(R) HPX-87H column chro from corn stover by treatment with heat and dilute sulfuric matography (Bio-Rad Laboratories, Inc., Hercules, Calif., acid. USA). 25 Xylan-containing material: The term "Xylan-containing Cellulosic material: The cellulosic material can be any material' is defined herein as any material comprising a plant material containing cellulose. The predominant polysaccha cell wall polysaccharide containing a backbone of beta-(1-4)- ride in the primary cell wall of biomass is cellulose, the linked xylose residues. Xylans of terrestrial plants are het second most abundant is hemicellulose, and the third is pec eropolymers possessing a beta-(1-4)-D-Xylopyranose back tin. The secondary cell wall, produced after the cell has 30 bone, which is branched by short carbohydrate chains. They stopped growing, also contains polysaccharides and is comprise D-glucuronic acid or its 4-O-methyl ether, L-arabi strengthened by polymeric lignin covalently cross-linked to nose, and/or various oligosaccharides, composed of D-xy hemicellulose. Cellulose is a homopolymer of anhydrocello lose, L-arabinose, D- or L-galactose, and D-glucose. Xylan biose and thus a linear beta-(1-4)-D-glucan, while hemicel type polysaccharides can be divided into homoxylans and luloses include a variety of compounds, such as Xylans, Xylo 35 heteroxylans, which include glucuronoxylans, (arabino)glu glucans, arabinoxylans, and mannans in complex branched curonoxylans, (glucurono)arabinoxylans, arabinoxylans, and structures with a spectrum of Substituents. Although gener complex heteroxylans. See, for example, Ebringerova et al., ally polymorphous, cellulose is found in plant tissue prima 2005, Adv. Polym. Sci. 186: 1-67. rily as an insoluble crystalline matrix of parallel glucan In the methods of the present invention, any material con chains. Hemicelluloses usually hydrogen bond to cellulose, 40 taining Xylan may be used. In a preferred aspect, the Xylan as well as to other hemicelluloses, which help stabilize the containing material is lignocellulose. cell wall matrix. Isolated polypeptide: The term "isolated polypeptide’ as Cellulose is generally found, for example, in the stems, used herein refers to a polypeptide that is isolated from a leaves, hulls, husks, and cobs of plants or leaves, branches, Source. In a preferred aspect, the polypeptide is at least 1% and wood of trees. The cellulosic material can be, but is not 45 pure, preferably at least 5% pure, more preferably at least limited to, herbaceous material, agricultural residue, forestry 10% pure, more preferably at least 20% pure, more preferably residue, municipal Solid waste, waste paper, and pulp and at least 40% pure, more preferably at least 60% pure, even paper mill residue (see, for example, Wiselogel et al., 1995, in more preferably at least 80% pure, and most preferably at Handbook on Bioethanol (Charles E. Wyman, editor), pp. least 90% pure, as determined by SDS-PAGE. 105-118, Taylor & Francis, Washington D.C.; Wyman, 1994, 50 Substantially pure polypeptide: The term “substantially Bioresource Technology 50: 3-16; Lynd, 1990, Applied Bio pure polypeptide' denotes herein a polypeptide preparation chemistry and Biotechnology 24/25: 695-719. Mosier et al., that contains at most 10%, preferably at most 8%, more 1999, Recent Progress in Bioconversion of Lignocellulosics, preferably at most 6%, more preferably at most 5%, more in Advances in Biochemical Engineering/Biotechnology, T. preferably at most 4%, more preferably at most 3%, even Scheper, managing editor, Volume 65, pp. 23-40, Springer 55 more preferably at most 2%, most preferably at most 1%, and Verlag, New York). It is understood herein that the cellulose even most preferably at most 0.5% by weight of other may be in the form of lignocellulose, a plant cell wall material polypeptide material with which it is natively or recombi containing lignin, cellulose, and hemicellulose in a mixed nantly associated. It is, therefore, preferred that the substan matrix. In a preferred aspect, the cellulosic material is ligno tially pure polypeptide is at least 92% pure, preferably at least cellulose. 60 94% pure, more preferably at least 95% pure, more preferably In one aspect, the cellulosic material is herbaceous mate at least 96% pure, more preferably at least 97% pure, more rial. In another aspect, the cellulosic material is agricultural preferably at least 98% pure, even more preferably at least residue. In another aspect, the cellulosic material is forestry 99% pure, most preferably at least 99.5% pure, and even most residue. In another aspect, the cellulosic material is municipal preferably 100% pure by weight of the total polypeptide Solid waste. In another aspect, the cellulosic material is waste 65 material present in the preparation. The polypeptides are pref paper. In another aspect, the cellulosic material is pulp and erably in a Substantially pure form, i.e., that the polypeptide paper mill residue. preparation is essentially free of other polypeptide material US 8,426,158 B2 10 with which it is natively or recombinantly associated. This populations. Gene mutations can be silent (no change in the can be accomplished, for example, by preparing the polypep encoded polypeptide) or may encode polypeptides having tide by well-known recombinant methods or by classical altered amino acid sequences. An allelic variant of a polypep purification methods. tide is a polypeptide encoded by an allelic variant of a gene. Mature polypeptide: The term “mature polypeptide' is Isolated polynucleotide: The term "isolated polynucle defined herein as a polypeptide in its final form following otide' as used herein refers to a polynucleotide that is isolated translation and any post-translational modifications, such as from a source. In a preferred aspect, the polynucleotide is at N-terminal processing, C-terminal truncation, glycosylation, least 1% pure, preferably at least 5% pure, more preferably at phosphorylation, etc. least 10% pure, more preferably at least 20% pure, more Mature polypeptide coding sequence: The term “mature 10 preferably at least 40% pure, more preferably at least 60% polypeptide coding sequence' is defined herein as a nucle pure, even more preferably at least 80% pure, and most pref otide sequence that encodes a mature polypeptide. erably at least 90% pure, as determined by agarose electro Identity: The relatedness between two amino acid phoresis. sequences or between two nucleotide sequences is described Substantially pure polynucleotide: The term “substantially by the parameter “identity”. 15 pure polynucleotide' as used herein refers to a polynucleotide For purposes of the present invention, the degree of identity preparation free of other extraneous or unwanted nucleotides between two amino acid sequences is determined using the and in a form suitable for use within genetically engineered Needleman-Wunsch algorithm (Needleman and Wunsch, protein production systems. Thus, a Substantially pure poly 1970, J. Mol. Biol. 48: 443-453) as implemented in the nucleotide contains at most 10%, preferably at most 8%, Needle program of the EMBOSS package (EMBOSS: The more preferably at most 6%, more preferably at most 5%, European Molecular Biology Open Software Suite, Rice et more preferably at most 4%, more preferably at most 3%, al., 2000, Trends in Genetics 16:276-277), preferably version even more preferably at most 2%, most preferably at most 3.0.0 or later. The optional parameters used are gap open 1%, and even most preferably at most 0.5% by weight of other penalty of 10, gap extension penalty of 0.5, and the EBLO polynucleotide material with which it is natively or recombi SUM62 (EMBOSS version of BLOSUM62) substitution 25 nantly associated. A substantially pure polynucleotide may, matrix. The output of Needle labeled “longest identity” (ob however, include naturally occurring 5' and 3' untranslated tained using the -nobrief option) is used as the percent identity regions. Such as promoters and terminators. It is preferred that and is calculated as follows: the substantially pure polynucleotide is at least 90% pure, preferably at least 92% pure, more preferably at least 94% (Identical Residuesx100)/(Length of Alignment-Total 30 pure, more preferably at least 95% pure, more preferably at Number of Gaps in Alignment) least 96% pure, more preferably at least 97% pure, even more For purposes of the present invention, the degree of identity preferably at least 98% pure, most preferably at least 99% between two deoxyribonucleotide sequences is determined pure, and even most preferably at least 99.5% pure by weight. using the Needleman-Wunsch algorithm (Needleman and The polynucleotides are preferably in a substantially pure Wunsch, 1970, supra) as implemented in the Needle program 35 form, i.e., that the polynucleotide preparation is essentially of the EMBOSS package (EMBOSS: The European Molecu free of other polynucleotide material with which it is natively lar Biology Open Software Suite, Rice et al., 2000, supra), or recombinantly associated. The polynucleotides may be of preferably version 3.0.0 or later. The optional parameters genomic, cDNA, RNA, semisynthetic, synthetic origin, or used are gap open penalty of 10, gap extension penalty of 0.5. any combinations thereof. and the EDNAFULL (EMBOSS version of NCBI NUC4.4) 40 Coding sequence: When used herein the term “coding substitution matrix. The output of Needle labeled “longest sequence” means a nucleotide sequence, which directly identity (obtained using the -nobrief option) is used as the specifies the amino acid sequence of its protein product. The percent identity and is calculated as follows: boundaries of the coding sequence are generally determined by an open reading frame, which usually begins with the ATG (Identical Deoxyribonucleotidesx100)/(Length of 45 start codon or alternative start codons such as GTG and TTG Alignment-Total Number of Gaps in Alignment) and ends with a stop codon such as TAA, TAG, and TGA. The Homologous sequence: The term "homologous sequence' coding sequence may be a DNA, cDNA, synthetic, or recom is defined herein as a predicted protein having an E value (or binant nucleotide sequence. expectancy score) of less than 0.001 in a tifasty search (Pear cDNA: The term “cDNA is defined herein as a DNA son, W. R., 1999, in Bioinformatics Methods and Protocols, 50 molecule that can be prepared by reverse transcription from a S. Misener and S. A. Krawetz, ed., pp. 185-219) with a mature, spliced, mRNA molecule obtained from a eukaryotic polypeptide of interest. cell. cDNA lacks intron sequences that may be present in the Polypeptide fragment: The term “polypeptide fragment' is corresponding genomic DNA. The initial, primary RNA tran defined herein as a polypeptide having one or more (several) Script is a precursor to mRNA that is processed through a amino acids deleted from the amino and/or carboxyl terminus 55 series of steps before appearing as mature spliced mRNA. of a mature polypeptide or a homologous sequence thereof, These steps include the removal of intron sequences by a wherein the fragment has biological activity. process called splicing. cDNA derived from mRNA lacks, Subsequence: The term “subsequence' is defined hereinas therefore, any intron sequences. a nucleotide sequence having one or more (several) nucle Nucleic acid construct: The term “nucleic acid construct’ otides deleted from the 5' and/or 3' end of a mature polypep 60 as used herein refers to a nucleic acid molecule, either single tide coding sequence or a homologous sequence thereof, or double-stranded, which is isolated from a naturally occur wherein the Subsequence encodes a polypeptide fragment ring gene or which is modified to contain segments of nucleic having biological activity. acids in a manner that would not otherwise exist in nature or Allelic variant: The term “allelic variant denotes herein which is synthetic. The term nucleic acid construct is synony any of two or more alternative forms of a gene occupying the 65 mous with the term “expression cassette' when the nucleic same chromosomal locus. Allelic variation arises naturally acid construct contains the control sequences required for through mutation, and may result in polymorphism within expression of a coding sequence. US 8,426,158 B2 11 12 Control sequences: The term "control sequences” is polypeptide having peroxidase activity. In one aspect, the defined herein to include all components necessary for the fermenting of the cellulosic material produces a fermentation expression of a polynucleotide encoding a polypeptide. Each product. In another aspect, the method further comprises control sequence may be native or foreign to the nucleotide recovering the fermentation product from the fermentation. sequence encoding the polypeptide or native or foreign to In each of the methods described above, the presence of the each other. Such control sequences include, but are not lim polypeptide having peroxidase activity increases the hydroly ited to, a leader, polyadenylation sequence, propeptide sis of the cellulosic material compared to the absence of the sequence, promoter, signal peptide sequence, and transcrip polypeptide having peroxidase activity. tion terminator. At a minimum, the control sequences include Preferably, the K of the peroxide-decomposing enzyme a promoter, and transcriptional and translational stop signals. 10 or peroxidase is in the range of preferably 0.0001 to 50 mM, The control sequences may be provided with linkers for the more preferably 0.001 to 10 mM, even more preferably 0.005 purpose of introducing specific restriction sites facilitating to 1 mM, and most preferably 0.01 to 0.1 mM. In one aspect, ligation of the control sequences with the coding region of the the K of the peroxide-decomposing enzyme or peroxidase is nucleotide sequence encoding a polypeptide. in the range of 0.0001 to 50 mM. In another aspect, the K of Operably linked: The term “operably linked denotes 15 the peroxide-decomposing enzyme or peroxidase is in the hereina configuration in which a control sequence is placed at range of 0.001 to 10 mM. In another aspect, the K of the an appropriate position relative to the coding sequence of the peroxide-decomposing enzyme or peroxidase is in the range polynucleotide sequence such that the control sequence of 0.005 to 1 mM. In another aspect, the K of the peroxide directs the expression of the coding sequence of a polypep decomposing enzyme or peroxidase is in the range of 0.01 to tide. 0.1 mM. Expression: The term “expression' includes any step In one aspect, in each of the methods described above, the involved in the production of a polypeptide including, but not enzyme composition further comprises a peroxide-generat limited to, transcription, post-transcriptional modification, ing system. In another aspect, the cellulosic material com translation, post-translational modification, and secretion. prises a peroxide-generating system. The presence of the Expression vector: The term “expression vector” is defined 25 peroxide-generating system and the polypeptide having per herein as a linear or circular DNA molecule that comprises a oxidase activity increases the hydrolysis of the cellulosic polynucleotide encoding a polypeptide and is operably linked material compared to the presence of the peroxide-generating to additional nucleotides that provide for its expression. system and the absence of the polypeptide having peroxidase Host cell: The term “host cell, as used herein, includes any activity. cell type that is Susceptible to transformation, transfection, 30 In another aspect, the peroxide-generating system is a transduction, and the like with a nucleic acid construct or hydrogen peroxide-generating enzyme. expression vector comprising a polynucleotide of the present The methods of the present invention can be used to sac invention. charify a cellulosic material to fermentable Sugars and con Modification: The term “modification” means herein any Vert the fermentable Sugars to many useful Substances, e.g., chemical modification of a polypeptide, as well as genetic 35 fuel, potable ethanol, and/or fermentation products (e.g., manipulation of the DNA encoding the polypeptide. The acids, alcohols, ketones, gases, and the like). The production modification can be a Substitution, a deletion and/or an inser of a desired fermentation product from cellulosic material tion of one or more (several) amino acids as well as replace typically involves pretreatment, enzymatic hydrolysis (sac ments of one or more (several) amino acid side chains. charification), and fermentation. Artificial variant: When used herein, the term “artificial 40 The processing of cellulosic material according to the variant’ means a polypeptide produced by an organism present invention can be accomplished using processes con expressing a modified polynucleotide sequence encoding a ventional in the art. Moreover, the methods of the present polypeptide variant. The modified nucleotide sequence is invention can be implemented using any conventional biom obtained through human intervention by modification of the ass processing apparatus configured to operate in accordance polynucleotide sequence. 45 with the invention. Hydrolysis (Saccharification) and fermentation, separate or DETAILED DESCRIPTION OF THE INVENTION simultaneous, include, but are not limited to, separate hydrolysis and fermentation (SHF); simultaneous saccharifi The present invention relates to methods for degrading or cation and fermentation (SSF); simultaneous saccharification converting a cellulosic material, comprising: treating the cel 50 and cofermentation (SSCF); hybrid hydrolysis and fermen lulosic material with an enzyme composition in the presence tation (HHF); separate hydrolysis and co-fermentation of a polypeptide having peroxidase activity. In one aspect, the (SHCF); hybrid hydrolysis and fermentation (HHCF); and method further comprises recovering the degraded or con direct microbial conversion (DMC). SHF uses separate pro verted cellulosic material. cess steps to first enzymatically hydrolyze cellulosic material The present invention also relates to methods for producing 55 to fermentable Sugars, e.g., glucose, cellobiose, cellotriose, a fermentation product, comprising: (a) Saccharifying a cel and pentose Sugars, and then ferment the fermentable Sugars lulosic material with an enzyme composition in the presence to ethanol. In SSF, the enzymatic hydrolysis of cellulosic of a polypeptide having peroxidase activity; (b) fermenting material and the fermentation of Sugars to ethanol are com the saccharified cellulosic material with one or more (several) bined in one step (Philippidis, G. P., 1996, Cellulose biocon fermenting microorganisms to produce the fermentation 60 version technology, in Handbook on Bioethanol: Production product; and (c) recovering the fermentation product from the and Utilization, Wyman, C. E., ed., Taylor & Francis, Wash fermentation. ington, D.C., 179–212). SSCF involves the cofermentation of The present invention further relates to methods of fer multiple sugars (Sheehan, J., and Himmel, M., 1999, menting a cellulosic material, comprising: fermenting the Enzymes, energy and the environment: A Strategic perspec cellulosic material with one or more (several) fermenting 65 tive on the U.S. Department of Energy's research and devel microorganisms, wherein the cellulosic material is hydro opment activities for bioethanol, Biotechnol. Prog. 15: 817 lyzed with an enzyme composition in the presence of a 827). HHF involves a separate hydrolysis step, and in addition US 8,426,158 B2 13 14 a simultaneous saccharification and hydrolysis step, which The cellulosic material can be pretreated before hydrolysis can be carried out in the same reactor. The steps in an HHF and/or fermentation. Pretreatment is preferably performed process can be carried out at different temperatures, i.e., high prior to the hydrolysis. Alternatively, the pretreatment can be temperature enzymatic saccharification followed by SSF at a carried out simultaneously with enzyme hydrolysis to release lower temperature that the fermentation strain can tolerate. 5 fermentable Sugars, such as glucose, Xylose, and/or cello DMC combines all three processes (enzyme production, biose. In most cases the pretreatment step itself results in hydrolysis, and fermentation) in one or more (several) steps Some conversion of biomass to fermentable Sugars (even in where the same organism is used to produce the enzymes for absence of enzymes). conversion of the cellulosic material to fermentable Sugars Steam Pretreatment. In steam pretreatment, cellulosic and to convert the fermentable Sugars into a final product 10 (Lynd, L.R., Weimer, P.J. van Zyl. W. H., and Pretorius, I.S., material is heated to disrupt the plant cell wall components, 2002, Microbial cellulose utilization: Fundamentals and bio including lignin, hemicellulose, and cellulose to make the technology, Microbiol. Mol. Biol. Reviews 66:506-577). It is cellulose and other fractions, e.g., hemicellulose, accessible understood herein that any method known in the art compris to enzymes. Cellulosic material is passed to or through a ing pretreatment, enzymatic hydrolysis (Saccharification), 15 reaction vessel where steam is injected to increase the tem fermentation, or a combination thereof, can be used in the perature to the required temperature and pressure and is practicing the methods of the present invention. retained therein for the desired reaction time. Steam pretreat A conventional apparatus can include a fed-batch stirred ment is preferably done at 140-230° C., more preferably reactor, a batch stirred reactor, a continuous flow stirred reac 160-200° C., and most preferably 170-190° C., where the tor with ultrafiltration, and/or a continuous plug-flow column optimal temperature range depends on any addition of a reactor (Fernanda de Castilhos Corazza, Flávio Faria de chemical catalyst. Residence time for the Steam pretreatment Moraes, Gisella Maria Zanin and Ivo Neitzel, 2003, Optimal is preferably 1-15 minutes, more preferably 3-12 minutes, control in fed-batch reactor for the cellobiose hydrolysis, and most preferably 4-10 minutes, where the optimal resi Acta Scientiarum. Technology 25:33-38; Gusakov, A.V., and dence time depends on temperature range and any addition of Sinitsyn, A. P., 1985, Kinetics of the enzymatic hydrolysis of 25 a chemical catalyst. Steam pretreatment allows for relatively cellulose: 1. A mathematical model for a batch reactor pro high Solids loadings, so that cellulosic material is generally cess, Enz. Microb. Technol. 7: 346-352), an attrition reactor only moist during the pretreatment. The steam pretreatment is (Ryu, S. K., and Lee, J. M., 1983, Bioconversion of waste often combined with an explosive discharge of the material cellulose by using an attrition bioreactor, Biotechnol. Bioeng. after the pretreatment, which is known as steam explosion, 25: 53-65), or a reactor with intensive stirring induced by an 30 electromagnetic field (Gusakov, A.V. Sinitsyn, A. P. Davy that is, rapid flashing to atmospheric pressure and turbulent dkin, I.Y., Davydkin, V.Y., Protas, O.V., 1996, Enhancement flow of the material to increase the accessible surface area by of enzymatic cellulose hydrolysis using a novel type of biore fragmentation (Duff and Murray, 1996, Bioresource Technol actor with intensive stirring induced by electromagnetic field, ogy 855: 1-33; Galbe and Zacchi, 2002, Appl. Microbiol. Appl. Biochem. Biotechnol. 56: 141-153). Additional reactor 35 Biotechnol. 59: 618-628; U.S. Patent Application No. types include: fluidized bed, upflow blanket, immobilized, 20020164730). During steam pretreatment, hemicellulose and extruder type reactors for hydrolysis and/or fermentation. acetyl groups are cleaved and the resulting acid autocatalyzes Pretreatment. In practicing the methods of the present partial hydrolysis of the hemicellulose to monosaccharides invention, any pretreatment process known in the art can be and oligosaccharides. Lignin is removed to only a limited used to disrupt plant cell wall components of cellulosic mate 40 eXtent. rial (Chandra et al., 2007, Substrate pretreatment: The key to A catalyst such as HSO or SO (typically 0.3 to 3% w/w) effective enzymatic hydrolysis of lignocellulosics? Adv. Bio is often added prior to steam pretreatment, which decreases chem. Engin./Biotechnol. 108: 67-93; Galbe and Zacchi, the time and temperature, increases the recovery, and 2007, Pretreatment of lignocellulosic materials for efficient improves enzymatic hydrolysis (Ballesteros et al., 2006, bioethanol production, Adv. Biochem. Engin./Biotechnol. 45 Appl. Biochem. Biotechnol. 129-132: 496-508; Varga et al., 108: 41-65; Hendriks and Zeeman, 2009, Pretreatments to 2004, Appl. Biochem. Biotechnol. 113-116:509-523: Sassner enhance the digestibility of lignocellulosic biomass, Biore et al., 2006, Enzyme Microb. Technol. 39: 756-762). source Technol. 100: 10-18; Mosier et al., 2005, Features of Chemical Pretreatment: The term “chemical treatment” promising technologies for pretreatment of lignocellulosic refers to any chemical pretreatment that promotes the sepa biomass, Bioresource Technol. 96: 673-686; Taherzadeh and 50 ration and/or release of cellulose, hemicellulose, and/or lig Karimi, 2008, Pretreatment of lignocellulosic wastes to nin. Examples of Suitable chemical pretreatment processes improve ethanol and biogas production: A review, Int. J. of include, for example, dilute acid pretreatment, lime pretreat Mol. Sci. 9; 1621-1651; Yang and Wyman, 2008, Pretreat ment, wet oxidation, ammonia fiber/freeze explosion ment: the key to unlocking low-cost cellulosic ethanol, Bio (AFEX), ammonia percolation (APR), and organosolv pre fiels Bioproducts and Biorefining-Bio?pr. 2: 26-40). 55 treatmentS. The cellulosic material can also be subjected to particle In dilute acid pretreatment, cellulosic material is mixed size reduction, pre-soaking, wetting, washing, or condition with dilute acid, typically HSO, and water to form a slurry, ing prior to pretreatment using methods known in the art. heated by steam to the desired temperature, and after a resi Conventional pretreatments include, but are not limited to, dence time flashed to atmospheric pressure. The dilute acid steam pretreatment (with or without explosion), dilute acid 60 pretreatment can be performed with a number of reactor pretreatment, hot waterpretreatment, alkaline pretreatment, designs, e.g., plug-flow reactors, counter-current reactors, or lime pretreatment, wet oxidation, wet explosion, ammonia continuous counter-current shrinking bed reactors (Duff and fiber explosion, organosolv pretreatment, and biological pre Murray, 1996, supra; Schellet al., 2004, Bioresource Technol. treatment. Additional pretreatments include ammonia perco 91: 179-188: Lee et al., 1999, Adv. Biochem. Eng. Biotechnol. lation, ultrasound, electroporation, microwave, Supercritical 65 65:93-115). CO, Supercritical H2O, oZone, and gamma irradiation pre Several methods of pretreatment under alkaline conditions treatmentS. can also be used. These alkaline pretreatments include, but US 8,426,158 B2 15 16 are not limited to, lime pretreatment, wet oxidation, ammonia 160-220° C., and more preferably 165-195°C., for periods percolation (APR), and ammonia fiber/freeze explosion ranging from seconds to minutes to, e.g., 1 second to 60 (AFEX). minutes. Lime pretreatment is performed with calcium carbonate, In another aspect, pretreatment is carried out as an ammo Sodium hydroxide, or ammonia at low temperatures of 5 nia fiber explosion step (AFEX pretreatment step). 85-150° C. and residence times from 1 hour to several days In another aspect, pretreatment takes place in an aqueous (Wyman et al., 2005, Bioresource Technol. 96: 1959-1966: slurry. In preferred aspects, cellulosic material is present dur Mosier et al., 2005, Bioresource Technol. 96: 673-686). WO ing pretreatment in amounts preferably between 10-80 wt %, 2006/110891, WO 2006/11899, WO 2006/11900, and WO more preferably between 20-70 wt %, and most preferably 10 between 30-60 wt %, such as around 50 wt %. The pretreated 2006/110901 disclose pretreatment methods using ammonia. cellulosic material can be unwashed or washed using any Wet oxidation is a thermal pretreatment performed typi method known in the art, e.g., washed with water. cally at 180-200° C. for 5-15 minutes with addition of an Mechanical Pretreatment: The term "mechanical pretreat oxidative agent such as hydrogen peroxide or over-pressure ment” refers to various types of grinding or milling (e.g., dry of oxygen (Schmidt and Thomsen, 1998, Bioresource Tech 15 milling, wet milling, or vibratory ball milling). mol. 64: 139-151; Palonen et al., 2004, Appl. Biochem. Bio Physical Pretreatment: The term “physical pretreatment” technol. 117: 1-17: Varga et al., 2004, Biotechnol. Bioeng. 88: refers to any pretreatment that promotes the separation and/or 567-574; Martin et al., 2006, J. Chem. Technol. Biotechnol. release of cellulose, hemicellulose, and/or lignin from cellu 81: 1669-1677). The pretreatment is performed at preferably losic material. For example, physical pretreatment can 1-40% dry matter, more preferably 2-30% dry matter, and involve irradiation (e.g., microwave irradiation), Steaming/ most preferably 5-20% dry matter, and often the initial pH is steam explosion, hydrothermolysis, and combinations increased by the addition of alkali Such as sodium carbonate. thereof. A modification of the wet oxidation pretreatment method, Physical pretreatment can involve high pressure and/or known as wet explosion (combination of wet oxidation and high temperature (steam explosion). In one aspect, high pres steam explosion), can handle dry matter up to 30%. In wet 25 sure means pressure in the range of preferably about 300 to explosion, the oxidizing agent is introduced during pretreat about 600 psi, more preferably about 350 to about 550 psi, ment after a certain residence time. The pretreatment is then and most preferably about 400 to about 500 psi, such as ended by flashing to atmospheric pressure (WO 2006/ around 450 psi. In another aspect, high temperature means 032282). temperatures in the range of about 100 to about 300° C. 30 preferably about 140 to about 235°C. In a preferred aspect, Ammonia fiber explosion (AFEX) involves treating cellu mechanical pretreatment is performed in a batch-process, losic material with liquid or gaseous ammonia at moderate steam gun hydrolyzer system that uses high pressure and high temperatures such as 90-100° C. and high pressure such as temperature as defined above, e.g., a Sunds Hydrolyzer avail 17-20 bar for 5-10 minutes, where the dry matter content can able from Sunds Defibrator AB, Sweden. be as high as 60% (Gollapalli et al., 2002, Appl. Biochem. 35 Combined Physical and Chemical Pretreatment: Cellulo Biotechnol. 98: 23-35; Chundawat et al., 2007, Biotechnol. sic material can be pretreated both physically and chemically. Bioeng. 96: 219-231: Alizadeh et al., 2005, Appl. Biochem. For instance, the pretreatment step can involve dilute or mild Biotechnol. 121: 1133-1141; Teymouri et al., 2005, Biore acid treatment and high temperature and/or pressure treat source Technol. 96: 2014-2018). AFEX pretreatment results ment. The physical and chemical pretreatments can be carried in the depolymerization of cellulose and partial hydrolysis of 40 out sequentially or simultaneously, as desired. A mechanical hemicellulose. Lignin-carbohydrate complexes are cleaved. pretreatment can also be included. Organosolv pretreatment delignifies cellulosic material by Accordingly, in a preferred aspect, cellulosic material is extraction using aqueous ethanol (40-60% ethanol) at 160 Subjected to mechanical, chemical, or physical pretreatment, 200° C. for 30-60 minutes (Pan et al., 2005, Biotechnol. or any combination thereof, to promote the separation and/or Bioeng.90: 473–481; Panet al., 2006, Biotechnol. Bioeng.94: 45 release of cellulose, hemicellulose, and/or lignin. 851-861; Kurabietal., 2005, Appl. Biochem. Biotechnol. 121: Biological Pretreatment: The term “biological pretreat 219-230). Sulphuric acid is usually added as a catalyst. In ment” refers to any biological pretreatment that promotes the organosolv pretreatment, the majority of hemicellulose is separation and/or release of cellulose, hemicellulose, and/or removed. lignin from cellulosic material. Biological pretreatment tech Other examples of suitable pretreatment methods are 50 niques can involve applying lignin-Solubilizing microorgan described by Schellet al., 2003, Appl. Biochem. and Biotech isms (see, for example, Hsu, T-A., 1996, Pretreatment of mol. Vol. 105-108, p. 69-85, and Mosier et al., 2005, Biore biomass, in Handbook on Bioethanol. Production and Utili source Technology 96: 673-686, and U.S. Published Applica zation, Wyman, C. E., ed., Taylor & Francis, Washington, tion 2002/0164730. D.C., 179-212; Ghosh and Singh, 1993, Physicochemical and In one aspect, the chemical pretreatment is preferably car 55 biological treatments for enzymatic/microbial conversion of ried out as an acid treatment, and more preferably as a con cellulosic biomass, Adv. Appl. Microbiol. 39: 295-333; tinuous dilute and/or mild acid treatment. The acid is typi McMillan, J. D., 1994, Pretreating lignocellulosic biomass: a cally Sulfuric acid, but other acids can also be used, such as review, in Enzymatic Conversion of Biomass for Fuels Pro acetic acid, citric acid, nitric acid, phosphoric acid, tartaric duction, Himmel, M.E., Baker, J.O., and Overend, R.P., eds., acid, Succinic acid, hydrogen chloride, or mixtures thereof. 60 ACS Symposium Series 566, American Chemical Society, Mild acid treatment is conducted in the pH range of prefer Washington, D.C., chapter 15: Gong, C. S., Cao, N.J., Du, J., ably 1-5, more preferably 1-4, and most preferably 1-3. In one and Tsao, G. T., 1999, Ethanol production from renewable aspect, the acid concentration is in the range from preferably resources, in Advances in Biochemical Engineering/Biotech 0.01 to 20 wt % acid, more preferably 0.05 to 10 wt % acid, nology, Scheper, T., ed., Springer-Verlag Berlin Heidelberg, even more preferably 0.1 to 5 wt % acid, and most preferably 65 Germany, 65: 207-241; Olsson and Hahn-Hagerdal, 1996, 0.2 to 2.0 wt % acid. The acid is contacted with cellulosic Fermentation of lignocellulosic hydrolysates for ethanol pro material and held at a temperature in the range of preferably duction, Enz. Microb. Tech. 18:312-331; and Vallander and US 8,426,158 B2 17 18 Eriksson, 1990, Production of ethanol from lignocellulosic about 0.01 to about 40 mg, more preferably about 0.01 to materials: State of the art, Adv. Biochem. Eng./Biotechnol. 42: about 30 mg, more preferably about 0.01 to about 20 mg. 63–95). more preferably about 0.01 to about 10 mg, more preferably Saccharification. In the hydrolysis step, also known as about 0.01 to about 5 mg, more preferably at about 0.025 to saccharification, the pretreated cellulosic material is hydro 5 about 1.5 mg, more preferably at about 0.05 to about 1.25 mg. lyzed to break down cellulose and alternatively also hemicel more preferably at about 0.075 to about 1.25 mg, more pref lulose to fermentable Sugars, such as glucose, cellobiose, erably at about 0.1 to about 1.25 mg, even more preferably at Xylose, Xylulose, arabinose, mannose, galactose, and/or about 0.15 to about 1.25 mg, and most preferably at about soluble oligosaccharides. The hydrolysis is performed enzy 0.25 to about 1.0 mg per g of cellulosic material. matically by an enzyme composition in the presence of a 10 polypeptide having peroxidase activity of the present inven In another preferred aspect, an effective amount of a tion. The composition can further comprise one or more (sev polypeptide having peroxidase activity to a hydrogen peroX eral) hemicellulolytic enzymes. The enzymes of the compo ide-generating enzyme is about 0.005 to about 1.0 g, prefer sitions can also be added sequentially. ably at about 0.01 to about 1.0 g, more preferably at about Enzymatic hydrolysis is preferably carried out in a suitable 15 0.15 to about 0.75 g, more preferably at about 0.15 to about aqueous environment under conditions that can be readily 0.5 g, more preferably at about 0.1 to about 0.5g, even more determined by one skilled in the art. In a preferred aspect, preferably at about 0.1 to about 0.5g, and most preferably at hydrolysis is performed under conditions suitable for the about 0.05 to about 0.2g per g of hydrogen peroxide-gener activity of the enzyme(s), i.e., optimal for the enzyme(s). The ating enzyme. hydrolysis can be carried out as a fed batch or continuous In another preferred aspect, an effective amount of process where the pretreated cellulosic material (substrate) is polypeptide(s) having cellulolytic enhancing activity to cel fed gradually to, for example, an enzyme containing hydroly lulolytic protein(s) is about 0.005 to about 1.0 g, preferably at sis solution. about 0.01 to about 1.0 g, more preferably at about 0.15 to The saccharification is generally performed in stirred-tank about 0.75 g, more preferably at about 0.15 to about 0.5 g. reactors or fermentors under controlled pH, temperature, and 25 more preferably at about 0.1 to about 0.5g, even more pref mixing conditions. Suitable process time, temperature and erably at about 0.1 to about 0.5g, and most preferably at about pH conditions can readily be determined by one skilled in the 0.05 to about 0.2 g per g of cellulolytic protein(s). art. For example, the Saccharification can last up to 200 hours, Fermentation. The fermentable sugars obtained from the but is typically performed for preferably about 12 to about 96 pretreated and hydrolyzed cellulosic material can be fer hours, more preferably about 16 to about 72 hours, and most 30 mented by one or more (several) fermenting microorganisms preferably about 24 to about 48 hours. The temperature is in capable of fermenting the Sugars directly or indirectly into a the range of preferably about 25° C. to about 70° C., more desired fermentation product. "Fermentation” or “fermenta preferably about 30°C. to about 65°C., and more preferably tion process' refers to any fermentation process or any pro about 40° C. to 60° C., in particular about 50° C. The pH is in cess comprising a fermentation step. Fermentation processes the range of preferably about 3 to about 8, more preferably 35 also include fermentation processes used in the consumable about 3.5 to about 7, and most preferably about 4 to about 6, alcohol industry (e.g., beer and wine), dairy industry (e.g., in particular about pH 5. The dry solids content is in the range fermented dairy products), leather industry, and tobacco of preferably about 5 to about 50 wt %, more preferably about industry. The fermentation conditions depend on the desired 10 to about 40 wt %, and most preferably about 20 to about 30 fermentation product and fermenting organism and can easily wt %. 40 be determined by one skilled in the art. The optimum amounts of the enzymes and polypeptides In the fermentation step, Sugars, released from cellulosic having cellulolytic enhancing activity depend on several fac material as a result of the pretreatment and enzymatic tors including, but not limited to, the mixture of component hydrolysis steps, are fermented to a product, e.g., ethanol, by cellulolytic enzymes, the cellulosic Substrate, the concentra a fermenting organism, such as yeast. Hydrolysis (Sacchari tion of cellulosic substrate, the pretreatment(s) of the cellu 45 fication) and fermentation can be separate or simultaneous, as losic Substrate, temperature, time, pH, and inclusion offer described herein. menting organism (e.g., yeast for Simultaneous Any suitable hydrolyzed cellulosic material can be used in Saccharification and Fermentation). the fermentation step in practicing the present invention. The In a preferred aspect, an effective amount of cellulolytic material is generally selected based on the desired fermenta enzyme(s) to cellulosic material is about 0.5 to about 50 mg. 50 tion product, i.e., the substance to be obtained from the fer preferably at about 0.5 to about 40 mg, more preferably at mentation, and the process employed, as is well known in the about 0.5 to about 25 mg, more preferably at about 0.75 to art. about 20 mg, more preferably at about 0.75 to about 15 mg. The term "fermentation medium' is understood herein to even more preferably at about 0.5 to about 10 mg, and most refer to a medium before the fermenting microorganism(s) preferably at about 2.5 to about 10 mg per g of cellulosic 55 is(are) added. Such as, a medium resulting from a saccharifi material. cation process, as well as a medium used in a simultaneous In another preferred aspect, an effective amount of a saccharification and fermentation process (SSF). polypeptide having peroxidase activity to cellulosic material "Fermenting microorganism” refers to any microorgan is about 0.001 to about 50 mg, preferably at about 0.01 to ism, including bacterial and fungal organisms, Suitable for about 40 mg, more preferably at about 0.02 to about 25 mg. 60 use in a desired fermentation process to produce a fermenta more preferably at about 0.03 to about 20 mg, more prefer tion product. The fermenting organism can be C and/or Cs ably at about 0.04 to about 15 mg, even more preferably at fermenting organisms, or a combination thereof. Both C and about 0.04 to about 10 mg, and most preferably at about 0.05 Cs fermenting organisms are well known in the art. Suitable to about 5 mg per g of cellulosic material. fermenting microorganisms are able to ferment, i.e., convert, In another preferred aspect, an effective amount of 65 Sugars, such as glucose, Xylose, Xylulose, arabinose, maltose, polypeptide(s) having cellulolytic enhancing activity to cel mannose, galactose, or oligosaccharides, directly or indi lulosic material is about 0.01 to about 50.0 mg, preferably rectly into the desired fermentation product. US 8,426,158 B2 19 20 Examples of bacterial and fungal fermenting organisms pentose Sugars, such as Xylose utilizing, arabinose utilizing, producing ethanol are described by Lin et al., 2006, Appl. and Xylose and arabinose co-utilizing microorganisms. Microbiol. Biotechnol. 69: 627-642. The cloning of heterologous genes into various fermenting Examples of fermenting microorganisms that can ferment microorganisms has led to the construction of organisms C. Sugars include bacterial and fungal organisms, such as capable of converting hexoses and pentoses to ethanol (cof yeast. Preferred yeast includes strains of the Saccharomyces ermentation) (Chen and Ho, 1993, Cloning and improving the spp., preferably Saccharomyces cerevisiae. expression of Pichia stipitis Xylose reductase gene in Saccha Examples of fermenting organisms that can ferment Cs romyces cerevisiae, Appl. Biochem. Biotechnol. 39-40: 135 Sugars include bacterial and fungal organisms, such as yeast. 147: Ho et al., 1998, Genetically engineered Saccharomyces Preferred Cs fermenting yeast include strains of Pichia, pref 10 yeast capable of effectively cofermenting glucose and Xylose, erably Pichia stipitis, such as Pichia stipitis CBS 5773: Appl. Environ. Microbiol. 64: 1852-1859; Kotter and Ciriacy, strains of Candida, preferably Candida boidinii, Candida 1993, Xylose fermentation by Saccharomyces cerevisiae, brassicae, Candida Sheatae, Candida diddensii, Candida Appl. Microbiol. Biotechnol. 38: 776-783; Walfridsson et al., pseudotropicalis, or Candida utilis. 1995, Xylose-metabolizing Saccharomyces cerevisiae Other fermenting organisms include strains of Zymomo 15 strains overexpressing the TKL1 and TAL1 genes encoding nas, Such as Zymomonas mobilis, Hansenula. Such as the pentose phosphate pathway enzymes transketolase and Hansenula anomala, Kluyveromyces, such as K. fragilis, transaldolase, Appl. Environ. Microbiol. 61: 4184-4190; Schizosaccharomyces, such as S. pombe; and E. coli, espe Kuyper et al., 2004, Minimal metabolic engineering of Sac cially E. coli strains that have been genetically modified to charomyces cerevisiae for efficient anaerobic xylose fermen improve the yield of ethanol. tation: a proof of principle, FEMSYeast Research 4: 655-664: In a preferred aspect, the yeast is a Saccharomyces spp. In Beall et al., 1991, Parametric studies of ethanol production a more preferred aspect, the yeast is Saccharomyces cerevi from Xylose and other Sugars by recombinant Escherichia siae. In another more preferred aspect, the yeast is Saccharo coli, Biotech. Bioeng. 38:296-303: Ingram et al., 1998, Meta myces distaticus. In another more preferred aspect, the yeast 25 bolic engineering of bacteria for ethanol production, Biotech is Saccharomyces uvarum. In another preferred aspect, the mol. Bioeng. 58:204-214: Zhanget al., 1995, Metabolic engi yeast is a Kluyveromyces. In another more preferred aspect, neering of a pentose metabolism pathway in ethanologenic the yeast is Kluyveromyces marxianus. In another more pre Zymomonas mobilis, Science 267: 240-243: Deanda et al., ferred aspect, the yeast is Kluyveromyces fragilis. In another 1996, Development of an arabinose-fermenting Zymomonas preferred aspect, the yeast is a Candida. In another more 30 mobilis Strain by metabolic pathway engineering, Appl. Envi preferred aspect, the yeast is Candida boidinii. In another more preferred aspect, the yeast is Candida brassicae. In ron. Microbiol. 62: 4465-4470; WO 2003/062430, xylose anothermore preferred aspect, the yeast is Candida diddensii. ). In another more preferred aspect, the yeast is Candida In a preferred aspect, the genetically modified fermenting pseudotropicalis. In another more preferred aspect, the yeast 35 microorganism is Saccharomyces cerevisiae. In another pre is Candida utilis. In another preferred aspect, the yeast is a ferred aspect, the genetically modified fermenting microor Clavispora. In another more preferred aspect, the yeast is ganism is Zymomonas mobilis. In another preferred aspect, Clavispora lusitaniae. In another more preferred aspect, the the genetically modified fermenting microorganism is yeast is Clavispora opuntiae. In another preferred aspect, the Escherichia coli. In another preferred aspect, the genetically yeast is a Pachysolen. In another more preferred aspect, the 40 modified fermenting microorganism is Klebsiella Oxytoca. In yeast is Pachysolen tannophilus. In another preferred aspect, another preferred aspect, the genetically modified fermenting the yeast is a Pichia. In another more preferred aspect, the microorganism is Kluyveromyces sp. yeast is a Pichia stipitis. In another preferred aspect, the yeast It is well known in the art that the organisms described is a Bretannomyces. In another more preferred aspect, the above can also be used to produce other Substances, as yeast is Bretannomyces clausenii (Philippidis, G. P., 1996, 45 described herein. Cellulose bioconversion technology, in Handbook on Bioet The fermenting microorganism is typically added to the hanol: Production and Utilization, Wyman, C. E., ed., Taylor degraded lignocellulose or hydrolysate and the fermentation & Francis, Washington, D.C., 179–212). is performed for about 8 to about 96 hours, such as about 24 Bacteria that can efficiently ferment hexose and pentose to to about 60 hours. The temperature is typically between about ethanol include, for example, Zymomonas mobilis and 50 26°C. to about 60°C., in particular about 32° C. or 50° C., and Clostridium thermocellum (Philippidis, 1996, supra). at about pH 3 to about pH 8, such as around pH 4-5, 6, or 7. In a preferred aspect, the bacterium is a Zymomonas. In a In a preferred aspect, the yeast and/or another microorgan more preferred aspect, the bacterium is Zymomonas mobilis. ism is applied to the degraded cellulosic material and the In another preferred aspect, the bacterium is a Clostridium. In fermentation is performed for about 12 to about 96 hours, another more preferred aspect, the bacterium is Clostridium 55 Such as typically 24-60 hours. In a preferred aspect, the tem thermocellum. perature is preferably between about 20°C. to about 60° C. Commercially available yeast suitable for ethanol produc more preferably about 25° C. to about 50° C., and most tion includes, e.g., ETHANOL REDTMyeast (available from preferably about 32° C. to about 50° C., in particular about Fermentis/Lesaffre, USA), FALITM (available from Fleis 32° C. or 50° C., and the pH is generally from about pH 3 to chmann's Yeast, USA), SUPERSTARTTM and THERMO 60 about pH 7, preferably around pH 4-7. However, some fer SACCTM fresh yeast (available from Ethanol Technology, WI, menting organisms, e.g., bacteria, have higher fermentation USA), BIOFERMTM AFT and XR (available from NABC temperature optima. Yeast or another microorganism is pref North American Bioproducts Corporation, GA, USA), GERT erably applied in amounts of approximately 10 to 10', pref STRANDTM (available from Gert Strand AB, Sweden), and erably from approximately 107 to 10", especially approxi FERMIOLTM (available from DSM Specialties). 65 mately 2x10 viable cell count per ml of fermentation broth. In a preferred aspect, the fermenting microorganism has Further guidance in respect of using yeast for fermentation been genetically modified to provide the ability to ferment can be found in, e.g., “The Alcohol Textbook” (Editors K. US 8,426,158 B2 21 22 Jacques, T. P. Lyons and D. R. Kelsall, Nottingham University organic acid is acetonic acid. In another more preferred Press, United Kingdom 1999), which is hereby incorporated aspect, the organic acid is adipic acid. In another more pre by reference. ferred aspect, the organic acid is ascorbic acid. In another For ethanol production, following the fermentation the more preferred aspect, the organic acid is citric acid. In fermented slurry is distilled to extract the ethanol. The ethanol another more preferred aspect, the organic acid is 2,5-diketo obtained according to the methods of the invention can be D-gluconic acid. In another more preferred aspect, the used as, e.g., fuel ethanol, drinking ethanol, i.e., potable neu organic acid is formic acid. In another more preferred aspect, tral spirits, or industrial ethanol. the organic acid is fumaric acid. In another more preferred A fermentation stimulator can be used in combination with aspect, the organic acid is glucaric acid. In another more any of the processes described herein to further improve the 10 fermentation process, and in particular, the performance of preferred aspect, the organic acid is gluconic acid. In another the fermenting microorganism, such as, rate enhancement more preferred aspect, the organic acid is glucuronic acid. In and ethanol yield. A “fermentation stimulator” refers to another more preferred aspect, the organic acid is glutaric stimulators for growth of the fermenting microorganisms, in acid. In another preferred aspect, the organic acid is 3-hy particular, yeast. Preferred fermentation stimulators for 15 droxypropionic acid. In another more preferred aspect, the growth include Vitamins and minerals. Examples of vitamins organic acid is itaconic acid. In another more preferred include multivitamins, biotin, pantothenate, nicotinic acid, aspect, the organic acid is lactic acid. In another more pre meso-inositol, thiamine, pyridoxine, para-aminobenzoic ferred aspect, the organic acid is malic acid. In another more acid, folic acid, riboflavin, and Vitamins A, B, C, D, and E. preferred aspect, the organic acid is malonic acid. In another See, for example, Alfenore et al., Improving ethanol produc more preferred aspect, the organic acid is oxalic acid. In tion and viability of Saccharomyces cerevisiae by a vitamin another more preferred aspect, the organic acid is propionic feeding strategy during fed-batch process, Springer-Verlag acid. In another more preferred aspect, the organic acid is (2002), which is hereby incorporated by reference. Examples Succinic acid. In another more preferred aspect, the organic of minerals include minerals and mineral salts that can Supply acid is Xylonic acid. See, for example, Chen, R., and Lee, Y. nutrients comprising P. K. Mg, S, Ca, Fe, Zn, Mn, and Cu. 25 Y., 1997, Membrane-mediated extractive fermentation for Fermentation products: A fermentation product can be any lactic acid production from cellulosic biomass, Appl. Bio substance derived from the fermentation. The fermentation chem. Biotechnol. 63-65: 435-448. product can be, without limitation, an alcohol (e.g., arabini In another preferred aspect, the fermentation product is a tol, butanol, ethanol, glycerol, methanol. 1,3-propanediol. ketone. It will be understood that the term "ketone' encom Sorbitol, and Xylitol); an organic acid (e.g., acetic acid, 30 acetonic acid, adipic acid, ascorbic acid, citric acid, 2.5- passes a Substance that contains one or more ketone moieties. diketo-D-gluconic acid, formic acid, fumaric acid, glucaric In another more preferred aspect, the ketone is acetone. See, acid, gluconic acid, glucuronic acid, glutaric acid, 3-hydrox for example, Qureshi and Blaschek, 2003, supra. ypropionic acid, itaconic acid, lactic acid, malic acid, malonic In another preferred aspect, the fermentation product is an acid, oxalic acid, oxaloacetic acid, propionic acid, Succinic 35 amino acid. In another more preferred aspect, the organic acid acid, and Xylonic acid); a ketone (e.g., acetone); an amino is aspartic acid. In another more preferred aspect, the amino acid (e.g., aspartic acid, glutamic acid, glycine, lysine, serine, acid is glutamic acid. In another more preferred aspect, the and threonine); and a gas (e.g., methane, hydrogen (H2). amino acid is glycine. In another more preferred aspect, the carbon dioxide (CO), and carbon monoxide (CO)). The fer amino acid is lysine. In another more preferred aspect, the mentation product can also be protein as a high value product. 40 amino acid is serine. In another more preferred aspect, the In a preferred aspect, the fermentation product is an alco amino acid is threonine. See, for example, Richard, A., and hol. It will be understood that the term “alcohol encom Margaritis, A., 2004, Empirical modeling of batch fermenta passes a Substance that contains one or more hydroxyl moi tion kinetics for poly(glutamic acid) production and other eties. In a more preferred aspect, the alcohol is arabinitol. In microbial biopolymers, Biotechnology and Bioengineering another more preferred aspect, the alcohol is butanol. In 45 87 (4): 501-515. another more preferred aspect, the alcohol is ethanol. In In another preferred aspect, the fermentation product is a another more preferred aspect, the alcohol is glycerol. In gas. In another more preferred aspect, the gas is methane. In another more preferred aspect, the alcohol is methanol. In another more preferred aspect, the gas is H. In another more anothermore preferred aspect, the alcohol is 1,3-propanediol. preferred aspect, the gas is CO. In another more preferred In another more preferred aspect, the alcohol is sorbitol. In 50 aspect, the gas is CO. See, for example, Kataoka, N., A. Miya, another more preferred aspect, the alcohol is xylitol. See, for and K. Kiriyama, 1997, Studies on hydrogen production by example, Gong, C.S., Cao, N.J., Du, J., and Tsao, G.T., 1999, Ethanol production from renewable resources, in Advances in continuous culture system of hydrogen-producing anaerobic Biochemical Engineering/Biotechnology, Scheper, T., ed., bacteria, Water Science and Technology 36 (6-7): 41-47; and Springer-Verlag Berlin Heidelberg, Germany, 65: 207-241: 55 Gunaseelan V.N. in Biomass and Bioenergy, Vol. 13(1-2), pp. Silveira, M. M., and Jonas, R., 2002. The biotechnological 83-114, 1997, Anaerobic digestion of biomass for methane production of sorbitol, Appl. Microbiol. Biotechnol. 59: 400 production: A review. 408; Nigam, P., and Singh, D., 1995, Processes for fermen Recovery. The fermentation product(s) can be optionally tative production of xylitol a Sugar Substitute, Process Bio recovered from the fermentation medium using any method chemistry 30 (2): 117-124; Ezei, T. C., Qureshi, N. and 60 known in the art including, but not limited to, chromatogra Blaschek, H. P., 2003, Production of acetone, butanol and phy, electrophoretic procedures, differential solubility, distil ethanol by Clostridium beijerinckii BA101 and in situ recov lation, or extraction. For example, alcohol is separated from ery by gas Stripping, World Journal of Microbiology and the fermented cellulosic material and purified by conven Biotechnology 19 (6): 595-603. tional methods of distillation. Ethanol with a purity of up to In another preferred aspect, the fermentation product is an 65 about 96 Vol.% can be obtained, which can be used as, for organic acid. In another more preferred aspect, the organic example, fuel ethanol, drinking ethanol, i.e., potable neutral acid is acetic acid. In another more preferred aspect, the spirits, or industrial ethanol. US 8,426,158 B2 23 24 Polypeptides Having Cellulolytic Enhancing Activity and preferred aspect, the polypeptide comprises amino acids 20 to Polynucleotides Thereof 326 of SEQ ID NO: 2. In another preferred aspect, the In the methods of the present invention, any polypeptide polypeptide consists of the amino acid sequence of SEQ ID having cellulolytic enhancing activity can be used. NO: 2 or an allelic variant thereof; or a fragment thereof that In a first aspect, the polypeptide having cellulolytic 5 has cellulolytic enhancing activity. In another preferred enhancing activity comprises the following motifs: aspect, the polypeptide consists of the amino acid sequence of IILMV-P-X(4,5)-G-X-Y-ILMV-X-R-X-EQI-X(4)- SEQID NO: 2. In another preferred aspect, the polypeptide HNQ and FW-TF-K-AIVI, consists of the mature polypeptide of SEQ ID NO: 2. In wherein X is any amino acid, X(4.5) is any amino acid at 4 another preferred aspect, the polypeptide consists of amino or 5 contiguous positions, and X(4) is any amino acid at 4 10 acids 20 to 326 of SEQID NO: 2 or an allelic variant thereof; contiguous positions. or a fragment thereofthat has cellulolytic enhancing activity. The polypeptide comprising the above-noted motifs may In another preferred aspect, the polypeptide consists of amino further comprise: acids 20 to 326 of SEQID NO: 2. H-X(1,2)-G-P-X(3)-YW-AILMV), A polypeptide having cellulolytic enhancing activity pref EQI-X-Y-X(2)-C-X-EHQN-FILV-X-ILV, or 15 erably comprises the amino acid sequence of SEQID NO: 4 H-X(1,2)-G-P-X(3)-YW-AILMV and EQI-X-Y-X or an allelic variant thereof; or a fragment thereof that has (2)-C-X-EHQN-FILV-X-ILV), cellulolytic enhancing activity. In a preferred aspect, the wherein X is any amino acid, X(1.2) is any amino acid at 1 polypeptide comprises the amino acid sequence of SEQ ID position or 2 contiguous positions, X(3) is any amino acid at NO:4. In another preferred aspect, the polypeptide comprises 3 contiguous positions, and X(2) is any amino acid at 2 the mature polypeptide of SEQID NO: 4. In another preferred contiguous positions. In the above motifs, the accepted aspect, the polypeptide comprises amino acids 18 to 239 of IUPAC single letter amino acid abbreviation is employed. SEQ ID NO: 4, or an allelic variant thereof; or a fragment In a preferred aspect, the polypeptide having cellulolytic thereof that has cellulolytic enhancing activity. In another enhancing activity further comprises H-X(1,2)-G-P-X(3)- preferred aspect, the polypeptide comprises amino acids 18 to YW-AILMV. In another preferred aspect, the isolated 25 239 of SEQ ID NO: 4. In another preferred aspect, the polypeptide having cellulolytic enhancing activity further polypeptide consists of the amino acid sequence of SEQ ID comprises EQI-X-Y-X(2)-C-X-EHQN-FILV-X-ILV. NO. 4 or an allelic variant thereof; or a fragment thereof that In another preferred aspect, the polypeptide having cellu has cellulolytic enhancing activity. In another preferred lolytic enhancing activity further comprises H-X(1,2)-G-P-X aspect, the polypeptide consists of the amino acid sequence of (3)-YW-AILMV and EQI-X-Y-X(2)-C-X-EHQN 30 SEQID NO: 4. In another preferred aspect, the polypeptide FILV-X-ILV. consists of the mature polypeptide of SEQ ID NO: 4. In In a second aspect, the polypeptide having cellulolytic another preferred aspect, the polypeptide consists of amino enhancing activity comprises the following motif: acids 18 to 239 of SEQID NO. 4 or an allelic variant thereof; IILMV-P-X(4,5)-G-X-YIILMV-X-R-x-EQI-X(3)-A- or a fragment thereofthat has cellulolytic enhancing activity. HNQ), 35 In another preferred aspect, the polypeptide consists of amino wherein X is any amino acid, X(4.5) is any amino acid at 4 acids 18 to 239 of SEQID NO: 4. or 5 contiguous positions, and X(3) is any amino acid at 3 A polypeptide having cellulolytic enhancing activity pref contiguous positions. In the above motif, the accepted IUPAC erably comprises the amino acid sequence of SEQID NO: 6 single letter amino acid abbreviation is employed. or an allelic variant thereof; or a fragment thereof that has In a third aspect, the polypeptide having cellulolytic 40 cellulolytic enhancing activity. In a preferred aspect, the enhancing activity comprises an amino acid sequence that has polypeptide comprises the amino acid sequence of SEQ ID a degree of identity to the mature polypeptide of SEQID NO: NO: 6. In another preferred aspect, the polypeptide comprises 2, SEQIDNO:4, SEQIDNO: 6, SEQID NO:8, SEQID NO: the mature polypeptide of SEQID NO: 6. In another preferred 10, SEQID NO: 12, SEQID NO: 14, or SEQ ID NO: 16 of aspect, the polypeptide comprises amino acids 20 to 258 of preferably at least 60%, more preferably at least 65%, more 45 SEQ ID NO: 6, or an allelic variant thereof; or a fragment preferably at least 70%, more preferably at least 75%, more thereof that has cellulolytic enhancing activity. In another preferably at least 80%, more preferably at least 85%, even preferred aspect, the polypeptide comprises amino acids 20 to more preferably at least 90%, most preferably at least 95%, 258 of SEQ ID NO: 6. In another preferred aspect, the and even most preferably at least 96%, at least 97%, at least polypeptide consists of the amino acid sequence of SEQ ID 98%, or at least 99% (hereinafter “homologous polypep 50 NO: 6 or an allelic variant thereof; or a fragment thereof that tides’). In a preferred aspect, the mature polypeptide has cellulolytic enhancing activity. In another preferred sequence is amino acids 20 to 326 of SEQID NO: 2, amino aspect, the polypeptide consists of the amino acid sequence of acids 18 to 239 of SEQID NO: 4, amino acids 20 to 258 of SEQID NO: 6. In another preferred aspect, the polypeptide SEQ ID NO: 6, amino acids 19 to 226 of SEQ ID NO: 8, consists of the mature polypeptide of SEQ ID NO: 6. In amino acids 20 to 304 of SEQID NO: 10, amino acids 16 to 55 another preferred aspect, the polypeptide consists of amino 317 of SEQID NO: 12, amino acids 23 to 250 of SEQID NO: acids 20 to 258 of SEQID NO: 6 or an allelic variant thereof; 14, or amino acids 20 to 249 of SEQID NO: 16. or a fragment thereofthat has cellulolytic enhancing activity. A polypeptide having cellulolytic enhancing activity pref In another preferred aspect, the polypeptide consists of amino erably comprises the amino acid sequence of SEQID NO: 2 acids 20 to 258 of SEQID NO: 6. or an allelic variant thereof; or a fragment thereof that has 60 A polypeptide having cellulolytic enhancing activity pref cellulolytic enhancing activity. In a preferred aspect, the erably comprises the amino acid sequence of SEQID NO: 8 polypeptide comprises the amino acid sequence of SEQID or an allelic variant thereof; or a fragment thereof that has NO: 2. In another preferred aspect, the polypeptide comprises cellulolytic enhancing activity. In a preferred aspect, the the mature polypeptide of SEQID NO: 2. In another preferred polypeptide comprises the amino acid sequence of SEQ ID aspect, the polypeptide comprises amino acids 20 to 326 of 65 NO:8. In another preferred aspect, the polypeptide comprises SEQ ID NO: 2, or an allelic variant thereof; or a fragment the mature polypeptide of SEQID NO:8. In another preferred thereof that has cellulolytic enhancing activity. In another aspect, the polypeptide comprises amino acids 19 to 226 of US 8,426,158 B2 25 26 SEQ ID NO: 8, or an allelic variant thereof; or a fragment prises the mature polypeptide of SEQID NO: 14. In another thereof that has cellulolytic enhancing activity. In another preferred aspect, the polypeptide comprises amino acids 23 to preferred aspect, the polypeptide comprises amino acids 19 to 250 of SEQ ID NO: 14, or an allelic variant thereof; or a 226 of SEQ ID NO: 8. In another preferred aspect, the fragment thereof that has cellulolytic enhancing activity. In polypeptide consists of the amino acid sequence of SEQID another preferred aspect, the polypeptide comprises amino NO: 8 or an allelic variant thereof; or a fragment thereof that acids 23 to 250 of SEQ ID NO: 14. In another preferred has cellulolytic enhancing activity. In another preferred aspect, the polypeptide consists of the amino acid sequence of aspect, the polypeptide consists of the amino acid sequence of SEQ ID NO: 14 or an allelic variant thereof; or a fragment SEQID NO: 8. In another preferred aspect, the polypeptide thereof that has cellulolytic enhancing activity. In another consists of the mature polypeptide of SEQ ID NO: 8. In 10 preferred aspect, the polypeptide consists of the amino acid another preferred aspect, the polypeptide consists of amino sequence of SEQID NO: 14. In another preferred aspect, the acids 19 to 226 of SEQID NO: 8 or an allelic variant thereof; polypeptide consists of the mature polypeptide of SEQ ID or a fragment thereofthat has cellulolytic enhancing activity. NO: 14. In another preferred aspect, the polypeptide consists In another preferred aspect, the polypeptide consists of amino of amino acids 23 to 250 of SEQ ID NO: 14 or an allelic acids 19 to 226 of SEQID NO: 8. 15 variant thereof; or a fragment thereof that has cellulolytic A polypeptide having cellulolytic enhancing activity pref enhancing activity. In another preferred aspect, the polypep erably comprises the amino acid sequence of SEQID NO: 10 tide consists of amino acids 23 to 250 of SEQID NO: 14. or an allelic variant thereof; or a fragment thereof that has A polypeptide having cellulolytic enhancing activity pref cellulolytic enhancing activity. In a preferred aspect, the erably comprises the amino acid sequence of SEQID NO: 16 polypeptide comprises the amino acid sequence of SEQID or an allelic variant thereof; or a fragment thereof that has NO: 10. In another preferred aspect, the polypeptide com cellulolytic enhancing activity. In a preferred aspect, the prises the mature polypeptide of SEQID NO: 10. In another polypeptide comprises the amino acid sequence of SEQ ID preferred aspect, the polypeptide comprises amino acids 20 to NO: 16. In another preferred aspect, the polypeptide com 304 of SEQ ID NO: 10, or an allelic variant thereof; or a prises the mature polypeptide of SEQID NO: 16. In another fragment thereof that has cellulolytic enhancing activity. In 25 preferred aspect, the polypeptide comprises amino acids 20 to another preferred aspect, the polypeptide comprises amino 249 of SEQ ID NO: 16, or an allelic variant thereof; or a acids 20 to 304 of SEQ ID NO: 10. In another preferred fragment thereof that has cellulolytic enhancing activity. In aspect, the polypeptide consists of the amino acid sequence of another preferred aspect, the polypeptide comprises amino SEQ ID NO: 10 or an allelic variant thereof; or a fragment acids 20 to 249 of SEQ ID NO: 16. In another preferred thereof that has cellulolytic enhancing activity. In another 30 aspect, the polypeptide consists of the amino acid sequence of preferred aspect, the polypeptide consists of the amino acid SEQ ID NO: 16 or an allelic variant thereof; or a fragment sequence of SEQID NO: 10. In another preferred aspect, the thereof that has cellulolytic enhancing activity. In another polypeptide consists of the mature polypeptide of SEQ ID preferred aspect, the polypeptide consists of the amino acid NO: 10. In another preferred aspect, the polypeptide consists sequence of SEQID NO: 16. In another preferred aspect, the of amino acids 20 to 304 of SEQ ID NO: 10 or an allelic 35 polypeptide consists of the mature polypeptide of SEQ ID variant thereof; or a fragment thereof that has cellulolytic NO: 16. In another preferred aspect, the polypeptide consists enhancing activity. In another preferred aspect, the polypep of amino acids 20 to 249 of SEQ ID NO: 16 or an allelic tide consists of amino acids 20 to 304 of SEQID NO: 10. variant thereof; or a fragment thereof that has cellulolytic A polypeptide having cellulolytic enhancing activity pref enhancing activity. In another preferred aspect, the polypep erably comprises the amino acid sequence of SEQID NO: 12 40 tide consists of amino acids 20 to 249 of SEQID NO: 16. or an allelic variant thereof, or a fragment thereof having Preferably, a fragment of the mature polypeptide of SEQ cellulolytic enhancing activity. In a preferred aspect, the ID NO: 2 contains at least 277 amino acid residues, more polypeptide comprises the amino acid sequence of SEQID preferably at least 287 amino acid residues, and most prefer NO: 12. In another preferred aspect, the polypeptide com ably at least 297 amino acid residues. Preferably, a fragment prises the mature polypeptide of SEQID NO: 12. In another 45 of the mature polypeptide of SEQID NO: 4 contains at least preferred aspect, the polypeptide comprises amino acids 16 to 185 amino acid residues, more preferably at least 195 amino 317 of SEQ ID NO: 12, or an allelic variant thereof; or a acid residues, and most preferably at least 205 amino acid fragment thereof having cellulolytic enhancing activity. In residues. Preferably, a fragment of the mature polypeptide of another preferred aspect, the polypeptide comprises amino SEQID NO: 6 contains at least 200 amino acid residues, more acids 16 to 317 of SEQ ID NO: 12. In another preferred 50 preferably at least 212 amino acid residues, and most prefer aspect, the polypeptide consists of the amino acid sequence of ably at least 224 amino acid residues. Preferably, a fragment SEQ ID NO: 12 or an allelic variant thereof; or a fragment of the mature polypeptide of SEQID NO: 8 contains at least thereofhaving cellulolytic enhancing activity. In another pre 175 amino acid residues, more preferably at least 185 amino ferred aspect, the polypeptide consists of the amino acid acid residues, and most preferably at least 195 amino acid sequence of SEQID NO: 12. In another preferred aspect, the 55 residues. Preferably, a fragment of the mature polypeptide of polypeptide consists of the mature polypeptide of SEQ ID SEQ ID NO: 10 contains at least 240 amino acid residues, NO: 12. In another preferred aspect, the polypeptide consists more preferably at least 255 amino acid residues, and most of amino acids 16 to 317 of SEQ ID NO: 12 or an allelic preferably at least 270 amino acid residues. Preferably, a variant thereof; or a fragment thereof having cellulolytic fragment of the mature polypeptide of SEQID NO: 12 con enhancing activity. In another preferred aspect, the polypep 60 tains at least 255 amino acid residues, more preferably at least tide consists of amino acids 16 to 317 of SEQID NO: 12. 270 amino acid residues, and most preferably at least 285 A polypeptide having cellulolytic enhancing activity pref amino acid residues. Preferably, a fragment of the mature erably comprises the amino acid sequence of SEQID NO: 14 polypeptide of SEQID NO: 14 contains at least 175 amino or an allelic variant thereof; or a fragment thereof that has acid residues, more preferably at least 190 amino acid resi cellulolytic enhancing activity. In a preferred aspect, the 65 dues, and most preferably at least 205 amino acid residues. polypeptide comprises the amino acid sequence of SEQID Preferably, a fragment of the mature polypeptide of SEQID NO: 14. In another preferred aspect, the polypeptide com NO: 16 contains at least 200 amino acid residues, more pref US 8,426,158 B2 27 28 erably at least 210 amino acid residues, and most preferably at SEQID NO:4, SEQID NO: 6, SEQID NO: 8, SEQID NO: least 220 amino acid residues. 10, SEQID NO: 12, SEQID NO: 14, or SEQID NO: 16, or Preferably, a subsequence of the mature polypeptide cod afragment thereof, may be used to design a nucleic acid probe ing sequence of SEQID NO: 1 contains at least 831 nucle to identify and clone DNA encoding polypeptides having otides, more preferably at least 861 nucleotides, and most cellulolytic enhancing activity from Strains of different gen preferably at least 891 nucleotides. Preferably, a subsequence era or species according to methods well known in the art. In of the mature polypeptide coding sequence of SEQID NO: 3 particular, such probes can be used for hybridization with the contains at least 555 nucleotides, more preferably at least 585 genomic or cDNA of the genus or species of interest, follow nucleotides, and most preferably at least 615 nucleotides. ing standard Southern blotting procedures, in order to identify Preferably, a Subsequence of the mature polypeptide coding 10 sequence of SEQID NO: 5 contains at least 600 nucleotides, and isolate the corresponding gene therein. Such probes can more preferably at least 636 nucleotides, and most preferably be considerably shorter than the entire sequence, but should at least 672 nucleotides. Preferably, a subsequence of the be at least 14, preferably at least 25, more preferably at least mature polypeptide coding sequence of SEQID NO: 7 con 35, and most preferably at least 70 nucleotides in length. It is, tains at least 525 nucleotides, more preferably at least 555 15 however, preferred that the nucleic acid probe is at least 100 nucleotides, and most preferably at least 585 nucleotides. nucleotides in length. For example, the nucleic acid probe Preferably, a Subsequence of the mature polypeptide coding may be at least 200 nucleotides, preferably at least 300 nucle sequence of SEQID NO:9 contains at least 720 nucleotides, otides, more preferably at least 400 nucleotides, or most more preferably at least 765 nucleotides, and most preferably preferably at least 500 nucleotides in length. Even longer at least 810 nucleotides. Preferably, a subsequence of the probes may be used, e.g., nucleic acid probes that are prefer mature polypeptide coding sequence of SEQID NO: 11 con ably at least 600 nucleotides, more preferably at least 700 tains at least 765 nucleotides, more preferably at least 810 nucleotides, even more preferably at least 800 nucleotides, or nucleotides, and most preferably at least 855 nucleotides. most preferably at least 900 nucleotides in length. Both DNA Preferably, a Subsequence of the mature polypeptide coding and RNA probes can be used. The probes are typically labeled sequence of nucleotides 67 to 796 of SEQID NO: 13 contains 25 for detecting the corresponding gene (for example, with P. at least 525 nucleotides, more preferably at least 570 nucle H. S. biotin, or avidin). Such probes are encompassed by otides, and most preferably at least 615 nucleotides. Prefer the present invention. ably, a Subsequence of the mature polypeptide coding A genomic DNA or cDNA library prepared from such sequence of SEQID NO: 15 contains at least 600 nucleotides, other strains may, therefore, be screened for DNA that hybrid more preferably at least 630 nucleotides, and most preferably 30 izes with the probes described above and encodes a polypep at least 660 nucleotides. tide having cellulolytic enhancing activity. Genomic or other In a fourth aspect, the polypeptide having cellulolytic DNA from such other strains may be separated by agarose or enhancing activity is encoded by a polynucleotide that polyacrylamide gel electrophoresis, or other separation tech hybridizes under at least very low stringency conditions, pref niques. DNA from the libraries or the separated DNA may be erably at least low stringency conditions, more preferably at 35 transferred to and immobilized on nitrocellulose or other least medium stringency conditions, more preferably at least suitable carrier material. In order to identify a clone or DNA medium-high Stringency conditions, even more preferably at that is homologous with SEQ ID NO: 1, or a subsequence least high Stringency conditions, and most preferably at least thereof, the carrier material is preferably used in a Southern very high Stringency conditions with (i) the mature polypep blot. tide coding sequence of SEQID NO: 1, SEQID NO:3, SEQ 40 For purposes of the present invention, hybridization indi ID NO. 5, SEQID NO: 7, SEQID NO:9, SEQID NO: 11, cates that the nucleotide sequence hybridizes to a labeled SEQID NO: 13, or SEQID NO: 15, (ii) the cDNA sequence nucleic acid probe corresponding to the mature polypeptide contained in the mature polypeptide coding sequence of SEQ coding sequence of SEQID NO: 1, SEQID NO:3, SEQID ID NO: 1, SEQID NO:3, SEQID NO:5, or SEQID NO:13, NO:5, SEQID NO: 7, SEQID NO:9, SEQID NO: 11, SEQ or the genomic DNA sequence comprising the mature 45 ID NO: 13, or SEQID NO: 15 the cDNA sequence contained polypeptide coding sequence of SEQID NO: 7, SEQID NO: in the mature polypeptide coding sequence of SEQID NO: 1. 9, SEQID NO: 11, or SEQID NO: 15, (iii) a subsequence of SEQID NO:3, SEQ ID NO: 5, or SEQ ID NO: 13, or the (i) or (ii), or (iv) a full-length complementary Strand of (i), (ii), genomic DNA sequence comprising the mature polypeptide or (iii) (J. Sambrook, E. F. Fritsch, and T. Maniatus, 1989, coding sequence of SEQID NO: 7, SEQID NO: 9, SEQID Supra). A Subsequence of the mature polypeptide coding 50 NO: 11, or SEQ ID NO: 15, its full-length complementary sequence of SEQID NO: 1, SEQID NO:3, SEQ ID NO: 5, Strand, or a Subsequence thereof, under very low to very high SEQID NO:7, SEQID NO:9, SEQID NO: 11, SEQID NO: stringency conditions, as described Supra. 13, or SEQ ID NO: 15 contains at least 100 contiguous In a preferred aspect, the nucleic acid probe is the mature nucleotides or preferably at least 200 contiguous nucleotides. polypeptide coding sequence of SEQ ID NO: 1. In another Moreover, the Subsequence may encode a polypeptide frag 55 preferred aspect, the nucleic acid probe is nucleotides 388 to ment that has cellulolytic enhancing activity. In a preferred 1332 of SEQ ID NO: 1. In another preferred aspect, the aspect, the mature polypeptide coding sequence is nucle nucleic acid probe is a polynucleotide sequence that encodes otides 388 to 1332 of SEQID NO: 1, nucleotides 98 to 821 of the polypeptide of SEQID NO: 2, or a subsequence thereof. SEQ ID NO: 3, nucleotides 126 to 978 of SEQ ID NO: 5, In another preferred aspect, the nucleic acid probe is SEQID nucleotides 55 to 678 of SEQID NO: 7, nucleotides 58 to 912 60 NO: 1. In another preferred aspect, the nucleic acid probe is of SEQID NO:9, nucleotides 46 to 951 of SEQID NO: 11, the polynucleotide sequence contained in plasmid pH.JG 120 nucleotides 67 to 796 of SEQID NO: 13, or nucleotides 77 to which is contained in E. coli NRRL B-30699, wherein the 766 of SEQID NO: 15. polynucleotide sequence thereof encodes a polypeptide hav The nucleotide sequence of SEQID NO: 1, SEQID NO:3, ing cellulolytic enhancing activity. In another preferred SEQID NO:5, SEQID NO: 7, SEQID NO:9, SEQID NO: 65 aspect, the nucleic acid probe is the mature polypeptide cod 11, SEQ ID NO: 13, or SEQ ID NO: 15, or a subsequence ing sequence contained in plasmid pEJG 120 which is con thereof; as well as the amino acid sequence of SEQID NO: 2, tained in E. coli NRRL B-30699. US 8,426,158 B2 29 30 In another preferred aspect, the nucleic acid probe is the NO: 11. In another preferred aspect, the nucleic acid probe is mature polypeptide coding sequence of SEQ ID NO: 3. In the polynucleotide sequence contained in plasmid pTter61F another preferred aspect, the nucleic acid probe is nucleotides which is contained in E. coli NRRL B-50044, wherein the 98 to 821 of SEQID NO:3. In another preferred aspect, the polynucleotide sequence thereof encodes a polypeptide hav nucleic acid probe is a polynucleotide sequence that encodes ing cellulolytic enhancing activity. In another preferred the polypeptide of SEQID NO: 4, or a subsequence thereof. aspect, the nucleic acid probe is the mature polypeptide cod In another preferred aspect, the nucleic acid probe is SEQID ing region contained in plasmid pTter61F which is contained NO:3. In another preferred aspect, the nucleic acid probe is in E. coli NRRL B-50044. the polynucleotide sequence contained in plasmid pTter61C In another preferred aspect, the nucleic acid probe is the which is contained in E. coli NRRL B-30813, wherein the 10 polynucleotide sequence thereof encodes a polypeptide hav mature polypeptide coding sequence of SEQ ID NO: 13. In ing cellulolytic enhancing activity. In another preferred another preferred aspect, the nucleic acid probe is nucleotides aspect, the nucleic acid probe is the mature polypeptide cod 67 to 796 of SEQID NO: 13. In another preferred aspect, the ing sequence contained in plasmid pTter61C which is con nucleic acid probe is a polynucleotide sequence that encodes tained in E. coli NRRL B-30813. 15 the polypeptide of SEQID NO: 14, or a subsequence thereof. In another preferred aspect, the nucleic acid probe is the In another preferred aspect, the nucleic acid probe is SEQID mature polypeptide coding sequence of SEQ ID NO: 5. In NO: 13. In another preferred aspect, the nucleic acid probe is another preferred aspect, the nucleic acid probe is nucleotides the polynucleotide sequence contained in plasmid p)ZA2-7 126 to 978 of SEQID NO: 5. In another preferred aspect, the which is contained in E. coli NRRL B-30704, wherein the nucleic acid probe is a polynucleotide sequence that encodes polynucleotide sequence thereof encodes a polypeptide hav the polypeptide of SEQID NO: 6, or a subsequence thereof. ing cellulolytic enhancing activity. In another preferred In another preferred aspect, the nucleic acid probe is SEQID aspect, the nucleic acid probe is the mature polypeptide cod NO: 5. In another preferred aspect, the nucleic acid probe is ing sequence contained in plasmid p)ZA2-7 which is con the polynucleotide sequence contained in plasmid pTter61D tained in E. coli NRRL B-30704. which is contained in E. coli NRRL B-30812, wherein the 25 In another preferred aspect, the nucleic acid probe is the polynucleotide sequence thereof encodes a polypeptide hav mature polypeptide coding sequence of SEQ ID NO: 15. In ing cellulolytic enhancing activity. In another preferred another preferred aspect, the nucleic acid probe is nucleotides aspect, the nucleic acid probe is the mature polypeptide cod 77 to 766 of SEQID NO: 15. In another preferred aspect, the ing sequence contained in plasmid pTter61D which is con nucleic acid probe is a polynucleotide sequence that encodes tained in E. coli NRRL B-30812. 30 the polypeptide of SEQID NO: 16, or a subsequence thereof. In another preferred aspect, the nucleic acid probe is the In another preferred aspect, the nucleic acid probe is SEQID mature polypeptide coding sequence of SEQID NO: 7. In NO: 15. In another preferred aspect, the nucleic acid probe is another preferred aspect, the nucleic acid probe is nucleotides the polynucleotide sequence contained in plasmid pTr333 55 to 678 of SEQID NO: 7. In another preferred aspect, the which is contained in E. coli NRRL B-30878, wherein the nucleic acid probe is a polynucleotide sequence that encodes 35 polynucleotide sequence thereof encodes a polypeptide hav the polypeptide of SEQID NO: 8, or a subsequence thereof. ing cellulolytic enhancing activity. In another preferred In another preferred aspect, the nucleic acid probe is SEQID aspect, the nucleic acid probe is the mature polypeptide cod NO: 7. In another preferred aspect, the nucleic acid probe is ing sequence contained in plasmidpTr333 which is contained the polynucleotide sequence contained in plasmid pTter61E in E. coli NRRL B-3O878. which is contained in E. coli NRRL B-30814, wherein the 40 For long probes of at least 100 nucleotides in length, very polynucleotide sequence thereof encodes a polypeptide hav low to very high stringency conditions are defined as prehy ing cellulolytic enhancing activity. In another preferred bridization and hybridization at 42° C. in 5xSSPE, 0.3% aspect, the nucleic acid probe is the mature polypeptide cod SDS, 200 ug/ml sheared and denatured salmon sperm DNA, ing sequence contained in plasmid pTter61E which is con and either 25% formamide for very low and low stringencies, tained in E. coli NRRL B-30814. 45 35% formamide for medium and medium-high Stringencies, In another preferred aspect, the nucleic acid probe is the or 50% formamide for high and very high stringencies, fol mature polypeptide coding sequence of SEQ ID NO: 9. In lowing standard Southern blotting procedures for 12 to 24 another preferred aspect, the nucleic acid probe is nucleotides hours optimally. 58 to 912 of SEQID NO:9 In another preferred aspect, the For long probes of at least 100 nucleotides in length, the nucleic acid probe is a polynucleotide sequence that encodes 50 carrier material is finally washed three times each for 15 the polypeptide of SEQID NO: 10, or a subsequence thereof. minutes using 2xSSC, 0.2% SDS preferably at 45° C. (very In another preferred aspect, the nucleic acid probe is SEQID low stringency), more preferably at 50° C. (low stringency), NO: 9. In another preferred aspect, the nucleic acid probe is more preferably at 55° C. (medium stringency), more prefer the polynucleotide sequence contained in plasmid pTter61G ably at 60° C. (medium-high stringency), even more prefer which is contained in E. coli NRRL B-30811, wherein the 55 ably at 65°C. (high stringency), and most preferably at 70° C. polynucleotide sequence thereof encodes a polypeptide hav (very high Stringency). ing cellulolytic enhancing activity. In another preferred For short probes of about 15 nucleotides to about 70 nucle aspect, the nucleic acid probe is the mature polypeptide cod otides in length, Stringency conditions are defined as prehy ing sequence contained in plasmid pTter61G which is con bridization, hybridization, and washing post-hybridization at tained in E. coli NRRL B-30811. 60 about 5° C. to about 10° C. below the calculated T using the In another preferred aspect, the nucleic acid probe is the calculation according to Bolton and McCarthy (1962, Pro mature polypeptide coding sequence of SEQ ID NO: 11. In ceedings of the National Academy of Sciences USA 48: 1390) another preferred aspect, the nucleic acid probe is nucleotides in 0.9 M. NaCl, 0.09 M Tris-HCl pH 7.6, 6 mM EDTA, 0.5% 46 to 951 of SEQID NO: 11. In another preferred aspect, the NP-40, 1xDenhardt’s solution, 1 mM sodium pyrophosphate, nucleic acid probe is a polynucleotide sequence that encodes 65 1 mM sodium monobasic phosphate, 0.1 mM ATP, and 0.2 mg the polypeptide of SEQID NO: 12, or a subsequence thereof. of yeast RNA per ml following standard Southern blotting In another preferred aspect, the nucleic acid probe is SEQID procedures for 12 to 24 hours optimally. US 8,426,158 B2 31 32 For short probes of about 15 nucleotides to about 70 nucle ces coelicolor, Streptomyces griseus, or Streptomyces liv otides in length, the carrier material is washed once in 6xSCC idans polypeptide having cellulolytic enhancing activity. plus 0.1% SDS for 15 minutes and twice each for 15 minutes The polypeptide having cellulolytic enhancing activity using 6xSSC at 5° C. to 10°C. below the calculated T. may also be a fungal polypeptide, and more preferably a yeast In a fifth aspect, the polypeptide having cellulolytic polypeptide Such as a Candida, Kluyveromyces, Pichia, Sac enhancing activity is encoded by a polynucleotide compris charomyces, Schizosaccharomyces, or Yarrowia polypeptide ing or consisting of a nucleotide sequence that has a degree of having cellulolytic enhancing activity; or more preferably a identity to the mature polypeptide coding sequence of SEQ filamentous fungal polypeptide Such as an Acremonium, IDNO: 1, SEQIDNO:3, SEQIDNO:5, SEQIDNO:7, SEQ Agaricus, Alternaria, Aspergillus, Aureobasidium, Botry ID NO:9, SEQID NO: 11, SEQID NO: 13, or SEQID NO: 10 Ospaeria, CeripOriopsis, Chaetomidium, Chrysosporium, 15 of preferably at least 60%, more preferably at least 65%, Claviceps, Cochliobolus, Coprinopsis, Coptotermes, Cory more preferably at least 70%, more preferably at least 75%, nascus, Cryphonectria, Cryptococcus, Diplodia, Exidia, Fili more preferably at least 80%, more preferably at least 85%, basidium, Fusarium, Gibberella, Holomastigotoides, Humi even more preferably at least 90%, most preferably at least cola, Irpex, Lentinula, Leptospaeria, Magnaporthe, 95%, and even most preferably at least 96%, at least 97%, at 15 Melanocarpus, Meripilus, Mucor, Myceliophthora, Neocal least 98%, or at least 99%. linastix, Neurospora, Paecilomyces, Penicillium, Phanero In a preferred aspect, the mature polypeptide coding chaete, Piromyces, Poitrasia, Pseudoplectania, Pseudotri sequence is nucleotides 388 to 1332 of SEQID NO: 1, nucle chonympha, Rhizomucor, Schizophyllum, Scytalidium, otides 98 to 821 of SEQID NO:3, nucleotides 126 to 978 of Talaromyces, Thermoascus, Thielavia, Tolypocladium, Tri SEQ ID NO: 5, nucleotides 55 to 678 of SEQ ID NO: 7, choderma, Trichophaea, Verticillium, Volvariella, or Xvlaria nucleotides 58 to 912 of SEQID NO:9, nucleotides 46 to 951 polypeptide having cellulolytic enhancing activity. of SEQID NO: 11, nucleotides 67 to 796 of SEQID NO: 13, In a preferred aspect, the polypeptide is a Saccharomyces or nucleotides 77 to 766 of SEQID NO: 15. Carlsbergensis, Saccharomyces cerevisiae, Saccharomyces In a sixth aspect, the polypeptide having cellulolytic diastaticus, Saccharomyces douglasii, Saccharomyces enhancing activity is an artificial variant comprising a Substi 25 kluyveri, Saccharomyces norbensis, or Saccharomyces Ovi tution, deletion, and/or insertion of one or more (or several) formis polypeptide having cellulolytic enhancing activity. amino acids of the mature polypeptide of SEQID NO: 2, SEQ In another preferred aspect, the polypeptide is an Acremo ID NO: 4, SEQID NO: 6, SEQID NO: 8, SEQID NO: 10, nium cellulolyticus, Aspergillus aculeatus, Aspergillus SEQID NO: 12, or SEQID NO: 14, or SEQID NO: 16; or a awamori, Aspergillus filmigatus, Aspergillus foetidus, homologous sequence thereof. Methods for preparing Such 30 Aspergillus japonicus, Aspergillus nidulans, Aspergillus an artificial variant is described Supra. niger; Aspergillus Oryzae, Chrysosporium keratinophilum, The total number of amino acid substitutions, deletions Chrysosporium lucknowense, Chrysosporium tropicum, and/or insertions of the mature polypeptide of SEQID NO: 2, Chrysosporium merdarium, Chrysosporium inops, Chrysos SEQID NO:4, SEQID NO: 6, SEQID NO: 8, SEQID NO: porium pannicola, Chrysosporium queenslandicum, Chry 10, SEQID NO: 12, or SEQID NO: 14, or SEQID NO: 16, 35 sosporium Zonatum, Fusarium bactridioides, Fusarium is 10, preferably 9, more preferably 8, more preferably 7, cerealis, Fusarium crookwellense, Fusarium culmorum, more preferably at most 6, more preferably 5, more prefer Fusarium graminearum, Fusarium graminum, Fusarium het ably 4, even more preferably 3, most preferably 2, and even erosporum, Fusarium negundi, Fusarium oxysporum, most preferably 1. Fusarium reticulatum, Fusarium roseum, Fusarium sam A polypeptide having cellulolytic enhancing activity may 40 bucinum, Fusarium sarcochroum, Fusarium sporotrichio be obtained from microorganisms of any genus. In a preferred ides, Fusarium sulphureum, Fusarium torulosum, Fusarium aspect, the polypeptide obtained from a given source is trichothecioides, Fusarium venematum, Humicola grisea, secreted extracellularly. Humicola insolens, Humicola lanuginosa, Irpex lacteus, A polypeptide having cellulolytic enhancing activity may Mucor miehei, Myceliophthora thermophila, Neurospora be a bacterial polypeptide. For example, the polypeptide may 45 crassa, Penicillium funiculosum, Penicillium purpurogenium, be a gram positive bacterial polypeptide such as a Bacillus, Phanerochaete chrysosporium, Thielavia achromatica, Streptococcus, Streptomyces, Staphylococcus, Enterococcus, Thielavia albomyces, Thielavia albopilosa, Thielavia austra Lactobacillus, Lactococcus, Clostridium, Geobacillus, or leinsis, Thielavia fimeti, Thielavia microspora, Thielavia Oceanobacillus polypeptide having cellulolytic enhancing ovispora, Thielavia peruviana, Thielavia spededonium, activity, or a Gram negative bacterial polypeptide Such as an 50 Thielavia setosa, Thielavia subthermophila, Thielavia terres E. coli, Pseudomonas, Salmonella, Campylobacter; Helico tris, Trichoderma harzianum, Trichoderma koningii, Tricho bacter, Flavobacterium, Fusobacterium, Ilvobacter, Neis derma longibrachiatum, Trichoderma reesei, Trichoderma seria, or Ureaplasma polypeptide having cellulolytic enhanc viride, or Trichophaea saccata polypeptide having cellu ing activity. lolytic enhancing activity. In a preferred aspect, the polypeptide is a Bacillus alkalo 55 It will be understood that for the aforementioned species philus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus the invention encompasses both the perfect and imperfect circulans, Bacillus clausii, Bacillus coagulans, Bacillus fir states, and other taxonomic equivalents, e.g., anamorphs, mus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, regardless of the species name by which they are known. Bacillus megaterium, Bacillus pumilus, Bacillus Stearother Those skilled in the art will readily recognize the identity of mophilus, Bacillus subtilis, or Bacillus thuringiensis 60 appropriate equivalents. polypeptide having cellulolytic enhancing activity. Strains of these species are readily accessible to the public In another preferred aspect, the polypeptide is a Strepto in a number of culture collections, such as the AmericanType coccus equisimilis, Streptococcus pyogenes, Streptococcus Culture Collection (ATCC), Deutsche Sammlung von Mik uberis, or Streptococcus equi Subsp. Zooepidemicus polypep roorganismen and Zellkulturen GmbH (DSM), Centraalbu tide having cellulolytic enhancing activity. 65 reau Voor Schimmelcultures (CBS), and Agricultural In another preferred aspect, the polypeptide is a Strepto Research Service Patent Culture Collection, Northern myces achromogenes, Streptomyces avermitilis, Streptomy Regional Research Center (NRRL). US 8,426,158 B2 33 34 Furthermore, polypeptides having cellulolytic enhancing Examples of peroxide-generating enzymes include the fol activity may be identified and obtained from other sources lowing: including microorganisms isolated from nature (e.g., Soil, E.C. 1.1.3.X-donor:oxygen oxidoreductase, glucose oxi composts, water, etc.) using the above-mentioned probes. dase (E.C. 1.1.3.4), hexose oxidase (E.C. 1.1.3.5), aryl-alco Techniques for isolating microorganisms from natural habi hol oxidase (E.C. 1.1.3.7), D-arabinono-1,4-lactone oxidase tats are well known in the art. The polynucleotide may then be (E.C. 1.1.3.37), vanillyl-alcohol oxidase (E.C. 1.1.3.38), obtained by similarly screening agenomic or cDNA library of xylitol oxidase (E.C. 1.1.3.41) Such a microorganism. Once a polynucleotide encoding a E.C. 1.1.99.8—alcohol dehydrogenase polypeptide has been detected with the probe(s), the poly E.C. 1.1.99.18—cellobiose dehydrogenase 10 E.C. 1.2.3.x aldehyde oxidase (E.C. 1.2.3.1), aryl-alde nucleotide can be isolated or cloned by utilizing techniques hyde oxidase (E.C. 1.2.3.9) that are well known to those of ordinary skill in the art (see, E.C. 1.3.3.x-dihydroorotate oxidase (E.C. 1.3.3.1), pyr e.g., Sambrook et al., 1989, supra) roloquinoline-quinone synthase (E.C. 1.3.3.11) Polynucleotides comprising nucleotide sequences that E.C. 1.4.3.x L-amino acid oxidase (E.C. 1.4.3.2), encode polypeptide having cellulolytic enhancing activity 15 L-glutamate oxidase (E.C. 1.4.3.11) can be isolated and utilized to express the polypeptide having E.C. 1.5.3.x polyamine oxidase (1.5.3.11) cellulolytic enhancing activity for evaluation in the methods E.C. 1.6.3.1- NADPH oxidase of the present invention, as described herein. E.C. 1.7.3.x urate oxidase (E.C. 1.7.3.3), hydroxylamine The polynucleotides comprise nucleotide sequences that oxidase (E.C. 1.7.3.4) have a degree of identity to the mature polypeptide coding E.C. 1.8.3.x thiol oxidase (E.C. 1.8.3.2), glutathione oxi sequence of SEQID NO: 1, SEQID NO:3, SEQ ID NO: 5, dase (E.C. 1.8.3.3) SEQID NO:7, SEQID NO:9, SEQID NO: 11, SEQID NO: E.C. 1.9.3.1—cytochrome c oxidase 13, or SEQ ID NO: 15 of preferably at least 60%, more E.C. 1.13.11.12 lipoxygenase preferably at least 65%, more preferably at least 70%, more E.C. 1.13.11.31 arachidonate 12-lipoxygenase preferably at least 75%, more preferably at least 80%, more 25 E.C. 1.13.11.33 arachidonate 15-lipoxygenase preferably at least 85%, even more preferably at least 90%, E.C. 1.13.11.34 arachidonate 5-lipoxygenase most preferably at least 95%, and even most preferably at E.C. 1.13.11.40 arachidonate 8-lipoxygenase least 96%, at least 97%, at least 98%, or at least 99%, which E.C. 1.13.11.45 linoleate 11-lipoxygenase encode a polypeptide having cellulolytic enhancing activity. E.C. 1.15.1.1—superoxide dismutase The polynucleotide may also be a polynucleotide encoding 30 E.C. 1.17.3.x xanthine oxidase (E.C. 1.17.3.2) a polypeptide having cellulolytic enhancing activity that The peroxide-generating enzyme may be obtained from hybridizes under at least very low stringency conditions, pref microorganisms of any genus. In one aspect, the polypeptide erably at least low stringency conditions, more preferably at obtained from a given source is secreted extracellularly. least medium stringency conditions, more preferably at least The peroxide-generating enzyme may be a bacterial per medium-high Stringency conditions, even more preferably at 35 oxide-generating enzyme. For example, the peroxide-gener least high Stringency conditions, and most preferably at least ating enzyme may be a gram positive bacterial peroxide very high stringency conditions with (i)t the mature polypep generating enzyme Such as a Bacillus, Streptococcus, tide coding sequence of SEQID NO: 1, SEQID NO:3, SEQ Streptomyces, Staphylococcus, Enterococcus, Lactobacillus, ID NO. 5, SEQID NO: 7, SEQID NO:9, SEQID NO: 11, Lactococcus, Clostridium, Geobacillus, or Oceanobacillus SEQID NO: 13, or SEQID NO: 15, (ii) the cDNA sequence 40 peroxide-generating enzyme, or a Gram negative bacterial contained in the mature polypeptide coding sequence of SEQ peroxide-generating enzyme such as an E. coli, Pseudomo ID NO: 1, SEQID NO:3, SEQID NO:5, or SEQID NO:13, nas, Salmonella, Campylobacter, Helicobacter, Flavobacte or the genomic DNA sequence comprising the mature rium, Fusobacterium, Ilyobacter, Neisseria, or Ureaplasma polypeptide coding sequence of SEQID NO: 7, SEQID NO: peroxide-generating enzyme. 9, SEQ ID NO: 11, or SEQ ID NO: 15, or (iii) a full-length 45 In one aspect, the peroxide-generating enzyme is a Bacillus complementary strand of (i) or (ii); or allelic variants and alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, subsequences thereof (Sambrook et al., 1989, supra), as Bacillus circulans, Bacillus clausii, Bacillus coagulans, defined herein. In a preferred aspect, the mature polypeptide Bacillus firmus, Bacillus lautus, Bacillus lentus, Bacillus coding sequence is nucleotides 388 to 1332 of SEQID NO: 1. licheniformis, Bacillus megaterium, Bacillus pumilus, Bacil nucleotides 98 to 821 of SEQID NO: 3, nucleotides 126 to 50 lus Stearothermophilus, Bacillus subtilis, or Bacillus thuring 978 of SEQID NO: 5, nucleotides 55 to 678 of SEQID NO: iensis peroxide-generating enzyme. 7, nucleotides 58 to 912 of SEQID NO: 9, nucleotides 46 to In another aspect, the peroxide-generating enzyme is a 951 of SEQID NO: 11, nucleotides 67 to 796 of SEQID NO: Streptococcus equisimilis, Streptococcus pyogenes, Strepto 13, or nucleotides 77 to 766 of SEQID NO: 15. coccus uberis, or Streptococcus equi Subsp. Zooepidemicus As described earlier, the techniques used to isolate or clone 55 peroxide-generating enzyme. a polynucleotide encoding a polypeptide are known in the art In another aspect, the peroxide-generating enzyme is a and include isolation from genomic DNA, preparation from Streptomyces achromogenes, Streptomyces avermitilis, cDNA, or a combination thereof. Streptomyces coelicolor; Streptomyces griseus, or Streptomy Peroxide-Generating Enzymes ces lividans peroxide-generating enzyme. In the methods of the present invention, the peroxide 60 The peroxide-generating enzyme may also be a fungal generating enzyme can be any peroxide-generating enzyme. peroxide-generating enzyme, and more preferably a yeast The peroxide-generating enzyme, e.g., hydrogen peroxide peroxide-generating enzyme Such as a Candida, Kluyvero generating enzyme, may be present as an enzyme activity in myces, Pichia, Saccharomyces, Schizosaccharomyces, or the enzyme composition, a component in one or more (sev Yarrowia peroxide-generating enzyme; or more preferably a eral) proteins added to the composition, and/or an enzyme 65 filamentous fungal peroxide-generating enzyme such as an component present in the cellulosic material. In one aspect, Acremonium, Agaricus, Alternaria, Aspergillus, Aureoba the peroxide is hydrogen peroxide. sidium, Botryospaeria, CeripOriopsis, Chaetomidium, Chry US 8,426,158 B2 35 36 sosporium, Claviceps, Cochliobolus, Coprinopsis, Coptoter No. Q3MSU9); Lamarre et al., 2001, Candida albicans mes, Corynascus, Cryphonectria, Cryptococcus, Diplodia, expresses an unusual cytoplasmic manganese-containing Exidia, Filibasidium, Fusarium, Gibberella, Holomastigo superoxide dismutase (SOD3 gene product) upon the entry toides, Humicola, Irpex, Lentinula, Leptospaeria, Mag and during the stationary phase, J. Biol. Chem. 276: 43784 naporthe, Melanocarpus, Meripilus, Mucor, Mycelioph 43791 (Accession No. O13401); Dufernez et al., 2006, The thora, Neocallinastix, Neurospora, Paecilomyces, presence of four iron-containing Superoxide dismutase Penicillium, Phanerochaete, Piromyces, Poitrasia, isozymes in trypanosomatidae: characterization, Subcellular Pseudoplectania, Pseudotrichonympha, Rhizomucor, Schizo localization, and phylogenetic origin in Trypanosoma brucei, phyllum, Scytalidium, Talaromyces, Thermoascus, Thielavia, Free Radic. Biol. Med. 40(2):193-5 (Accession Nos. Tolypocladium, Trichoderma, Trichophaea, Verticillium, Vol 10 AY894557, AY894558, AY894559 and AY894560); Hjal variella, or Xvlaria peroxide-generating enzyme. marsson et al., 1987, Isolation and sequence of complemen In another aspect, the peroxide-generating enzyme is a tary DNA encoding human extracellular Superoxide dismu Saccharomyces Carlsbergensis, Saccharomyces cerevisiae, tase Proc. Natl. Acad. Sci. USA 84:6340-6344 (Accession No. Saccharomyces diastaticus, Saccharomyces douglasii, Sac J02947); Yamada et al., 1999 Sequence and analysis of chro charomyces kluyveri, Saccharomyces norbensis, or Saccha 15 mosome 2 of the plant Arabidopsis Thaliana, Nature 402: romyces oviformis peroxide-generating enzyme. 761-768 (Accession No. Q97PY2); Kriechbaum et al., 1989, In another aspect, the peroxide-generating enzyme is an Cloning and DNA sequence analysis of the glucose oxidase Acremonium cellulolyticus, Aspergillus aculeatus, Aspergil gene from Aspergillus niger NRRL-3, FEBS Lett. 255(1): lus awamori, Aspergillus fumigatus, Aspergillus foetidus, 63-66 (Accession No. P13006); Kiess et al., 1998) Glucose Aspergillus japonicus, Aspergillus nidulans, Aspergillus oxidase from Penicillium anagasakiense. Primary structure niger, Aspergillus Oryzae, Chrysosporium keratinophilum, and comparison with other glucose-methanol-choline Chrysosporium lucknowense, Chrysosporium tropicum, (GMC) oxidoreductases Eur:J. Biochem. 252 (1):90-99 (Ac Chrysosporium merdarium, Chrysosporium inops, Chrysos cession No. P81156); Nierman et al., 2005, Genomic porium pannicola, Chrysosporium queenslandicum, Chry sequence of the pathogenic and allergenic filamentous fungus sosporium Zonatum, Fusarium bactridioides, Fusarium 25 Aspergillus filmigatus Nature 438: 1151-1156 (Accession cerealis, Fusarium crookwellense, Fusarium culmorum, No. Q4WZA6); Toyama et al., 2005, Molecular cloning and Fusarium graminearum, Fusarium graminum, Fusarium het structural analysis of quinohemoprotein alcohol dehydroge erosporum, Fusarium negundi, Fusarium oxysporum, nase ADH-IIG from Pseudomonas putida HK5 J. Mol. Biol. Fusarium reticulatum, Fusarium roseum, Fusarium Sam 352 (1): 91-104 (Accession No. Q4W6GO). bucinum, Fusarium sarcochroum, Fusarium sporotrichio 30 Examples of cellobiose dehydrogenase and their sources ides, Fusarium sulphureum, Fusarium torulosum, Fusarium include Xu et al., 2001, Humicola insolens cellobiose dehy trichothecioides, Fusarium venenatum, Humicola grisea, drogenase: cloning, redox chemistry, and “logic gate’-like Humicola insolens, Humicola lanuginosa, Irpex lacteus, dual functionality, Enz. Microb. Technol. 28: 744-753 (Acces Mucor miehei, Myceliophthora thermophila, Neurospora sion No. Q9P8H5); Nozaki et al., 1999, Cloning and expres crassa, Penicillium finiculosum, Penicillium purpurogenium, 35 sion of cellobiose dehydrogenase from Irpex lacteus. Submit Phanerochaete chrysosporium, Thielavia achromatica, ted (AUG-2004) to the EMBL/GenBank/DDBJ databases Thielavia albomyces, Thielavia albopilosa, Thielavia austra (Accession No. Q6AW20); Moukha et al., 1999, Cloning and leinsis, Thielavia fimeti, Thielavia microspora, Thielavia analysis of Pycnoporus cinnabarinus cellobiose dehydroge ovispora, Thielavia peruviana, Thielavia spededonium, nase, Gene 234: 23-33 (Accession No. O74253); Li et al., Thielavia setosa, Thielavia subthermophila, Thielavia terres 40 1996, Cloning of a cDNA encoding cellobiose dehydroge tris, Trichoderma harzianum, Trichoderma koningii, Tricho nase, a hemoflavoenzyme from Phanerochaete chrysospo derma longibrachiatum, Trichoderma reesei, or Trichoderma rium, Appl. Environ. Microbiol. 62: 1329-1335 (Accession viride peroxide-generating enzyme. No. Q01738); Kajisa et al., 2004, Characterization and Examples of peroxide-generating enzymes and their molecular cloning of cellobiose dehydrogenase from the sources include Cohn, 1958, The enzymatic formation of 45 brown-rot fungus Coniophora puteana, Biosci. Bioeng. 98: oxalacetic acid by nonpyridine nucleotide malic dehydroge 57-63 (Accession No. Q6BDD5); Zamocky et al., Phyloge nase of Micrococcus lysodeikticus, J. Biol. Chem. 233: 299 netic analysis of cellobiose dehydrogenases. Submitted 304, Yamashita et al., 2000, Isolation, characterization and (NOV-2002) to the EMBL/GenBank/DDBJ databases (Ac molecular cloning of a thermostable Xylitol oxidase from cession No. Q7Z975):Yoshida et al., 2002, Molecular cloning Streptomyces sp. IKD472, J. Biosci. Bioeng. 89: 350-360 50 and characterization of a cDNA encoding cellobiose dehy (Accession No. Q9KX73); Seo et al., 2000, The Arabidopsis drogenase from the wood-rotting fungus Grifola frondosa, aldehyde oxidase 3 (AAO3) gene product catalyzes the final FEMS Microbiol. Lett. 217: 225-230 (Accession No. step in abscisic acid biosynthesis in leaves, Proc. Natl. Acad. Q8J2T4); Stapleton et al., 2004, Molecular cloning of the Sci. USA 97: 12908-12913 (Accession No. Q7G191); Aurich cellobiose dehydrogenase gene from Trametes versicolor and et al., 1972, Purification and properties of L-amino acid oxi 55 expression in Pichia pastoris, Enzyme Microb. Technol. 34: dase from Neurospora crassa, Acta Biol. Med. Ger: 28: 209 55-63 (Accession No. Q875J3); Dumonceaux et al., 1998, 220 (Accession No. P23623); Hoober and Thorpe, 2002, Cloning and sequencing of a gene encoding cellobiose dehy Flavin-dependent sulfhydryl oxidases in protein disulfide drogenase from Trametes versicolor; Gene 210: 211-219 (Ac bond formation, Methods Enzymol. 348: 30-34; Baum and cession No. O42729); Nierman et al., 2005, Genomic Scandalios, 1981, Isolation and characterization of the cyto 60 sequence of the pathogenic and allergenic filamentous fungus Solic and mitochondrial Superoxide dismutases of maize, Aspergillus filmigatus, Nature 438: 1151-1156 (Accession Arch. Biochem. Biophys. 206: 249-64 (Accession No. No. Q4WIN9); Raices et al., 1995, Cloning and characteriza P09233); Holdom et al., 1996, The Cu,Zn superoxide dismu tion of a cDNA encoding a cellobiose dehydrogenase from tases of Aspergillus flavus, Aspergillus niger, Aspergillus the white rot fungus Phanerochaete chrysosporium, FEBS nidulans, and Aspergillus terreus: purification and biochemi 65 Lett. 369: 233-238 (Accession No. Q 12661); Zamocky et al., cal comparison with the Aspergillus fumigatus Cu,Zn Super 2008, Cloning, sequence analysis and heterologous expres oxide dismutase, Infect. Immun. 64; 3326-3332 (Accession sion in Pichia pastoris of a gene encoding a thermostable US 8,426,158 B2 37 38 cellobiose dehydrogenase from Myriococcum thermophilum, Examples of peroxide generating systems include, but are Protein Expr: Purif 59:258-265 (Accession No. A9XK88); not limited to, UV-irradiation of Rose Bengal (Wright et al., Fedorova et al., Genomic islands in the pathogenic filamen 2000, Singlet Oxygen-Mediated Protein Oxidation: Evidence tous fungus Aspergillus fumigatus, PloS (Accession No. for the Formation of Reactive Peroxides, Redox Report, A1CFVO); Subramaniam et al., Biochemical and molecular 5 5:159-161); the Reidl-Pfleiderer process of autooxidation of biological characterization of cellobiose dehydrogenase from 2-ethyl-9,10-dihydroxyanthracene--O. tO 2-ethylan Sporotrichum thermophilum, Submitted (June-1998) to the thraquinone--HO, Eul et al., 2001, Hydrogen peroxide, in EMBL/GenBank/DDBJ databases (Accession No. 074240); Kirk-Othmer Encyclopedia of Chemical Technology Wiley, Fedorova et al., Genomic islands in the pathogenic filamen New York; reaction of singlet state molecular oxygen 'O tous fungus Aspergillus fumigatus, PloS (Accession No. 10 with ascorbate (Kramarenko et al., 2006, Ascorbate Reacts A1CYG2); Fedorova et al., Genomic islands in the patho with Singlet Oxygen to Produce Hydrogen Peroxide, Photo genic filamentous fungus Aspergillus filmigatus, PloS (Ac chem. Photobiol. 82(6):1634-1637); the oxidation of unsat cession No. BOXVQ8): Fedorova et al., Genomic islands in urated lipid (after radical initiation) to lipid peroxide (Benzie, the pathogenic filamentous fungus Aspergillus fumigatus, 1996, Lipid peroxidation: A review of causes, consequences, PloS (Accession No. A1C890): Fedorova et al., Genomic 15 measurement and dietary influences, International Journal of islands in the pathogenic filamentous fungus Aspergillus Food Sciences and Nutrition 47(3): 233-261); and the oxida filmigatus, PloS (Accession No. A1DIY3); Zamocky et al., tion of organic alcohols by molecular oxygen in the presence 2008, Cloning, sequence analysis and heterologous expres of various metal and metal complex catalysts (Bortolo et al., sion in Pichia pastoris of a gene encoding a thermostable 2000, Production of Hydrogen Peroxide from Oxygen and cellobiose dehydrogenase from Myriococcum thermophilum, Alcohols, Catalyzed by Palladium Complexes, J. Mol. Cat. A. Protein Expr: Purif 59:258-265 (Accession No. A9XK87); Chem., 153:25-29). Birren et al., The Broad Institute Genome Sequencing Plat Polypeptides Having Peroxidase Activity form "Genome Sequence of Pyrenophora tritici-repentis, In the methods of the present invention, the polypeptide Submitted (March-2007) to the EMBL/GenBank/DDBJ having peroxidase activity can be any polypeptide having databases (Accession No. B2WHIT); Birren et al., The Broad 25 peroxidase activity. The polypeptide having peroxidase activ Institute Genome Sequencing Platform "Genome Sequence ity may be present as an enzyme activity in the enzyme of Pyrenophora tritici-repentis, Submitted (March-2007) to composition and/or as one or more (several) protein compo the EMBL/GenBank/DDBJ databases (Accession No. nents added to the composition. In a preferred aspect, the B2WJX3): Fedorova et al., Genomic islands in the patho polypeptide having peroxidase activity is foreign to one or genic filamentous fungus Aspergillus filmigatus, PloS (Ac 30 more (several) components of the enzyme composition. cession No. Q4WC40); and Pel et al., 2007, Genome Examples of peroxidase or peroxide-decomposing sequencing and analysis of the versatile cell factory Aspergil enzymes include, but are not limited to, the following: lus niger CBS 513.88, Nat. Biotechnol. 25: 221-231 (Acces E.C. 1.11.1.1 NADH peroxidase sion No. A2OD75). E.C. 1.11.1.2 NADPH peroxidase In one aspect, the cellobiose dehydrogenase is a Humicola 35 E.C. 1.11.1.3 fatty acid peroxidase insolens cellobiose dehydrogenase. In another aspect, the E.C. 1.11.1.5 di-heme cellobiose dehydrogenase is a Humicola insolens DSM 1800 E.C. 1.11.1.5 cytochrome c peroxidase cellobiose dehydrogenase, e.g., the polypeptide comprising E.C. 1.11.1.6 catalase SEQ ID NO: 18 encoded by SEQID NO: 17, or a fragment E.C. 1.11.1.6 manganese catalase thereof having cellobiose dehydrogenase activity (see U.S. 40 E.C. 1.11.1.7 invertebrate peroxinectin Pat. No. 6,280.976). E.C. 1.11.1.7 In another aspect, the cellobiose dehydrogenase is a Myce E.C. 1.11.1.7 liophthora thermophila cellobiose dehydrogenase. In another E.C. 1.11.1.7 aspect, the cellobiose dehydrogenase is a Myceliophthora E.C. 1.11.1.8 thermophila CBS 117.65 cellobiose dehydrogenase. 45 E.C. 1.11.1.9 glutathione peroxidase It will be understood that for the aforementioned species E.C. 1.11.1.10 the invention encompasses both the perfect and imperfect E.C. 1.11.1.11 states, and other taxonomic equivalents, e.g., anamorphs, E.C. 1.11.1.12 other glutathione peroxidase regardless of the species name by which they are known. E.C. 1.11.1.13 manganese peroxidase Those skilled in the art will readily recognize the identity of 50 E.C. 1.11.1.14 appropriate equivalents. E.C. 1.11.1.15 cysteine Strains of these species are readily accessible to the public E.C. 1.11.1.16 in a number of culture collections, such as the AmericanType E.C. 1.11.1.B2 chloride peroxidase Culture Collection (ATCC), Deutsche Sammlung von Mik E.C. 1.11.1.B4 roorganismen and Zellkulturen GmbH (DSM), Centraalbu 55 E.C. 1.11.1.84 no-heme haloperoxidase reau Voor Schimmelcultures (CBS), and Agricultural E.C. 1.11.1.B6 iodide peroxidase Research Service Patent Culture Collection, Northern E.C. 1.11.1.87 bromide peroxidase Regional Research Center (NRRL). E.C. 1.11.1.B8 iodide peroxidase Non-Enzymatic Peroxide-Generating Systems In one aspect, the peroxidase is a NADH peroxidase. In In the present invention, the peroxide generation system 60 another aspect, the peroxidase is a NADPH peroxidase. In can be any peroxide-generating chemical reaction or system another aspect, the peroxidase is a fatty acid peroxidase. In of reactions. The peroxide-generating system may be present another aspect, the peroxidase is a di-heme cytochrome c as a reaction between components in the enzyme composition peroxidase. In another aspect, the peroxidase is a cytochrome and/or one or more (several) of the components of the biom c peroxidase. In another aspect, the peroxidase is a catalase. asS and/or one or more (several) chemical components added 65 In another aspect, the peroxidase is a manganese catalase. In to the composition. In one aspect, the peroxide is hydrogen another aspect, the peroxidase is an invertebrate peroxinectin. peroxide. In another aspect, the peroxidase is an eosinophil peroxidase. US 8,426,158 B2 39 40 In another aspect, the peroxidase is a lactoperoxidase. In chrome c peroxidase, lignin peroxidase, manganese peroxi another aspect, the peroxidase is a myeloperoxidase. In dase, versatile peroxidase, other class II peroxidase, class III another aspect, the peroxidase is a thyroid peroxidase. In peroxidase, alkylhydroperoxidase D, other alkylhydroper another aspect, the peroxidase is a glutathione peroxidase. In oxidases, no-heme, no metal haloperoxidase, no-heme vana another aspect, the peroxidase is a chloride peroxidase. In 5 dium haloperoxidase, manganese catalase, NADH peroxi another aspect, the peroxidase is an ascorbate peroxidase. In dase, glutathione peroxidase, cysteine peroxiredoxin, another aspect, the peroxidase is a glutathione peroxidase. In thioredoxin-dependent thiol peroxidase, and AhpE-like per another aspect, the peroxidase is a manganese peroxidase. In oxiredoxin (Passard et al., 2007, Phytochemistry 68:1605 another aspect, the peroxidase is a lignin peroxidase. In 1611. another aspect, the peroxidase is a cysteine peroxiredoxin. In 10 The polypeptide having peroxidase activity may be another aspect, the peroxidase is a versatile peroxidase. In obtained from microorganisms of any genus. In one aspect, another aspect, the peroxidase is a chloride peroxidase. In the polypeptide obtained from a given source is secreted another aspect, the peroxidase is a haloperoxidase. In another extracellularly. aspect, the peroxidase is a no-heme vanadium haloperoxi The polypeptide having peroxidase activity may be a bac dase. In another aspect, the peroxidase is an iodide peroxi 15 terial polypeptide. For example, the polypeptide may be a dase. In another aspect, the peroxidase is a bromide peroxi gram positive bacterial polypeptide Such as a Bacillus, Strep dase. In another aspect, the peroxidase is a iodide peroxidase. tococcus, Streptomyces, Staphylococcus, Enterococcus, Lac Examples of polypeptides having peroxidase activity tobacillus, Lactococcus, Clostridium, Geobacillus, or include, but are not limited to, Coprinus cinereus peroxidase Oceanobacillus polypeptide having peroxidase activity, or a (Baunsgaard et al., 1993, Amino acid sequence of Coprinus Gram negative bacterial polypeptide such as an E. coli, macrorhizus peroxidase and cDNA sequence encoding Pseudomonas, Salmonella, Campylobacter; Helicobacter, Coprinus cinereus peroxidase. A new family of fungal per Flavobacterium, Fusobacterium, Ilyobacter, Neisseria, or oxidases, Eur: J. Biochem. 213 (1):605-611 (Accession num Ureaplasma polypeptide having peroxidase activity. ber P28314); horseradish peroxidase (Fujiyama et al., 1988, In one aspect, the polypeptide is a Bacillus alkalophilus, Structure of the horseradish peroxidase isozyme C genes, 25 Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circu Eur: J. Biochem. 173 (3): 681-687 (Accession number lans, Bacillus clausii, Bacillus coagulans, Bacillus firmus, P15232); peroxiredoxin (Singh and Shichi, 1998. A novel Bacillus lautus, Bacillus lentus, Bacillus licheniformis, glutathione peroxidase in bovine eye. Sequence analysis, Bacillus megaterium, Bacillus pumilus, Bacillus Stearother mRNA level, and translation, J. Biol. Chem. 273 (40): 26171 mophilus, Bacillus subtilis, or Bacillus thuringiensis 26178 (Accession number O77834); lactoperoxidase (Dullet 30 polypeptide having peroxidase activity. al., 1990, Molecular cloning of cDNAs encoding bovine and In another aspect, the polypeptide is a Streptococcus equi human lactoperoxidase, DNA Cell Biol. 9 (7): 499-509 (Ac similis, Streptococcus pyogenes, Streptococcus uberis, or cession number P80025); Eosinophilperoxidase (Fornhem et Streptococcus equi Subsp. Zooepidemicus polypeptide hav al., 1996, Isolation and characterization of porcine cationic ing peroxidase activity. eosinophilgranule proteins, Int. Arch. Allergy Immunol. 110 35 In another aspect, the polypeptide is a Streptomyces ach (2): 132-142 (Accession number P80550); versatile peroxi romogenes, Streptomyces avermitilis, Streptomyces coeli dase (Ruiz-Duenas et al., 1999, Molecular characterization of color, Streptomyces griseus, or Streptomyces lividans a novel peroxidase isolated from the ligninolytic fungus Pleu polypeptide having peroxidase activity. rotus eryngii, Mol. Microbiol. 31 (1): 223-235 (Accession The polypeptide having peroxidase activity may also be a number O94753); turnip peroxidase (Mazza and Welinder, 40 fungal polypeptide, and more preferably a yeast polypeptide 1980, Covalent structure of turnip peroxidase 7. Cyanogen Such as a Candida, Kluyveromyces, Pichia, Saccharomyces, bromide fragments, complete structure and comparison to Schizosaccharomyces, or Yarrowia polypeptide having per horseradish peroxidase C, Eur: J. Biochem. 108 (2): 481-489 oxidase activity; or more preferably a filamentous fungal (Accession number P00434); myeloperoxidase (Morishita et polypeptide Such as an Acremonium, Agaricus, Alternaria, al., 1987, Chromosomal gene structure of human myeloper 45 Aspergillus, Aureobasidium, Botryospaeria, Ceriporiopsis, oxidase and regulation of its expression by granulocyte Chaetomidium, Chrysosporium, Claviceps, Cochliobolus, colony-stimulating factor, J. Biol. Chem. 262 (31): 15208 Coprinopsis, Coptotermes, Corynascus, Cryphonectria, 15213 (Accession number P05164); peroxidasin and peroxi Cryptococcus, Diplodia, Exidia, Filibasidium, Fusarium, dasin homologs (Horikoshi et al., 1999, Isolation of differen Gibberella, Holomastigotoides, Humicola, Irpex, Lentinula, tially expressed cDNAs from p53-dependent apoptotic cells: 50 Leptospaeria, Magnaporthe, Melanocarpus, Meripilus, activation of the human homologue of the Drosophila peroxi Mucor, Myceliophthora, Neocallinastix, Neurospora, dasin gene, Biochem. Biophys. Res. Commun. 261 (3): 864 Paecilomyces, Penicillium, Phanerochaete, Piromyces, Poi 869 (Accession number Q92626); lignin peroxidase (Tien trasia, Pseudoplectania, Pseudotrichonympha, Rhizomucor, and Tu, 1987, Cloning and sequencing of a cDNA for a Schizophyllum, Scytalidium, Talaromyces, Thermoascus, ligninase from Phanerochaete chrysosporium, Nature 326 55 Thielavia, Tolypocladium, Trichoderma, Trichophaea, Verti (6112): 520-523 (Accession number P06181); Manganese cillium, Volvariella, or Xvlaria polypeptide having peroxi peroxidase (Orth et al., 1994, Characterization of a cDNA dase activity. encoding a manganese peroxidase from Phanerochaete chry In another aspect, the polypeptide is a Saccharomyces SOsporium: genomic organization of lignin and manganese Carlsbergensis, Saccharomyces cerevisiae, Saccharomyces peroxidase-encoding genes, Gene 148 (1): 161-165 (Acces 60 diastaticus, Saccharomyces douglasii, Saccharomyces sion number P78733); alpha-dioxygenase, dual oxidase, per kluyveri, Saccharomyces norbensis, or Saccharomyces Ovi oxidasin, invertebrate peroxinectin, short peroxidockerin, formis polypeptide having peroxidase activity. lactoperoxidase, myeloperoxidase, non-mammalian verte In another aspect, the polypeptide is an Acremonium cel brate peroxidase, catalase, catalase-lipoxygenase fusion, di lulolyticus, Aspergillus aculeatus, Aspergillus awamori, heme cytochrome c peroxidase, methylamine utilization pro 65 Aspergillus filmigatus, Aspergillus foetidus, Aspergillus tein, DyP-type peroxidase, haloperoxidase, ascorbate japonicus, Aspergillus nidulans, Aspergillus niger; Aspergil peroxidase, catalase peroxidase, hybrid ascorbate-cyto lus Oryzae, Chrysosporium keratinophilum, Chrysosporium US 8,426,158 B2 41 42 lucknowense, Chrysosporium tropicum, Chrysosporium mer galactosidases), carbohydrate-esterases (e.g., acetyl-Xylan darium, Chrysosporium inops, Chrysosporium pannicola, esterases, acetyl-mannan esterases, ferulic acid esterases, Chrysosporium queenslandicum, Chrysosporium Zonatum, coumaric acid esterases, glucuronoyl esterases), pectinases, Fusarium bactridioides, Fusarium cerealis, Fusarium crook proteases, ligninolytic enzymes (e.g., laccases, manganese wellense, Fusarium culmorum, Fusarium graminearum, peroxidases, lignin peroxidases, H-O-producing enzymes, Fusarium graminum, Fusarium heterosporum, Fusarium oxidoreductases), expansins, Swollenins, or mixtures thereof. negundi, Fusarium oxysporum, Fusarium reticulatum, In the methods of the present invention, the additional Fusarium roseum, Fusarium Sambucinum, Fusarium sarco enzyme(s) can be added prior to or during fermentation, e.g., chroum, Fusarium sporotrichioides, Fusarium sulphureum, during saccharification or during or after propagation of the Fusarium torulosum, Fusarium trichothecioides, Fusarium 10 fermenting microorganism(s). venenatum, Humicola grisea, Humicola insolens, Humicola One or more (several) components of the enzyme compo lanuginosa, Irpex lacteus, Mucor miehei, Myceliophthora sition may be wild-type proteins, recombinant proteins, or a thermophila, Neurospora crassa, Penicillium finiculosum, combination of wild-type proteins and recombinant proteins. Penicillium purpurogenium, Phanerochaete chrysosporium, For example, one or more (several) components may be Thielavia achromatica, Thielavia albomyces, Thielavia albo 15 native proteins of a cell, which is used as a host cell to express pilosa, Thielavia australeinsis, Thielavia fimeti, Thielavia recombinantly one or more (several) other components of the microspora, Thielavia ovispora, Thielavia peruviana, Thiella enzyme composition. One or more (several) components of via spededonium, Thielavia setosa, Thielavia subthermo the enzyme composition may be produced as monocompo phila, Thielavia terrestris, Trichoderma harzianum, Tricho nents, which are then combined to form the enzyme compo derma koningii, Trichoderma longibrachiatum, Trichoderma sition. The enzyme composition may be a combination of reesei, or Trichoderma viride polypeptide having peroxidase multicomponent and monocomponent protein preparations. activity. The enzymes used in the methods of the present invention In another aspect, the peroxidase is horseradish peroxi may be in any form suitable for use in the processes described dase. In another aspect, the peroxidase is Coprinus cinereus herein, Such as, for example, a crude fermentation broth with peroxidase. 25 or without cells removed, a cell lysate with or without cellular Techniques used to isolate or clone a polynucleotide debris, a semi-purified or purified enzyme preparation, or a encoding a polypeptide having peroxidase activity are known host cell as a source of the enzymes. The enzyme composition in the art and include isolation from genomic DNA, prepara may be a dry powder or granulate, a non-dusting granulate, a tion from cDNA, or a combination thereof. The cloning of the liquid, a stabilized liquid, or a stabilized protected enzyme. polynucleotides of the present invention from Such genomic 30 Liquid enzyme preparations may, for instance, be stabilized DNA can be effected, e.g., by using the well known poly by adding stabilizers such as a Sugar, a Sugar alcohol or merase chain reaction (PCR) or antibody screening of expres another polyol, and/or lactic acid or another organic acid sion libraries to detect cloned DNA fragments with shared according to established processes. structural features. See, e.g., Innis et al., 1990, PCR: A Guide A polypeptide having cellulolytic enzyme activity or Xylan to Methods and Application, Academic Press, New York. 35 degrading activity may be a bacterial polypeptide. For Other nucleic acid amplification procedures such as example, the polypeptide may be a gram positive bacterial chain reaction (LCR), ligation activated transcription (LAT) polypeptide Such as a Bacillus, Streptococcus, Streptomyces, and nucleotide sequence-based amplification (NASBA) may Staphylococcus, Enterococcus, Lactobacillus, Lactococcus, be used. Clostridium, Geobacillus, or Oceanobacillus polypeptide Enzyme Compositions 40 having cellulolytic enzyme activity or Xylan degrading activ In the methods of the present invention, the enzyme com ity, or a Gram negative bacterial polypeptide Such as an E. position may comprise any protein involved in the processing coli, Pseudomonas, Salmonella, Campylobacter; Helico of a cellulose-containing material to glucose and/or cello bacter, Flavobacterium, Fusobacterium, Ilvobacter, Neis biose, or hemicellulose to Xylose, mannose, galactose, and/or seria, or Ureaplasma polypeptide having cellulolytic enzyme arabinose. 45 activity or Xylan degrading activity. The enzyme composition preferably comprises enzymes In a preferred aspect, the polypeptide is a Bacillus alkalo having cellulolytic activity and/or Xylandegrading activity. In philus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus one aspect, the enzyme composition comprises one or more circulans, Bacillus clausii, Bacillus coagulans, Bacillus fir (several) cellulolytic enzymes. In another aspect, the enzyme mus, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, composition comprises one or more (several) Xylan degrad 50 Bacillus megaterium, Bacillus pumilus, Bacillus Stearother ing enzymes. In another aspect, the enzyme composition mophilus, Bacillus subtilis, or Bacillus thuringiensis comprises one or more (several) cellulolytic enzymes and one polypeptide having cellulolytic enzyme activity or Xylan or more (several) Xylan degrading enzymes. degrading activity. The one or more (several) cellulolytic enzymes are prefer In another preferred aspect, the polypeptide is a Strepto ably selected from the group consisting of an endoglucanase, 55 coccus equisimilis, Streptococcus pyogenes, Streptococcus a cellobiohydrolase, and a beta-glucosidase. The one or more uberis, or Streptococcus equi Subsp. Zooepidemicus polypep (several) Xylan degrading enzymes are preferably selected tide having cellulolytic enzyme activity or Xylan degrading from the group consisting of a Xylanase, an acetyXylan activity. esterase, a feruloyl esterase, an arabinofuranosidase, a xylosi In another preferred aspect, the polypeptide is a Strepto dase, and a glucuronidase. 60 myces achromogenes, Streptomyces avermitilis, Streptomy In another aspect, the enzyme composition may further or ces coelicolor, Streptomyces griseus, or Streptomyces liv even further comprise one or more (several) additional idans polypeptide having cellulolytic enzyme activity or enzyme activities to improve the degradation of the cellulose Xylan degrading activity. containing material. Preferred additional enzymes are hemi The polypeptide having cellulolytic enzyme activity or cellulases (e.g., alpha-D-glucuronidases, alpha-L-arabino 65 Xylan degrading activity may also be a fungal polypeptide, furanosidases, endo-mannanases, beta-mannosidases, alpha and more preferably a yeast polypeptide Such as a Candida, galactosidases, endo-alpha-L-arabinanases, beta Kluyveromyces, Pichia, Saccharomyces, Schizosaccharony US 8,426,158 B2 43 44 ces, or Yarrowia polypeptide having cellulolytic enzyme vozymes A/S), NOVOZYMTM 188 (Novozymes A/S), CEL activity or Xylan degrading activity; or more preferably a LUZYMETM (Novozymes A/S), CEREFLOTM (Novozymes filamentous fungal polypeptide Such as an Acremonium, A/S), and ULTRAFLOTM (Novozymes A/S), ACCEL Agaricus, Alternaria, Aspergillus, Aureobasidium, Botry ERASETM (Genencor Int.), LAMINEXTM (Genencor Int.), Ospaeria, Ceriporiopsis, Chaetomidium, Chrysosporium, SPEZYMETMCP (Genencor Int.), ROHAMENTTM 7069 W Claviceps, Cochliobolus, Coprinopsis, Coptotermes, Cory (Rohm GmbH), FIBREZYMER LDI (Dyadic International, nascus, Cryphonectria, Cryptococcus, Diplodia, Exidia, Fili Inc.), FIBREZYMER LBR (Dyadic International, Inc.), or basidium, Fusarium, Gibberella, Holomastigotoides, Humi VISCOSTAR(R) 150 L (Dyadic International, Inc.). The cel cola, Irpex, Lentinula, Leptospaeria, Magnaporthe, lulase enzymes are added in amounts effective from about Melanocarpus, Meripilus, Mucor, Myceliophthora, Neocal 10 0.001 to about 5.0 wt % of solids, more preferably from about linastix, Neurospora, Paecilomyces, Penicillium, Phanero 0.025 to about 4.0 wt % of solids, and most preferably from chaete, Piromyces, Poitrasia, Pseudoplectania, Pseudotri about 0.005 to about 2.0 wt % of Solids. The cellulase chonympha, Rhizomucor, Schizophyllum, Scytalidium, enzymes are added in amounts effective from about 0.001 to Talaromyces, Thermoascus, Thielavia, Tolypocladium, Tri about 5.0 wt % of solids, more preferably from about 0.025 to choderma, Trichophaea, Verticillium, Volvariella, or Xvlaria 15 about 4.0 wt % of solids, and most preferably from about polypeptide having cellulolytic enzyme activity or Xylan 0.005 to about 2.0 wt % of Solids. degrading activity. Examples of bacterial endoglucanases that can be used in In a preferred aspect, the polypeptide is a Saccharomyces the methods of the present invention, include, but are not Carlsbergensis, Saccharomyces cerevisiae, Saccharomyces limited to, an Acidothermus cellulolyticus endoglucanase diastaticus, Saccharomyces douglasii, Saccharomyces (WO 91/05039; WO 93/15186: U.S. Pat. No. 5,275,944; WO kluyveri, Saccharomyces norbensis, or Saccharomyces Ovi 96/02551; U.S. Pat. No. 5,536,655, WO 00/70031, WO formis polypeptide having cellulolytic enzyme activity or 05/093050); Thermobifida fisca endoglucanase III (WO Xylan degrading activity. 05/093050); and Thermobifida fisca endoglucanase V (WO In another preferred aspect, the polypeptide is an Acremo 05/093050). nium cellulolyticus, Aspergillus aculeatus, Aspergillus 25 Examples of fungal endoglucanases that can be used in the awamori, Aspergillus filmigatus, Aspergillus foetidus, methods of the present invention, include, but are not limited Aspergillus japonicus, Aspergillus nidulans, Aspergillus to, a Trichoderma reesei endoglucanase I (Penttila et al., niger, Aspergillus Oryzae, Chrysosporium keratinophilum, 1986, Gene 45: 253-263; GENBANKTM accession no. Chrysosporium lucknowense, Chrysosporium tropicum, M15665); Trichoderma reesei endoglucanase II (Saloheimo, Chrysosporium merdarium, Chrysosporium inops, Chrysos 30 et al., 1988, Gene 63:11-22: GENBANKTM accession no. porium pannicola, Chrysosporium queenslandicum, Chry M19373); Trichoderma reesei endoglucanase III (Okada et sosporium Zonatum, Fusarium bactridioides, Fusarium al., 1988, Appl. Environ. Microbiol. 64: 555-563: GEN cerealis, Fusarium crookwellense, Fusarium culmorum, BANKTM accession no. AB003694); Aspergillus aculeatus Fusarium graminearum, Fusarium graminum, Fusarium het endoglucanase (Ooi et al., 1990, Nucleic Acids Research 18: erosporum, Fusarium negundi, Fusarium oxysporum, 35 5884); Aspergillus kawachii endoglucanase (Sakamoto et al., Fusarium reticulatum, Fusarium roseum, Fusarium Sam 1995, Current Genetics 27: 435-439): Erwinia carotovara bucinum, Fusarium sarcochroum, Fusarium sporotrichio endoglucanase (Saarilahti et al., 1990, Gene 90: 9-14); ides, Fusarium sulphureum, Fusarium torulosum, Fusarium Fusarium oxysporum endoglucanase (GENBANKTM acces trichothecioides, Fusarium venematum, Humicola grisea, sion no. L29381); Humicola grisea var. thermoidea endoglu Humicola insolens, Humicola lanuginosa, Irpex lacteus, 40 canase (GENBANKTM accession no. AB003107); Melano Mucor miehei, Myceliophthora thermophila, Neurospora carpus albomyces endoglucanase (GENBANKTM accession crassa, Penicillium finiculosum, Penicillium purpurogenium, no. MAL515703); Neurospora crassa endoglucanase (GEN Phanerochaete chrysosporium, Thielavia achromatica, BANKTM accession no. XM 324477); Humicola insolens Thielavia albomyces, Thielavia albopilosa, Thielavia austra endoglucanase V (SEQID NO: 20); Myceliophthora thermo leinsis, Thielavia fimeti, Thielavia microspora, Thielavia 45 phila CBS 117.65 endoglucanase (SEQ ID NO: 22); basidi ovispora, Thielavia peruviana, Thielavia spededonium, omycete CBS 495.95 endoglucanase (SEQ ID NO: 24); Thielavia setosa, Thielavia subthermophila, Thielavia terres basidiomycete CBS 494.95 endoglucanase (SEQID NO: 26): tris, Trichoderma harzianum, Trichoderma koningii, Tricho Thielavia terrestris NRRL 8126 CEL6B endoglucanase derma longibrachiatum, Trichoderma reesei, Trichoderma (SEQ ID NO: 28); Thielavia terrestris NRRL 8126 CEL6C viride, or Trichophaea saccata polypeptide having cellu 50 endoglucanase (SEQID NO:30); Thielavia terrestris NRRL lolytic enzyme activity or Xylan degrading activity. 8126 CEL7C endoglucanase (SEQ ID NO: 32); Thielavia Chemically modified or protein engineered mutants of terrestris NRRL 8126 CEL7E endoglucanase (SEQID NO: polypeptides having cellulolytic enzyme activity or Xylan 34); Thielavia terrestris NRRL 8126 CEL7F endoglucanase degrading activity may also be used. (SEQ ID NO: 36); Cladorrhinum foecundissimum ATCC One or more (several) components of the enzyme compo 55 62373 CEL7A endoglucanase (SEQID NO:38); and Tricho sition may be a recombinant component, i.e., produced by derma reesei strain No. VTT-D-80133 endoglucanase (SEQ cloning of a DNA sequence encoding the single component IDNO:40: GENBANKTM accession no. M15665). The endo and subsequent cell transformed with the DNA sequence and glucanases of SEQID NO:20, SEQID NO: 22, SEQID NO: expressed in a host (see, for example, WO 91/17243 and WO 24, SEQID NO: 26, SEQID NO: 28, SEQID NO:30, SEQ 91/17244). The host is preferably a heterologous host (en 60 IDNO:32, SEQID NO:34, SEQIDNO:36, SEQID NO:38, Zyme is foreign to host), but the host may under certain and SEQ ID NO: 40 described above are encoded by the conditions also be a homologous host (enzyme is native to mature polypeptide coding sequence of SEQID NO: 19, SEQ host). Monocomponent cellulolytic proteins may also be pre IDNO:21, SEQID NO:23, SEQIDNO:25, SEQID NO:27, pared by purifying Such a protein from a fermentation broth. SEQID NO: 29, SEQ ID NO:31, SEQID NO:33, SEQ ID Examples of commercial cellulolytic protein preparations 65 NO:35, SEQID NO:37, SEQID NO:39, respectively. Suitable for use in the present invention include, for example, Examples of cellobiohydrolases useful in the methods of CELLICTM Ctec (Novozymes A/S), CELLUCLASTTM (No the present invention include, but are not limited to, Tricho US 8,426, 158 B2 45 46 derma reesei cellobiohydrolase I (SEQ ID NO: 42); Tricho 043980, WO 2004/048592, WO 2005/001065, WO 2005/ derma reesei cellobiohydrolase II (SEQID NO: 44); Humi 028636, WO 2005/093050, WO 2005/093073, WO 2006/ cola insolens cellobiohydrolase I (SEQ ID NO: 46), 074005, WO 2006/117432, WO 2007/071818, WO 2007/ Myceliophthora thermophila cellobiohydrolase II (SEQ ID 071820, WO 2008/008070, WO 2008/008793, U.S. Pat. No. NO: 48 and SEQID NO: 50), Thielavia terrestris cellobio- 5 4,435,307, U.S. Pat. No. 5,457,046, U.S. Pat. No. 5,648,263, hydrolase II (CEL6A) (SEQID NO: 52), Chaetomium ther U.S. Pat. No. 5,686,593, U.S. Pat. No. 5,691,178, U.S. Pat. mophilum cellobiohydrolase I (SEQ ID NO: 54), and Cha No. 5,763,254, and U.S. Pat. No. 5,776,757. etomium thermophilum cellobiohydrolase II (SEQ ID NO: Examples of commercial Xylan degrading enzyme prepa 56). The cellobiohydrolases of SEQID NO:40, SEQID NO: rations suitable for use in the present invention include, for 42, SEQID NO: 44, SEQID NO: 46, SEQID NO: 48, SEQ 10 example, SHEARZYMETM (Novozymes A/S), CELLICTM ID NO: 50, SEQID NO: 52, and SEQID NO. 54 described Htec (Novozymes A/S), VISCOZYME(R) (Novozymes A/S), above are encoded by the mature polypeptide coding ULTRAFLOR) (Novozymes A/S), PULPZYMER HC (No sequence of SEQID NO: 41, SEQID NO:43, SEQID NO: vozymes A/S), MULTIFECTR Xylanase (Genencor), 45, SEQID NO:47, SEQID NO:49, SEQID NO:51, SEQ ECOPULPR, TX-200A (AB Enzymes), HSP 6000 Xylanase ID NO:53, and SEQID NO:55, respectively. 15 (DSM), DEPOLTM 333P (Biocatalysts Limit, Wales, UK), Examples of beta-glucosidases useful in the methods of the DEPOLTM 740 L. (Biocatalysts Limit, Wales, UK), and present invention include, but are not limited to, Aspergillus DEPOLTM 762P (Biocatalysts Limit, Wales, UK). oryzae beta-glucosidase (SEQID NO:58); Aspergillus fumi Examples of xylanases useful in the methods of the present gatus beta-glucosidase (SEQID NO: 60); Penicillium brasil invention include, but are not limited to, Aspergillus aculea ianum IBT 20888 beta-glucosidase (SEQ ID NO: 62): 20 tus xylanase (GeneSeqP:AAR63790; WO 94/21785), Aspergillus niger beta-glucosidase (SEQ ID NO: 64); and Aspergillus filmigatus Xylanases (WO 2006/078256), and Aspergillus aculeatus beta-glucosidase (SEQ ID NO: 66). Thielavia terrestris NRRL 8126 xylanases (WO 2009/ The beta-glucosidases of SEQID NO:58, SEQID NO: 60, 079210). SEQ ID NO: 62, SEQ ID NO: 64, and SEQ ID NO: 66 Examples of beta-xylosidases useful in the methods of the described above are encoded by the mature polypeptide cod- 25 present invention include, but are not limited to, Trichoderma ing sequence of SEQ ID NO: 57, SEQID NO. 59, SEQ ID reesei beta-xylosidase (UniProtKB/TrEMBL accession num NO: 61, SEQID NO: 63, and SEQID NO: 65, respectively. ber Q92458), Talaromyces emersonii (SwissProt accession The Aspergillus Oryzae polypeptide having beta-glucosi number Q8x212), and Neurospora crassa (SwissProt acces dase activity can be obtained according to WO 2002/095.014. sion number Q7SOW4). The Aspergillus fumigatus polypeptide having beta-glucosi- 30 Examples of acetylxylan esterases useful in the methods of dase activity can be obtained according to WO 2005/047499. the present invention include, but are not limited to, Hypocrea The Penicillium brasilianum polypeptide having beta-glu jecorina acetylxylan esterase (WO 2005/001036), Neuro cosidase activity can be obtained according to WO 2007/ spora Crassa acetylxylan esterase (UniProt accession number 019442. The Aspergillus niger polypeptide having beta-glu q7s259), Thielavia terrestris NRRL 8126 acetylxylan cosidase activity can be obtained according to Dan et al., 35 esterase (WO 2009/042846), Chaetomium globosum 2000, J. Biol. Chem. 275: 4973-4980. The Aspergillus acetylxylan esterase (Uniprot accession number Q2GWX4), aculeatus polypeptide having beta-glucosidase activity can Chaetomium gracile acetylxylan esterase (GeneSeqP acces be obtained according to Kawaguchi et al., 1996, Gene 173: sion number AAB82124), Phaeosphaeria nodorum acetylxy 287-288. lan esterase (Uniprotaccession number QOUHJ1), and Humi The beta-glucosidase may be a fusion protein. In one 40 cola insolens DSM 1800 acetylxylan esterase (WO 2009/ aspect, the beta-glucosidase is the Aspergillus Oryzae beta 073709). glucosidase variant BG fusion protein of SEQID NO: 68 or Examples offerulic acid esterases useful in the methods of the Aspergillus Oryzae beta-glucosidase fusion protein of the present invention include, but are not limited to, Humicola SEQ ID NO: 70. In another aspect, the Aspergillus oryzae insolens DSM 1800 feruloyl esterase (WO 2009/076122), beta-glucosidase variant BG fusion protein is encoded by the 45 Neurospora crassa feruloyl esterase (UniProt accession num polynucleotide of SEQID NO: 67 or the Aspergillus Oryzae ber Q9HGR3), and Neosartorya fischeri feruloyl esterase beta-glucosidase fusion protein is encoded by the polynucle (UniProt Accession number A1D9T4). otide of SEQID NO: 69. Examples of arabinofuranosidases useful in the methods of Other endoglucanases, cellobiohydrolases, and beta-glu the present invention include, but are not limited to, Humicola cosidases are disclosed in numerous Glycosyl Hydrolase 50 insolens DSM 1800 arabinofuranosidase (WO 2009/073383) families using the classification according to Henrissat B., and Aspergillus niger arabinofuranosidase (GeneSeqPacces 1991. A classification of glycosylhydrolases based on amino sion number AAR94170). acid sequence similarities, Biochem. J. 280: 309-316, and Examples of alpha-glucuronidases useful in the methods of Henrissat B., and Bairoch A., 1996, Updating the sequence the present invention include, but are not limited to, Aspergil based classification of glycosylhydrolases, Biochem. J. 316: 55 lus clavatus alpha-glucuronidase (UniProt accession number 695-696. alcc12), Trichoderma reesei alpha-glucuronidase (Uniprot Other cellulolytic enzymes that may be used in the present accession number Q99024), Talaromyces emersonii alpha invention are described in EP 495,257, EP 531,315, EP 531, glucuronidase (UniProt accession number Q8X211), 372, WO 89/09259, WO 94/07998, WO 95/24471, WO Aspergillus niger alpha-glucuronidase (Uniprot accession 96/11262, WO 96/29397, WO 96/034108, WO 97/14804, 60 number Q96WX9), Aspergillus terreus alpha-glucuronidase WO 98/08940, WO 98/O12307, WO 98/13465, WO (SwissProt accession number Q0CJP9), and Aspergillus 98/015619, WO 98/015633, WO 98/028411, WO 99/06574, filmigatus alpha-glucuronidase (SwissProt accession number WO 99/10481, WO 99/025846, WO 99/025847, WO Q4WW45). 99/031255, WO 2000/009707, WO 2002/050245, WO 2002/ The enzymes and proteins used in the methods of the 0076792, WO 2002/101078, WO 2003/027306, WO 2003/ 65 present invention may be produced by fermentation of the 052054, WO 2003/052055, WO 2003/052056, WO 2003/ above-noted microbial strains on a nutrient medium contain 052057, WO 2003/052118, WO 2004/016760, WO 2004/ ing Suitable carbon and nitrogen sources and inorganic salts, US 8,426,158 B2 47 48 using procedures known in the art (see, e.g., Bennett, J. W. cell are promoters obtained from the genes for Aspergillus and LaSure, L. (eds.), More Gene Manipulations in Fungi, Oryzae TAKA amylase, Rhizomucor mieheiaspartic protein Academic Press, CA, 1991). Suitable media are available ase, Aspergillus niger neutral alpha-amylase, Aspergillus from commercial Suppliers or may be prepared according to niger acid stable alpha-amylase, Aspergillus niger or published compositions (e.g., in catalogues of the American 5 Aspergillus awamori glucoamylase (glaA), Rhizomucor mie Type Culture Collection). Temperature ranges and other con hei lipase, Aspergillus Oryzae alkaline protease, Aspergillus ditions suitable for growth and enzyme production are known Oryzae triose phosphate isomerase, Aspergillus nidulans in the art (see, e.g., Bailey, J. E., and Ollis, D.F., Biochemical acetamidase, Fusarium venematum amyloglucosidase (WO Engineering Fundamentals, McGraw-Hill Book Company, 00/56900), Fusarium venenatum Daria (WO 00/56900), NY, 1986). 10 The fermentation can be any method of cultivation of a cell Fusarium venenatum Quinn (WO 00/56900), Fusarium resulting in the expression or isolation of an enzyme. Fermen Oxysporum trypsin-like protease (WO 96/00787), Tricho tation may, therefore, be understood as comprising shake derma reesei beta-glucosidase, Trichoderma reesei cellobio flask cultivation, or Small- or large-scale fermentation (in hydrolase I, Trichoderma reesei cellobiohydrolase II, Tricho cluding continuous, batch, fed-batch, or Solid state fermenta- 15 derma reesei endoglucanase I, Trichoderma reesei tions) in laboratory or industrial fermentors performed in a endoglucanase II, Trichoderma reesei endoglucanase III, Tri Suitable medium and under conditions allowing the enzyme choderma reesei endoglucanase IV. Trichoderma reesei to be expressed or isolated. The resulting enzymes produced endoglucanase V. Trichoderma reesei Xylanase I, Tricho by the methods described above may be recovered from the derma reesei Xylanase II, Trichoderma reesei beta-xylosi fermentation medium and purified by conventional proce- 20 dase, as well as the NA2-tpi promoter (a modified promoter dures. including a gene encoding a neutral alpha-amylase in Nucleic Acid Constructs Aspergilli in which the untranslated leader has been replaced Nucleic acid constructs comprising an isolated polynucle by an untranslated leader from a gene encoding triose phos otide encoding a polypeptide of interest, e.g., one or more phate isomerase in Aspergilli; non-limiting examples include (several) cellulolytic enzymes, a polypeptide having peroxi- 25 modified promoters including the gene encoding neutral dase activity, or a polypeptide having cellulolytic enhancing alpha-amylase in Aspergillus niger in which the untranslated activity, operably linked to one or more (several) control leader has been replaced by an untranslated leader from the sequences may be constructed that direct the expression of the gene encoding triose phosphate isomerase in Aspergillus coding sequence in a Suitable host cell under conditions com nidulans or Aspergillus Oryzae); and mutant, truncated, and patible with the control sequences. 30 hybrid promoters thereof. The isolated polynucleotide may be manipulated in a vari In a yeast host, useful promoters are obtained from the ety of ways to provide for expression of the polypeptide. genes for Saccharomyces cerevisiae enolase (ENO-1), Sac Manipulation of the polynucleotide's sequence prior to its charomyces cerevisiae galactokinase (GAL1), Saccharomy insertion into a vector may be desirable or necessary depend ces cerevisiae alcohol dehydrogenase/glyceraldehyde-3- ing on the expression vector. The techniques for modifying 35 phosphate dehydrogenase (ADH1, ADH2/GAP), polynucleotide sequences utilizing recombinant DNA meth Saccharomyces cerevisiae triose phosphate isomerase (TPI), ods are well known in the art. Saccharomyces cerevisiae metallothionein (CUP1), and Sac The control sequence may be an appropriate promoter charomyces cerevisiae 3-phosphoglycerate kinase. Other sequence, a nucleotide sequence that is recognized by a host useful promoters for yeast host cells are described by cell for expression of a polynucleotide encoding a polypep- 40 Romanos et al., 1992, Yeast 8: 423-488. tide of the present invention. The promoter sequence contains The control sequence may also be a suitable transcription transcriptional control sequences that mediate the expression terminator sequence, a sequence recognized by a host cell to of the polypeptide. The promoter may be any nucleotide terminate transcription. The terminator sequence is operably sequence that shows transcriptional activity in the host cell of linked to the 3' terminus of the nucleotide sequence encoding choice including mutant, truncated, and hybrid promoters, 45 the polypeptide. Any terminator that is functional in the host and may be obtained from genes encoding extracellular or cell of choice may be used in the present invention. intracellular polypeptides either homologous or heterologous Preferred terminators for filamentous fungal host cells are to the host cell. obtained from the genes for Aspergillus Oryzae TAKA amy Examples of suitable promoters for directing the transcrip lase, Aspergillus niger glucoamylase, Aspergillus nidulans tion of the nucleic acid constructs of the present invention, 50 anthranilate synthase, Aspergillus niger alpha-glucosidase, especially in a bacterial host cell, are the promoters obtained and Fusarium oxysporum trypsin-like protease. from the E. colilac operon, Streptomyces coelicolor agarase Preferred terminators for yeast host cells are obtained from gene (dagA), Bacillus subtilis levanSucrase gene (sacB), the genes for Saccharomyces cerevisiae enolase, Saccharo Bacillus licheniformis alpha-amylase gene (amyl), Bacillus myces cerevisiae cytochrome C (CYC1), and Saccharomyces Stearothermophilus maltogenic amylase gene (amyM), 55 cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Bacillus amyloliquefaciens alpha-amylase gene (amyO), Other useful terminators for yeast host cells are described by Bacillus licheniformis penicillinase gene (penP), Bacillus Romanos et al., 1992, supra. subtilis XylA and XylB genes, and prokaryotic beta-lactamase The control sequence may also be a suitable leader gene (VIIIa-Kamaroff et al., 1978, Proceedings of the sequence, a nontranslated region of an mRNA that is impor National Academy of Sciences USA 75: 3727-3731), as well 60 tant for translation by the host cell. The leader sequence is as the tac promoter (DeBoer et al., 1983, Proceedings of the operably linked to the 5' terminus of the nucleotide sequence National Academy of Sciences USA 80: 21-25). Further pro encoding the polypeptide. Any leader sequence that is func moters are described in “Useful proteins from recombinant tional in the host cell of choice may be used in the present bacteria” in Scientific American, 1980, 242: 74-94; and in invention. Sambrook et al., 1989, supra. 65 Preferred leaders for filamentous fungal host cells are Examples of suitable promoters for directing the transcrip obtained from the genes for Aspergillus Oryzae TAKA amy tion of the nucleic acid constructs in a filamentous fungal host lase and Aspergillus nidulans triose phosphate isomerase. US 8,426,158 B2 49 50 Suitable leaders for yeast host cells are obtained from the converted to a mature active polypeptide by catalytic or auto genes for Saccharomyces cerevisiae enolase (ENO-1), Sac catalytic cleavage of the propeptide from the propolypeptide. charomyces cerevisiae 3-phosphoglycerate kinase, Saccha The propeptide coding sequence may be obtained from the romyces cerevisiae alpha-factor, and Saccharomyces cerevi genes for Bacillus subtilis alkaline protease (aprE), Bacillus siae alcohol dehydrogenase/glyceraldehyde-3-phosphate subtilis neutral protease (nprT), Saccharomyces cerevisiae dehydrogenase (ADH2/GAP). alpha-factor, Rhizomucor miehei aspartic proteinase, and The control sequence may also be a polyadenylation Myceliophthora thermophila laccase (WO95/33836). sequence, a sequence operably linked to the 3' terminus of the Where both signal peptide and propeptide sequences are nucleotide sequence and, when transcribed, is recognized by present at the amino terminus of a polypeptide, the propeptide the host cell as a signal to add polyadenosine residues to 10 sequence is positioned next to the amino terminus of a transcribed mRNA. Any polyadenylation sequence that is polypeptide and the signal peptide sequence is positioned functional in the host cell of choice may be used in the present next to the amino terminus of the propeptide sequence. invention. It may also be desirable to add regulatory sequences that Preferred polyadenylation sequences for filamentous fun allow the regulation of the expression of the polypeptide gal host cells are obtained from the genes for Aspergillus 15 relative to the growth of the host cell. Examples of regulatory Oryzae TAKA amylase, Aspergillus niger glucoamylase, systems are those that cause the expression of the gene to be Aspergillus nidulans anthranilate synthase, Fusarium turned on or offin response to a chemical or physical stimu Oxysporum trypsin-like protease, and Aspergillus niger lus, including the presence of a regulatory compound. Regu alpha-glucosidase. latory systems in prokaryotic systems include the lac, tac, and Useful polyadenylation sequences for yeast host cells are trp operator systems. In yeast, the ADH2 system or GAL1 described by Guo and Sherman, 1995, Molecular Cellular system may be used. In filamentous fungi, the TAKA alpha Biology 15:5983-5990. amylase promoter, Aspergillus niger glucoamylase promoter, The control sequence may also be a signal peptide coding and Aspergillus Oryzae glucoamylase promoter may be used sequence that encodes a signal peptide linked to the amino as regulatory sequences. Other examples of regulatory terminus of a polypeptide and directs the encoded polypep 25 sequences are those that allow for gene amplification. In tide into the cell's secretory pathway. The 5' end of the coding eukaryotic systems, these regulatory sequences include the sequence of the nucleotide sequence may inherently contain dihydrofolate reductase gene that is amplified in the presence a signal peptide coding sequence naturally linked in transla of methotrexate, and the metallothionein genes that are tion reading frame with the segment of the coding sequence amplified with heavy metals. In these cases, the nucleotide that encodes the secreted polypeptide. Alternatively, the 5' 30 sequence encoding the polypeptide would be operably linked end of the coding sequence may contain a signal peptide with the regulatory sequence. coding sequence that is foreign to the coding sequence. The Expression Vectors foreign signal peptide coding sequence may be required Recombinant expression vectors comprising a polynucle where the coding sequence does not naturally contain a signal otide encoding a polypeptide of interest, e.g., one or more peptide coding sequence. Alternatively, the foreign signal 35 (several) cellulolytic enzymes, a polypeptide having peroxi peptide coding sequence may simply replace the native signal dase activity, or a polypeptide having cellulolytic enhancing peptide coding sequence in order to enhance secretion of the activity, a promoter, and transcriptional and translational stop polypeptide. However, any signal peptide coding sequence signals may be constructed for expression of the polypeptide that directs the expressed polypeptide into the secretory path in a suitable host cell. The various nucleic acids and control way of a host cell of choice, i.e., secreted into a culture 40 sequences described herein may be joined together to pro medium, may be used in the present invention. duce a recombinant expression vector that may include one or Effective signal peptide coding sequences for bacterial more (several) convenient restriction sites to allow for inser host cells are the signal peptide coding sequences obtained tion or Substitution of the nucleotide sequence encoding the from the genes for Bacillus NCIB 11837 maltogenic amylase, polypeptide at Such sites. Alternatively, a polynucleotide Bacillus Stearothermophilus alpha-amylase, Bacillus licheni 45 sequence may be expressed by inserting the nucleotide formis subtilisin, Bacillus licheniformis beta-lactamase, sequence or a nucleic acid construct comprising the sequence Bacillus Stearothermophilus neutral proteases (nprT, nprS. into an appropriate vector for expression. In creating the nprM), and Bacillus subtilis prSA. Further signal peptides are expression vector, the coding sequence is located in the vector described by Simonen and Palva, 1993, Microbiological so that the coding sequence is operably linked with the appro Revie's 57: 109-137. 50 priate control sequences for expression. Effective signal peptide coding sequences for filamentous The recombinant expression vector may be any vector fungal host cells are the signal peptide coding sequences (e.g., a plasmidor virus) that can be conveniently subjected to obtained from the genes for Aspergillus Oryzae TAKA amy recombinant DNA procedures and can bring about expression lase, Aspergillus niger neutral amylase, Aspergillus niger of the nucleotide sequence. The choice of the vector will glucoamylase, Rhizomucor miehei aspartic proteinase, 55 typically depend on the compatibility of the vector with the Humicola insolens cellulase, Humicola insolens endogluca host cell into which the vector is to be introduced. The vectors nase V, and Humicola lanuginosa lipase. may be linear or closed circular plasmids. Useful signal peptides for yeast host cells are obtained The vector may be an autonomously replicating vector, i.e., from the genes for Saccharomyces cerevisiae alpha-factor a vector that exists as an extrachromosomal entity, the repli and Saccharomyces cerevisiae invertase. Other useful signal 60 cation of which is independent of chromosomal replication, peptide coding sequences are described by Romanos et al., e.g., a plasmid, an extrachromosomal element, a minichro 1992, supra. mosome, or an artificial chromosome. The vector may con The control sequence may also be a propeptide coding tain any means for assuring self-replication. Alternatively, the sequence that encodes a propeptide positioned at the amino vector may be one that, when introduced into the host cell, is terminus of a polypeptide. The resultant polypeptide is 65 integrated into the genome and replicated together with the known as a proenzyme or propolypeptide (or a Zymogen in chromosome(s) into which it has been integrated. Further Some cases). A propeptide is generally inactive and can be more, a single vector or plasmid or two or more vectors or US 8,426,158 B2 51 52 plasmids that together contain the total DNA to be introduced 917.5; WO 00/24883). Isolation of the AMA1 gene and con into the genome of the host cell, or a transposon, may be used. struction of plasmids or vectors comprising the gene can be The vectors preferably contain one or more (several) accomplished according to the methods disclosed in WO selectable markers that permit easy selection of transformed, OOf 24883. transfected, transduced, or the like cells. A selectable marker 5 More than one copy of a polynucleotide may be inserted is a gene the product of which provides for biocide or viral into a host cell to increase production of the gene product. An resistance, resistance to heavy metals, prototrophy to aux increase in the copy number of the polynucleotide can be otrophs, and the like. obtained by integrating at least one additional copy of the Examples of bacterial selectable markers are the dal genes sequence into the host cell genome or by including an ampli from Bacillus subtilis or Bacillus licheniformis, or markers 10 fiable selectable marker gene with the polynucleotide where that confer antibiotic resistance Such as amplicillin, kanamy cells containing amplified copies of the selectable marker cin, chloramphenicol, or tetracycline resistance. Suitable gene, and thereby additional copies of the polynucleotide, can markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, be selected for by cultivating the cells in the presence of the MET3, TRP1, and URA3. Selectable markers for use in a appropriate selectable agent. filamentous fungal host cell include, but are not limited to, 15 The procedures used to ligate the elements described above amdS (acetamidase), argB (ornithine carbamoyltransferase), to construct the recombinant expression vectors are well bar (phosphinothricin acetyltransferase), hph (hygromycin known to one skilled in the art (see, e.g., Sambrook et al., phosphotransferase), nial) (nitrate reductase), pyrC (oroti 1989, supra). dine-5'-phosphate decarboxylase), SC (Sulfate adenyltrans Host Cells ferase), and trpC (anthranilate synthase), as well as equiva The nucleic acid constructs or expression vectors compris lents thereof. Preferred for use in an Aspergillus cell are the ing an isolated polynucleotide encoding a polypeptide of amdS and pyrC genes of Aspergillus nidulans or Aspergillus interest, e.g., one or more (several) cellulolytic enzymes, a Oryzae and the bar gene of Streptomyces hygroscopicus. polypeptide having peroxidase activity, or a polypeptide hav The vectors preferably contain an element(s) that permits ing cellulolytic enhancing activity, may be introduced into integration of the vector into the host cell's genome or 25 recombinant host cells for the recombinant production of the autonomous replication of the vector in the cell independent polypeptides. A vector comprising a polynucleotide is intro of the genome. duced into a host cell so that the vector is maintained as a For integration into the host cell genome, the vector may chromosomal integrant or as a self-replicating extra-chromo rely on the polynucleotide's sequence encoding the polypep somal vector as described earlier. The term "host cell encom tide or any other element of the vector for integration into the 30 passes any progeny of a parent cell that is not identical to the genome by homologous or nonhomologous recombination. parent cell due to mutations that occur during replication. The Alternatively, the vector may contain additional nucleotide choice of a host cell will to a large extent depend upon the sequences for directing integration by homologous recombi gene encoding the polypeptide and its source. nation into the genome of the host cell at a precise location(s) The host cell may be any cell useful in the recombinant in the chromosome(s). To increase the likelihood of integra 35 production of a polypeptide, e.g., a prokaryote or a eukaryote. tion at a precise location, the integrational elements should The prokaryotic host cell may be any Gram positive bac preferably contain a sufficient number of nucleic acids, Such terium or a Gram negative bacterium. Gram positive bacteria as 100 to 10,000 base pairs, preferably 400 to 10,000 base include, but not limited to, Bacillus, Streptococcus, Strepto pairs, and most preferably 800 to 10,000 base pairs, which myces, Staphylococcus, Enterococcus, Lactobacillus, Lacto have a high degree of identity to the corresponding target 40 coccus, Clostridium, Geobacillus, and Oceanobacillus. sequence to enhance the probability of homologous recom Gram negative bacteria include, but not limited to, E. coli, bination. The integrational elements may be any sequence Pseudomonas, Salmonella, Campylobacter; Helicobacter, that is homologous with the target sequence in the genome of Flavobacterium, Fusobacterium, Ilvobacter; Neisseria, and the host cell. Furthermore, the integrational elements may be Ureaplasma. non-encoding or encoding nucleotide sequences. On the 45 The bacterial host cell may be any Bacillus cell. Bacillus other hand, the vector may be integrated into the genome of cells include, but are not limited to, Bacillus alkalophilus, the host cell by non-homologous recombination. Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circu For autonomous replication, the vector may further com lans, Bacillus clausii, Bacillus coagulans, Bacillus firmus, prise an origin of replication enabling the vector to replicate Bacillus lautus, Bacillus lentus, Bacillus licheniformis, autonomously in the host cell in question. The origin of rep 50 Bacillus megaterium, Bacillus pumilus, Bacillus Stearother lication may be any plasmid replicator mediating autono mophilus, Bacillus subtilis, and Bacillus thuringiensis cells. mous replication that functions in a cell. The term “origin of In a preferred aspect, the bacterial host cell is a Bacillus replication' or “plasmid replicator is defined herein as a amyloliquefaciens, Bacillus lentus, Bacillus licheniformis, nucleotide sequence that enables a plasmid or vector to rep Bacillus Stearothermophilus or Bacillus subtilis cell. In a licate in vivo. 55 more preferred aspect, the bacterial host cell is a Bacillus Examples of bacterial origins of replication are the origins amyloliquefaciens cell. In another more preferred aspect, the of replication of plasmids pBR322, puC19, p.ACYC177, and bacterial host cell is a Bacillus clausii cell. In another more pACYC184 permitting replication in E. coli, and puB110, preferred aspect, the bacterial host cell is a Bacillus licheni pE 194, pTA1060, and pAMR1 permitting replication in formis cell. In another more preferred aspect, the bacterial Bacillus. 60 host cell is a Bacillus subtilis cell. Examples of origins of replication for use in a yeast host The bacterial host cell may also be any Streptococcus cell. cell are the 2 micron origin of replication, ARS1, ARS4, the Streptococcus cells include, but are not limited to, Strepto combination of ARS1 and CEN3, and the combination of coccus equisimilis, Streptococcus pyogenes, Streptococcus ARS4 and CEN6. uberis, and Streptococcus equi Subsp. Zooepidemicus cells. Examples of origins of replication useful in a filamentous 65 In a preferred aspect, the bacterial host cell is a Streptococ fungal cell are AMA1 and ANS1 (Gems et al., 1991, Gene 98: cus equisimilis cell. In another preferred aspect, the bacterial 61-67; Cullen et al., 1987, Nucleic Acids Research 15:9163 host cell is a Streptococcus pyogenes cell. In another pre US 8,426, 158 B2 53 54 ferred aspect, the bacterial host cell is a Streptococcus uberis In an even more preferred aspect, the yeast host cell is a cell. In another preferred aspect, the bacterial host cell is a Candida, Hansenula, Kluyveromyces, Pichia, Saccharomy Streptococcus equi Subsp. Zooepidemicus cell. ces, Schizosaccharomyces, or Yarrowia cell. The bacterial host cell may also be any Streptomyces cell. In a most preferred aspect, the yeast host cell is a Saccha Streptomyces cells include, but are not limited to, Streptomy- 5 romyces Carlsbergensis, Saccharomyces cerevisiae, Saccha ces achromogenes, Streptomyces avermitilis, Streptomyces romyces diastaticus, Saccharomyces douglasii, Saccharomy coelicolor, Streptomyces griseus, and Streptomyces lividans ces kluyveri, Saccharomyces norbensis, or Saccharomyces cells. Oviformis cell. In another most preferred aspect, the yeast host In a preferred aspect, the bacterial host cell is a Streptomy cell is a Kluyveromyces lactis cell. In another most preferred ces achromogenes cell. In another preferred aspect, the bac- 10 aspect, the yeast host cell is a Yarrowia lipolytica cell. terial host cell is a Streptomyces avermitilis cell. In another In another more preferred aspect, the fungal host cell is a preferred aspect, the bacterial host cell is a Streptomyces filamentous fungal cell. "Filamentous fungi' include all fila coelicolor cell. In another preferred aspect, the bacterial host mentous forms of the subdivision Eumycota and Oomycota cell is a Streptomyces griseus cell. In another preferred (as defined by Hawksworth et al., 1995, supra). The filamen aspect, the bacterial host cell is a Streptomyces lividans cell. 15 tous fungi are generally characterized by a mycelial wall The introduction of DNA into a Bacillus cell may, for composed of chitin, cellulose, glucan, chitosan, mannan, and instance, be effected by protoplast transformation (see, e.g., other complex polysaccharides. Vegetative growth is by Chang and Cohen, 1979, Molecular General Genetics 168: hyphal elongation and carbon catabolism is obligately aero 111-115), by using competent cells (see, e.g., Young and bic. In contrast, vegetative growth by yeasts such as Saccha Spizizen, 1961, Journal of Bacteriology 81:823-829, or Dub- 20 romyces cerevisiae is by budding of a unicellular thallus and nau and Davidoff-Abelson, 1971, Journal of Molecular Biol carbon catabolism may be fermentative. ogy 56: 209-221), by electroporation (see, e.g., Shigekawa In an even more preferred aspect, the filamentous fungal and Dower, 1988, Biotechniques 6: 742-751), or by conjuga host cell is an Acremonium, Aspergillus, Aureobasidium, tion (see, e.g., Koehler and Thorne, 1987, Journal of Bacte Bjerkandera, CeripOriopsis, Chrysosporium, Coprinus, riology 169: 5271-5278). The introduction of DNA into an E. 25 Coriolus, Cryptococcus, Filibasidium, Fusarium, Humicola, coli cell may, for instance, be effected by protoplast transfor Magnaporthe, Mucor, Myceliophthora, Neocallinastix, Neu mation (see, e.g., Hanahan, 1983, J. Mol. Biol. 166:557-580) rospora, Paecilomyces, Penicillium, Phanerochaete, Phlebia, or electroporation (see, e.g., Dower et al., 1988, Nucleic Acids Piromyces, Pleurotus, Schizophyllum, Talaromyces, Ther Res. 16: 6127-6145). The introduction of DNA into a Strep moascus, Thielavia, Tolypocladium, Trametes, or Tricho tomyces cell may, for instance, be effected by protoplast 30 derma cell. transformation and electroporation (see, e.g., Gong et al., In a most preferred aspect, the filamentous fungal host cell 2004, Folia Microbiol. (Praha) 49:399-405), by conjugation is an Aspergillus awamori, Aspergillus fumigatus, Aspergil (see, e.g., Mazodier et al., 1989, J. Bacteriol. 171: 3583 lus foetidus, Aspergillus japonicus, Aspergillus nidulans, 3585), or by transduction (see, e.g., Burke et al., 2001, Proc. Aspergillus niger or Aspergillus Oryzae cell. In another most Natl. Acad. Sci. USA 98: 6289-6294). The introduction of 35 preferred aspect, the filamentous fungal host cell is a DNA into a Pseudomonas cell may, for instance, be effected Fusarium bactridioides, Fusarium cerealis, Fusarium crook by electroporation (see, e.g., Choi et al., 2006, J. Microbiol. wellense, Fusarium culmorum, Fusarium graminearum, Methods 64; 391-397) or by conjugation (see, e.g., Pinedo Fusarium graminum, Fusarium heterosporum, Fusarium and Smets, 2005, Appl. Environ. Microbiol. 71: 51-57). The negundi, Fusarium oxysporum, Fusarium reticulatum, introduction of DNA into a Streptococcus cell may, for 40 Fusarium roseum, Fusarium Sambucinum, Fusarium sarco instance, be effected by natural competence (see, e.g., Perry chroum, Fusarium sporotrichioides, Fusarium sulphureum, and Kuramitsu, 1981, Infect. Immun. 32: 1295-1297), by Fusarium torulosum, Fusarium trichothecioides, or protoplast transformation (see, e.g., Catt and Jollick, 1991, Fusarium venenatum cell. In another most preferred aspect, Microbios. 68: 189-207, by electroporation (see, e.g., Buck the filamentous fungal host cell is a Bjerkandera adusta, ley et al., 1999, Appl. Environ. Microbiol. 65: 3800-3804) or 45 Ceriporiopsis aneirina, Ceriporiopsis aneirina, CeripOriop by conjugation (see, e.g., Clewell, 1981, Microbiol. Rev. 45: sis caregiea, CeripOriopsis gilvescens, CeripOriopsis panno 409-436). However, any method known in the art for intro cinta, CeripOriopsis rivulosa, CeripOriopsis subrufa, Ceripo ducing DNA into a host cell can be used. riopsis subvermispora, Chrysosporium keratinophilum, The host cell may also be a eukaryote, Such as a mamma Chrysosporium lucknowense, Chrysosporium tropicum, lian, insect, plant, or fungal cell. 50 Chrysosporium merdarium, Chrysosporium inops, Chrysos In a preferred aspect, the host cell is a fungal cell. "Fungi' porium pannicola, Chrysosporium queenslandicum, Chry as used herein includes the phyla Ascomycota, Basidiomy sosporium Zonatum, Coprinus cinereus, Coriolus hirsutus, cota, Chytridiomycota, and Zygomycota (as defined by Humicola insolens, Humicola lanuginosa, Mucor miehei, Hawksworth et al., In, Ainsworth and Bisby's Dictionary of Myceliophthora thermophila, Neurospora crassa, Penicil The Fungi, 8th edition, 1995, CAB International, University 55 lium purpurogenium, Phanerochaete chrysosporium, Phlebia Press, Cambridge, UK) as well as the Oomycota (as cited in radiata, Pleurotus eryngii, Thielavia terrestris, Trametes vil Hawksworth et al., 1995, supra, page 171) and all mitosporic losa, Trametes versicolor; Trichoderma harzianum, Tricho fungi (Hawksworth et al., 1995, Supra). derma koningii, Trichoderma longibrachiatum, Trichoderma In a more preferred aspect, the fungal host cell is a yeast reesei, or Trichoderma viride cell. cell. "Yeast’ as used herein includes ascosporogenous yeast 60 Fungal cells may be transformed by a process involving (Endomycetales), basidiosporogenous yeast, and yeast protoplast formation, transformation of the protoplasts, and belonging to the Fungi Imperfecti (Blastomycetes). Since the regeneration of the cell wall in a manner known perse. Suit classification of yeast may change in the future, for the pur able procedures for transformation of Aspergillus and Tricho poses of this invention, yeast shall be defined as described in derma host cells are described in EP238 023 and Yelton et al., Biology and Activities of Yeast (Skinner, F. A., Passmore, S. 65 1984, Proceedings of the National Academy of Sciences USA M., and Davenport, R. R., eds, Soc. App. Bacteriol. Sympo 81: 1470–1474. Suitable methods for transforming Fusarium sium Series No. 9, 1980). species are described by Malardier et al., 1989, Gene 78: US 8,426,158 B2 55 56 147-156, and WO 96/00787. Yeast may be transformed using containing 100 ml of shake flask medium to obtain culture the procedures described by Becker and Guarente. In Abel broth for the purification of a cellobiose dehydrogenase. PDA son, J. N. and Simon, M.I., editors, Guide to Yeast Genetics plates were composed of 39 g of potato dextrose agar and and Molecular Biology, Methods in Enzymology, Volume deionized water to 1 liter. The shake flask medium was com 194, pp 182-187, Academic Press, Inc., New York; Ito et al., posed of 15 g of glucose, 4 g of KHPO 1 g of NaCl, 0.2g 1983, Journal of Bacteriology 153: 163; and Hinnen et al., 1978, Proceedings of the National Academy of Sciences USA of MgSO.7H2O, 2 g of MES free acid, 1 g of Bacto Peptone, 75: 1920. 5g of yeast extract, 2.5g of citric acid, 0.2 g of CaCl2.H2O, Methods of Production 5 g of NH4NO, 1 ml of trace elements solution, and deion A polypeptide of interest, e.g., one or more (several) cel ized water to 1 liter. The trace elements solution was com lulolytic enzymes, a polypeptide having peroxidase activity, 10 posed of 1.2 g of FeSO4.7H2O, 10 g of ZnSO.7H2O, 0.7 g of ora polypeptidehaving cellulolytic enhancing activity, can be MnSOHO, 0.4 g of CuSO4.5H2O, 0.4 g of produced by (a) cultivating a cell, which in its wild-type form NaB.O. 10H2O, 0.8 g of NaMoO2HO, and deionized produces the polypeptide, under conditions conducive for water to 1 liter. The shake flask was incubated at 45° C. on an production of the polypeptide; and (b) recovering the orbital shaker at 200 rpm for 48 hours. Fifty ml of the shake polypeptide. 15 flask broth was used to inoculate a 2 liter fermentation vessel. The polypeptide of interest can also be produced by (a) A total of 1.8 liters of the fermentation batch medium was cultivating a recombinant host cell, as described herein, under added to a two liter glass jacketed fermentor (Applikon Bio conditions conducive for production of the polypeptide; and technology, Schiedam, Netherlands). Fermentation feed (b) recovering the polypeptide. medium was dosed at a rate of 4 g/1/hour for a period of 72 In the production methods, the cells are cultivated in a hours. Fermentation batch medium was composed of 5 g of nutrient medium suitable for production of the polypeptide yeast extract, 176 g of powdered cellulose, 2 g of glucose, 1 g using methods well known in the art. For example, the cell of NaCl, 1 g of Bacto Peptone, 4 g of KHPO, 0.2 g of may be cultivated by shake flask cultivation, and Small-scale CaCl2.2H2O, 0.2 g of MgSO4.7H2O, 2.5g of citric acid, 5g or large-scale fermentation (including continuous, batch, fed of NHNO, 1.8 ml of anti-foam, 1 ml of trace elements batch, or solid state fermentations) in laboratory or industrial 25 fermentors performed in a Suitable medium and under con solution, and deionized water to 1 liter. The fermentation ditions allowing the polypeptide to be expressed and/or iso vessel was maintained at a temperature of 45° C. and pH was lated. The cultivation takes place in a suitable nutrient controlled using an Applikon 1030 control system (Applikon medium comprising carbon and nitrogen Sources and inor Biotechnology, Schiedam, Netherlands) to a set-point of ganic salts, using procedures known in the art. Suitable media 5.6+7-0.1. Air was added to the vessel at a rate of 1 VVm and are available from commercial Suppliers or may be prepared 30 the broth was agitated by Rushton impeller rotating at 1100 to according to published compositions (e.g., in catalogues of 1300 rpm. At the end of the fermentation, whole broth was the American Type Culture Collection). If the polypeptide is harvested from the vessel and centrifuged at 3000xg to secreted into the nutrient medium, the polypeptide can be remove the biomass. recovered directly from the medium. If the polypeptide is not secreted into the medium, it can be recovered from cell 35 Example 2 lysates. The polypeptides may be detected using methods known in Purification of Myceliophthora thermophila CBS the art that are specific for the polypeptides. These detection 117.65 Cellobiose Dehydrogenase methods may include use of specific antibodies, formation of an enzyme product, or disappearance of an enzyme Substrate. 40 The Myceliophthora thermophila CBS 117.65 harvested For example, an enzyme assay may be used to determine the broth described in Example 1 was centrifuged in 500 ml activity of the polypeptide as described herein. bottles at 13,000xg for 20 minutes at 4° C. and then sterile The resulting polypeptide may be recovered using methods filtered using a 0.22 um polyethersulfone membrane (Milli known in the art. For example, the polypeptide may be recov pore, Bedford, Mass., USA). The filtered broth was concen ered from the nutrient medium by conventional procedures including, but not limited to, centrifugation, filtration, extrac 45 trated and buffer exchanged with 20 mM Tris-HCl pH 8.5 tion, spray-drying, evaporation, or precipitation. using a tangential flow concentrator (Pall Filtron, Northbor The polypeptides may be purified by a variety of proce ough, Mass., USA) equipped with a 10 kDa polyethersulfone dures known in the art including, but not limited to, chroma membrane (Pall Filtron, Northborough, Mass., USA). To tography (e.g., ion exchange, affinity, hydrophobic, chro decrease the amount of pigment, the concentrate was applied matofocusing, and size exclusion), electrophoretic 50 to a 60 ml Q-SEPHAROSE BIG BEADTM column (GE procedures (e.g., preparative isoelectric focusing), differen Healthcare, Piscataway, N.J., USA) equilibrated with 20 mM tial Solubility (e.g., ammonium sulfate precipitation), SDS Tris-HCl pH 8.5, and eluted stepwise with equilibration PAGE, or extraction (see, e.g., Protein Purification, J.-C. buffer containing 600 mM NaCl. Flow-through and eluate Janson and Lars Ryden, editors, VCH Publishers, New York, fractions were analyzed by SDS-PAGE using 8-16% CRITE 1989) to obtain substantially pure polypeptides. 55 RIONTM SDS-PAGEgels (Bio-Rad Laboratories, Inc., Her The present invention is further described by the following cules, Calif., USA) and stained with GELCODETM Blue pro examples that should not be construed as limiting the scope of tein stain (Thermo Fisher Scientific, Waltham, Mass., USA). the invention. The eluate fraction contained cellobiose dehydrogenase (CBDH) as judged by the presence of a band corresponding to EXAMPLES 60 the apparent molecular weight of approximately 100 kDa by SDS-PAGE (Schou et al., 1998, Biochem. J. 330: 565-571). Example 1 The eluate fraction was concentrated using an AMICONTM ultrafiltration device (Millipore, Bedford, Mass., USA) Growth of Myceliophthora thermophila CBS 117.65 equipped with a 10 kDa polyethersulfone membrane, and 65 buffer-exchanged into 20 mM Tris-HCl pH 8.5 using a Two plugs from a PDA plate of Myceliophthora thermo HIPREPR 26/10 desalting column (GE Heathcare, Piscat phila CBS 117.65 were inoculated into a 500 ml shake flask away, N.J., USA). The desalted material was loaded onto a US 8,426,158 B2 57 58 MONO QR column (HR 16/10, GE Healthcare, Piscataway, The hydrolysis of PCS was performed using 2.2 ml deep N.J., USA) equilibrated with 20 mM Tris-HCl pH 8.5. Bound well plates (Axygen, Union City, Calif., USA) in a total proteins were eluted with a linear NaCl gradient from 0 to 500 reaction volume of 1.0 ml. The hydrolysis was performed mM (18 column volumes) in 20 mM Tris-HCl pH 8.5. Frac with 50 mg of unwashed PCS per ml of 50 mM sodium tions were analyzed by SDS-PAGE as described above and acetate pH 5.0 buffer containing 1 mM manganese sulfate and the cellobiose dehydrogenase eluted at approximately 350 a Trichoderma reesei cellulase composition (CELLU 400 mMNaCl. CLASTR supplemented with Aspergillus Oryzae beta-glu Fractions containing cellobiose dehydrogenase were cosidase available from Novozymes A/S, Bagsvaerd, Den pooled (60 ml) and mixed with an equal volume of 20 mM mark; the cellulase composition is designated herein in the Tris-HCl pH 7.5 containing 3.4 Mammonium sulfate to yield 10 Examples as “Trichoderma reesei cellulase composition') at a final concentration of 1.7 Mammonium sulfate. The sample 4 mg per g of PCS. Cellobiose dehydrogenase was added at was filtered (0.2 uM syringe filter, polyethersulfone mem concentrations between 0 and 10% (w/w) of total protein. The brane, Whatman, Maidstone, United Kingdom) to remove plate was then sealed using an ALPS-300TM plate heat sealer particulate matter prior to loading onto a Phenyl Superose (Abgene, Epsom, United Kingdom), mixed thoroughly, and column (HR 16/10, GE Healthcare, Piscataway, N.J., USA) 15 equilibrated with 1.7 Mammonium sulfate in 20 mM Tris incubated at 50° C. for 72 hours with shaking at 150 rpm. All HCl pH 7.5. Bound proteins were eluted with a decreasing experiments were performed in triplicate. 1.7->0 Mammonium sulfate gradient (12 column volumes) At various time points between 24 and 72 hours of incu in 20 mM Tris-HCl pH 7.5. Fractions were analyzed by SDS bation, 100 ul aliquots were removed and the extent of PAGE as described above and the cellobiose dehydrogenase hydrolysis was assayed by HPLC using the protocol eluted at approximately 800 mM ammonium sulfate. The described below. Cellobiose dehydrogenase-dependent cel cellobiose dehydrogenase fraction was >90% pure as judged lulase inhibition was established, and then horseradish per by SDS-PAGE. CBDH activity was confirmed by a 2.6- oxidase was added to eliminate any peroxide produced. dichlorophenolindophenol (DCIP) reduction assay in the Horseradish peroxidase (Invitrogen, Carlsbad, Calif., USA) presence of cellobiose, as described by Schou et al., 1998, 25 was added at final concentrations indicated in FIG. 1 from a Supra. stock solution of 1 unit per ul. One unit of horseradish per Fractions containing cellobiose dehydrogenase were oxidase is defined as that quantity of enzyme necessary to pooled, concentrated, and buffer exchanged into 20 mM Tris form 1.0 mg of purpurogallin from pyrogallol in 20 seconds at HCl pH 7.5 by centrifugal concentration in a SORVALL(R) RT7 centrifuge (Thermo Fisher Scientific, Waltham, Mass. 30 pH 6.0 at 20° C. High concentrations of horseradish peroxi USA) using VIVASPINTM20 centrifugal concentrators. (10 dase were added to ensure sufficient peroxidase activity was kDa polyethersulfone membrane; Sartorius, Gottingen, Ger present. No Amplex Red or other electron acceptor was many) at 1877xg. Protein concentration was determined added. using a Microplate BCATM Protein Assay Kit (Thermo Fis For HPLC analysis, samples were filtered with a 0.45um cher Scientific, Waltham, Mass., USA) in which bovine 35 MULTISCREENR 96-well filter plate (Millipore, Bedford, serum albumin was used as a protein standard. Mass., USA) and filtrates analyzed for Sugar content as described below. When not used immediately, filtered ali Example 3 quots were frozen at -20° C. The Sugar concentrations of samples diluted in 0.005 MHSO were measured using a Pretreatment of Corn Stover 40 4.6x250 mm AMINEXR HPX-87H column (Bio-Rad Labo Corn stover was pretreated at the U.S. Department of ratories, Inc., Hercules, Calif., USA) by elution with 0.5% Energy National Renewable Energy Laboratory (NREL) w/w benzoic acid-5 mM HSO at a flow rate of 0.6 ml per using dilute Sulfuric acid. The following conditions were used minute at 65° C. for 11 minutes, and quantitation by integra for the pretreatment: 1.4 wt % sulfuric acid at 165° C. and 107 45 tion of glucose and cellobiose signals from refractive index psi for 8 minutes. According to NREL, the water-insoluble detection (CHEMSTATIONR, AGILENTR 1100 HPLC, solids in the pretreated corn stover (PCS) contained 56.5% Agilent Technologies, Santa Clara, Calif., USA) calibrated cellulose, 4.6% hemicellulose and 28.4% lignin. Cellulose by pure Sugar samples. The resultant equivalents were used to and hemicellulose were determined by a two-stage sulfuric calculate the percentage of cellulose conversion for each acid hydrolysis with Subsequent analysis of Sugars by high 50 reaction. The extent of each hydrolysis was determined as the performance liquid chromatography using NREL Standard fraction of total cellulose converted to cellobiose+glucose, Analytical Procedure #002. Lignin was determined gravi and were not corrected for soluble sugars present in PCS metrically after hydrolyzing the cellulose and hemicellulose liquor. fractions with sulfuric acid using NREL Standard Analytical All HPLC data processing was performed using Kaleida Procedure #003. The PCS was washed with a large volume of 55 graph software (Synergy software, Reading, Pa., USA). Mea DDI water on a glass filter. Sured Sugar concentrations were adjusted for the appropriate Example 4 dilution factor. Glucose and cellobiose were chromatographi cally separated and integrated and their respective concentra The Effect of Peroxidase on Hydrolysis of Pretreated 60 tions determined independently. However, to calculate total Corn Stover (PCS) in the Presence and Absence of conversion the glucose and cellobiose values were combined. Cellobiose Dehydrogenase Fractional hydrolysis was determined as the overall mass conversion to glucose--cellobiose/total cellulose. Tripli The effect of horseradish peroxidase (HRP) on hydrolysis cate data points were averaged and standard deviation was calculated. of PCS was evaluated in the presence and absence of Myce 65 liophthora thermophila CBS 117.65 cellobiose dehydroge The results (FIG. 1) demonstrated that increasing concen nase (CDH). trations of M. thermophila cellobiose dehydrogenase led to US 8,426,158 B2 59 60 reduced hydrolysis. Under these specific hydrolysis condi phosphate pH 7, 0.9 mMHO, and 1.7 mM 2,2'-azinobis(3- tions, the addition of 10% cellobiose dehydrogenase resulted ethylbenzothiazoline-6-sulfonic acid) (ABTS), by monitor in approximately 5% loss of hydrolysis. At 4% cellobiose ing the absorption increase at 418 nm. A stock concentration dehydrogenase, where the Trichoderma reesei cellulase com of 630 uM peroxidase was used. position was modestly inhibited, the extent of hydrolysis was completely restored at the lowest concentration of 1 unit of horseradish peroxidase per ml. Example 7

Example 5 10 The Effect of Various Peroxidases on Hydrolysis of Pretreated Corn Stover (PCS) The Effect of Horseradish Peroxidase on Hydrolysis of Pretreated Corn Stover (PCS) The effect of various peroxidases on PCS hydrolysis by the 15 The effect of horseradish peroxidase (HRP) on PCS Trichoderma reesei cellulase composition was determined. hydrolysis by the Trichoderma reesei cellulase composition The peroxidases included manganese peroxidase (Sigma was determined in the absence of Myceliophthora thermo Chemical Co., St. Louis, Mo., USA), bovine milk lactoper phila CBS 117.65 cellobiose dehydrogenase. oxidase (Sigma Chemical Co., St. Louis, Mo., USA), lignin The hydrolysis of PCS was conducted using 2.2 ml deep peroxidase (Sigma Chemical Co., St. Louis, Mo., USA), and well plates in a total reaction volume of 1.0 ml. The hydrolysis Coprinus cinereus peroxidase (Example 6). One unit of man was performed with 50 mg of washed PCS per ml of 50 mM ganese peroxidase is defined as the amount of enzyme nec Sodium acetate pH 5.0 buffer containing 1 mM manganese essary to oxidize 1 umole of Mn" per minute to Mn" at pH sulfate and the Trichoderma reesei cellulase composition at 4 4.5 and 25°C. One unit of lactoperoxidase is defined as that mg per g of PCS. Horseradish peroxidase was added at 0-4 25 units per ml. The plate was then sealed using an ALPS-300TM quantity of enzyme necessary to form 1.0 mg of purpurogallin plate heat sealer, mixed thoroughly, and incubated at 50° C. from pyrogallol in 20 seconds at pH 6.0 and 20°C. One unit for 72 hours with shaking at 150 rpm. All experiments were of lignin peroxidase is defined as the amount of enzyme performed in triplicate. necessary to oxidize 1 umole 3.4-dimethoxybenzyl alcohol HPLC analysis of the extent of hydrolysis was performed 30 perminute at pH 3.0 and 30°C. One unit of Coprinus cinereus according to the procedure described in Example 4. peroxidase is defined as the quantity of enzyme necessary to The results (FIG. 2) demonstrated that the extent of hydrolysis increased linearly with horseradish peroxidase consume 1 Limole of H2O per minute. activity over the range of peroxidase activity tested, and the The hydrolysis of PCS was conducted using 2.2 ml deep dependence of the extent of hydrolysis on horseradish per 35 well plates in a total reaction volume of 1.0 ml. The hydrolysis oxidase activity was approximately 2-fold higher at 3 days of was performed with 50 mg of washed PCS per ml of 50 mM incubation. Addition of horseradish peroxidase at 4 units per Sodium acetate pH 5.0 containing 1 mM manganese Sulfate ml increased hydrolysis approximately 10% over reactions and the Trichoderma reesei cellulase composition at 4 mg per performed without horseradish peroxidase, both at day 1 and g of PCS. Manganese peroxidase, lactoperoxidase, lignin day 3 of hydrolysis. 40 peroxidase, and Coprinus cinereus peroxidase were added at volumes of 0-100 ul, to give the final concentrations indicated in FIG.3, from the following stock solutions: Mn peroxidase: Example 6 0.005 unit perul, 50g perul; lignin peroxidase: 0.04 unit per Preparation of Coprinus cinereus Peroxidase 45 ul. 20 Jug per ul; lactoperoxidase: 0.2 unit per ul, 5ug per ul; and Coprinus cinereus peroxidase: 630 uM. The plate was Coprinus cinereus peroxidase was purified as described by then sealed using an ALPS-300TM plate heat sealer, mixed WO 1992/016634, and Xu et al., 2003, “Fusion proteins thoroughly, and incubated at 50° C. for 72 hours with shaking containing Coprinus cinereus peroxidase and the cellulose at 150 rpm. All experiments were performed in triplicate. binding domain of Humicola insolens family 45 endogluca 50 HPLC analysis of the extent of hydrolysis was performed nase' in Application of Enzymes to Lignocellulosics (Mans according to the procedure described in Example 5. field, S. D. and Saddler, J. N. eds.) pp. 382–402, American Chemical Society, Washington, D.C. The purification scheme The results (FIG. 3) demonstrated that each of the peroxi comprised ultrafiltration and anion-exchange chromatogra dases, with the exception of manganese peroxidase, enhanced phy. Cell-free broth of a Coprinus cinereus peroxidase (pH 55 PCS hydrolysis in a concentration-dependent manner. The 7.7, 11 mS conductivity) was filtered with Whatman #2 paper effect did not saturate under the concentrations utilized. and ultrafiltered with a polyethersulfone membrane (30 kD The invention described and claimed herein is not to be molecular weight cutoff). The washed and concentrated broth limited in scope by the specific aspects herein disclosed, since (pH 7.7, 1 mS) was then loaded onto a Q-SEPHAROSE BIG these aspects are intended as illustrations of several aspects of BEADTM column pre-equilibrated with 5 mM CaCl-10 mM 60 the invention. Any equivalent aspects are intended to be Tris-HCl pH 7.6 (Buffer A). The active fraction eluted by 5% within the scope of this invention. Indeed, various modifica Buffer B (Buffer A plus 2 MNaCl) was washed (with 5 mM tions of the invention in addition to those shown and CaCl) to 1 mS, then applied to a MONO-QTM column (GE described herein will become apparent to those skilled in the Healthcare, Piscataway, N.J., USA) equilibrated with Buffer art from the foregoing description. Such modifications are A. Buffer B was used again for the elution. Fractions were 65 also intended to fall within the scope of the appended claims. analyzed for peroxidase activity and by SDS-PAGE. Specific In the case of conflict, the present disclosure including defi peroxidase activity was assayed at 30°C. with 0.1 M sodium nitions will control.

US 8,426,158 B2 63 64 - Continued <4 OOs, SEQUENCE: 2

Met Lys Ser Phe Thir Ile Ala Ala Lieu Ala Ala Lell Trp Ala Glin Glu 1. 5 1O 15

Ala Ala Ala His Ala Thr Phe Glin Asp Lieu. Trp Ile Asp Gly Val Asp 2O 25

Tyr Gly Ser Glin Cys Val Arg Lieu Pro Ala Ser Asn Ser Pro Wall. Thir 35 4 O 45

Asn Val Ala Ser Asp Asp Ile Arg Cys Asn Val Gly Thir Ser Arg Pro SO 55 6 O

Thr Val Lys Cys Pro Val Lys Ala Gly Ser Thr Wall Thir Ile Glu Met 65 70 7s 8O

His Glin Glin Pro Gly Asp Arg Ser Cys Ala Asn Glu Ala Ile Gly Gly 85 90 95

Asp His Tyr Gly Pro Val Met Val Tyr Met Ser Wall Asp Asp Ala 1OO 105 11 O

Val Thr Ala Asp Gly Ser Ser Gly Trp Phe Lys Wall Phe Glin Asp Ser 115 12 O 125 Trp Ala Lys Asn. Pro Ser Gly Ser Thr Gly Asp Asp Asp Trp Gly 13 O 135 14 O

Thir Lys Asp Lieu. Asn. Ser Cys Cys Gly Lys Met Asn Wall Ile Pro 145 150 155 160

Glu Asp Ile Glu Pro Gly Asp Tyr Lieu. Lieu. Arg Ala Glu Wall Ile Ala 1.65 17O 17s

Lieu. His Val Ala Ala Ser Ser Gly Gly Ala Glin Phe Met Ser Cys 18O 185 19 O

Tyr Glin Lieu. Thr Val Thr Gly Ser Gly Ser Ala Thir Pro Ser Thir Wall 195 2OO

Asn Phe Pro Gly Ala Tyr Ser Ala Ser Asp Pro Gly Ile Luell Ile Asn 21 O 215 22O

Ile His Ala Pro Met Ser Thr Tyr Val Val Pro Gly Pro Thir Val Tyr 225 23 O 235 24 O

Ala Gly Gly Ser Thr Lys Ser Ala Gly Ser Ser Ser Gly Cys Glu 245 250 255

Ala Thr Cys Thr Val Gly Ser Gly Pro Ser Ala Thir Lell Thir Gln Pro 26 O 265 27 O Thir Ser Thr Ala Thr Ala Thr Ser Ala Pro Gly Gly Gly Gly Ser Gly 27s 28O 285

Cys Thr Ala Ala Lys Tyr Glin Glin Cys Gly Gly Thir Gly Thr Gly 29 O 295 3 OO

Cys Thir Thr Cys Ala Ser Gly Ser Thr Cys Ser Ala Wall Ser Pro Pro 3. OS 310 315 32O Tyr Tyr Ser Glin Cys Lieu. 3.25

<210s, SEQ ID NO 3 &211s LENGTH: 880 &212s. TYPE: DNA <213> ORGANISM: Thiellavia terrestris

<4 OOs, SEQUENCE: 3 accc.cgggat cactgc.ccct aggaaccagc acacct cqgt c caat catgc ggttctgacgc 6 O cct citcc.gcc ct cqct cittg cqc.cgcttgt gigotggccac c cagotacat 12 O catcggcggc aaaacct atc cc.ggctacga gggct tctic cctgcct cqa gcc.cgc.cgac 18O gatcCagtac Cagtggc.ccg act acaac cc gaccctgagc gtgaccgacc cgaagatgcg 24 O US 8,426,158 B2 65 66 - Continued

Ctgcaacggc ggcaccitcgg cagagcticag cgc.gc.ccgtc Caggc.cggcg agaacgtgac

aag cagtgga cccaccagca aggcc.ccgtc atggtctgga tgttcaagtg 360 cc.ccggcgac ttct cqtcgt. gcc acggcga cggcaagggc tggttcaaga tcqaccagot gggCCtgttgg ggcaacaacc t caact cqaa Caactggggg accgcgatcg tctacaagac

Cctic cagtgg agcaa.ccc.ga tcc ccaagaa Cct cogcc.g ggcaact acc t catcc.gc.ca 54 O cgagctgctic gcc ctdcacc aggccaacac gcc.gcagttc tacgc.cgagt gcgcc.ca.gct ggtcgt.ct Co ggcagoggct cc.gc.cctgcc cc.cgt.ccgac tacct Ctaca gcatcc.ccgt. 660 citacgc.gc.cc Cagaacgacc ccggcatcac cgtgagtggg citt cogttcc gcggcgagct 72 O

Ctgtggaaat Cttgctgacg atgggctagg ttgacat cita Caacggcggg cit tacct cott acacccc.gc.c cggcggcc cc gtctggtctg gct tcgagtt ttaggcgcat 84 O gctacgaggg gaaggcatct gttcgcatga gcgtggg tac

SEQ ID NO 4 LENGTH: 239 TYPE : PRT ORGANISM: Thiellavia terrestris

< 4 OOs SEQUENCE: 4

Met Arg Phe Asp Ala Lieu. Ser Ala Lieu Ala Lieu Ala Pro Luell Wall Ala 1. 5 1O 15

Gly His Gly Ala Val Thir Ser Tyr Ile Ile Gly Gly Thir Tyr Pro 25

Gly Glu Gly Phe Ser Pro Ala Ser Ser Pro Pro Thir Ile Gln Tyr 35 4 O 45

Glin Pro Asp Tyr Asn Pro Thr Lieu. Ser Wall Thir Asp Pro Lys Met 55 6 O

Arg Asn Gly Gly. Thir Ser Ala Glu Lieu. Ser Ala Pro Wall Glin Ala 65 70 7s

Gly Glu Asn Val Thr Ala Val Trp Lys Glin Trp Thir His Glin Gln Gly 85 90 95

Pro Wall Met Val Trp Met Phe Lys Cys Pro Gly Asp Phe Ser Ser Ser 105 11 O

His Gly Asp Gly Lys Gly Trp Phe Lys Ile Asp Glin Lell Gly Lieu. Trp 115 12 O 125

Gly Asn Asn Luell Asn. Ser Asn. Asn Trp Gly Thr Ala Ile Wall 13 O 135 14 O

Thir Luell Glin Trp Ser Asn Pro Ile Pro Lys Asn Lell Ala Pro Gly Asn 145 150 155 160

Luell Ile Arg His Glu Lieu. Lieu. Ala Lieu. His Glin Ala Asn Thir Pro 1.65 17O 17s

Glin Phe Ala Glu. Cys Ala Glin Lieu Wal Wall Ser Gly Ser Gly Ser 18O 185 19 O

Ala Luell Pro Pro Ser Asp Tyr Lieu Tyr Ser Ile Pro Wall Ala Pro 195

Glin Asn Asp Pro Gly Ile Thr Val Asp Ile Tyr Asn Gly Gly Lieu. Thir 21 O 215 22O

Ser Thir Pro Pro Gly Gly Pro Val Trp Ser Gly Phe Glu Phe 225 23 O 235

<210s, SEQ ID NO 5 &211s LENGTH: 1.OOO &212s. TYPE: DNA

US 8,426,158 B2 69 70 - Continued

18O 185 19 O

Ala Glin Wall Wall Ile Thr Gly Ser Gly Thr Ala Glin Pro Asp Ala Ser 195 2OO 2O5

Lys Ala Ala Ile Pro Gly Tyr Cys Asn Glin Asn Asp Pro ASn Ile 21 O 215 22O

Lys Wall Pro Ile Asn Asp His Ser Ile Pro Glin Thir Tyr Lys Ile Pro 225 23 O 235 24 O

Gly Pro Pro Wall Phe Lys Gly Thr Ala Ser Lys Ala Arg Asp Phe 245 250 255

Thir Ala

<210s, SEQ ID NO 7 &211s LENGTH: 681 &212s. TYPE: DNA <213> ORGANISM: Thiellavia terrestris

<4 OO > SEQUENCE: 7 atgct cqcaa acggtgc.cat cqtct tcc td gcc.gc.cgc.cc tcggcgt cag tggccactac 6 O acctggccac gggittaacga C9gcgc.cgac tggcaac agg tcc.gtaaggc gga caactgg 12 O

Caggacaacg gctacgt.cgg ggatgtcacg tcgc.cacaga tcc.gctgttt cCaggcgacc 18O ccgt.ccc.cgg c cc catcc.gt cct caacacc acggc.cggct cgaccgtgac Ctactgggcc 24 O alacc cc.gacg tctaccaccc cqggcctgtg cagttttaca gcc.cgatggc 3OO gaggacat Ca act cqtggala C9gcgacggc gcc.gtgttggt t calaggtgta cgaggac cat 360 cctacctttg gcgcticagot cacatggc cc agcacgggca agagctcgtt cgcggitt CCC atcc ccc.cgt. gcatcaagtic cqgctactac Ctcct Coggg cggagcaaat cggcctgcac gtc.gc.ccaga gcgtaggcgg agcgcagttc taCatct Cat gcgcc.ca.gct cagcgtcacc 54 O ggcggcggca gCaccgagcc gcc.gaacaag gtggcct tcc ccggcgctta Cagtgcgacg gacccgggca ttctgat caa catctactac cctgttcc.ca cgtcc tacca gaaccc.cggc 660 ccggcc.gt ct t cagctgctg a 681

SEQ ID NO 8 LENGTH: 226 TYPE : PRT ORGANISM: Thiellavia terrestris

< 4 OOs SEQUENCE: 8

Met Lieu Ala Asn Gly Ala Ile Wall Phe Lieu Ala Ala Ala Luell Gly Val 1. 1O 15

Ser Gly His Tyr Thr Trp Pro Arg Val Asn Asp Gly Ala Asp Trp Glin 2O 25 3O

Glin Wall Arg Lys Ala Asp Asn Trp Glin Asp Asn Gly Tyr Wall Gly Asp 35 4 O 45

Wall Thir Ser Pro Glin Ile Arg Cys Phe Glin Ala Thir Pro Ser Pro Ala SO 55 6 O

Pro Ser Wall Luell Asn Thir Thir Ala Gly Ser Thr Wall Thir Trp Ala 65 70 7s

Asn Pro Asp Val Tyr His Pro Gly Pro Wall Glin Phe Met Ala Arg 85 90 95

Wall Pro Asp Gly Glu Asp Ile Asn Ser Trp Asn Gly Asp Gly Ala Wall 1OO 105 11 O

Trp Phe Lys Val Tyr Glu Asp His Pro Thir Phe Gly Ala Glin Lieu. Thir 115 12 O 125

US 8,426,158 B2 73 74 - Continued

Gly Ala Gly Thr Asp Thr Val Thr Val Lys Ala Gly Asp Glin Phe Thr 65 70 7s 8O

Phe Thir Luell Asp Thr Pro Val Tyr His Glin Gly Pro Ile Ser Ile Tyr 85 90 95

Met Ser Ala Pro Gly Ala Ala Ser Asp Tyr Asp Gly Ser Gly Gly 105 11 O

Trp Phe Lys Ile Lys Asp Trp Gly Pro Thir Phe Asn Ala Asp Gly Thr 115 12 O 125

Ala Thir Trp Asp Met Ala Gly Ser Tyr Thr Tyr Asn Ile Pro Thr Cys 13 O 135 14 O

Ile Pro Asp Gly Asp Tyr Lieu. Lieu. Arg Ile Glin Ser Lell Ala Ile His 145 150 155 160

Asn Pro Trp Pro Ala Gly Ile Pro Glin Phe Tyr Ile Ser Ala Glin 1.65 17O 17s

Ile Thir Wall Thr Gly Gly Gly Asn Gly Asn Pro Gly Pro Thir Ala Lieu 18O 185 19 O

Ile Pro Gly Ala Phe Lys Asp Thr Asp Pro Gly Tyr Thir Wall ASn Ile 195

Thir Asn Phe His Asn Tyr Thr Val Pro Gly Pro Glu Wall Phe Ser 21 O 215 22O

Cys Asn Gly Gly Gly Ser Asn Pro Pro Pro Pro Wall Ser Ser Ser Thr 225 23 O 235 24 O

Pro Ala Thir Thir Thir Lieu Val Thr Ser Thr Arg Thir Thir Ser Ser Thr 245 250 255

Ser Ser Ala Ser Thr Pro Ala Ser Thr Gly Gly Thir Wall Ala Lys 26 O 265 27 O

Trp Gly Glin Cys Gly Gly Asn Gly Tyr Thr Gly Thir Thir Cys Ala 27s 285

Ala Gly Ser Thr Cys Ser Lys Glin Asn Asp Tyr Tyr Ser Glin Cys Lieu. 29 O 295 3 OO

SEQ ID NO 11 LENGTH: 954 TYPE: DNA ORGANISM: Thiellavia terrestris

< 4 OOs SEQUENCE: 11 atgaagggcc ticagoctoct tcggcagcga ctgct catac catctitcgtg 6 O

Cagctcgagt Cagggggaac gacct atcc.g gtatic ctacg gcatc.cggga c cctago tac 12 O gacggit coca t caccgacgt. cacct cogac t cactggctt gcaatggtcC cc.cgaac ccc 18O acgacgc.cgt. cc.ccgtacat catcaacgt.c accgc.cggca ccacggit cqc ggcgatctgg 24 O aggcacac cc t cacatccgg cc.ccgacgat gtcatggacg c cago cacaa gggg.ccgacC 3OO ctggcc tacc t caagaaggt cgatgatgcc ttgaccgaca cgg.cggctgg 360 ttcaagat.co aggagg.ccgg ttacgacaat ggcaattggg ctaccagcac ggtgat cacc aacggtggct to Calatatat tgacatcc cc gcc togcattc cCaacggc.ca gtatctgctic cgc.gcc.gaga tgatcgc.gct ccacgcc.gc.c agcacgcagg gct ctacatg 54 O gagtgcgc.gc agatcaacgt. ggtgggcggc tcc.ggcagcg Ccagc.ccgca gacgtacagc atc.ccgggca tctaccaggc alaccgaccc.g ggcctgctga toalacat Cta ctic catgacg 660 ccgt.ccagcc agtacaccat cc cct gttca Cctgcagcgg Cagcggcaac 72 O aacggcgg.cg gcagdaac cc gtcgggcggg Cagaccacga cacgacgacg US 8,426,158 B2 75 - Continued acggcggcga caccacctic ct ccd.ccgct cctaccagda gcc agggggg cagcagcggit 84 O tgcaccgttc cccagtggca gcagtgcggt ggcatct cqt t caccggctg. Caccacctgc 9 OO gcggcgggct acacctgcaa gitatctgaac gactatt act cqcaatgcca gtaa 954

<210s, SEQ ID NO 12 &211s LENGTH: 317 212. TYPE: PRT <213> ORGANISM: Thiellavia terrestris

<4 OOs, SEQUENCE: 12 Met Lys Gly Lieu. Ser Lieu. Lieu Ala Ala Ala Ser Ala Ala Thr Ala His 1. 5 1O 15 Thir Ile Phe Val Glin Lieu. Glu Ser Gly Gly Thr Thr Tyr Pro Val Ser 2O 25 3O Tyr Gly Ile Arg Asp Pro Ser Tyr Asp Gly Pro Ile Thr Asp Val Thr 35 4 O 45 Ser Asp Ser Leu Ala Cys Asn Gly Pro Pro Asn Pro Thr Thr Pro Ser SO 55 6 O Pro Tyr Ile Ile Asn Val Thr Ala Gly Thr Thr Val Ala Ala Ile Trp 65 70 7s 8O Arg His Thir Lieu. Thir Ser Gly Pro Asp Asp Wal Met Asp Ala Ser His 85 90 95 Lys Gly Pro Thir Lieu Ala Tyr Lieu Lys Llys Val Asp Asp Ala Lieu. Thir 1OO 105 11 O Asp Thr Gly Ile Gly Gly Gly Trp Phe Lys Ile Glin Glu Ala Gly Tyr 115 12 O 125 Asp Asn Gly Asn Trp Ala Thir Ser Thr Val Ile Thr Asn Gly Gly Phe 13 O 135 14 O Glin Tyr Ile Asp Ile Pro Ala Cys Ile Pro Asn Gly Glin Tyr Lieu. Lieu. 145 150 155 160 Arg Ala Glu Met Ile Ala Lieu. His Ala Ala Ser Thr Glin Gly Gly Ala 1.65 17O 17s Gln Leu Tyr Met Glu. Cys Ala Glin Ile Asin Val Val Gly Gly Ser Gly 18O 185 19 O Ser Ala Ser Pro Gln Thr Tyr Ser Ile Pro Gly Ile Tyr Glin Ala Thr 195 2OO 2O5 Asp Pro Gly Lieu Lleu. Ile Asn Ile Tyr Ser Met Thr Pro Ser Ser Glin 21 O 215 22O Tyr Thir Ile Pro Gly Pro Pro Leu Phe Thr Cys Ser Gly Ser Gly Asn 225 23 O 235 24 O Asn Gly Gly Gly Ser Asn Pro Ser Gly Gly Glin Thr Thr Thr Ala Lys 245 250 255

Pro Thir Thir Thir Thir Ala Ala Thir Thir Thir Ser Ser Ala Ala Pro Thr 26 O 265 27 O Ser Ser Glin Gly Gly Ser Ser Gly Cys Thr Val Pro Gln Trp Glin Glin 27s 28O 285 Cys Gly Gly Ile Ser Phe Thr Gly Cys Thr Thr Cys Ala Ala Gly Tyr 29 O 295 3 OO Thr Cys Llys Tyr Lieu. Asn Asp Tyr Tyr Ser Glin Cys Glin 3. OS 310 315

<210s, SEQ ID NO 13 &211s LENGTH: 799 &212s. TYPE: DNA <213> ORGANISM: Thermoa scus aurantiacus US 8,426,158 B2 77 - Continued <4 OOs, SEQUENCE: 13 atgtcc ttitt coaagataat tgctactgcc ggcgttcttg cctctgcttic tct agtggct 6 O ggc.catggct tcqttcagaa catcgtgatt gatggtaaaa agtatgtcat tgcaagacgc 12 O acataagcgg caa.cagctga Caatcgacag ttatggcggg tat ctagtga accagtaticc 18O atacatgtc.c aatcct coag agg to atcgc ctggit ctact acggcaactg atc.ttggatt 24 O tgttggacggit actggatacc aaaccc.ca.ga tat catctgc Cat agggg.cg cCaagcctgg 3OO agcc ctdact gct coagt ct CtcCaggagg aactgttgag cittcaatgga ctic catggcc 360 tgattct cac Catggcc.cag ttatcaacta ccttgct cog tgcaatggtg attgttccac tgtggataag acc caattag aattottcaa. aattgcc.gag agcggtctica t caatgatga

Caat CCtcCt gggatctggg citt cagacaa tctgatagoa gccaacaa.ca gctggactgt 54 O

CaccattcCa accacaattg Cacctggaaa ctatogttctg aggcatgaga ttattgctict t cacticagot Cagaac Cagg atggtgcc.ca gaactat coc cagtgcatca atctgcaggt 660

Cactggaggit ggttctgata accctgctgg aact cittgga acggcactict accacgatac 72 O cgatcc tiga attctgat ca aCatctat Ca gaaactitt.co agctatat ca tcc ct gg.tcc t cct ctdtat actggittaa 799

<210s, SEQ ID NO 14 &211s LENGTH: 250 212. TYPE : PRT &213s ORGANISM: Thermoa scus aura ntiacus

<4 OOs, SEQUENCE: 14

Met Ser Phe Ser Lys Ile Ile Ala Thr Ala Gly Wall Lell Ala Ser Ala 1. 5 1O 15

Ser Luell Wall Ala Gly His Gly Phe Wall Glin Asn Ile Wall Ile Asp Gly 25 3O

Tyr Tyr Gly Gly Tyr Lieu Wall Asn. Glin Pro Met Ser 35 4 O 45

Asn Pro Pro Glu Val Ile Ala Trp Ser Thir Thir Ala Thir Lieu. Gly SO 6 O

Phe Wall Asp Gly Thr Gly Tyr Glin Thr Pro Asp Ile Ile His Arg 65 70 7s

Gly Ala Pro Gly Ala Lieu. Thr Ala Pro Wall Ser Pro Gly Gly Thr 85 90 95

Wall Glu Luell Gln Trp Thr Pro Trp Pro Asp Ser His His Gly Pro Wall 105 11 O

Ile Asn Tyr Lieu Ala Pro Cys Asn Gly Asp Cys Ser Thir Wall Asp Llys 115 12 O 125

Thir Glin Luell Glu Phe Phe Lys Ile Ala Glu Ser Gly Lell Ile Asn Asp 13 O 135 14 O

Asp Asn Pro Pro Gly Ile Trp Ala Ser Asp Asn Lell Ile Ala Ala Asn 145 150 155 160

Asn Ser Trp Thir Wall. Thir Ilie Pro Thir Thir Ile Ala Pro Gly Asn Tyr 1.65 17O 17s

Wall Luell Arg His Glu Ile Ile Ala Lieu. His Ser Ala Glin Asn Glin Asp 18O 185 19 O

Gly Ala Glin Asn Tyr Pro Gln Cys Ile Asn Lieu. Glin Wall Thir Gly Gly 195

Gly Ser Asp Asn Pro Ala Gly Thr Leu Gly Thr Ala Lell His Asp 21 O 215 22O

US 8,426,158 B2 85 86 - Continued

145 150 155 160

Gly Ser Ser Ser Thir Ser Glin Gly Tyr Luell Wall Lell Gly Arg Ala Ser 1.65 17O 17s

Ala Arg Arg Gly Wall Wall Gly Pro Thir Pro Asp Thir Ala Thir Phe 18O 185 19 O

Gly Phe His Asp Asn Gly Phe Gly Glin Trp Gly Wall Gly Luell Glu Asn 195

Ala Wall Ser Glu Glin Tyr Ser Glu Trp Ala Ser Lell Pro Gly Luell Thir 21 O 215

Wall Glu Thir Thir Glu Gly Ser Gly Pro Gly Glu Ala Glin Wall 225 23 O 235 24 O

Pro Ala Pro Glu Glu Thir Asp Ile Wall Wall Gly Ala Gly Ala 245 250 255

Gly Gly Ile Pro Wall Ala Asp Luell Ser Glu Ala Gly His Wall 26 O 265 27 O

Lell Luell Ile Glu Gly Pro Pro Ser Thir Gly Arg Trp Glin Gly Thir 28O 285

Met Lys Pro Glu Trp Lell Glu Gly Thir Asp Luell Thir Arg Phe Asp Wall 29 O 295 3 OO

Pro Gly Luell Asn Glin Ile Trp Wall Asp Ser Ala Gly Ile Ala Cys 3. OS 310 315

Thir Asp Thir Asp Glin Met Ala Gly Wall Luell Gly Gly Gly Thir Ala 3.25 330 335

Wall Asn Ala Gly Lell Trp Trp Pro Ile Asp Lell Asp Trp Asp Glu 34 O 345 35. O

Asn Phe Pro Glu Gly Trp His Ser Glin Asp Luell Ala Ala Ala Thir Glu 355 360 365

Arg Wall Phe Glu Arg Ile Pro Gly Thir Trp His Pro Ser Met Asp Gly 37 O 375

Lys Luell Tyr Arg Asp Glu Gly Wall Luell Ser Ser Gly Luell Ala 385 390 395 4 OO

Glu Ser Gly Trp Lys Glu Wall Wall Ala Asn Glu Wall Pro Asn Glu 4 OS 415

Asn Arg Thir Phe Ala His Thir His Phe Met Phe Ala Gly Gly Glu Arg 425 43 O

Asn Gly Pro Luell Ala Thir Luell Wall Ser Ala Asp Ala Arg Glu Asn 435 44 O 445

Phe Ser Luell Trp Thir Asn Thir Ala Wall Arg Arg Ala Wall Arg Thir Gly 450 45.5 460

Gly Lys Wall Thir Gly Wall Glu Luell Glu Luell Thir Asp Gly Gly Tyr 465 470

Ser Gly Ile Wall Lys Lell Asn Glu Gly Gly Gly Wall Ile Phe Ser Ala 485 490 495

Gly Ala Phe Gly Ser Ala Luell Luell Phe Arg Ser Gly Ile Gly Pro SOO 505

Glu Asp Glin Luell Arg Wall Wall Ala Ser Ser Asp Gly Glu Asp Phe 515 525

Ile Asp Glu Asp Trp Ile Luell Pro Wall Gly Asn Luell Ile 53 O 535 54 O

Asp His Luell Asn Thir Asp Lell Ile Luell Thir His Pro Asp Wall Wall Phe 5.45 550 555 560

Asp Phe Glu Ala Trp Thir Thir Pro Ile Glu Ala Asp Lys Glin 565 st O sts

US 8,426,158 B2 89 - Continued taccat cagt gcc tdtagaa titc 923

<210s, SEQ ID NO 2 O &211s LENGTH: 305 212. TYPE: PRT <213> ORGANISM: Humicola insolens

<4 OOs, SEQUENCE: 2O Met Arg Ser Ser Pro Lieu. Lieu. Arg Ser Ala Val Val Ala Ala Lieu Pro 1. 5 1O 15 Val Lieu Ala Lieu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 2O 25 3O Cys Llys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val Asin Glin Pro 35 4 O 45 Val Phe Ser Cys Asn Ala Asn. Phe Glin Arg Ile Thr Asp Phe Asp Ala SO 55 6 O Llys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys Ala Asp Glin 65 70 7s 8O Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Lieu. Gly Phe Ala Ala Thr 85 90 95 Ser Ile Ala Gly Ser Asn. Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 1OO 105 11 O Lieu. Thir Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met Val Val Glin 115 12 O 125 Ser Thir Ser Thr Gly Gly Asp Lieu. Gly Ser Asn His Phe Asp Lieu. Asn 13 O 135 14 O Ile Pro Gly Gly Gly Val Gly Ile Phe Asp Gly Cys Thr Pro Glin Phe 145 150 155 160 Gly Gly Lieu Pro Gly Glin Arg Tyr Gly Gly Ile Ser Ser Arg Asn. Glu 1.65 17O 17s Cys Asp Arg Phe Pro Asp Ala Lieu Lys Pro Gly Cys Tyr Trp Arg Phe 18O 185 19 O Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe Arg Glin Val 195 2OO 2O5 Glin Cys Pro Ala Glu Lieu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 21 O 215 22O Asp Gly Asn Phe Pro Ala Val Glin Ile Pro Ser Ser Ser Thr Ser Ser 225 23 O 235 24 O

Pro Wall ASn Gin Pro Thir Ser Thir Ser Thir Thir Ser Thir Ser Thir Thr 245 250 255 Ser Ser Pro Pro Val Glin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu 26 O 265 27 O Arg Trp Ala Glin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys 27s 28O 285 Val Ala Gly Ser Thr Cys Thr Lys Ile Asn Asp Trp Tyr His Glin Cys 29 O 295 3 OO

Lell 3. OS

<210s, SEQ ID NO 21 &211s LENGTH: 1.188 &212s. TYPE: DNA <213> ORGANISM: Myceliophthora thermophila

<4 OOs, SEQUENCE: 21 cgacttgaaa cqc cocaaat galagt cct Co atcct cqcca gcgt.ctt.cgc cacgggcgc.c 6 O

US 8,426,158 B2 95 96 - Continued c catcggtgc cgagcggttg Caggctg.cga citcaatggitt gaagcagaac aacct caagg gctt CCtggg cgagat.cggc gcc.ggcticta act cogcttg cat cagcgct gtgcagggtg 108 O cgttgttgttc gatgcagcaa. tctggtgttgt ggctcgg.cgc t ct ctggtgg gctg.cgggcc 114 O cgtggtgggg cgactactac cagtic catcg agcc.gcc ctic 12 OO t cct cocqca ggc cct gctg cc.gttcgcgt. a.a. 1232

SEQ ID NO 24 LENGTH: 397 TYPE : PRT ORGANISM: Basidiomycete CBS 495.95

< 4 OOs SEQUENCE: 24 Met Lys Ser Lieu. Phe Lieu. Ser Lieu. Wall Ala Thr Wall Ala Lieu. Ser Ser 1. 5 1O 15

Pro Wall Phe Ser Val Ala Val Trp Gly Glin Cys Gly Gly Ile Gly Phe 25 3O

Ser Gly Ser Thr Val Cys Asp Ala Gly Ala Gly Cys Val Llys Lieu. Asn 35 4 O 45

Asp Tyr Ser Glin Cys Gln Pro Gly Ala Pro Thir Ala Thir Ser Ala SO 55 6 O

Ala Pro Ser Ser Asn Ala Pro Ser Gly. Thir Ser Thir Ala Ser Ala Pro 65 70 7s 8O

Ser Ser Ser Lieu. Cys Ser Gly Ser Arg Thr Pro Phe Glin Phe Phe Gly 85 90 95

Wall Asn Glu Ser Gly Ala Glu Phe Gly Asn Lieu. Asn Ile Pro Gly Val 105 11 O

Lell Gly Thir Asp Tyr Thir Trp Pro Ser Pro Ser Ser Ile Asp Phe Phe 115 12 O 125

Met Gly Lys Gly Met Asn Thr Phe Arg Ile Pro Phe Leu Met Glu Arg 13 O 135 14 O

Lell Wall Pro Pro Ala Thr Gly Ile Thr Gly Pro Lieu. Asp Gln Thr Tyr 145 150 155 160

Lell Gly Gly Leul Glin. Thir Ile Wall Asn Tyr Ile Thr Gly Lys Gly Gly 1.65 17O 17s

Phe Ala Luell Ile Asp Pro His Asn Phe Met Ile Tyr Asn Gly Glin Thr 18O 185 19 O

Ile Ser Ser Thir Ser Asp Phe Glin Llys Phe Trp Glin Asn Lieu Ala Gly 195 2O5

Wall Phe Ser ASn Ser His Wall Ile Phe Asp Wal Met Asn. Glu Pro 21 O 215 22O

His Asp Ile Pro Ala Glin. Thir Wall Phe Glin Lieu. ASn Glin Ala Ala Wall 225 23 O 235 24 O

Asn Gly Ile Arg Ala Ser Gly Ala Thir Ser Glin Lieu. Ile Lieu Val Glu 245 250 255

Gly Thir Ser Trp Thr Gly Ala Trp Thir Trp Thr Thir Ser Gly Asn Ser 26 O 265 27 O

Asp Ala Phe Gly Ala Ile Lys Asp Pro Asn. Asn Asn. Wall Ala Ile Glin 27s 285

Met His Glin Tyr Lieu. Asp Ser Asp Gly Ser Gly Thir Ser Glin Thr Cys 29 O 295 3 OO

Wall Ser Pro Thir Ile Gly Ala Glu Arg Lieu. Glin Ala Ala Thr Glin Trp 3. OS 310 315 32O

Lell Glin Asn Asn Lieu Lys Gly Phe Leu Gly Glu Ile Gly Ala Gly

US 8,426,158 B2 99 100 - Continued

Trp Ser Gly Ser Thir Wall Asp Ala Gly Luell Ala Cys Wall Ile Luell 35 4 O 45

Asn Ala Phe Glin Cys Luell Thir Pro Ala Ala Gly Glin Thir Thir SO 55 6 O

Thir Gly Ser Gly Ala Pro Ala Ser Thir Ser Thir Ser His Ser Thir Wall 65 70

Thir Thir Gly Ser Ser His Ser Thir Thir Gly Thir Thir Ala Thir Lys Thir 85 90 95

Thir Thir Thir Pro Ser Thir Thir Thir Thir Luell Pro Ala Ile Ser Wall Ser 1OO 105 11 O

Gly Arg Wall Ser Gly Ser Arg Thir Phe Phe Phe Gly Wall 115 12 O 125

Asn Glu Ser Ala Glu Phe Gly Asn Thir Ala Trp Pro Gly Glin Luell 13 O 135 14 O

Gly Asp Thir Trp Pro Ser Pro Ser Ser Wall Asp Phe Met 145 150 155 160

Gly Ala Gly Phe Asn Thir Phe Arg Ile Thir Phe Lell Met Glu Arg Met 1.65

Ser Pro Pro Ala Thir Gly Lell Thir Gly Pro Phe Asn Glin Thir Luell 18O 185 19 O

Ser Gly Luell Thir Thir Ile Wall Asp Ile Thir Asn Lys Gly Gly 195

Ala Luell Ile Asp Pro His Asn Phe Met Arg Asn Asn Gly Ile Ile 21 O 215 22O

Ser Ser Thir Ser Asp Phe Ala Thir Trp Trp Ser Asn Lell Ala Thir Wall 225 23 O 235 24 O

Phe Ser Thir Lys Asn Ala Ile Phe Asp Ile Glin Asn Glu Pro 245 250 255

Gly Ile Asp Ala Glin Thir Wall Glu Luell ASn Glin Ala Ala Ile Asn 26 O 265 27 O

Ser Ile Arg Ala Ala Gly Ala Thir Ser Glin Luell Ile Lell Wall Glu Gly 285

Thir Ser Thir Gly Ala Trp Thir Trp Wall Ser Ser Gly Asn Gly Ala 29 O 295 3 OO

Ala Phe Ala Ala Wall Thir Asp Pro Asn ASn Thir Ala Ile Glu Met 3. OS 310 315

His Glin Luell Asp Ser Asp Gly Ser Gly Thir Asn Glu Asp Cys Wall 3.25 330 335

Ser Ser Thir Ile Gly Ser Glin Arg Luell Glin Ala Ala Thir Ala Trp Luell 34 O 345 35. O

Glin Glin Thir Gly Lell Gly Phe Luell Gly Glu Thir Gly Ala Gly Ser 355 360 365

Asn Ser Glin Ile Asp Ala Wall Phe Asp Glu Lell Met Glin 37 O 375

Glin Glin Gly Gly Ser Trp Ile Gly Ala Luell Trp Trp Ala Ala Gly Pro 385 390 395 4 OO

Trp Trp Gly Thir Tyr Ile Ser Ile Glu Pro Pro Ser Gly Ala Ala 4 OS 415

Ile Pro Glu Wall Lell Pro Glin Gly Luell Ala Pro Phe Lell 42O 425

<210s, SEQ ID NO 27 &211s LENGTH: 1580 &212s. TYPE: DNA <213> ORGANISM: Thiellavia terrestris

US 8,426,158 B2 103 104 - Continued Wall Ala Asp Val Gly Thr Phe Lieu. Trp Lieu. Asp Ser Ile Glu Asn 85 90 95

Ile Gly Lieu. Glu Pro Ala Ile Glin Asp Wall Pro Cys Glu ASn Ile 105 11 O

Lell Gly Luell Val Ile Tyr Asp Lieu. Pro Gly Arg Asp Cys Ala Ala Lys 115 12 O 125

Ala Ser Asn Gly Glu Lieu Lys Val Gly Glu Ile Asp Arg Lys Thr 13 O 135 14 O

Glu Ile Asp Llys Ile Val Ser Ile Lieu Lys Ala His Pro Asn. Thir 145 150 155 160

Ala Phe Ala Lieu Wall Ile Glu Pro Asp Ser Lieu. Pro Asn Luell Wall. Thir 1.65 17O 17s

Asn Ser Asn Lieu. Asp Thr Cys Ser Ser Ser Ala Ser Gly Tyr Arg Glu 18O 185 19 O

Gly Wall Ala Tyr Ala Lieu Lys Asn Luell Asn. Luell Pro Asn Wall Ile Met 195

Luell Asp Ala Gly. His Gly Gly Trp Lieu. Gly Trp Asp Ala Asn Luell 21 O 215

Glin Pro Gly Ala Glin Glu Lieu Ala Lys Ala Tyr Asn Ala Gly Ser 225 23 O 235 24 O

Pro Glin Lieu. Arg Gly Phe Ser Thir Asn. Wall Ala Gly Trp Asn. Ser 245 250 255

Trp Glin Ser Pro Gly Glu Phe Ser Glin Ala Ser Asp Ala 26 O 265 27 O

Asn Cys Gln Asn. Glu Lys Ile Tyr Val Ser Thir Phe Gly Ser Ala 27s 285

Lell Glin Ser Ala Gly Met Pro Asn His Ala Ile Wall Asp Thir Gly Arg 29 O 295 3 OO

Asn Gly Wall Thr Gly Lieu. Arg Llys Glu Trp Gly Asp Trp Asn. Wall 3. OS 310 315 32O

Asn Gly Ala Gly Phe Gly Val Arg Pro Thir Ser Asn Thir Gly Lieu. Glu 3.25 330 335

Lell Ala Asp Ala Phe Val Trp Val Llys Pro Gly Gly Glu Ser Asp Gly 34 O 345 35. O

Thir Ser Asp Ser Ser Ser Pro Arg Tyr Asp Ser Phe Cys Gly 355 360 365

Asp Ala Phe Llys Pro Ser Pro Glu Ala Gly Thr Trp Asn Glu Ala Tyr 37 O 375

Phe Glu Met Lieu Lleu Lys Asn Ala Wall Pro Ser Phe 385 390 395

SEQ ID NO 29 LENGTH: 12O3 TYPE: DNA ORGANISM: Thiellavia terrestris

< 4 OOs SEQUENCE: 29 atgaagtacct caacctic ct cgcagct citc Ctcgc.cgt.cg ctic ct citctic cct cqctgca 6 O cc.ca.gcatcg aggc.ca.gaca gtcgaacgt.c aac Coataca tcggcaa.gag ccc.gct cott 12 O attagg togt acgc.ccaaaa gCttgaggag accgt. Cagga cct tccagoa acgtggcgac 18O

Cagct caacg Ctgcgaggac acggacggtg Cagaacgttg cgactitt cqc Ctggat.ct cq 24 O gataccalatg gtattggagc catt.cgacct ct catccaag atgct ct cqc Ccagcaggct 3OO cgcactggac agalagg to at cgt.ccaaatc gtcgt.ctaca acctic cc aga tcgcgactgc 360

US 8,426,158 B2 109 110 - Continued

Caggtotcgg gaggcggcaa. cgg.cggct C9 a CCaC Cacca cgt.cgaccac cacgctgagg 126 O acct cqacca cgaccaccac caccgcc.ccg acggccactg ccacgcactg gggacaatgc 132O ggcggaatcg ggg tacgt.ca accocctic ct gcattctgtt gaggaagtta actaacgtgg 1380 ccitacgcagt ggactggacc gaccgtctgc gaatcgc.cgt. acgcatgcaa. ggagctgaac 144 O c cct gg tact accagtgc ct ctaaagtatt gcagtgaagc catactic cqt 15OO

9

SEQ ID NO 32 LENGTH: 464 TYPE : PRT ORGANISM: Thiellavia terrestris

< 4 OOs SEQUENCE: 32

Met Gly Gln Lys Thr Lieu. His Gly Phe Ala Ala Thir Ala Luell Ala Wall 1. 5 1O 15

Lell Pro Phe Val Lys Ala Glin Glin Pro Gly Asn Phe Thir Pro Glu Wall 25

His Pro Glin Leu Pro Thr Trp Llys Cys Thr Thr Ala Gly Gly Cys Val 35 4 O 45

Glin Glin Asp Thir Ser Wal Wall Lieu. Asp Trp Asn Tyr Arg Trp Ile His SO 55 6 O

Asn Ala Asp Gly. Thir Ala Ser Cys Thir Thir Ser Ser Gly Wall Asp His 65 70 8O

Thir Luell Pro Asp Glu Ala Thr CyS Ala Lys Asn Phe Wall Glu 85 90 95

Gly Wall Asn Tyr Thr Ser Ser Gly Wall. Thir Thir Ser Gly Ser Ser Luell 105 11 O

Thir Met Arg Gln Tyr Phe Lys Gly Ser Asn Gly Glin Thir Asn Ser Wall 115 12 O 125

Ser Pro Arg Lieu. Tyr Lieu. Lieu. Gly Ser Asp Gly Asn Wall Met Lieu. 13 O 135 14 O

Lys Luell Luell Gly Glin Glu Lieu. Ser Phe Asp Wall Asp Lell Ser Thir Lieu. 145 150 155 160

Pro Gly Glu Asn Gly Ala Lieu. Tyr Lieu Ser Glu Met Asp Ala Thr 1.65 17O 17s

Gly Arg Asin Glin Tyr Asn Thr Gly Gly Ala Asn Gly Ser Gly 18O 185 19 O

Asp Ala Glin Cys Pro Val Gln Thr Trp Met Asn Gly Thir Lieu. 195 2O5

Asn Thir Asn Gly Glin Gly Tyr Cys Cys Asn. Glu Met Asp Ile Lieu. Glu 21 O 215

Ala Asn Ser Arg Ala Asn Ala Met Thr Pro His Pro Ala Asin Gly 225 23 O 235 24 O

Ser Asp Luell Asn. Pro Ala Glu Gly Tyr 245 250 255

Ser Tyr Gly Pro Gly Lieu Thr Val Asp Thir Ser Lys Pro Phe 26 O 265 27 O

Thir Ile Ile Thr Arg Phe Ile Thr Asp Asp Gly Thir Thir Ser Gly Thr 27s 28O 285

Lell Asn Glin Ile Glin Arg Ile Tyr Wall Glin Asn Gly Thir Wall Ala 29 O 295 3 OO

Ser Ala Ala Ser Gly Gly Asp Ile Ile Thir Ala Ser Gly Thir Ser 3. OS 310 315 32O

US 8,426,158 B2 113 114 - Continued gcagcacgta Caagagcgag ticagocact agagtagagc titgitaatt 1368

<210s, SEQ ID NO 34 &211s LENGTH: 423 212. TYPE: PRT <213> ORGANISM: Thiellavia terrestris

<4 OOs, SEQUENCE: 34 Met Ala Pro Llys Ser Thr Val Lieu Ala Ala Trp Lieu Lleu Ser Ser Lieu. 1. 5 1O 15 Ala Ala Ala Glin Glin Ile Gly Lys Ala Val Pro Glu Val His Pro Llys 2O 25 3O Lieu. Thir Thr Glin Lys Cys Thr Lieu. Arg Gly Gly Cys Llys Pro Val Arg 35 4 O 45 Thir Ser Val Val Lieu. Asp Ser Ser Ala Arg Ser Lieu. His Llys Val Gly SO 55 6 O Asp Pro Asn. Thir Ser Cys Ser Val Gly Gly Asp Lieu. Cys Ser Asp Ala 65 70 7s 8O Llys Ser Cys Gly Lys Asn. Cys Ala Lieu. Glu Gly Val Asp Tyr Ala Ala 85 90 95 His Gly Val Ala Thr Lys Gly Asp Ala Lieu. Thir Lieu. His Glin Trp Lieu. 1OO 105 11 O Lys Gly Ala Asp Gly Thr Tyr Arg Thr Val Ser Pro Arg Val Tyr Lieu. 115 12 O 125 Lieu. Gly Glu Asp Gly Lys Asn Tyr Glu Asp Phe Llys Lieu. Lieu. Asn Ala 13 O 135 14 O Glu Lieu. Ser Phe Asp Wall Asp Val Ser Glin Lieu Val Cys Gly Met Asn 145 150 155 160 Gly Ala Leu Tyr Phe Ser Glu Met Glu Met Asp Gly Gly Arg Ser Pro 1.65 17O 17s Lieu. Asn Pro Ala Gly Ala Thr Tyr Gly Thr Gly Tyr Cys Asp Ala Glin 18O 185 19 O Cys Pro Llys Lieu. Asp Phe Ile Asin Gly Glu Lieu. Asn. Thir Asn His Thr 195 2OO 2O5 Tyr Gly Ala Cys Cys Asn. Glu Met Asp Ile Trp Glu Ala Asn Ala Lieu. 21 O 215 22O Ala Glin Ala Lieu. Thr Pro His Pro Cys Asn Ala Thr Arg Val Tyr Lys 225 23 O 235 24 O Cys Asp Thir Ala Asp Glu. Cys Gly Glin Pro Val Gly Val Cys Asp Glu 245 250 255 Trp Gly Cys Ser Tyr Asn Pro Ser Asn Phe Gly Val Lys Asp Tyr Tyr 26 O 265 27 O Gly Arg Asn Lieu. Thr Val Asp Thr Asn Arg Llys Phe Thr Val Thir Thr 27s 28O 285 Glin Phe Val Thir Ser Asn Gly Arg Ala Asp Gly Glu Lieu. Thr Glu Ile 29 O 295 3 OO Arg Arg Lieu. Tyr Val Glin Asp Gly Val Val Ile Glin Asn His Ala Val 3. OS 310 315 32O Thr Ala Gly Gly Ala Thr Tyr Asp Ser Ile Thr Asp Gly Phe Cys Asn 3.25 330 335 Ala Thr Ala Thir Trp Thr Glin Glin Arg Gly Gly Lieu Ala Arg Met Gly 34 O 345 35. O Glu Ala Ile Gly Arg Gly Met Val Lieu. Ile Phe Ser Lieu. Trp Val Asp 355 360 365

Asn Gly Gly Phe Met Asn Trp Lieu. Asp Ser Gly Asn Ala Gly Pro Cys

US 8,426,158 B2 117 118 - Continued

105 11 O

Asp Luell Lieu. Asn. Glin Glu Lieu. Ser Val Asp Val Asp Phe Ser Ala Lieu 115 12 O 125

Pro Cys Gly Glu Asn Gly Ala Phe Tyr Lieu Ser Glu Met Ala Ala Asp 13 O 135 14 O

Gly Arg Gly Asp Ala Gly Ala Gly Asp Gly Tyr Asp Ala Gln Cys 145 150 155 160

Glin Gly Tyr Cys Cys Asn Glu Met Asp Ile Lieu. Glu Ala Asn Ser Met 1.65 17O 17s

Ala Thir Ala Met Thr Pro His Pro Cys Lys Gly Asn Asn Cys Asp Arg 18O 185 19 O

Ser Gly Cys Gly Tyr Asn Pro Tyr Ala Ser Gly Glin Arg Gly Phe Tyr 195

Pro Gly Lys Thr Val Asp Thr Ser Llys Pro Phe Thir Wall Wall. Thir 21 O 22O

Phe Ala Ala Ser Gly Gly Lys Lieu. Thir Glin Ile Thir Arg 23 O 235 24 O

Glin Asin Gly Arg Glu Ile Gly Gly Gly Gly Thir Ile Ser Ser Cys 245 250 255

Ser Glu Ser Ser Thr Gly Gly Lieu. Thr Gly Met Gly Glu Ala Lieu 26 O 265 27 O

Arg Gly Met Val Lieu Ala Met Ser Ile Trp Asn Asp Ala Ala Glin 285

Met Ala Trp Lieu. Asp Ala Gly Asn Asn Gly Pro Ala Ser Gly 29 O 3 OO

Gly Ser Pro Ser Wall Ile Glin Ser Glin His Pro Asp Thir His Wall 310 315 32O

Phe Ser Asn. Ile Arg Trp Gly Asp Ile Gly Ser Thir Thir Lys Asn 3.25 330 335

SEO ID NO 37 LENGTH: 1480 TYPE: DNA ORGANISM: Cladorrhinum foecundissimum

< 4 OOs SEQUENCE: 37 gatc.cgaatt cotcct ct cq ttctittagtic acagaccaga catctg.ccca cgatggttca 6 O caagttcgcc ctic ct caccg gcct cqc.cgc ctic cct cqca tctg.cccagc agat.cggcac 12 O cgt.cgt.cccc gag tot cacc c caagct tcc Caccalagcgc tgcact citcg CC9gtggctg 18O ccagaccgt.c gacacctic ca tcqt catcga cgcct tccag cgt.ccc.cticc acaagat.cgg 24 O cgaccct tcc act cottgcg tCgtcggcgg ccctcitctgc cc.cgacgc.ca agt cctg.cgc 3OO tgagaactgc gcgctic gagg gtgtcgacta tgcct Cotgg ggcatcaaga CC9aggg.cga 360 cgcc ctaact citcaac cagt ggatgc.ccga cc.cggcgaac Cctggc.cagt acalagacgac tact coccgt acttaccttg ttgctgagga cggcaagaac tacgaggatg tgaagct cot ggctaaggag atct cqtttg atgc.cgatgt cagcaac citt CCCtgcggca tgaacggtgc 54 O tittctacttgtctgagatgt tgatggatgg tggacgtggc gacct caacc Ctgctggtgc cgagtatggit accggittact gtgatgcgca gtgct tcaag ttggattitca t caacggcga 660 ggccaa.catc gaccaaaagc acggcgc.ctg Ctgcaacgaa atgga cattt tcqaatccaa 72 O

Ctcgc.gc.gcc aagacictt.cg to coccacco ctgcaa.catc acgcaggtot acaagtgcga aggcgalagac gagtgcggCC cgtgtgcgac aagttgggggt gcggcttcaa 84 O US 8,426,158 B2 119 120 - Continued cgagtacaaa tigggcgt.cg agt ccttcta C9gc.cggggc ticgcagttcg C catcgactic 9 OO CtcCaagaag titcaccgt.ca C cacgcagtt cct gaccgac aacggcaagg aggacggcgt. 96.O Cctcgt.cgag atcc.gc.cgct ttggcacca ggatggcaag Ctgat Caaga acaccgctat O2O cCaggttgag gagaactaca gcacggactic ggtgagcacc gagttctg.cg agaagactgc O8O ttctitt Cacc atgcagcgcg gtggtctgaa ggcgatgggc gaggctatcg gtcgtggitat 14 O ggtgctggitt ttcago atct gggcggatga titcgggttitt atgaactggit tatgcgga 2OO gggtaatggc ccttgcagcg cactgaggg catc.cgaag gagattgtca agaataa.gc.c 26 O ggatgctagg gttacgttct caaac attag gattggtgag gttggtagca C9tatgctic C 32O gggtgggaag toggtgtta agagc agggit totaggggg Ctt actgctt Cttaaggggg 38O gtgttgaagag aggaggaggt gttgttgggg gttggagatg attaattgggc gagatggtgt 44 O agagcgggitt ggttggat at gaatacgttgaattggatgt 48O

<210s, SEQ ID NO 38 &211s LENGTH: 440 212. TYPE: PRT <213> ORGANISM: Cladorrhinum foecundissimum

<4 OOs, SEQUENCE: 38 Met Val His Llys Phe Ala Lieu. Lieu. Thr Gly Lieu Ala Ala Ser Lieu Ala 1. 5 1O 15 Ser Ala Glin Glin Ile Gly Thr Val Val Pro Glu Ser His Pro Llys Lieu. 2O 25 3O Pro Thr Lys Arg Cys Thr Lieu. Ala Gly Gly Cys Gln Thr Val Asp Thr 35 4 O 45 Ser Ile Val Ile Asp Ala Phe Glin Arg Pro Lieu. His Lys Ile Gly Asp SO 55 6 O Pro Ser Thr Pro Cys Val Val Gly Gly Pro Leu. Cys Pro Asp Ala Lys 65 70 7s 8O Ser Cys Ala Glu Asn. Cys Ala Lieu. Glu Gly Val Asp Tyr Ala Ser Trp 85 90 95 Gly Ile Llys Thr Glu Gly Asp Ala Lieu. Thir Lieu. Asn Gln Trp Met Pro 1OO 105 11 O Asp Pro Ala Asn Pro Gly Glin Tyr Lys Thr Thr Thr Pro Arg Thr Tyr 115 12 O 125 Lieu Val Ala Glu Asp Gly Lys Asn Tyr Glu Asp Val Llys Lieu. Lieu Ala 13 O 135 14 O Lys Glu Ile Ser Phe Asp Ala Asp Val Ser Asn Lieu Pro Cys Gly Met 145 150 155 160 Asn Gly Ala Phe Tyr Lieu. Ser Glu Met Lieu Met Asp Gly Gly Arg Gly 1.65 17O 17s Asp Lieu. Asn. Pro Ala Gly Ala Glu Tyr Gly Thr Gly Tyr Cys Asp Ala 18O 185 19 O Glin Cys Phe Llys Lieu. Asp Phe Ile Asin Gly Glu Ala Asn. Ile Asp Glin 195 2OO 2O5 Llys His Gly Ala Cys Cys Asn. Glu Met Asp Ile Phe Glu Ser Asn. Ser 21 O 215 22O Arg Ala Lys Thr Phe Val Pro His Pro Cys Asn Ile Thr Glin Val Tyr 225 23 O 235 24 O Lys Cys Glu Gly Glu Asp Glu. Cys Gly Glin Pro Val Gly Val Cys Asp 245 250 255 Lys Trp Gly Cys Gly Phe Asn Glu Tyr Lys Trp Gly Val Glu Ser Phe 26 O 265 27 O

US 8,426,158 B2 123 124 - Continued ttct coaa.ca t c cqctgggg agacattggg totactacga act cactgc gcc.ccc.gc.cc 12 OO cc.gc.ctg.cgt. C cagdacgac gttitt.cgact acacggagga gct cacgac titcgagcagc 126 O cc.gagctgca cqcagactica Ctgggggcag toggtggca ttggg tacag C9ggtgcaa.g 132O acgtgcacgt. C9ggCactac gtgc.cagt at agcaacgact act acticgca atgcc tittag 1380

<210s, SEQ ID NO 4 O &211s LENGTH: 459 212. TYPE: PRT <213> ORGANISM: Trichoderma reesei

<4 OOs, SEQUENCE: 4 O

Met Ala Pro Ser Wall. Thir Lieu Pro Lieu. Thir Thir Ala Ile Lieu. Ala Ile 1. 5 1O 15 Ala Arg Lieu Val Ala Ala Glin Gln Pro Gly. Thir Ser Thr Pro Glu Val 2O 25 3O His Pro Llys Lieu. Thir Thr Tyr Lys Cys Thr Lys Ser Gly Gly Cys Val 35 4 O 45 Ala Glin Asp Thir Ser Val Val Lieu. Asp Trp Asn Tyr Arg Trp Met His SO 55 6 O Asp Ala Asn Tyr Asn Ser Cys Thr Val Asn Gly Gly Val Asn. Thir Thr 65 70 7s 8O Lieu. Cys Pro Asp Glu Ala Thr Cys Gly Lys Asn. Cys Phe Ile Glu Gly 85 90 95 Val Asp Tyr Ala Ala Ser Gly Val Thir Thr Ser Gly Ser Ser Lieu. Thr 1OO 105 11 O Met Asin Glin Tyr Met Pro Ser Ser Ser Gly Gly Tyr Ser Ser Val Ser 115 12 O 125 Pro Arg Lieu. Tyr Lieu. Lieu. Asp Ser Asp Gly Glu Tyr Val Met Lieu Lys 13 O 135 14 O Lieu. Asn Gly Glin Glu Lieu. Ser Phe Asp Wall Asp Lieu. Ser Ala Lieu Pro 145 150 155 160 Cys Gly Glu Asn Gly Ser Lieu. Tyr Lieu. Ser Gln Met Asp Glu Asn Gly 1.65 17O 17s Gly Ala Asn Glin Tyr Asn. Thir Ala Gly Ala Asn Tyr Gly Ser Gly Tyr 18O 185 19 O Cys Asp Ala Glin Cys Pro Val Glin Thir Trp Arg Asn Gly. Thir Lieu. Asn 195 2OO 2O5 Thir Ser His Glin Gly Phe Cys Cys Asn Glu Met Asp Ile Leu Glu Gly 21 O 215 22O Asn Ser Arg Ala Asn Ala Lieu. Thr Pro His Ser Cys Thr Ala Thir Ala 225 23 O 235 24 O Cys Asp Ser Ala Gly Cys Gly Phe Asin Pro Tyr Gly Ser Gly Tyr Lys 245 250 255 Ser Tyr Tyr Gly Pro Gly Asp Thr Val Asp Thr Ser Lys Thr Phe Thr 26 O 265 27 O Ile Ile Thr Glin Phe Asn Thr Asp Asn Gly Ser Pro Ser Gly Asn Lieu. 27s 28O 285 Val Ser Ile Thr Arg Lys Tyr Glin Glin Asn Gly Val Asp Ile Pro Ser 29 O 295 3 OO Ala Glin Pro Gly Gly Asp Thir Ile Ser Ser Cys Pro Ser Ala Ser Ala 3. OS 310 315 32O Tyr Gly Gly Lieu Ala Thr Met Gly Lys Ala Leu Ser Ser Gly Met Val 3.25 330 335

Lieu Val Phe Ser Ile Trp Asn Asp Asn Ser Glin Tyr Met Asn Trp Leu

US 8,426,158 B2 127 128 - Continued acaacttgcc aggtoctogaa ccct tactac tot cagtgcc totaa 1545

SEQ ID NO 42 LENGTH: 514 TYPE : PRT ORGANISM: Trichoderma reesei

< 4 OOs SEQUENCE: 42

Met Tyr Arg Llys Lell Ala Wall Ile Ser Ala Phe Lell Ala Thir Ala Arg 1. 5 1O 15

Ala Glin Ser Ala Cys Thir Lell Glin Ser Glu Thir His Pro Pro Luell Thir 2O 25 3O

Trp Glin Lys Ser Ser Gly Gly Thir Thir Glin Glin Thir Gly Ser 35 4 O 45

Wall Wall Ile Ala Asn Trp Arg Trp Thir His Ala Thir Asn Ser Ser SO 55 6 O

Thir Asn Asp Gly Asn Thir Trp Ser Ser Thir Lell Pro Asp 65 70

Asn Glu Thir Ala Asn Luell Asp Gly Ala Ala Tyr Ala 85 90 95

Ser Thir Gly Wall Thir Thir Ser Gly Asn Ser Lell Ser Ile Gly Phe 105 11 O

Wall Thir Glin Ser Ala Glin Asn Wall Gly Ala Arg Lell Tyr Luell Met 115 12 O 125

Ala Ser Asp Thir Thir Glin Glu Phe Thir Luell Lell Gly Asn Glu Phe 13 O 135 14 O

Ser Phe Asp Wall Asp Wall Ser Glin Luell Pro Cys Gly Lell Asn Gly Ala 145 150 155 160

Lell Tyr Phe Wall Ser Met Asp Ala Asp Gly Wall Ser Tyr Pro 1.65 17O 17s

Thir Asn Thir Ala Gly Ala Gly Thir Asp Ser Glin 18O 185 19 O

Pro Arg Asp Lell Phe Ile Asn Gly Ala Asn Wall Glu Gly 195 2OO 2O5

Trp Glu Pro Ser Ser Asn Asn Ala Asn Thir Ile Gly Gly His Gly 21 O 215

Ser Ser Glu Met Asp Ile Trp Glu Asn Ser Ile Ser Glu 225 23 O 24 O

Ala Luell Thir Pro His Pro Thir Thir Wall Glin Glu Ile Cys Glu 245 250 255

Gly Asp Gly Cys Gly Gly Thir Ser Asp ASn Arg Gly Gly Thir 26 O 265 27 O

Asp Pro Asp Gly Asp Trp Asn Pro Arg Lell Gly Asn Thir 27s 285

Ser Phe Gly Pro Gly Ser Ser Phe Thir Luell Asp Thir Thir 29 O 295 3 OO

Lell Thir Wall Wall Thir Glin Phe Glu Thir Ser Gly Ala Ile Asn Arg Tyr 3. OS 310 315 32O

Wall Glin Asn Gly Wall Thir Phe Glin Glin Pro Asn Ala Glu Luell Gly 3.25 330 335

Ser Ser Gly Asn Glu Lell Asn Asp Asp Thir Ala Glu Glu 34 O 345 35. O

Ala Glu Phe Gly Gly Ser Ser Phe Ser Asp Gly Gly Luell Thir Glin 355 360 365

US 8,426,158 B2 131 132 - Continued ggtcgatcgg gaaagcagcc taccggacag caa.ca.gtggg gag actggtg caatgtgat c 1380 ggcaccggat ttgg tatt.cg cc catcc.gca aac actgggg act cqttgct ggatt cqttt 144 O gtctgggit Ca agcCaggcgg cgagtgttgac ggcaccagcg acagcagtgc gcc acgattit 15OO gact cocact gtgcgctic cc agatgccttg caa.ccggcgc Ctcaa.gctgg to ttggttc 1560 caagcc tact ttgttgcagct tot Cacaaac gcaaacccat cgttcct gta a 1611

SEQ ID NO 44 LENGTH: 471 TYPE : PRT ORGANISM: Trichoderma rees ei

< 4 OOs SEQUENCE: 44 Met Ile Val Gly Ile Lieu. Thir Thr Lieu Ala Thr Lieu Ala Thir Lieu Ala 1. 5 1O 15

Ala Ser Wall Pro Lieu. Glu Glu Arg Glin Ala Cys Ser Ser Val Trp Gly 25 3O

Glin Gly Gly Glin Asn Trp Ser Gly Pro Thr Cys Cys Ala Ser Gly 35 4 O 45

Ser Thir Val Tyr Ser Asn Asp Tyr Tyr Ser Gln Cys Lieu Pro Gly SO 55 6 O

Ala Ala Ser Ser Ser Ser Ser Thr Arg Ala Ala Ser Thr Thr Ser Arg 65 70 8O

Wall Ser Pro Thir Thr Ser Arg Ser Ser Ser Ala Thr Pro Pro Pro Gly 85 90 95

Ser Thir Thir Thr Arg Val Pro Pro Val Gly Ser Gly Thr Ala Thr Tyr 105 11 O

Ser Gly Asn Pro Phe Val Gly Val Thr Pro Trp Ala Asn Ala Tyr Tyr 115 12 O 125

Ala Ser Glu Wal Ser Ser Lieu Ala Ile Pro Ser Lieu. Thr Gly Ala Met 13 O 135 14 O

Ala Thir Ala Ala Ala Ala Wall Ala Llys Val Pro Ser Phe Met Trp Leu 145 150 155 160

Asp Thir Luell Asp Llys Thr Pro Lieu. Met Glu Glin Thir Lieu Ala Asp Ile 1.65 17O 17s

Arg Thir Ala Asn Lys Asn Gly Gly Asn Tyr Ala Gly Glin Phe Val Val 18O 185 19 O

Asp Luell Pro Asp Arg Asp Cys Ala Ala Lieu. Ala Ser Asn Gly Glu 195 2O5

Ser Ile Ala Asp Gly Gly Val Ala Lys Tyr Lys Asn Tyr Ile Asp 21 O 215 22O

Thir Ile Arg Glin Ile Wal Wall Glu Tyr Ser Asp Ile Arg Thr Lieu. Lieu. 225 23 O 235 24 O

Wall Ile Glu Pro Asp Ser Lieu Ala Asn Lieu Wall Thr Asn Lieu. Gly Thr 245 250 255

Pro Ala Asn Ala Glin Ser Ala Tyr Lieu. Glu Cys Ile Asn Tyr 26 O 265 27 O

Ala Wall Thir Gln Lieu. Asn Lieu. Pro Asn. Wall Ala Met Tyr Lieu. Asp Ala 27s 285

Gly His Ala Gly Trp Lieu. Gly Trp Pro Ala Asn Glin Asp Pro Ala Ala 29 O 295 3 OO

Glin Luell Phe Ala Asn Val Tyr Lys Asn Ala Ser Ser Pro Arg Ala Lieu. 3. OS 310 315 32O

Arg Gly Luell Ala Thr Asn. Wall Ala Asn Tyr Asn Gly Trp Asn Ile Thr

US 8,426,158 B2 135 136 - Continued tgccaacggc gatctoggcg agatcaag.cg cittctacgt.c Caggatggca agat catcc c

Caactic.cgag to Caccat CC ccggcgt.cga gggcaatticc at Cacccagg actggtgcga 44 O cc.gc.ca.gaag gttgcc tittg gcgacattga cgactitcaac cgcaa.gggcg gcatgaa.gca SOO gatgggcaa.g gcc cc atggit Cctggtcatg tccatctggg atgaccacgc 560 citccaa.catg Ctctggct cq act caccitt c cctdtcgat gcc.gctggca agc.ccggcgc cgagcgcggit gcctic cc.ga ccaccitcggg tgtc.cct gct gaggttgagg cc.gaggc.ccc caac agcaac gtcgtc.ttct c caa.catc.cg citt cqgc.ccc atcggctica cc.gttgctgg 74 O tctic ccc.ggc gcggcaacaa cggcggcaac cc.ccc.gc.ccc CCaCCaC CaC cacctic ct cq gct Coggc.ca CCaCCaCCaC cgc.ca.gc.gct ggc.cccalagg Ctggcc.gctg 86 O gcagcagtgc ggcggcatcg gct tcactgg CCC gaccCag tgcgaggagc cct acatttg 92 O caccaa.gctic aacgactggit act ct cagtg cctgtaaatt Ctgagtc.gct gacticgacga 98 O t cacggc.cgg tttittgcatg aaaggaaa.ca aacgaccgcg ataaaaatgg agggitaatga gatgtc. 2O46

SEQ ID NO 46 LENGTH: 525 TYPE : PRT ORGANISM: Humicola insolens

< 4 OOs SEQUENCE: 46

Met Arg Thir Ala Lys Phe Ala Thr Lieu Ala Ala Lieu Wall Ala Ser Ala 1. 5 1O 15

Ala Ala Glin Glin Ala Cys Ser Lieu. Thir Thr Gul Arg His Pro Ser Luell 25 3O

Ser Trp Asn Lys Cys Thr Ala Gly Gly Glin Cys Glin. Thir Wall Glin Ala 35 4 O 45

Ser Ile Thir Lieu. Asp Ser Asn Trp Arg Trp Thr His Glin Wall Ser Gly SO 55 6 O

Ser Thir Asn Cys Tyr Thr Gly Asn Thir Ser Ile Cys Thr 65 70 8O

Asp Ala Ser Cys Ala Glin Asn Cys Cys Val Asp Gly Ala Asp Tyr 85 90 95

Thir Ser Thir Tyr Gly Ile Thr Thr Asin Gly Asp Ser Leu Ser Lieu Lys 105 11 O

Phe Wall Thir Lys Gly Glin His Ser Thir Asn. Wall Gly Ser Arg Thr Tyr 115 12 O 125

Lell Met Asp Gly Glu Asp Llys Tyr Glin. Thir Phe Glu Lieu. Lieu. Gly Asn 13 O 135 14 O

Glu Phe Thir Phe Asp Val Asp Wall Ser Asn. Ile Gly Cys Gly Luell Asn 145 150 155 160

Gly Ala Luell Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Lieu. Ser Arg 1.65 17O

Pro Gly Asn Lys Ala Gly Ala Llys Tyr Gly Thr Gly Tyr 18O 185 19 O

Ala Glin Cys Pro Arg Asp Ile Llys Phe Ile Asn Gly Glu Ala ASn Ile 195 2O5

Glu Gly Trp Thr Gly Ser Thr Asn Asp Pro Asn Ala Gly Ala Gly Arg 21 O 215 22O

Tyr Gly Thir Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn As yet 225 23 O 235 24 O US 8,426,158 B2 137 138 - Continued Ala Thir Ala Phe Thr Pro His Pro Cys Thir Ile Ile Gly Glin Ser Arg 245 250 255

Glu Gly Asp Ser Cys Gly Gly Thr Tyr Ser Asn Glu Arg Tyr Ala 26 O 265 27 O

Gly Wall Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gln Gly 27s 285

Asn Lys Thir Phe Tyr Gly Lys Gly Met Thir Wall Asp Thir Thir 29 O 295 3 OO

Ile Thir Wall Wall. Thir Glin Phe Lieu. Lys Asp Ala Asn Gly Asp Lieu. Gly 3. OS 310 315 32O

Glu Ile Arg Phe Tyr Val Glin Asp Gly Lys Ile Ile Pro Asn. Ser 3.25 330 335

Glu Ser Thir Ile Pro Gly Val Glu Gly Asn. Ser Ile Thir Glin Asp Trp 34 O 345 35. O

Asp Arg Glin Llys Val Ala Phe Gly Asp Ile Asp Asp Phe Asn Arg 355 360 365

Gly Gly Met Lys Gln Met Gly Lys Ala Lieu. Ala Gly Pro Met Wall 37 O 375

Lell Wall Met Ser Ile Trp Asp Asp His Ala Ser Asn Met Luell Trp Lieu. 385 390 395 4 OO

Asp Ser Thir Phe Pro Val Asp Ala Ala Gly Lys Pro Gly Ala Glu Arg 4 OS 41O 415

Gly Ala Pro Thr Thr Ser Gly Wall Pro Ala Glu Wall Glu Ala Glu 425 43 O

Ala Pro Asn Ser Asn. Wal Wall Phe Ser Asn. Ile Arg Phe Gly Pro Ile 435 44 O 445

Gly Ser Thir Val Ala Gly Lieu Pro Gly Ala Gly Asn Gly Gly Asn. Asn 450 45.5 460

Gly Gly Asn Pro Pro Pro Pro Thir Th Thir Thr Ser Ser Ala Pro Ala 465 470 47s 48O

Thir Thir Thir Thir Ala Ser Ala Gly Pro Lys Ala Gly Arg Trp Glin Glin 485 490 495

Gly Gly Ile Gly Phe Thr Gly Pro Thr Gin Glu Glu Pro Tyr SOO 505 51O

Ile Thir Llys Lieu. Asn Asp Trip Tyr Ser Glin Lell 515 525

SEO ID NO 47 LENGTH: 1812 TYPE: DNA ORGANISM: Myceliophthora thermophila

< 4 OOs SEQUENCE: 47 atggccaaga agcttitt cat caccgcc.gc.c Cttgcggctg ggc.ccc.cgt.c 6 O attgaggagc gcc agaactg cgg.cgctgttg tggtaagaaa agttt cocat 12 O gactittctica tcgagtaatg gCatalaggcc caccc ctitcg actgactgtg agaatcgatc 18O aaatcc agga Ctcaatgcgg cggcaacggg tggCagggtc ccacatgctg cgc.ct cqggc 24 O tcgacctg.cg ttgcgcagaa cgagtggtac t ct cagtgcc tgcc.caacaa t caggtgacg 3OO agttccaa.ca citc.cgt.cgt.c gact tccacc tcgcagcgca gcago agcac ctic cago agc 360 agcaccagga gcggcagctic ctic ct cost co accaccacgc c cc ct cocqt ctic cago ccc gtgact agca titc.ccggcgg tgc gaccacc acggcgagct actctggcaa cccctitctog 48O ggcgt.ccggc t ctitcgc.caa cgact actac aggtocgagg to Cacaatct cgc.catt cot 54 O

US 8,426,158 B2 141 142 - Continued

145 150 155 160

Wall Ala Glu Wall Pro Ser Phe Glin Trp Lieu. Asp Arg Asn Wall Thir Ile 1.65 17O 17s

Asp Thr Lieu Met Val Glin Thr Lieu. Ser Glin Ile Arg Ala Ala Asn. Asn 18O 185 19 O

Ala Gly Ala Asn Pro Pro Tyr Ala Ala Glin Lieu. Wall Wall Asp Lieu. 195

Pro Asp Arg Asp Cys Ala Ala Ala Ala Ser Asn Gly Glu Phe Ser Ile 21 O 215 22O

Ala Asn Gly Gly Ala Ala Asn Tyr Arg Ser Tyr Ile Asp Ala Ile Arg 225 23 O 235 24 O

Llys His Ile Ile Glu Tyr Ser Asp Ile Arg Ile Ile Lell Wall Ile Glu 245 250 255

Pro Asp Ser Met Ala Asn Met Val Thir Asn. Met Asn Wall Ala 26 O 265 27 O

Ser Asn Ala Ala Ser Thr Tyr His Glu Lieu. Thir Wall Tyr Ala Lieu Lys 27s 285

Gln Lieu. Asn Lieu Pro Asn. Wall Ala Met Tyr Lieu. Asp Ala Gly His Ala 29 O 295 3 OO

Gly Trp Lieu. Gly Trp Pro Ala Asn Ile Glin Pro Ala Ala Asp Lieu. Phe 3. OS 310 315 32O

Ala Gly Ile Tyr Asn Asp Ala Gly Llys Pro Ala Ala Wall Arg Gly Lieu. 3.25 330 335

Ala Thr Asn Val Ala Asn Tyr Asn Ala Trp Ser Ile Ala Ser Ala Pro 34 O 345 35. O

Ser Tyr Thr Ser Pro Asn Pro Asn Tyr Asp Glu His Ile Glu 355 360 365

Ala Phe Ser Pro Lieu. Lieu. Asn Ala Ala Gly Phe Pro Ala Arg Phe Ile 37 O 375

Val Asp Thr Gly Arg Asn Gly Lys Glin Pro Thir Gly Glin Glin Gln Trp 385 390 395 4 OO

Gly Asp Trp Cys Asn. Wall Lys Gly Thr Gly Phe Gly Wall Arg Pro Thir 4 OS 41O 415

Ala Asn Thr Gly His Asp Lieu Val Asp Ala Phe Wall Trp Wall Llys Pro 425 43 O

Gly Gly Glu Ser Asp Gly. Thir Ser Asp Thr Ser Ala Ala Arg 435 44 O 445

Tyr His Cys Gly Lieu. Ser Asp Ala Leul Glin Pro Ala Pro Glu Ala Gly 450 45.5 460

Gln Trp Phe Glin Ala Tyr Phe Glu Gln Lieu. Lieu. Thir Asn Ala Asn. Pro 465 470 47s 48O

Pro Phe

<210s, SEQ ID NO 49 &211s LENGTH: 1725 &212s. TYPE: DNA <213> ORGANISM: Trichoderma reesei

<4 OOs, SEQUENCE: 49 gagggcagct cacctgaaga ggcttgtaag at CaCCCtct gtgtattgca c catgattgt 6 O cggcattct c accacgctgg Ctacgctggc cacacticgca gct agtgtgc Ctctagagga 12 O gcggcaa.gct tct caag.cg tctggggc.ca atgttggtggc Cagaattggit cgggit Cogac 18O ttgctgtgct tcc.ggaagca catgcgt.cta ctic caacgac tat tact coc agtgtct tcc 24 O

US 8,426,158 B2 145 146 - Continued

115 12 O 125

Ala Ser Glu Wall Ser Ser Lell Ala Ile Pro Ser Lell Thir Gly Ala Met 13 O 135 14 O

Ala Thir Ala Ala Ala Ala Wall Ala Wall Pro Ser Phe Met Trp Luell 145 150 155 160

Asp Thir Luell Asp Lys Thir Pro Luell Met Glu Glin Thir Lell Ala Asp Ile 1.65 17O 17s

Arg Thir Ala Asn Lys Asn Gly Gly Asn Ala Gly Glin Phe Wall Wall 18O 185 19 O

Asp Luell Pro Asp Arg Asp Cys Ala Ala Luell Ala Ser Asn Gly Glu 195

Ser Ile Ala Asp Gly Gly Wall Ala Lys Asn Ile Asp 21 O 215 22O

Thir Ile Arg Glin Ile Wall Wall Glu Ser Asp Ile Arg Thir Luell Luell 225 23 O 235 24 O

Wall Ile Glu Pro Asp Ser Lell Ala Asn Luell Wall Thir Asn Luell Gly Thir 245 250 255

Pro Ala Asn Ala Glin Ser Ala Luell Glu Ile Asn 26 O 265 27 O

Ala Wall Thir Glin Lell Asn Lell Pro Asn Wall Ala Met Tyr Luell Asp Ala 27s 28O 285

Gly His Ala Gly Trp Lell Gly Trp Pro Ala ASn Glin Asp Pro Ala Ala 29 O 295 3 OO

Glin Luell Phe Ala Asn Wall Asn Ala Ser Ser Pro Arg Ala Luell 3. OS 310 315

Arg Gly Luell Ala Thir Asn Wall Ala Asn Tyr ASn Gly Trp Asn Ile Thir 3.25 330 335

Ser Pro Pro Ser Tyr Thir Glin Gly Asn Ala Wall Asn Glu Luell 34 O 345 35. O

Ile His Ala Ile Gly Pro Luell Luell Ala ASn His Gly Trp Ser Asn 355 360 365

Ala Phe Phe Ile Thir Asp Glin Gly Arg Ser Gly Lys Glin Pro Thir Gly 37 O 375

Glin Glin Glin Trp Gly Asp Trp Asn Wall Ile Gly Thir Gly Phe Gly 385 390 395 4 OO

Ile Arg Pro Ser Ala Asn Thir Asp Ser Luell Lell Asp Ser Phe Wall 4 OS 415

Trp Wall Pro Gly Gly Glu Asp Gly Thir Ser Asp Ser Ser Ala 425 43 O

Pro Arg Phe Asp Ser His Ala Luell Pro Asp Ala Lell Glin Pro Ala 435 44 O 445

Pro Glin Ala Gly Ala Trp Phe Glin Ala Tyr Phe Wall Glin Luell Luell Thir 450 45.5 460

Asn Ala Asn Pro Ser Phe Lell 465 470

<21Os SEO ID NO 51 <211 > LENGTH: 1446 <212> TYPE: DNA <213> ORGANISM: Thiellavia terrestris

< 4 OOs SEQUENCE: 51 atggct caga agcticcitt ct cqc.cgc.cgcc Cttgcggcca gcgcc ct cqc totc.ccgt.c gtcgaggagc gcc agaactg. C9gttc.cgt.c tdgagcc-aat gcggcgg cat totggtcC 12 O

US 8,426,158 B2 149 150 - Continued

Ala Ile Pro Ser Met Thir Gly Ala Met Ala Thir Lys Ala Ala Glu Wall 145 150 155 160

Ala Wall Pro Ser Phe Glin Trp Luell Asp Arg Asn Wall Thir Ile Asp 1.65 17O 17s

Thir Luell Phe Ala His Thir Lell Ser Glin Ile Arg Ala Ala Asn Glin Lys 18O 185 19 O

Gly Ala Asn Pro Pro Ala Gly Ile Phe Wall Wall Tyr Asp Luell Pro 195

Asp Arg Asp Ala Ala Ala Ala Ser Asn Gly Glu Phe Ser Ile Ala 21 O 215

Asn Asn Gly Ala Ala Asn Thir Ile Asp Ala Ile Arg Ser 225 23 O 235 24 O

Lell Wall Ile Glin Tyr Ser Asp Ile Arg Ile Ile Phe Wall Ile Glu Pro 245 250 255

Asp Ser Luell Ala Asn Met Wall Thir Asn Luell ASn Wall Ala Lys Ala 26 O 265 27 O

Asn Ala Glu Ser Thir Glu Luell Thir Wall Ala Luell Glin Glin 28O 285

Lell Asn Luell Pro Asn Wall Ala Met Luell Asp Ala Gly His Ala Gly 29 O 295 3 OO

Trp Luell Gly Trp Pro Ala Asn Ile Glin Pro Ala Ala Asn Luell Phe Ala 3. OS 310 315

Glu Ile Thir Ser Ala Gly Pro Ala Ala Wall Arg Gly Luell Ala 3.25 330 335

Thir Asn Wall Ala Asn Asn Gly Trp Ser Luell Ala Thir Pro Pro Ser 34 O 345 35. O

Thir Glin Gly Asp Pro Asn Tyr Asp Glu Ser His Tyr Wall Glin Ala 355 360 365

Lell Ala Pro Luell Lell Thir Ala Asn Gly Phe Pro Ala His Phe Ile Thir 37 O 375

Asp Thir Gly Arg Asn Gly Glin Pro Thir Gly Glin Arg Glin Trp Gly 385 390 395 4 OO

Asp Trp Asn Wall Ile Gly Thir Gly Phe Gly Wall Arg Pro Thir Thir 4 OS 415

Asn Thir Gly Luell Asp Ile Glu Asp Ala Phe Wall Trp Wall Lys Pro Gly 425 43 O

Gly Glu Cys Asp Gly Thir Ser Asn Thir Thir Ser Pro Arg Asp Tyr 435 44 O 445

His Cys Gly Luell Ser Asp Ala Luell Glin Pro Ala Pro Glu Ala Gly Thir 450 45.5 460

Trp Phe Glin Ala Tyr Phe Glu Glin Luell Luell Thir Asn Ala Asn Pro Pro 465 470 47s 48O

Phe

<21Os SEO ID NO 53 <211 > LENGTH: 1593 <212> TYPE: DNA <213> ORGANISM: Chaetomium thermophilum

< 4 OOs SEQUENCE: 53 atgatgtaca agaagttcgc cgct Ctcgcc gcc ct cqtgg Ctggcgc.cgc cqCCC agcag gcttgctic cc ticaccactga gacccacc cc agact cactt ggaag.cgctg. cacct ctdgc 12 O ggcaactgct caccgtgaa cigcgc.cgt.c accatcgatg C caactggcg Ctggact cac 18O

US 8,426,158 B2 153 154 - Continued

Lell Met Glu Asn Asp Thir Lys Glin Met Phe Glu Lell Luell Gly Asn 13 O 135 14 O

Glu Phe Thir Phe Asp Wall Asp Wall Ser Asn Luell Gly Gly Luell Asn 145 150 155 160

Gly Ala Luell Phe Wall Ser Met Asp Ala Asp Gly Met Ser 1.65 17O 17s

Ser Gly Asn Lys Ala Gly Ala Lys Gly Thir Tyr Asp 18O 185 19 O

Ala Glin Cys Pro Arg Asp Lell Lys Phe Ile ASn Gly Ala Asn Ile 195

Glu Asn Trp Thir Pro Ser Thir Asn Asp Ala ASn Ala Phe Gly Arg 21 O 215

Tyr Gly Ser Cys Ser Glu Met Asp Ile Trp Asp Asn Asn Met 225 23 O 235 24 O

Ala Thir Ala Phe Thir Pro His Pro Thir Ile Ile Glin Ser Arg 245 250 255

Glu Gly Asn Ser Gly Gly Thir Ser Ser Arg Ala 26 O 265 27 O

Gly Wall Cys Asp Pro Asp Gly Cys Asp Phe ASn Ala Tyr Arg Glin Gly 27s 285

Asp Lys Thir Phe Tyr Gly Lys Gly Met Thir Wall Asp Thir Thir 29 O 295 3 OO

Met Thir Wall Wall Thir Glin Phe His Asn Ser Ala Gly Wall Luell Ser 3. OS 310 315 32O

Glu Ile Arg Phe Wall Glin Asp Gly Ile Ile Ala Asn Ala 3.25 330 335

Glu Ser Ile Pro Gly Asn Pro Gly Asn Ser Ile Thir Glin Glu Trp 34 O 345 35. O

Asp Ala Glin Lys Wall Ala Phe Gly Asp Ile Asp Asp Phe Asn Arg 355 360 365

Gly Gly Met Ala Glin Met Ser Ala Luell Glu Gly Pro Met Wall 37 O 375

Lell Wall Met Ser Wall Trp Asp Asp His Ala Asn Met Luell Trp Luell 385 390 395 4 OO

Asp Ser Thir Pro Ile Asp Ala Gly Thir Pro Gly Ala Glu Arg 4 OS 41O 415

Gly Ala Pro Thir Thir Ser Gly Wall Pro Ala Glu Ile Glu Ala Glin 425 43 O

Wall Pro Asn Ser Asn Wall Ile Phe Ser Asn Ile Arg Phe Gly Pro Ile 435 44 O 445

Gly Ser Thir Wall Pro Gly Lell Asp Gly Ser Thir Pro Ser Asn Pro Thir 450 45.5 460

Ala Thir Wall Ala Pro Pro Thir Ser Thir Thir Thir Ser Wall Arg Ser Ser 465 470

Thir Thir Glin Ile Ser Thir Pro Thir Ser Glin Pro Gly Gly Thir Thir 485 490 495

Glin Trp Gly Glin Gly Gly Ile Gly Thir Gly Cys Thir Asn SOO 505 51O

Wall Ala Gly Thir Thir Thir Glu Luell ASn Pro Trp Ser Glin 515 52O 525

Luell 53 O

<210s, SEQ ID NO 55

US 8,426,158 B2 157 158 - Continued

Wall Thir Thir Ala Pro Pro Thir Thir Thir Ile Pro Gly Gly Ala Ser Ser 1OO 105 11 O

Thir Ala Ser Asn Gly Asn Pro Phe Ser Gly Wall Glin Luell Trp Ala 115 12 O 125

Asn Thir Ser Ser Glu Wall His Thir Luell Ala Ile Pro Ser Luell 13 O 135 14 O

Ser Pro Glu Luell Ala Ala Ala Ala Wall Ala Glu Wall Pro Ser 145 150 155 160

Phe Glin Trp Luell Asp Arg Asn Wall Thir Wall Asp Thir Lell Phe Ser Gly 1.65 17O 17s

Thir Luell Ala Glu Ile Arg Ala Ala Asn Glin Arg Gly Ala Asn Pro Pro 18O 185 19 O

Ala Gly Ile Phe Wall Wall Tyr Asp Luell Pro Asp Arg Asp Ala 195

Ala Ala Ala Ser Asn Gly Glu Trp Ser Ile Ala Asn Asn Gly Ala Asn 21 O 215 22O

Asn Arg Tyr Ile Asp Arg Ile Arg Glu Lell Lell Ile Glin Tyr 225 23 O 235 24 O

Ser Asp Ile Arg Thir Ile Lell Wall Ile Glu Pro Asp Ser Luell Ala Asn 245 250 255

Met Wall Thir Asn Met Asn Wall Glin Lys Ser Asn Ala Ala Ser Thir 26 O 265 27 O

Glu Luell Thir Wall Ala Luell Glin Lell Asn Luell Pro His 285

Wall Ala Met Met Asp Ala Gly His Ala Gly Trp Lell Gly Trp Pro 29 O 295 3 OO

Ala Asn Ile Glin Pro Ala Ala Glu Luell Phe Ala Glin Ile Arg Asp 3. OS 310 315

Ala Gly Arg Pro Ala Ala Wall Arg Gly Luell Ala Thir Asn Wall Ala Asn 3.25 330 335

Asn Ala Trp Ser Ile Ala Ser Pro Pro Ser Thir Ser Pro Asn 34 O 345 35. O

Pro Asn Tyr Asp Glu His Tyr Ile Glu Ala Phe Ala Pro Luell Luell 355 360 365

Arg Asn Glin Gly Asp Ala Phe Ile Wall Asp Thir Gly Arg Asn 37 O 375

Gly Glin Pro Gly Glin Luell Glu Trp Gly His Trp Asn Wall 385 390 395 4 OO

Gly Thir Gly Gly Wall Arg Pro Thir Ala Asn Thir Gly His Glu 4. OS 415

Lell Wall Asp Ala Wall Trp Wall Lys Pro Gly Gly Glu Ser Asp Gly 425 43 O

Thir Ser Ala Asp Ser Ala Ala Arg Asp His Gly Luell 435 44 O 445

Ser Asp Ala Luell Pro Ala Pro Glu Ala Gly Glin Trp Phe Glin Ala 450 45.5 460

Tyr Phe Glu Glin Lell Lell Ile Asn Ala Asn Pro Pro Lell 465 470 47s

SEO ID NO 57 LENGTH: 2586 TYPE: DNA ORGANISM: Aspergillus oryzae

< 4 OOs SEQUENCE: f

US 8,426,158 B2 161 162 - Continued tittgagcgta t t cacttggc ccctt.cgcag gaggc.cgtgt gga caacgac cct taccc.gt 246 O cgtgaccttg caaactggga C9ttt cqgct Caggactgga ccgt.c act cc ttaccc.caag 252O acgatctacg ttggaaactic ct cacggaala Ctgcc.gcticc aggcct cqct gcctaaggcc 2580 cagtaa 2586

<210s, SEQ ID NO 58 &211s LENGTH: 861 212. TYPE : PRT <213> ORGANISM: Aspergillus oryzae

<4 OOs, SEQUENCE: 58

Met Lys Lieu. Gly Trp Ile Glu Wall Ala Ala Luell Ala Ala Ala Ser Wall 1. 5 1O 15

Wall Ser Ala Lys Asp Asp Lell Ala Tyr Ser Pro Pro Phe Tyr Pro Ser 25

Pro Trp Ala Asp Gly Glin Gly Glu Trp Ala Glu Wall Tyr Arg Ala 35 4 O 45

Wall Asp Ile Wall Ser Glin Met Thir Luell Thir Glu Lys Wall Asn Luell Thir SO 55 6 O

Thir Gly Thir Gly Trp Glin Lell Glu Arg Wall Gly Glin Thir Gly Ser 65 70

Wall Pro Arg Luell Asn Ile Pro Ser Luell Cys Luell Glin Asp Ser Pro Luell 85 90 95

Gly Ile Arg Phe Ser Asp Asn Ser Ala Phe Pro Ala Gly Wall Asn 105 11 O

Wall Ala Ala Thir Trp Asp Thir Luell Ala Lell Arg Gly Glin Ala 115 12 O 125

Met Gly Glu Glu Phe Ser Asp Gly Ile Asp Wall Glin Luell Gly Pro 13 O 135 14 O

Ala Ala Gly Pro Lell Gly Ala His Pro Asp Gly Gly Arg Asn Trp Glu 145 150 155 160

Gly Phe Ser Pro Asp Pro Ala Luell Thir Gly Wall Lell Phe Ala Glu Thir 1.65 17O

Ile Gly Ile Glin Asp Ala Gly Wall Ile Ala Thir Ala Lys His 18O 185 19 O

Ile Met Asn Glu Glin Glu His Phe Arg Glin Glin Pro Glu Ala Ala 195

Gly Phe Asn Wall Ser Asp Ser Luell Ser Ser Asn Wall Asp Asp 21 O 215 22O

Thir Met His Glu Lell Tyr Lell Trp Pro Phe Ala Asp Ala Wall Arg Ala 225 23 O 235 24 O

Gly Wall Gly Ala Wall Met Ser Asn Glin Ile Asn Asn Ser 245 250 255

Gly Glu Asn Ser Glu Thir Luell Asn Luell Lell Ala Glu Luell 26 O 265 27 O

Gly Phe Glin Gly Phe Wall Met Ser Asp Trp Thir Ala His His Ser Gly 27s 28O 285

Wall Gly Ala Ala Lell Ala Gly Luell Asp Met Ser Met Pro Gly Asp Wall 29 O 295 3 OO

Thir Phe Asp Ser Gly Thir Ser Phe Trp Gly Ala Asn Lell Thir Wall Gly 3. OS 310 315

Wall Luell Asn Gly Thir Ile Pro Glin Trp Arg Wall Asp Asp Met Ala Wall 3.25 330 335

Arg Ile Met Ala Ala Wall Gly Arg Asp Thir Thir US 8,426,158 B2 163 164 - Continued

34 O 345 35. O

Pro Pro Asn Phe Ser Ser Trp Thir Arg Asp Glu Gly Phe Ala His 355 360 365

Asn His Wall Ser Glu Gly Ala Glu Arg Wall Asn Glu Phe Wall Asp 37 O 375

Wall Glin Arg Asp His Ala Asp Luell Ile Arg Arg Ile Gly Ala Glin Ser 385 390 395 4 OO

Thir Wall Luell Luell Lys Asn Gly Ala Luell Pro Lell Ser Arg Lys Glu 4 OS 415

Luell Wall Ala Lell Lell Gly Glu Asp Ala Gly Ser Asn Ser Trp Gly 425 43 O

Ala Asn Gly Asp Asp Arg Gly Asp ASn Gly Thir Luell Ala Met 435 44 O 445

Ala Trp Gly Ser Gly Thir Ala Asn Phe Pro Lell Wall Thir Pro Glu 450 45.5 460

Glin Ala Ile Glin Asn Glu Wall Luell Glin Gly Arg Gly Asn Wall Phe Ala 465 470

Wall Thir Asp Ser Trp Ala Lell Asp Ile Ala Ala Ala Ala Arg Glin 485 490 495

Ala Ser Wall Ser Lell Wall Phe Wall Asn Ser Asp Ser Gly Glu Gly SOO 505

Lell Ser Wall Asp Gly Asn Glu Gly Asp Arg ASn Asn Ile Thir Luell Trp 515 525

Asn Gly Asp Asn Wall Wall Thir Ala Ala Asn Asn Asn Asn 53 O 535 54 O

Thir Wall Wall Ile Ile His Ser Wall Gly Pro Wall Lell Ile Asp Glu Trp 5.45 550 555 560

Asp His Pro Asn Wall Thir Gly Ile Luell Trp Ala Gly Luell Pro Gly 565 st O sts

Glin Glu Ser Gly Asn Ser Ile Ala Asp Wall Luell Gly Arg Wall Asn 585 59 O

Pro Gly Ala Ser Pro Phe Thir Trp Gly Thir Arg Glu Ser 595 605

Gly Ser Pro Luell Wall Asp Ala Asn Asn Gly Asn Gly Ala Pro Glin 610 615

Ser Asp Phe Thir Glin Gly Wall Phe Ile Asp Tyr Arg His Phe Asp Lys 625 630 635 64 O

Phe Asn Glu Thir Pro Ile Glu Phe Gly Gly Lell Ser Tyr Thir 645 650 655

Thir Phe Glu Luell Ser Asp Lell His Wall Glin Pro Lell Asn Ala Ser Arg 660 665 67 O

Thir Pro Thir Ser Gly Met Thir Glu Ala Ala Asn Phe Gly Glu 675 685

Ile Gly Asp Ala Ser Glu Tyr Wall Pro Glu Gly Lell Glu Arg Ile 69 O. 695 7 OO

His Glu Phe Ile Tyr Pro Trp Ile Asn Ser Thir Asp Lell Ala Ser 7 Os

Ser Asp Asp Ser Asn Gly Trp Glu Asp Ser Ile Pro Glu 72 73 O 73

Gly Ala Thir Asp Gly Ser Ala Glin Pro Arg Luell Pro Ala Ser Gly Gly 740 74. 7 O

Ala Gly Gly Asn Pro Gly Lell Tyr Glu Asp Luell Phe Arg Wall Ser Wall 7ss 760 765

US 8,426,158 B2 169 170 - Continued Met Gly Glu Glu Phe Asn Asp Gly Wall Asp Ile Lieu. Lieu. Gly Pro 13 O 135 14 O

Ala Ala Gly Pro Leu. Gly Lys Pro Asp Gly Gly Arg Ile Trp Glu 145 150 155 160

Gly Phe Ser Pro Asp Pro Wall Luell Thir Gly Wall Lell Phe Ala Glu Thir 1.65 17O

Ile Gly Ile Glin Asp Ala Gly Wall Ile Ala Thir Ala Lys His 18O 185 19 O

Ile Luell Asn Glu Glin Glu His Phe Arg Glin Wall Gly Glu Ala Glin 195

Gly Tyr ASn Ile Thir Glu Thir Ile Ser Ser Asn Wall Asp Asp 21 O 215 22O

Thir Met His Glu Lieu. Tyr Lell Trp Pro Phe Ala Asp Ala Wall Arg Ala 225 23 O 235 24 O

Gly Wall Gly Ala Wall Met Ser Asn Glin Ile Asn Asn Ser 245 250 255

Gly Glin Asn. Ser Glin Thir Luell Asn Luell Lell Ala Glu Luell 26 O 265 27 O

Gly Phe Glin Gly Phe Wall Met Ser Asp Trp Ser Ala His His Ser Gly 27s 28O 285

Wall Gly Ala Ala Lieu. Ala Gly Luell Asp Met Ser Met Pro Gly Asp Ile 29 O 295 3 OO

Ser Phe Asp Asp Gly Lell Ser Phe Trp Gly Thir Asn Lell Thir Wall Ser 3. OS 310 315

Wall Luell Asn Gly Thr Wall Pro Ala Trp Arg Wall Asp Asp Met Ala Wall 3.25 330 335

Arg Ile Met Thir Ala Wall Gly Arg Asp Arg Luell Arg Ile 34 O 345 35. O

Pro Pro Asn Phe Ser Ser Trp Thir Arg Asp Glu Tyr Gly Trp Glu His 355 360 365

Ser Ala Wall Ser Glu Gly Ala Trp Thir Wall Asn Asp Phe Wall Asn 37 O 375

Wall Glin Arg Ser His Ser Glin Ile Ile Arg Glu Ile Gly Ala Ala Ser 385 390 395 4 OO

Thir Wall Luell Lieu Lys Asn Thir Gly Ala Luell Pro Lell Thir Gly Lys Glu 4 OS 415

Wall Wall Gly Val Lell Gly Glu Asp Ala Gly Ser Asn Pro Trp Gly 42O 425 43 O

Ala Asn Gly Cys Pro Asp Arg Gly Cys Asp ASn Gly Thir Luell Ala Met 435 44 O 445

Ala Trp Gly Ser Gly Thir Ala Asn Phe Pro Lell Wall Thir Pro Glu 450 45.5 460

Glin Ala Ile Glin Arg Glu Wall Ile Ser Asn Gly Gly Asn Wall Phe Ala 465 470 47s

Wall Thir Asp Asin Gly Ala Lell Ser Glin Met Ala Asp Wall Ala Ser Glin 485 490 495

Ser Ser Wall Ser Luell Wall Phe Wall Asn Ala Asp Ser Gly Glu Gly Phe SOO 505

Ile Ser Wall Asp Gly Asn Glu Gly Asp Arg Asn Lell Thir Luell Trp 515 525

Asn Gly Glu Ala Wall Ile Asp Thir Wall Wall Ser His Asn Asn 53 O 535 54 O

Thir Ile Wall Wall Ile His Ser Wall Gly Pro Wall Lell Ile Asp Arg Trp 5.45 550 555 560 US 8,426,158 B2 171 172 - Continued

Asp Asn. Pro Asn. Wall. Thir Ala Ile Ile Trp Ala Gly Leu Pro Gly 565 st O sts

Glin Glu Ser Gly Asn. Ser Lieu Val Asp Val Lieu. Gly Arg Val Asn 585 59 O

Pro Ser Ala Lys Thr Pro Phe Thr Trp Gly Lys Thir Arg Glu Ser Tyr 595 605

Gly Ala Pro Leu Lleu. Thir Glu Pro Asn Asn Gly Asn Gly Ala Pro Glin 610 615

Asp Asp Phe Asin Glu Gly Val Phe Ile Asp Tyr Arg His Phe Asp Llys 625 630 635 64 O

Arg Asn Glu Thr Pro Ile Tyr Glu Phe Gly His Gly Lell Ser Tyr Thr 645 650 655

Thir Phe Gly Tyr Ser His Lieu. Arg Wall Glin Ala Lell Asn Ser Ser Ser 660 665 67 O

Ser Ala Tyr Val Pro Thr Ser Gly Glu Thir Lys Pro Ala Pro Thr Tyr 675 685

Gly Glu Ile Gly Ser Ala Ala Asp Tyr Lieu. Tyr Pro Glu Gly Lieu Lys 69 O. 695 7 OO

Arg Ile Thr Llys Phe Ile Tyr Pro Trp Lieu. Asn Ser Thir Asp Lieu. Glu 7 Os 71O

Asp Ser Ser Asp Asp Pro Asn Tyr Gly Trp Glu Asp Ser Glu Tyr Ile 72 73 O 73

Pro Glu Gly Ala Arg Asp Gly Ser Pro Glin Pro Lell Lell Lys Ala Gly 740 74. 7 O

Gly Ala Pro Gly Gly Asn Pro Thr Leu Tyr Glin Asp Lell Val Arg Val 760 765

Ser Ala Thir Ile Thr Asn Thr Gly Asn. Wall Ala Gly Glu Wall Pro 770 775

Glin Luell Tyr Val Ser Lieu. Gly Gly Pro Asn. Glu Pro Arg Wal Wall Lieu. 79 O 79. 8OO

Arg Phe Asp Arg Ile Phe Lieu. Ala Pro Gly Glu Glin Llys Val Trp 805 810 815

Thir Thir Thir Lieu. Asn Arg Arg Asp Lieu Ala Asn Trp Asp Wall Glu Ala 825 83 O

Glin Asp Trp Val Ile Thr Lys Tyr Pro Llys Llys Wall His Val Gly Ser 835 84 O 845

Ser Ser Arg Llys Lieu Pro Lieu. Arg Ala Pro Lieu. Pro Arg Val Tyr 850 855 860

SEQ ID NO 61 LENGTH: 28OO TYPE: DNA ORGANISM: Penicillium bras ilianum

< 4 OOs SEQUENCE: 61 tgaaaatgca gggttctaca atc.tttctgg citttcgcctc atgggcgagc Caggttgctg 6 O c cattgcgca gcc catacag aag cacgagg tttgtttitat cittgct catg gacgtgctitt 12 O gacttgacta attgttttac atacagc.ccg gatttctgca cgggc.cccala gcc at agaat 18O cgttct caga accogttctac ccgt.cgcc ct ggatgaatcc tCacgc.cgag gigctgggagg 24 O cc.gcatat ca gaaagct caa gattttgtct cgcaact cac tat cittggag aaaataaatc 3OO tgaccaccgg tttgggtaa gtct citcc.ga Ctgcttctgg gtcacggtgc gacgagccac 360 tgacttitttg aagctgggaa aatgggcc.gt gtgtaggaaa cactggat.ca atticcitcgt.c

US 8,426,158 B2 175 176 - Continued

<210s, SEQ ID NO 62 &211s LENGTH: 878 212. TYPE : PRT <213> ORGANISM: Penicillium brasilianum

<4 OOs, SEQUENCE: 62

Met Glin Gly Ser Thir Ile Phe Luell Ala Phe Ala Ser Trp Ala Ser Glin 1. 5 15

Wall Ala Ala Ile Ala Glin Pro Ile Glin His Glu Pro Gly Phe Luell 25 3O

His Gly Pro Glin Ala Ile Glu Ser Phe Ser Glu Pro Phe Pro Ser 35 4 O 45

Pro Trp Met Asn Pro His Ala Glu Gly Trp Glu Ala Ala Glin SO 55 6 O

Ala Glin Asp Phe Wall Ser Glin Luell Thir Ile Luell Glu Ile Asn Luell 65 70 7s

Thir Thir Gly Wall Gly Trp Glu Asn Gly Pro Wall Gly Asn Thir Gly 85 90 95

Ser Ile Pro Arg Lell Gly Phe Gly Phe Thir Glin Asp Ser Pro 105 11 O

Glin Gly Wall Arg Phe Ala Asp Tyr Ser Ser Ala Phe Thir Ser Ser Glin 115 12 O 125

Met Ala Ala Ala Thir Phe Asp Arg Ser Ile Luell Tyr Glin Arg Gly Glin 13 O 135 14 O

Ala Met Ala Glin Glu His Ala Gly Ile Thir Ile Glin Luell Gly 145 150 155 160

Pro Wall Ala Gly Pro Lell Gly Arg Ile Pro Glu Gly Gly Arg Asn Trp 1.65 17O 17s

Glu Gly Phe Ser Pro Asp Pro Wall Luell Thir Gly Ile Ala Met Ala Glu 18O 185 19 O

Thir Ile Lys Gly Met Glin Asp Thir Gly Wall Ile Ala Cys Ala His 195

Ile Gly Asn Glu Glin Glu His Phe Arg Glin Wall Gly Glu Ala Ala 21 O 215 22O

Gly His Gly Tyr Thir Ile Ser Asp Thir Ile Ser Ser Asn Ile Asp Asp 225 23 O 235 24 O

Arg Ala Met His Glu Lell Tyr Luell Trp Pro Phe Ala Asp Ala Wall Arg 245 250 255

Ala Gly Wall Gly Ser Phe Met Ser Ser Glin Ile Asn Asn Ser 26 O 265 27 O

Gly Cys Glin Asn Ser Glin Thir Luell Asn Lys Lell Lell Ser Glu 27s 285

Lell Gly Phe Glin Gly Phe Wall Met Ser Asp Trp Gly Ala His His Ser 29 O 295 3 OO

Gly Wall Ser Ser Ala Lell Ala Gly Luell Asp Met Ser Met Pro Gly Asp 3. OS 310 315

Thir Glu Phe Asp Ser Gly Lell Ser Phe Trp Gly Ser Asn Luell Thir Ile 3.25 330 335

Ala Ile Luell Asn Gly Thir Wall Pro Glu Trp Arg Lell Asp Asp Met Ala 34 O 345 35. O

Met Arg Ile Met Ala Ala Phe Wall Gly Lell Thir Ile Glu Asp 355 360 365

Glin Pro Asp Wall Asn Phe Asn Ala Trp Thir His Asp Thir Gly Tyr 37 O 375 38O US 8,426,158 B2 177 178 - Continued

Lys Ala Tyr Ser Lys Glu Asp Glu Glin Wall Asn Trp His Wall 385 390 395 4 OO

Asp Wall Arg Ser Asp His Asn Luell Ile Arg Glu Thir Ala Ala 4 OS 41O 415

Gly Thir Wall Luell Lell Asn Asn Phe His Ala Lell Pro Luell Glin 425 43 O

Pro Arg Phe Wall Ala Wall Wall Gly Glin Asp Ala Gly Pro Asn Pro 435 44 O 445

Gly Pro Asn Gly Cys Ala Asp Arg Gly Asp Glin Gly Thir Luell Ala 450 45.5 460

Met Gly Trp Gly Ser Gly Ser Thir Glu Phe Pro Lell Wall Thir Pro 465 470

Asp Thir Ala Ile Glin Ser Wall Luell Glu Gly Gly Arg Tyr Glu 485 490 495

Ser Ile Phe Asp Asn Asp Asp Asn Ala Ile Lell Ser Luell Wall Ser SOO 505

Glin Pro Asp Ala Thir Ile Wall Phe Ala ASn Ala Asp Ser Gly Glu 515 525

Gly Tyr Ile Thir Wall Asp Asn Asn Trp Gly Asp Arg Asn Asn Luell Thir 53 O 535 54 O

Lell Trp Glin Asn Ala Asp Glin Wall Ile Ser Thir Wall Ser Ser Arg Cys 5.45 550 555 560

Asn Asn Thir Ile Wall Wall Lell His Ser Wall Gly Pro Wall Luell Luell Asn 565 st O sts

Gly Ile Glu His Pro Asn Ile Thir Ala Ile Wall Trp Ala Gly Met 585 59 O

Pro Gly Glu Glu Ser Gly Asn Ala Luell Wall Asp Ile Lell Trp Gly Asn 595 605

Wall Asn Pro Ala Gly Arg Thir Pro Phe Thir Trp Ala Ser Arg Glu 610 615

Asp Gly Thir Asp Ile Met Glu Pro ASn Asn Glin Arg Ala 625 630 635 64 O

Pro Glin Glin Asp Phe Thir Glu Ser Ile Tyr Luell Asp Arg His Phe 645 650 655

Asp Ala Gly Ile Glu Pro Ile Tyr Glu Phe Gly Phe Gly Luell Ser 660 665 67 O

Thir Thir Phe Glu Ser Asp Luell Arg Wall Wall Lys Wall 675 685

Glin Pro Ser Pro Thir Thir Gly Thir Gly Ala Glin Ala Pro Ser Ile 69 O. 695 7 OO

Gly Glin Pro Pro Ser Glin Asn Luell Asp Thir Tyr Phe Pro Ala Thir 7 Os 71s

Ile Lys Thir Phe Ile Pro Lell Asn Ser Thir Wall 72 73 O 73

Ser Luell Arg Ala Ala Ser Asp Pro Glu Gly Arg Thir Asp Phe 740 74. 7 O

Ile Pro Pro His Ala Arg Asp Gly Ser Pro Glin Pro Lell Asn Pro Ala 760 765

Gly Asp Pro Wall Ala Ser Gly Gly Asn Asn Met Lell Asp Glu Luell 770 775

Tyr Glu Wall Thir Ala Glin Ile Asn Thir Gly Asp Wall Ala Gly Asp 79 O 79.

Glu Wall Wall Glin Lell Wall Asp Luell Gly Gly Asp Asn Pro Pro Arg 805 810 815

US 8,426,158 B2 181 182 - Continued ggcaagactic gtgaggccta c caagacitac ttggit caccg agcc.caacaa cggcaacgga 1860 gcc cct Cagg aag actttgt cgagggcgtC tt cattgact accgtggatt tgacaag.cgc 1920 aacgaga.ccc cgatctacga gttcggct at ggtctgagct acaccactitt caact acticg 198O alaccttgagg tgcaggtgct gag.cgc.ccct gCatacgagc Ctgct tcggg tgaga.ccgag gcagcgc.caa Cct tcggaga ggttggaaat gcgt.cggatt accticta CCC Cagcggattg 21OO cagaga atta c caagttcat ctacc cct gg citcaacggta ccgat ct cqa ggcatct tcc 216 O ggggatgcta gctacgggca ggact cotcc gacitat ctitc CC9agggagc Caccgatggc 222 O tctg.cgcaac cgatcc tigcc tgc.cggtggc ggtCctggcg gcaac cct cq cctgtacgac 228O gagct catcc gcgtgtcagt gac catcaag aac accggca aggttgctgg tgatgaagtt 234 O c cccaactgt atgttt coct tggcggtc. cc aatgagcc.ca agat.cgtgct gcqtcaattic 24 OO gagggcatca cgctgcagcc gtcggaggag acgaagtgga gcacgactict gacgc.gc.cgt 246 O gaccttgcaa actggaatgt tgaga agcag gactgggaga ttacgt.cgta tcc caagatg 252O gtgtttgtcg gaagct cotc gcggaagctg cc.gct Coggg cgt.ct ctdcc tactgttcac 2580 taa 2583

SEQ ID NO 64 LENGTH: 860 TYPE : PRT ORGANISM: Aspergillus nige

< 4 OOs SEQUENCE: 64

Met Arg Phe Thir Lieu. Ile Glu Ala Wall Ala Lieu. Thir Ala Wall Ser Luell 1. 5 1O 15

Ala Ser Ala Asp Glu Lieu Ala Tyr Ser Pro Pro Pro Ser Pro 25

Trp Ala Asn Gly Glin Gly Asp Trp Ala Glin Ala Glin Arg Ala Wall 35 4 O 45

Asp Ile Wall Ser Glin Met Thir Lieu. Asp Glu Lys Wall Asn Luell Th Thr SO 55 6 O

Gly Thir Gly Trp Glu Lieu. Glu Lieu Cys Val Gly Glin Thir Gly Gly Val 65 70 8O

Pro Arg Luell Gly Val Pro Gly Met Cys Lieu. Glin Asp Ser Pro Lieu. Gly 85 90 95

Wall Arg Asp Ser Asp Tyr Asn. Ser Ala Phe Pro Ala Gly Met Asn. Wall 105 11 O

Ala Ala Thir Trp Asp Lys Asn Lieu Ala Tyr Lieu. Arg Gly Ala Met 115 12 O 125

Gly Glin Glu Phe Ser Asp Llys Gly Ala Asp Ile Glin Lell Gly Pro Ala 13 O 135 14 O

Ala Gly Pro Lieu. Gly Arg Ser Pro Asp Gly Gly Arg Asn Trp Glu Gly 145 150 155 160

Phe Ser Pro Asp Pro Ala Lieu. Ser Gly Val Lieu. Phe Ala Glu Thir Ile 1.65 17O 17s

Gly Ile Glin Asp Ala Gly Val Wall Ala Thr Ala His Tyr Ile 18O 185 19 O

Ala Glu Gln Glu. His Phe Arg Glin Ala Pro Glu Ala Glin Gly Phe 195 2O5

Gly Phe Asn Ile Ser Glu Ser Gly Ser Ala Asn Lell Asp Asp Lys Thr 21 O 215 22O

Met His Glu Leu Tyr Lieu. Trp Pro Phe Ala Asp Ala Ile Arg Ala Gly US 8,426,158 B2 183 184 - Continued

225 23 O 235 24 O

Ala Gly Ala Wall Met Ser Asn Glin Ile Asn Asn Ser Tyr Gly 245 250 255

Glin Asn Ser Tyr Thir Lell Asn Lys Luell Luell Ala Glu Luell Gly 26 O 265 27 O

Phe Glin Gly Phe Wall Met Ser Asp Trp Ala Ala His His Ala Gly Wall 285

Ser Gly Ala Luell Ala Gly Lell Asp Met Ser Met Pro Gly Asp Wall Asp 29 O 295 3 OO

Tyr Asp Ser Gly Thir Ser Trp Gly Thir ASn Lell Thir Ile Ser Wall 3. OS 310 315

Lell Asn Gly Thir Wall Pro Glin Trp Arg Wall Asp Asp Met Ala Wall Arg 3.25 330 335

Ile Met Ala Ala Tyr Wall Gly Arg Asp Arg Lell Trp Thir Pro 34 O 345 35. O

Pro Asn Phe Ser Ser Trp Thir Arg Asp Glu Gly Tyr 355 360 365

Wall Ser Glu Gly Pro Tyr Glu Wall ASn Glin Wall Asn Wall 37 O 375

Glin Arg Asn His Ser Glu Lell Ile Arg Arg Ile Gly Ala Asp Ser Thir 385 390 395 4 OO

Wall Luell Luell Asn Asp Gly Ala Luell Pro Luell Thir Gly Glu Arg 4 OS 415

Lell Wall Ala Luell Ile Gly Glu Asp Ala Gly Ser Asn Pro Tyr Gly 425 43 O

Asn Gly Cys Ser Asp Arg Gly Cys Asp Asn Gly Thir Lell Ala Met 435 44 O 445

Trp Gly Ser Gly Thir Ala Asn Phe Pro Luell Wall Thir Pro Glu 450 45.5 460

Ala Ile Ser Asn Glu Wall Lell His ASn Gly Wall Phe Thir 465 470

Thir Asp Asn Trp Ala Ile Asp Glin Ile Glu Ala Lell Ala Thir 485 490 495

Ser Wall Ser Luell Wall Phe Wall Asn Ala Asp Ser Gly Glu Gly SOO 505

Asn Wall Asp Gly Asn Lell Gly Asp Arg Arg ASn Lell Thir Luell Trp Arg 515 525

Asn Gly Asp Asn Wall Ile Lys Ala Ala Ala Ser Asn Asn Asn Thir 53 O 535 54 O

Ile Wall Wall Ile His Ser Wall Gly Pro Wall Luell Wall Asn Glu Trp Tyr 5.45 550 555 560

Asp Asn Pro Asn Wall Thir Ala Ile Luell Trp Gly Gly Lell Pro Gly Glin 565 st O sts

Glu Ser Gly Asn Ser Lell Ala Asp Wall Luell Gly Arg Wall Asn Pro 585 59 O

Gly Ala Lys Ser Pro Phe Thir Trp Gly Thir Arg Glu Ala Glin 595 605

Asp Tyr Luell Wall Thir Glu Pro Asn Asn ASn Gly Ala Pro Glin Glu 610 615

Asp Phe Wall Glu Gly Wall Phe Ile Asp Arg Gly Phe Asp Arg 625 630 635 64 O

Asn Glu Thir Pro Ile Glu Phe Gly Tyr Gly Lell Ser Thir Thir 645 650 655

US 8,426,158 B2 189 190 - Continued

Ile Arg Asp Ser Asp Asn Ser Ala Phe Pro Ala Gly Wall Asn Wall 105 11 O

Ala Ala Thir Trp Asp Asn Luell Ala Luell Arg Gly Glin Ala Met 115 12 O 125

Gly Glin Glu Phe Ser Asp Lys Gly Ile Asp Wall Glin Lell Gly Pro Ala 13 O 135 14 O

Ala Gly Pro Luell Gly Arg Ser Pro Asp Gly Gly Arg Asn Trp Glu Gly 145 150 155 160

Phe Ser Pro Asp Pro Ala Lell Thir Gly Wall Luell Phe Ala Glu Thir Ile 1.65 17O 17s

Gly Ile Glin Asp Ala Gly Wall Wall Ala Thir Ala His Ile 18O 185 19 O

Lell Asn Glu Glin Glu His Phe Arg Glin Wall Ala Glu Ala Ala Tyr 195

Gly Phe Asn Ile Ser Asp Thir Ile Ser Ser ASn Wall Asp Asp Thir 21 O 215 22O

Ile His Glu Met Tyr Lell Trp Pro Phe Ala Asp Ala Wall Arg Ala Gly 225 23 O 235 24 O

Wall Gly Ala Ile Met Ser Asn Glin Ile Asn Asn Ser Tyr Gly 245 250 255

Glin Asn Ser Tyr Thir Lell Asn Lys Luell Luell Ala Glu Luell Gly 26 O 265 27 O

Phe Glin Gly Phe Wall Met Ser Asp Trp Gly Ala His His Ser Gly Wall 285

Gly Ser Ala Luell Ala Gly Lell Asp Met Ser Met Pro Gly Asp Ile Thir 29 O 295 3 OO

Phe Asp Ser Ala Thir Ser Phe Trp Gly Thir ASn Lell Thir Ile Ala Wall 3. OS 310 315

Lell Asn Gly Thir Wall Pro Glin Trp Arg Wall Asp Asp Met Ala Wall Arg 3.25 330 335

Ile Met Ala Ala Tyr Wall Gly Arg Asp Arg Lell Tyr Glin Pro 34 O 345 35. O

Pro Asn Phe Ser Ser Trp Thir Arg Asp Glu Gly Phe Phe 355 360 365

Pro Glin Glu Gly Pro Tyr Glu Wall ASn His Phe Wall Asn Wall 37 O 375

Glin Arg Asn His Ser Glu Wall Ile Arg Luell Gly Ala Asp Ser Thir 385 390 395 4 OO

Wall Luell Luell Asn Asn Asn Ala Luell Pro Luell Thir Gly Glu Arg 4 OS 41O 415

Wall Ala Ile Lell Gly Glu Asp Ala Gly Ser Asn Ser Tyr Gly 42O 425 43 O

Asn Gly Cys Ser Asp Arg Gly Cys Asp Asn Gly Thir Lell Ala Met 435 44 O 445

Trp Gly Ser Gly Thir Ala Glu Phe Pro Luell Wall Thir Pro Glu 450 45.5 460

Ala Ile Glin Ala Glu Wall Lell His Gly Ser Wall Ala 465 470

Thir Asp Asn Trp Ala Lell Ser Glin Wall Glu Thir Lell Ala Glin 485 490 495

Ser Wall Ser Luell Wall Phe Wall Asn Ser Asp Ala Gly Glu Gly SOO 505 51O

Ser Wall Asp Gly Asn Glu Gly Asp Arg Asn ASn Lell Thir Luell Trp Lys US 8,426,158 B2 191 192 - Continued

515 525

Asn Gly Asp Asn Lieu. Ile Lys Ala Ala Ala ASn Asn Cys Asn Asn. Thir 53 O 535 54 O

Ile Val Val Ile His Ser Val Gly Pro Wall Luell Wall Asp Glu Trp Tyr 5.45 550 555 560

Asp His Pro Asn Val Thr Ala Ile Luell Trp Ala Gly Lell Pro Gly Glin 565 st O sts

Glu Ser Gly Asn. Ser Lieu Ala Asp Wall Luell Tyr Gly Arg Wall Asn. Pro 585 59 O

Gly Ala Lys Ser Pro Phe Thir Trp Gly Thir Arg Glu Ala 595 605

Asp Tyr Lieu Val Arg Glu Lieu. Asn Asn ASn Gly Ala Pro Glin Asp 610 615

Asp Phe Ser Glu Gly Val Phe Ile Asp Arg Gly Phe Asp 625 630 635 64 O

Asn Glu Thr Pro Ile Tyr Glu Phe Gly His Gly Lell Ser Th Thr 645 650 655

Phe Asn Tyr Ser Gly Lieu. His Ile Glin Wall Luell Asn Ala Ser Ser Asn 660 665 67 O

Ala Glin Val Ala Thr Glu Thr Gly Ala Ala Pro Thir Phe Gly Glin Wall 675 685

Gly Asn Ala Ser Asp Tyr Val Tyr Pro Glu Gly Lell Thir Arg Ile Ser 69 O. 695 7 OO

Llys Phe Ile Tyr Pro Trp Lieu. Asn Ser Thir Asp Lell Ala Ser Ser 7 Os 71O 72O

Gly Asp Pro Tyr Tyr Gly Val Asp Thir Ala Glu His Wall Pro Glu Gly 72 73 O 73

Ala Thr Asp Gly Ser Pro Gln Pro Wall Luell Pro Ala Gly Gly Gly Ser 740 74. 7 O

Gly Gly Asn Pro Arg Lieu. Tyr Asp Glu Luell Ile Arg Wall Ser Wall. Thir 760 765

Val Lys Asn. Thr Gly Arg Val Ala Gly Asp Ala Wall Pro Glin Leu Tyr 770 775

Val Ser Lieu. Gly Gly Pro Asn. Glu Pro Lys Wall Wall Lell Arg Llys Phe 78s 79 O 79. 8OO

Asp Arg Lieu. Thir Lieu Lys Pro Ser Glu Glu Thir Wall Trp Thir Th Thr 805 810 815

Lieu. Thir Arg Arg Asp Lieu. Ser Asn Trp Asp Wall Ala Ala Glin Asp Trp 825 83 O

Val Ile Thir Ser Tyr Pro Llys Lys Wall His Wall Gly Ser Ser Ser Arg 835 84 O 845

Glin Lieu Pro Lieu. His Ala Ala Lieu. Pro Lys Wall Glin 850 855 860

<210s, SEQ ID NO 67 &211s LENGTH: 3294 &212s. TYPE: DNA <213> ORGANISM: Aspergillus oryz ae

<4 OO > SEQUENCE: 67 atgcgttcct c cc ccctic ct cc.gct cogcc CCCtgc.cggit gttggcc ctt 6 O gcc.gctgatg gCaggit coac cc.gct actgg gactgctgca agcct tcgtg cggctgggCC 12 O aagaaggctic ccgtgalacca gcc tdtctitt tcc tigcaacg CCaact tcca gcgitat cacg 18O gactitcgacg C caagt ccgg cct acticgtg cgc.cgaccag 24 O