Ravn et al. Biotechnol Biofuels (2021) 14:150 https://doi.org/10.1186/s13068-021-01995-x Biotechnology for Biofuels

RESEARCH Open Access CAZyme prediction in ascomycetous yeast genomes guides discovery of novel xylanolytic species with diverse capacities for hemicellulose hydrolysis Jonas L. Ravn1, Martin K. M. Engqvist1, Johan Larsbrink1,2 and Cecilia Geijer1*

Abstract Background: Ascomycetous yeasts from the kingdom fungi inhabit every biome in nature. While flamentous fungi have been studied extensively regarding their enzymatic degradation of the complex polymers comprising lignocel- lulose, yeasts have been largely overlooked. As yeasts are key organisms used in industry, understanding their enzy- matic strategies for biomass conversion is an important factor in developing new and more efcient cell factories. The aim of this study was to identify polysaccharide-degrading yeasts by mining CAZymes in 332 yeast genomes from the phylum . Selected CAZyme-rich yeasts were then characterized in more detail through growth and enzymatic activity assays. Results: The CAZyme analysis revealed a large spread in the number of CAZyme-encoding genes in the ascomycet- ous yeast genomes. We identifed a total of 217 predicted CAZyme families, including several CAZymes likely involved in degradation of plant polysaccharides. Growth characterization of 40 CAZyme-rich yeasts revealed no cellulolytic yeasts, but several species from the Trichomonascaceae and CUG-Ser1 clades were able to grow on xylan, mixed- linkage β-glucan and xyloglucan. Blastobotrys mokoenaii, Sugiyamaella lignohabitans, Spencermartinsiella europaea and several Schefersomyces species displayed superior growth on xylan and well as high enzymatic activities. These species possess genes for several putative xylanolytic enzymes, including ones from the well-studied xylanase- containing glycoside hydrolase families GH10 and GH30, which appear to be attached to the cell surface. B. mokoenaii was the only species containing a GH11 xylanase, which was shown to be secreted. Surprisingly, no known xylanases were predicted in the xylanolytic species Wickerhamomyces canadensis, suggesting that this yeast possesses novel xylanases. In addition, by examining non-sequenced yeasts closely related to the xylanolytic yeasts, we were able to identify novel species with high xylanolytic capacities. Conclusions: Our approach of combining high-throughput bioinformatic CAZyme-prediction with growth and enzyme characterization proved to be a powerful pipeline for discovery of novel xylan-degrading yeasts and enzymes. The identifed yeasts display diverse profles in terms of growth, enzymatic activities and xylan substrate preferences, pointing towards diferent strategies for degradation and utilization of xylan. Together, the results provide

*Correspondence: [email protected] 1 Department of Biology and Biological Engineering, Chalmers University of Technology, 412 96 Gothenburg, Sweden Full list of author information is available at the end of the article

© The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat​ iveco​ ​ mmons.org/​ licen​ ses/​ by/4.​ 0/​ . The Creative Commons Public Domain Dedication waiver (http://creat​ iveco​ mmons.​ org/​ publi​ cdoma​ in/​ ​ zero/1.0/​ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 2 of 18

novel insights into how yeast degrade xylan, which can be used to improve cell factory design and industrial biocon- version processes. Keywords: Ascomycota, Non-conventional yeasts, CAZymes, Xylanase, Xylan, Xylanolytic yeasts

Background breakdown of biomass requires multiple exo-, endo- and Revolutionizing the use of biomass is one of the most auxiliary CAZymes to hydrolyze the diversity of polysac- promising pathways to a more sustainable production of charide backbones and side chains [16, 17]. CAZymes liquid fuels, chemicals and materials and a reduced fos- are divided into classes and families in the carbohy- sil fuel dependence. Te global benefts of a ‘green shift’ drate-active enzymes database (CAZy, www.​cazy.​org; towards a circular, biobased economy are numerous and [18]) based on their sequence similarities, which in turn include lower ­CO2 emissions, resilient product and food determine their structures and functions, e.g., catalysis chains and creation of stimulating high-skilled jobs [1]. reactions [19]. Te enzyme classes in CAZy comprise However, for it to be realized, many technological hur- glycoside hydrolases (GHs), glycosyl transferases (GTs), dles and biochemical challenges in waste minimization polysaccharide lyases (PLs), carbohydrate esterases (CEs) and resource conversion efciency must be overcome [1, and auxiliary activities (AAs). Te database also com- 2]. prises the non-catalytic carbohydrate-binding modules Lignocellulosic biomass is mainly composed of the (CBMs), which are often found linked to degradative homopolysaccharide cellulose (40–60% of dry weight), CAZymes, where their main function is to provide addi- various hemicellulosic heteropolysaccharides (20–35% of tional substrate-binding capabilities and improve overall dry weight), and the aromatic polymer lignin (15–40% of enzyme efciency [20]. dry weight) [3]. Cellulose is a linear polysaccharide con- Knowledge on CAZymes targeting the plant cell wall sisting of β-1,4-linked d-glucose units that form crystal- has mainly been generated from research on flamentous line and insoluble microfbrils [4]. Hemicelluloses coat fungi and bacteria [21–23]. Various yeast species have the cellulose fbrils and their proportions and abundances been shown to grow on diverse and complex substrates, difer between plant species. In industrially important but their contribution to biomass degradation and as grasses and hardwoods, xylans are the most abundant a source of CAZymes has been largely overlooked [21, hemicellulose type, while in other species galacto-glu- 24]. Tus, yeast species represent an untapped source comannans, xyloglucans and mixed-linkage β-glucans of CAZymes of potential industrial relevance. Moreo- are more abundant [5–7]. Xylans comprise a backbone ver, yeasts capable of growing on diverse and complex of β-1,4-linked d-xylose residues which are commonly substrates in challenging environments combined with a O-acylated and further substituted by α-1,2- or α-1,3- unicellular growth pattern, ease of cultivation and genetic linked arabinosyl units and α-1,2-linked (methyl)-glucu- manipulation make them attractive candidates as future ronic acid moieties, and these carbohydrate decorations biorefnery cell factories for consolidated bioprocessing can in turn be further substituted in various patterns. (CBP) [25, 26]. Te xylans are typically grouped into arabinoxylan (AX), Based on known CAZyme protein domains, it has glucuronoxylan (GX) and glucuronoarabinoxylan (GAX) recently become possible to annotate and predict [8]. Te arabinosyl substitutions found on xylans can be CAZymes in whole genomes in a high-throughput man- esterifed with ferulic acid that can in turn form phenolic ner using the automated online meta server dbCAN2 crosslinks to other feruloylated xylans or the hydropho- [27]. Moreover, advances in next-generation sequencing bic lignin polymers in the plant cell wall [9], thereby and bioinformatical tools have considerably increased exerting biomechanical contributions to cellulose fbril- knowledge of yeast genetics and evolution [28] and about lar networks [10]. For biorefning purposes, the complex a quarter of the approx. 1500 yeast species described and heterogenous carbohydrate matrix in the plant cell to date have been sequenced [26, 29]. Together, these wall represents one of the main challenges in efcient technical advances provide an opportunity to identify and rapid conversion of biomass and biowastes to value- polysaccharide-degrading yeast species through bioin- added chemicals and fuels [11, 12]. formatic mining, complementing time-consuming and Microbes and their carbohydrate-active enzymes labor-intensive bioprospecting approaches. Te aim (CAZymes) are central for depolymerization of the of this study was to identify polysaccharide-degrading complex lignocellulosic polysaccharides in the global yeasts by mining 332 yeast genomes from the Ascomy- carbon cycle as well as in industrial bioconversion pro- cota phylum [26]. We used the results of the initial pre- cesses [13]–[15]. Complete or semi-complete enzymatic diction of the species’ CAZyme repertoires to select a Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 3 of 18

subgroup of CAZyme-rich yeasts for more in-depth char- Culture Collection, USA (NRRL; https://​nrrl.​ncaur.​usda.​ acterization of polysaccharide metabolism and enzymatic gov/). Te selected sequenced species that we managed activities. Tis bioinformatic-based approach allowed to cultivate in the lab are listed in Table 1. Te six non- us to map phylogenetic clades rich in xylanolytic yeast sequenced species were: Sugiyamaella novakii (CBS species and identify additional highly xylanolytic non- 8402), Sugiyamaella smithiae (CBS 5657), Blastobot- sequenced yeast species. rys malaysiensis (CBS 10336), Blastobotrys illinoisensis (CBS 10339), Blastobotrys parvus (CBS 6147) and Schef- Methods fersomyces shehatae (CBS 5813). All strains were either Prediction of CAZymes by dbCAN2 in 332 ascomycetous received freeze-dried in ampules which were re-grown yeasts in liquid yeast extract–peptone–dextrose (YPD) at room A bioinformatic analysis was carried out to identify temperature, or as agar slants which were re-streaked on CAZymes in the 332 ascomycetous yeasts [26]. Fasta fles YPD agar plates and grown at room temperature. YPD containing protein sequences were downloaded from contained 10 g L­ −1 yeast extract, 20 g L­ −1 peptone and Figshare (https://​doi.​org/​10.​6084/​m9.​fgsh​are.​58546​ 20 g ­L−1 glucose. 92) in November 2019. Te protein sequences in each fasta fle were de-duplicated by clustering at 98% iden- Yeast growth characterization tity using CD-HIT [30] and cluster representatives were Growth on polysaccharides was measured in both carried forward for further analysis. Hidden Markov semi-solid and liquid media. Polysaccharides included Models (HMMs) for CAZymes were downloaded from wheat arabinoxylan (Megazyme, Ireland), birchwood dbCAN (http://​bcb.​unl.​edu/​dbCAN2/, version 8) [27]. glucuronoxylan (Sigma-Aldrich, Germany), xylo- Each sequence in the fasta fles was matched against glucan (tamarind, Megazyme, Ireland), mixed-link- these HMMs using HMMER3 [31] with the -E fag set age β-1,3/1,4-glucan (barley, Megazyme, Ireland), to flter hits with e-values below 10­ –15 as well as with galactomannan (guar/locust bean gum, Sigma-Aldrich, the—domtblout fag to obtain an easily parsable output Germany), glucomannan (konjac, Sigma-Aldrich, fle. Hits covering less than 35% of the corresponding Germany), curdlan (Merck, USA), poly-methylgalac- HMM model were removed. Additionally, if two domains turonan (Sigma-Aldrich, Germany), pectin (citrus, showed more than 20% overlap on a single protein, only Sigma-Aldrich, Germany), carboxymethyl cellulose the domain with a better e-value score was retained. For (Sigma-Aldrich, Germany), Avicel (Sigma-Aldrich, each of the enzymes in the fasta fles, potential signal Germany) and potato starch (Sigma-Aldrich, Ger- peptides, indicating secretion, were also predicted using many). For semi-solid growth, agar plates were pre- SignalP (http://​www.​cbs.​dtu.​dk/​servi​ces/​Signa​lP/) [32]. pared using autoclaved Delft minimal medium with In these runs the -org fag was set to "euk" (for eukary- different polysaccharides 0.2% (w/v) and 2% agar (w/v). ote), and the -format fag to "short" to obtain easily pars- The Delft media contained 5 g L­ −1 ammonium sulfate, able output fles. Te fnal data was visualized, using the 3 g ­L−1 potassium phosphate, 1 g ­L−1 magnesium sul- ETE toolkit, on phylogenetic trees comprising the 332 fate, vitamins and trace metals as described previously yeast species, likewise retrieved from the Figshare record [36], and pH was adjusted to 5 using 2 M KOH. Yeasts (https://​doi.​org/​10.​6084/​m9.​fgsh​are.​58546​92). were inoculated in Delft medium 2% glucose (w/v) and Te entire bioinformatic analysis was carried out in an grown at 30 °C, 150 rpm for 24 h before harvested, Anaconda environment (https://​www.​anaco​nda.​com/; washed, and resuspended in water to a cell density of version 2018.12) using the Python programming lan- ­OD600 = 5.10 µl of the cell suspensions were spotted on guage (version 3.7.1) on a Unix system. Python librar- plates that were then sealed with parafilm and kept at ies used include Pandas version 0.25.1, re version 2.2.1, room temperature for 10 days before scoring growth. BioPython version 1.72 [33], ETE3 version 3.1.2 [34] All strains were also spotted on agar plates either with- and Matplotlib version: 3.3.2 [35]. Other dependencies out any carbon source (where no strains were expected include CD-HIT version 4.8.1, HMMER version 3.2.1, to grow) or with 2% glucose (where all strains were and SignalP version 5.0b. expected to grow). The Saccharomyces cerevisiae strain CEN.PK 113-7D that is unable to grow on polysaccha- Yeast selection rides was also included as a negative control. Growth Yeasts were selected based on their total number of was scored by visual inspection of colony thickness predicted CAZymes and CAZyme functional activ- and size (including hyphae) in comparison to cell drop- ity clustering in polysaccharide degradation (Addi- lets on plates without carbon source. For growth in tional fle 1: Table S1). In total, 40 sequenced yeasts and liquid cultures, yeasts were inoculated with a starting six non-sequenced yeasts were ordered from the ARS ­OD600 = 0.05 in Delft minimal media containing 10 g Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 4 of 18

Table 1 Overview of budding yeast growth assessment on agar plates and liquid cultures using diferent polysaccharides

Clade Species NRRLWheat AX Birch GX Xyloglucan β-glucan GluMan GalMan Pectin Poly-MeGal Curdlan Avicel CMC Saccharom. Saccharomyces cerevisiae (CENPK) - - - - + + - W - - - Lipomyc. Lipomyces arxii, CBS 7333 Y-17921 ------+ - - - Spencermartinsiella europaea, CBS 11730 Y-48265 ++ ++ + + W - - + - - - Sugiyamaella lignohabitans, CBS 10342YB-1473 + ++ - W W - ++ ++ - - - Diddensiella caesifluorescens, CBS 12613Y-48781 W W - +-- - W - - - Blastobotrys peoriensis, CBS 10340YB-2290 ++ ++ - ++ - - - + - - - Blastobotrys muscicola, CBS 10338Y-7993 W W - W - W - ++ - - - Trichomon Blastobotrys mokoenaii, CBS 8435 Y-27120 +++ ++ +++ W + +++ - + W - - ascaceae Blastobotrys americana, CBS 10337Y-6844 W W - + - W - + W - - Blastobotrys adeninivorans, CBS 7370 Y-17693 + + - ++ - + - + - - - Blastobotrys proliferans, CBS 522.75 Y-17577 ++ W W ++ + ++ - ++ W - - Blastobotrys serpentis, CBS 10541 Y-48249 + + + W W W - + - - - Blastobotrys raffinosifermentans, CBS 6800 Y-27150 + + + + + - - + + - - Ambrosiozyma philentoma, CBS 6276 Y-7523 W W - + + - - + W - - Pichiaceae Ambrosiozyma monospora, CBS 6392 Y-7403 W - - + - - W + - - - Ambrosiozyma oregonensis, CBS 5560 Y-6106 W W - ++ - - W + - - - Spathaspora passalidarum, CBS 10155Y-27907 + ++ - - W - - ++ - - - Debaryomyces subglobosus, CBS 792Y-6666- W - - - - - + - - - Debaryomyces fabryi, CBS 5948 Y-1453 - W - - - - - + - - - Debaryomyces prosopidis, CBS 8450 Y-27369 - W - - - - - + - - - CUG-Ser1 Debaryomyces nepalensis, CBS 5921 Y-7108 + W - - - - - + - W - Candida gorgasii, CBS 9880 Y-27707 W + - - - - - + - - - Debaryomyces maramus, CBS 1958 Y-2171 W W - - - - - + - W - Scheffersomyces lignosus, CBS 4705 Y-12856 ++ ++ - - - - - + - - - Sceffersomyces stipitis, CBS 6054 Y-7124 + + - - - - - + - - - Wickerhamomyces anomalus, CBS 5759 Y-17698 W - W - - - - + - - - Phaffomyce Wickerhamomyces ciferrii, CBS 111Y-1031 W - - W + - W ++ - - - taceae Wickerhamomyces canadensis, CBS1992Y-1888 + + - - - - - + - - - Wickerhamomyces hampshirensis, CBS 7208 YB-4128 W + - - - W W W W - - CUG-Ser2 Saccharomycopsis capsularis, CBS 2519 Y-17639 + + W W - W - W W - - Saccharomycopsis malanga, CBS 6267 Y-7175 - - + + ------Growth was scored by visual comparison to a negative control plate not containing a carbon source and by the diference in colony thickness and size (including hyphae, if present) Growth was ranked from to , where was regular growth and extensive growth, while W indicates weak growth and no growth. Growth after 72 h in + +++ + +++ − liquid cultures > OD 0.2 is indicated by a green color = AX, arabinoxylan; GX, glucuronoxylan; GluMan, glucomannan; GalMan, galactomannan; Poly-MeGal, poly-methylgalacturonan; CMC, carboxymethyl cellulose; Saccharom., Saccharomycetaceae; Lipomyc.,

Liquidculture 72 h >OD=0.2

­L−1 (w/v) of the different polysaccharides except curd- Xylanolytic activity determination lan, CMC and Avicel, and cultivated at 30 °C, 150 rpm To quantify the xylanolytic yeasts’ secretome and cell- for 72 h before determining growth through optical associated xylanase activities, the fnal cultures from density ­(OD600) measurements. Yeast cultures that dis- the GrowthProfler experiment were collected by cen- trifugation (2000 g 15 min) and xylanase activity was played optical densities of ­OD600 ≥ 0.2 were considered × as growing on the respective polysaccharide. assayed in the cell-free supernatant or the intact cell To follow growth on xylan substrates over time, pellets, respectively. Te assay mixture consisted of a 1 selected species were precultured at 30 °C, 150 rpm 175 µl xylan suspension of 10 g ­L− wheat AX or birch- for 24 h in Delft medium containing 2% xylose (w/v). wood GX and 50 mM sodium acetate bufer (pH 5.5) Here, xylose was selected as carbon source as it has added to cell pellets or 25 µl cell-free supernatants previously been shown to induce expression of xyla- mixed in a 96-well plate. Te mixture was incubated at nases in other xylanolytic yeasts [24, 37]. Precultured 30 °C for 30 min followed by immediate chilling on ice. cells were then inoculated in 250 µl Delft medium sup- Reducing sugar ends released by xylanases was deter- plemented with 10 g ­L−1 xylan (either wheat AX or mined by the dinitrosalicylic acid (DNS) method [38] as end point assay. All enzymatic measurements were birchwood GX) to a starting ­OD600 = 0.2. While wheat AX was soluble in Delft medium, birchwood GX was performed in triplicates. One unit of enzyme activ- not fully soluble. All yeast strains were grown in bio- ity was defned as the amount of enzyme required to logical triplicates in a 96-well plate setup in a Growth- release 1 µmol of reducing sugars in 1 min under the 1 Profiler 960 (Enzyscreen, Netherlands). ‘Green Values’ assay conditions. Volumetric activity (U ­mL− ) was cal- (GV) measured by the GrowthProfiler correspond to culated by converting mM reduced sugar to Units by growth based on pixel counts, and GV changes were multiplying with total assay volume (L), dividing with recorded every 20 min for 72 h at 30 °C and 150 rpm. Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 5 of 18

assay time (min) and then dividing with sample volume of CAZymes is likely incomplete. Nonetheless, we identi- (L) as described previously [39]. fed a total 217 diferent CAZyme families, with GT and GH as the most predominant classes. Te full genetic Phylogenetic analysis CAZyme prediction and associated protein sequences Phylogenetic trees of GH10 and 11 xylanases were con- can be downloaded from Zenodo (https://​zenodo.​org/​ structed using the identifed yeast enzymes as well as record/​45483​36/​export/​hx#.​YLSzG​qgzaUk, https://​doi.​ sequences from 259 characterized GH10 members org/​10.​5281/​zenodo.​45483​35). and 208 characterized GH11 members retrieved from Te 332 yeasts encoded on average 152 CAZymes with the CAZy database (www.​cazy.​org), respectively. Te 20 yeasts having more than 200 CAZymes. Yeasts con- sequences were aligned using MUSCLE (https://​www.​ taining the highest number of CAZymes primarily belong ebi.​ac.​uk/​Tools/​msa/) [40], and then submitted for tree to the Trichomonascaceae clade, followed by species building using the online Iqtree tool http://​iqtree.​cibiv.​ in the Lipomycetaceae and the Pichiaceae clades. Spe- univie.​ac.​at/ with 1000 bootstrap alignments and viewed cies with low amounts of CAZymes include Hansenias- in MEGA-X as Newick trees [41]. For species phyloge- pora and Eremothecium species. With our focus being netic analysis, Internal Transcribed Spacer (ITS) nucle- on potential polysaccharide-degrading yeasts and their otide sequences from xylanolytic yeasts, their closely respective CAZymes, GTs involved in biosynthesis of related species and Schizosaccharomyces pombe as out- disaccharides were not included in subsequent analyses. group were aligned using ClustalW, and a maximum like- An overview of the abundance of CAZymes (except GTs) lihood (ML) phylogenetic tree with bootstrap value 1000 in individual yeast species throughout the phylogenetic was constructed using MEGA-X. tree can be viewed in Fig. 1, with increasing CAZyme numbers represented by yellow to dark red color. From In‑gel proteomics of the GH11 enzyme from Blastobotrys this initial analysis, we can conclude that some clades mokoenaii more than others appear to be hotspots for identifcation Supernatants from yeast cultures of Blastobotrys mokoe- and characterization of yeast CAZymes. naii grown in Delft minimal medium with 10 g ­L−1 birchwood GX or wheat AX (72 h, 30 °C, 150 rpm) CAZyme annotation in putative polysaccharide‑degrading were concentrated using 10 kDa ultra centrifugal flters yeasts (Amicon, Merck, Germany) by centrifugation (2000×g, To relate and confrm genetic CAZyme predictions to 10 min, repeated 3×). Secreted proteins were identifed real capacity of polysaccharide utilization, 40 yeasts from by sodium dodecyl sulphate–polyacrylamide gel electro- six diferent phylogenetic clades were selected for fur- phoresis (SDS-PAGE). A protein at ~ 24 kDa was cut out ther characterization. Te species were chosen based on from the gel using a scalpel and kept at − 20 °C before their total number of predicted CAZymes and the clus- sent for proteomic analysis. Te protein identity was con- tering of their CAZymes by functional activity involved frmed by MS/MS analysis as a 23.19-kDa GH11 xyla- in polysaccharide degradation. Te distribution of the nase with the following predicted protein sequence (217 diferent enzyme classes (excluding GTs) and CBMs in residues): the selected yeasts is shown in Fig. 2a. Te GHs showed MKLSNAITAICAAAVLAAPLEEEEVAKRSVTPSST- the highest variation in number among species while GTNNGYYYSFWSDGGGDVTYTNGNGGSYSVEWT- relatively few CBM and PL families were predicted. Te NCGNFVGGKGWNPGAAREINFSGSFNPSGNGYLS- yeast with highest number of CAZymes (excluding GTs) VYGWTTNPLVEYYIVESYGDYNPGTAGTFLGT- was Spencermartinsiella europaea with 204 predicted VDSDGSTYDIYKAVRTNAPSIEGTATFDQYWSIR- CAZymes followed by Blastobotrys proliferans (203) RNHRTSGTVNTGNHFNAWAQHGLQLGTHNYQI- from the same clade, then Lipomyces starkeyi (167) from VATEGYQSSGSSSITVS. the Lipomycetaceae clade. Tese numbers are around twice the number of CAZymes found in the more com- Results monly studied yeasts such as Saccharomyces cerevi- CAZyme abundance and distribution in ascomycetous siae (79) and Schizosaccharomyces pombe [22]. Notably, yeasts another fve Blastobotrys species from the Trichomon- To identify polysaccharide-degrading yeasts, the ascaceae clade also ranked among the top 25 yeasts in dbCAN2 meta server was used to scan the genomes of terms of absolute CAZyme numbers. In addition, we 332 yeast species within the Ascomycota phylum to pre- grouped the yeasts’ CAZymes by predicting functional dict and compile their encoded CAZymes [26, 27]. It is polysaccharide degradation activities (Additional fle 1: important to note that the 332 genomes included in the Table S1), e.g., β-glucanases, cellulases, chitinases, lignin- dataset are not all complete [26], and therefore, the list degrading enzymes, mannanases, pectinases, starch Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 6 of 18

W

s i

c

s si

W k 3

s Eremot W is i n n e 4

i

s

a Ere r

c

i 2 ra h c ns Klu n La c ire S k s i C e k i a

e Ere e t i n

e s

i a m i s K m a b 2 sh c Ca r r l a i yver Starme tol r n ae

h

Eremothe a hanc r ade n lu o i u si

h ri n o o h m f e t o l

a d m i l w Kluyve p n a a p y s ndid m mp l s yveromyc thec ec l Kluyveromyce i 200 m a i an i e

a c s o Lac m t a d a othe rm te c n omyc y li r s a

h p i o c n h a s i e a e u c o a i a s ra m i s t a ens opun e

a olen fa e m pon a a h i m a s i s e h u a t p u ra n ss e s es t s La y ance c a c m m m m

e y yc m a r o v Lach an s ae

sineca c L Ashb e y La om i es b a y raki e y u l c y yc a quercu r

c e c l e c a m m

z L z ur i n

s e cymbalariae z turn uma g m ma ua e c h tas i m ce y o t s y yc om 180 c c um m acha es o s o ti o

s y derosa h o schu s a an y a do t t z i ha han la t o r i y t s a sa c t a a m m ces i t m y acl l ii d o b o amomyce i tic i mi a

y i ii l e oz n e n o t e in c ma ossy om ra b i f id t s

o era m n f t ncea a a h m en c d f amom ce a a u coryl e n t ra L n f a e n r ha a

z d fomy ce v a i CAZyme families (# n r l va nce r o h r lo a s n r La r a ac du n e e n u u a a a ne ian r

hanski af rso i n zar aestu r ne jad y a a l a a fre r a a e s af rham n a nom a n m s i esc lactis a h r i La thermoto a xianus ceri h er tio x c h i r d te cuum e B h indn i m B d r pi B ic q C a k l nd lv a P ra fab hanc ance ma P dner ner e r Torulas d meyers i Ca ke ngat 160 ote ick a Tor ch ue P c di in u i B i rli e p e f asi er n L n Ba ic n a tativo T Wicke rl nd n r achancea ance othofagi nu arii W b rli a m quer elo T b W e li or ulaspor Lac nsis i be d o ea e en W y e r a a lac va Ca y dnera er i ul rul e ida v in c d C yb l n dner ycetangi s Zyg Z C d i m ia aspor T p ha l e s n hylecoet Zygos y aspor ferm wa erans n i C yb m s var or or iran i i Cyb m en g s an rli osac ot nce s C s 140 ul yber a a a i C berl e dnera sc Zygot a s b n o a m aspor fra en C erlindn idea nsi r pretori tina l li scen s a ul a k a ci ti Cy r chyder e c Z l yb e h ccharo nci uy t i Cy ndid e ulve yg a d ati chyderf ni nsi aromyc sporaic el C a e osaccharomyceso roel a mal ve d Ca yb fulv g Zy ru b s ri en cae C lloasco ispa N r ri oroparopa ae g mycela uec A ca ns p 120 au o s lip si o sonia e s flore Sp a h bubul t a e por e s d lyti a m s k soi e k Sp ia keeluno t ccha dsoniadi a o s a ae ii Nad a lip forma m Na vo omb n de n i lg us b mrak Na a w u um zym r isporus tina s rrow de Ka om C ro wia a iv omycesdidu u Ya d us perm 100 o ch wi a id Kaza zachstaniav a y i Yar rro can as Kaz oz d ce aen ba i wi oven lb iculatusr Ka y aire s Ya o lh et Kaz ma ili rr s a Ka z ac chsta r s i Yarro e hum gen s t a ne ou i e ata zachs c ac h s Ya idd tric s c s h s castelliin x ascu cu lav

hsta ) stan tani ni solicola si ii M c nsis lis e 80 a aerobia s Geo od te ti tan a uni p e ne a Kazac ia transvaaln odasusiomya ia rs cqia Kazac ia ia s Di h e Ka Dip incommuniind v y iame s a a a z h aku p Magnproc d ll domer hsta achs st or a ma ie ol an s a S ndi zy lla ic 60 K h ns a b a aza ni tania ia t imae e i a l a b aia nsis s C ako rham bom la chstan e hamie icol Kazachst rom ne ns D cke la p ico na ker a cacticont el g n is Wi c erel lla a Kaz i i an s e nf ola a ac i is Wi rm er a i c 40 a a int earumshi ta amiellal s Kaz chsta n e i S rm h ti Kazachstani ia m st a er musci n Ka ac in St ck rpe na z hsta nia ar a i rys e ica ach tin lis W ot s s Ka afric ia ickerhamielob nsis s za sta ni e W st otry amer ans c n a vitic a la b orie hs ia a turic na B trys pe nivoran ta kunashir o o s eni Sacc nia e la Blasto iferment Ka s nsis astob otry s ad inos ii h zachstape Bl try f na Sacch aromyce nc en astob bo s raf oe er sis Bl o aromyc nia oru ast mok sis Sacc s rosi m Bl obotryrys en Sacc par st ot ivea ntic haromycese ad n Bla ob liferans h s cerev oxus ii ast trys n Sa aromyc Bl pro salma c isia stobo yma ch e mikata Bla is aromyces kudriavzevi e aldoz aens Sa Blastobotrysew n Sacc cch s e s ofu rae ns aromyc ar Groen e ta haromyc boric i ascu s mey aea es u o Zygo scu lignohabi op es va la a lla a e ur Naka Ca eu ru Zygo mae siell cens ndid baya m rtin seomyc a nivar n a ifluores us Sugiyacerm caes es ien en a lis Candid de sis Sp iell abi lph ddens vari a bra ensi Di ria Na Ca car s gonopsis ina ila kaseomyce ndida ensis Tri is v todoph glabrata nema s bac Trigonopsma illisporus Tet Candid Botryozy starmeri rapisis a pora pora caste Tortis anteri namn llii ora g tica Tetra aonens Tortisp aseinoly pisispora is pora c Tetrapis flee Tortis fer is tii yces lipo Tetrap pora pha Lipom icus isispora ffii es japon Vand iriomote Lipomyc njongii erwaltozym nsis s doo re a polys pomyce oae Tetra pora Li nenk pisisp kono ora bla t Lipomyces Yueo tae starkeyi myces s ine Lipomyces Hans nsis arxii eniaspora Lipomyces singularis brius Hanseniaspor Lipomyces mesem a v albyensis Hansen pomyces oligophaga iaspora uvarum Li H Lipomyces suomiensis anseniaspora clerm ontiae Hanseniaspora pseudoguilliermondi i Babjeviellainositovora Kloeckera h atyaiensis Millerozyma acaciae Hanseniaspora osmophila Debaryomyces m aramus Hanseniaspora vineae Debaryomyces nepalen dea asiatica sis Ascoi Debaryomyce s hansenii Ascoidea rubescens Deb aryomyces pros psis capsularis Debaryomyc opidis Saccharomyco a es f sis malang Deb abryi haromycop aryomyces s Sacc polymorpha Priceo ubglo Ogataea myces bosus morpha Pric carso parapoly eomyce nii Ogataea endri s medi ea philod Priceom us Ogata Priceomycesyces ca aea kodamae stillae Ogat minuta Sch ha ataea effe plop Og ntans Schef rsomyce hilus nferme i ferso s lignos aea no a henrici Spathas myces us Ogat ae pini pora stip Ogat aea Spa itis Ogat thas passa zyma Spa pora lidarum luco ii tha girio taea g zsolt Spathasspor i Oga aea a a rbo nens Sp po ra Ogat bsti athas ra riae loa ae Ca p gorw eha ialb n ora iae ea tr pul rsa Candiddida hagerda ata ea po e soj Og toav s Ca ae liae Ogata nitra sensi n a t r ea ili ila did opicalis ata ea p ph Can a albi Og gata halo ica C dida c O tre ol and d an a an s id ub s atae th tan Lodderoma c lin Og a me en o ien tae erm oma Can ryda sis a of e d y li Og bin ilent sia Ca ida ce ra ph ro s ndi ortho s a a ma si Acic d elo did y amb en a par p ng sioz on ra Ko ulo sil isp Can yma eg o i da co a os oru oz or osp fti C ma nid psil is s Ambro a n rkli i and i osi ym mo e ifti K ea um s Ambrosiioz nd od ida la ac s ma a erkl S amae re e uleat bro ozy dov nd uho s tipor u a acolaae Su tin Am se a v g i my a ga i um mbrosi p m ina ee di S ho c ohm e A ma al u m es y kash m ee S hom yc p eri ioz a a a h asei u es y os ym hi is Wic ho y ra r oz c nak ns m ces ta lid mb Ambrosiozy Pi e i T k yc n a A siozymrosi g evi eun er emberorumza e o chia ve s Teunom h es wae br Pi r vz ali o am Amb no ria Teu my canb n Am a ent ua ia f s i ud d ig s C c e is a no yc es luo rr Pich ia k cci a Ca n m a ch o a ex s d e gatu re e i ia lo a Can n ida yce s s ns P ichi nifacieny l d k ce is P a x C i s c ru ne Pich o co is didad au i n ns b rri s Clavisa a retensii si or e n nd heve ris s s e eri Can ia t ip gl ra id or membra h c a o C a f si id c o p is p eg ic s d i s i C an di o r ola n P a h is n o d uc o ichia rad d Ca a di ra nensis P Ca por a ait e nd a tu ser r Mets d bl lus s o arue e n i a i a rnis a s nca a Ca di da a i r o n t t or tu isp a z ila M d t ta an p d h n c a go w e e is or n Me et r ia Sa p a silv is di h anc me rn turn s r M sch n e u a rnispome o xuum s da i t S rni u esop u M et tschni kow lub h d a er ia Sa M ets ni haw atu Sat or abi lftens s s e p a flu e nsis M etsch ch k ia bi vi n S urnisp omal e la i ow iae rij ll M ets c ni k a i nis n Me hn o iian r yma ja d e etsch k i c Sat i x iphi ishi la Me c o wi a u z rvan r s a a ni ik o n ru c i M t hnik w a k a s i e e an Me s ow i pi Satu g a b ico ni e t ko ia p in yc g i M s c hi d rv ustersianusuc ivor d tsc ch h n u c ata a Metschn ik ow w ia a bisc k a e m es yl l n Metsch e t nik t Kre g c a na oi schn i berdee ae a v s d ment a is Metsch tschni hn ni ow a shivo Mart e no i M r e a su ow ia i a c ea eth hi atae s i Me k d a my a b ns M i i K y nd a c et k o a pr r r o m d e ul s M iko ake a r i s og Metsch o w ia cap e bi li Mets t s b ga e tensis et w o an Ca ris Metsc t sch ch ko i n c om gat a i D n a o Brett t a ea a o r a s ikow w ip t t ii C i e nsberg iae uspi n C n r e to ta s ik a O a H a c w c e a a t hi m H n i o ae i H a c i a s Cand hnik a Y a kow o Y o al Cand s us i Y n Y astori C n cer m a ishi a pop yp ik ia i ylosa l i h g i y n Bret hawaii l lt i c as n y nt m i a n wi d ia a i a i p a i siamensi s a s x a i is a etan v l p Og a d e n ko oe rett a i h ow ma ra p l e p at a i h idus d m d s s a el c n m h lo holst m h i r m s mat r a ik ra i h B Og u p s n i l n a n a f h s h i o n t n d a a i opic ni i o ensi g d kowi w ka c en a es b a o n d iko a a ow a o K a i Kur a p a z m t d o g w ai a i ha a e a a a hhea ac b d p at d daru e i k i a to e r d d p d y onen ella o i d ma au e r h t gata d g ow i atae ur f r m s a a a i s alb i o a m ha a al a i nnophi c p a c r wia i a myce a e s a r c a r n h hagi o K a z mak u d g z s z h o pseudo s z a ve a ci is ym h ta e a c y t a e t r i ia wa y n y i y i a u l o a scal d i a ama a l gata a a aw c l m simi sis sc h m a i m a m o eromyce c i i ri ii a a t m bo va a z d a a cleri d heim d s i i b n en u i zo i i e iteromyc Koma a a o C a a h r a nui ie a l a eroz m Cite m ekor a d d u t g u Cit t d r C Peterozym k o a w m o i a y s k ns a t l n n p o n n r ana m is mata l a e r p le d z a c t u an ar d en a h Kom halo a a o i Nakazawa i Pe a n n o gataella Na n o o h i e i i siae i i s k p r m l n is C d C to h i l u e a a l id e nsi a

e o s y y a e i n p i n n i i s n g t r z C z a e y s zmaniell a Ce e i s um t a s s om o a ru Pachys e o i r C C s i e w

K s m

M e m

a

a

y Kurt a

e

e

M Lipomycetaceae Saccharomycetaceae Trigonopsidaceae Saccharomycodaceae / CUG-Ser2 clade Trichomonascaceae Pichiaceae Sporopachydermia clade CUG-Ala clade Alloascoideaceae CUG-Ser1 clade Phaffomycetaceae Fig. 1 CAZyme abundance in 332 budding yeasts. The total number of predicted CAZymes (GTs excluded) in each yeast species is represented by a heat signature ranging from light yellow to dark red with increasing numbers of predicted CAZymes

degrading enzymes, xylanases and xyloglucanases [18] portfolios and with a particular enrichment of mannan-, and created a heatmap based on the resulting number of xylan-, xyloglucan-, and cellulose-degrading CAZymes. enzymes (Fig. 2b). Te analysis suggests that the yeasts Te Lipomycetaceae clade appears rich in starch degrad- from the Trichomonascaceae clade have diverse enzyme ing CAZymes, while Aciculoconidium aculeatum in Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 7 of 18 n an ca e n

ablu uc -g n gl in in ch nna llulos ta ct ar gn Chitin St Be Ce Li Ma Pe Xyla Xylo

e Lipomyces starkeyi Lipomyces starkeyi Lipomyces arxii Lipomyces arxii Lipomyces kononenkoae Lipomyces kononenkoae Lipomyces lipofer Lipomyces lipofer Lipomycetaceae Lipomycetacea Lipomyces doorenjongii Lipomyces doorenjongii Spencermartinsiella europaea Spencermartinsiella europaea Sugiyamaella lignohabitans Sugiyamaella lignohabitans Diddensiella caesifluorescens Diddensiella caesifluorescens e Blastobotrys peoriensis Blastobotrys peoriensis Blastobotrys nivea Blastobotrys nivea Blastobotrys muscicola Blastobotrys muscicola Blastobotrys mokoenaii Blastobotrys mokoenaii Blastobotrys americana Blastobotrys americana ichomonascaceae ichomonascacea Tr Tr Blastobotrys adeninivorans Blastobotrys adeninivorans 60 Blastobotrys proliferans Blastobotrys proliferans CA Zy Blastobotrys serpentis Blastobotrys serpentis 50 Blastobotrys raffinosifermentans Blastobotrys raffinosifermentans Ambrosiozyma philentoma Ambrosiozyma philentoma me families (# Ambrosiozyma ambrosiae Ambrosiozyma ambrosiae 40 Ambrosiozyma monospora Ambrosiozyma monospora Pichiaceae Pichiacea e Ambrosiozyma oregonensis Ambrosiozyma oregonensis 30 Spathaspora passalidarum Spathaspora passalidarum Debaryomyces subglobosus Debaryomyces subglobosus Debaryomyces fabryi Debaryomyces fabryi 20 ) Debaryomyces prosopidis Debaryomyces prosopidis Debaryomyces hansenii Debaryomyces hansenii 10 Debaryomyces nepalensis Debaryomyces nepalensis CUG-Ser1 CUG-Ser1 Candida gorgasii Candida gorgasii Debaryomyces maramus Debaryomyces maramus 0 Scheffersomyces lignosus Scheffersomyces lignosus Scheffersomyces stipitis Scheffersomyces stipitis Aciculoconidium aculeatum Aciculoconidium aculeatum c c Wickerhamomyces anomalus Wickerhamomyces anomalus my my Wickerhamomyces ciferrii Wickerhamomyces ciferrii Wickerhamomyces canadensis Wickerhamomyces canadensis Phaffo etaceae Phaffo etaceae Wickerhamomyces hampshirensis Wickerhamomyces hampshirensis Saccharomycopsis capsularis Saccharomycopsis capsularis Saccharomycopsis malanga Saccharomycopsis malanga Ascoidea asiatica Ascoidea asiatica CUG-Ser2 CUG-Ser2 Ascoidea rubescens Ascoidea rubescens

Fig. 2 Total number of CAZymes (except GTs) in the 40 selected yeasts and their grouping by function. a Total number of CAZymes in each selected species. b CAZyme families from the same species grouped by predicted function in polysaccharide degradation. Dark red and red-colored squares indicate high number (#) of CAZymes with predicted activity towards the listed polysaccharide. Please note that the heatmap is depicting the total number of CAZyme-encoding genes belonging to families known to degrade specifc polysaccharides, and thus heat signatures from polysaccharides with very few CAZymes needed for depolymerization (e.g., β-glucan) may be skewed compared to more complex polysaccharides (such as xylan) requiring many CAZymes. Poly-specifc enzyme families such as GH5 and GH3 may also show false positive activities as their members have shown activities on several diferent β-1,4-linked glycans, e.g., xylanase, mannanase, glucanase, glucosidase, galactanase [19]. GH5 enzymes were assigned to cellulose, mannan, xylan, and xyloglucan, while GH3 were assigned to β-glucan, cellulose, xylan and xyloglucan. CBM, carbohydrate-binding module; CE, carbohydrate esterase; GH, glycoside hydrolases; PL, polysaccharide lyase the CUG-Ser1 clade contains multiple enzymes for chi- ascomycetous yeast genomes (Fig. 2b). Collectively, the tin degradation with a total of 57 predicted GH18 chi- results suggest that the assessed yeasts are equipped with tinases. In general, relatively few CAZymes associated a range of diferent polysaccharide-degrading enzymes, with pectin and lignin degradation were predicted in the where some species seem specialized to degrade specifc Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 8 of 18

polysaccharides while others appears to be polysaccha- β-glucan, glucomannan and galactomannan. Also yeasts ride generalists. from the CUG-Ser1 and Phafomycetaceae clades showed growth on xylan, whereas those from the Pichiaceae clade Growth characterization on diferent polysaccharides did not. Some of the herein characterized species have To determine if polysaccharides could support growth been identifed as xylan-growers also in other screens, for for the 40 selected ascomycetous yeasts, the yeasts were example Schefersomyces stipitis, Sugiyamaella lignohab- cultivated on agar plates with semisolid minimal media itans and Spencermartinsiella sp. [42] while, to the best (Delft) supplemented with diferent polysaccharides as of our knowledge, other species such as Blastobotrys ser- the sole carbon source. Growth on xylan, xyloglucan, pentis, Blastobotrys peoriensis and Schefersomyces ligno- β-glucan, galactomannan, glucomannan, pectin and sus have so far escaped attention in this regard [43, 44]. poly-methylgalacturonan polysaccharides was also con- In opposite to hemicellulosic substrates, the assessed frmed in liquid cultures and the accumulated growth yeasts did not grow well on cellulose despite predictions results are shown in Table 1. Several species—Lipomyces of cellulase activities (Table 1, Fig. 2b). In line with these doorenjongii, Lipomyces kononenkoae, Lipomyces lipofer, results, a large-scale screen to identify wild cellulolytic Lipomyces starkeyi, Aciculoconidium aculeatum, Ambro- yeasts showed that only 16 of 390 strains grew on cellu- siozyma ambrosiae, Ascoidea rubescens and Blastobotrys lose and just 5 had signifcant enzyme activity levels [45], nivea—did not grow in the pre-cultures and were there- indicating that most yeasts are unable to utilize crystal- fore discarded from further analysis. In general, growth line cellulose [24]. Overall, we can conclude that the on agar plates corresponded well with the increased polysaccharide-degrading ascomycetous yeasts identifed optical density (OD­ 600 > 0.2) observed in liquid cultures, in this study display better growth on hemicellulosic sub- though some species from the CUG-Ser1 clade, particu- strates compared to cellulosic substrates in accordance larly Schefersomyces species, showed better growth in with previous studies. liquid culture than on agar plates with mannan-based, pectin and xyloglucan polysaccharides (Table 1). In Growth and enzymatic activities of xylan‑utilizing yeasts accordance with the CAZyme heatmap (Fig. 2b), species To further characterize the top xylan-utilizing yeast from the Trichomonascaceae clade showed substantial species, we determined their growth profles over time growth on hemicellulosic substrates, particularly xylans, in both wheat AX and birchwood GX (Fig. 3). Te

ab Wheat arabinoxylan Birchwood glucuronoxylan 30 30

25 25

20 20

15 15 GV 10 GV 10

5 5

0 0

-5 0612 18 24 30 36 42 48 -5 04812162024283236404448

Time (h) Time (h) B. adeninivorans B. mokoenaii B. adeninivorans B. mokoenaii B. peoriensis B. proliferans B. peoriensis B. proliferans B. raffinosifermentans B. serpentis B. raffinosifermentans B. serpentis S. europaea Sp. passalidarum S. europaea Sp. passalidarum W. canadensis Su. lignohabitans W. canadensis Su. lignohabitans Sc. lignosus Sc. stipitis Sc. lignosus Sc. stipitis Fig. 3 Growth profles of 12 xylanolytic yeasts in Delft minimal medium containing 10 g/L of either a wheat arabinoxylan or b birchwood glucuronoxylan. GV Green Value (corresponding to growth based on pixel counts, as determined by a GrowthProfler instrument). Growth profles = are shown as averages of triplicates Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 9 of 18

xylanolytic yeasts showed diferent growth profles and the correlation between measured xylanase activity and reached stationary phase between 8–28 h, with some growth characteristics was ambiguous. For example, species showing biphasic growth curves. Notably, B. yeasts such as B. adeninivorans and B. peoriensis with mokoenaii, Sp. europaea, Sc. lignosus and Wickerhamo- intermediate growth in both xylans showed only modest myces canadensis reached the highest optical densities in xylanolytic activities (0.2–0.3 U ­mL−1), whereas Sc. stipi- both xylans, although the yeasts’ growth profles difered tis and Su. lignohabitans showed high xylanase activi- somewhat between the two substrates (Fig. 3a, b). ties (0.4–2.8 U mL­ −1) but only moderate xylan growth Next, the xylanolytic yeasts were characterized in (Figs. 3, 4). Overall, the diverse profles in terms of terms of xylanase activities. Both the secretome and growth, enzymatic activities and xylan substrate prefer- cell-associated enzymatic activities were assayed to gain ences point towards diferent yeast strategies for degra- deeper insight into the xylanolytic strategies used by dation and utilization of xylan. these species. Xylanase activity of the secretome was par- ticularly high in B. mokoenaii for both types of xylans, CAZyme analysis in xylanolytic yeast species with a higher activity on wheat AX compared to birch- To connect the experimentally measured xylanolytic wood GX (3.6 and 2.3 U mL­ −1, respectively) (Fig. 4a). activities with the predicted CAZymes, we identifed Tese values were 7.2-fold higher than those of Sc. ligno- all putative xylanolytic CAZymes for each of the top 12 sus that had the second highest secretome activity values, xylan-growing yeasts (Table 2). Overall, the yeasts, com- and also higher than what has been reported previously ing from three clades, have similar numbers of genes on yeasts that secrete xylanases [37]. Tis indicates that encoding CEs with expected roles in de-acylation of B. mokoenaii possesses a unique xylanolytic strategy polysaccharides, and GH3 enzymes predicted to act among the studied species. B. mokoenaii also had a high as exo-β-glycosidases on oligosaccharides. Te species cell-associated xylanase activity on both wheat AX and from the Trichomonascaceae clade have a more diverse birchwood GX, a feature shared with several other spe- and abundant xylanolytic CAZyme distribution com- cies. Tese included the other top xylan-growing species pared to yeasts from other clades. Te top-performing Sp. europaea, Sc. lignosus, and W. canadensis (Figs. 3 and xylanolytic yeast B. mokoenaii encodes a putative GH11 4b), which all showed good correlation between enzyme xylanase, which is a unique trait within the whole 332 activity and growth. However, for several other yeasts, yeast dataset. We were able to detect the GH11 protein

ab

Secreted Wheat arabinoxylan Cell associated Wheat arabinoxylan Birchwood glucuronoxylan Birchwood glucuronoxylan a o fo m. ff W.canadensis Ph m.

Ph W. canadensis af 1 Sc. stipitis Sc. stipitis -Ser Sc. lignosus Sc. lignosus G-Ser1 UG Sp. passalidarum

CU Sp. passalidarum B. raffinosifermentans B. raffinosifermentans eC B. serpentis e B. serpentis B. proliferans B. proliferans B. adeninivorans B. adeninivorans B. mokoenaii B. mokoenaii B. peoriensis B. peoriensis Trichomonascacea Su. lignohabitans Trichomonascacea Su. lignohabitans Sp. europaea Sp. europaea 01234 01234 U mL-1 U mL-1 Fig. 4 Xylanolytic yeast activities in liquid cultures. Volumetric activities of a secretome xylanases and b yeast cell-associated xylanases in wheat arabinoxylan (grey) and birchwood glucuronoxylan (black) determined at 30 °C after growth on xylan in liquid medium for 72 h. Phafom. Phafomycetaceae clade = Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 10 of 18 Total 33 23 36 27 18 27 22 32 GH115 (28/332) GH115 GH115(3) GH115(2) GH67 (4/332) GH67 GH67 GH67 GH67 GH62 (1/332) GH62 GH51 (39/332) GH51 GH51 GH51 GH51(3) GH51(2) GH51(3) GH51 GH51 GH43_24 GH43_24 GH43 (22/332) GH43_14, GH43_6, GH43_6 GH43_6 GH30_7 GH30_7 GH30 (11/332) GH30_5, GH30_3(3) GH30_7 GH30_5, GH30_3 GH30_3(2) GH30_3 GH30_3 GH11 (1/332) GH11 GH10(5/332) GH10(2) GH10 GH10(2) GH5_9(2), GH5_12(2), GH5_22(4), GH5_49 GH5_7 (2), GH5_9(2), GH5_12, GH5_22(2), GH5_49 GH5_12(2), GH5_22(5), GH5_49 GH5_12(2), GH5_22(2), GH5_49 GH5_12(2), GH5_49 GH5_9(3), GH5_12, GH5_22, GH5_49 GH5_9(2), GH5_12, GH5_31, GH5_49 GH5_12(2), GH5_44, GH5_47 GH5 (324/332) GH5_5(2), GH5, GH5_5, GH5_9(3), GH5_9(2), GH5_9(3), GH5, GH5_5(3), GH5_9(3), GH3 (263/332) GH3(11) GH3(8) GH3(15) GH3(5) GH3(9) GH3(5) GH3(12) GH3(8) CE15 CE4(3), CE15 CE5(6) CE (332/332) CE1, CE4, CE1, CE4 CE1, CE4 CE1(2), CE1, CE4(2), CE1, CE4 CE1(2), CE4 CE1, CE4(2) - - itans mentans vorans Species Sp. europaea Sp. B. mokoenaii B. B. peoriensis B. Su. lignohab Su. B. rafnosifer B. B. serpentis B. B. proliferans B. B. adenini - B. Xylanolytic CAZyme predicted from whole-genome from predicted CAZyme Xylanolytic sequenced xylanolytic yeasts ascaceae 2 Table Clade - Trichomon Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 11 of 18 Total 19 20 14 17 GH115 (28/332) GH115 GH115 GH115 GH67 (4/332) GH62 (1/332) GH51 (39/332) GH43 (22/332) GH30 (11/332) GH11 (1/332) GH10(5/332) GH10(2) GH10 GH5_9(3), GH5_22, GH5_49 GH5_9(3), GH5_12, GH5_22(2), GH5_49 GH5_12, GH5_22 (3), GH5_49 GH5_12, GH5_22, GH5_49 GH5 (324/332) GH5, GH5_5, GH5_9(2), GH5_9(2), Phafomycetace GH3 (263/332) GH3(9) GH3(7) GH3(7) GH3(5) CE (332/332) CE1, CE4(2) CE1, CE4, CE1, CE4, CE1(3), CE4 darum sis Species Spa. passali - Sc. lignosus Sc. stipitis - canaden W. (continued) 2 Table Clade number of each enzyme in (x/332). Copy family is marked is stated by each CAZyme number of species containing Total unique enzymes dataset. families marked in bold are CAZyme the species within 332-yeast to next the enzyme to parenthesis Phaf. hydrolase, GH glycoside esterase, CE carbohydrate CUG-Ser1 Phaf. Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 12 of 18

Fig. 5 Phylogenetic analysis of GH11 and GH10. a Phylogenetic placement of the GH11 xylanase from B. mokoenaii (orange triangle) and b Phylogenetic placement of the GH10 xylanases from Sp. europaea (yellow circles), Su. lignohabitans (green circles), B. peoriensis (red circle), Sc. lignosus (blue squares) and Sc. stipitis (purple square). The molecular phylogenetic analysis was performed using full protein sequences from 259 GH10 and 208 GH11 characterized enzymes using Newick tree model from MUSCLE alignment with 1000 boot strap replicates. The numbers at each branch indicate bootstrap values and tree topology confdence. Trees are drawn to scale with branch lengths measured in numbers of substitutions per site. Scale bars represent 1.0 substitutions per nucleotide position

with a molecular size of 23.19 kDa in the secretome of (B. mokoenaii) xylanases (Fig. 4, Table 2). An interest- B. mokoenaii grown in medium containing wheat AX or ing exception is W. canadensis, which does not appear birchwood GX, using in-gel proteomic MS/MS analy- to encode either GH11, GH10 or GH30 xylanases. How- sis (Additional fle 2: Fig. S1). Te GH11 gene can be ever, it possesses putative GH5_9, GH5_22 and GH5_49 found in the genome position 29298–29948 in the Gen- CAZymes in common with most of the xylanolytic spe- Bank sequence ID: PPJM02000065.1. B. mokoenaii is cies listed in Table 2, suggesting that some of these also unique in that it possesses two gene copies for GH5 CAZymes may be novel xylanases. No xylanase activities enzymes from subfamily 7 (GH5_7; putative endo-β- have yet been confrmed in the mentioned GH5 subfami- 1,4-mannanases) and a GH62 α-l-arabinofuranosidase. lies, and in fact no GH5_49 enzymes have to date been Further, B. mokoenaii encodes a GH30_7 enzyme (puta- biochemically characterized [18]. Although we can- tive exo-β-1,4-xylanase or glucuronoxylanase) in com- not completely rule out that the lack of genes encoding mon with only two other yeasts that also scored high in known xylanases in W. canadensis is due to an incom- our assays: Su. lignohabitans and Sp. europaea. Indeed, plete genome assembly, this species and its putative xyla- all eight species in the Trichomonascaceae clade have nases deserve further characterization outside the scope predicted GH30 enzymes and some species have puta- of this study. tive GH67 α-glucuronidases as well as GH43 and GH51 enzymes predicted to be α-l-arabinofuranosidases), indi- Phylogenetic analysis of GH10 and GH11 xylanases cating abilities to target complex GAX. A similar setup is To investigate the origin of the genes encoding the iden- not found in the CUG-Ser1 and Phafomycetaceae clades. tifed GH10 and GH11 members in the ascomycetous However, the CUG-Ser1 clade species possess a putative yeasts, we determined the phylogenetic relationships GH115 α-glucuronidase, potentially enabling them to of these enzymes with 259 characterized enzymes from hydrolyze glucuronic acid side chains present in birch- GH10 and 208 from GH11, listed in the CAZy data- wood GX. base [18]. Phylogenetic trees displaying all character- Species displaying good xylanolytic activity (2–4 U ized enzymes can be viewed in Additional fle 3: Fig. S2. ­mL−1) almost all possess predicted GH10 (Sp. europaea, Te GH11 xylanase from B. mokoenaii shows the high- Su. lignohabitans, Sc. stipitis and Sc. lignosus) or GH11 est sequence identify (71.95%) to the xlnB xylanase from Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 13 of 18

(See fgure on next page.) Fig. 6 Characterization of non-sequenced xylanolytic yeasts. a Phylogenetic analysis of 19 Blastobotrys, Sugiyamaella and Schefersomyces species as well as Schizosaccharomyces pombe serving as outgroup. The molecular phylogenetic analysis was based on ITS sequences using maximum-likelihood model from ClustalW alignment with 1000 bootstrap replicates. The numbers at each branch indicate bootstrap values and tree topology confdence. The tree is drawn to scale, with branch lengths measured in the number (0.2) of substitutions per site. Growth profles 1 of xylanolytic yeasts grown in Delft medium containing 10 g ­L− of b wheat arabinoxylan and c birchwood glucuronoxylan. Yeasts were grown for 48 h at 30 °C. GV Green Value (corresponding to growth based on pixel counts, as determined by a GrowthProfler instrument). d Secretome and e = cell-associated volumetric xylanase activities on wheat arabinoxylan (grey) and birchwood glucuronoxylan (black) determined at 30 °C after growth in xylan-containing liquid medium for 72 h. Stars (*) symbolizes non-sequenced species

Aspergillus nidulans FGSCA4 with confrmed ability Sc. shehatae have so far largely escaped scientifc atten- to hydrolyze oat-spelt xylan [46], suggesting a similar tion, and genomic information except for ITS and riboso- function of the putative B. mokoenaii enzyme (Fig. 5a). mal RNA sequences is almost completely missing [37, 42, All GH10 copies from Sp. europaea, Su. lignohabitans, 44, 48]. Te yeast growth profles in Delft minimal media B. peoriensis, Sc. lignosus and Sc. stipitis clustered to containing wheat AX and birchwood GX as carbon the same branch of the phylogenetic tree, together with sources and the secretome and cell-associated xylano- characterized xylanases from the flamentous fungi Tal- lytic activities and can be seen in Fig. 6b-e. All six species aromyces leycettanus, Penicillium canescens and Bispora showed secreted or cell-associated xylanolytic activities, sp.MEY-1 (Fig. 5b). Tus, we can conclude that all yeast or both, and all except B. parvus grew on both xylan sub- GH10 and GH11 are of Ascomycota origin and likely strates. Tis species, however, displayed high xylanolytic have xylanolytic activity. Te presence of these genes in activity (1.8–2.8 U mL­ −1) on both xylan types (Fig. 6b- the yeast genomes could be a result of horizontal gene e). Interestingly, Sc. shehatae reached the highest green transfer within the phylum, or that these genes have values (26.7 GV) in wheat AX out of all the xylanolytic been specifcally retained by a small number of (ances- yeasts characterized in this study (Fig. 6b) and Su. smith- tral) yeast species after the split between Pezizomyco- iae and Su. novakii showed high activity on both xylans tina (flamentous fungi) and (yeasts) (in contrast to Su. lignohabitans which seem to pre- in the Ascomycota phylum. In favor for the gene reten- fer birchwood GX) (Figs. 4, 6d, e). Overall, these results tion explanation model, Morel and co-authors have show that CAZyme-rich clades are treasure troves for shown that the genome of the yeast Geotrichum candi- identifying xylanolytic yeast species, and sequencing and dum within the Trichomonascaceae clade contains a few characterization of the new yeasts will most likely lead to hundred genes that are orthologous to predicted genes additional discoveries of CAZymes with potential indus- in flamentous fungi rather than other sequenced Sac- trial value. charomycotina yeasts [47]. Moreover, B. mokoenaii pos- sesses several other unique CAZymes, including ones Discussion from GH62 and GH12 that also show high sequence Complete depolymerization of complex lignocellulosic identity to CAZymes from Aspergillus species (77% to A. polysaccharides requires a repertoire of enzymes that pseudonomiae and 62% to A. favus, respectively), further act together on the diferent chemical bonds [14]. While supporting this model. CAZyme systems from flamentous fungi and bacte- ria have been studied for decades, yeast species have Identifcation of novel xylanolytic species by phylogenetic received considerably less attention. However, since association yeasts are key industrial workhorses, elucidating their Te successful approach of using CAZyme predic- plant cell wall-degrading potential may be of great beneft tion to identify xylan-degrading yeasts in CAZyme- for the development of efcient CBP strains able to both rich clades, prompted us to scout for additional, novel produce the enzymes needed for biomass degradation xylanolytic species through phylogenetic association. and convert the released sugars into valuable products Six non-sequenced yeast species phylogenetically closely [49]. Non-conventional, xylanolytic yeasts can potentially related to the highest scoring xylanolytic species found be developed into future CBP cell factories. Alternatively, in this study were therefore included in another round the strategies these yeasts use may be directly transfer- of characterization; Sugiyamaella novakii (CBS 8402), rable to industrial Saccharomyces species in a manner Sugiyamaella smithiae (CBS 5657), Blastobotrys malay- that is not feasible for systems used by flamentous fungi siensis (CBS 10,336), Blastobotrys illinoisensis (CBS or bacteria. We here present a strategy of high-through- 10,339), Blastobotrys parvus (CBS 6147) and Schefer- put mining of genomes for putative CAZymes followed somyces shehatae (CBS 5813) (Fig. 6a). All species except by growth studies and enzymatic investigation, through Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 14 of 18

a

bc Wheat arabinoxylan Birchwood glucuronoxylan 30 30 25 25 20 20 15 15 GV 10 GV 10 5 5 0 0 -5 -5 04812162024283236404448 04812162024283236404448 Time (h) Time (h) B. parvus B. illinoisensis B. parvus B. illinoisensis Su. novakii Su. smithiae Su. novakii Su. smithiae Sc. shehatae B. malaysiensis Sc. shehatae B. malaysiensis de Secreted Wheat arabinoxylan Cell-associated Wheat arabinoxylan Birchwood glucuronoxylan Birchwood glucuronoxylan Sc. shehatae * Sc. shehatae * B.parvus * B. parvus * B. malaysiensis * B. malaysiensis * B. illinoisensis * B. illinoisensis * Su. smithiae * Su. smithiae * Su.novakii * Su. novakii * 012345 012345 U mL-1 U mL-1 Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 15 of 18

which we identifed several novel yeast species displaying enzymatic activities, but our results clearly emphasize high xylanolytic activities and seemingly diverse strate- the need to confrm bioinformatic predictions with wet gies for hemicellulose utilization. lab characterization.

Correlation between predicted CAZymes, growth Xylanolytic strategies in ascomycetous yeasts clades and enzymatic activities Te xylanolytic yeasts identifed in this study display Identifcation of CAZyme-encoding genes in a genome volumetric xylanase activities that range between 0.9 and is a promising lead in fnding polysaccharide-utiliz- 4.3U ­mL−1 (cell-associated and secreted activities com- ing species, but it does not per say reveal whether the bined), which can be compared to the xylanolytic activi- genes are actually expressed into functional enzymes by ties of 28–30 U mL­ −1 reported for flamentous fungi the organism. Moreover, a series of enzymatic activi- Aspergillus niger, Aspergillus favus and Trichoderma vir- ties is required to degrade complex polysaccharides ide [57, 58] and also to previous reports of Sc. stipitis (1.6 into oligo- and monosaccharides that can be taken up U ­mL−1) and Su. lignohabitans (0.29 U mL­ −1) [48, 59]. and metabolized by the microorganism. To fnd yeasts Strikingly, the diferent yeast species characterized here equipped with all enzymes needed to hydrolyze and seem to have versatile approaches to degrade xylan. Most grow on plant polysaccharides, we initially selected of the species that showed high enzymatic activities also CAZyme-rich yeasts rather than cherry-picked spe- possess CAZymes from the well-studied xylan-associated cies containing single GHs of interest. By doing so, our GH families GH5, GH10, GH11 and GH30. GH10 mem- hope was also to increase the chances of capturing spe- bers were found in species in the Trichomonascaceae and cies equipped with CAZymes from poly-specifc GH CUG-Ser1 clades that preferentially display cell-associ- families or even GH families not yet associated with ated xylanolytic activities. In contrast, B. mokoenaii was the targeted polysaccharides. Overall, this proved to be the only species identifed with a GH11 enzyme, which a highly successful approach, as we identifed multiple seems to be secreted in a soluble form, i.e. not attached polysaccharide-degrading species not captured through to the yeast cell surface. In addition to GH10 and previous bioprospecting campaigns [37, 42, 48]. GH11, species in the Trichomonascaceae clade encoded Among the CAZyme-rich species characterized in enzymes from a range of additional families potentially this study, we identifed several yeast species from the containing xylan-acting enzymes that are completely Trichomonascaceae, CUG-Ser1 and the Phafomyc- lacking in the other clades. For example, enzymes from etaceae clades that were able to grow on xylan sub- subfamily GH30_5 (putative endo-β-1,6-galactanase) strates. However, not all species enriched in CAZymes were present in B. mokoenaii and Sp. europaea, while a associated with xylan degradation grew according to putative exo-β-1,4-xylanase or glucuronoxylanase from the CAZyme-predictions and some surprisingly dis- GH30_7 was present in B. mokoenaii, Su. lignohabitans played relatively high xylanolytic activities but only and Sp. europaea – three of the top scoring xylanolytic modest xylan-growth, suggesting an inability to utilize species found in our study. Yeasts in this clade were the released oligosaccharides. Moreover, the CAZyme also found to contain genes for diferent xylanolytic de- heatmap indicated potential for cellulolytic activ- branching enzymes, e.g., likely GH43, GH51 and GH62 ity in several species, but this was not observed in the α-l-arabinofuranosidases and GH67 α-glucuronidases. growth studies for the soluble cellulose analog CMC Tese could enable de-branching of complex xylans and or crystalline and insoluble Avicel. For efcient deg- reduce steric hindrances thus promoting the access to the radation of cellulose to glucose, endoglucanase, cel- xylan backbone for main-chain cleaving xylanases [60]. lobiohydrolase, lytic polysaccharide monooxygenase Along with the Trichomonascaceae species, we also and β-glucosidase activities are generally regarded as identifed xylanolytic species such as W. canadensis needed [50–53]. Te inability of yeasts to grow on cel- where the expected xylan-depolymerizing CAZyme lulose may be due to the absence of one or several of setup seems to be absent. As mentioned above, the lack the enzyme families known to encode such enzymes, of predicted xylanolytic CAZymes in W. canadensis could such as GH6 (totally absent in the 332 yeasts) and GH7 be due to an incomplete genomic sequence [26]. How- (only present in B. peoriensis) and GH12 (only found in ever, other xylanolytic yeasts such as Sp. passalidarum B. mokoenaii). Tese results are well correlated with the and Candida intermedia [37, 42] for which complete lack of literature describing cellulose-growing yeasts, or almost complete genomes exist, obvious xylano- and only a few studies have reported cellulolytic enzy- lytic enzyme-candidates are also missing (this study matic activities in yeasts [45, 55, 56]. Further analysis is and unpublished results). Tis supports the hypothesis needed to decipher the relationships between bioinfor- that these yeasts possess novel xylanases that should be matic CAZyme prediction and measured growth and explored more thoroughly. Further studies in terms of Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 16 of 18

gene deletion studies and/or heterologous expression Abbreviations AX: Arabinoxylan; CAZyme: Carbohydrate active enzyme; CBM: Carbohydrate- and characterization of individual enzymes are needed binding module; CE: Carbohydrate esterase; CMC: Carboxymethyl cellulose; to conclusively assign diferent enzyme activities of these CBP: Consolidated bioprocessing; DNS: Dinitro salicylate; GalMan: Galacto- putative enzymes. mannan; GH: Glycoside hydrolase; GluMan: Glucomannan; GT: Glycoside transferases; GX: Glucuronoxylan; PL: Polysaccharide lyase; poly-MeGal: Poly- methylgalacturonan; SDS-PAGE: Sodium dodecyl sulphate-polyacrylamide gel Future outlook electrophoresis. We here present the identification of several novel ascomycetous yeast species that can grow on polysac- Supplementary Information charides and particularly on xylans. Future research The online version contains supplementary material available at https://​doi.​ includes careful physiological characterization of org/​10.​1186/​s13068-​021-​01995-x. select species from the Trichomonascaceae, CUG- Ser1 and Phaffomycetaceae clades, to determine Additional fle 1: Table S1. Table S1. CAZyme families grouped by poly- saccharide degradation function for heatmap generation. their precise growth requirements and full substrate Additional fle 2: Fig. S1. SDS-PAGE gel showing proteins secreted by B. ranges, tolerance-levels to industrial stressors and mokoenaii after three days of growth in xylan containing Delft medium. product portfolios. Additionally, xylanases are in high Additional fle 3: Fig. S2. Phylogenetic analysis of GH11 and GH10 industrial demand for production of textiles, pulp and xylanases. paper as well as in modern biotechnology for pro- duction of, for example, functional foods and feeds. Acknowledgements Characterization of the many putative CAZymes The authors would like to thank Dr. Marcel Taillefer for his expertise and analy- identified in yeasts may provide new enzyme features sis on MS/MS proteomics and Amanda Sörensen Ristinmaa for her assistance in the agar growth plate studies. The authors would also like to thank the ARS in terms of fold, stability and specificity with poten- Culture Collection for providing yeast cultures. tial to improve current processes and enable new applications. To assign physiological roles and sub- Authors’ contributions All authors conceived the project; JLR, CG and JL designed the experiments; strate specificities to these enzymes, heterologous JLR performed the experiments; ME performed bioinformatical work. JLR and expression, purification and biochemical characteri- CG interpreted the data and wrote the manuscript. JLR, CG, JL and ME revised zation will be needed. the manuscript. All authors read and approved the fnal manuscript.

Funding Open access funding provided by Chalmers University of Technology. JLR, CG Conclusions and JL would like to acknowledge Carl Trygger Fonden (Grant nr. CTS 18:118) for fnancial support of this research. Yeast biodiversity presents a huge, untapped resource for present and future industrial applications such as Availability of data and materials CBP in terms of desirable microbial phenotypes and All data generated and analyzed in this study are included in the publication article and its additional information fles. novel CAZyme discovery. In this study, we have devel- All code and data used in this study are made freely available for re-use. The oped a bioinformatic pipeline to rapidly process and pre- bioinformatic analysis code is available as a GitHub repository (https://​github.​ dict CAZymes in a large number of genome sequenced com/​Engqv​istLab/​yeast_​xylan​ase) and the main output fles, with informa- tion on predicted CAZyme domains, are downloadable from Zenodo (https://​ ascomycetous yeasts. Te resulting CAZyme predictions zenodo.​org/​depos​it/​45483​36; https://​doi.​org/​10.​5281/​zenodo.​45483​36). combined with growth and enzymatic activities assays enabled identifcation of several novel xylanolytic yeasts. Declarations Moreover, additional non-sequenced species with xylan- degrading capacity were identifed through phylogenetic Ethics approval and consent to participate Not applicable. association. Many species identifed and characterized here show equal or better xylanolytic activities compared Consent for publication to described species in literature such as Schefersomyces All authors consented on the publication of this work. and Sugiyamaella species, highlighting the potential of Competing interests the approach. Collectively, the results presented expand The authors declare that they have no competing interests. our current knowledge on polysaccharide-degrading Author details ascomycetous yeasts and opens up for numerous follow- 1 Department of Biology and Biological Engineering, Chalmers University up studies on yeast physiology and CAZyme characteri- of Technology, 412 96 Gothenburg, Sweden. 2 Wallenberg Wood Science zation. Te knowledge generated through such studies Center, Chalmers University of Technology, 412 96 Gothenburg, Sweden. will be of high importance for the optimization of ligno- Received: 12 April 2021 Accepted: 14 June 2021 cellulosic biomass conversion processes. Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 17 of 18

References 23. Despres J, et al. Xylan degradation by the human gut Bacteroides xylani- 1. Fritsche U. Future transitions for the bioeconomy towards sustainable solvens XB1AT involves two distinct gene clusters that are linked at the development and a climate-neutral. Publications Ofce of the European transcriptional level. BMC Genomics. 2016;17(1):1–14. https://​doi.​org/​10.​ Union, JRC121212. 2020. https://​doi.​org/​10.​2760/​667966 1186/​s12864-​016-​2680-8. 2. Sheldon RA. The: E factor 25 years on: The rise of green chemistry and 24. Biely P, Kremnický L. Yeasts and their enzyme systems degrading cel- sustainability. Green Chem. 2017;19(1):18–43. https://​doi.​org/​10.​1039/​ lulose, hemicelluloses and pectin. Food Technology and Biotechnology. c6gc0​2157c. 1998;36(4):305–12. 3. Zoghlami A, Paës G. Lignocellulosic biomass: understanding recalcitrance 25. Mukherjee V, Radecka D, Aerts G, Verstrepen KJ, Lievens B, Thevelein JM. and predicting hydrolysis. Front Chem. 2019;7:2. https://​doi.​org/​10.​3389/​ Phenotypic landscape of non-conventional yeast species for diferent fchem.​2019.​00874. stress tolerance traits desirable in bioethanol fermentation. Biotechnol 4. Heinze T. Cellulose chemistry and properties: fbers, nanocelluloses and Biofuels. 2017;10(1):1–19. https://​doi.​org/​10.​1186/​s13068-​017-​0899-5. advanced materials. Adv Polym Sci. 2015;271:47. 26. Shen XX, et al. Tempo and mode of genome evolution in the budding 5. Lundqvist J, et al. Isolation and characterization of galactoglucomannan yeast subphylum. Cell. 2018;175(6):1533-1545.e20. https://​doi.​org/​10.​ from spruce (Picea abies). Carbohydr Polym. 2002;48(1):29–39. https://​doi.​ 1016/j.​cell.​2018.​10.​023. org/​10.​1016/​S0144-​8617(01)​00210-7. 27. Zhang H, et al. DbCAN2: a meta server for automated carbohydrate- 6. Park YB, Cosgrove DJ. Xyloglucan and its interactions with other com- active enzyme annotation. Nucleic Acids Res. 2018;46(W1):W95–101. ponents of the growing cell wall. Plant Cell Physiol. 2015;56(2):180–94. https://​doi.​org/​10.​1093/​nar/​gky418. https://​doi.​org/​10.​1093/​pcp/​pcu204. 28. Libkind D, et al. Into the wild: new yeast genomes from natural environ- 7. Wierzbicki MP, Maloney V, Mizrachi E, Myburg AA. Xylan in the middle: ments and new tools for their analysis. FEMS Yeast Res. 2020;20(2):1–15. understanding xylan biosynthesis and its metabolic dependencies https://​doi.​org/​10.​1093/​femsyr/​foaa0​08. toward improving wood fber for industrial processing. Front Plant Sci. 29. Kurtzman CP, Fell JW, Boekhout T. Defnition, classifcation and nomencla- 2019;10(February):1–29. https://​doi.​org/​10.​3389/​fpls.​2019.​00176. ture of the yeasts, vol. 1. New York: Elsevier; 2011. 8. Ebringerová A, Heinze T. Xylan and xylan derivatives - Biopolymers with 30. Li W, Godzik A. Cd-hit: a fast program for clustering and compar- valuable properties, 1: naturally occurring xylans structures, isolation pro- ing large sets of protein or nucleotide sequences. Bioinformatics. cedures and properties. Macromol Rapid Commun. 2000;21(9):542–56. 2006;22(13):1658–9. https://​doi.​org/​10.​1093/​bioin​forma​tics/​btl158. 9. Mnich E, et al. Phenolic cross-links: building and de-constructing the 31. Eddy SR. Accelerated profle HMM searches. PLoS Comput Biol. 2011;7:10. plant cell wall. Nat Prod Rep. 2020. https://​doi.​org/​10.​1039/​c9np0​0028c. https://​doi.​org/​10.​1371/​journ​al.​pcbi.​10021​95. 10. Berglund J, et al. Wood hemicelluloses exert distinct biomechanical con- 32. AlmagroArmenteros JJ, et al. SignalP 5.0 improves signal peptide predic- tributions to cellulose fbrillar networks. Nat Commun. 2020;11(1):1–16. tions using deep neural networks. Nat Biotechnol. 2019;37(4):420–3. https://​doi.​org/​10.​1038/​s41467-​020-​18390-z. https://​doi.​org/​10.​1038/​s41587-​019-​0036-z. 11. De Buck V, Polanska M, Van Impe J. Modeling biowaste biorefneries: a 33. Cock PJA, et al. Biopython: Freely available Python tools for computational review. Front Sust Food Syst. 2020;4:7. https://​doi.​org/​10.​3389/​fsufs.​2020.​ molecular biology and bioinformatics. Bioinformatics. 2009;25(11):1422– 00011. 3. https://​doi.​org/​10.​1093/​bioin​forma​tics/​btp163. 12. Den W, Sharma VK, Lee M, Nadadur G, Varma RS. Lignocellulosic biomass 34. Huerta-Cepas J, Serra F, Bork P. ETE 3: Reconstruction, Analysis, and Visu- transformations via greener oxidative pretreatment processes: access to alization of Phylogenomic Data. Mol Biol Evol. 2016;33(6):1635–8. https://​ energy and value added chemicals. Front Chem. 2018;6:1–23. https://​doi.​ doi.​org/​10.​1093/​molbev/​msw046. org/​10.​3389/​fchem.​2018.​00141. 35. Thiruvathukal EGK, Hunter BJD. M Atplotlib: a 2D G Raphics E Nviron- 13. Arntzen M, Bengtsson O, Várnai A, Delogu F, Mathiesen G, Eijsink VGH. ment. Berlin: Springer; 2007. p. 90–5. Quantitative comparison of the biomass-degrading enzyme repertoires 36. Hendriks ATWM, van Lier JB, de Kreuk MK. Growth media in anaerobic of fve flamentous fungi. Sci Rep. 2020;10(1):1–17. https://​doi.​org/​10.​ fermentative processes: The underestimated potential of thermophilic 1038/​s41598-​020-​75217-z. fermentation and anaerobic digestion. Biotechnol Adv. 2018;36(1):1–13. 14. Chukwuma OB, Rafatullah M, Tajarudin HA, Ismail N. Lignocellulolytic https://​doi.​org/​10.​1016/j.​biote​chadv.​2017.​08.​004. enzymes in biotechnological and industrial processes: a review. Sustain. 37. Lara CA, et al. “Identifcation and characterisation of xylanolytic yeasts 2020;12(18):1–31. https://​doi.​org/​10.​3390/​su121​87282. isolated from decaying wood and sugarcane bagasse in Brazil”, Antonie 15. Chettri D, Verma AK, Verma AK. Innovations in CAZyme gene diver- van Leeuwenhoek. Int J Gen Mol Microbiol. 2014;105(6):1107–19. https://​ sity and its modifcation for biorefnery applications. Biotechnol Rep. doi.​org/​10.​1007/​s10482-​014-​0172-x. 2020;28:8. https://​doi.​org/​10.​1016/j.​btre.​2020.​e00525. 38. Miller GL. Use of Dinitrosalicylic acid reagent for determination of reduc- 16. Minty JJ, Lin XN. Engineering synthetic microbial consortia for consoli- ing sugar. Anal Chem. 1959;31(3):426–8. https://​doi.​org/​10.​1021/​ac601​ dated bioprocessing of lignocellulosic biomass into valuable fuels and 47a030. chemicals. New York: Elsevier; 2015. 39. Ghose TK, Bisaria VS. Measurement of hemicellulase activities part 1: xyla- 17. Claes A, Deparis Q, Foulquié-Moreno MR, Thevelein JM. Simultaneous nases. Pure Appl Chem. 1987;59(12):1739–51. https://​doi.​org/​10.​1351/​ secretion of seven lignocellulolytic enzymes by an industrial second-gen- pac19​87591​21739. eration yeast strain enables efcient ethanol production from multiple 40. Edgar RC. MUSCLE: Multiple sequence alignment with high accuracy and polymeric substrates. Metab Eng. 2020;59:131–41. https://​doi.​org/​10.​ high throughput. Nucleic Acids Res. 2004;32(5):1792–7. https://​doi.​org/​ 1016/j.​ymben.​2020.​02.​004. 10.​1093/​nar/​gkh340. 18. Lombard V, GolacondaRamulu H, Drula E, Coutinho PM, Henrissat B. The 41. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular evo- carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. lutionary genetics analysis across computing platforms. Mol Biol Evol. 2014;42:490–5. https://​doi.​org/​10.​1093/​nar/​gkt11​78. 2018;35(6):1547–9. https://​doi.​org/​10.​1093/​molbev/​msy096. 19. Davies G, Gilbert H, Henrissat B, Svensson B, Vocadlo D, Williams S. 42. Morais CG, Cadete RM, Uetanabaro APT, Rosa LH, Lachance MA, Rosa CA. Ten years of CAZypedia: a living encyclopedia of carbohydrate-active D-xylose-fermenting and xylanase-producing yeast species from rotting enzymes. Glycobiology. 2018;28(1):3–8. https://​doi.​org/​10.​1093/​glycob/​ wood of two Atlantic Rainforest habitats in Brazil. Fungal Genet Biol. cwx089. 2013;60:19–28. https://​doi.​org/​10.​1016/j.​fgb.​2013.​07.​003. 20. Boraston AB, Bolam DN, Gilbert HJ, Davies GJ. Carbohydrate-binding 43. Suh SO, Houseknecht JL, Gujjari P, Zhou JJ. Schefersomyces parasheha- modules: fne-tuning polysaccharide recognition. Biochem J. tae f.a., sp. nov., Schefersomyces xylosifermentans f.a., sp. nov., Candida 2004;382(3):769–81. https://​doi.​org/​10.​1042/​BJ200​40892. broadrunensis sp. nov. and Candida manassasensis sp. Nov., novel yeasts 21. Sun Y, Cheng J. Hydrolysis of lignocellulosic materials for ethanol produc- associated with wood-ingesting insects, and their ecological and biofuel tion: a review. Bioresour Technol. 2002;83(1):1–11. https://​doi.​org/​10.​ implications. Int J Syst Evol Microbiol. 2013;63:4330–9. https://​doi.​org/​10.​ 1016/​S0960-​8524(01)​00212-7. 1099/​ijs.0.​053009-0. 22. Zhao Z, Liu H, Wang C, Xu JR. Correction to Comparative analysis of fun- 44. Kurtzman CP. Blastobotrys americana sp. nov., Blastobotrys illinoisensis gal genomes reveals diferent plant cell wall degrading capacity in fungi. sp. nov., Blastobotrys malaysiensis sp. nov., Blastobotrys muscicola sp. BMC Genomics. 2014;15:144. https://​doi.​org/​10.​1186/​1471-​2164-​15-6. nov., Blastobotrys peoriensis sp. nov. and Blastobotrys rafnosifermentans Ravn et al. Biotechnol Biofuels (2021) 14:150 Page 18 of 18

sp. nov., novel anamorphic yeast species. Int J Syst Evol Microbiol. 54. Kulon U, Park N. Screening of amylolytic and cellulolytic yeast from 2007;57(5):1154–62. https://​doi.​org/​10.​1099/​ijs.0.​64847-0. Dendrobium spathilingue in Bali botanical garden, Indonesia screening 45. Goldbeck R. Screening and identifcation of cellulase producing of amylolytic and cellulolytic yeast from Dendrobium spathilingue in Bali yeast-like microorganisms from Brazilian biomes. African J Biotechnol. botanical garden, Indonesia. AIP Conf Proc. 2020;2242:050013. 2012;11(53):11595–603. https://​doi.​org/​10.​5897/​ajb12.​422. 55. Shirnalli G, Ashwini M, Gachhi D. Isolation and evaluation of cellulo- 46. Pérez-González JA, De Graaf LH, Visser J, Ramön D. Molecular cloning lytic yeasts for production of ethanol from wheat straw. PLoS ONE. and expression in Saccharomyces cerevisiae of two Aspergillus nidulans 2018;7(12):2215–21. xylanase genes. Appl Environ Microbiol. 1996;62(6):2179–82. https://​doi.​ 56. Kim J, et al. Isolation and characterization of cellulolytic yeast belonging org/​10.​1128/​aem.​62.6.​2179-​2182.​1996. to Moesziomyces sp. from the gut of Grasshopper. Korean J. Microbiol. 47. Morel G, et al. Diferential gene retention as an evolutionary mechanism 2019;55(3):234–41. to generate biodiversity and adaptation in yeasts. Sci Rep. 2015;5:1–18. 57. Oyedeji O, Iluyomade A, Egbewumi I, Odufuwa A. Isolation and screening https://​doi.​org/​10.​1038/​srep1​1571. of xylanolytic fungi from soil of botanical garden: xylanase production 48. Sena LMF, et al. “d-Xylose fermentation, xylitol production and xylanase from Aspergillus favus and Trichoderma viride. J Microbiol Res. 2018;8(1):9– activities by seven new species of Sugiyamaella”, Antonie van Leeuwen- 18. https://​doi.​org/​10.​5923/j.​micro​biolo​gy.​20180​801.​02. hoek. Int J Gen Mol Microbiol. 2017;110(1):53–67. https://​doi.​org/​10.​1007/​ 58. Sridevi A, Sandhya A, Ramanjaneyulu G, Narasimha G, Devi PS. Biocata- s10482-​016-​0775-5. lytic activity of Aspergillus niger xylanase in paper pulp biobleaching. 3 49. Olson DG, McBride JE, Joe Shaw A, Lynd LR. Recent progress in consoli- Biotech. 2016;6(2):1–7. https://​doi.​org/​10.​1007/​s13205-​016-​0480-0. dated bioprocessing. Curr Opin Biotechno. 2012;23(3):396–405. https://​ 59. Lee H, Biely P, Latta RK. Utilization of xylan by yeasts and its conversion to doi.​org/​10.​1016/j.​copbio.​2011.​11.​026. ethanol by Pichia stipitis strains. Appl Environ Microbiol. 1986;52(2):320–4. 50. Keller MB, Sørensen TH, Krogh KBRM, Wogulis M, Borch K, Westh P. https://​doi.​org/​10.​1128/​aem.​52.2.​320-​324.​1986. Biotechnology for biofuels activity of fungal β-glucosidases on cel- 60. Jia L, et al. Synergistic degradation of arabinoxylan by free and immobi- lulose. Biotechnol Biofuels. 2020;2020(7):1–7. https://​doi.​org/​10.​1186/​ lized xylanases and arabinofuranosidase. Biochem Eng J. 2016;114:268– s13068-​020-​01762-4. 75. https://​doi.​org/​10.​1016/j.​bej.​2016.​07.​013. 51. Rahikainen J, Ceccherini S, Molinier M, Gro S, Suurna A. Efect of cellulase family and structure on modifcation of wood fbres at high consistency, vol. 0123456789. Berlin: Springer; 2019. p. 5085–103. Publisher’s Note 52. Østby H, Hansen LD, Horn SJ, Eijsink VGH, Várnai A. Enzymatic processing Springer Nature remains neutral with regard to jurisdictional claims in pub- of lignocellulosic biomass: principles, recent advances and perspectives, lished maps and institutional afliations. vol. 47. Berlin: Springer; 2020. p. 9–10. 53. Horn SJ, Vaaje-Kolstad G, Westereng B, Eijsink VG. Novel enzymes for the degradation of cellulose. Biotechnol Biofuels. Biotechnol Biofuels. 2012;5:45.

Ready to submit your research ? Choose BMC and benefit from:

• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations • maximum visibility for your research: over 100M website views per year

At BMC, research is always in progress.

Learn more biomedcentral.com/submissions