<<

bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1 Comment on “Seasonal cycling in the gut microbiome of the Hadza hunter-gatherers of Tanzania” 2 3 Stephanie L. Schnorr1, Marco Candela2, Simone Rampelli2, Silvia Turroni2, and Amanda G. Henry3§, 4 Alyssa N. Crittenden4§ 5 6 7 1Konrad Lorenz Institute for Evolution and Cognition Research, Klosterneuburg, Austria; 2Unit of 8 Microbial Ecology of Health (UMEH), Department of Pharmacy and Biotechnology, University of 9 Bologna, Bologna Italy; 3Department of Archaeological Sciences, Faculty of Archaeology, Leiden 10 University, Netherlands; 4Laboratory of , Anthropometry, and Nutrition, Department of 11 Anthropology, University of Nevada, Las Vegas, Las Vegas, Nevada, USA. 12 13 §Corresponding authors. Email: [email protected]; [email protected] 14 15 16 17 Abstract 18 In a recent paper, Smits et al. (Science, 25 August 2017, p. 802) report on seasonal changes in the gut 19 microbiome of Hadza hunter-gatherers. They argue that seasonal volatility of some bacterial taxa 20 corresponds to seasonal dietary changes. We address the authors’ insufficient reporting of relevant data 21 and problematic areas in their assumptions about Hadza diet that yield inconsistencies in their results and 22 interpretations. 23 24 25 Main text 26 27 To date, robust data has been generated on the gut microbiome of more than a dozen small-scale non- 28 industrial populations practicing varying forms of mixed subsistence, including foraging, farming, 29 horticulture, or pastoralism (1). It is critical for research of this nature, which focuses on biological 30 samples collected from traditionally marginalized groups, many of them indigenous populations, to 31 include an anthropological perspective. This not only allows for the contextualization of dietary and 32 behavioral data, but also acts to inform our understanding of any links to more general conclusions about 33 patterns of human health and disease (1). Microbiome biobanking and large scale data sharing are now a 34 reality (2) and microbes from small-scale non-industrial populations are discussed as a valuable

1 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

35 commodity that may be used in the near future for commercial enterprises and health interventions in the 36 post-industrialized west, particularly as therapies for microbiota-driven diseases (3). Therefore, 37 commercial exploitation is a paramount danger, both for vulnerable traditional communities who are the 38 wellspring for microbial biotech inspiration, and for the communities targeted by commercial enterprise 39 with the promise of therapeutic benefits (4). This “unchartered ethical landscape” (5) poses many 40 challenges that require interdisciplinary work between clinical scientists, microbiologists, and 41 anthropologists so that careful attention can be paid to the relevant ethnographic and demographic 42 characteristics of a population under study. As the outcomes of such work often inform strategies of 43 health-cultivation in the cultural west, it is essential that we accurately portray the associative living 44 context of the human populations from which these informative data derive. 45 46 Our understanding of how the gut microbiome varies across populations and in response to ecology, 47 subsistence, and health patterns is increasing at a rapid rate. In recent years, several extensive meta- 48 analyses have found that diet composition is one of the drivers of microbiome variation, yet diet in sum 49 may only explain a small fraction of ecosystem diversity found across and within populations 50 (complicated by non-standardized dietary classifications), with often greater mechanistic influence 51 inferable from certain isolated dietary components (6–8). Furthermore, the few studies that have directly 52 explored seasonal variation or short time-series alterations in diet composition have yielded variable 53 results, yet it is important to note that, in each instance detailed diet composition and lifestyle data were 54 collected concomitantly (9–12). Therefore, the effects of temporal change (e.g. seasonality), both short- 55 and long-term duration, on the gut microbiome are not well known and any attempt to investigate the 56 relationship of seasonality on microbiota variation requires detailed supporting data. 57 58 In a recent study of seasonal changes in the gut microbiome of Hadza hunter-gatherers, Smits et al. 59 (13) contribute to this burgeoning field by undertaking longitudinal fecal sampling, however, we argue 60 that their characterization of Hadza diet, depiction of seasonal effects, and interpretations of the gut 61 microbiome (of both Hadza and urban industrial populations) are untenable due to insufficient data, 62 inaccurate reporting, and inappropriate analyses. Therefore, we present the following concerns regarding 63 the report by Smits et al.: (1) apparent sample stratification within seasons challenges the conclusions that 64 taxonomic and diversity differences match “seasonally associated functions” rather than specific dietary 65 fluctuations; (2) CAZyme annotations are not provided and CAZyme families are qualitatively 66 summarized using undisclosed criteria (see Figure 2 and Table S9 in Smits et al.), resulting in an incorrect 67 and reductive oversimplification of the array of reported and the potential microbiome functions 68 (see Table 1); and critically, (3) inaccurate depictions of the Hadza diet throughout the text and a lack of

2 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

69 data on diet composition during time of collections (for either Hadza or urban reference sample 70 populations) make evaluation of the results potentially inaccurate and interpretations of seasonal impacts 71 on the microbiome unfeasible. 72 73 The Hadza live in a subtropical environment marked by an annual cycle of wet and dry months that 74 determines availability of wild foods. While general trends in food availability are documented in a robust 75 literature going back 60 years, primary production tracks rainfall patterns and often varies annually (14). 76 Smits et al. report higher phylogenetic diversity and observed OTU metrics in dry versus wet season 77 samples and state that, “wet-season microbiotas possess a significantly less diverse CAZyome compared 78 to the dry-season microbiotas,” implying that the more speciose plant-based diet consumed by the Hadza 79 in the wet season (15,16) produces a less diverse taxonomic and CAZyme profile than the meat-rich dry 80 season diet. This finding challenges some prior conclusions about the microbiome-diet relationship that 81 have been made based on observations among post-industrialized populations (11), as well as 82 comparative studies looking at traditional and urban-industrial populations. Thus, the conclusions of 83 Smits et al. contest the hypothesis that more diverse and complex dietary components 84 confer greater gut microbiome diversity (17), which is potentially very interesting and impactful, but 85 requires a sound basis for support. Unfortunately, Smits et al. omit reporting of any observations of 86 dietary intake during sampling dates, do not supply data on rainfall patterns that would corroborate 87 seasonality demarcations during sampling dates, and mischaracterize the prominent seasonal dietary 88 features in their interpretations of CAZyme patterns. It is these concerns that led us to reassess the data 89 reported by Smits and colleagues. The files and R code used to conduct this reanalysis are provided with 90 the supporting documentation. 91 92 (1) Seasonal stratification and diversity 93 94 We downloaded the 16S rRNA data from NCBI SRA and conducted data curation and OTU table 95 creation following the same bioinformatic procedures reported in the author’s methods: QIIME 1.9.1 used 96 for quality filtering, OTU clustering with UCLUST at 97% identity against Greengenes 13.8, count table 97 rarefied to 11,000 observations, and diversity analysis accomplished based on the output of the PyNAST 98 aligner and FastTree also implemented in QIIME 1.9.1 (18–20). Our reanalysis does not replicate the 99 finding that taxonomic diversity is significantly greater in dry season samples based on the Observed 100 Species (p=0.1248) and Phylogenetic Diversity (p=0.6448) metrics using a Wilcoxon rank sum test 101 (Figure 1). Similar results were obtained when we used USEARCH UPARSE denovo OTU clustering 102 (results not shown). Importantly, while broad segregation between wet and dry season samples is

3 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

103 observed based on ordinations of the Bray-Curtis and UniFrac beta-diversity calculations (Figure 2), 104 interspersion between early and late phases of each season (sub-seasons) is readily apparent. Early and 105 late phases within the same season significantly separate from each other along PCo2 of the unweighted 106 UniFrac PCoA (Figure 2; p=8.206e-12 for wet season, and p=1.065e-13 for dry season), yet the 2014 107 late-wet and early-dry samples intersperse, and are not significantly varying in their coordinate values 108 (p=0.147; Wilcoxon). These sub-season stratification features, while undisclosed, are nevertheless visible 109 upon close inspection of Smits et al. Figure 1A right panel and 1B bottom panel, and signify stochastic 110 microbiome rearrangements not predicted by seasonality alone. A further problem lies with the early-dry 111 season samples, which were all collected in July 2014, the month of the year when berries often 112 disproportionately contribute to the diet (15,16). Yet Smits et al. characterize dietary trade-offs between 113 wet and dry seasons as follows: “… berry foraging and honey consumption are more frequent during the 114 wet season, whereas hunting is most successful during the dry”. The authors cite Frank Marlowe's 2010 115 book as the source of this information, though in the text of this book Marlowe presents data from a 116 previously published study, which found that berries can be a major source of calories during the early 117 dry and the end of the late dry periods (see Fig. 3 of Marlowe and Berbesque, 2009). The specific timing 118 of berry availability depends on rainfall patterns, which can vary significantly from year to year. In order 119 to make accurate statements about the timing of seasons in the East African tropical climate and 120 consequent changes to the diet, it is therefore imperative to record rainfall data during the period of study 121 (14,21). Still, the consistent and cumulative assessments of the Hadza diet across the seasons 122 demonstrates that berry foraging is most prevalent in the early dry season and the end of the late dry 123 season and beginning of the early wet season (15). In this regard, the depiction of Hadza diet seasonality 124 given by Smits et al. is inaccurate with great consequence, as it leads to misunderstandings in further 125 interpretations of their results and a much broader problem for interpreting the drivers in patterns of 126 microbiome taxonomic and functional fluctuations both for Hadza and for the western population samples 127 of the HMP dataset. We must underscore that, despite their incorrect characterization of potential seasonal 128 differences in diet, we nevertheless failed to replicate their reported microbiome patterns for diversity 129 differences and fluctuations in the microbiome that correspond to seasonality. 130 131 (2) CAZyme annotations and classifications 132 133 We reanalyzed the Smits et al. shotgun metagenomic data for CAZymes (22) using hidden Markov model 134 (HMM) profiles on reads, following the same search methods and criteria given in Smits et al. methods: 135 coding sequences, or open reading frames (ORFs), were found from reads with FragGeneScan (23); 136 CAZymes from dbCAN database identified using HMMER (24) with e-value<1e-5, and resultant

4 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

137 CAZyme family abundances normalized by the number of ORFs per sample. Descriptive plots of the 138 number of ORFs by number of reads, CAZyme observations and CAZyme hits with sampling estimation 139 curves are presented in Figure 3. Smits et al. report that CAZyme diversity is greater for dry season 140 samples using the Shannon index, which evaluates evenness of abundance across observed “species” (25) 141 and is not necessarily the meaning intuited by “high” or “low” diversity. Shannon diversity is, indeed, 142 greater for dry season samples (p=5.432e-05; Wilcoxon), but the Chao1 measure of richness is equivocal 143 and the “Observed Species” metric is slightly greater for wet season samples (Figure 4). Thus, wet season 144 samples appear to contain greater numbers of different types of CAZyme genes than dry season samples. 145 Conclusions based on differences in overall abundance of CAZyme gene hits depends on whether sub- 146 seasons are lumped or split, whether abundance is normalized by number of ORFs only (Figure 4), or if 147 totals (both rarefied and non-rarefied) are used, indicating potential discrepancies in sampling 148 completeness, bacterial genome size, and gene copy numbers. These details are important because they 149 alter interpretations regarding the size and direction of environmental impacts, and the depth of 150 metagenomic analysis offered by Smits and colleagues does not provide sound conclusions about the 151 adaptiveness of the microbiome functionality, particularly with regards to seasonal variation. 152 153 An ordination based on Bray-Curtis dissimilarity index and a heatplot of the normalized CAZyme 154 abundances offers an objective view of the stratification of samples based on their profiled functional 155 potential (Figure 5). Again, interspersion between wet and dry season samples is apparent, resulting in 156 three dominant clusters both of individual samples and of CAZymes. The CAZyme families appear 157 alternately abundant in dry and wet seasons and 105 CAZyme families significantly differ in abundance 158 between wet and dry season samples (reduced to 80 after FDR-correction, Wilcoxon; see Table 1). While 159 these clusters could potentially illuminate broad functional associations, it is impractical to ascribe 160 CAZyme families to qualitative categories like “plant”, “animal”, “mucin”, and “/fructose” 161 because these enzymes may operate in hundreds if not thousands of different pathways and often on 162 secondarily derived molecules rather than the original source (see Table 1 annotations, and see 163 the online CAZy database for a comprehensive list of annotations at http://www.cazy.org/). Additionally, 164 several family categories provided by Smits et al. seem subjectively assigned; many CAZymes were 165 assigned to more than one qualitative category, and so whether a particular CAZyme reflected “plant” or 166 “animal” for a given sample is unclear. As the reported methods do not describe how this process was 167 conducted, the categorization of CAZyme families by Smits et al. is nonreplicable. Another problem is 168 that the CAZyme categorizations given by Smits et al. are misidentified. For example, CAZyme families 169 including are assigned to the category “plant”, despite chitin being a polysaccharide comprising 170 arthropod exoskeletons and fungi cell walls. Hadza do consume bee larvae from honeycomb, but no

5 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

171 particular fungi are known to be a part of their diet. Smits et al. interpret “sucrose/fructose” CAZyme 172 enrichment for wet season samples as associating with berry consumption, which is problematic given 173 data on the timing of berry foraging that we addressed above. It is further perplexing that the authors 174 ignore the overwhelming contribution of honey to the diet during the wet season, which would be a much 175 more parsimonious explanation of “sucrose/fructose” metabolism. Curiously, pectin enzymes are enriched 176 in dry season samples, which is additionally suggestive of either berry consumption or honey 177 consumption having occurred around the time of sample acquisition (August, September, and October) 178 and signifying alterations in typical rainfall patterns (14,16). 179 180 The authors go on to state that, “In the dry seasons, the Hadza consume more meat, which corresponds 181 with the enrichment of CAZymes related to animal .” If microbiome fluctuations across the 182 year are a result of dietary seasonal cycling, we would expect the following to be true: greater diversity 183 and abundance of plant foods in the wet season corresponds with enrichment of CAZymes related to plant 184 carbohydrate. However, this logical corollary is not upheld by their results, and acknowledging statements 185 are avoided in the body of the text, but implied in the conclusions. We would expect these reciprocal 186 relationships, at the very least, to be tested and reported. That they are not is a surprising omission, given 187 that the explanation for seasonal cycling posited by Smits et al. rests squarely on seasonal differences in 188 diet, and is a core thesis in their analysis: “Systematic seasonal differences in the Hadza microbiota led us 189 to hypothesize that seasonal dietary changes might lead to related changes in the functional capacity of 190 the microbial community”. Thus, the CAZyme analysis strategy employed by Smits et al. is problematic 191 and distorts conclusions about dietary components that may be contributing to actual observed 192 microbiome features. 193 194 Metabolome data reported in Smits et al. Figure S4 are not provided with the manuscript or data files and, 195 therefore, cannot be assessed. However, the KEGG assigned pathways of the metagenomic reads are 196 accessible in a supplementary data file (Table S4) as “relative abundances of genes assigned to KEGG 197 Carbohydrate Metabolism pathways”. We analyzed the KEGG pathway relative abundance data in Table 198 S4 to assess the claim that consistent functional representation is observed across dry season samples, 199 with similar patterns as depicted in the CAZyme analysis. A simple visualization of the KEGG 200 carbohydrate pathway relative abundance data does not yield similar patterns as those depicted in the 201 CAZyme analysis in their Figure 2. In fact, 9 of 15 total KEGG pathways are significantly enriched in wet 202 season samples (Figure 6; FDR < 0.05, Wilcoxon). By plotting the KEGG pathway data provided, we see 203 that dry season samples again intersperse with wet season samples (Figure 7). The relative abundance of 204 carbohydrate pathways are weighted in favor of wet season samples; pathways indicative of plant xylan

6 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

205 and metabolism (pentose-phosphate pathway, galactose metabolism), fermentation 206 (glycolysis and pyruvate metabolism), general carbohydrate metabolism of indigestible and fiber 207 (propanoate and butanoate metabolism), and metabolism of plant phytates or lecithins (inositol 208 metabolism). This information provides a picture of the seasonal differences in functional potential, 209 whereby wet season samples are actually enriched in carbohydrate metabolism of putatively plant-based 210 (relative to total metabolic function), contrary to what is depicted in Smits et al. Figure 2 211 and to statements made in the text about analogous patterns of functional enrichments. 212 213 In general, we find the results and conclusions reported in Smits et al. to be problematic for the reasons 214 outlined above. This extends to the relevant conclusions made at the end of the paper about “seasonally 215 associated functions” and functional association to the “cyclic succession of species”. Neither were 216 seasonal functions unambiguously demonstrated, nor were associations between species fluctuations and 217 gene functions explicitly tested. 218 219 (3) Lack of diet composition data 220 221 Perhaps our greatest concern is that dietary information previously indicated as associated with these data 222 were not reported. We would expect, minimally, to see analysis of the “short diet questionnaires for each 223 subject” reportedly collected and analyzed by the team (26). The inclusion of even snapshot dietary 224 composition data at the time of collection could resolve many of the issues outlined here. Since diet is 225 invoked as the main driver for the reported seasonal differences, nutritional information is indispensable. 226 In nearly all instances where population-specific gut microbiome traits are investigated, particularly in 227 studies on seasonality, research teams have provided extensive quantified data on diet (10,12,27–30). 228 Such data are critical when interpreting seasonal impacts on the gut microbiome from populations 229 consuming a seasonally variable diet (10). Furthermore, it is imperative that reported characterizations of 230 diet composition based on findings in the literature are accurately portrayed, and that the dietary 231 conditions under which the data were collected are reported. Only the introduction provides any 232 discussion of seasonal availability of Hadza foods, but this concise summary does neither justice to the 233 complexity of Hadza diet, nor to the robust canon of literature on Hadza diet (see Blurton Jones 2016 for 234 summary of diet related work to date). While we are not suggesting a comprehensive summary of this 235 body of work was in order, we do have reservations that the abridged discussion of seasonality presented 236 by Smits et al. allows for a full interpretation of the data. 237

7 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

238 The results as presented by Smits et al. posit that taxonomic fluctuations and functional capacities vary by 239 season in a manner contrary to what we might expect given the historical data on Hadza diet and current 240 paradigms about the influence of plant on the gut microbiome, thus their claims should 241 be reinforced by relevant evidence (e.g. dietary and climatic) or include a discussion that addresses 242 discrepancies. Finally, Smits et al. conclude that the loss of seasonal microbiome cycling in western 243 microbiomes may relate to the lack of certain “volatile” microorganisms from the guts of people living in 244 the post-industrialized west. However, the absence of a seasonally sampled correspondent dataset on an 245 urban western population precludes such an inference. Without seasonal environmental or dietary data for 246 the Hadza or for urbanized western controls, it is impossible to interpret what ecological pressures are 247 driving temporal microbiome variation and/or compare microbiome variation in the context of human 248 subsistence. Holding the effects of diet constant, it remains unclear whether or not taxa seen in any 249 population, urban or hunter-gatherer, are, in fact, seasonally volatile.

8 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

250 References 251 252 1. Crittenden AN, Schnorr SL. Current views on hunter-gatherer nutrition and the evolution of the 253 human diet. Am J Phys Anthropol. 2017;162(Yearbook):84–109. 254 2. Chuong KH, Hwang DM, Tullis DE, Waters VJ, Yau YCW, Guttman DS, et al. Navigating social 255 and ethical challenges of biobanking for human microbiome research. BMC Med Ethics. 256 2017;18(1). 257 3. Dominguez-Bello MG, Peterson D, Noya-Alarcon O, Bevilacqua M, Rojas N, Rodriguez R, et al. 258 Ethics of exploring the microbiome of native peoples. Nat Microbiol. 2016;1(16097). 259 4. Lewis CM, Obregon-Tito AJ, Tito RY, Foster MW, Spicer PG. The Human Microbiome Project: 260 lessons from human genomics. Trends Microbiol. 2012;20(1):1–4. 261 5. McGuire AL, Colgrove J, Whitney SN, Diaz CM, Bustillos D, Versalovic J. Ethical, legal, and 262 social considerations in conducting the Human Microbiome Project. Genome Res. 263 2008;18(12):1861–4. 264 6. Falony G, Joossens M, Vieira-Silva S, Wang J, Darzi Y, Faust K, et al. Population-level analysis 265 of gut microbiome variation. Science (80- ). 2016 Apr 29;352(6285):560 LP-564. 266 7. Zhernakova A, Kurilshikov A, Bonder MJ, Tigchelaar EF, Schirmer M, Vatanen T, et al. 267 Population-based analysis reveals markers for gut microbiome composition and 268 diversity. Science (80- ). 2016 Apr 29;352(6285):565 LP-569. 269 8. Holmes AJ, Chew YV, Colakoglu F, Cliff JB, Klaassens E, Read MN, et al. Diet-Microbiome 270 Interactions in Health Are Controlled by Intestinal Nitrogen Source Constraints. Cell Metab. 271 Elsevier; 2018 Mar 27;25(1):140–51. 272 9. Hisada T, Endoh K, Kuriki K. Inter- and intra-individual variations in seasonal and daily stabilities 273 of the human gut microbiota in Japanese. Arch Microbiol. 2015;197(7):919–34. 274 10. Davenport ER, Mizrahi-man O, Michelini K, Barreiro LB, Ober C, Gilad Y. Seasonal Variation in 275 Human Gut Microbiome Composition. PLoS One. 2014;9(3):e90731. 276 11. David LA, Maurice CF, Carmody RN, Gootenberg DB, Button JE, Wolfe BE, et al. Diet rapidly 277 and reproducibly alters the human gut microbiome. Nature. Nature Publishing Group; 2014 Dec 278 11;505:559–63. 279 12. Zhang J, Guo Z, An A, Lim Q, Zheng Y, Koh EY, et al. Mongolians core gut microbiota and its 280 correlation with seasonal dietary changes. Sci Rep. 2014;4(5001):1–11. 281 13. Smits SA, Leach J, Sonnenburg ED, Gonzalez CG, Lichtman JS, Reid G, et al. Seasonal cycling in 282 the gut microbiome of the Hadza hunter-gatherers of Tanzania. Science (80- ). American 283 Association for the Advancement of Science; 2017;357(6353):802–6. 284 14. Blurton Jones NG. Demography and evolutionary ecology of Hadza hunter-gatherers. 1st ed. 285 Cambridge University Press; 2016. 508 p. 286 15. Marlowe FW, Berbesque CJ. Tubers as fallback foods and their impact on Hadza hunter-gatherers. 287 Am J Phys Anthropol. 2009 Dec;140(4):751–8. 288 16. Marlowe FW. The Hadza: Hunter-Gatherers of Tanzania. 1st ed. London: University of California 289 Press; 2010. 336 p. 290 17. Sonnenburg ED, Smits SA, Tikhonov M, Higginbottom SK, Wingree NS, Sonnenburg JL. Diet- 291 induced extinctions in the gut microbiota compound over generations. Nature. 2016;529:212–5. 292 18. DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, et al. Greengenes, a 293 Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB. Appl 294 Environ Microbiol. 2006;72(7):5069–72. 295 19. Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles 296 instead of a distance matrix. Mol Biol Evol. 2009;26:1641–50. 297 20. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger J, Bushman FD, Costello EK, et al. QIIME 298 allows analysis of high-throughput community sequencing data. Nat Methods. 2010;7:335–6. 299 21. Murray SS, Schoeninger MJ, Bunn HT, Pickering TR, Marlett JA. Nutritional Composition of 300 Some Wild Plant Foods and Honey Used by Hadza Foragers of Tanzania. J Food Compos Anal.

9 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

301 2001 Feb;14(1):3–13. 302 22. Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B. The Carbohydrate-active 303 enzymes database (CAZy) in 2013. Nucleic Acids Res. 2014;42:D490–5. 304 23. Rho M, Tang H, Ye Y. FragGeneScan: Predicting Genes in Short and Error-prone Reads. Nucleic 305 Acids Res. 2010;38(20):e191. 306 24. Eddy SR. Profile Hidden Markov Models. Bioinformatics. 1998;14:755–63. 307 25. Hughes JB, Hellmann JJ, Ricketts TH, Bohannan BJM. Counting the Uncountable: Statistical 308 Approaches to Estimating Microbial Diversity. Appl Environ Microbiol. 2001;67(10):4399–406. 309 26. De Vrieze J. Gut instinct. Science (80- ). 2014;343(6168):241–3. 310 27. De Filippo C, Cavalieri D, Di Paola M, Ramazzotti M, Poullet JB, Massart S, et al. Impact of diet 311 in shaping gut microbiota revealed by a comparative study in children from Europe and rural 312 Africa. Proc Natl Acad Sci. 2010 Aug 17;107(33):14691–6. 313 28. Martinez I, Stegen JC, Maldonado-Gomez M, Eren AM, Siba PM, Greenhill AR, et al. The Gut 314 Microbiota of Rural Papua New Guineans: Composition, Diversity Patterns, and Ecological 315 Processes. Cell Rep. 2015;11:527–38. 316 29. Obregon-Tito AJ, Tito RY, Metcalf J, Sankaranarayanan K, Clemente JC, Ursell LK, et al. 317 Subsistence strategies in traditional societies distinguish gut microbiomes. Nat Commun. Nature 318 Publishing Group; 2015;6:1–9. 319 30. Gomez A, Petrzelkova KJ, Burns MB, White BA, Leigh SR, Blekhman R, et al. Gut Microbiome 320 of Coexisting BaAka Pygmies and Bantu Reflects Gradients of Traditional Subsistence Patterns. 321 Cell Rep. 2016;14:1–12. 322

10 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

323 Figure and table legends 324 325 Figure 1. Diversity analysis of 16S rRNA gene data 326 Reanalysis of the 16S rRNA gene data results in non-significant differences, contrasting reported 327 diversity analysis in Smits et al 2017. Observed features (or OTUs) by read count indicates that wet- 328 season samples in general had greater numbers of reads generated, but with no discernable pattern in 329 OTU diversity. The Observed Species and Phylogenetic Diversity metrics were calculated on OTU tables 330 rarefied to 11,000 reads, and no significant difference is seen between dry and wet season samples based 331 on Wilcoxon test. Samples and plots are colored by season: red = dry; blue = wet. 332 333 Figure 2. PCoA of 16S rRNA gene data shows stratification by season and sub-season along PCo2 334 of unweighted UniFrac distance ordination 335 Ordinations of 16S rRNA gene data show broad seasonal stratification, but sub-seasonal interspersion. A 336 plot of samples along PCo2 of the unweighted UniFrac distance, similar to Figure 1A and 1B of Smits et 337 al., shows sub-season stratification that disrupts clear patterns of seasonal fluctuation. Instead, distribution 338 along PCo2 indicates that the two main clusters apparent in ordination plots above are driven by 339 clustering of the late-dry season data separate from the other sub-seasons. Therefore, seasonal 340 differentiation appears driven only by one sub-seasonal window, which is a time of year when more game 341 meat is available. The late-dry sub-seasons of concurrent years are not significantly varying in their 342 coordinate distribution along PCo2 (p=0.1006), however, significant differences are seen between the 343 early-wet and late-wet sample distribution (p=8.206e-12), between the early-dry and late-dry sample 344 distributions (p=1.065e-13), and between the 2013 dry season samples compared to all 2014 dry season 345 samples (p=5.663e-11). The late-wet season samples and the early-dry season samples do not 346 significantly vary along PCo2 (p=0.147). Dot plots on the left are colored by season (red = dry; blue = 347 wet), and those on the right as well as the bottom panel are colored by sub-season (orange = 2013 late- 348 dry; purple = 2014 early-wet; blue = 2014 late-wet; red = 2014 early-dry; dark red = 2014 late-dry). 349 350 Figure 3. Shotgun data relationships between reads, ORFs, and CAZyme HMM abundances by 351 sub-season 352 Annotation of ORFs follows a linear pattern based on input read abundance, indicating that ORF 353 prediction follows depth of sequencing, and that CAZyme family HMMs found within each sample scale 354 linearly with the number of ORFs. The number of different observed CAZymes and rarefaction curves 355 indicate a sharp plateau above 200 observations. 356

11 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

357 Figure 4. Diversity metrics for CAZyme HMM abundance table and comparisons of total 358 annotations viewed by season and sub-season 359 Alpha-diversity metrics were calculated on the unrarefied abundance table of CAZyme HMM hits, 360 demonstrating that when other measures of diversity are considered, seasonal differences in diversity are 361 multimodal. Wet season samples appear to have more different types of CAZymes, while dry season 362 samples have a more even pattern of CAZyme abundances. Boxplots of CAZyme abundances normalized 363 per million ORFs clearly show that dry season samples have a larger fraction of the gene-coding regions 364 dedicated to CAZymes, however the late-wet season samples are enriched in total CAZyme HMM counts. 365 This is likely due to read generation being higher for these samples. Gene annotation abundances though 366 must account for genome sizes and gene copy numbers, which vary based on the taxonomic 367 representations in the microbiome, and therefore confound robust interpretations from this simplistic 368 broad-overview analysis strategy. Boxplots colored by season (red = dry; blue = wet), and sub-season 369 (orange = 2014 early-dry; turquoise = 2014 early-wet; red = 2013 late-dry; blue = 2014 late-wet). 370 371 Figure 5. Ordination and heatplot of CAZyme HMM normalized abundance plotted by sub-season 372 shows clustering with interspersion 373 Ordination using the Bray-Curtis dissimilarity index on the normalized CAZyme HMM abundance table 374 shows variation along PCo1 by wet and dry season samples, with no apparent pattern of distribution 375 based on season or sub-season along PCo2. Heatplot produced from the made4 package in R using 376 Ward’s method for clustering on the normalized CAZyme abundance table with samples in columns, 377 colored by season: orange = early-dry; turquoise = early-wet; red = late-dry; blue = late-wet. CAZyme 378 genes are also clustered in the rows showing three major clusters with sub-clustering of the third. Wet and 379 dry season samples are split along two major clusters, with the wet-season samples sub-clustering and 380 interspersed by some early-dry samples. 381 382 Figure 6. Boxplots for abundance of KEGG pathways in carbohydrate metabolism from Smits et al. 383 Table S4 data trends in favor of wet-season carbohydrate metabolism specialization. 384 KEGG pathway relative abundance supplementary table is visualized with boxplots for only the Hadza 385 samples, colored by season (red = dry; blue = wet). Wilcoxon tests between seasons for each pathway 386 resulted in the finding that 9 of the 15 total pathways are significantly enriched for wet-season samples, 387 with significant p-values displayed in bold in the middle of the plots. These pathways largely correspond 388 with activities for degradation or metabolism of complex polysaccharides and pentose sugars. 389

12 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

390 Figure 7. Overview of KEGG pathway abundances for Hadza and HMP samples and ordination 391 plus heatplot of Hadza samples. 392 Broadly, the variation in KEGG pathway relative abundance is greatest between Hadza samples and HMP 393 samples, regardless of season. Hadza samples tend to be inversely enriched in KEGG pathways relative to 394 HMP. Clustering within Hadza samples only shows again separation between wet and dry seasons, but 395 with interspersion. Sub-clustering within the wet season samples indicates trade-off between fiber 396 degradation pathways and starch/simple-sugar metabolism. More KEGG pathways appear enriched in wet 397 season samples than dry season samples. 398 399 Table 1 400 List of the significantly differentially abundance CAZyme families between dry and wet season, 401 determined by Wilcoxon test in R. The median, mean, p-values, and f-values are shown for each CAZyme 402 family. The qualitative category assigned by Smits et al. is provided when available for CAZyme 403 families, along with the actual class type and annotation given from the cazy.org database referencing 404 site.

13 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Observed features by number of reads Observed Species● Phylogenetic Diversity

DRY ● ●

WET ● NS NS ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●● ●● ●● ●● ● ●●● ● ●●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ●● ● ●● ●● ●● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ● ●●● ●● ●●●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●●●●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● TUs ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● O ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 600 800 1000 1200 1400 1600 1800 ● ● ● ● ● ● 600 800 1000 1200 1400 1600 1800

● ● ● ● ● ● ● ● 30 40 50 60 70 80 90 100

0 10000 20000 30000 40000 50000 DRY WET DRY WET

number of reads bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Bray−Curtis Bray−Curtis MDS2 MDS2

Weighted UniFrac Weighted UniFrac MDS1 MDS1 MDS2 MDS2

Unweighted UniFrac Unweighted UniFrac MDS1 MDS1 MDS2 MDS2

● ● MDS1 ● ● MDS1● ● ● 0 DRY WET 0 2013−LD 2014−ED 2014−EW 2014−LD 2014−LW

0 0

Sub−seasons along PCo2 2013−LD

2014−EW

2014−LW

2014−ED

2014−LD

−1.5 −1.0 −0.5 0.0 0.5 1.0

Unweighted UniFrac PCo2 bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

number ORFs by reads Observed CAZymes by number of ORFs orfs um diff CAZymes E−DRY14 n E−DRY14 E−WET14 E−WET14

L−DRY13 170 180 190 200 210 L−DRY13

2e+06 4e+06 6e+06 8e+06 1e+07 L−WET14 L−WET14

5.0e+06 1.0e+07 1.5e+07 2e+06 4e+06 6e+06 8e+06 1e+07

reads number of ORFs Total CAZyme HMMs by number of ORFs Rarefaction Species

total CAZyme HMMs E−DRY14 E−WET14 L−DRY13 L−WET14 2e+04 6e+04 1e+05 0 50 100 150 200 250

2e+06 4e+06 6e+06 8e+06 1e+07 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05

number of ORFs Sample Size bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

● Chao1 Observed Species Shannon

● ● ● ● ● ● ● ● p=5.43e−5 ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●

● ● ●

● ● ●

● 180 190 200 210 220 230 5.4 5.5 5.6 5.7 5.8 5.9 6.0

● ● ●

170 180 190 200 210 ●

DRY WET DRY WET● DRY WET

CAZyme● HMM abundance by ORFs CAZyme HMM abundance by ORFs ●

● ● ●

● p=1.91e−5 ● ●

● ● ●

● ●

● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ●

● ● ● CAZyme HMMs per 1M ORFs CAZyme HMMs per 1M ORFs

● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● 9000 10000 11000 12000 13000 9000 10000 11000 12000 13000

● E−DRY14● E−WET14 L−DRY13 L−WET14 DRY WET

Total CAZyme HMM abundances Total CAZyme HMM abundances

● ●

● ●

● ●

● ●

● ● ● ●

● ● ● ● ● ● ● ● ● ● ● CAZyme HMMs CAZyme HMMs

● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●

● ● ●●● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● 20000 40000 60000 ● ● ● ●

2e+04 4e+04 6e+04 8e+04● 1e+05 ●

E−DRY14 E−WET14 L−DRY13 L−WET14 DRY WET certified bypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmadeavailableunder M DS2 bioRxiv preprint

-0.5 0.0 0.5 - ET14 L-W 13 L-DRY ET14 E-W 14 E-DRY doi: https://doi.org/10.1101/284513 -0.5 0.0 a ; CC-BY-NC 4.0Internationallicense this versionpostedApril25,2018. 0.5 The copyrightholderforthispreprint(whichwasnot .

SRR5763459 SRR5763462 SRR5763455 SRR5763446 SRR5763467 SRR5763469 SRR5763461 SRR5763466 SRR5763464 SRR5763470 SRR5763445 SRR5763463 SRR5763460 SRR5763471 SRR5763468 SRR5763479 SRR5763472 SRR5763475 SRR5763452 SRR5763450 SRR5763473 SRR5763476 SRR5763483 SRR5763484 SRR5763454 SRR5763449 SRR5763451 SRR5763453 SRR5763480 SRR5763447 SRR5763448 SRR5763458 SRR5763456 SRR5763477 SRR5763481 SRR5763482 SRR5763465 SRR5763478 SRR5763457 SRR5763474 CBM2.hmm CBM4.hmm GH94.hmm GH53.hmm GH31.hmm GH133.hmm CBM61.hmm GH16.hmm GH25.hmm CE2.hmm GH130.hmm CBM20.hmm GH117.hmm CBM6.hmm GH5.hmm CE1.hmm GH10.hmm GH26.hmm GH98.hmm GH76.hmm PL22.hmm GH2.hmm GH8.hmm GH115.hmm GH67.hmm GH127.hmm GH43.hmm GH51.hmm GH88.hmm PL9.hmm CE12.hmm PL11.hmm GH105.hmm CE8.hmm GH28.hmm GT9.hmm GT83.hmm CE11.hmm GT19.hmm GH97.hmm GT3.hmm GH84.hmm GH35.hmm CE7.hmm GH95.hmm CE6.hmm GT30.hmm GH19.hmm GH128.hmm CBM57.hmm GH121.hmm PL1.hmm GT65.hmm GH135.hmm GH73.hmm GT42.hmm GH123.hmm GH29.hmm GH125.hmm GH27.hmm GH89.hmm GH20.hmm GH92.hmm CBM50.hmm GT51.hmm CBM77.hmm GT41.hmm GH23.hmm GH57.hmm AA3.hmm AA8.hmm GT81.hmm CBM46.hmm AA6.hmm GH120.hmm GH65.hmm GH129.hmm GH1.hmm GH4.hmm CBM34.hmm GH32.hmm GH36.hmm GH13.hmm GH42.hmm GH112.hmm GH77.hmm CE10.hmm CBM48.hmm CE9.hmm GT35.hmm GT5.hmm CBM26.hmm CBM74.hmm CBM21.hmm CBM25.hmm GT40.hmm GT4.hmm GT2.hmm GT32.hmm GT37.hmm GT90.hmm CBM27.hmm GT6.hmm GH3.hmm PL12.hmm GT50.hmm CBM63.hmm GH113.hmm GT71.hmm GH68.hmm GH70.hmm CBM47.hmm GT59.hmm GH37.hmm GT20.hmm GT45.hmm GH101.hmm CBM71.hmm GH104.hmm AA10.hmm GH17.hmm AA1.hmm CBM13.hmm GH79.hmm GH7.hmm GT44.hmm GT24.hmm GT33.hmm GH107.hmm GT58.hmm GT77.hmm CBM42.hmm GT31.hmm CBM41.hmm CBM51.hmm AA7.hmm GT1.hmm GT66.hmm GT84.hmm CE4.hmm GH38.hmm dockerin.hmm SLH.hmm CBM9.hmm CE5.hmm PL2.hmm CBM22.hmm GH44.hmm GH48.hmm GH9.hmm CE3.hmm PL17.hmm CBM76.hmm CBM79.hmm CBM40.hmm CBM66.hmm GH72.hmm GT62.hmm GT47.hmm GH75.hmm GT61.hmm CBM23.hmm CBM78.hmm GT12.hmm GT2_Cellulose_synt CBM56.hmm CBM68.hmm GH114.hmm GH100.hmm GH111.hmm GT53.hmm CBM43.hmm GT85.hmm GT25.hmm GH124.hmm CBM12.hmm CBM5.hmm GT87.hmm CBM16.hmm GT76.hmm GH116.hmm CBM38.hmm GT89.hmm AA2.hmm GH46.hmm GH91.hmm GT57.hmm CBM14.hmm GH56.hmm GH62.hmm GT15.hmm GH55.hmm GH78.hmm GH6.hmm AA5.hmm GH47.hmm GT48.hmm GT64.hmm AA4.hmm CE14.hmm CBM65.hmm GT21.hmm GT46.hmm CE13.hmm GT26.hmm GT75.hmm GT7.hmm CBM11.hmm GH45.hmm GT13.hmm CBM80.hmm GT11.hmm GH81.hmm GT73.hmm GH64.hmm GH30.hmm CE15.hmm PL10.hmm GH52.hmm PL4.hmm GT23.hmm AA12.hmm GH106.hmm CBM67.hmm CBM58.hmm GH50.hmm GH109.hmm GH110.hmm GH33.hmm GT94.hmm GT10.hmm GT82.hmm GH59.hmm GH74.hmm CBM37.hmm CBM3.hmm GH126.hmm PL14.hmm CBM36.hmm GT52.hmm GH18.hmm GH39.hmm GT17.hmm GH24.hmm GH93.hmm cohesin.hmm GT39.hmm CBM30.hmm GT60.hmm CBM35.hmm CBM54.hmm GH11.hmm CBM75.hmm CBM44.hmm GH119.hmm GH71.hmm GH102.hmm GH103.hmm GT56.hmm GT99.hmm GT49.hmm PL7.hmm GH15.hmm PL21.hmm GT27.hmm GT28.hmm GH108.hmm GT22.hmm GH54.hmm GT34.hmm PL6.hmm GH87.hmm GH99.hmm GH14.hmm GT29.hmm GT72.hmm CBM62.hmm PL13.hmm CBM32.hmm GH63.hmm GH66.hmm PL15.hmm GH85.hmm PL8.hmm GT14.hmm GT8.hmm GT92.hmm bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

● Amino.sugar.and.nucleotide.sugar.metabolism Ascorbate.and.aldarate.metabolism● Butanoate.metabolism

● ● ● ● ● ● 0.0403 0.0063 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●

● ● ● ● ● ● ● ●

● 0.0085 0.0100 0.0115 0.0285 0.0300 0.0315 dry wet 0.0018 0.0024 0.0030 dry wet dry wet C5.Branched.dibasic.acid.metabolism Citrate.cycle..TCA.cycle. Fructose.and.mannose.metabolism

● ● ● ●

● ● ● 0.0036 0.0085 ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.017 0.020 ● 0.0046 0.0052 0.0058 0.008 0.010 0.012

●dry wet dry wet dry wet Galactose.metabolism Glycolysis...Gluconeogenesis Glyoxylate.and.dicarboxylate.metabolism

● ● ●

● ● ● 0.0437 ● ● ● 0.0036 ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● 0.022 0.026 ● 0.012 0.014 ● ● ● ● ●

0.0070 0.0085 ● dry wet dry wet dry wet Inositol.phosphate.metabolism Pentose.and.glucuronate.interconversions Pentose.phosphate.pathway

● ● ● ● ● ● 0.0063 ● 0.0028 ● ● ● ●● ● ● 0.0213 ● ● ●● ● ● ●●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● 0.016 0.018 0.020 0.009 0.011 0.013

0.0020 0.0030 0.0040 ● dry wet dry wet dry wet Propanoate.metabolism Pyruvate.metabolism Starch.and.sucrose.metabolism ●

● ● ● ● ● ●● ● ● ●● ● ● 0.0403 0.0213 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ●● ● ● ● ●

● ●● ● ● 0.0190 0.0205 0.0220 0.017 0.020 0.023 0.008 0.010 0.012 dry wet dry wet dry wet bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not Amino.sugar.and.nucleotide.sugar.metabolismcertified by peer review) is the author/funder, who hasAscorbate.and.aldarate.metabolism granted bioRxiv a license to display the preprint in perpetuity.Butanoate.metabolism It is made available under aCC-BY-NC 4.0 International license. ● ● ● ●

● 0.008 0.009 0.010 0.011 0.012 0.0020 0.0025 0.0030 0.0035 0.026 0.028 0.030 0.032

2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP

C5.Branched.dibasic.acid.metabolism Citrate.cycle..TCA.cycle. Fructose.and.mannose.metabolism

● 0.008 0.010 0.012 0.014 0.016 0.0035 0.0045 0.0055 0.016 0.018 0.020 0.022 0.024 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP

Galactose.metabolism Glycolysis...Gluconeogenesis Glyoxylate.and.dicarboxylate.metabolism

● ● ● 0.022 0.024 0.026 0.028 0.012 0.014 0.016 0.018 0.007 0.009 0.011 0.013

2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP

Inositol.phosphate.metabolism Pentose.and.glucuronate.interconversions Pentose.phosphate.pathway

● ● ●

● 0.010 0.014 0.018 0.016 0.017 0.018 0.019 0.020 0.0020 0.0025 0.0030 0.0035 0.0040 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP

Propanoate.metabolism Pyruvate.metabolism Starch.and.sucrose.metabolism

● ● ●

● ● ● ●

● 0.018 0.020 0.022 0.024 0.018 0.020 0.022 0.024 0.008 0.009 0.010 0.011 0.012 0.013

2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP 2013−DRY 2014−DRY 2014−WET HMP bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

RDA

2013-DRY 2014-DRY 2014-W ET HMP

KEGG pathway by season

−2 0 2

C5.Branched.dibasic.acid.metabolism Ascorbate.and.aldarate.metabolism Pyruvate.metabolism Propanoate.metabolism Glycolysis...Gluconeogenesis Inositol.phosphate.metabolism Pentose.phosphate.pathway Butanoate.metabolism Galactose.metabolism Fructose.and.mannose.metabolism Amino.sugar.and.nucleotide.sugar.metabolism Starch.and.sucrose.metabolism Glyoxylate.and.dicarboxylate.metabolism Pentose.and.glucuronate.interconversions Citrate.cycle..TCA.cycle. TZ_42041 TZ_39371 TZ_61717 TZ_10742 TZ_24827 TZ_90808 TZ_76219 TZ_86771 TZ_39227 TZ_40311 TZ_12739 TZ_21618 TZ_70090 TZ_83893 TZ_83788 TZ_42864 TZ_47979 TZ_21460 TZ_86768 TZ_87532 TZ_14139 TZ_99300 TZ_16581 TZ_60579 TZ_65642 TZ_43638 TZ_41014 TZ_67706 TZ_65172 TZ_25806 TZ_27689 TZ_81781 TZ_73414 TZ_83985 TZ_37609 cazyme_name med_dry med_wet mean_dry mean_wet pval fdr Smits_assigned_category class cazy.org_enzyme_family_annotations AA6 2.84 6.18 3.50 6.07 0.00248 0.01055 NA auxiliary-activities 1,4-benzoquinone reductase (EC. 1.6.5.6) Modules of approx. 120 residues. The -binding function has been demonstrated in one case on NA carbohydrate-binding molecule amorphous cellulose and β-1,4-xylan. Some of these modules also bind β-1,3-glucan, β-1,3-1,4-glucan, and β-1,4- CBM6 9.34 2.94 10.68 2.90 0.00001 0.00016 glucan Modules of approx. 150 residues which always appear as a threefold internal repeat. The only apparent exception to this, II of Actinomadura sp. FC7 (GenBank U08894), is in fact not completely sequenced. These modules were first identified in several plant lectins such as or agglutinin of Ricinus communis which bind galactose residues. The three-dimensional structure of a plant lectin has been determined and displays a pseudo-threefold symmetry in accord with the observed sequence threefold repeat. These modules have since NA carbohydrate-binding molecule been found in a number of other proteins of various functions including and . While in the plant lectins this module binds mannose, binding to xylan has been demonstrated in the Streptomyces lividans xylanase A and arabinofuranosidase B. Binding to GalNAc has been shown for the corresponding module of GalNAc 4. For the other proteins, the binding specificity of these modules has not been established. The pseudo three-fold symmetry of the CBM13 module has now been CBM13 11.60 14.83 12.16 16.00 0.00694 0.02728 confirmed in the 3-D structure of the intact, two-domain, xylanase of Streptomyces olivaceoviridis. The granular starch-binding function has been demonstrated in several cases. Interact strongly with CBM20 6.57 2.14 6.50 2.49 0.00001 0.00016 NA carbohydrate-binding molecule cyclodextrins. Often designated as starch-binding domains (SBD). The inulin-binding function has been demonstrated in the case of the cycloinulo-oligosaccharide carbohydrate-binding molecule fructanotransferase from Paenibacillus macerans (Bacillus macerans) by Lee et al. (2004) FEMS Microbiol Lett CBM38 0.00 0.10 0.21 0.46 0.02273 0.07221 NA 234:105-10. (PMID:15109727). Modules of approx. 100 residues, found at the C-terminus of several GH5 . Cellulose-binding function CBM46 0.37 1.07 0.64 1.48 0.00045 0.00259 NA carbohydrate-binding molecule demonstrated in one case. Modules of approx. 100 residues with -binding function, appended to GH13 modules. Also found in the CBM48 26.64 34.96 28.54 32.73 0.02519 0.07858 NA carbohydrate-binding molecule beta subunit (glycogen-binding) of AMP-activated protein kinases (AMPK)

Modules of approx. 50 residues found attached to various enzymes from families GH18, GH19, GH23, GH24, GH25 and GH73, i.e. enzymes cleaving either chitin or peptidoglycan. Binding to chitopentaose demonstrated in carbohydrate-binding molecule the case of Pteris ryukyuensis A [Ohnuma T et al. (2008) J. Biol. Chem. 283:5178-87 (PMID: 18083709)]. CBM50 modules are also found in a multitude of other enzymes targeting the petidoglycan such as peptidases CBM50 71.25 65.48 72.48 62.79 0.05042 0.14102 NA and amidases. These enzymes are not reported in the list below. Created from reading Schallus et al (2008) Mol Biol Cell. 19:3404-3414 [PMID: 18524852] and finding related CBM57 4.31 0.65 4.66 1.20 0.00198 0.00904 NA carbohydrate-binding molecule domains attached to various glycosidases. Modules of approx. 150 residues found appended to GH16, GH30, GH31, GH43, GH53 and GH66 catalytic NA carbohydrate-binding molecule domains. A β-1,4-galactan binding function has been demonstrated for the CBM60 of Thermotoga maritima CBM61 3.58 1.30 3.74 1.81 0.00248 0.01055 GH53 galactanase [PMID: 20826814] The CBM62 module of Clostridium thermocellum Cthe_2193 protein binds galactose moieties found on CBM62 1.95 0.85 2.29 1.18 0.03937 0.11529 NA carbohydrate-binding molecule xyloglucan, arabinogalactan and galactomannan. Fujimoto et al. [PMID : 23486481] disclosed the L-rhamnose binding activity and 3-D structure of the CBM67 of CBM67 12.57 7.47 12.72 9.50 0.05381 0.14708 NA carbohydrate-binding molecule Streptomyces avermitilis α-L-rhamnosidase (SaRha78A); CBM71 0.00 0.34 0.13 0.38 0.04690 0.13460 NA carbohydrate-binding molecule The two CBM71s of S. pneumoniae BgaA bind and LacNAc Pectin binding modules of approx. 110 residues. The Ruminococcus flavefaciens CBM77 was shown to bind CBM77 3.67 0.85 3.96 0.89 0.00001 0.00015 NA carbohydrate-binding molecule various pectins of low degree of esterification acetyl xylan esterase (EC 3.1.1.72); cinnamoyl esterase (EC 3.1.1.-); feruloyl esterase (EC 3.1.1.73); NA carbohydrate-esterase carboxylesterase (EC 3.1.1.1); S-formylglutathione (EC 3.1.2.12); diacylglycerol O-acyltransferase (EC CE1 96.89 67.69 97.21 66.21 0.00000 0.00015 2.3.1.20); 6-O-mycolyltransferase (EC 2.3.1.122) CE2 26.75 15.79 26.70 17.45 0.00034 0.00209 NA carbohydrate-esterase acetyl xylan esterase (EC 3.1.1.72) NA carbohydrate-esterase acetyl xylan esterase (EC 3.1.1.72); chitin deacetylase (EC 3.5.1.41); chitooligosaccharide deacetylase (EC 3.5.1.- CE4 101.73 136.22 104.15 134.91 0.00007 0.00071 ); peptidoglycan GlcNAc deacetylase (EC 3.5.1.-); peptidoglycan N-acetylmuramic acid deacetylase (EC 3.5.1.-). CE6 13.70 4.03 13.78 4.28 0.00000 0.00015 NA carbohydrate-esterase acetyl xylan esterase (EC 3.1.1.72) CE7 41.97 19.05 40.24 19.46 0.00000 0.00000 NA carbohydrate-esterase acetyl xylan esterase (EC 3.1.1.72); cephalosporin-C deacetylase (EC 3.1.1.41). CE8 77.26 35.22 85.72 34.75 0.00000 0.00015 NA carbohydrate-esterase pectin methylesterase (EC 3.1.1.11) N-acetylglucosamine 6-phosphate deacetylase (EC 3.5.1.25); N-acetylglucosamine 6-phosphate deacetylase (EC CE9 163.83 214.83 172.80 216.67 0.00014 0.00110 NA carbohydrate-esterase 3.5.1.80) CE11 73.74 28.86 65.50 27.11 0.00011 0.00092 NA carbohydrate-esterase UDP-3-0-acyl N-acetylglucosamine deacetylase (EC 3.5.1.-) pectin acetylesterase (EC 3.1.1.-); rhamnogalacturonan acetylesterase (EC 3.1.1.-); acetyl xylan esterase (EC CE12 81.18 29.35 79.51 32.82 0.00003 0.00039 NA carbohydrate-esterase 3.1.1.72) CE15 10.07 6.60 12.31 6.82 0.02171 0.07000 NA carbohydrate-esterase 4-O-methyl-glucuronoyl methylesterase (EC 3.1.1.-) β-glucosidase (EC 3.2.1.21); β-galactosidase (EC 3.2.1.23); β- (EC 3.2.1.25); β-glucuronidase (EC 3.2.1.31); β-xylosidase (EC 3.2.1.37); β-D-fucosidase (EC 3.2.1.38); phlorizin hydrolase (EC 3.2.1.62); exo-β-1,4- (EC 3.2.1.74); 6-phospho-β-galactosidase (EC 3.2.1.85); 6-phospho-β-glucosidase (EC 3.2.1.86); strictosidine β-glucosidase (EC 3.2.1.105); (EC 3.2.1.108); amygdalin β-glucosidase (EC 3.2.1.117); PLANT/ANIMAL glycoside-hydrolase prunasin β-glucosidase (EC 3.2.1.118); vicianin hydrolase (EC 3.2.1.119); raucaffricine β-glucosidase (EC 3.2.1.125); thioglucosidase (EC 3.2.1.147); β-primeverosidase (EC 3.2.1.149); isoflavonoid 7-O-β-apiosyl-β- glucosidase (EC 3.2.1.161); ABA-specific β-glucosidase (EC 3.2.1.175); DIMBOA β-glucosidase (EC 3.2.1.182); β- GH1 208.42 442.95 269.50 448.84 0.00018 0.00132 glycosidase (EC 3.2.1.-); hydroxyisourate hydrolase (EC 3.-.-.-) β-galactosidase (EC 3.2.1.23) ; β-mannosidase (EC 3.2.1.25); β-glucuronidase (EC 3.2.1.31); α-L- PLANT/ANIMAL/MUCIN glycoside-hydrolase arabinofuranosidase (EC 3.2.1.55); mannosylglycoprotein endo-β-mannosidase (EC 3.2.1.152); exo-β- GH2 790.15 648.59 811.29 664.32 0.00078 0.00420 glucosaminidase (EC 3.2.1.165) maltose-6-phosphate glucosidase (EC 3.2.1.122); α-glucosidase (EC 3.2.1.20); α-galactosidase (EC 3.2.1.22); 6- PLANT/ANIMAL glycoside-hydrolase phospho-β-glucosidase (EC 3.2.1.86); α-glucuronidase (EC 3.2.1.139); α-galacturonase (EC 3.2.1.67); palatinase GH4 42.49 76.59 53.51 78.89 0.00055 0.00310 (EC 3.2.1.-) endo-β-1,4-glucanase / (EC 3.2.1.4); endo-β-1,4-xylanase (EC 3.2.1.8); β-glucosidase (EC 3.2.1.21); β- mannosidase (EC 3.2.1.25); β- (EC 3.2.1.45); glucan β-1,3-glucosidase (EC 3.2.1.58); licheninase (EC 3.2.1.73); exo-β-1,4-glucanase / cellodextrinase (EC 3.2.1.74); glucan endo-1,6-β-glucosidase (EC bioRxiv preprint doi: https://doi.org/10.1101/284513; this version posted April 25, 2018. The copyright holder for this preprint (which was not 3.2.1.75); mannan endo-β-1,4-mannosidase (EC 3.2.1.78); cellulose β-1,4-cellobiosidase (EC 3.2.1.91); steryl β- certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under glycoside-hydrolase glucosidase (EC 3.2.1.104); endoglycoceramidase (EC 3.2.1.123); (EC 3.2.1.132); β-primeverosidase aCC-BY-NC 4.0 International license. (EC 3.2.1.149); xyloglucan-specific endo-β-1,4-glucanase (EC 3.2.1.151); endo-β-1,6-galactanase (EC 3.2.1.164); hesperidin 6-O-α-L-rhamnosyl-β-glucosidase (EC 3.2.1.168); β-1,3-mannanase (EC 3.2.1.-); arabinoxylan-specific GH5 12.73 7.50 12.79 7.41 0.00441 0.01784 PLANT endo-β-1,4-xylanase (EC 3.2.1.-); mannan transglycosylase (EC 2.4.1.-) chitosanase (EC 3.2.1.132); cellulase (EC 3.2.1.4); licheninase (EC 3.2.1.73); endo-1,4-β-xylanase (EC 3.2.1.8); GH8 30.90 9.46 32.12 10.24 0.00002 0.00036 PLANT glycoside-hydrolase reducing-end-xylose releasing exo-oligoxylanase (EC 3.2.1.156) endo-1,4-β-xylanase (EC 3.2.1.8); endo-1,3-β-xylanase (EC 3.2.1.32); tomatinase (EC 3.2.1.-); xylan GH10 64.36 26.19 63.48 25.62 0.00008 0.00074 NA glycoside-hydrolase endotransglycosylase (EC 2.4.2.-)

xyloglucan:xyloglucosyltransferase (EC 2.4.1.207); keratan-sulfate endo-1,4-β-galactosidase (EC 3.2.1.103); endo- 1,3-β-glucanase (EC 3.2.1.39); endo-1,3(4)-β-glucanase (EC 3.2.1.6); licheninase (EC 3.2.1.73); β- (EC glycoside-hydrolase 3.2.1.81); κ-carrageenase (EC 3.2.1.83); xyloglucanase (EC 3.2.1.151); endo-β-1,3-galactanase (EC 3.2.1.181); β- porphyranase (EC 3.2.1.178); (EC 3.2.1.35); endo-β-1,4-galactosidase (EC 3.2.1.-); chitin β-1,6- GH16 22.76 12.13 21.68 14.14 0.00402 0.01648 PLANT glucanosyltransferase (EC 2.4.1.-); endo-β-1,4-galactosidase (EC 3.2.1.-) β- (EC 3.2.1.52); lacto-N-biosidase (EC 3.2.1.140); β-1,6-N-acetylglucosaminidase) (EC 3.2.1.-); β- GH20 170.87 104.25 170.92 93.88 0.00087 0.00437 ANIMAL/MUCIN glycoside-hydrolase 6-SO3-N-acetylglucosaminidase (EC 3.2.1.-) type G (EC 3.2.1.17); peptidoglycan (EC 4.2.2.n1) also known in the literature as peptidoglycan GH23 41.79 27.38 42.96 27.20 0.00000 0.00015 NA glycoside-hydrolase lytic transglycosylase; chitinase (EC 3.2.1.14) GH25 108.70 82.68 108.22 84.60 0.00087 0.00437 NA glycoside-hydrolase lysozyme (EC 3.2.1.17) β-mannanase (EC 3.2.1.78); exo-β-1,4-mannobiohydrolase (EC 3.2.1.100); β-1,3-xylanase (EC 3.2.1.32); GH26 43.66 19.29 42.88 20.91 0.00016 0.00119 PLANT glycoside-hydrolase / endo-β-1,3-1,4-glucanase (EC 3.2.1.73); mannobiose-producing exo-β-mannanase (EC 3.2.1.-) α-galactosidase (EC 3.2.1.22); α-N-acetylgalactosaminidase (EC 3.2.1.49); isomalto- (EC 3.2.1.94); β-L- GH27 36.91 20.22 35.40 20.77 0.00005 0.00060 PLANT/MUCIN glycoside-hydrolase arabinopyranosidase (EC 3.2.1.88); galactan:galactan (EC 2.4.1.-) (EC 3.2.1.15); α-L-rhamnosidase (EC 3.2.1.40); exo-polygalacturonase (EC 3.2.1.67); exo- PLANT glycoside-hydrolase polygalacturonosidase (EC 3.2.1.82); rhamnogalacturonase (EC 3.2.1.171); rhamnogalacturonan α-1,2- GH28 155.32 72.46 144.09 76.67 0.00003 0.00037 galacturonohydrolase (EC 3.2.1.173); endo-xylogalacturonan hydrolase (EC 3.2.1.-) GH29 151.37 130.61 152.24 133.31 0.01864 0.06222 PLANT/ANIMAL/MUCIN glycoside-hydrolase α-L-fucosidase (EC 3.2.1.51); α-1,3/1,4-L-fucosidase (EC 3.2.1.111) endo-β-1,4-xylanase (EC 3.2.1.8); β-glucosidase (3.2.1.21); β-glucuronidase (EC 3.2.1.31); β-xylosidase (EC 3.2.1.37); β-fucosidase (EC 3.2.1.38); glucosylceramidase (EC 3.2.1.45); β-1,6-glucanase (EC 3.2.1.75); NA glycoside-hydrolase glucuronoarabinoxylan endo-β-1,4-xylanase (EC 3.2.1.136); endo-β-1,6-galactanase (EC:3.2.1.164); [reducing GH30 31.23 18.29 29.82 19.19 0.00274 0.01138 end] β-xylosidase (EC 3.2.1.-) α-glucosidase (EC 3.2.1.20); α-galactosidase (EC 3.2.1.22); α-mannosidase (EC 3.2.1.24); α-1,3-glucosidase (EC 3.2.1.84); -isomaltase (EC 3.2.1.48) (EC 3.2.1.10); α-xylosidase (EC 3.2.1.177); α-glucan lyase (EC glycoside-hydrolase 4.2.2.13); isomaltosyltransferase (EC 2.4.1.-); oligosaccharide α-1,4- (EC 2.4.1.161); GH31 365.82 292.76 352.89 300.26 0.00979 0.03649 NA sulfoquinovosidase (EC 3.2.1.-) β-galactosidase (EC 3.2.1.23); exo-β-glucosaminidase (EC 3.2.1.165); exo-β-1,4-galactanase (EC 3.2.1.-); β-1,3- GH35 63.93 30.33 58.27 34.35 0.00024 0.00156 MUCIN glycoside-hydrolase galactosidase (EC 3.2.1.-) α-mannosidase (EC 3.2.1.24); mannosyl-oligosaccharide α-1,2-mannosidase (EC 3.2.1.113); mannosyl- ANIMAL glycoside-hydrolase oligosaccharide α-1,3-1,6-mannosidase (EC 3.2.1.114); α-2-O-mannosylglycerate hydrolase (EC 3.2.1.170); GH38 31.76 47.35 32.55 50.71 0.00044 0.00255 mannosyl-oligosaccharide α-1,3-mannosidase (EC 3.2.1.-) GH42 93.69 159.83 106.10 157.45 0.00134 0.00663 NA glycoside-hydrolase β-galactosidase (EC 3.2.1.23); α-L-arabinopyranosidase (EC 3.2.1.-)

β-xylosidase (EC 3.2.1.37); α-L-arabinofuranosidase (EC 3.2.1.55); arabinanase (EC 3.2.1.99); xylanase (EC PLANT glycoside-hydrolase 3.2.1.8); galactan 1,3-β-galactosidase (EC 3.2.1.145); α-1,2-L-arabinofuranosidase (EC 3.2.1.-); exo-α-1,5-L- GH43 75.69 40.82 76.64 39.43 0.00001 0.00015 arabinofuranosidase (EC 3.2.1.-); [inverting] exo-α-1,5-L-arabinanase (EC 3.2.1.-); β-1,3-xylosidase (EC 3.2.1.-) endoglucanase (EC 3.2.1.4); endo-β-1,4-xylanase (EC 3.2.1.8); β-xylosidase (EC 3.2.1.37); α-L- GH51 185.10 118.71 186.47 122.23 0.00008 0.00074 PLANT glycoside-hydrolase arabinofuranosidase (EC 3.2.1.55) GH52 0.00 0.00 0.13 0.00 0.05110 0.14102 NA glycoside-hydrolase β-xylosidase (EC 3.2.1.37). GH53 106.24 59.03 110.52 64.21 0.00001 0.00021 PLANT glycoside-hydrolase endo-β-1,4-galactanase (EC 3.2.1.89) α- (EC 3.2.1.1); α-galactosidase (EC 3.2.1.22); amylopullulanase (EC 3.2.1.41); (EC GH57 19.56 10.30 20.64 10.98 0.00016 0.00119 NA glycoside-hydrolase 3.2.1.54); branching (EC 2.4.1.18); 4-α-glucanotransferase (EC 2.4.1.25) processing α-glucosidase (EC 3.2.1.106); α-1,3-glucosidase (EC 3.2.1.84); α-glucosidase (EC 3.2.1.20); GH63 10.35 6.55 14.31 8.94 0.02013 0.06640 NA glycoside-hydrolase mannosylglycerate α-mannosidase / mannosylglycerate hydrolase (EC 3.2.1.170) GH64 0.61 0.00 0.96 0.33 0.00721 0.02795 NA glycoside-hydrolase β-1,3-glucanase (EC 3.2.1.39) α,α- (EC 3.2.1.28); maltose (EC 2.4.1.8); trehalose phosphorylase (EC 2.4.1.64); (EC 2.4.1.230); trehalose-6-phosphate phosphorylase (EC 2.4.1.216); nigerose phosphorylase (EC NA glycoside-hydrolase 2.4.1.279); 3-O-α-glucopyranosyl-L-rhamnose phosphorylase (EC 2.4.1.282); 2-O-α-glucopyranosylglycerol: phosphate β-glucosyltransferase (EC 2.4.1.-); α-glucosyl-1,2-β-galactosyl-L-hydroxylysine α-glucosidase (EC GH65 15.69 31.47 19.62 33.82 0.00006 0.00064 3.2.1.107) GH66 18.14 11.27 19.74 13.31 0.05042 0.14102 NA glycoside-hydrolase cycloisomaltooligosaccharide glucanotransferase (EC 2.4.1.248); dextranase (EC 3.2.1.11). GH67 84.78 28.84 83.06 27.42 0.00001 0.00016 PLANT glycoside-hydrolase α-glucuronidase (EC 3.2.1.139); xylan α-1,2-glucuronidase (EC 3.2.1.131) GH68 0.00 0.41 0.21 0.62 0.01770 0.06048 SUCROSE/FRUCTOSE glycoside-hydrolase (EC 2.4.1.10); β-fructofuranosidase (EC 3.2.1.26); (EC 2.4.1.9). lysozyme (EC 3.2.1.17); mannosyl- endo-β-N-acetylglucosaminidase (EC 3.2.1.96); peptidoglycan GH73 32.24 23.80 35.87 25.18 0.00021 0.00143 NA glycoside-hydrolase hydrolase with endo-β-N-acetylglucosaminidase specificity (EC 3.2.1.-) GH77 385.83 452.88 389.93 457.16 0.00165 0.00777 NA glycoside-hydrolase amylomaltase or 4-α-glucanotransferase (EC 2.4.1.25) β-glucuronidase (EC 3.2.1.31); (EC 3.2.1.36); (EC 3.2.1.166); baicalin β- GH79 0.00 0.42 0.30 0.61 0.02140 0.06979 ANIMAL glycoside-hydrolase glucuronidase (EC 3.2.1.167); β-4-O-methyl-glucuronidase (EC 3.2.1.-) N-acetyl β-glucosaminidase (EC 3.2.1.52); hyaluronidase (EC 3.2.1.35); [protein]-3-O-(GlcNAc)-L-Ser/Thr β-N- GH84 22.37 10.43 28.00 12.04 0.00069 0.00383 ANIMAL glycoside-hydrolase acetylglucosaminidase (EC 3.2.1.169) GH85 4.08 1.82 5.39 3.17 0.02728 0.08330 ANIMAL glycoside-hydrolase endo-β-N-acetylglucosaminidase (EC 3.2.1.96) GH89 85.74 43.34 82.84 47.13 0.00203 0.00909 ANIMAL/MUCIN glycoside-hydrolase α-N-acetylglucosaminidase (EC 3.2.1.50) inulin lyase [DFA-I-forming] (EC 4.2.2.17); inulin lyase [DFA-III-forming] (EC 4.2.2.18); difructofuranose 1,2':2,3' GH91 0.00 1.18 0.28 1.43 0.00024 0.00156 SUC-FRUC glycoside-hydrolase dianhydride hydrolase [DFA-IIIase] (EC 3.2.1.-) mannosyl-oligosaccharide α-1,2-mannosidase (EC 3.2.1.113); mannosyl-oligosaccharide α-1,3-mannosidase (EC 3.2.1.-); mannosyl-oligosaccharide α-1,6-mannosidase (EC 3.2.1.-); α-mannosidase (EC 3.2.1.24); α-1,2- ANIMAL glycoside-hydrolase mannosidase (EC 3.2.1.-); α-1,3-mannosidase (EC 3.2.1.-); α-1,4-mannosidase (EC 3.2.1.-); mannosyl-1- GH92 238.18 107.35 229.24 117.53 0.00087 0.00437 phosphodiester α-1,P-mannosidase (EC 3.2.1.-) GH95 337.30 190.80 345.69 196.63 0.00008 0.00074 PLANT/ANIMAL/MUCIN glycoside-hydrolase α-L-fucosidase (EC 3.2.1.51); α-1,2-L-fucosidase (EC 3.2.1.63); α-L-galactosidase (EC 3.2.1.-) GH97 483.84 185.69 420.55 177.57 0.00006 0.00064 NA glycoside-hydrolase glucoamylase (EC 3.2.1.3); α-glucosidase (EC 3.2.1.20); α-galactosidase (EC 3.2.1.22) blood-group endo-β-1,4-galactosidase (EC 3.2.1.102); blood group A- and B-cleaving endo-β-1,4-galactosidase GH98 6.13 2.53 6.80 3.21 0.00900 0.03442 ANIMAL glycoside-hydrolase (EC 3.2.1.-); endo-β-1,4-xylanase (EC 3.2.1.8); endo-β-1,4-xylanase (EC 3.2.1.8) GH101 0.46 1.26 0.61 4.25 0.00250 0.01055 ANIMAL glycoside-hydrolase endo-α-N-acetylgalactosaminidase (EC 3.2.1.97) GH103 3.86 0.72 4.64 2.28 0.01697 0.05940 NA glycoside-hydrolase peptidoglycan lytic transglycosylase (EC 3.2.1.-)

GH105 132.06 80.94 136.80 84.23 0.00034 0.00209 ANIMAL glycoside-hydrolase unsaturated rhamnogalacturonyl hydrolase (EC 3.2.1.172); d-4,5-unsaturated β-glucuronyl hydrolase (EC 3.2.1.-) GH106 125.41 60.84 134.97 68.08 0.00165 0.00777 NA glycoside-hydrolase α-L-rhamnosidase (EC 3.2.1.40) GH108 1.82 0.83 2.51 1.12 0.03210 0.09597 NA glycoside-hydrolase N-acetylmuramidase (EC 3.2.1.17) GH109 42.10 28.74 38.78 33.16 0.03129 0.09452 ANIMAL glycoside-hydrolase α-N-acetylgalactosaminidase (EC 3.2.1.49) lacto-N-biose phosphorylase or galacto-N-biose phosphorylase (EC 2.4.1.211); D-galactosyl-β-1,4-L-rhamnose GH112 148.09 233.64 150.65 232.40 0.00183 0.00848 NA glycoside-hydrolase phosphorylase (EC 2.4.1.247); galacto-N-biose/lacto-N-biose phosphorylase (EC 2.4.1.-) GH115 126.29 40.47 127.23 46.42 0.00003 0.00039 PLANT glycoside-hydrolase xylan α-1,2-glucuronidase (3.2.1.131); α-(4-O-methyl)-glucuronidase (3.2.1.-) α-1,3-L-neoagarooligosaccharide hydrolase (EC 3.2.1.-); α-1,3-L-neoagarobiase / neoagarobiose hydrolase (EC GH117 4.14 0.84 3.96 1.14 0.00001 0.00024 PLANT glycoside-hydrolase 3.2.1.-) GH120 4.34 7.02 5.45 7.71 0.03358 0.09934 NA glycoside-hydrolase β-xylosidase (EC 3.2.1.37) GH121 3.76 0.88 4.42 1.04 0.00043 0.00255 PLANT glycoside-hydrolase β-L-arabinobiosidase (EC 3.2.1.-)

GH123 26.48 14.88 29.35 20.78 0.04417 0.12805 NA glycoside-hydrolase β-N-acetylgalactosaminidase (EC 3.2.1.53); glycosphingolipid β-N-acetylgalactosaminidase (EC 3.2.1.-) GH125 60.09 44.76 62.99 45.76 0.01156 0.04201 NA glycoside-hydrolase exo-α-1,6-mannosidase (EC 3.2.1.-) GH127 178.13 117.12 183.17 122.81 0.00001 0.00015 NA glycoside-hydrolase β-L-arabinofuranosidase (EC 3.2.1.185); 3-C-carboxy-5-deoxy-L-xylose (aceric acid) hydrolase GH128 2.07 0.65 2.52 0.97 0.00504 0.02009 NA glycoside-hydrolase β-1,3-glucanase (EC 3.2.1.39) GH129 10.29 24.55 12.72 26.15 0.00004 0.00044 MUCIN glycoside-hydrolase α-N-acetylgalactosaminidase (EC 3.2.1.49) β-1,4-mannosylglucose phosphorylase (EC 2.4.1.281); β-1,4-mannooligosaccharide phosphorylase (EC NA glycoside-hydrolase 2.4.1.319); β-1,4-mannosyl-N-acetyl-glucosamine phosphorylase (EC 2.4.1.320); β-1,2-mannobiose GH130 69.09 43.10 66.25 46.11 0.00087 0.00437 phosphorylase (EC 2.4.1.-); β-1,2-oligomannan phosphorylase (EC 2.4.1.-); β-1,2-mannosidase (EC 3.2.1.-) GH133 69.53 53.45 69.69 53.46 0.00224 0.00991 NA glycoside-hydrolase amylo-α-1,6-glucosidase (EC 3.2.1.33) cellulose synthase (EC 2.4.1.12); (EC 2.4.1.16); dolichyl-phosphate β-D- (EC 2.4.1.83); dolichyl-phosphate β-glucosyltransferase (EC 2.4.1.117); N-acetylglucosaminyltransferase (EC 2.4.1.-); N-acetylgalactosaminyltransferase (EC 2.4.1.-); (EC 2.4.1.212); chitin oligosaccharide synthase (EC 2.4.1.-); β-1,3-glucan synthase (EC 2.4.1.34); β-1,4-mannan synthase (EC 2.4.1.-); β- mannosylphosphodecaprenol-mannooligosaccharide α-1,6-mannosyltransferase (EC 2.4.1.199); UDP-Galf: NA glycosyl-transferase rhamnopyranosyl-N-acetylglucosaminyl-PP-decaprenol β-1,4/1,5-galactofuranosyltransferase (EC 2.4.1.287); UDP-Galf: galactofuranosyl-galactofuranosyl-rhamnosyl-N-acetylglucosaminyl-PP-decaprenol β-1,5/1,6- galactofuranosyltransferase (EC 2.4.1.288); dTDP-L-Rha: N-acetylglucosaminyl-PP-decaprenol α-1,3-L- GT2_Cellulose_synt 0.00 0.00 0.02 0.11 0.01625 0.05756 rhamnosyltransferase (EC 2.4.1.289) GT3 113.67 47.04 96.92 41.42 0.00021 0.00143 NA glycosyl-transferase (EC 2.4.1.11) UDP-Glc: glycogen glucosyltransferase (EC 2.4.1.11); ADP-Glc: starch glucosyltransferase (EC 2.4.1.21); NDP-Glc: NA glycosyl-transferase starch glucosyltransferase (EC 2.4.1.242); UDP-Glc: α-1,3-glucan synthase (EC 2.4.1.183) UDP-Glc: α-1,4-glucan GT5 273.44 371.59 286.18 367.55 0.00000 0.00015 synthase (EC 2.4.1.-) lipopolysaccharide α-1,3-galactosyltransferase (EC 2.4.1.44); UDP-Glc: (glucosyl)lipopolysaccharide α-1,2- glucosyltransferase (EC 2.4.1.-); lipopolysaccharide glucosyltransferase 1 (EC 2.4.1.58); glycosyl-transferase glucosyltransferase (EC 2.4.1.186); inositol 1-α-galactosyltransferase (galactinol synthase) (EC 2.4.1.123); homogalacturonan α-1,4-galacturonosyltransferase (EC 2.4.1.43); UDP-GlcA: xylan α-glucuronyltransferase (EC GT8 29.25 24.55 32.50 25.44 0.01255 0.04502 NA 2.4.1.-)

GT9 18.57 6.57 18.43 7.35 0.00005 0.00054 NA glycosyl-transferase lipopolysaccharide N-acetylglucosaminyltransferase (EC 2.4.1.56); heptosyltransferase (EC 2.4.-.-)

GT12 0.00 0.00 0.02 0.20 0.02723 0.08330 NA glycosyl-transferase [N-acetylneuraminyl]-galactosylglucosylceramide N-acetylgalactosaminyltransferase (EC 2.4.1.92). GT19 88.85 38.35 80.64 32.45 0.00003 0.00037 NA glycosyl-transferase lipid-A-disaccharide synthase (EC 2.4.1.182) α,α-trehalose-phosphate synthase [UDP-forming] (EC 2.4.1.15); Glucosylglycerol-phosphate synthase (EC NA glycosyl-transferase 2.4.1.213); trehalose-6-P phosphatase (EC 3.1.3.12); [retaining] GDP-valeniol: validamine 7-phosphate bioRxiv preprint doi: https://doi.org/10.1101/284513GT20 0.00; this version posted0.39 April 25, 2018.0.36 The copyright0.92 holder for this0.01032 preprint (which0.03799 was not valeniolyltransferase (EC 2.-.-.-) certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under GT21 0.70aCC-BY-NC 4.00.10 International license0.84 . 0.26 0.01803 0.06086 NA glycosyl-transferase UDP-Glc: ceramide β-glucosyltransferase (EC 2.4.1.80) GT30 32.51 11.14 29.48 12.29 0.00005 0.00054 NA glycosyl-transferase CMP-β-KDO: α-3-deoxy-D-manno-octulosonic-acid (KDO) transferase (EC 2.4.99.-) GT35 558.12 713.42 588.53 717.73 0.00012 0.00102 NA glycosyl-transferase glycogen or (EC 2.4.1.1) UDP-GlcNAc: peptide β-N-acetylglucosaminyltransferase (EC 2.4.1.255); UDP-Glc: peptide N-β- GT41 2.72 1.54 3.92 1.66 0.02290 0.07221 NA glycosyl-transferase glucosyltransferase (EC 2.4.1.-) GT42 0.00 0.00 0.13 0.00 0.05110 0.14102 NA glycosyl-transferase CMP-NeuAc α-2,3- (EC 2.4.99.-) GT51 265.60 238.56 266.26 240.25 0.00030 0.00194 NA glycosyl-transferase murein polymerase (EC 2.4.1.129). NA glycosyl-transferase undecaprenyl phosphate-α-L-Ara4N: 4-amino-4-deoxy-β-L- (EC 2.4.2.43); dodecaprenyl GT83 25.84 10.50 25.46 10.74 0.00000 0.00015 phosphate-β-galacturonic acid: lipopolysaccharide core α-galacturonosyl transferase (EC 2.4.1.-)

GT94 0.41 0.00 0.43 0.16 0.01770 0.06048 NA glycosyl-transferase GDP-Man: GlcA-β-1,2-Man-α-1,3-Glc-β-1,4-Glc-α-1-PP-undecaprenol β-1,4-mannosyltransferase (2.4.1.251) PL1 29.77 7.25 28.99 8.97 0.00001 0.00015 PLANT polysaccharide-lyase pectate lyase (EC 4.2.2.2); exo-pectate lyase (EC 4.2.2.9); pectin lyase (EC 4.2.2.10) hyaluronate lyase (EC 4.2.2.1); chondroitin AC lyase (EC 4.2.2.5); xanthan lyase (EC 4.2.2.12); chondroitin ABC PL8 5.36 2.72 6.76 3.15 0.00149 0.00724 ANIMAL polysaccharide-lyase lyase (EC 4.2.2.20)

PL9 45.56 18.66 44.30 19.44 0.00000 0.00015 PLANT polysaccharide-lyase pectate lyase (EC 4.2.2.2); exopolygalacturonate lyase (EC 4.2.2.9); thiopeptidoglycan lyase (EC 4.2.2.-) PL10 12.57 3.30 15.59 3.68 0.00009 0.00075 NA polysaccharide-lyase pectate lyase (EC 4.2.2.2) PL11 88.53 34.21 89.62 38.24 0.00014 0.00110 PLANT polysaccharide-lyase rhamnogalacturonan endolyase (EC 4.2.2.23); rhamnogalacturonan exolyase (EC 4.2.2.24) PL22 3.17 1.91 3.54 1.81 0.00979 0.03649 PLANT polysaccharide-lyase oligogalacturonate lyase / oligogalacturonide lyase (EC 4.2.2.6)