bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Bacterial and fungal contributions to delignification and lignocellulose degradation in forest soils with metagenomic and quantitative stable isotope probing

Roland C. Wilhelm, Rahul Singh, Lindsay D. Eltis and William W. Mohn*

Department of Microbiology & Immunology, Life Sciences Institute, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada.

*Corresponding Author: William Mohn Current address: Department of Microbiology & Immunology, Life Sciences Institute, 2350 Health Sciences Mall, University of British Columbia, Vancouver, BC V6T 1Z3

Tel: +1 604-822-4285 Fax: +1 604-822-6041 Email: [email protected]

bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Abstract

1 Delignification, or lignin-modification, facilitates the decomposition of lignocellulose in woody 2 plant biomass. The extant diversity of lignin-degrading and fungi is underestimated by 3 culture-dependent methods, limiting our understanding of the functional and ecological traits of 4 decomposers populations. Here, we describe the use of stable isotope probing (SIP) coupled with 5 amplicon and shotgun metagenomics to identify and characterize the functional attributes of 6 lignin-, cellulose- and hemicellulose-degrading fungi and bacteria in coniferous forest soils from 7 across North America. We tested the extent to which catabolic genes partitioned among different 8 decomposer taxa; the relative roles of bacteria and fungi, and whether taxa or catabolic genes 9 correlated with variation in lignocellulolytic activity, measured as the total assimilation of 13C- 10 label into DNA and phospholipid fatty acids. We found high overall bacterial degradation of our 11 model lignin substrate, particularly by gram-negative bacteria (Comamonadaceae and 12 ), while fungi were more prominent in cellulose-degradation. Very few taxa 13 incorporated 13C-label from more than one lignocellulosic polymer, suggesting specialization 14 among decomposers. Collectively, members of Caulobacteraceae could degrade all three 15 lignocellulosic polymers, providing new evidence for their importance in lignocellulose 16 degradation. Variation in lignin-degrading activity was better explained by microbial community 17 properties, such as catabolic gene content and community structure, than cellulose-degrading 18 activity. SIP significantly improved shotgun metagenome assembly resulting in the recovery of 19 several high-quality draft metagenome-assembled genomes and over 7,500 contigs containing 20 unique clusters of carbohydrate-active genes. These results improve understanding of which 21 organisms, conditions and corresponding functional genes contribute to lignocellulose 22 decomposition.

23 Keywords: forest soil, decomposition, stable isotope probing, lignin, cellulose, hemicellulose, 24 metagenomics, PLFA, carbon cycling.

25 Running Title: Lignocellulose Degraders in Forest Soils bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Introduction

26 The incomplete decomposition of woody biomass in coniferous forests is an important

27 global carbon sink, with approximately one third of a gigaton of carbon accruing on an annual

28 basis (Myneni et al., 2001). Lignocellulose decomposition is influenced by the structure and

29 function of microbial communities (Strickland et al., 2009; Cleveland et al., 2014) and terrestrial

30 carbon cycling models increasingly parameterize these properties (Allison et al., 2012; Wieder et

31 al., 2015). However, efforts are constrained by our rudimentary knowledge of the composition

32 and ecology of decomposer populations, stemming from limitations of culture-dependent

33 methods and the complexity of soil communities. The best characterized decomposers inhabit

34 forest soil litter, where conditions favour rapid lignocellulose degradation (Melilo et al., 1989;

35 Cortrufo et al., 2015), yet the decay of lignin-rich plant polymers in soil occurs in a continuum

36 governed by conditions and substrate accessibility, resulting in diversified niches for

37 decomposers (von Lützow et al., 2006; Lehmann and Kleber, 2015; Klotzbücher et al., 2015). To

38 resolve the catabolic and ecological traits of decomposers, we must utilize culture-independent

39 methods, like stable isotope probing (SIP), that better reflect in situ conditions.

40 The nature of microbial lignin-degradation is poorly described beyond the canonical

41 breakdown of lignin in woody biomass by specialized, aerobic, litter-inhabiting wood-rot fungi.

42 Yet, these fungi are largely absent in deeper mineral soil where conditions favour decomposition

43 by bacteria (Ekschmitt et al., 2008), who represent the most likely lignin-degraders when oxygen

44 is limited (Benner et al., 1984; DeAngelisa et al., 2011; Hall et al., 2015). Several soil bacteria

45 can degrade model lignin compounds in pure culture, suggesting a role for them in the

46 catabolism of low-molecular weight, partially-degraded forms of lignin (Crawford, 1978; Masai

47 et al., 2007; DeAngelisb et al., 2011; Brown et al., 2012; Taylor et al., 2012; Davis et al., 2013; bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

48 Rashid et al., 2015). Knowledge of bacterial lignin-degradation in environmental contexts is

49 limited, with all studies of soil degraders originating from tropical forests, which describe active

50 populations of predominantly Alpha- and Gammaproteobacteria (DeAngelisa et al., 2011; Pold et

51 al., 2015). The existing evidence for bacterial lignin-degradation demonstrates the need to

52 characterize both bacterial and fungal populations and contrast their roles in different soil

53 environments to better understand the in situ processes that govern the decomposition of

54 lignocellulose.

55 Delignification rapidly increases the rate of lignocellulose decomposition and may have

56 evolved primarily as a means of accessing more readily degradable plant carbohydrates,

57 exemplified by the strategies of white- and brown-rot Agaricomycota (Eastwood et al., 2011;

58 Ding et al., 2012; Zeng et al., 2014). The capability to co-degrade lignin and other

59 lignocellulosic polymers has not been thoroughly explored in bacteria. There is some evidence

60 for the co-metabolism of lignocellulosic polymers by various Streptomyces spp. (Větrovský et

61 al., 2014), while a comparative genomics study revealed that cellulose-degrading bacteria also

62 possess higher numbers of hemicellulases (Medie et al., 2012). The extent to which catabolic

63 traits are conserved within specific taxa versus communities will influence models of microbial

64 decomposition. In the simplest case, the abundance of highly adapted, multi-substrate degrading

65 taxa may predict rates of decomposition, which is supported by a small number of studies

66 (Strickland et al., 2009; Wilhelma et al., 2017). However, forces of genomic streamlining in

67 bacteria lead to functional diversification among closely related , particularly in extra-

68 cellular processes that produce common goods (Morris et al., 2012), evident in the species-level

69 conservation of bacterial endoglucanases (Berlemont and Martiny, 2013). A multi-substrate SIP

70 experiment provides the means to identify whether forest soil decomposers can assimilate 13C bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

71 from various lignocellulosic polymers and, with shotgun metagenomic sequencing, determine

72 whether their genomes encode the necessary suite of catabolic genes.

73 To address the above knowledge gaps, we utilized SIP microcosm-based experiments to

74 investigate the composition and degradative potential of hemicellulose-, cellulose- and lignin-

75 degrading populations from organic and mineral layer soils in coniferous forests across North

76 America. The identity of degraders and their genomic content were assessed using amplicon and

77 shotgun sequencing of DNA enriched in 13C from labeled substrates. Lignocellulolytic activity

78 was quantified according to the amount of 13C assimilated into total DNA and phospholipid fatty

79 acids (PLFA). The in situ abundances of lignocellulolytic populations were determined on the

80 basis of previously reported pyrotag libraries from the same field samples (Wilhelm et al., 2017b,

81 Wilhelm et al., 2017c). This is currently the most comprehensive cultivation-independent study

82 of lignocellulolytic populations in forest soils and yields new insights to the taxa responsible and

83 the importance of certain catabolic gene families. bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Material and Methods

Overview of Sites and Sample Collection

84 Soil samples were collected from five forest regions in North American chosen to

85 encompass a range of climates, tree cover and soil conditions. Samples were collected from

86 mature coniferous forest corresponding to ‘reference’ plots in the Long-Term Soil Productivity

87 Study in northern Ontario (BSON and JPON), British Columbia (IDFBC), California (PPCA) and

88 Texas (LPTX). Detailed descriptions of geographical locations, soil properties and sampling

89 methods can be found in Wilhelmb et al. (2017) as well as in Table S1. Samples were collected

90 from three sites in each geographical region, except IDFBC where three plots were samples from

91 a single site. The organic layer (O-horizon) and the top 20 cm of mineral soil (A-horizon +

92 occasionally upper B-horizon) were sampled separately (overview in Figure 1a). Samples were

93 kept at 4°C during transport, sieved through 2-mm mesh and stored at -80°C.

Preparation of SIP Soil Microcosms and 13C-labeled Substrates

94 Microcosms were prepared with a minimum of three replicates by adding 13C-labeled

95 hemicellulose, cellulose or lignin to soil from as many of the five regions as possible contingent

96 on the availability of substrate (Figure 1a). Microcosms were comprised of either 1 g (organic) or

97 2 g (mineral) dry wt soil in 30-mL serum vials adjusted to a moisture content of 60% (mineral)

98 and 125% (organic) (w/v). Microcosms were wetted and pre-incubated in the dark at 20° C for

99 one-week prior to the addition of 1% (w/w) 13C-labeled cellulose or 0.8% (w/w) lignin. The

100 hemicellulose-amended incubations were performed as part of a previous study examining the

101 impacts of timber harvesting on decomposition using identical methods as described here for

102 cellulose (Leung et al., 2016). Each microcosm with 13C-labeled substrate was paired with an

103 identical ‘12C-control’ microcosm amended with the corresponding unlabeled substrate (~1.1% bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

104 atom 13C). 12C-control microcosms were used to estimate the natural abundance of 13C in PLFAs

105 and to control for the background presence of GC-rich DNA in higher density CsCl gradient

106 fractions (Youngblut et al., 2014).

107 Bacterial cellulose was produced from Gluconacetobacter xylinus str. KCCM 10100

108 grown with either unlabeled or 13C-labeled glucose (99 atom % 13C, Cambridge Isotope

109 Laboratories, MA, USA) in Yamanaka medium as described in the Supplementary Methods.

110 DHP lignin was synthesized from ring-labeled coniferyl alcohol using horseradish peroxidase as

111 described Kirk and Brunow (1988). Coniferyl alcohol was synthesized from ring-labeled vanillin

112 (75 atom % 13C, Sigma Aldrich, CA) according the reactions in Figure S1, following methods

113 described in the Supplementary Methods. DHP lignin had a weight average molecular weight

-1 114 (Mw) of 2,624 g·mol which equates to ~ 14 polymerized units of coniferyl alcohol. Yields were

115 approximately 12% and 18% (w/w) and were labeled at ~99% and 60% atom % 13C for cellulose

116 and lignin, respectively.

117 Microcosms were incubated at 20° C for 14 days (cellulose) or 60 days (lignin) or 2 days

118 (hemicellulose; Leung et al., 2016). The incubation length for this study was optimized by

119 preliminary time-course experiments described in Wilhelm et al. (2014). To overcome poor 13C-

120 labeling of microbial biomass in organic soils from DHP-lignin, organic soils were mixed 1:3

121 with corresponding double-autoclaved mineral soil to dilute pre-existing organic matter.

122 Following incubation, soil samples were lyophilized and stored at -80° C until processing. All

123 SIP-PLFA, SIP-pyrotag and SIP-metagenomic data were generated from the same microcosm

124 unless otherwise stated. Due to limited quantities of 13C-substrate, SIP-lignin microcosms were

125 prepared with soil from only BSON, PPCA and LPTX, while SIP-hemicellulose data was available

126 for IDFBC and PPCA (see Table S2 for summary of data). To determine the lignolytic activity bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

127 solely by bacteria, eight additional soil microcosms were prepared with 13C-lignin amended

128 mineral soil and incubated with or without weekly additions of two fungicides: 1 mg·g-1 soil

129 cylcoheximide plus 1.2 mg·g-1 soil fungizone (amphotericin B).

SIP-Phospholipid Fatty Acid Analysis

130 PLFAs were extracted from 0.75 g (organic) or 1.0 g (mineral) dry wt soil according to

131 Bligh and Dyer (1959) and 13C-content was analyzed using ion ratio mass spectrometry (UBC

132 Stable Isotope Facility) ported with gas chromatography as detailed in Churchland et al. (2013).

133 Peak identification was based on retention time compared against two reference standards:

134 bacterial acid methyl-ester standard (47080-0; Sigma–Aldrich, St. Louis) and a 37-Component

135 fatty acid methyl-ester mix (47885-U; Sigma–Aldrich, St. Louis). For details on the treatment of

136 unidentifiable, but 13C-labeled, peaks and taxonomic designations consult the Supplementary

137 Methods.

Preparation of Amplicon and Shotgun Metagenome Libraries

138 DNA was extracted from soil (0.5 g) using the FastDNA™ Spin Kit for Soil (MPBio,

139 Santa Ana, CA) following the manufacturer’s protocol. 13C-enriched DNA was separated using

140 cesium chloride density centrifugation per methods outlined by Neufeld et al. (2007) with minor

141 modifications (see Supplementary Methods). DNA from heavy fractions (1.727-1.735 g· mL-1;

142 typically F1-F7) was pooled, desalted and concentrated using Amicon Ultra-0.5 mL filters (EMD

143 Millipore, MA, USA). DNA from 12C-controls was treated identically; however, the total DNA

144 recovered from those heavy fractions of 12C-controls was insufficient for sequencing library

13 145 preparation, resulting in the need to pool additional fractions (F1-F9). The % atom C of soil

146 DNA extracts and DNA from heavy fractions was quantified according to Wilhelm et al. (2014). bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

147 All other DNA quantitation was performed using Pico-Green fluorescent dye (ThermoFisher,

148 MA, USA).

149 PCR amplification was performed on bacterial 16S rRNA gene (V1–V3) using barcoded

150 primers as described in Hartmann et al., (2012). PCR reactions were performed in triplicate and

151 pooled prior to purification. Samples were sequenced using the Roche 454 Titanium platform

152 (GS FLX+) at the McGill University and Genome Québec Innovation Centre, yielding an

153 average of 8,400 bacterial reads per sample with a minimum length of 250-bp following quality

154 processing. All 16S rRNA gene libraries are stored at the European Nucleotide Archive (ENA)

155 under the study accession PRJEB12502. Identically prepared 16S rRNA gene amplicon libraries

156 for SIP-hemicellulose (Leung et al. 2016) and for corresponding unincubated field soil samples

157 (Wilhelmc et al. 2017) were re-analyzed in the present study, following retrieval from the ENA.

158 Whole shotgun metagenomes were prepared from 13C-enriched or 12C-control DNA using

159 either the Nextera DNA Sample Preparation Kit (Illumina Inc., CA, USA) for SIP-cellulose

160 microcosm (40-50 ng template DNA), or the Nextera XT DNA Sample Preparation Kit (Illumina

161 Inc., CA, USA) for SIP-lignin (1 ng), due to lower recovery of 13C-enriched DNA. The two

162 sequencing library preparations produced very similar metagenomic profiles in a direct

163 comparison (Wilhelm, 2016). Four libraries were multiplexed per lane of Illumina HiSeq 2500 (2

164 x 100-bp) and were grouped based on similar average fragment sizes based on fragment profiles

165 from a 2100 Bioanalyzer (Agilent Technologies) All raw and assembled shotgun metagenomic

166 data is available under ENA accession PRJEB12502, with an overview of accessioned data

167 available in Table S3.

Bioinformatic and Statistical Analyses bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

168 16S rRNA gene amplicon libraries for all three SIP-substrates were quality filtered and

169 simultaneously processed using mothur according to the Schloss “454 SOP” (accessed

170 November 2015; Schloss et al. 2009) and clustered into operational taxonomic units (OTUs) at

171 0.01% dissimilarity. Taxonomic classification was performed using the RDP Classifier (Wang et

172 al., 2007) with the GreenGenes database for 16S rRNA genes (database gg_13_8_99; August

173 2013). All OTU counts were normalized to total counts per thousand reads. All related count

174 tables, classifications and sample data is provided as phyloseq objects in the Supplementary

175 Data. Phylogenetic trees were prepared by placing sequences into the pre-aligned SILVA tree

176 (Pruesse et al., 2007; ‘SSURef_NR99_123_SILVA_12_07_15’) using maximum parsimony in

177 ARB (Westram et al., 2011).

178 Shotgun metagenome libraries from SIP-cellulose and SIP-lignin were quality

179 preprocessed with Trimmomatic (Bolger et al., 2014; v. 0.32) and FastX Toolkit (Gordon and

180 Hannon, 2010; v. 0.7). Individual metagenomes were assembled using Ray-meta (Boisvert et al.,

181 2012). Metagenomes were also composited by location and soil layer and assembled with the

182 low-memory assembler MEGAHIT (Li et al., 2015; v. 1.0.2). Contigs from composited

183 metagenomes were binned into metagenome-assembled genomes (MAGs) using MetaBAT

184 (Kang et al., 2014; v. 0.18.6) based on read mapping with Bowtie2 (Langmead and Salzberg,

185 2012). The quality of draft genome bins was assessed using CheckM (Parks et al., 2015) and the

186 lowest common ancestor (LCA) classification of raw, assembled and MAG data was performed

187 with MEGAN (Huson et al., 2007; v. 5.10.1) based on DIAMOND blastx (Buchfink et al., 2015;

188 v. 0.7.9) searches against a local version of the ‘nr’ NCBI database (downloaded October 2014).

189 MAGs were classified to a specific taxon if > 25% of contigs were uniformly classified at the

190 level. ORF prediction was performed by Prodigal (Hyatt et al., 2010; v. 2.6.2) and CAZy bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

191 encoding ORFs were identified based on BLAST searches against a local CAZy database

192 (downloaded August 19th, 2015) and hmmscan searches (Finn et al., 2011; HMMER v.3.1b1)

193 using custom hmms for oxidative, lignolytic gene families (see Supplementary Methods) and

194 hmms provided by dbCAN (Yin et al., 2012). Contigs containing clusters of three or more genes

195 encoding CAZymes (‘CAZy gene clusters’) were recovered using a custom approach described

196 in the Supplementary Methods. CAZyme families attributed xylanase, endoglucanase and

197 ligninase activity can be found in the Supplementary Methods.

198 Statistics were performed using R v. 3.1.0 (R Core Team, 2014) with general dependency

199 on the following packages: reshape2, ggplot2, plyr (Wickham 2007; Wickham 2009; Wickham

200 2011), Hmisc (Harrell and Dupont 2015) and phyloseq (McMurdie and Holmes 2014). Where

201 necessary, P-values were adjusted according to Benjamini & Hochberg (1995). The ‘vegan’ R-

202 package (Oksanen et al., 2015) provided tools to calculate, non-parametric multidimensional

203 scaling (‘metaMDS’) on Bray-Curtis dissimilarities (‘vegdist’). A combination of DESeq

204 (Anders and Huber 2010) and indicator species analysis (De Cáceres and Legendre 2009) were

205 used to compare control and labeled microcosms to identify OTUs and taxonomic groups that

206 incorporated 13C from the labeled substrate. An OTU or taxon had to be selected by at least one

207 of these methods and be, on average, 3-fold more abundant in 13C libraries in at least one

208 geographic region. Random forest classification, implemented in boruta (Kursa and Rudnicki,

209 2010), was used to identify taxa or CAZymes predictive of total 13C incorporation into DNA or

210 PLFA. Correlation between features selected by boruta and total 13C were subsequently

211 performed using ‘rcorr’ from the R package Hmisc. The relative importance of soil layer,

212 geographic region, total carbon, CAZy composition and community structure in linear models

213 for each substrate was assessed using the R package relaimpo (Grömping, 2006) using the bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

214 primary and secondary axes of NMDS as measures of community structure and CAZy

215 composition.

Results 216 We utilized SIP microcosm-based experiments to investigate the taxonomic identity and

217 catabolic potential of lignocellulolytic populations from coniferous forest soils in five regions

218 across North America. Microcosms contained either organic (top 2 - 5 cm) or mineral layer soil

219 (5 – 20 cm) and were amended with either 13C-labeled hemicellulose, cellulose or lignin (Figure

220 1a). Significant enrichment of 13C in soil DNA extracts (Figure 1b) and PLFAs, and the recovery

221 of heavy DNA from higher density gradient fractions (Figure 1c) confirmed the catabolism and

222 assimilation of 13C from substrates. The quantification of total 13C in PLFA and DNA was in

223 close agreement (r = 0.74, p < 0.001). The successful targeting of lignocellulolytic populations

224 was evident in the strong compositional differences in DNA and PLFA profiles from soils

225 amended with 13C-labeled substrates compared to controls. A total of 503 OTUs were enriched

226 in 13C-amplicon libraries, enabling the determination of which taxa had the capacity to degrade

227 multiple lignocellulosic (Figure 2; full list in Table S4). The enrichment of degradative

228 populations was also apparent in shotgun metagenomes, as evidence by the increased proportion

229 of reads comprising the assembly (Figure S2), the assembly quality (N5013C = 4000 vs. N5012C =

230 2100; t = 2.83, p = 0.005), and the recovery of a greater number of MAGs from metagenomes

231 prepared from 13C-enriched DNA compared to controls (Figure 3). The high quality of MAGs

232 enabled the characterization of gene content to test our conclusions about individual capacities

233 for catabolizing lignocellulosic polymers.

234 Identifying Degraders of Multiple Lignocellulosic Polymers bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

235 Members of the family Caulobacteraceae and order Burkholderiales dominated 13C-

236 enriched DNA pools, comprising between 1 - 5% of 13C-metagenomes, and were highly

237 represented in our collection of MAGs (Figure 3). In Burkholderiales, three major families

238 possessed taxa that could degrade one or more substrates: Burkholderiaceae, Comamonadaceae

239 and Oxalobacteraceae (Figure 2; Figure S3). In Caulobacteraceae, members of the genus

240 assimilated 13C from both hemicellulose and cellulose, while members of

241 demonstrated the capacity to assimilate all three substrates (Figure 4a; full

242 breakdown in Figure S4). Notably, several OTUs belonging to unclassified clades of

243 Caulobacteraceae were highly enriched in 13C-DNA pools from lignin (Figure S5) and were

244 significantly more abundant in mineral than organic soils (21.5 vs. 6.0 counts per thousand reads,

245 respectively; t-test; p=0.02). Differences in predisposition for cellulose- versus lignin-catabolism

246 in Asticcacaulis versus Caulobacter, respectively, was evident in the catabolic gene content of

247 metagenome assemblies (Figure 4bc).

248 We anticipated that lignocellulolytic species would commonly degrade more than one of

249 the three lignocellulosic polymers tested, yet most incorporated 13C from only one (Figure 2). No

250 single OTU had increased relative abundance in 13C-DNA pools for all three substrates. Only

251 five OTUs were enriched in 13C-DNA pools for two substrates and were, in all cases, enriched

252 from 13C-lignin and 13C-cellulose. These OTUs were classified as Simplicispira

253 (Burkholderiaceae), Aquincola (Comamonadaceae), unclassified Caulobacteraceae and two

254 Sinobacteraceae (Figure S6). The only genera to possess OTUs that collectively incorporated 13C

255 from all three substrates were Pelomonas and Caulobacter. More taxa possessed OTUs that

256 collectively catabolized a combination of hemicellulose and cellulose (n = 7) than cellulose and

257 lignin (n = 4), or hemicellulose and lignin (n = 3). Genera that utilized both hemicellulose and bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

258 cellulose included Cellvibrio, Janthinobacterium, Cytophaga and Salinibacterium. All fungal

259 taxa enriched in unassembled 13C-metagenomes were enriched from cellulose, with the sole

260 exception of a putative member of Saccharomycetaceae from lignin. SIP tended to enrich

261 relatively rare populations, evidenced by the small proportion of OTUs (41/503) shared between

262 13C-amplicon libraries and unincubated field samples representing in situ relative abundances.

263 Correlating Soil Properties with 13C-Assimilating Populations

264 Variation in the composition of 13C-assimilating bacterial populations was explained

265 most by substrate use (PERMANOVA; R2 = 13%; p < 0.001), followed by the interaction of

266 geographic region and substrate use (10%; p < 0.001), region (8%; p < 0.001) and the interaction

267 of substrate and site (6%; p < 0.001; Table S6). The relative importance of the predictors of 13C-

268 incorporation into DNA or PLFA differed for hemicellulose-, cellulose- and lignin-degrading

269 populations (Figure 5a). However, soil layer, forest region and CAZy content consistently ranked

270 highly followed by taxonomic composition and total soil carbon content. Higher soil organic

271 matter content slowed rates of 13C-assimilation, with total carbon and nitrogen negatively

13 272 correlated with C enrichment of PLFAs in hemicellulose- (ρ = -0.71; padj < 0.01) and lignin-

273 amended mineral layer soils (ρ = -0.85; padj < 0.001; Figure 5c), but not those with cellulose. The

274 highest rates of 13C-assimilation were observed for hemicellulose and cellulose in the organic

275 layer of IDFBC and for both cellulose and lignin in LPTX mineral soil (Figure 6a).

276 Highly active cellulolytic populations in the IDFBC organic layer were uniquely

277 dominated by bacteria in contrast to those from other regions and soil layers (Figure 6b). These

278 active cellulolytic populations were comprised largely of taxa not active elsewhere, namely:

279 MIZ46 (Deltaproteobacteria), Cellvibrio (Gammaproteobacteria), and Planctomyces

280 (vadinHA49) (Figure S7). In contrast, Ascomycota were the major degraders of cellulose in most bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

281 other soils (i.e. BSON, PPCA and LPTX). The active lignolytic populations in BSON soils were also

282 distinct from other forest soils, and were comprised of Cystobacteraceae (Deltaproteobacteria),

283 FAC88 (Elusimicrobia, formerly Termite Group 1), Gaiellaceae and Solirubrobacterales

284 (). Twelve bacterial genera were identified by random forest classification whose

285 relative abundance in 13C amplicon libraries was significantly positively correlated with 13C-

286 enrichment of DNA (Table S7). The strongest correlations occurred for Caulobacter (cellulose; ρ

287 = 0.6; padj < 0.001), unclassified Caulobacteraceae (lignin; Figure 5b), and Janthinobacterium

13 288 (cellulose; ρ = 0.59; padj < 0.001). No fungi were correlated with the C-enrichment of DNA or

289 PLFA.

290 Bacterial versus Fungal Lignocellulolytic Activity

291 In most microcosms, bacteria were predominantly active in catabolizing and assimilating

292 13C from substrates and potentially other breakdown products (Figure 6b; Figure S8). Gram-

293 negative bacteria had the highest δ-13C enrichment of PLFAs in 72% of lignin-amended

294 microcosms, while roughly equal numbers of cellulose- amended microcosms were dominated

295 by gram-negative bacteria or fungi. The total quantity of 13C assimilated into PLFAs (µmol·g-1

296 dry wt soil) did not significantly differ between microcosms with greater bacterial versus fungal

297 assimilation (Wilcoxon test; W = 194, p=0.7). Gram-negative and gram-positive bacterial

298 populations dominated the assimilation of hemicellulose in a comparable number of microcosms,

299 41% and 39%, respectively, though 13C-enrichment was 2-fold higher when gram-negative

300 bacteria dominated (Wilcoxon test; W=1261, p < 0.001).

301 Fungicide-treatment greatly reduced assimilation of 13C from lignin into fungal PLFAs,

302 but also into gram-positive bacteria and, to a lesser extent, gram-negative bacteria (Figure 7a).

303 While overall bacterial activity was decreased by fungicide-treatment, the assimilation of 13C by bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

304 gram-negative bacteria persisted at relatively high levels (20-50% of untreated incubations)

305 when zero enrichment of fungal PLFAs had occurred (in BSON and PPCA). The major taxa

306 incorporating 13C-lignin in fungicide-treated incubations were distinct from those in untreated

307 ones and were predominantly gram-negative bacteria: Burkholderiaceae and

308 Sphingobacteriaceae (Figure 7b). Despite the reduction in 13C-enrichment of gram-positive

309 PLFAs, the relative abundance of a major lignolytic Actinobacterial family (Nocardiaceae) and

310 Actinobacteria overall (not shown) were increased in 13C-metagenomes after fungicide

311 treatment. The impact of fungicide-treatment on the relative abundance of Caulobacteraceae

312 differed by region, with only minor decreases observed in PPCA and LPTX.

313 CAZy Gene Content in Cellulose- and Lignin-degraders

314 Genes encoding lignin-modifying Auxiliary Activity (AA) enzymes were approximately

315 three-fold more abundant in 13C-lignin versus 13C-cellulose metagenomes. Yet, many AA

316 families were also enriched in 13C-cellulose versus control metagenomes, which corresponded to

317 an overall increase in reads classified as fungal (Figure 8). In contrast, the enriched AA gene

318 families in 13C-lignin metagenomes (AA3, AA4, and AA6) were largely bacterial in origin.

319 Overall, there was an 18-fold increase in fungal versus bacterial AA reads in 13C-cellulose

320 metagenomes compared to a 4-fold increase in bacterial versus fungal AA reads in 13C-lignin

321 metagenomes (Mann-Whitney, W = 92, p < 0.001).

322 Genes encoding several oxidative enzymes implicated in the depolymerization of lignin

323 were enriched in 13C-lignin metagenomes and were classified to taxa previously identified as

324 assimilating 13C from lignin in amplicon libraries. More specifically, genes encoding laccases

325 (AA1), dye-decolouring peroxidases (DyP) and alcohol oxidases were the most enriched AA

326 gene families. These were predominantly classified to Commamonadaceae, bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

327 and Caulobacteraceae (Figure S9a). Nine MAGs contained genes predicted to encode the entire

328 β-ketoadipate pathway, potentially involved in catabolizing aromatic lignin depolymerization

329 products to central metabolites, and were classified to Burkholderia, Sphingomonas, Caulobacter

330 and Sorangium. Aryl alcohol oxidase genes were the most significantly correlated (ρ > 0.3 and

13 331 padj < 0.01) with total C-assimilation from lignin into PLFA.

332 13C-cellulose metagenomes were highly enriched in glycosyl hydrolase gene families

333 (GHs). Approximately 37% of CAZy gene clusters (groups of 3 or more GHs) from 13C-

334 cellulose metagenomes contained carbohydrate-binding module (CBM) genes and 4% of these

335 also contained an endoglucanase gene. In comparison, only 22% of clusters in 13C-lignin

336 metagenomes contained a CBM gene, and none of these contained endoglucanase genes. Thirty-

337 three percent of endoglucanase gene-containing clusters contained a CBM, and were mostly

338 classified to Asticcacaulis, Sorangium and Chthoniobacter (Figure S9b). Actinomycetales had

339 the greatest diversity of endoglucanase gene families in CAZy gene clusters, while fungi

340 predominantly had endoglucanase genes belonging to GH7 and GH9 families (Figure S9c). Few

341 endoglucanase or GH gene families were correlated with cellulolytic activity (Table S7). Four

342 MAGs contained CAZy gene clusters encoding enzymes putatively catabolizing all three

343 substrates (i.e. xylanases, endoglucanases and ligninases) and were classified as Chaetomium

344 (fungi), Caulobacter, Chthoniobacter and Sphingobium (Table S8).

Discussion

345 We failed to find support for our initial hypothesis that individual species would degrade

346 multiple components of lignocellulose. Instead, we identified a diverse array of taxa assimilating

347 13C from three major lignocellulosic polymers, demonstrating the specialization of decomposer

348 populations. While approximately one third of taxa enriched in 13C-DNA pools possessed bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

349 members that could collectively degrade more than one substrate, and, though a handful of

350 MAGs encoded a putative suite of catabolic enzymes, none of the individual bacterial OTUs (at a

351 99% similarity threshold) were definitively enriched from all three substrates. Even closely

352 related taxa, like Caulobacter and Asticcacaulis, exhibited differences in their levels of

353 enrichment on cellulose and lignin and in CAZy gene content. In contrast, the multi-substrate

354 degradative capacity of fungi was evident in the increased relative abundances of genes encoding

355 both delignifying (peroxidases) and cellulolytic enzymes in 13C-DNA pools. The recovery of a

356 Myceliophthora MAG from 13C-metagenomes was consistent with its known capability for

357 complete lignocellulose decomposition (Li et al., 1999; Babot et al., 2011). Therefore, our

358 findings suggest that complementation among functional guilds may be necessary for

359 decomposition of lignocellulose by bacteria and not necessarily fungi. One caveat is the

360 possibility that our method was biased towards identifying single substrate utilizing bacteria. The

361 activity of multi-substrate users may have been masked by populations that grew faster on

362 individual substrates, particularly the relatively labile hemicellulose, as was observed in other

363 SIP experiments (Pepe-Ranney and Campbell et al., 2016). At the very least, our study

364 demonstrates the decomposition of lignocellulosic polymers can commonly occur via a division

365 of labour among specialized taxa.

366 The substantial assimilation of carbon from our model lignin substrate by bacteria

367 supports our hypothesis that bacteria contribute significantly to degradation of native forms of

368 lignin in situ. Bacterial activity was particularly evident in deeper mineral layers of forest soil,

369 suggesting that lignin decomposition can occur throughout the soil column. Interestingly, the

370 most novel lignin-degraders from mineral soils belong to uncultured clades of Caulobacteraceae,

371 Acidobacteria, Solirubrobacterales, Elusimicrobia, Nevskiales, and Cystobacteraceae. The low bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

372 activity of fungal lignin degraders in our microcosms was unlikely due to unmet nutritional or

373 environmental conditions, since metabolically competent taxa like Myceliophthora (Li et al.,

374 1999; Babot et al., 2011) assimilated carbon from cellulose under the same conditions. One

375 possible explanation is that fungi were involved in lignin-degradation but did not metabolize

376 degradation products to become sufficiently labeled, perhaps because of bacterial cross-feeding,

377 evident in the inhibitory effect of fungicide on lignin incorporation by bacteria. Indeed, the

378 results from the fungicide-treatments indicated significant interactions occurred between bacteria

379 and fungi in lignin degradation. However, fungi were not essential for bacterial lignin

380 degradation, given the persistent and substantial incorporation of 13C-lignin by bacteria in

381 fungicide-treated soils. This study provides strong evidence for the capacity of bacteria to

382 degrade lignin, highlighting the need for further characterization of their activity and ecology in

383 soil.

384 The substantial number of taxa identified by SIP with previously reported

385 lignocellulolytic activity was testament to the success of previous culturing-based efforts to

386 characterize soil decomposers and a validation of our SIP approach. Approximately 72% (14/19)

387 of hemicellulolytic taxa and 74% (31/42) of cellulolytic taxa were previously reported to degrade

388 the corresponding substrate in vitro. Far fewer lignolytic taxa (~28%, 8/29) had previously

389 reported degradative activity, likely due to less frequent and formalized testing of lignin

390 degradation. However, there was considerable agreement in the bacterial genera we identified

391 and those identified in other culture-independent studies of lignin-degraders in forest soil, such

392 as Caulobacter, Sphingomonas, Sphingobacterium, Nocardia, Telmatospirillum and

393 Azospirillum (DeAngelis et al., 2011a; Woo et al., 2014; Pold et al., 2015). A lower proportion

394 of lignin-degrading taxa associated with mineral layer soil (5/16) had previously reported activity bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

395 compared to those associated with the organic layer (8/12), suggesting the former may be more

396 difficult to culture or were less frequently targeted for study.

397 Many of the novel cellulolytic groups we identified belong to cultivation-resistant phyla,

398 Planctomycetes, Verrucomicrobia, Chloroflexi and Armatimonadetes (formerly OP10), which

399 commonly predominate soil communities (Youssef et al., 2009; Bergmann et al., 2011). Each of

400 these phyla possesses at least one isolate capable of degrading cellulose (Sangwan et al., 2004;

401 Lladó et al., 2015; Dedysh et al., 2013; Lee et al., 2014) and have been designated cellulolytic in

402 other SIP-cellulose experiments (Eichorst and Kuske, 2012; Verastegui et al., 2014; Pepe-

403 Ranney and Campbell et al., 2016; Wilhelma et al., 2017). Notably, all contigs that contained

404 clusters of ten or more CAZymes (27 contigs), from 13C-cellulose metagenomes, were classified

405 to taxa from the aforementioned groups and contained both CBMs (26/27) and endoglucanases

406 (20/27). These findings supported our assertion that the richness of described lignocellulolytic

407 taxa is underestimated in soils, which limits our understanding of the ecology of decomposition

408 and may afford new types of biocatalysts for processing lignocellulosic biomass.

409 The widespread capacity among members of Caulobacteraceae to degrade the three

410 components of lignocellulose was unexpected, yet consistent with reports of their enrichment on

411 decaying wood (Folman et al., 2008; Valaskova et al., 2009) and during early stages of litter

412 decomposition (Smenderovac, 2014). Caulobacter were first isolated from cellulose-amended

413 lake-water (Henrici and Johnson, 1935), but are primarily known as oligotrophic, aquatic

414 organisms (Poindexter et al., 1981). Their role in degradation of plant carbohydrates was first

415 postulated based on the analysis of the C. crescentus genome (Nierman et al., 2001) and

416 subsequently demonstrated by growth on cellulose, xylose and vanillate (Hottes et al., 2004;

417 Thanbichler et al., 2007; Song et al., 2013; Presley et al., 2014). Although we provide the first bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

418 evidence for the role of Caulobacteraceae in degrading all three polymers of lignocellulose,

419 several surveys of forest soils report enrichment of Caulobacter or Asticcacaulis in samples

420 amended with cellulose (Verastegui et al., 2014; Wang et al., 2015; Pepe-Ranney and Campbell

421 et al., 2016) or lignin (DeAngelis et al., 2011; Pold et al., 2015). The role of Caulobacteraceae in

422 decomposition in forest soils may prove significant, given their relatively high abundance (0.5 -

423 2.5% of total libraries) and their capacity to adhere to insoluble polymers, like lignocellulose.

424 The and catabolic capacity of lignocellulose degraders were moderately

425 associated with variation in the total lignocellulolytic activity. The influence of compositional

426 differences between soil layers was, in certain cases, overshadowed by the strong negative

427 correlation between organic matter and 13C enrichment. Yet, in regression modeling, CAZy gene

428 content and community composition had equivalent or greater explanatory power than other

429 predictors. The abundances of several prominent lignocellulolytic taxa were significantly

430 correlated with higher 13C assimilation, demonstrating that certain taxa play more significant

431 roles, at least under conditions we tested. Strongly oxidative bacterial enzymes commonly

432 studied in relation to lignin-degradation, such as DyP-type peroxidases and laccases (Ausec et

433 al., 2012; Colpa et al., 2014; Singh and Eltis, 2015), were less predictive of lignolytic activity

434 than aryl alcohol oxidases, suggesting a greater role of the latter class of enzymes in bacterial

435 lignin-degradation. Overall, fewer endoglucanase genes were predictive of 13C-assimilation

436 compared to AA gene families. This difference suggests either the existence of a greater

437 diversity of endoglucanases, minimizing the explanatory power of any single gene, or a narrower

438 activity range of AA families with a correspondingly higher relevance to lignolytic activity. This

439 result agrees with the finding that the composition of decomposer communities had increased bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

440 explanatory power for the rate of litter decomposition during later stages when greater

441 proportions of lignin remained (Cleveland et al., 2014).

442 The comprehensive study of all three major lignocellulosic polymers enabled us to

443 examine the co-occurrence of lignocellulolytic traits, addressing several knowledge gaps in this

444 area. The results indicated unexpected specialization among bacterial populations for

445 degradation of individual lignocellulosic polymers and revealed several novel lignocellulolytic

446 taxa, highlighting current limits to our knowledge of decomposition. This research supports the

447 view that bacterial decomposition of oligomeric lignin is a ubiquitous soil process, with the

448 potential to occur in deeper soil layers following early stages of litter decomposition. As

449 hypothesized, variation in community composition was found to constrain lignocellulolytic

450 activity across various forest types and soil layers in North America. The relationship between

451 these communities and process rates should receive continuing study to refine our understanding

452 of soil carbon stabilization and terrestrial carbon cycling models. Furthermore, the large number

453 of degradative gene clusters from uncultivated lignocellulolytic taxa represent a trove of

454 potentially novel enzymes for biotechnological applications.

455 Acknowledgements

456 This study was supported by a Large Scale Applied Research Project (162MIC) funded by

457 Genome Canada and Genome BC. R. Wilhelm was supported by a NSERC graduate scholarship

458 and UBC Four Year Fellowship.

459 Conflict of Interest 460 The authors declare no conflict of interest. bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

References

461 Allison SD. (2012). A trait-based approach for modelling microbial litter decomposition. Ecol 462 Lett, 15: 1058-1070.

463 Anders, S., and Huber, W. (2010). Differential expression analysis for sequence count data. 464 Genome Biol. 11, R106.

465 Baldrian P, Kolařik M, Štursová M, Kopecký, Valášková V, Větrovský T et al. (2011). Active 466 and total microbial communities in forest soil are largely different and highly stratified during 467 decomposition. ISME J, 6: 248-258.

468 Benjamini Y, Hochberg Y. (1995). Controlling the false discovery rate: a practical and powerful 469 approach to multiple testing. J R Stat Soc B Met, Series B: 289-300.

470 Berlemont R, Martiny, AC. (2013). Phylogenetic Distribution of Potential Cellulases in Bacteria. 471 Appl Environ Microbiol, 79: 1545-1554.

472 Bligh EG, Dyer WJ. (1959). A rapid method of total lipid extraction and purification. Can J 473 Biochem and Physiol, 37: 911-917.

474 Babot ED, Rico A, Rencoret J, Kalum L, Lund H, Romero J, et al. (2011). Towards industrially- 475 feasible delignification and pitch removal by treating paper pulp with Myceliophthora 476 thermophila laccase and a phenolic mediator. Biores Technol, 102: 6717-6722.

477 Boisvert S, Raymond F, Godzaridis É, Laviolette F, Corbeil J. (2012). Ray Meta: scalable de 478 novo metagenome assembly and profiling. Genome Biol, 13: R122.

479 Bolger, A.M., Lohse, M., Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina 480 Sequence Data. Bioinformatics, 30: 2114-2120.

481 Brown ME, Barros T, Chang MC. (2012). Identification and characterization of a multifunctional 482 dye peroxidase from a lignin-reactive bacterium. ACS Chemical Biol, 7: 2074-2081.

483 Buchfink B, Xie C, Huson DH. (2015). Fast and sensitive protein alignment using DIAMOND. 484 Nature methods, 12: 59-60.

485 Bugg TDH, Ahmad M, Hardiman EM, Singh R. (2011). The emerging role for bacteria in lignin 486 degradation and bio-product formation. Curr Opin Biotechnol, 22: 1-7.

487 Churchland C, Grayston SJ, Bengtson P. (2013). Spatial variability of soil fungal and bacterial 488 abundance: consequences for carbon turnover along a transition from a forested to clear-cut site. 489 Soil Biol Biochem 63: 5-13.

490 Crawford DL. (1978). Lignocellulose decomposition by selected Streptomyces strains. Appl 491 Environ Microbiol, 35: 1041-1045.

bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

492 Davis JR, Goodwin L, Teshima H, Detter C, Tapia R, Han C et al. (2013). Genome sequence of 493 Streptomyces viridosporus strain T7A ATCC 39115, a lignin-degrading Actinomycete. Genome 494 Announc, 1: e00416-00413.

495 De Caceres M, Legendre P. (2009). Associations between species and groups of sites: indices 496 and statistical inference. Ecology, 90: 3566-3574.

497 DeAngelisa KM, Allgaier M, Chavarria Y, Fortney JL, Hugenholtz P, Simmons B et al. (2011). 498 Characterization of Trapped Lignin-Degrading Microbes in Tropical Forest Soil. PLoS ONE, 6: 499 e19306.

500 DeAngelisb KM, D'Haeseleer P, Chivian D, Fortney JL, Khudyakov J, Simmons B et al. (2011). 501 Complete genome sequence of Enterobacter lignolyticus SCF1. Stand Genomic Sci, 5: 69-85.

502 Ding SY, Liu YS, Zeng Y, Himmel ME, Baker JO, Bayer EA. (2012). How does plant cell wall 503 nanoscale architecture correlate with enzymatic digestibility? Science, 338: 1055-1060.

504 Eichorst SA, Kuske CR. (2012) Identification of Cellulose-Responsive Bacterial and Fungal 505 Communities in Geographically and Edaphically Different Soils by Using Stable Isotope 506 Probing. Appl Environ Microbiol, 78: 2316–27.

507 Finn RD, Clements J, Eddy S. (2011). HMMER web server: interactive sequence similarity 508 searching. Nucl Acids Res, 39: W29-W37.

509 Floudas D, Held BW, Riley R, Nagy LH, Koehler G, Ransdell AS et al. (2015). Evolution of 510 novel wood decay mechanisms in Agaricales revealed by the genome sequences of Fistulina 511 hepatica and Cylindrobasidium torrendii. Fungal Gen Biol, 76: 78-92.

512 Folman LB, Klein Gunnewiek PJA, Boddy L, de Boer W. (2008). Impact of white-rot fungi on 513 numbers and community composition of bacteria colonizing beech wood from forest soil. FEMS 514 Microbiol Ecol, 63: 181-191.

515 Giovannoni SJ, Thrash JC, Temperton B. (2014). Implications of streamlining theory for 516 microbial ecology. ISME J, 8: 1553.

517 Gordon A, Hannon GJ. (2010). Fastx-toolkit. Computer program distributed by the author, 518 website: http://hannonlab.cshl.edu/fastx_toolkit/index.html [accessed November 2015].

519 Grömping U. (2006). Relative Importance for Linear Regression in R: The Package relaimpo. J 520 Stat Softw, 17: 1-27.

521 Hall SJ, Silver WL, Timokhin VI, Hammel KE. (2015). Lignin decomposition is sustained under 522 fluctuating redox conditions in humid tropical forest soils. Glob Change Biol, 21: 2818-2828.

523 Harrell FE, Dupont C. (2015). Hmisc: Harrell Miscellaneous. R package version 3.17-1. 524 https://CRAN.R-project.org/package=Hmisc bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

525 Hartmann M, Howes CG, VanInsberghe D, Yu H, Bachar D, Christen R et al. (2012). Significant 526 and persistent impact of timber harvesting on soil microbial communities in Northern coniferous 527 forests. ISME J, 6: 2199-2218.

528 Huson DH, Auch AF, Qi J, Schuster SC. (2007). MEGAN analysis of metagenomic data. 529 Genome Res 17, 377-386.

530 Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. (2010). Prodigal: 531 prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 532 11: 119.

533 Kang DD, Froula J, Egan R, Wang Z. (2014). A robust statistical framework for reconstructing 534 genomes from metagenomic data. bioRxiv, 011460.

535 Karhu K, Auffre MD, Dungait J, Hopkins DW, Prosser JI, Singh BK et al. (2014). Temperature 536 sensitivity of soil respiration rates enhanced by microbial community response. Nature, 513: 81- 537 85.

538 Kirk TK, Brunow G. (1988). Synthetic 14C-labeled lignins. In, Methods in enzymology. Eds. 539 Colowick, SP and Kaplan NO, 161: 65-73.

540 Kursa MB, Rudnicki WR. (2010). Feature Selection with the Boruta Package. J Stat Softw, 36: 1- 541 13.

542 Langmead B, Salzberg SL. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9: 543 357-359.

544 Leung HT, Maas KR, Wilhelm RC, Mohn WW. (2016). Long-term effects of timber harvesting 545 on hemicellulolytic microbial populations in coniferous forest soils. ISME J, 10: 363-375.

546 Li D, Liu CM, Luo R, Sadakane K, Lam TW. (2015). MEGAHIT: An ultra-fast single-node 547 solution for large and complex metagenomics assembly via succinct de Bruijn graph. 548 Bioinformatics, 31: 1674-1676.

549 Li K, Xu F, Eriksson KE. (1999). Comparison of Fungal Laccases and Redox Mediators in 550 Oxidation of a Nonphenolic Lignin Model Compound. Appl Environ Microbiol, 65: 2654-2660.

551 Lladó S, Benada O, Cajthaml T, Baldrian P, Garcia-Fraile P. (2015). Silvibacterium bohemicum 552 gen. nov. sp. nov., an acidobacterium isolated from coniferous soil in the Bohemian Forest 553 National Park. Syst Appl Microbiol, 39: 14-19.

554 McMurdie PJ, Holmes S. (2014) phyloseq: an R package for reproducible interactive analysis 555 and graphics of microbiome census data. PLoS One, 8: e61217

bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

556 Masai E, Katayama Y, Fukuda M. (2007). Genetic and Biochemical Investigations on Bacterial 557 Catabolic Pathways for Lignin-Derived Aromatic Compounds. Biosci Biotechnol Biochem, 71: 558 1-15.

559 Medie FM, Davies GJ, Drancourt M, Henrissat, B. (2012). Genome analyses highlight the 560 different biological roles of cellulases. Nat Rev Microbiol, 10: 227-234.

561 Morris SA, Radajweski S, Willison TW, Murrel JC. (2002). Identification of the functionally 562 active methanotroph population in a peat soil microcosm by stable-isotope probing. Appl 563 Environ Microbiol, 68: 1446-1453.

564 Morrissey EM, Mau RL, Schwartz E, Caporaso JG, Dijkstra P, van Gestel N et al. (2016). 565 Phylogenetic organization of bacterial activity. ISME J, 10: 2336 – 2340.

566 Myneni RB, Dong J, Tucker CJ, Kaufmann RK, Liski J, Zhou L et al. (2001). A large carbon 567 sink in the woody biomass of Northern forests. PNAS, 98: 14784-14789.

568 Neufeld JD, Vohra J, Dumont MG, Leuders T, Manefield M, Friedrich et al. (2007). DNA 569 stable-isotope probing. Nat Protoc, 2: 860-866.

570 Nierman WC, Feldblyum TV, Laub MT, Paulsen IT, Nelson KE, Eisen J et al. (2001) Complete 571 genome sequence of . PNAS, 98: 4136-4141.

572 Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, O’Hara RB et al. (2015). vegan: 573 Community Ecology Package. R package version 2.3-1. http://CRAN.R- 574 project.org/package=vegan

575 Parks D, Imelfort M, Skennerton CT, Hugenholtz P, Tyson G. (2015). CheckM: assessing the 576 quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome 577 Res, 25: 1043-1055.

578 Pepe-Ranney C and Campbell AN, Koechli CN, Berthrong S, Buckley DH. (2016). Unearthing 579 the ecology of soil microorganisms using a high-resolution DNA-SIP approach to explore 580 cellulose and xylose metabolism in soil. Front Microbiol, 7: 703.

581 Poindexter JS. (1981). The Caulobacters: Ubiquitous Unusual Bacteria. Microbiol Rev, 45: 123- 582 179.

583 Pold G, Melillo JM, DeAngelis KM. (2015). Two decades of warming increases diversity of a 584 potentially lignolytic bacterial community. Front Microbiol, 6:480.

585 Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J et al. (2007) SILVA: a 586 comprehensive online resource for quality checked and aligned ribosomal RNA sequence data 587 compatible with ARB. Nucl Acids Res, 35: 7188-7196.

bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

588 R Core Team. (2015). R: A language and environment for statistical computing. R Foundation 589 for Statistical Computing, Vienna, Austria. https://www.R-project.org/

590 Radajweski S, Ineson P, PArakh NR, Murrel JC. (2000). Stable-isotope probing as a tool in 591 microbial ecology. Nature, 403: 646-649.

592 Rashid GM, Taylor CR, Liu Y, Zhang X, Rea D, Fulop V et al. (2015). Identification of 593 manganese superoxide dismutase from Sphingobacterium sp. T2 as a novel bacterial enzyme for 594 lignin oxidation. ACS Chem Biol, 10: 2286-2294.

595 Romani AM, Fischer H, Mille-Lindblom C, Tranvik LJ. (2006). Interactions of bacteria and 596 fungi on decomposing litter: differential extracellular enzyme activities. Ecol, 87: 2559-2569.

597 Ruka DR, Simon GP, Dean KM. (2012). Altering the growth conditions of Gluconacetobacter 598 xylinus to maximize the yield of bacterial cellulose. Carbohyd Polym, 89: 613-622.

599 Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB et al. (2009). 600 Introducing mothur: open-source, platform-independent, community-supported software for 601 describing and comparing microbial communities. Appl Environ Microbiol 75: 7537-7541.

602 Schmidt, JM. (1981). The genera Caulobacter and Asticcacaulis. In: Starr MP, Stolp H, Trüper 603 HG, Balows A, Schlegel HG (eds). The Prokaryotes, a handbook on the habitats, isolation, and 604 identification of bacteria. Springer Verlag: Berlin, pp 466-476.

605 Song, Z., Vail, A., Sadowsky, M.J., Schilling, J.S. (2015). Influence of Hyphal Inoculum 606 potential on the Competitive Success of Fungi Colonizing Wood. Microb. Ecol. 69, 758-767.

607 Strickland, M.S. and Rousk, J. (2010). Considering fungal: bacterial dominance in soils– 608 methods, controls, and ecosystem implications. Soil Biol Biochem, 42: 1385-1395.

609 Štursová M, Žifčáková L, Leigh MB, Burgess R, Baldrian P. (2012). Cellulose utilization in 610 forest litter and soil: identification of bacterial and fungal decomposers. FEMS Microbiol Ecol, 611 80: 735-746.

612 Taylor CR, Hardiman EM, Ahmad M, Sainsbury PD, Norris PR, Bugg TDH. (2012). Isolation of 613 bacterial strains able to metabolize lignin from screening of environmental samples. J Appl 614 Microbiol, 113: 521-530.

615 Verastegui Y, Cheng J, Engel K, Kolczynski D, Mortimer S, Lavigne J et al. (2014). 616 Multisubstrate isotope labeling and metagenomic analysis of active soil bacterial communities. 617 mBio, 5: e01157-14.

618 Větrovský T, Steffen KT, Baldrian P. (2014). Potential of Cometabolic Transformation of 619 Polysaccharides and Lignin in Lignocellulose by Soil Actinobacteria. PLoS ONE, 9: e89108.

bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

620 Wang Q, Garrity GM, Tiedje JM, Cole JR. (2007). Naive Bayesian classifier for rapid 621 assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol, 73: 622 5261-5267.

623 Westram R, Bader K, Prüsse E, Kumar Y, Meier H, Glöckner FO et al. (2011). ARB: a software 624 environment for sequence data. Handbook of molecular microbial ecology I: metagenomics and 625 complementary approaches, 399-406.

626 Wickham H. (2007). Reshaping Data with the reshape Package. J Stat Soft, 21: 1-20.

627 Wickham H. (2011). The Split-Apply-Combine Strategy for Data Analysis. J Stat Soft, 40: 1-29.

628 Wickham H. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 629 2009.

630 Wieder WR, Grandy AS, Kallenbach CM, Taylor PG, Bonan GB. (2015). Representing life in 631 the Earth system with soil microbial functional traits in the MIMICS model. Geosci Model Dev, 632 8:1789-1808.

633 Wilhelm RC, Szeitz S, Klassen TL, Mohn WW. (2014). Sensitive, Efficient Quantitation of 13C- 634 Enriched Nucleic Acids via Ultrahigh-Performance Liquid Chromatography–Tandem Mass 635 Spectrometry for Applications in Stable Isotope Probing. Appl Environ Microbiol, 80: 7206- 636 7211.

637 Wilhelm RC. (2016). Deciphering Decomposition and the Effects of Disturbance in Forest Soil 638 Microbial Communities with Metagenomics and Stable Isotope Probing. Ph.D. thesis. University 639 of British Columbia.

640 Wilhelma RC, Cardenas E, Leung H, Szeitz A, Jensen LD, Mohn WW. (2017). Long-term 641 Enrichment of Stress-tolerant Cellulolytic Soil Populations following Timber Harvesting 642 Evidenced by Multi-omic Stable Isotope Probing. Front Microbiol, 8: 537.

643 Wilhelmb RC, Cardenas E, Maas K, Leung H, McNeil L, Berch S et al. (2017). Biogeography 644 and Organic Matter Removal Shape Long-term Effects of Timber Harvesting on Forest Soil 645 Microbial Communities. ISME J, 11: 2552-2568.

646 Wilhelmc RC, Cardenas E, Hahn AS, Leung H, Maas K, Hartmann M, et al. (2017). A Multi- 647 omic Survey of Forest Soil Microbial Communities More Than a Decade After Timber 648 Harvesting. Sci Data, 4:170092.

649 Woo HL, Hazen TC, Simmons BA, DeAngelis KM. (2014). Enzyme activities of aerobic 650 lignocellulolytic bacteria isolated from wet tropical forest soils. Syst Appl Microbiol, 37: 60-67.

651 Yin Y, Mao X, Yan J, Chen X, Mao F, Xu Y. (2012). dbCAN: a web resource for automated 652 carbohydrate-active enzyme annotation. Nucl Acids Res, 40: W445-W451.

bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

653 Youngblut N, Buckley DH. (2015). Intra-genomic variation in G + C content and its implications 654 for DNA stable isotope probing. Env Microbiol, 6: 767-775.

655 Zeng Y, Zhao S, Yang S, Ding SY. (2014). Lignin plays a negative role in the biochemical 656 process for producing lignocellulosic biofuels. Curr Opin Biotech, 27: 38-45. bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure Legends

657 Figure 1. (A) An overview of the samples and methods used in this study, including (B) 658 evidence of the enrichment of 13C in DNA extracted from soil amended with 13C-labeled or 659 unlabeled substrate, and (C) for the separation and recovery of 13C-labeled DNA from heavy 660 fractions of a cesium chloride density gradient. The schematic in A illustrates our use of organic 661 and mineral layer soil from five forest regions incubated with or without one of three model 662 constituents of lignocellulose. Due to limited quantities of 13C-labeled substrate, the complete 663 pairing of all forest regions and substrates was not possible. In B and C, Statistically supported 664 differences between paired labeled and unlabeled treatments (t-test; p < 0.01) are designated with 665 an asterisk (*).

666 Figure 2. Designations of all putatively hemicellulolytic, cellulolytic and lignolytic taxa based 667 on differential abundance between 13C- and 12C-pyrotag (bacterial OTUs) or unassembled 668 metagenomes (fungal taxonomy based on LCA). The lowest possible classifications are given 669 with the prefix corresponding to taxonomic rank. The abundance of each taxon in situ (i.e. in 670 unincubated field soil) is represented by a scaled circle that was coloured (and given a coloured 671 background) if that taxon was attributed lignocellulolytic activity in that forest region. 672 Uncoloured (white) circles indicate SIP data was not available for that forest region (consult 673 Figure 1a for overview) and grey circles indicate SIP data was available, but the taxon was not 674 attributed activity. The averaged ratio of abundance in 13C- versus 12C-libraries for each 675 taxonomic classification (i.e. all OTUs within each taxon) is displayed for each substrate. A 676 lemniscate (∞) indicates cases where that taxon was detected in only 13C-libraries. A full account 677 of all enriched OTUs and citations for all reported catabolic activity is available in 678 Supplementary Table S4.

679 Figure 3. An overview of the classification and quality of metagenome-assembled genomes 680 (MAGs) recovered from organic and mineral soil layers from 13C-cellulose and 13C-lignin 681 shotgun metagenomes. (A) Proportion of MAGs classified to various Orders. Taxonomic 682 designations are given to any MAG possessing > 40% uniformity in the classification of 683 sequence data (normalized to contig length). (B) Information about the top-quality MAGs based 684 on completeness (> 80%) and low contamination (< 40%) based on the completeness or 685 duplication, respectively, in lineage-specific single-copy gene sets. (C) Quality metrics for all 686 MAGs, with top-quality MAGs identified as red points. The complete list and all information on 687 the MAGs is provided in Table S5.

688 Figure 4. Characteristics of 13C assimilation by members of Caulobacteraceae evidenced by (A) 689 their enriched relative abundance in 13C-amplicon libraries for all three substrates; (B) 690 differences in the relative abundance of endoglucanase and AA gene families between closely 691 related genera, Asticcacaulis and Caulobacter, and (C) a heatmap profiling CAZy gene content 692 in cellulose and lignin metagenome assemblies. In C, the x-axis corresponds to individual bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

693 metagenomes and the y-axis abundant CAZy families. Both axes are clustered according to 694 Bray-Curtis dissimilarity for the full dataset, though only CAZy families with greater than 5 read 695 counts per million were displayed. Statistically supported differences are grouped by lettering 696 (TukeyHSD; p < 0.01). Consult Figure S4 for a complete breakdown of differentially abundant 697 genera in Caulobacteraceae.

698 Figure 5. The influence of environmental and microbial community variables on lignolytic 699 activity. (A) Environmental and community compositional variables which explain the highest 700 proportion of variance in total 13C incorporation into PLFA. Compositional predictors 701 ‘taxonomy’ and ‘CAZy’ were based on the separation of samples by the primary and secondary 702 axes of NMDS ordinations. (B) The correlation in relative abundance of Caulobacteraceae reads 703 in 16S rRNA libraries and 13C-enrichment of DNA for lignin. (C) The negative correlation 704 between total 13C incorporation from lignin into PLFA and total soil carbon in mineral soils. δ- 705 13C is a measure of the ratio of stable isotopes of carbon relative to Vienna Pee Dee Belemnite.

706 Figure 6. A comparison of 13C incorporation into microbial biomass in soils of differing forest 707 region based on (A) δ-13C in PLFA, and (B) the relative incorporation into bacterial versus 708 fungal PLFA markers. In A, PLFAs from microcosms incubated with unlabeled substrates had 709 comparable delta-values to soil (-25 ‰), and low variance, and are distinguished by a dashed 710 line. Statistically supported differences are grouped by lettering (TukeyHSD; p < 0.01). In B, the 711 ratio represented by the y-axis was designed to normalize 13C-enrichment in fungi relative to 13 13 712 bacteria ( Cfungi : Cbacteria) to the proportion of pre-existing bacterial and fungal biomass 12 12 713 ( Cfungi : Cbacteria). Differences in incubation length of microcosms among substrates should be 714 considered when comparing between them. Also, δ-13C enrichment is weighted against total 12C, 715 and thus reflects the total lignocellulolytic activity relative to total biomass which is 3- to 20- 716 fold higher in organic layer soils (details in Table S1).

717 Figure 7. Impact of fungicide treatment on (A) the total assimilation of 13C from lignin in to 13 718 PLFA and (B) the relative abundance of taxa in C-metagenomes. In B, BSON was averaged 719 from two replicates. Taxa that commonly occurred in 12C-control libraries but were not 720 designated as major lignolytic taxa were left uncolored (St: Streptomycetaceae, Br: 721 Bradyrhizobiaceae; Ac: Acidobacteriaceae, Ch: Chitinophagaceae and Cy: Cytophagaceae). 722 Taxa comprising fewer than 0.05% of total reads are not displayed.

723 Figure 8. Differences in proportion of fungal and bacterial encoded auxiliary activity (lignin- 724 modifying) genes in 13C- versus 12C-metagenomes. Bubbles are coloured according to domain of 725 life and scaled to the total read counts per million. The top three most enriched taxa, at least 3- 726 fold more abundant in 13C-libraries, are displayed and coloured according to domain of life. The 727 fold-change of enrichment for each taxa is provided in brackets. bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Figure 1.

No Incubation A Forest Soil British Columbia PLFA Analysis • GC-IRMS Hemicellulose IDFBC

Organic • Quantify 13C in Fungi/Bacteria

H

H H H O

OH

O Layer Mineral O O OH O O

O O O O O O

HO

HO O

OHO OH Layer HO O HO

OH O

(~20 cm) O HO ...

O OH H Sieve & ... PPCA Weigh 2-day incubation maize extract Amplicon Sequencing California • 16S rRNA gene libraries H H O HO OH O HO OH O O O O O O O O • Identify functional taxa (bact.) HO OH O HO OH O H H *all substrates H H O HO OH O HO OH O O O O O O ... O O ... HO OH O HO OH O H H H CelluloseH O HO OH O HO OH O O O O O O 13 12 O O C-Labeled C-Control HO OH O HO OH O H H Ontario 14-day BSON Comparable bacterial cellulose ‘Heavy DNA’ or JPON Fraction

OH Density Gradient Ultracentrifugation OH O Me OH HO O

HO OMe org+min HO OH (1:3) HO O OH Shotgun Sequencing

O O ... Lignin OMe ... OH • Identify functional genes O MeO LPTX O Texas • Identify functional taxa (fungi) OH OH MeO 60-day Soil Microcosms dehydrogenatively polymerized Fungicide + Lignin Treatment 20° C | in dark 13C-Labeled or 12C-Control • Bacteria-speci c activity Cellulose Lignin Cellulose Lignin B Organic Layer Mineral Layer Org. Min. C 5 30 * * * Label

(ng) 13 4 * C * * * * 12C 3 20 * * ** * * 2 * * * C in Soil DNA Extract * * 10 13

1 DNA Total Average % * * 0 0 BC ON ON CA TX BC ON ON CA TX ON CA TX ON CA TX IDF BS JP PP LP IDF BS JP PP LP BS PP LP BS PP LP F1-7 F8 F9 F10 F1-7 F8-9 F10 bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Figure 2. Abundance Phylum Best Classication Substrate in situ Acidobacteria o__iii1-15 8.1 o__Ellin6513 ∞ o__Solibacterales 5.0 Actinobacteria o__Acidimicrobiales ∞ f__ACK-M1 ∞ f__Frankiaceae 240 f__Nocardiaceae 17 f__Microbacteriaceae 1418.9 g__Salinibacterium 5.5 5.7 f__Micrococcaceae 21 6.7 f__Streptomycetaceae 8.5 8.9 g__Streptomyces 13 o__Gaiellaceae 19 o__Solirubrobacterales 87 f__Conexibacteraceae 66 g__Conexibacter 37 Armatimonadetes d__FBP 36 o__FW68 14 f__Sphingobacteriaceae ∞ 5.0 g__Cytophaga 300 Chloro exi c__C0119 4.6 Elusimicrobia o__FAC88 33 Firmicutes g__Bacillius 6.6 f__Paenibacillaceae 8.7 Planctomycetes o__DH61 30 f__Caulobacteraceae 28 g__Asticcacaulis 3.9 10 g__Caulobacter 416.63.0 g__Bosea 4.8 g__Devosia 6.0 6.7 g__Agrobacterium 3.6 g__Telmatospirillum 11 ∞ f__Sphingomonadaceae 6.1 3.5 g__Azospirillum 42 f__Erythrobacteraceae 14 Betaproteobacteria g__Burkholderia 4.1 10 f__Comamonadaceae 3.4 4.4 g__Pelomonas ∞ 5.2 ∞ g__Leptothrix 4.1 g__Variovorax 28 f__Oxalobacteraceae 34 g__Janthinobacterium ∞ 4.0 IDFBC PPCA BSON JPON LPTX Bacteria Organic-associated Mineral-associated Previously reported activity by members of group bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Figure 2. Abundance Phylum Best Classi cation Substrate in situ Deltaproteobacteria o__MIZ46 ∞ f__Cystobacteraceae 20 f__Polyangiaceae 61 Gammaproteobacteria g__Cellvibrio 18 6.1 g__Pseudomonas 3.6 f__Piscirickettsiaceae 250 f__Sinobacteraceae 16 TM7 c__ 23 (Candid. Saccharibacteria) TM7-3 Verrucomicrobia p__Verrucomicrobia 2.0 f__Opitutaceae 88 Ascomycota g__Pseudogymnoascus 61 g__Botrytis 11 g__Arthrobotrys 10 f__Saccharomycetaceae 8.4 g__Verticillium 20 g__Trichoderma 15 g__Hypocrea 4.6 g__Fusarium 23 g__Ophiocordyceps 8.9 g__Chaetomium 46 g__Humicola 30 g__Myceliopthora 54 g__Neurospora 25 g__Magnaporthe 37 Basidiomycota g__Pleurotus 5.2 g__Coprinopsis 2.8 g__Schizophyllum 2.3 g__Laccaria 1.6 g__Piriformospora 42 g__Trichosporon 2.4 g__Cryptococcus 2.4 IDFBC PPCA BSON JPON LPTX Bacteria Fungi Organic-associated Mineral-associated Forest Sites Substrate Scale (IDF) British Columbia Hemicellulose 16 32 64 128 California (PP) Cellulose counts per thousand Ontario (BS) Lignin Ontario (JP) Texas (LP) Previously reported activity by members of group No SIP Data Available No Activity in Ecozone bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 3.

A Organic Associated Organic Mineral B Phylum Classication Size (Mb) C Longest Sca old (Kb) Genome Size (Mb) Acidobacteriales Actinobacteria g__Arthrobacter (78%) 4.9 1,000 ● ● ● ● Rhizobiales g__Conexibacter (56%) 2.8 ● ● 30 ● ● ● ● g__Conexibacter (44%) 3.4 ● ● ● 300 ● ● ● ● ● ●● Sphingobacteriales ● ● ● g__Conexibacter (55%) 2.7 ● ●● ● ●●● ● ●● ● 10 ●● ● ● ●● ● ● ●● ●●● ●●● ● ●●● Streptomycetales g__Leifsonia (42%) 4.6 ●● ● ● ● ● ● ●●●● ● ● 100 ● ● ●●●●●● ●●●● ● ●● ●●●● ● ●●●●●●● ● ●●●● ● ● ●●● ●●●●●●● Xanthomonadales ●●●●● ●●● ●● Alphaproteo- g__ (87%) 4.1 ●●● ● Asticcacaulis ●●●● ● ● ●● 3 ●●● ● ● ● ●●●● ●●●● ●●● ●●●●● ●●●● ●●●●●●● ●●●●● ● ●●●● ●●●● ● ●●●●●● ●●●●●● ●●●●● g__Asticcacaulis (94%) 4.7 ●●●●●●●● ●● ●● ●●●● ●●● ● ● ●●●●●●●●● ●●●● ● ●●●● ●●● 30 ●●●●●●●● ● ●●● ●●●●●●●● ● ● ● ●●●●● ●●●● Mineral Associated ●●●●●● ●●● ●●●● g__ (50%) ●●● ●●● ●●●●● Caulobacter 4.5 ● ● ● ●● ●● 1 ●● ● ●●● ●●● ●●●●●●● ●●●●●● ●●●●●● ●●●● ●●●●●●● ● ●● ● ●● ●● ●●●●●● ●● ● ● ●●●●●●●● ● ●●●● ●● ●● ●●●●●●●● g__Magnetospirillum (28%) 1.9 ●●●● ●●● ●●●● ● ●● Chthoniobacterales ● ●●●●●● ●●●●● ●●● ●●●● ●●●●●●●● ●●● ●●●● ●●● ● ●● ● 10 ● ● ● ●●● ●● ●● ●●● ●●●●●● ● ●●●●● ●●●●●●● Betaproteo- g__Herbaspirillum (41%) 4.0 ●● ●●●●● ● ● 0.3 ● ●● Nevskiales ●●●● ●●●●●● ● ●●●● ●● ● ● ●●●●● Deltaproteo- g__Sorangium (67%) 9.2 ●● ●●●●●● Common Taxa Verrucomicrobia g__Chthoniobacter (51%) 9.3 % Contamination % Completeness Sordariales g__Opitutus (65%) 4.1

● ●● ● f__Verrucomicrobiaceae (48%) 8.3 ●●●●●●●●●● ● ● 100 ●●●●●●● ●●●● Burkholderiales ●●●●● ●●●●●● ● ●●●● NA ● ● ● ●● ●●●●● d__Unclassi ed 2.9 ● ● ● ●● ●● ● ●●●● ● ● ●●●●●● ● ● ● ●●●● ● ● ●●●●● ● ●●● ● Ascomycota g__Chaetomium (48%) 38.6 ●● ● 30 ● ● ● ● ●● 100 ● ●●● ● ● ● Cytophagales ● ●●●●●● ● ●● ●● ● ●●●●● ● ●● ● ●●●●●●●● ●●●● ● ●●● Alphaproteo- g__ (58%) ●●●● ●● ● Caulobacter 4.1 ● ● ● ● ● ● ●●● ●● ●●● ●●●●●● ● ● ● ●●● ●● ●● ●● ● ● ● 10 ●●●●● g__Caulobacter (44%) 5.7 ● ● ● ● ● ●●●● ● Myxococcales ●● ●● ● ●● 10 ● ● ●● ● ●●● ●●● Betaproteo- __ (75%) 4.9 ● ●●●●●●●●●●● g Comamonadaceae ●● ● Solirubrobacterales ●● ●● ● ● ●●●●●● ● ● ●●● ● ● ● 3.0 ● g__Collimonas (46%) 5.9 ●●●● ● Sphingomonadales ● ●●●●● ● ●●●●●●●●● ● ● ● ● ● ● ●● ● 0 ●● ● ● ● Org Min Cellulose Lignin ● 0 ●● ● ●● ncellulose 154 128 > 80% Completeness & ● n 20 30 Lignin Lignin ● ●● lignin Cellulose Cellulose < 40% Contamination 0.1 ●●●●●●●●●●●●●●●●●● 0.3 ●●●●●●●●●●●●●●●● bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Figure 4. A Caulobacteraceae B Asticcacaulis Caulobacter In situ Hemi- Cellulose Lignin Cellulose Lignin Cellulose Lignin

b 300 Label Label b 12C b 6.0 12C 13C 13C

200 b 4.0 b a b b 100 2.0

a per Million Reads Counts Average

Average Counts per Thousand Reads Thousand per Counts Average a a a a a b 0 0 a

BC BC BC ON CA TX CA CA TX CA TX ON ON ON ON

BS JP PP LP IDF LP LP endo endo endo endo IDF PP IDF BS JP PP BS PP GH AA GH AA GH AA GH AA

CBM48 GH106 C GH13 (26) GH18 GH103 GH13 (11) AA6 GH65 CBM50 AA3 GH24 GH102 GH10 GH28 GH16 GH20 GH105 GH115 GH9 GH130 GH97 GH35 GH92 GH5 (7) GH13 (31) GH13 GH74 Label Substrate Label Substrate Read Counts per Million 12C Cellulose 13 Lignin -2 -1 0 1 2+ C z-score bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 5. Forest Region B ● Lignin Taxonomy (Axis 1) 4 r = 0.48; p = 0.006 TOC Horizon ● ● ● 3 ● ● Taxonomy (Axis 2) ● Hemicellulose ● ● ● 2 ● Horizon ● ● ● ● ● ● ● ● ●

C in Soil DNA Extract ● TOC ● ● ● ● 13 ● ● Taxonomy (Axis 1) 1 ● ● % ● Forest Region 1.50 1.75 2.00 2.25 2.50 2.75 CAZy (Axis 2) log10(rel. abund. Caulobacteraceae) CAZy (Axis 1) Taxonomy (Axis 2) Cellulose C 3.8 ● Lignin ● ● ● ● ● CAZy (Axis 1) 3.6 ● ● ● r = -0.85; p < 0.001 Forest Region ● Horizon ● 3.4 TOC C in PLFA) ● 13 Ecozone ● (Axis 2)

Taxonomy (δ-

10 BSON 3.2 ● ● (Axis 2) ● CAZy PPCA ●

log ● Taxonomy (Axis 1) Lignin LPTX 3.0 ● 0 0.1 0.2 0.3 0.0-0.25 0.25 0.50 0.75 1.0 Average Partitioned R2 A log10(Total Soil Carbon) bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 6. Hemicellulose Cellulose Lignin A Org. Min. Organic Layer Mineral Layer Org. Min. 20,000 b

15,000 (‰)

C in PLFA 10,000 a 13 ● a

δ a a 5,000 a a a b ●

b ● b b a b ● b b

0 ●

12 C BC CA 12 C BC CA 12 C BC ON ON CA TX 12 C BC ON ON CA TX 12 C ON CA TX 12 C ON CA TX IDF PP IDF PP IDF BS JP PP LP IDF BS JP PP LP BS PP LP BS PP LP Hemicellulose Cellulose Lignin B Organic Mineral Organic Mineral Organic Mineral

● 2.5

2.0 C-Fungi C-Bacteria 12 12 b : 1.5

More b Fungal b

1.0 a More Bacterial ● C-Fungi

C-Bacteria a ● 13 a

13 a 0.5 ● a

BC CA BC CA BC ON ON CA TX BC ON ON CA TX ON CA TX ON CA TX IDF PP IDF PP IDF BS JP PP LP IDF BS JP PP LP BS PP LP BS PP LP bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 7.

BSON PPCA LPTX BSON PPCA LPTX A B 100 St Ac Ac Family 6,000 Cy Cy Burkholderiaceae = fungicide St Br Caulobacteraceae Ch Br

(%) Br Br Comamonadaceae (‰) Nocardiaceae 4,000 Br Br St Oxalobacteraceae 50 Br Pseudomonadaceae C in PLFA

13 Sphingobacteriaceae St δ- Sphingomonadaceae 2,000 Ac

Relative Abundance Relative Xanthomonadaceae (Fungi) Saccharomycetaceae

0 0

+ + + Fungi Gram+ Gram -

13 13 13 12 C-ControlC-Lignin13 C-Lignin 12 C-ControlC-Lignin13 C-Lignin 12 C-ControlC-Lignin13 C-Lignin bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 8.

Cellulose 12C 13C Lignin 12C 13C Chaetomiaceae Nectriaceae (11x) Comamonadaceae AA1 Hypocreaceae AA1 Sphingomonadaceae laccases Caulobacteraceae AA2 Sordariaceae (7x) peroxidases Glomerellaceae Chaetomiaceae (65x) AA2 Aspergillaceae (6x) (4x) Oxalobacteraceae (5x) Burkholderiaceae Cytophagaceae(3x) AA3 Streptosporangiaceae aryl alcohol oxidases AA3

Chaetomiaceae (22x) AA4 Aspergillaceae (6x) vanillyl-alcohol Ophiostomataceae (6x) oxidases Acetobacteraceae (22x) Cystobacteraceae (8x) AA4 Pleosporaceae (14x) AA5 Micrococcaceae (7x) alcohol oxidases Oxalobacteraceae (8x) Alteromonadaceae (22x) Burkholderiaceae (4x) AA5 Nocardiaceae (6x) AA6 Caulobacteraceae (4x) benzoquinone Nitrosomonadaceae (13x) reductases AA6 Rhodospirillaceae (7x) Alcanivoracaceae Chaetomiaceae (24x) Sordariaceae (5x) AA7 Clostridiaceae glucosaccharide Acidobacteriaceae (4x) oxidase AA7 Shewanellaceae Chaetomiaceae AA8 Sordariaceae iron reductase Lasiosphaeriaceae domain AA8 N.D. Chaetomiaceae (58x) AA9 Sordariaceae (14x) (12x) AA9 lytic Nectriaceae monooxygenase Streptomycetaceae (9x) AA10 Unclassied (5x) AA10 Pseudomonadaceae Aspergillaceae (4x) Chaetomiaceae (41x) AA11 Sordariaceae Pseudeurotiaceae AA11 Methylobacteriaceae (4x) Chaetomiaceae (27x) Sinobacteraceae AA12 Sordariaceae (17x) Sphingomonadaceae quinone-dep. Lasiosphaeriaceae AA12 oxidoreductase Sordariaceae Glomerellaceae AA13 N.D. Chaetomiaceae AA13

Top 3 Most Di . Domain Abundant Families Bacteria Fungi 0.01 100 10 1 0.1 Average Read Counts Per Million bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S1. Reaction scheme for the synthesis of 13C ring-labeled coniferyl alcohol which was subsequently polymerized into lignin with an average of fourteen coniferyl alcohols. Carbon-13 isotopes are denoted in light blue. Consult the Supplementary Methods for a complete description of synthesis reactions.

O OH O OH O OMe OH H O O H H H H O O Pyridine MeOH, HCl LiAlH4 + OH HO OH Aniline, Piperidine re ux @ 60 °C , 5 h dry THF, on ice OCH3 re ux @ 55 °C for 16 h OH OCH3 OCH3 OCH3 OCH3 OH OH OH OH Vanillin Malonic acid Ferulic acid Ferulic acid Coniferyl alcohol (Methyl ester) bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S2. Evidence for the improved assembly of metagenomes derived from 13C-en- riched DNA in soil microcosms amended with 13C-labeled (blue) or unlabeled (pink) cellulose or lignin. Statistically supported differences between paired labeled and unla- beled treatments (t-test; p < 0.01) are designated with an asterix (*). Statistical testing could not be performed for cellulose with single 12C-libraries. When composited, 12C-li- braries cellulose libraries had significantly lower assembly than 13C-libraries in organic (Wilcoxon; p < 0.003) and mineral layer soils (Wilcoxon; p < 0.001).

Cellulose Lignin 80 Org Min Org Min

60

* * 40

* 20 * % of Reads Mapped to Assembly % of Reads Mapped to

0 BC ON ON CA TX BC ON ON CA TX ON CA TX ON CA TX IDF BS JP PP LP IDF BS JP PP LP BS PP LP BS PP LP bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S3. The relative abundance of prominent lignocellulolytic members of Burkholderiales based on differential abundance between 12C- and 13C-libraries. Plots are organized in a hierarchical structure displaying family and genus for all active taxa within each group. Error bars correspond to one standard error of the mean. Significant (Tukey HSD; padj < 0.05) pairwise differences are grouped by lettering.

Burkholderiales (Order)

Substrate Label b 13C-enriched b Natural abundance 400 b b Substrate b a In Situ Hemicellulose Cellulose Lignin 200 a a a b a a Average Counts per Thousand Reads Thousand per Counts Average 0

BS JP BS JP BS IDF Cali Tex IDF Cali IDF Cali Tex Cali Tex

Burkholderiaceae (Family) Comamonadaceae (Family) Oxalobacteraceae (Family) 300 400 b 200 300 200 b

200 100 100 b a b 100 b a b

a a a a b Average Counts per Thousand Reads Thousand per Counts Average 0 b 0 0 a a

BS JP BS JP BS BS JP BS JP BS BS JP BS JP BS IDF Cali Tex IDF Cali IDF Cali Tex Cali Tex IDF Cali Tex IDF Cali IDF Cali Tex Cali Tex IDF Cali Tex IDF Cali IDF Cali Tex Cali Tex Figure S3continued certified bypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmadeavailableunder

Average Counts per Thousand Reads bioRxiv preprint 100 200 0 IDF BS JP Cali Tex Substrate doi: Burkholderia a Lignin Cellulose Hemicellulose In Situ IDF b https://doi.org/10.1101/387308 ... Cali

IDF

BS

JP

Label Cali (Genus) 13 12 C C Tex a

BS CC-BY-NC-ND 4.0Internationallicense ; this versionpostedAugust8,2018. Cali

Tex Average Counts per Thousand Reads 100 150 50 10 20 30 0 0 IDF IDF BS BS JP JP Uncl. Commamonadaceae Cali Cali Tex Tex

IDF IDF Variovorax Cali Cali The copyrightholderforthispreprint(whichwasnot . IDF IDF

BS BS

JP JP

Cali Cali (Genus)

Tex Tex a

BS BS b

Cali Cali

Tex Tex 100 150 200 250 50 0 IDF BS JP Cali Tex Janthinobacterium a

IDF b

Cali

IDF

BS

JP

Cali

Tex

(Genus) BS

Cali

Tex bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S4. The relative abundance of all OTUs classified to the genera of Caulobacteraceae. Differences in hemicellulolytic, cellulolytic and ligno- lytic can be observed among genera and unclassified Caulobacteraceae sequences based on differential abundance between 12C- and 13C-libraries. Error bars correspond to one standard error of the mean. Significant (Tukey HSD; padj < 0.05) pairwise differences are grouped by lettering. Caulobacter spp. Asticcacaulis spp.

b b 150 150 Caulobacteraceae

100 300 b 100 b a b b b b b 50 50 b b 200 a a

Average Counts per Thousand Reads Thousand per Counts Average b a a a a 0 a 0 a

BC BC BC BC CA TX BC BC CA TX ON CA CA CA TX ON ON CA CA CA TX ON ON TX ON ON ON TX ON ON

b BS JP PP LP PP JP LP PP LP BS JP PP LP PP JP LP PP LP a IDF IDF IDF BS PP BS IDF IDF IDF BS PP BS b 100 Uncl. Caulobacteraceae spp. spp. a 300 a Average Counts per Thousand Reads Thousand per Counts Average b 40 a a 0 a

BC BC BC ON CA TX CA CA TX CA TX ON ON ON ON 200 30

BS JP PP LP IDF LP LP IDF PP IDF BS JP PP BS PP

20 Substrate Label a a b b a In Situ 13C 100 b b Hemicellulose 12 10 C a b b Cellulose Reads Thousand per Counts Average a a b a a b Lignin 0 0

BC BC BC BC CA TX BC BC CA TX ON CA CA CA TX ON ON CA CA CA TX ON ON TX ON ON ON TX ON ON

BS JP PP LP JP LP PP LP BS JP PP LP JP LP PP LP IDF IDF PP IDF BS PP BS IDF IDF PP IDF BS PP BS bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S5. Maximum parsimony tree showing phylogenetic Caulobacteraceae Tree distribution of lignocellulolytic OTUs within the family Caulo- bacteraceae. Clades are named based on SILVA tree and custom names were assigned to putatively functional clades. Branches Asticcacaulis were coloured to indicate membership to broader clades inset at Uncultured 6 the top left. The closest cultured represtatives were included Caulobacter1 where possible. Clades from this study are comprised of multi- Phenylobacterium ple OTUs with the total number provided in parentheses. Caulobacter2 Uncultured5 Uncultured Clade ‘L1’ 4 0.01 Uncultured3 Uncultured2 Uncultured1 6 Amorphus Asticcacaulis C. crescentus (AE005673) 4 C. segnis (CP002008) Clade ‘L1’ 2 (4) 0 N.D. N.D. N.D.

C. DS48-5-2 (KF360052) BS PP LP C. ginsengisoli (AB271055) C. daechungensis (JX861096) Clade ‘L1’ Outgroup Uncultured soil clone (FQ659175) ‘L1’ Outgroup 7.5 (5) 5 C. fusiformis (AJ227759)

‘Clean-room’ clone (DQ532309) 2.5 30 Clade ‘L2’ ‘Clean-room’ clone (FJ192660) Clade ‘L2’ 0 N.D. N.D. JP IDF BS PP LP BS JP PP LP (3) 20 IDF Clade ‘CH’ 75 Clade ‘CH’ 10 (12)

C. henricii (AJ007805) 50 0 N.D. N.D. N.D.

C. henricii (AJ227758) BS PP LP ‘Betaproteo’ JGI Genome (ATTT01000022) 25

C. EMa18 (JQ977571) 0 N.D.

JP LP BS PP PP BS JP PP LP IDF IDF IDF Clade ‘CH’ Outgroup ‘CH’ Outgroup (13) 4

Phenylobacterium 2 Caulobacter2

0

BS JP PP LP PP BS JP PP LP BS PP LP IDF IDF IDF bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Clade ‘L31’ Figure S5 continued... 30 Clade ‘L31’ (8) 20 ‘Clean-room’ clone (DQ532251) 10 Skin clone (GQ094027)

Soil clone (EU202832) 0

JP BS PP LP PP BS JP PP LP BS PP LP IDF IDF

Clade ‘L32’ Clade ‘L33’ Clade ‘L32’ 3 (3)

40 2

Clade ‘L33’ (4) 20 1 0.01 Volcanic deposit clone (FJ466188) 0 N.D. N.D. N.D. 0 N.D. N.D.

BS JP PP LP BS LP BS PP LP IDF PP Clade ‘L3’ Outgroup ‘L3’ Outgroup 2.5 (14) 2.0

1.5 Clade ‘L41’ 1.0 30 0.5

Clade ‘L41’ 20 0

JP BS PP LP PP BS JP LP BS PP LP (9) IDF IDF IDF PP

10 Clade ‘L42’

Soil clone (AB821150) 20 0 N.D. N.D. N.D. Soil clone (GU731323) BS PP LP 15

10 Clade ‘L42’ ... to Rhizobiales (5) 5 Clade ‘L4’ Outgroup N.D. N.D. Soil clone (EU133327) 6 0 BS JP PP LP BS LP IDF PP

‘L4’ Outgroup 4 (5)

2

0 N.D.

JP BS PP LP BS JP PP LP BS PP LP IDF IDF bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S6. Abundances of OTUs enriched in both lignin and cellulose 13C-pyrotag libraries within the same region. Plots are titled with the lowest supported taxonomic classification and include representative read names. Where two OTUs were combined, both independently exhibited the trend in multiple substrate use. The unclassified Caulobacteraceae OTU belongs to the clade ‘LH3_1’ presented in Supplementary Figure 3b.

Simplicispira (JDR68TK01BPQJF) Uncl. Caulobacteraceae (JJB2FHK01EN1RF) 80 30

60

20 40

10 20 Average Counts per Thousand Reads Thousand per Counts Average

0 N.D. N.D. 0 N.D. N.D.

BS JP PP LP PP BS JP PP LP BS PP LP BS JP PP LP PP BS JP PP LP BS PP LP IDF IDF IDF IDF IDF IDF

(JJB2FHK01ET4AQ & Aquincola (JDR68TK02PO9B) Uncl. Sinobacteraceae JDR68TK02IUC1F) 6 20

15 4

10

2 5 Average Counts per Thousand Reads Thousand per Counts Average

0 N.D. N.D. 0 N.D. N.D.

BS JP PP LP PP BS JP PP LP BS PP LP BS JP PP LP PP BS JP PP LP BS PP LP IDF IDF IDF IDF IDF IDF Substrate Label In Situ 13C Hemicellulose 12C Cellulose Lignin bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S7. Relative abundances of cellulolytic taxa unique to IDFBC. The relative abundances of bacteria are based on 16S rRNA gene pyrotag libraries and fung on LCA classification of unassem- bled shotgun metagenomes. Error bars correspond to one standard error of the mean.

MIZ46 (Deltaproteobacteria) Cellvibrio (Gammaproteobacteria) 80 200

60 150

40 100

20 50 Average Counts per Thousand Reads Thousand per Counts Average Reads Thousand per Counts Average 0 0

JP JP LP LP BS JP PP LP PP BS JP PP LP BS PP LP BS PP LP PP BS PP BS PP IDF IDF IDF IDF IDF IDF

DH61 (Planctomycetes) 80

60 Substrate Label In Situ 13C 40 Hemicellulose 12C Cellulose Lignin 20 Average Counts per Thousand Reads Thousand per Counts Average 0

BS JP PP LP PP BS JP PP LP BS PP LP IDF IDF IDF Ascomycota Basidiomycota 1,500 30,000

1,000 20,000

500 Read Count per Million Read Count per Million Read Count 10,000

0 0

JP JP IDF BS PP LP BS PP LP IDF BS PP LP BS PP LP bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure S8. The average delta-13C enrichment of PLFA markers for fungi, gram-positive bacteria and gram-negative bacteria for all three substrate amendments.

Hemicellulose Cellulose Lignin 30,000 Taxonomic Marker Fungi Gram-Positive 20,000 Gram-Negative C in PLFA 13

δ 10,000

0 Organic Mineral Organic Mineral Organic Mineral bioRxiv preprint doi: https://doi.org/10.1101/387308; this version posted August 8, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Figure S9. Taxonomic classification of catabolic genes in CAZy clusters in 13C-metagenome assemblies, including (A) genes encoding lignin-modifying enzymes, (B) clusters containing a CBM and an endogluca- nase, and (C) all endoglucanase-containing gene families. All taxa represented by a single gene were not displayed. Taxonomic classifications were assigned based on LCA and functional annotation based on top blast hit (e-value < 10-5) to CAZy database or positive match to custom hidden Markov models. a Laccase-only

Bacteria Onygenales Xylariales Dye-decolouring Alcohol

Oxidases Bacteria Peroxidases Bacteria Laccase Oxidase-only & DyP AA1 AA4 AA5 Cytophagales (n=35) (n=79) (n=94) Ktedonobacterales Solirubrobacterales Fungi Hypocreales Proportion of CAZy clusters

Fungi Glomerellales

b 100 Caulobacter c Taxa in Multiple Plots Endoglucanase-only Duganella 300 Acidobacteriales Bacteroidales Streptomyces Actinomycetales Burkholderiales Chthoniobacter 200 Caulobacterales 50 Myxococcales Sorangium Pseudomonadales Rhizobiales 100 Rhodospirillales Percentage of Classi ed Reads of Classi ed Percentage Asticcacaulis Sphingomonadales

Normalized Total Number of CAZy Clusters Total Normalized Xanthomonadales 0 Verrucomicrobiales / (n=81) 0 GH5 (5) GH6 GH7 GH8 GH9 GH12 GH16 GH26 GH44 GH51 GH74 GH81 Chthoniobacterales CBM+Endoglucanase Clusters Endoglucanase-containing Families Sordariales