Delft University of Technology

Operation-driven heterogeneity and overlooked feed-associated populations in global anaerobic digester microbiome

Mei, Ran; Nobu, Masaru K.; Narihiro, Takashi; Kuroda, Kyohei; Muñoz Sierra, Julian; Wu, Zhuoying; Ye, Lin; Lee, Patrick K.H.; Lee, Po Heng; van Lier, Jules B. DOI 10.1016/j.watres.2017.07.050 Publication date 2017 Document Version Accepted author manuscript Published in Water Research

Citation (APA) Mei, R., Nobu, M. K., Narihiro, T., Kuroda, K., Muñoz Sierra, J., Wu, Z., Ye, L., Lee, P. K. H., Lee, P. H., van Lier, J. B., McInerney, M. J., Kamagata, Y., & Liu, W. T. (2017). Operation-driven heterogeneity and overlooked feed-associated populations in global anaerobic digester microbiome. Water Research, 124, 77- 84. https://doi.org/10.1016/j.watres.2017.07.050 Important note To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology. For technical reasons the number of authors shown on this cover page is limited to a maximum of 10. © 2017 Manuscript version made available under CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/ Postprint of Water Research Volume 124, 1 November 2017, Pages 77-84 Link to formal publication (Elsevier): https://doi.org/10.1016/j.watres.2017.07.050 1 Operation-driven Heterogeneity and Overlooked Feed-associated Populations in Global 2 Anaerobic Digester Microbiome 3 4 Ran Meia, Masaru K. Nobua,b, Takashi Narihirob, Kyohei Kurodaa,c, Julian Muñoz Sierrad, 5 Zhuoying Wue, Lin Yef, Patrick K. H. Leeg, Po-Heng Leee, Jules B. van Lierd, Michael J. 6 McInerneyh, Yoichi Kamagatab, Wen-Tso Liua* 7 8 a. Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, 9 Urbana, IL, USA 10 b. Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology 11 (AIST), Tsukuba, Ibaraki, Japan 12 c. Department of Environmental Systems Engineering, Nagaoka University of Technology, Kami-tomioka, 13 Niigata, Japan 14 d. Section Sanitary Engineering, Department of Water Management, Delft University of Technology, 15 Delft, The Netherlands 16 e. Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hung 17 Hom, Kowloon, Hong Kong 18 f. School of the Environment, Nanjing University, Nanjing, Jiangsu, China 19 g. School of Energy and Environment, City University of Hong Kong, Kowloon, Hong Kong 20 h. Department of Botany and Microbiology, University of Oklahoma, Norman, OK, USA 21 22 *Corresponding author: Wen-Tso Liu 23 Email: [email protected] 24 25 26 27 28 *Manuscript Click here to view linked References 29 30 31 32

1 33 Abstract

34 Anaerobic digester (AD) microbiomes harbor complex, interacting microbial populations to

35 achieve biomass reduction and biogas production, however how they are influenced by operating

36 conditions and feed sludge microorganisms remain unclear. These were addressed by analyzing

37 the microbial communities of 90 full-scale digesters at 51 municipal wastewater treatment plants

38 from five countries. Heterogeneity detected in community structures suggested that no single AD

39 microbiome could be defined. Instead, the AD microbiomes were classified into eight clusters

40 driven by operating conditions (e.g., pretreatment, temperature range, and salinity), whereas

41 geographic location of the digesters did not have significant impacts. Comparing digesters

42 populations with those present in the corresponding feed sludge led to the identification of a

43 hitherto overlooked feed-associated microbial group (i.e., the residue populations). They

44 accounted for up to 21.4% of total sequences in ADs operated at low temperature, presumably

45 due to ineffective digestion, and as low as 0.8% in ADs with pretreatment. Within each cluster, a

46 core microbiome was defined, including methanogens, syntrophic metabolizers, fermenters, and

47 the newly described residue populations. Our work provides insights into the key factors shaping

48 full-scale AD microbiomes in a global scale, and draws attentions to the overlooked residue

49 populations.

50

51 Keywords

52 Anaerobic digester, microbiome, operation, feed sludge

53

54

2 55 1. Introduction

56 Wastewater treatment processes, including primary treatment for solids separation and

57 secondary treatment for carbon and nutrients removal, produce substantial amount of waste

58 sewage sludge. For example, the amount of waste sludge generated in European Union is

59 estimated to exceed 13 million dry solid tons in 2020 (Kelessidis and Stasinakis 2012).

60 Anaerobic digestion (AD) has been used worldwide to simultaneously degrade waste sludge and

61 produce methane, and is an promising solution to treat the increasing global growth of organic

62 solid waste (Appels et al. 2011). Meanwhile, the microbial community involved in AD is

63 complex (Narihiro et al. 2015) and a better understanding of the AD ecosystem would optimize

64 existing processes and enhance the engineering application (Vanwonterghem et al. 2014).

65 To identify critical populations responsible for the AD process, multiple researches have

66 tried to define the core AD microbiome. Campanaro et al. (Campanaro et al. 2016) and Treu et al.

67 (Treu et al. 2016) analyzed metagenomic sequences of mesophilic and thermophilic lab-scale

68 digesters treating cattle manure, and concluded that 77 out of 265 genome bins could be

69 considered as the core essential microbial groups in biogas production. Our recent study

70 analyzed the microbial communities of three full-scale digesters in the a wastewater treatment

71 plant and observed a core microbiome that accounted for 59% of the total 16S rRNA gene

72 sequences (Mei et al. 2016a). Studies investigating multiple full-scale plants reported that core

73 populations constituted 36.4% of the total 16S rRNA gene sequences in seven digesters from

74 Seoul, South Korea (Lee et al. 2012), and 28% of the total 16S rRNA gene sequences in seven

75 digesters from France, Germany, and Chile (Riviere et al. 2009). De Vrieze et al. (De Vrieze et al.

76 2015) evaluated the microbial communities of 29 AD installations whose locations were not

77 specified, and reported that Clostridiales and Bacteroidales were part of the core microbiome as

3 78 they were shared by each sample with >0.1% abundance. So, if a large number of digesters are

79 sampled and multiple operating parameters are considered, such as temperature, ammonia

80 concentration, and system configuration that are known to influence AD community (De Vrieze

81 et al. 2015, Smith et al. 2017), would it be still possible to define a core AD microbiome?

82 Furthermore, geographical differences in microbiomes have been observed for waste-treating

83 ecosystems like activated sludge (Zhang et al. 2012) and solid waste landfill (Stamps et al. 2016).

84 Would a similar difference be observed with the AD microbiome?

85 A classic categorization of microorganisms in AD consists of fermenting

86 (fermenters), syntrophic metabolizers (syntrophs), and methanogenic archaea (methanogens)

87 (Schink and Stams 2006). However, it has been realized that AD microbiome embraces a large

88 proportion of with unrecognized ecophysiology (Narihiro 2016). For example, our

89 recent study (Mei et al. 2016a) revealed that 25% of the AD populations in one wastewater

90 treatment plant migrated from the upstream activated sludge process and remained as residue

91 populations in AD. The presence of those non-anaerobic residue populations has not been widely

92 examined to test whether it is a common phenomenon in all digesters under different operating

93 conditions from different geographical locations. Furthermore, the microbial populations in

94 activated sludge can vary considerably due to differences in process configuration and

95 geographical locations (Zhang et al. 2012). Thus, it is not clear whether such variations of

96 microbial populations in the feed sludge impacts the AD microbiome.

97 In this study, we used high-throughput sequencing technologies to characterize

98 microbiomes in digesters around the world by sampling 90 full-scale digesters with diverse

99 operating conditions and feed sludge characteristics from 51 municipal wastewater treatment

100 plants. The impacts of operating conditions and geographical locations on AD microbiome were

4 101 examined. Clustering of samples was performed and cluster-specific core populations were

102 identified. Within the AD microbiome, feed-derived populations were investigated and the

103 distribution in different digesters was characterized.

104

105 2. Materials and methods

106 2.1. Sample collection

107 In total, 148 digester sludge samples were collected from 90 full-scale ADs in 51

108 municipal wastewater treatment plants. Feed sludge in 27 plants were collected prior to entering

109 ADs, and feed sludge in the rest plants were not collected due to sampling difficulties. All

110 operation-related information was provided by the plant operators. Besides the volatile solids

111 reduction (VSR) provided by plant operators, we calculated VSR values using the Van Kleeck

112 equation according to the USEPA regulation (Regulations 2003), which were further used in the

113 downstream analyses. Most plants were operated with the conventional primary-secondary

114 (activated sludge) treatment scheme, while three plants were only configured with primary

115 treatment before AD (plant CAII, CALG, and USRA). Seven plants (JPHW, JPMU, JPNA, JPST,

116 JPTB, JPYS, and USDV) used a two-stage anaerobic digestion process with similar sludge

117 retention time (the first digester treating sludge from primary/secondary clarifiers and the second

118 digester treating sludge from the first digester). Seven plants (JPHG, JPNA, JPNG, USST,

119 USUR, NEAV, and USCA) introduced external solid wastes into digesters, such as food waste,

120 green waste, and sludge from other sources. Wastewater to two Hong Kong plants (HKST and

121 HKTP) had approximately 1/4 to 1/5 of seawater of high salinity. Due to its high saline nature

122 with high sulfate content, these two AD digesters dosed ferric chloride (FeCl3) to suppress

123 sulfide production, leading to a chloride concentration of 4,000 to 6,000 mg/L (Koenig and Bari

5 124 2001, Zhang et al. 2012). Wastewater to another Hong Kong plant (HKYL) had effluent from the

125 tannery industry and contained high concentrations of Zn and Cr (Wong et al. 2001). Digester

126 NEAV1 had both high salinity influent (electrical conductivity about 30-35 mS/cm) and external

127 food waste sludge simultaneously. Digesters from Hong Kong and US (except for USWA and

128 USSF) were sampled at multiple time points with at least one-month interval. These multiple

129 tome points samples were considered as different samples. Fifty milliliters of sludge were

130 collected from the recirculation lines of digesters, transported to laboratory in UIUC on ice

131 (including international samples), and stored at -80°C until DNA extraction.

132 2.2. 16S rRNA gene sequencing

133 Genomic DNA was extracted from 2 mL of well-mixed sludge using the FastDNA SPIN

134 Kit for Soil (MP Biomedicals, Carlsbad, CA, USA), and quantified using a Nanodrop 2000c

135 spectrophotometer. For PCR amplification, 60 ng of genomic DNA was added into a total

136 reaction volume of 25 µL as template. With a dual-indexing approach (Kozich et al. 2013), a

137 universal primer set 515F (5’-GTGCCAGCMGCCGCGGTAA-3’)/909R(5’-

138 CCCCGYCAATTCMTTTRAGT-3’) targeting the V4-V5 region of both bacterial and archaeal

139 16S rRNA gene was used for PCR amplification. PCR was performed with the thermal cycling

140 protocol consisting of initial denaturation (94°C, 3 min), 25 cycles of denaturation (94 °C, 30 s),

141 annealing (55 °C, 45 s) and extension (72 °C, 1 min), and a final extension (72 °C, 10 min) (Mei

142 et al. 2016b). The PCR amplicons were purified using the Wizard SV Gel and PCR Clean-Up

143 system (Promega, Fichburg, WI, USA) and quantified by Qubit 2.0 Fluorometer. Library

144 preparation and sequencing on Illumina Miseq Bulk 2 × 300 nt paired-end system was performed

145 at the Roy J. Carver Biotechnology Center at the University of Illinois at Urbana-Champaign, IL,

146 USA.

6 147 2.3. Microbial community analyses

148 Paired-end raw sequences were assembled, screened, and trimmed using Mothur 1.33.3

149 (Schloss et al. 2009) with a maximum sequence length of 400 bp and a quality score of 20. The

150 output data were analyzed using QIIME 1.9.1 (Caporaso et al. 2010b) for OTU (operational

151 taxonomic unit, 97% sequence similarity) picking with the de novo strategy, which included

152 OTU grouping by UCLUST (Edgar 2010), alignment by PyNAST (Caporaso et al. 2010a),

153 chimera identification by ChimeraSlayer (Haas et al. 2011), assignment by BLAST

154 using reference sequences in the GreenGene 2013 database. After removing singletons (OTUs

155 that only had one sequence in the entire dataset), all samples were rarefied to an even depth of

156 20,957 sequences (determined by the sample with fewest sequences). Shannon index (

157 is the relative abundance of an individual population) calculation, UniFrac

158 distance matrix calculation, Bray-Curtis distance matrix calculation, principal coordinate

159 analysis (PCoA), and unweighted pair group method with arithmetic mean (UPGMA) with 100

160 iterations were all performed using QIIME. Relative abundance was calculated from OTU table.

161 Phylogenetic trees was constructed using the methods of neighbor joining and parsimony

162 provided in ARB program (Ludwig et al. 2004).

163 Statistical differences of principal components between samples from different locations

164 were evaluated using Mann Whitney U test with Bonferroni correction with R (Ihaka and

165 Gentleman 1996). A p-value < 0.01 was considered as statistical significance. Correlations

166 between microbial groups, alpha diversity, and VSR were determined using the Spearman's Rank

167 Order Correlation test with R. Evaluation of normality of the data using Shapiro Wilk Normality

168 test, and preparation of box plot and histogram were also performed using R. Distance-based

169 linear model (DistLM) and analysis of similarity (ANOSIM) were performed with Primer 6

7 170 (Clarke 1993). Raw Illumina sequences obtained in this study have been deposited in

171 DDBJ/NCBI/EMBL-EBI under the accession number DRA005150.

172

173 3. Results

174 3.1. Operation-driven heterogeneity of AD microbiome

175 In total, over 7 million quality-filtered, non-chimeric sequences were obtained from 148

176 AD samples in 51 municipal wastewater treatment plants (Fig. S1, Table S1 in the

177 Supplementary material). After removing singletons and subsampling to an even depth (20,957

178 sequences per sample, determined by the sample with fewest sequences), each AD sample on

179 average contained 1,844 OTUs with a high standard deviation of 595 OTUs. The Shannon index

180 that characterized both richness and evenness of a community showed large variations (Fig. S2),

181 with the highest value being 2.5 times higher than the lowest value (9.12 vs. 3.68). Dissimilarity

182 between AD communities was also reflected in the large variations in the relative abundance of

183 major phyla (Fig. S3). For example, the abundance of varied from 5% to 71% in

184 different samples, and the abundance of varied from 0 to 56%.

185 Principal coordinate analysis (PCoA) performed on beta-diversity (weighted UniFrac

186 distance) showed that there were different types of AD communities (Fig. S4). However, the

187 variance could not be explained by geographical locations, as only North America samples

188 significantly differed from Hong Kong samples in PC1 and from Japan samples in PC2. In

189 addition, only small portions of the variance could be explained by single environmental

190 parameters such as temperature (9.63%), pH (3.22%), and sludge retention time (SRT) (1.63%)

191 (Table S2A).

8 192 To identify shaping factors of the heterogeneous AD communities, the dissimilarity based

193 on weighted Unifrac was further analyzed using unweighted pair group method with arithmetic

194 mean (UPGMA), a clustering method that could fully reveal the variance in beta diversity. Eight

195 clusters were observed (Fig. 1). Cluster A contained six samples from saline digesters in two

196 Hong Kong plants due to flushing toilet with sea water. Cluster B contained two samples from

197 digesters (one from the US and one from the Netherlands) that received feed sludge after

198 pretreatment with thermal hydrolysis. Cluster C contained three samples from the digester

199 treating wastewater partially from the tannery industry in a Hong Kong plant. Cluster D

200 contained 14 samples from thermophilic digesters (>50°C) located in Japan, US, Canada, and the

201 Netherlands. Cluster E contained seven samples from two Japanese plants and one USA plant

202 that operated digesters at temperatures < 30°C for at least three months. Cluster F contained six

203 samples from one non-saline Hong Kong plant (HKSW, digester temperature at 36.0°C) and one

204 USA plant (USNO, digester temperature at 30.3°C), but the operating conditions that determined

205 high community similarity of these two plants are still not clear. Cluster G contained 16 samples

206 from seven Japanese plants, with slightly high operation temperatures between 38 and 42°C,

207 except for plant JPSS at 36.5°C. The largest cluster (H) contained 91 samples of from 16 USA

208 plants, six Japanese plants, two Canadian plants, and four Netherlandish plants, which operated

209 digesters mainly under mesophilic conditions. Samples from plant USLA and NEAV were not

210 assigned to any cluster due to lack of clear association with operating conditions. Within each

211 cluster, samples that originated from the same plant generally clustered together, even though

212 they might be collected from different reactors or on different dates. The clustering of the AD

213 microbiomes into eight clusters was confirmed by ANOSIM, which gave global R-values close

214 to 1, showing that the between-cluster distances were significantly larger than the within-cluster

9 215 distances (Table S2B). In contrast, the clustering solely based on the geographical location of the

216 samples generated much smaller global R-values (less than 0.6) (Table S2C). A UPGMA-based

217 clustering on Bray-Curtis distance matrix produced very similar results (Fig. S5C), where only

218 two samples diverged from cluster G and three samples diverged from cluster H compared to the

219 results based on weighted UniFrac.

220 3.2. Characterization of feed-derived residue populations

221 Our previous study revealed that, in a single wastewater treatment plant, AD microbial

222 communities could contain exogenous populations (i.e., residue populations) that migrated from

223 the feed sludge, resisted to digestion, and not actively involved in anaerobic metabolism (Mei et

224 al. 2016a). In the present study with a much broader sampling scale, we identified such residue

225 populations by comparing the upstream feed sludge and the corresponding AD. To be stringent,

226 we first defined an OTU as being more abundant in feed sludge in a plant only when its feed/AD

227 abundance ratio was over 2, and, conversely, an OTU as being more abundant in AD when the

228 feed/AD abundance ratio was below 0.5. Further, we defined OTUs as residue populations if

229 they were frequently more abundant in feed sludge (minimum five plants) and rarely more

230 abundant in AD (maximum five plants) (Fig. 2). Using these criteria, 1,464 OTUs were

231 identified as residue populations. In agreement, only 172 of them were associated with known

232 obligate anaerobic taxa based on family-level phylogeny (TableS3) obtained from literature

233 (Rosenberg et al. 2014, Vos et al. 2011). In total, 704 residue OTUs were associated with

234 and 298 OTUs with Bacteroidetes, accounting for 20.8% and 13.4% of

235 sequences in feed sludge, respectively (Fig. 3A). Abundances of these OTUs in the AD

236 community decreased drastically to 3.6% and 1.6%, respectively. Other phyla including

237 , , and also contained residues populations but were

10 238 presented by a small number of OTUs (<100) and low relative abundance. Detailed phylogenetic

239 analysis of the top 50 abundant residue OTUs indicated that 21 of them were associated with

240 Proteobacteria (excluding ) and 18 OTUs with Bacteroidetes (excluding

241 Bacteroidales) (Fig. S6), which were mostly known as aerobes or facultative anaerobes and were

242 consistent with our previous study (Mei et al. 2016a). On the other hand, known anaerobic

243 populations in AD were not assigned as residue in our analysis, although they were detected in

244 the feed sludge. These populations included for example methanogens (e.g., Methanobacteriales,

245 Methanomicrobiales, and Methanosarcinales), fermenters (e.g., Anaerolineales), and syntrophs

246 (e.g., Syntrophobacterales). Their abundance increased after entering AD, and no residue OTU

247 was related to these taxa (Fig. S7).

248 We further observed that the presence of residue populations was a universal

249 phenomenon in all the digesters sampled (Fig. 3B). The lowest relative abundance of residue

250 populations in a sample was 0.02% in USSF1 that received feed sludge after pretreatment, and

251 the majority (117 out of 148 AD samples) were less than 10%. High residue populations were

252 less common, with 26 samples between 10%-20%, and five samples between 20-30%. The

253 highest abundance was observed with JPYS1 (27.3%) that was operated below 20°C.

254 Furthermore, we observed a clear positive correlation (rho=0.846, p<0.01) between residue

255 populations and alpha diversity (Shannon index) of the AD community (Fig. 3C), indicating the

256 migration of residue populations increased both species richness and evenness of the AD

257 microbial community. In contrast, varying abundance of endogenous populations, such as

258 methanogens or syntrophs, did not correlate with Shannon index of the community (small rho

259 values, Fig. S8). Also a higher residue population abundance was observed to coincide with a

11 260 lower digestion efficiency (volatile solids reduction) (Fig. S9), but the correlation was weak as

261 indicated by a low coefficient (rho=-0.361, p<0.01).

262 The presence of residue populations was also influenced by operating conditions (Fig.

263 3D). The highest residue population abundance was 21.4%, observed in cluster E (low operating

264 temperature), followed by 13.9% in cluster F. Correspondingly, clusters E and F had the highest

265 alpha diversity. The abundance of residue populations in cluster H, which represented most of

266 the digesters studied, was 6.0%. In comparison, clusters B (pretreatment), D (thermophilic), and

267 G (>40°C) contained residue population at relative abundances of 0.8%, 3.3%, and 1.6%,

268 respectively. In addition, residue populations could be more abundant than syntrophs (1.2-7.1%)

269 and methanogens (0.3-2.6%), such as in cluster E and F (Fig. S10). We also tested whether

270 residue populations affected beta-diversity by removing residue OTUs from each community.

271 Based on weighted UniFrac distance, clusters A to G remained intact. Seven samples that were

272 originally in cluster H were separated from the cluster(Fig. S5A and B). Based on Bray-Curtis

273 distance, samples in cluster E were split (Fig. S5C and D).

274 3.3. Identification of cluster-specific core populations

275 The heterogeneity revealed by the occupancy distribution of OTUs among all 148 AD

276 samples precluded the ability to define a universal core AD microbiome (Fig. S11A). No OTU

277 was present in 147 or 148 samples. Only 14 OTUs were detected in more than 136 samples, and

278 they only accounted for 4.8% of total sequences. In contrast, within each cluster, OTUs shared

279 by all the samples accounted for a large portion of the total sequences (>50% in each cluster, Fig.

280 S11B), indicating that samples in the same cluster tended to have highly similar microbiomes.

281 Thus, we defined cluster-specific core populations (Fig. 4) by including OTUs that were both

282 prevalent and abundant (top 15 abundant bacterial and top three abundant archaeal OTUs that

12 283 were detected in all samples of that cluster). The phylogeny of core OTUs was confirmed by

284 building phylogenetic trees (Fig. S12).

285 In the core communities, OTUs related to known syntrophs were limited to Smithella and

286 Syntrophomonas (Fig. S12A), known to syntrophically oxidize propionate. Smithella related

287 OTUs were observed in clusters B, E, F, G, and H, whereas Syntrophomonas related OTUs were

288 observed in clusters with high salinity (cluster A), industrial influent (cluster C), and high

289 operating temperature (cluster D). For the methanogenic core populations, there was a similar

290 trend that an OTU related to Candidatus Methanofastidiosa (hydrogenotrophic methanogen) and

291 an OTU related to Methanosaeta (aceticlastic methanogen) were consistently observed in

292 clusters B, E, F, G, and H. The high-temperature cluster D contained two unique core OTUs

293 related to Methanothermobacter and Methanoculleus. The low-temperature cluster E contained

294 one unique core OTU related to Methanoregula. Cluster C with industrial influent contained two

295 core OTUs related to Methanosarcina, absent in the core communities of other clusters. Cluster

296 A with high salinity contained an OTU related to Methanolinea but at low abundance (<0.05%)

297 compared with other hydrogenotrophic methanogens. The core community of cluster A also

298 contained an OTU related to Methanosaeta, but likely a different species from the one shared by

299 other clusters based on phylogenetic analysis (Fig. S12B).

300 With regards to residue populations, the core communities of cluster B (plants with

301 pretreatment) and cluster G (plants operated at ~ 40°C) did not contain any OTU identified as

302 residue population. For the core communities of other clusters, Proteobacteria were the major

303 taxa, and the core residue populations were generally related to Zoogloea, Decholomonas,

304 Azospira, and Acidovorax (Fig. S12C). Cluster F contained residue populations mainly related to

305 in Bacteroidetes, likely because the feed sludge of cluster F had highest

13 306 abundance of Bacteroidetes and lowest abundance Proteobacteria in comparison to other

307 clusters (Fig. S13).

308 The remaining core populations were classified as fermenters. Bacteroidetes, as the most

309 diverse, abundant, and ubiquitous phylum, contained 30 core OTUs, all related to the order

310 Bacteroidales (Fig. S12D). All clusters contained multiple Bacteroidetes-related core OTUs,

311 except for cluster B (plants operated at thermophilic conditions) with only one Bacteroidetes-

312 related core OTU. Other major phyla were Firmicutes, Candidatus Cloacimonetes (WWE1),

313 , and Thermotogae. The majority of fermenters were only assigned to a taxonomic

314 level at order or phylum, as a few known closely isolates were available including Mesotoga,

315 Defluviitoga, Anaerobaculum, Sedimentibacter, and Coprothermobacter. Last, we observed core

316 populations related to phyla without cultivated representatives, including Candidatus

317 Aminicenantes (OP8), Candidatus Fermentibacteria (Hyd24-12), Candidatus Atribacteria (OP9)

318 and Candidatus Marinimicrobia (SAR406).

319

320 4. Discussion

321 Determining the core microbiome for an ecosystem is an effective approach to delineate

322 how microbes drive biochemical processes (Consortium 2012, Gilbert et al. 2014, Sunagawa et

323 al. 2015). This study demonstrated heterogeneity in AD microbial communities, and rejected the

324 possibility to define a universal core microbiome for all digesters that differed in operational

325 conditions. This was contradictory to studies using a small number of digesters (Campanaro et al.

326 2016, Lee et al. 2012, Mei et al. 2016a, Riviere et al. 2009), but consistent with the previous

327 report that when a relatively large number of digesters were sampled, different types of

328 communities appeared (De Vrieze et al. 2015). Such heterogeneity in AD microbial communities

14 329 was linked to diversity in operating conditions, which further led to the discovery of cluster-

330 specific core microbiomes. For example, in digesters operated at high temperature (those in

331 cluster D), core OTUs related to thermophiles, including Methanoculleus (Cheng et al. 2008),

332 Methanothermobacter (Cheng et al. 2011) Defluviitoga (Hania et al. 2012), Coprothermobacter

333 (Etchebehere et al. 1998), and Anaerobaculum (Rees et al. 1997) were uniquely detected. OTUs

334 related to zinc-tolerant Sedimentibacter (Burkhardt et al. 2011) were detected in digesters

335 (cluster C) receiving tannery industry wastewater that had high Zn concentration. OTUs related

336 to sulfur-utilizing Mesotoga (Nesbø et al. 2012) were detected in digesters (cluster D) receiving

337 sea water. These sulfur-utilizing microorganisms could compete for hydrogen and suppress

338 hydrogenotrophic methanogens in cluster D. An OTU related to Methanoregula that could grow

339 at 10°C was detected in digesters in cluster E operated under 30°C (Yashiro et al. 2011). It could

340 be expected that if more digesters with more diverse operating conditions are included, the

341 heterogeneity and the clustering complexity will keep increasing as niche diversity increases.

342 Although there was no shared population among all the eight clusters, some populations

343 were frequently observed in clusters B, E, F, G, and H. These populations included OTUs related

344 to the novel archaeal clade Candidatus Methanofastidiosa that is predicted to perform

345 hydrogenotrophic methanogenesis through methylated thiol reduction (Nobu et al. 2016), and

346 Smithella that syntrophically oxidize propionate (Liu et al. 1999). Possibly methylated thiol

347 compounds (e.g., methanethiol and dimethylsulfide) and propionate are critical intermediates

348 prevalent in most ADs. We also observed abundant and diverse OTUs affiliated with the phyla

349 Bacteroidetes and Candidatus Cloacimonetes, whose ecological functions in AD are still

350 difficult to discern. For example, isolates of Bacteroidetes from anaerobic reactors could be

351 saccharolytic (Su et al. 2014, Sun et al. 2016) or proteolytic (Abe et al. 2012, Chen and Dong

15 352 2005), but the vast majority of the members in this phylum remain uncultivated and, thus, their

353 metabolism is unknown (Wu et al. 2011). Candidatus Cloacimonetes-related populations have

354 been proposed to perform amino acids fermentation (Pelletier et al. 2008), syntrophic propionate

355 oxidation (Nobu et al. 2015), or extracellular cellulose hydrolysis (Limam et al. 2014). Given

356 that the core OTUs in this phylum were associated with distinct uncultivated phylogenetic clades

357 (e.g., W22, SHA-116, BHB21, and W5), one can only speculate about their metabolisms until

358 more genomics information becomes available or until representatives of these clades are

359 cultured.

360 Previous studies detected core AD populations related to known aerobic and facultative

361 microorganisms including Thauera, Brachymonas, and Rhodobacter (Nelson et al. 2011, Riviere

362 et al. 2009) that were reported as predominant microorganisms in activated sludge (Zhang et al.

363 2012). Their appearance as core populations in AD is likely due to incomplete digestion, in

364 contrast to other core populations such as methanogens, syntrophs, and fermenters. It is known

365 that activated sludge processes sometimes contain anaerobic zones supporting the growth of

366 anaerobic microorganisms in (Kämpfer et al. 1996). Based on the change in abundance before

367 and after entering AD, our analysis could effectively distinguish microorganisms in feed sludge

368 as residue populations (i.e., decreasing abundance) from those contributing to digestion (i.e.,

369 increase in abundance) in AD. Thus, the residue populations we define here were unlikely to

370 involve in the essential functions in AD, i.e., waste degradation and biogas production. Further

371 investigations are necessary to elucidate the exact survival mechanisms of the residue

372 populations in AD. For example, some of them could survive on accumulated carbon reserve like

373 polyhydroxyalkanoates (Liu et al. 2001) or carry out anaerobic metabolism with different

16 374 electron acceptors (e.g., nitrate reduction by Zoogloea (Shao et al. 2009) and chlorate reduction

375 by Dechloromonas (Achenbach et al. 2001)).

376 On the basis of our previous study of AD in a single plant (Mei et al. 2016a), we showed

377 here that the presence of residue populations was a common phenomenon among all the sampled

378 digesters. Residue populations could account for at least 6% of total sequences obtained from

379 digesters under normal conditions (i.e., cluster H community) and were more abundant than

380 methanogens and syntrophs. Higher abundance of residue populations (i.e., 21.4%) was observed

381 with cluster E likely due to low operating temperature at <30°C. In addition, pretreatment such

382 as thermal hydrolysis could successfully reduce residue populations in AD (i.e., 0.8% abundance

383 in cluster B community).

384 We observed that the presence of residue populations only contributed to the increase of

385 alpha diversity of the AD microbiome. By removing residue populations from each community,

386 we observed almost no change on the beta-diversity, and the topology of the clustering remained

387 almost the same based on either weighted UniFrac (only seven samples split from cluster H) or

388 Bray-Curtis distance (only cluster E split). This is likely due to the fact that most residue

389 populations were affiliated with Proteobacteria and Bacteroidetes, which only represented a

390 small fraction of the vast phylogenetic diversity of AD microbiome. Moreover, the abundances

391 of residue populations were generally less than 10% in most digesters, thus their impacts on the

392 beta diversity calculation were marginal. Only when the abundance of residue populations was

393 high (i.e., in cluster E), a major impact was observed. Finally, a very weak correlation was

394 observed between the abundance of residue populations and overall digestion efficiency, likely

395 because the presence of residue populations could only indicate inefficient cell lysis, the first

396 step of AD process (Amani et al. 2010). The digestion efficiency of full-scale systems is

17 397 collectively influenced by other factors including compositions of the feed sludge, mixing

398 condition of the reactor, and monitoring approaches.

399

400 5. Conclusion

401 The analyses of microbial communities of 90 full-scale anaerobic digesters around the

402 world lead to the following conclusions:

403 The differences of microbial community structures were determined by the operating

404 conditions of digesters, whereas geographical location of the digesters did not have a

405 significant impact.

406 Residue populations associated with undigested feed sludge were commonly observed in

407 all the AD samples, with the highest abundance observed in low-temperature digesters

408 and lowest abundance in digesters with pretreatment.

409 There was no population shared by all the sampled digesters due to the operation-driven

410 heterogeneity. The cluster-specific core microbiome contained methanogens, syntrophs,

411 fermenters, and residue populations.

412

413 Acknowledgements

414 We are sincerely grateful to the numerous operation staff in the 51 wastewater treatment

415 plants for their generous help on providing samples and sharing information on operating

416 parameters and efficiencies. This research is part of an ongoing project supported by Joint

417 Genomic Institute.

418

18 419 References

420 Abe, K., Ueki, A., Ohtaki, Y., Kaku, N., Watanabe, K. and Ueki, K. (2012) Anaerocella delicata gen. nov., 421 sp. nov., a strictly anaerobic bacterium in the phylum Bacteroidetes isolated from a methanogenic reactor 422 of cattle farms. The Journal of General and Applied Microbiology 58(6), 405-412.

423 Amani, T., Nosrati, M. and Sreekrishnan, T. (2010) Anaerobic digestion from the viewpoint of 424 microbiological, chemical, and operational aspects-a review. Environmental Reviews 18(NA), 255-278.

425 Appels, L., Lauwers, J., Degrève, J., Helsen, L., Lievens, B., Willems, K., Van Impe, J. and Dewil, R. 426 (2011) Anaerobic digestion in global bio-energy production: potential and research challenges. 427 Renewable and Sustainable Energy Reviews 15(9), 4295-4301.

428 Burkhardt, E.-M., Bischoff, S., Akob, D.M., Büchel, G. and Küsel, K. (2011) Heavy metal tolerance of Fe 429 (III)-reducing microbial communities in contaminated creek bank soils. Applied and environmental 430 microbiology 77(9), 3132-3136.

431 Campanaro, S., Treu, L., Kougias, P.G., Francisci, D., Valle, G. and Angelidaki, I. (2016) Metagenomic 432 analysis and functional characterization of the biogas microbiome using high throughput shotgun 433 sequencing and a novel binning strategy. Biotechnology for biofuels 9(1), 26.

434 Caporaso, J.G., Bittinger, K., Bushman, F.D., DeSantis, T.Z., Andersen, G.L. and Knight, R. (2010a) 435 PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26(2), 266-267.

436 Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., 437 Pena, A.G., Goodrich, J.K. and Gordon, J.I. (2010b) QIIME allows analysis of high-throughput 438 community sequencing data. Nature Methods 7(5), 335-336.

439 Chen, S. and Dong, X. (2005) Proteiniphilum acetatigenes gen. nov., sp. nov., from a UASB reactor 440 treating brewery wastewater. International journal of systematic and evolutionary microbiology 55(6), 441 2257-2261.

442 Cheng, L., Dai, L., Li, X., Zhang, H. and Lu, Y. (2011) Isolation and characterization of 443 Methanothermobacter crinale sp. nov., a novel hydrogenotrophic methanogen from the Shengli oil field. 444 Applied and environmental microbiology 77(15), 5212-5219.

445 Cheng, L., Qiu, T.L., Li, X., Wang, W.D., Deng, Y., Yin, X.B. and Zhang, H. (2008) Isolation and 446 characterization of Methanoculleus receptaculi sp. nov. from Shengli oil field, China. FEMS 447 microbiology letters 285(1), 65-71.

448 Clarke, K.R. (1993) Non‐parametric multivariate analyses of changes in community structure. Australian 449 journal of ecology 18(1), 117-143.

450 Consortium, H.M.P. (2012) Structure, function and diversity of the healthy human microbiome. Nature 451 486(7402), 207-214.

452 De Vrieze, J., Saunders, A.M., He, Y., Fang, J., Nielsen, P.H., Verstraete, W. and Boon, N. (2015) 453 Ammonia and temperature determine potential clustering in the anaerobic digestion microbiome. Water 454 Research 75, 312-323.

19 455 Edgar, R.C. (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26(19), 456 2460-2461.

457 Etchebehere, C., Pavan, M., Zorzopulos, J., Soubes, M. and Muxi, L. (1998) Coprothermobacter platensis 458 sp. nov., a new anaerobic proteolytic thermophilic bacterium isolated from an anaerobic mesophilic 459 sludge. International journal of systematic and evolutionary microbiology 48(4), 1297-1304.

460 Gilbert, J.A., Jansson, J.K. and Knight, R. (2014) The Earth Microbiome project: successes and 461 aspirations. BMC biology 12(1), 1.

462 Haas, B.J., Gevers, D., Earl, A.M., Feldgarden, M., Ward, D.V., Giannoukos, G., Ciulla, D., Tabbaa, D., 463 Highlander, S.K. and Sodergren, E. (2011) Chimeric 16S rRNA sequence formation and detection in 464 Sanger and 454-pyrosequenced PCR amplicons. Genome research 21(3), 494-504.

465 Hania, W.B., Godbane, R., Postec, A., Hamdi, M., Ollivier, B. and Fardeau, M.-L. (2012) Defluviitoga 466 tunisiensis gen. nov., sp. nov., a thermophilic bacterium isolated from a mesothermic and anaerobic whey 467 digester. International journal of systematic and evolutionary microbiology 62(6), 1377-1382.

468 Ihaka, R. and Gentleman, R. (1996) R: a language for data analysis and graphics. Journal of 469 computational and graphical statistics 5(3), 299-314.

470 Kämpfer, P., Erhart, R., Beimfohr, C., Böhringer, J., Wagner, M. and Amann, R. (1996) Characterization 471 of bacterial communities from activated sludge: culture-dependent numerical identification versus in situ 472 identification using group-and genus-specific rRNA-targeted oligonucleotide probes. Microbial ecology 473 32(2), 101-121.

474 Kelessidis, A. and Stasinakis, A.S. (2012) Comparative study of the methods used for treatment and final 475 disposal of sewage sludge in European countries. Waste Management 32(6), 1186-1195.

476 Koenig, A. and Bari, Q. (2001) Vane shear strength of dewatered sludge from Hong Kong. Water science 477 and technology 44(2-3), 389-397.

478 Kozich, J.J., Westcott, S.L., Baxter, N.T., Highlander, S.K. and Schloss, P.D. (2013) Development of a 479 dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq 480 Illumina sequencing platform. Applied and environmental microbiology 79(17), 5112-5120.

481 Lee, S.-H., Kang, H.-J., Lee, Y.H., Lee, T.J., Han, K., Choi, Y. and Park, H.-D. (2012) Monitoring 482 bacterial community structure and variability in time scale in full-scale anaerobic digesters. Journal of 483 Environmental Monitoring 14(7), 1893-1905.

484 Limam, R.D., Chouari, R., Mazéas, L., Wu, T.D., Li, T., Grossin‐Debattista, J., Guerquin‐Kern, J.L., 485 Saidi, M., Landoulsi, A. and Sghir, A. (2014) Members of the uncultured bacterial candidate division 486 WWE1 are implicated in anaerobic digestion of cellulose. MicrobiologyOpen 3(2), 157-167.

487 Liu, W.T., Nielsen, A.T., Wu, J.H., Tsai, C.S., Matsuo, Y. and Molin, S. (2001) In situ identification of 488 polyphosphate‐and polyhydroxyalkanoate‐accumulating traits for microbial populations in a biological 489 phosphorus removal process. Environmental Microbiology 3(2), 110-122.

20 490 Liu, Y., Balkwill, D.L., Aldrich, H.C., Drake, G.R. and Boone, D.R. (1999) Characterization of the 491 anaerobic propionate-degrading syntrophs Smithella propionica gen. nov., sp. nov. and Syntrophobacter 492 wolinii. International journal of systematic bacteriology 49(2), 545-556.

493 Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H., Buchner, A., Lai, T., Steppi, S., Jobb, G. 494 and Förster, W. (2004) ARB: a software environment for sequence data. Nucleic acids research 32(4), 495 1363-1371.

496 Mei, R., Narihiro, T., Nobu, M.K., Kuroda, K. and Liu, W.-T. (2016a) Evaluating digestion efficiency in 497 full-scale anaerobic digesters by identifying active microbial populations through the lens of microbial 498 activity. Scientific reports 6, 34090.

499 Mei, R., Narihiro, T., Nobu, M.K. and Liu, W.-T. (2016b) Effects of heat shocks on microbial community 500 structure and microbial activity of a methanogenic enrichment degrading benzoate. Letters in Applied 501 Microbiology 63(5), 356-362.

502 Narihiro, T. (2016) Microbes in the Water Infrastructure: Underpinning Our Society. Microbes and 503 Environments 31(2), 89-92.

504 Narihiro, T., Nobu, M.K., Kim, N.K., Kamagata, Y. and Liu, W.T. (2015) The nexus of syntrophy- 505 associated microbiota in anaerobic digestion revealed by long-term enrichment and community survey. 506 Environmental Microbiology 17(5), 1707-1720.

507 Nelson, M.C., Morrison, M. and Yu, Z. (2011) A meta-analysis of the microbial diversity observed in 508 anaerobic digesters. Bioresource Technology 102(4), 3730-3739.

509 Nesbø, C.L., Bradnan, D.M., Adebusuyi, A., Dlutek, M., Petrus, A.K., Foght, J., Doolittle, W.F. and Noll, 510 K.M. (2012) Mesotoga prima gen. nov., sp. nov., the first described mesophilic species of the 511 Thermotogales. Extremophiles 16(3), 387-393.

512 Nobu, M.K., Narihiro, T., Kuroda, K., Mei, R. and Liu, W.-T. (2016) Chasing the elusive Euryarchaeota 513 class WSA2: genomes reveal a uniquely fastidious methyl-reducing methanogen. ISME J 10(10), 2478- 514 2487.

515 Nobu, M.K., Narihiro, T., Rinke, C., Kamagata, Y., Tringe, S.G., Woyke, T. and Liu, W.T. (2015) 516 Microbial dark matter ecogenomics reveals complex synergistic networks in a methanogenic bioreactor. 517 ISME J 9(8), 1710-1722.

518 Pelletier, E., Kreimeyer, A., Bocs, S., Rouy, Z., Gyapay, G., Chouari, R., Rivière, D., Ganesan, A., 519 Daegelen, P. and Sghir, A. (2008) “Candidatus Cloacamonas acidaminovorans”: genome sequence 520 reconstruction provides a first glimpse of a new bacterial division. Journal of Bacteriology 190(7), 2572- 521 2579.

522 Rees, G.N., Patel, B.K., Grassia, G.S. and Sheehy, A.J. (1997) Anaerobaculum thermoterrenum gen. nov., 523 sp. nov., a novel, thermophilic bacterium which ferments citrate. International journal of systematic and 524 evolutionary microbiology 47(1), 150-154.

525 Regulations, E. (2003) Technology: Control of Pathogens and Vector Attraction in Sewage Sludge. 526 USEPA, Offıce of Research and Development.

21 527 Riviere, D., Desvignes, V., Pelletier, E., Chaussonnerie, S., Guermazi, S., Weissenbach, J., Li, T., 528 Camacho, P. and Sghir, A. (2009) Towards the definition of a core of microorganisms involved in 529 anaerobic digestion of sludge. The ISME Journal 3(6), 700-714.

530 Rosenberg, E., DeLong, E.F., Lory, S., Stackebrandt, E. and Thompson, F. (2014) The Prokaryotes: Other 531 Major Lineages of Bacteria and The Archaea, Springer Berlin Heidelberg.

532 Schink, B. and Stams, A.J. (2006) Syntrophism among prokaryotes, Springer.

533 Schloss, P.D., Westcott, S.L., Ryabin, T., Hall, J.R., Hartmann, M., Hollister, E.B., Lesniewski, R.A., 534 Oakley, B.B., Parks, D.H. and Robinson, C.J. (2009) Introducing mothur: open-source, platform- 535 independent, community-supported software for describing and comparing microbial communities. 536 Applied and environmental microbiology 75(23), 7537-7541.

537 Smith, A.L., Shimada, T. and Raskin, L. (2017) A comparative evaluation of community structure in full- 538 scale digesters indicates that two-phase digesters exhibit greater microbial diversity than single-phase 539 digesters. Environmental Science: Water Research & Technology.

540 Stamps, B.W., Lyles, C.N., Suflita, J.M., Masoner, J.R., Cozzarelli, I.M., Kolpin, D.W. and Stevenson, 541 B.S. (2016) Municipal solid waste landfills harbor distinct microbiomes. Frontiers in microbiology 7.

542 Su, X.-L., Tian, Q., Zhang, J., Yuan, X.-Z., Shi, X.-S., Guo, R.-B. and Qiu, Y.-L. (2014) 543 Acetobacteroides hydrogenigenes gen. nov., sp. nov., an anaerobic hydrogen-producing bacterium in the 544 family Rikenellaceae isolated from a reed swamp. International journal of systematic and evolutionary 545 microbiology 64(9), 2986-2991.

546 Sun, L., Toyonaga, M., Ohashi, A., Tourlousse, D.M., Matsuura, N., Meng, X.-Y., Tamaki, H., Hanada, 547 S., Cruz, R. and Yamaguchi, T. (2016) Lentimicrobium saccharophilum gen. nov., sp. nov., a strictly 548 anaerobic bacterium representing a new family in the phylum Bacteroidetes, and proposal of 549 Lentimicrobiaceae fam. nov. International journal of systematic and evolutionary microbiology.

550 Sunagawa, S., Coelho, L.P., Chaffron, S., Kultima, J.R., Labadie, K., Salazar, G., Djahanschiri, B., Zeller, 551 G., Mende, D.R. and Alberti, A. (2015) Structure and function of the global ocean microbiome. Science 552 348(6237), 1261359.

553 Treu, L., Kougias, P.G., Campanaro, S., Bassani, I. and Angelidaki, I. (2016) Deeper insight into the 554 structure of the anaerobic digestion microbial community; the biogas microbiome database is expanded 555 with 157 new genomes. Bioresource Technology 216, 260-266.

556 Vanwonterghem, I., Jensen, P.D., Ho, D.P., Batstone, D.J. and Tyson, G.W. (2014) Linking microbial 557 community structure, interactions and function in anaerobic digesters using new molecular techniques. 558 Current opinion in biotechnology 27, 55-64.

559 Vos, P., Garrity, G., Jones, D., Krieg, N.R., Ludwig, W., Rainey, F.A., Schleifer, K.-H. and Whitman, W. 560 (2011) Bergey's Manual of Systematic Bacteriology: Volume 3: The Firmicutes, Springer Science & 561 Business Media.

562 Wong, J., Li, K., Fang, M. and Su, D. (2001) Toxicity evaluation of sewage sludges in Hong Kong. 563 Environment International 27(5), 373-380.

22 564 Wu, G.D., Chen, J., Hoffmann, C., Bittinger, K., Chen, Y.-Y., Keilbaugh, S.A., Bewtra, M., Knights, D., 565 Walters, W.A. and Knight, R. (2011) Linking long-term dietary patterns with gut microbial enterotypes. 566 Science 334(6052), 105-108.

567 Yashiro, Y., Sakai, S., Ehara, M., Miyazaki, M., Yamaguchi, T. and Imachi, H. (2011) Methanoregula 568 formicica sp. nov., a methane-producing archaeon isolated from methanogenic sludge. International 569 journal of systematic and evolutionary microbiology 61(1), 53-59.

570 Zhang, T., Shao, M.-F. and Ye, L. (2012) 454 Pyrosequencing reveals bacterial diversity of activated 571 sludge from 14 sewage treatment plants. The ISME Journal 6(6), 1137-1147. 572

23 Figure 1

F E

<30°C USNO

HKSW

G JPSS JPTO

C SRA 0° YS W 4 JP U ~ H JP P KH* J

USN

USNO1 JP HKSW1 1 HKSW3 HKSW2 3 USLA A JPSS A2 JPTO1 D TR JPTO2 RA J O SR S2 2 >

USNO2 U 3 JPKH4 USR PYS1 W 5 JP PTO3 US J PY JPTR3 1 A2 0 J USOA ° ST JPTR4 JPHW1 C JPH LA1 A3 4 JPTR1 USL JPSB O A JPTR2 US S U 2 J USO PST1 OA NEDR LG JP JPST2 USOA1 JPSB2 US A6 CA AI H USOA5 O 1 G JP CA US DR USWA* JP S G1 B1 NE L JPNG JPHG H N G CA I1 USSF* i 1 AA 1 n C EBR JP C d U 2 i USW H PNG n u S G3 J f s SF lu t NEBR1 USSF2 JPMR r * A2 e i JP n a USSF3 JPMR2 R1 l T M p t B JP r JPT R3 e B M t B JP r USDI JPTB2 1 HKYL e HKYL3 a t U YL1 m JPIK SD HK I1 e USDI2 KYL2 NEVE U H n SWA* t JPIK1 NEVE1 USSF* h USW i USSF1 g A1 A USCA3 ST1 h USCA HK HKST s USC a HKST2 l A1 i T3 n USCA HKS i 2 t USWL y USWL1 HKTP2 HKTP USWL2 HKTP3 HKTP1 USMO1 NEAV USMO NEAV1 USMO3 ge) USMO4 (Feed slud USMO 2 sludge) USMO8 (Feed USUR* USUR1 USMO6 (Feed sludge) USBL2 USBL USBL1 USBL3 0.1 JPAS2 >90% JPAS3 JPAS >70% JPAS1 >50% USPE1 E2 USP USPE USPE3 USST1 USAR2 USST12 1 USST2 R USAR USA EHA1 USS N T7 USST15 CALI1 NEHA US JPKH1 ST10 CALI KH2 USS JP T4 USST14 JPKH3 ! USST JPKH* CAII U 3 U SST8 SS U USPN4 S T S AII USPN3 USST T1 PN1 3 C S 2 U U SS 5 U USPN SST6 T9 PN5 U US SST 4 U S 1 USNP6 U ST 1 SUR SPN S1 16 U USPS P USUR US US 2 SPS3 USDV5 U U 4 5 USDV R USPS6 3 6 US USPS2 US US U DV 1 USPS USDE4 SD DV2 4 USUR U U U S US V R* USUR5 NA2 USDE SD UST D 3 USTU JPNA1 US E JP DE2 USPS E 1 US JPNA3 TU1 5 JPNA4 U4

R* JPMU1 3

JPMU2 3

USTU2 DV NEVL1

NENW1 NEVL2

USDE USU

U U

STU W JPNA

USM USVL

NEN H

Figure 1. Clustering of digester microbial communities. UPGMA dendrogram was built using weighted UniFrac as distance matrix after jackknifed rarefaction to 20,957 sequences per sample with 100 iterations. Three feed sludge samples are used as outgroup to root the tree. Plants that have samples not clustered together are marked. Figure27 2

25

Number of OTUs sharing same coordinate

10 100 >1000

20 OTUs associated with aerobic/facultative anaerobic taxa vs. OTUs associated with

≤ 0.5) obligate anaerobic taxa

≤0.2 0.5 1 2 ≥5 abundance 15 /AD abundance

10

5 Residue populations Number of plants (feed

0 051015202527

Number of plants (feedabundance/ADabundance ≥ 2)

Figure 2. Identification of OTUs related to residue populations. For each OTU, the x value represents the number of plants where the OTU has more than double abundance in feed sludge than in AD. The y value represents the number of plants where the OTU has more than double abundance in AD than in feed sludge. The size of each bubble represents the number of OTUs (in log scale) sharing the same x-y coordinate. The color scale represents at a given coordinate the ratio of the number of OTUs associated with aerobic/facultative anaerobic taxa over the number of OTUs associated with obligate anaerobic taxa. Shadowed region (x ≥ 5, y ≤ 5) represents OTUs defined as residue populations in this study. Figure 3

(A) (B) 30 900 Relative abundance in feed sludge community 50 Relative abundance in AD community 25 OTU number 750 40

20 600 30

15 450

20 10 300 OTU number No. of samples Relative abundance (%) 5 150 10

0 0 0 a 2 eria ria 0 10 20 30 cutes GN0 cteria mycetes eobact Firmi Chloroflexi Relative abundance (%) Prot Bacteroidetes errucomicrobi ParcubacteActinobacteria Acidoba Plancto V Ca.

(C) 30 (D) Relative abundance of residue populations rho=0.846, p<0.01 30 Shannon index 9 25

25 7.5 20 20 6

15 15 4.5

10

10 3 Shannon index Relative abundance (%)

Relative abundance (%) 5 5 1.5

0 0 0 345678910 ABCDEFGH Shannon index Cluster

Figure 3. Distribution of OTUs identified as residue populations. Panel (A) shows the top ten phyla that contain high numbers of residue OTUs. Dots represent numbers of OTUs of this phylum (primary y axis). Solid bars represent abundance of residue populations of this phylum in feed sludge community and open bars represent abundance of residue populations of this phylum in AD community (secondary y axis). Panel (B) shows the distribution of residue abundance in 148 AD samples. Panel (C) shows correlation between Shannon index and residue abundance of each AD sample. Panel (D) shows residue abundance (dots, primary y axis) and Shannon index (bar, secondary y axis) of each cluster determined previously. A B C D E FGH Figure 4 Taxomony OTU ID# (high (pre- (industrial (>50°C) (<30°C) (~40°C) Methanogen salinity) treatment) influent) Methanothermobacter 251952 Ca. Methanofastidiosa 454177 Methanoculleus 91897 Methanolinea 456136 Methanoregula 320695 Methanospirillum 75801 444616 Methanosaeta 170163 119112 Methanosarcina 274978 Syntroph 433935 Firmicutes Syntrophomonas 429809 40322 411727 Proteobacteria Smithella 356423 215050 Residue 424987 Bacteroidetes Flavobacterium 200474 Chitinophagaceae 203311 147804 413531 Proteobacteria Acidovorax 170164 286773 Dechloromonas 234708 260866 Azospira 143565 Zoogloea 453562 249182 59307 Ca. Latescibacteria 35110 Fermenter

Bacteroidetes Bacteroidales 44552 202151 276527 15143 202864 431477 137302 82547 441867 288191 28139 132409 176925 68756 387815 406612 323003 106994 135922 439264 221084 482600 211125 214687 36264 129869 209485 169256 148920 441596 Firmicutes Sedimentibacter 186335 361090 82535 150104 223517 33917 46600 263980 20743 46097 Coprothermobacter 359293 416445 308382 Sphirochaetes 354771 59281 129871 97104 224485 436831 Ca. Cloacimonetes 15147 37214 59278 306593 416952 275687 Thermotogae 386455 Mesotoga 139545 284625 18768 423765 Defluviitoga 287086 Others Ca. Aminicenantes 257239 347826 Caldiserica 336001 Choloroflexi 86947 Ca. Fermentibacteria 347296 Ca. Atribacteria 309783 Ca. Marinimicrobia 348740 355271 27060

Relative abundance in each cluster 20% 10% 5% 1% 0.5%

Figure 4. Core microbial community of each cluster. Electronic Supplementary Material (for online publication only) Click here to download Electronic Supplementary Material (for online publication only): Supplementary.pdf