Genome-centric resolution of microbial diversity, metabolism and interactions

in anaerobic digestion

Running title: Genome-centric resolution through deep metagenomics

Inka Vanwonterghem1,2, Paul D Jensen1, Korneel Rabaey1,3 and Gene W Tyson1,2*

1Advanced Water Management Centre (AWMC), The University of Queensland, St Lucia, QLD

4072, Australia; 2Australian Centre for Ecogenomics (ACE), School of Chemistry and Molecular

Biosciences, The University of Queensland, St Lucia, QLD 4072, Australia; 3Laboratory for

Microbial Ecology and Technology (LabMET), Ghent University, Coupure Links 653, 9000 Ghent,

Belgium

*Corresponding author: Prof. Gene W. Tyson. Mailing address: Australian Centre for

Ecogenomics (ACE), School of Chemistry and Molecular Biosciences, The University of

Queensland, St Lucia, QLD 4072, Australia. Phone: +617 3365 3829 Fax: +617 336 54511 Email:

[email protected]

Keywords: metagenomics / genome-centric / functional redundancy / metabolic network / novel

diversity / anaerobic digestion

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the Version of Record. Please cite this article as an ‘Accepted Article’, doi: 10.1111/1462-2920.13382

This article is protected by copyright. All rights reserved.

Abstract

Our understanding of the complex interconnected processes performed by microbial communities is hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a growing number of genomes representing previously uncultured populations from increasingly complex environments. In this study, a temporal genome-centric metagenomic analysis was performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total,

101 population genomes that were moderate to near-complete were recovered based primarily on differential coverage binning. These populations span 19 phyla, represent mostly novel and expand the genomic coverage of several rare phyla. Classification into functional guilds based on their metabolic potential revealed metabolic networks with a high level of functional redundancy as well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists for several rare, uncultured populations. Genome-centric analyses of complex microbial communities across diverse environments provide the key to understanding the phylogenetic and metabolic diversity of these interactive communities.

Introduction

Microorganisms are ubiquitous in the environment and play key roles in global biogeochemical cycles. As the majority of microbial life has eluded cultivation in the laboratory, culture- independent techniques have been developed to study their diversity and functions (Tringe and

Rubin, 2005; Albertsen et al., 2013; Vanwonterghem et al., 2014a). Metagenomics, the sequencing of bulk DNA extracted directly from environmental samples, provides direct access to the

2

This article is protected by copyright. All rights reserved. metabolic potential of a microbial community. Advances in sequence throughput, read length and quality, and bioinformatics tools have contributed to a more widespread application of metagenomics to study natural and engineered systems.

Early metagenomic studies relied largely on gene-centric analyses (Venter et al., 2004; Tringe et al.,

2005) with the recovery of individual genomes limited to environments dominated by few distinct populations (Tyson et al., 2004). These gene-centric approaches are biased towards existing databases, hereby overlooking a significant fraction of the novel diversity (Jaenicke et al., 2010;

Wong et al., 2013). In addition, as only an overview of the metabolic potential of the community is provided without assigning functions to individual populations, important metabolic interactions may remain undetected. The development of new improved sequencing technologies and population genome binning algorithms (Wrighton et al., 2012; Albertsen et al., 2013; Imelfort et al.,

2014) has allowed us to move beyond gene-centric approaches and recover population genomes from increasingly complex environments. This has led to the discovery of novel lineages (Brown et al., 2015; Castelle et al., 2015), and insight into the metabolic processes (Raghoebarsing et al.,

2006; Haroon et al., 2013) and microbial interactions (Wrighton et al., 2014; Baker et al., 2015) taking place in these environments.

Engineered systems offer a controlled environment in which to study complex microbial communities, test hypotheses and explore the efficacy of new metagenomic approaches. Anaerobic digestion provides an interesting study environment as it consists of a series of metabolic processes carried out by a consortium of interdependent microorganisms. This process is a critical component of the global carbon cycle as well as industrially relevant as a waste management strategy and for the production of bioenergy (Amani et al., 2010). Due to the complexity of the communities involved, anaerobic digesters (ADs) remain genomically underexplored and most metagenomic studies have relied on gene-centric approaches (Jaenicke et al., 2010; Hanreich et al., 2013; Wong

3

This article is protected by copyright. All rights reserved. et al., 2013; Solli et al., 2014; Stolze et al., 2015). The recovery of population genomes from various engineered systems has provided genomic insight into candidate phyla such as TM7

(Albertsen et al., 2013) and KSB3 (Sekiguchi et al., 2015), which is responsible for filamentous bulking in anaerobic wastewater treatment, and microbial interactions such as synergistic networks within terephthalate-degrading bioreactors (Nobu et al., 2014). Genome-centric approaches can thus provide a powerful means to understanding the phylogenetic and metabolic diversity in anaerobic digestion.

Here, a detailed genome-centric exploration of complex microbial communities in ADs was performed to reconstruct the metabolic network by gaining access to the functional potential of individual population involved in the conversion of cellulose to methane. ADs were operated in triplicate for a year and supplied with cellulose. Metagenomic sequencing was performed on samples taken at two time points (spanning ~8 months), characterized by differences in performance. Co-assembly of the six generated metagenomes followed by differential coverage- based binning resulted in the recovery of 101 population genomes that constitute the majority of the community. These genomes represent 19 phyla and expand the genomic diversity of several lineages with few sequenced representatives. The metabolic reconstruction of individual populations combined with their relative abundance estimates allowed us to study ecological theories through the identification of a high level of functional redundancy, and construct an interaction network for the flow of carbon through the community. These results demonstrate the importance of genome-centric analyses when studying complex communities that harbor novel diversity, and provide the foundation for further hypotheses-driven experiments.

Results

Metagenomic sequencing and assembly

4

This article is protected by copyright. All rights reserved. The phylogenetic and metabolic diversity of microbial communities involved in anaerobic digestion was studied using a genome-centric metagenomic approach. Three lab-scale ADs (designated AD1,

AD2 and AD3) were used as controlled systems in which to study the community dynamics and reconstruct the metabolic network. The ADs were inoculated with a mixture of eight samples taken from anaerobic environmental and engineered systems (Table S1). They were operated for 362 days and supplied with cellulose as the sole carbon and energy source. Samples for metagenomic sequencing were collected from the reactors at two time points (T1: Day 96; T2: Day 362) based on differences in the structure and performance of the microbial communities, which are summarized in Fig. S1 and Table S2, and have been described in detail previously (Vanwonterghem et al.,

2014b). Briefly, cellulose hydrolysis was stable at both time points at an average efficiency of 86 ±

4%. Accumulation of predominantly acetate and propionate was observed at T1, with highest volatile fatty acid (VFA) concentrations measured for AD1 which correlated with lower methane production. At T2, VFAs were efficiently converted to methane and only minor differences were observed between the reactors. The six metagenomes (111 Gb total raw reads) from the triplicate

ADs at these two time points were co-assembled, generating 494,042 contigs with a combined length of 908 Mb (Table S3). On average, >85% of the metagenomic reads from each dataset mapped onto the contigs (>500 bp) from the combined assembly (Table S4).

Microbial community composition and population genome recovery

The community composition was determined by extracting the 16S rRNA gene sequences from the metagenomes (Fig. 1) and compared to previously reported amplicon-based community profiles

(Fig. S1) (Vanwonterghem et al., 2014b). The most abundant populations belonged to the phyla

Euryarchaeota, Actinobacteria, Bacteroidetes, Fibrobacteres, Firmicutes, Spirochaetes and

Verrucomicrobia, which are commonly found in ADs (Jaenicke et al., 2010). The microbial communities were highly similar to one another, but shifted in structure over time leading to significantly different communities at the two time points (P < 0.001). Several differences could be

5

This article is protected by copyright. All rights reserved. observed between the metagenome- and amplicon-based community profiles. Interestingly, a

Cellulomonas population was detected at 3-7% relative abundance in the metagenomes, while a primer mismatch for the forward primer (926F) used in the amplicon sequencing approach (Fig. S2) failed to detect this population (Fig. S1). On the contrary, the abundance of methanogens was overestimated in the amplicon dataset compared to the metagenome dataset, which is likely due to

PCR primer and amplification biases. Amplicon-based studies using the 454 sequencing platform also suffer from lower taxonomic resolution compared to metagenomics and may underestimate the community diversity and dynamics. For example, two Fibrobacter populations were detected in the metagenome dataset, each dominant at a different time point, yet were grouped together as one phylotype in the amplicon dataset (Fig. 1 and Fig. S1). A similar observation was made for the dominant Methanosaeta populations and influences our perception of the microbial community dynamics.

Population genome binning of the co-assembled metagenomes enabled the recovery of 93 bacterial and 8 archaeal population genomes with ≥50% completeness and ≤10% contamination (Table 1 and

Table S5). Of these genomes, 58 were substantially to near complete (≥80%) with low to medium contamination, according to the CheckM classification (Table 1) (Parks et al., 2015). The 101 genomes ranged in size between 1.4 and 6.3 Mb, across a GC content range between 29 and 74%

(Table 1 and Table S5). They represent the majority of the community (62 ± 3% and 79 ± 4% at T1 and T2, respectively; based on percentage of reads mapping), with 58% representing relatively high abundance populations (>0.5% in at least one of the samples) and the remaining 42% representing low abundance populations (down to 0.09% maximum relative abundance in at least one of the samples) (Table 1 and Table S5). In addition to recovering genomes for all the abundant population identified in the 16S rRNA gene profiles (Fig. 1, Fig. 2 and Fig. S3), a large number of low abundance population genomes were recovered which highlights the strength of the binning approach used in this study. The populations were phylogenetically diverse and belong to 19

6

This article is protected by copyright. All rights reserved. different phyla (Fig. 2). Many of these genomes represent novel orders, families and/or genera, and they significantly expand the genomic representation of phyla with relatively few sequenced genomes such as Fibrobacteres (Rahman et al., 2015), Verrucomicrobia, Planctomycetes and

Candidate division WWE1 (Fig. 2).

Classification into functional guilds based on metabolic potential

The metabolic potential of the microbial communities in these reactors was determined in order to classify individual populations into functional guilds fulfilling the different steps in anaerobic digestion (hydrolysis, fermentation, syntrophic oxidation and methanogenesis). Based on the potential substrate utilization for the dominant populations and their relative abundance, the flow of carbon from cellulose to methane in each community could be inferred, leading to the construction of a metabolic network.

Hydrolysis. Firstly, a gene-centric approach was applied to examine the hydrolytic capacity of the

AD communities over time and relative to other environments. Glycoside hydrolase (GH) profiles were generated for each individually assembled metagenome by calculating the total number of enzymes within each GH family. Comparative analysis of these GH profiles showed no significant differences between reactors and time points (P<0.05). The AD metagenomes were enriched in genes belonging to GH5 (5.3 ± 0.4% of total GH) and GH9 (1.6 ± 0.6%), but also showed high levels of other GH families, including GH2 (4.2 ± 0.3%), GH3 (3 ± 0.4%), GH31 (2.3 ± 0.2%),

GH43 (4.2 ± 0.7%), GH94 (2.0 ± 0.2%), GH78 (3.3 ± 0.3%), GH13 (4.9 ± 0.5%) and GH23 (3.1 ±

0.5%) (Table S6). Enzymes belonging to these GH families are predominantly involved in the hydrolysis of cellulose, oligosaccharides, sugar side chains, amylose/maltose and peptidoglycan. A comparison was made between the GH profiles of the ADs and those reported for soil ecosystems

(Tveit et al., 2013), switchgrass compost, termite hindgut and rumen (Allgaier et al., 2010) (Table

S7 and Fig. S4). Principle component analysis showed distinct clustering of the cellulose-degrading

7

This article is protected by copyright. All rights reserved. reactor samples together with the wood-feeding termite hindgut community (Allgaier et al., 2010), which were all enriched for cellulases predominantly belonging to GH5, reflecting the cellulosic substrate. The soil environments clustered together despite differences in plant cover (moss versus vascular plants), while the rumen sample was most different and showed a high abundance of oligosaccharide degrading enzymes belonging to GH2, GH3 and GH51 (Table S7 and Fig. S4), which is likely driven by the dominant grass hemicellulose found in this environment.

Cellulose hydrolyzers were identified in the ADs by generating GH profiles for the individual population genomes and correlating known activities for GH families with gene annotations to determine the substrate profile (Fig. 3). The potential to degrade cellulose was a common feature and present in 65% of the bacterial populations, including phyla commonly associated with cellulose hydrolysis such as Fibrobacteres (Fibro_01-03), Firmicutes (Firm_03-06, Firm_10-11,

Firm_13-14 and Firm_16), Bacteroidetes (Bact_02-03, Bact_08-11, Bact_13 and Bact_24),

Spirochaetes (Spiro_07-10 and Spiro_12), and Actinobacteria (Actino_01-02) (Fig. 3, Fig. 4 and

Fig. 5) (Lynd et al., 2002; Bayer et al., 2008; Bekele et al., 2011; Suen et al., 2011; Naas et al.,

2014). A range of GH enzymes were also detected in the two Verrucomicrobia populations

(Verruco_01-02) (Fig. 3), and it has previously been speculated that certain populations within this phylogenetically heterogeneous group can make a substantial contribution to polysaccharide hydrolysis, even when present at low abundance (Martinez-Garcia et al., 2012). Similar to prior studies, one of the Lentisphaerae genomes (Lenti_02) (Fig. 3) encoded a high abundance and variety of GH enzymes (Kaoutari et al., 2013). However, only a very limited number of GH enzymes were detected in the second Lentisphaerae population (Lenti_01), indicating that polysaccharide hydrolysis is not a representative feature of the whole phylum. Although the genome completeness of Lenti_01 is lower than Lenti_02, it is unlikely that this large difference in

GH abundance and diversity can be bridged by the missing fraction of the genome. The largest number of GH enzymes was observed for a Planctomycetes population (Planc_01) (Fig. 3), which

8

This article is protected by copyright. All rights reserved. expands our understanding of the metabolic role of Phycisphaerae since only a limited number of genomes within this class have been sequenced thus far, and this agrees with the recent finding of a broad range of GH enzymes within Planctomycetes genomes recovered from estuary sediment

(Baker et al., 2015). The discovery of hydrolytic potential within novel species highlights the importance of genome-centric approaches as these organisms play a crucial role in carbon cycling.

Microorganisms that could use cellobiose but not cellulose were identified in the reactors among

Proteobacteria (Alpha_01, Beta_02, Delta_01 and Epsilon_01), Bacteroidetes (Bact_22-23),

Spirochaetes (Spiro_02-03) and Synergistetes (Syner_01). By assigning functions to individual populations, discrepancies could be observed between cellobiose opportunists and cellulose degraders. In contrast to previous studies that reported a minimum ration of 2:1 for these functional groups (cellobiose:cellulose) (Berlemont and Martiny, 2013; Wrighton et al., 2014), the number of cellobiose opportunists in the ADs was lower than cellulose degraders. When taking the relative abundance into account it could be shown that this ratio was dynamic and became more even over time (1:7 at T1, 1:3 at T2 of cellobiose:cellulose).

The GH profile for each genome was normalized by its relative abundance at each time point (Fig.

S5 and Fig. S6) and this showed a clear shift in the abundant cellulose degraders over time, i.e. from Bacteroidetes (Bact_02-03) and Ruminococcus (Firm_04-06) populations at T1 (Fig. 4 and

Fig. S5), to Cellulomonas (Actino_01), Fibrobacter (Fibro_03) and Clostridiales (Firm_11) populations at T2 (Fig. 5 and Fig. S6). Several Spirochaetes (Spiro_07-10 and Spiro_12) and

Verrucomicrobia (Verruco_01-02) were initially present at lower abundance (maximum 1.3%) but increased over time (maximum 6.1%). Most of the dominant cellulolytic populations possessed a plurality of genes with cellulase and cellobiosidase activity (Fig. 3), and it has been hypothesized that higher GH diversity and copy number results in improved cellulose degrading ability

(Berlemont and Martiny, 2013).

9

This article is protected by copyright. All rights reserved. The presence of multiple high abundance cellulose degraders at the same time within a community may suggest there is a level of niche specialization. For example, a positive correlation in relative abundance was observed between Fibro_03 and Firm_11 (Table 1 and Fig. S7). These populations may utilize different strategies for attachment to cellulose particles since fibro-slime proteins (fsu) and pili (pil) were identified in Fibro_03, similar to Fibrobacter succinogenes (Suen et al., 2011), while dockerin and cohesion modules were detected in Firm_11 suggesting the presence of an organized cellulosome apparatus similar to Clostridium thermocellum (Lynd et al., 2002; Bayer et al., 2008). Their substrate specificity may also vary as multiple endoglucanases (GH5, GH8, GH9 and GH45) but only one cellobiose phosphorylase (GH94) for cellobiose utilization were found in

Fibro_03, while only few endoglucanases within the GH5 family but multiple cellobiose phosphorylase (GH94) and beta-glucosidase genes (GH1 and GH3) were detected for Firm_11. In addition, these populations potentially use different oligosaccharide, cellobiose and glucose transport mechanisms, such as phosphotransferase systems (pts), non-specific sugar ABC transporters (e.g. msmK, malK, sugC, and gguAB) and specific cellobiose transporters (cebEFG)

(Fig. S8). These differences in hydrolytic potential suggest that within the same environment and functional guild, niche specialization may allow seemingly functionally redundant populations to grow simultaneously and potentially work together.

Fermentation. The majority of the community showed a potential to convert glucose to acetate, with 73% of the bacterial population genomes encoding the acetate kinase (ack) and phosphate acetyltransferase (pta) genes required for acetate production (Fig. 4 and Fig. 5). An additional 16% were missing only one of these genes. This indicates a high level of functional redundancy and confirms acetate as one of the most important intermediates in these types of systems (Amani et al.,

2010).

10

This article is protected by copyright. All rights reserved. Propionate production within these communities occurred via the methylmalonyl-CoA pathway by populations within the Actinobacteria (Actino_02), Bacteroidetes (Bact_02-03, Bact_09-11,

Bact_13, Bact_19 and Bact_22-24), Rhodospirillum (Alpha_01) and Verrucomicrobia

(Verruco_01-02), which contained the key enzymes methylmalonyl-CoA mutase, methylmalonyl-

CoA epimerase and methylmalonyl-CoA carboxyltransferase. The higher propionate concentrations observed in the reactors at T1 (Table S2) were likely related to the high relative abundance of

Bact_03 (10 ± 2%) and Actino_02 (4 ± 1%), a population closely related to Propionibacterium

(Fig. 4). The main propionate producers decreased in abundance over time and at T2 the dominant populations of this functional guild shifted to members of the Bacteroidetes (Bact_19 and Bact_22-

24; 0-5%) and Verrucomicrobia (Verruco_01-02; 0- 6%) (Fig. 5). A full complement of genes for propionate production via the acrylate pathway or propanediol pathway was not detected in the investigated genomes.

Multiple potential butyrate producers were detected within the phylum Bacteroidetes (Bact_08-11,

Bact_13, Bact_19 and Bact_22-24) (Fig. 4 and Fig. 5). These populations contained the key gene butyrate kinase (buk) as well as most or all of the remaining genes in the butyrate fermentation pathway. The alternative pathway using butyryl-CoA:acetate CoA-transferase (but) was not detected in the studied population genomes. Although butyrate production genes were expected to be found in the Clostridiales genomes based on what is known from cultured species and genome representatives (Vital et al., 2016), a complete pathway for butyrate production was not detected in any of the Clostridiales genomes from this study. Potential for amino acid fermentation to acetate and butyrate was detected for Synergistetes (Syner_01 and Syner_03) and Treponema (Spiro_12) populations, which has been observed for species belonging to these phyla previously (Tucci and

Martin, 2007; Ganesan et al., 2008; Chertkov et al., 2010). These populations may be scavengers utilizing proteins that have been excreted or leaked from dead cells. Potential growth on proteinaceous compounds and sugars with predominantly acetate and lower amounts of butyrate as

11

This article is protected by copyright. All rights reserved. fermentation products may also be possible for the Thermotogae populations (Thermo_01-02), similar to what has been suggested for Mesotoga prima (Nesbo et al., 2012). Only three mesophilic

Thermotogae genomes have been described so far, providing limited knowledge of their metabolism. The populations within the reactors seem phylogenetically more closely related to

Mesotoga infera, however they lack the genes for utilization of sulfur as terminal electron acceptor, a key feature for this species (Hanaia et al., 2013). Instead, they also contain a selection of polysaccharide degrading enzymes, which can be related back to the environment in which they are found.

Syntrophic VFA oxidation. Reduced compounds such as propionate and butyrate can be further oxidized to acetate, CO2, H2 and formate by syntrophic when H2 partial pressures are low.

Two Syntrophobacterales genomes (Delta_01 and Delta_02; 47% amino acid identity (AAI)) contained the majority of genes for the methylmalonyl-CoA pathway, indicating a potential involvement in propionate oxidation. Other members of this family are capable of syntrophic propionate oxidation, i.e. Syntrophobacter fumaroxidans (Harmsen et al., 1998) (64% AAI to

Delta_01), and syntrophic oxidation of phenol and other aromatics to acetate, i.e. Syntrophorhabdus aromaticivora (Qiu et al., 2008) (63% AAI to Delta_02). Delta_01 and Delta_02 were present at

<0.2% relative abundance at T1 and increased over time to 0.3-0.9% at T2 (Fig. 5). Although these relative abundances are still low, syntrophic propionate oxidizers are capable of high substrate turnover and this likely contributed to the low observed propionate concentrations at T2. It has been suggested that Candidatus ‘Cloacamonas Acidaminovorans’ is a hydrogen-producing syntroph capable of oxidizing propionate based on its genome sequence combined with cultivation experiments (Pelletier et al., 2008). Although the WWE1 genome recovered from the reactors appears to have similar genes required for the utilization of amino acids, sugars and carboxylic acids, as well as multiple putative Fe-only hydrogenases, the energy-conservation mechanism required for syntrophic VFA oxidation remains to be elucidated.

12

This article is protected by copyright. All rights reserved.

Butyrate oxidation was likely performed via the beta-oxidation pathway by another

Syntrophobacterales population (Delta_03), which is most closely related to Syntrophus aciditrophicus (60% AAI) (Mclnerney et al., 2007). The Delta_03 genome had a large number of genes invested in butyrate oxidation and increased in abundance over time from <0.001% at T1 to

~1.4% at T2 (Fig. 5). The Delta_01 and Delta_02 genomes only encode part of the beta-oxidation pathway, i.e. from butyryl-CoA or crotonyl-CoA to acetate, suggesting intermediates from other oxidation pathways may feed into the butyrate oxidation pathway at this step.

Methanogenesis. Methane producing populations within these communities were related to

Methanocorpusculum (Methan_05), Methanospirillum (Methan_06), Methanoculleus (Methan_07) and Methanosaeta (Methan_01-02 and Methan_04) (Fig. 4 and Fig. 5). Over time, there was an overall increase in methanogen abundance associated with a shift from hydrogenotrophic to acetoclastic methanogenesis. The presence of multiple populations capable of fulfilling the same function shows that a level of functional redundancy remained within more specialized functional guilds. Another interesting finding was the presence of a near complete complement of genes for hydrogenotrophic methanogenesis within each of the three Methanosaeta genomes (Methan_01-02 and Methan_04), which showed little to no contamination and are reported to be strictly acetoclastic. Various hypotheses have been developed to explain the potential role of this pathway

(Smith and Ingram-Smith, 2007; Rotaru et al., 2014) but functional assays are needed to determine whether this pathway is active in these systems.

While methanogen abundance increased over time, the increase in methane production was disproportional, and this was likely due to a shift in the rate-limiting step. The observed accumulation of VFAs at T1 indicates syntrophic VFA oxidation and/or methanogenesis was rate- limiting within the community at this time point. As all VFAs were efficiently converted to biogas

13

This article is protected by copyright. All rights reserved. at T2, steps upstream in the metabolic network were more likely rate-limiting. When substrate concentrations are low, methanogens can use internal storage compounds (e.g. glycogen) for growth without methane production (Verhees et al., 2003). Also, enzymes for assimilatory and dissimilatory sulfate reduction were encoded within several populations present at higher abundance at T2 (Delta_01, Chlorobi_01 and Alpha_01-03), indicating potential competition with methanogens for H2 and/or acetate (Oremland and Polcin, 1982).

Discussion

The widespread application of metagenomics sequencing has led to the discovery of novel species and metabolic processes of global importance (Haroon et al., 2013; Wrighton et al., 2014; Baker et al., 2015). Improved metagenome assembly and binning tools (Imelfort et al., 2014) now allow a growing number of population genomes to be recovered from increasingly complex environments

(Albertsen et al., 2013; Baker et al., 2015; Brown et al., 2015). Here, a detailed genome-centric analysis of microbial communities involved in the conversion of cellulose to methane led to the recovery of 101 population genomes that could be classified into functional guilds based on their potential substrate utilization. Through the recovery of population genomes for the majority of the community, we were able to combine the metabolic potential of individual populations with their relative abundance, and reconstruct a metabolic network for the dominant players in the communities at two time points (T1: Fig. 4 and T2: Fig. 5). The networks revealed a high level of functional redundancy, particularly among the hydrolyzers and fermenters, as changes in the dominant players were observed over time while the overall functionality was maintained. Potential niche specialization was also observed based on the variety and abundance of GH families. Various microbial interactions could be inferred including competition for substrates and cellobiose- or glucose-utilizing opportunists that depend on the activity of primary cellulose degraders. Metabolic functions that could not have been predicted from known cultured or sequenced representatives

14

This article is protected by copyright. All rights reserved. were also identified within each functional guild. By correlating the metabolic network with performance parameters, observations such as the accumulation of propionate could be explained.

The genome-resolved network also enabled the proportion of the community represented by each functional guild to be calculated, and this highlighted the importance of a diverse and well-balanced community with functional flexibility to fulfill a complex multi-step process such as the anaerobic digestion.

The results presented here demonstrate the valuable insights that can be gained into complex metabolic networks through genome-centric metagenomics. The approach described in this study can be readily applied to other natural and engineered systems, which will undoubtedly reveal novel microbial diversity and metabolic interactions. When genome-centric metagenomics is combined with functional data derived from metatranscriptomics or -proteomics, we will be able to develop a holistic understanding of the complex roles microorganisms play in these environments.

Experimental procedures

Sample collection and DNA extraction

Triplicate ADs (2L working volume) were seeded with a diverse inoculum consisting of a samples taken various anaerobic digesters, an anaerobic lagoon, rumen fluid and anoxic lake sediment

(Vanwonterghem et al., 2014b), and supplied with alpha cellulose (Sigma Aldrich, NSW Australia) as the sole energy and carbon source. The reactors were designated AD1, AD2 and AD3, and were run for 362 days at a 10 day sludge retention time (SRT), under mesophilic conditions and at

-1 -1 -1 -1 neutral pH. The medium contained 3 g L Na2HPO4, 1 g L NH4Cl, 0.5 g L NaCl, 0.2465 g L

-1 -1 -1 -1 MgSO4.7H2O, 1.5 g L KH2PO4, 14.7 mg L CaCl2, 2.6 g L NaHCO3, 0.5 g L C3H7NO2S, 0.25

15

This article is protected by copyright. All rights reserved. -1 -1 -1 g L Na2S.9H2O, and 1 mL of trace solution containing 1.5 g L FeSO4.7H2O, 0.15 g L H3BO3,

-1 -1 -1 -1 -1 0.03g L CuSO4.5H2O, 0.18 g L KI, 0.12 g L MnCl2.4H2O, 0.06 g L Na2Mo4.2H2O, 0.12 g L

-1 -1 -1 ZnSO4.7H2O, 0.15 g L CoCl2.6H2O, 10 g L EDTA and 23 mg L NiCl2.6H2O. It was sparged with N2 and then autoclaved at 121°C for 60 min for oxygen removal and sterilisation. The reactors

-1 were supplied with alpha cellulose at a concentration of 5 g cellulose L medium semi-continuously, i.e. at intervals of six hours resulting in 4 feed events per day. Reactor performance parameters and microbial community composition were monitored over time as part of a previous study

(Vanwonterghem et al., 2014b). Samples for metagenomic sequencing were collected from the three reactors (2 mL) at two time points (Day 96 and Day 362) based on differences in reactor performance (Table S2). The samples were centrifuged at 14,000 g for 2 min to collect the biomass, and the pellet was snap-frozen in liquid nitrogen and stored at -80°C until further processing. DNA was extracted from these samples using the MP-Bio FastDNA Spin Kit for Soil (MP Biomedicals,

Australia) and according to the manufacturer’s instructions.

Metagenome library preparation and sequencing

DNA libraries for samples from the first time point were prepared using the TruSeq DNA Sample

Preparation Kits v2 (Illumina, CA) with 2 µg of DNA from each sample, following the manufacturer’s instructions. The DNA concentration of the libraries was measured using the

QuanIT kit (Molecular Probes, CA). Paired-end sequencing (2 x 150 bp, average fragment size 250 bp) was performed on the Illumina HiSeq2000 using the TruSeq PE Cluster Kit v3-cBot-HS

(Illumina). The second set of samples were prepared for sequencing using the Nextera DNA

Sample Preparation Kit (Illumina) with 50 ng of DNA from each sample, following the manufacturer’s instructions. Quantification and quality assessment of the libraries was performed using the Agilent 2100 Bioanalyser (Agilent technologies, CA). Paired-end sequencing (2 x 150 bp,

16

This article is protected by copyright. All rights reserved. ranging from 300-800 bp fragment size) was performed on the Illumina HiSeq2000 platform using the TruSeq SBS Kit v3 (Illumina). Each sample was sequenced on one third of a flowcell lane, generating a combined total of 111 Gb of raw sequence data. Three additional large insert (2797 ±

83 bp) mate-pair libraries were generated from the same genomic DNA extracted from the three reactors at day 362. The libraries were constructed using the Nextera Mate Pair Sample Preparation

Kit (Illumina) and sequenced on the Illumina MiSeq system (2 x 250 bp paired-end sequencing) using the MiSeq Reagent kit v2. Each sample was sequenced on one quarter of a flowcell lane, generating a combined total of 5 Gb of raw sequence data.

Community profiling

16S rRNA gene amplicon sequencing of all samples using the Roche 454 GS-FLX Titanium platform (Roche Diagnostics, Australia) has been previously reported (Vanwonterghem et al.,

2014b). The microbial community composition was also determined by identifying and classifying all 16S rRNA reads from the paired-end metagenomic datasets using the software CommunityM v.1.2 with default parameters (https://github.com/dparks1134/CommunityM.git), which uses hidden

Markov models (HMMs) to identify the 16S rRNA gene sequences and classifies them using the

GreenGenes database (DeSantis et al., 2006) with clustering at 97% sequence similarity.

Metagenome assembly and population genome binning

Paired-end reads were quality trimmed using CLC workbench v.6 (CLC Bio, Taiwan) with a quality score threshold of 0.01 and minimum read length of 100 bp. Illumina sequencing adapters at the ends of reads were trimmed (if found) and reads containing ambiguous nucleotides were removed from the dataset. Trimmed sequences were assembled using the CLC de novo assembly algorithm with a kmer size of 63 and automatic bubble size. All six datasets were assembled individually and also combined in a single large dataset co-assembly for population genome binning. Only contigs larger than 500 bp were used in downstream analyses. The raw paired-end

17

This article is protected by copyright. All rights reserved. reads from each individual dataset were mapped onto the combined assembly using BWA (Li,

2013) and SAMtools (Li et al., 2009) with default parameters. On average 87 ± 4% of all reads mapped onto the co-assembly. Population genomes were recovered from the sequence data based primarily on differential coverage profiles using GroopM v.0.2 (Imelfort et al., 2014), with initial core formation set at 1500 bp.

Population genome bin refinement and quality assessment

The completeness and level of contamination of the population genome bins was calculated with

CheckM v.0.9.4 (Parks et al., 2015), which uses lineage specific conserved marker gene sets for each population genome. Manual refinement of the population genome bins was performed using the GroopM refine function based on coverage profiles, kmer signatures and GC content, leading to a significant increase in good quality population genomes (Table S8). The resulting population genome bins were further refined using the mate-pair sequence data. Adapter sequences were removed, trimmed reads shorter than 50 bp were discarded, and only valid mate pairs, i.e. reads oriented in the reverse-forward direction, were retained. Scaffolding of the processed mate-pair reads was performed using SSPACE v.2.0 (Boetzer et al., 2011) with a minimum number of links set at 2. The population genome bins were improved by adding or removing linked contigs based on coverage information, the number of connections between contigs and completeness/contamination estimates (Table S8). The completeness estimates were also used to calculate the expected genome size. The data has been submitted to the NCBI Short Read Archive under BioProject

PRJNA284316.

Genome tree phylogeny

A genome tree was generated using 38 universal (Darling et al., 2014) conserved marker genes from 2015 finished bacterial and archaeal genomes available from the Integrated Microbial

Genomes database (IMG) (Markowitz et al., 2012) and the recovered population genomes (Table

18

This article is protected by copyright. All rights reserved. S9). Marker genes were identified using HMMs and the genome tree was generated with FastTree

(Price et al., 2009) using a concatenated alignment of the marker genes. The phylogenetic affiliation of the population genomes was determined relative to the IMG genomes and compared to the of 16S rRNA sequences identified in the genome bins using CommunityM v.1.2 with default parameters and the GreenGenes database clustered at 97% sequence similarity

(https://github.com/dparks1134/CommunityM.git).

Functional annotation of the metagenomes

For each individually assembled metagenome, open reading frames (ORFs) were identified using

PROKKA v.1.8 (Seemann, 2014). Genes encoding carbohydrate active enzymes (CAZy) (Lombard et al., 2014) were detected using hmmer v.3.1 (Finn et al., 2011) and the HMM-based database for

CAZy annotation (dbCAN v.3) (Yin et al., 2012), which classifies enzymes that degrade glycosidic bonds into families based on structurally-related catalytic and carbohydrate-binding modules. For each metagenome, the total number of hits to a glycoside hydrolase (GH) family was calculated for comparative analysis.

Functional annotation of the population genomes and metabolic network reconstruction

Population genomes recovered from the combined metagenome assembly were annotated using

PROKKA v.1.8 and validated based on homology search with BLASTP (Altschul et al., 1990) using the IMG protein database (Markowitz et al., 2012) and KEGG Orthology database (Kanehisa and Goto, 2000; Kanehisa et al., 2014). Carbohydrate active enzymes were detected for each population genome using hmmer and dbCAN, similar to the individual metagenomes. These results were combined with known activities of GH families (http://www.cazy.org; https://www.cazypedia.org) (Allgaier et al., 2010) and the annotations based PROKKA and IMG databases, in order to determine the predominant substrate profile for each GH family. A full reconstruction of the metabolic potential for each population genomes was based on the consensus of the different annotation methods used and metabolic pathways identified in KEGG and MetaCyc

19

This article is protected by copyright. All rights reserved. (Caspi et al., 2008). A metabolic pathway comprising multiple genes was considered present if the majority (>75%) of genes involved in this pathway were detected in the genome. The populations could be classified into one or more functional guilds, namely hydrolysis (cellulose/cellobiose), fermentation (acetate/propionate/butyrate), syntrophic VFA oxidation and methanogenesis

(hydrogenotrophic/acetoclastic), based primarily on their carbon metabolism. In order to reconstruct the metabolic networks at each time point (Fig. 4 and Fig. 5), only those populations present at >

0.1% relative abundance in at least one of the reactor were considered, and their average relative abundance across the reactors at each time point was calculated to determine the contribution of each population to the flow of carbon (represented by the thickness of the lines in Fig. 4 and Fig. 5).

The combined (average) relative abundance of all populations within a functional guild was calculated to assess the overall distribution of functions across the community and how this balance shifts over time.

Statistical analyses

All statistical analyses and construction of heatmaps were carried out in RStudio v.2.15.0 using the

R CRAN packages: vegan (Oksanen et al., 2013) and RColorBrewer (Neuwirth, 2011). Tukey’s

Honestly Significant Differences Tests were used to statistically compare the datasets and principle component analysis (PCA) was used to assess the variability between samples. Correlation analyses were performed through linear regression of the relative abundance profiles and assessing the respective R2 values.

Acknowledgements

20

This article is protected by copyright. All rights reserved. This study was supported by the Commonwealth Scientific and industrial Research Organization

(CSIRO) Flagship Cluster “Biotechnological solutions to Australia’s transport, energy and greenhouse gas challenges”. IV acknowledges support from the University of Queensland

International Scholarship, and PJ acknowledges support from the Australian Meat Processor

Corporation (2013/4008 Technology Fellowship). KR acknowledges support by the European

Research Council (Starter Grant Electrotalk), and GWT was supported by an Australian Research

Council Queen Elizabeth fellowship (DP1093175). The authors would like to thank Serene Low at the Australian Centre for Ecogenomics for the metagenome library preparation, Mike Imelfort for assistance with the bioinformatics analysis, Donovan Parks for assistance with the genome quality assessment and Philip Hugenholtz for providing comments on the manuscript.

Competing financial interests

The authors declare no competing financial interests.

References

Albertsen, M., Hugenholtz, P., Skarshewski, A., Nielsen, K.A., Tyson, G.W., and Nielsen, P.H.

(2013) Genome sequences of rare, uncultured bacertia obtained by differential coverage binning of multiple metagenomes. Nature Biotechnology 31: 533-538.

Allgaier, M., Reddy, A., Park, J.I., Ivanova, N.N., D'Haeseleer, P., Lowry, P. et al. (2010) Targeted discovery of glycoside hydrolases from a switchgrass-adapted compost community. Plos One 5: 1-

9.

Altschul, S.F., Gisch, W., Miller, W., Meyers, E.W., and Lipman, D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology 215: 403-410.

21

This article is protected by copyright. All rights reserved. Amani, T., Norsati, M., and Sreekrishnan, T.R. (2010) Anaerobic digestion from the viewpoint of microbiological, chemical, and operational aspects - a review. Environmental Reviews 18: 255-278.

Baker, B.J., Lazar, C.S., Teske, A.P., and Dick, G.J. (2015) Genomic resolution of linkages in carbon, nitrogen, and sulfur cycling among widespread estuary sediment bacteria. Microbiome 3: 1-

12.

Bayer, E.A., Lamed, R., White, B.A., and Flint, H.J. (2008) From cellulosomes to cellulosomics.

The Chemical Record 8: 364-377.

Bekele, A.Z., Koike, S., and Kobayashi, Y. (2011) Phylogenetic diversity and dietary assoiation of rumen Treponema revealed using grou-specific 16S rRNA gene-based analysis. FEMS

Microbiology Letters 316: 51-60.

Berlemont, R., and Martiny, A.C. (2013) Phylogenetic distribution of potential cellulases in bacteria. Applied and Environmental Microbiology 79: 1545-1554.

Boetzer, M., Henkel, C.V., Jansen, H.J., Butler, D., and Pirovano, W. (2011) Scaffolding pre- assembled contigs using SSPACE. Bioinformatics 27: 578-579.

Brown, C.T., Hug, L.A., Thomas, B.C., Sharon, I., Castelle, C.J., Singh, A. et al. (2015) Unusual biology across a group comprising more than 15% of domain Bacteria. Nature: 1-18.

Caspi, R., Foerster, H., Fulcher, C.A., PKaipa, P., Krummenacker, M., Latendresse, M. et al. (2008)

The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Research 36: 623-631.

Castelle, C.J., Wrighton, K.C., Thomas, B.C., Hug, L.A., Brown, C.T., Wilkins, M.J. et al. (2015)

Genomic expansion of domain Archaea highlights roles for organisms from new phyla in anaerobic carbon cycling. Current Biology 25: 1-12.

Chertkov, O., Sikorski, J., Brambilla, E., Lapidus, A., Copeland, A., Glavina Del Rio, T. et al.

(2010) Complete genome sequence of Aminobacterium colombiense type strain (ALA-1T).

Standards in Genomic Sciences 2: 280-289.

22

This article is protected by copyright. All rights reserved. Darling, A.E., Jospin, G., Lowe, E., Matsen, I.V., Bik, H.M., and Eisen, J.A. (2014) Phylosift: phylogenetic analysis of genomes and metagenomes. PeerJ 2: 1-28.

DeSantis, T.Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E.L., Keller, K. et al. (2006)

Greengenes, a chimera-checkes 16S rRNA gene database and workbench compatible with ARB.

Applied and Environmental Microbiology 72: 5069-5072.

Finn, R.D., Clements, J., and Eddy, S.R. (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Research 39: 29-37.

Ganesan, A., Chaussonnerie, S., Tarrade, A., Dauga, C., Boucher, T., Pelletier, E. et al. (2008)

Cloacibacillus evryensis gen. nov., sp. nov., a novel asaccharolytic, mesophilic, amino-acid- degrading bacterium within the phylum 'Synergistetes', isolated from an anaerobic sludge digester.

International Journal of Systematic and Evolutionary Microbiology 58: 2003-2012.

Hanaia, W.B., Postec, A., Aullo, T., Ranchou-Peyruse, A., Erauso, G., Brochier-Armanet, C. et al.

(2013) Mesotoga infera sp. nov., a mesophilic member of the order Thermotogales, isolated from an underground gas storage aquifer. International Journal of Systematic and Evolutionary

Microbiology 63: 3003-3008.

Hanreich, A., Schimpf, U., Zakrzewski, M., Schluter, A., Benndorf, D., Heyer, R. et al. (2013)

Metagenome and metaproteome analyses of microbial communities in mesophilic biogas-producing anaerobic batch fermentations indicate concerted plant carbohydrate degradation. Systematic and

Applied Microbiology 36: 330-338.

Harmsen, H.J.M., Van Kuijk, B.L.M., Plugge, C.M., Akkermans, A.D.L., De Vos, W.M., and

Stams, A.J.M. (1998) Syntrophobacter fumaroxidans sp. nov., a syntrophic propionate-degrading sulfate-reducing bacterium. International Journal of Systematic Bacteriology 48: 1383-1387.

Haroon, M.F., Hu, S., Shi, Y., Imelfort, M., Keller, J., Hugenholtz, P. et al. (2013) Anaerobic oxidation of methane coupled to nitrate reduction in a novel archaeal lineage. Nature 500: 567-570.

23

This article is protected by copyright. All rights reserved. Imelfort, M., Parks, D.H., Woodcroft, B.J., Dennis, P.D., Hugenholtz, P., and Tyson, G.W. (2014)

GroopM: an automated tool for the recovery of population genomes from related metagenomes.

PeerJ 2: e603.

Jaenicke, S., Ander, C., Bekel, T., Bisdorf, R., Droge, M., Gartemann, K.-H. et al. (2010)

Comparative and joint analysis of two metagenomic datasets from a biogas fermenter obtained by

454-pyrosequencing. Plos One 6: 1-15.

Kanehisa, M., and Goto, S. (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic

Acids Research 28: 27-30.

Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M. (2014) Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Research 42:

199-205.

Kaoutari, A.E., Armougom, F., Gordon, J.I., Raoult, D., and Henrissat, B. (2013) The abundance and variety of carbohydrate-active enzymes in the human gut microbiota. Nature Reviews

Microbiology 11: 497-504.

Li, H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:13033997v2 [q-bioGN].

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N. et al. (2009) The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25: 2078-2079.

Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P.M., and Henrissat, B. (2014) The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Research 42: 490-495.

Lynd, L.R., Weimer, P.J., van Zyl, W.H., and Pretorius, I.S. (2002) Microbial cellulose utilization:

Fundamentals and biotechnology. Microbiology and Molecular Biology Reviews 66: 506-577.

Markowitz, V.M., Chen, I.-M.A., Palaniappan, K., Chu, K., Szeto, E., Grechkin, Y. et al. (2012)

IMG: the integrated microbial genomes database and comparative analysis system. Nucleic Acids

Research 40: 115-122.

24

This article is protected by copyright. All rights reserved. Martinez-Garcia, M., Brazel, D.M., Swan, B.K., Arnosti, C., Chain, P.S.G., Reitenga, K.G. et al.

(2012) Capturing single cell genomes of active polysaccharide degraders: An unexpected contribution of Verrucomicrobia. Plos One 7: 1-11.

Mclnerney, M.J., Rohlin, L., Mouttaki, H., Kim, U., Krupp, R.S., Rios-Hernandez, L. et al. (2007)

The genome of Syntrophus aciditrophicus: Life at the thermodynamic limit of microbial growth.

PNAS 104: 7600-7605.

Naas, A.E., Mackenzie, A.K., Mravec, J., Schuckel, J., Willats, W.G.T., Eijsink, V.G.H., and Pope,

P.B. (2014) Do rumen Bacteroidetes utilize an alternative mechanism for cellulose degradation.

MBio 5: 1-6.

Nesbo, C.L., Bradman, D.M., Adebusuyi, A., Dlutek, M., Petrus, A.K., Foght, J. et al. (2012)

Mesotoga prima gen. nov., sp. nov., the first described mesophilic species of the Thermotogales.

Extremophiles 16: 387-393.

Neuwirth, E. (2011) RColorBrewer: ColorBrewer palettes. .

Nobu, M.K., Narihiro, T., Rinke, C., Kamagata, Y., Tringe, S.G., Woyke, T., and Liu, W.-T. (2014)

Microbial dark matter ecogenomics reveals complex synergistic networks in a methanogenic bioreactor. The ISME Journal: 1-13.

Oksanen, J., Blanchet, G., Kindt, R., Legendre, P., Minchin, P.R., O'Hara, R.B. et al. (2013) Vegan: community ecology package.

Oremland, R.S., and Polcin, S. (1982) Methanogenesis and sulfate reduction: competitive and noncompetitive substrates in estuarine sediments. Applied and Environmental Microbiology 44:

1270-1276.

Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P., and Tyson, G.W. (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

PeerJ PrePrints 2.

25

This article is protected by copyright. All rights reserved. Pelletier, E., Kreimeyer, A., Bocs, S., Rouy, Z., Gyapay, G., Chouari, R. et al. (2008) "Candidatus

Cloacamonas Acidaminovorans": genome sequence reconstruction provides a first glimpse of a new bacterial division. Journal of Bacteriology 190: 2572-2579.

Price, M.N., Dehal, P.S., and Arkin, A.P. (2009) FastTree: Computing large minimum-evolution trees with profiles instead of distance matric. Molecular Biology and Evolution 26: 1641-1650.

Qiu, Y.-L., Hanada, S., Ohashi, A., Harada, H., Kamagata, Y., and Sekiguchi, Y. (2008)

Syntrophorhabdus aromaticivorans gen. nov., sp. nov., the first cultured anaerobe capable of degrading phenol to acetate in obligate syntrophic associations with a hydrogenotrophic methanogen. Applied and Environmental Microbiology 74: 2051-2058.

Raghoebarsing, A.A., Pol, A., van de Pas-Schoonen, K.T., Smolders, A.J.P., Ettwig, K.F., Rijpstra,

I.C. et al. (2006) A microbial consortium couples anaerobic methane oxidation to denitrification.

Nature 440: 918-921.

Rahman, N.A., Parks, D., Vanwonterghem, I., Morrison, M., Tyson, G.W., and Hugenholtz, P.

(2015) A phylogenomic analysis of the bacterial phylum Fibrobacteres. Frontiers in Microbiology.

Rotaru, A.-E., Shrestha, P.M., Liu, F., Shrestha, M., Shrestha, D., Embree, M. et al. (2014) A new model for electron flow during anaerobic digestion: direct interspecies electron transfer to

Methanosaeta for the reduction of carbon dioxide to methane. Energy and Environmental Science 7:

408-415.

Seemann, T. (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30: 2068-2069.

Sekiguchi, Y., Ohashi, A., Parks, D.H., Yamauchi, T., Tyson, G.W., and Hugenholtz, P. (2015)

First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking. PeerJ 3.

Smith, K.S., and Ingram-Smith, C. (2007) Methanosaeta, the forgotten methanogen? Trends in

Microbiology 15: 150-155.

Solli, L., Havelsrud, O.E., Horn, S.J., and Rike, A.G. (2014) A metagenomic study of the microbial communities in four parallel biogas reactors. Biotechnology for Biofuels 7: 1-15.

26

This article is protected by copyright. All rights reserved. Stolze, Y., Zakrzewski, M., Maus, I., Eikmeyer, F., Jaenicke, S., Rottmann, N. et al. (2015)

Comparative metagenomics of biogas-producing microbial communities from production-scale biogas plants operating under wet or dry fermentation conditions. Biotechnology for Biofuels 8: 1-

18.

Suen, G., Weimer, P.J., Stevenson, D.M., Aylward, F.O., Boyum, J., Deneke, J. et al. (2011) The complete gneome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist. PLoS ONE 6: 1-15.

Tringe, S.G., and Rubin, E.M. (2005) Metagenomics: DNA sequencing of environmental samples.

Nature Reviews Genetics 6: 805-814.

Tringe, S.G., Von Mering, C., Kobayashi, A., Salamov, A.A., Chen, K., Chang, H.W. et al. (2005)

Comparative metagenomics of microbial communities. Science 308: 554-557.

Tucci, S., and Martin, W. (2007) A novel prokaryotic trans-2-enoyl-CoA reductase from the spirochete Treponema denticola. FEBS Letters 581: 1561-1566.

Tveit, A., Schwacke, R., Svenning, M.M., and Urich, T. (2013) Organic carbon transformations in high-Arctic peat soils: key functions and microorganisms. The ISME Journal 7: 299-311.

Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram, R.J., Richardson, P.M. et al. (2004)

Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 37-43.

Vanwonterghem, I., Jensen, P.D., Ho, D.P., Batstone, D.J., and Tyson, G.W. (2014a) Linking microbial community structure, interactions and function in anaerobic digesters using new molecular techniques. Current Opinion in Biotechnology 27: 55-64.

Vanwonterghem, I., Jensen, P.D., Dennis, P.G., Hugenholtz, P., Rabaey, K., and Tyson, G.W.

(2014b) Deterministic processes guide long-term synchronised population dynamics in replicate anaerobic digesters. The ISME Journal: 1-14.

Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A. et al. (2004)

Environmental genome shotgun sequencing of the Sargasso sea. Science 304: 66-74.

27

This article is protected by copyright. All rights reserved. Verhees, C.H., Kengen, S.W.M., Tuininga, J.E., Schut, G.J., Adams, M.W.W., De Vos, W.M., and

Van der Oost, J. (2003) The uniqu features of glycolytic pathways in Archaea. Biochemical Journal

375: 231-246.

Vital, M., Howe, A.C., and Tiedje, J.M. (2016) Revealing the bacterial butyrate synthesis pathways by analyzing (meta)genomic data. MBio 5: 1-11.

Wong, M.T., Zhang, D., Li, J., Hui, R.K.H., Tun, H.M., Brar, M.S. et al. (2013) Towards a metagenomic understanding on enhanced biomethane production from waste activated sludge after pH 10 pretreatment. Biotechnology for Biofuels 6: 1-14.

Wrighton, K.C., Thomas, B.C., Sharon, I., Miller, C.S., Castelle, C.J., Verberkmoes, N.C. et al.

(2012) Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla.

Science 337: 1661-1665.

Wrighton, K.C., Castelle, C.J., Wilkins, M.J., Hug, L.A., Sharon, I., Thomas, B.C. et al. (2014)

Metabolic interdependencies between phylogenetically novel fermenters and respiratory organisms in an unconfined aquifer. The ISME Journal 8: 1452-1463.

Yin, Y., Mao, X., Yang, J.C., Chen, X., Mao, F., and Xu, Y. (2012) dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Research 40: 445-451.

Figure legends

Fig. 1. Metagenome-based microbial community composition. The community profiles are shown for AD1, AD2 and AD3 on Days 96 (T1) and 362 (T2) based on the 16S rRNA genes extracted from the metagenomes and clustered at 97% sequence similarity. All populations present at >0.5% relative abundance in at least one of the samples are shown. The taxonomic classification based on the 16S rRNA gene are shown at the phylum level (left-hand side) and lowest level of taxonomic assignment (c: class, o: order, f: family and g: genus; right-hand side). 28

This article is protected by copyright. All rights reserved.

Fig. 2. Phylogeny of the population genomes. Genome tree based on a concatenated set of marker genes showing the phylogenetic affiliation of the 101 recovered population genomes from the anaerobic digesters relative to 2015 IMG genomes.

Fig. 3. Distribution of glycoside hydrolase (GH) families for 62 population genomes. The number of open reading frames (ORFs) identified within each GH family is shown by the heatmap and GH families are grouped by substrate activity. The phylum-level classification of the population genomes is shown on the left-hand side of the panel.

Fig. 4. Metabolic network based on the functional classification of all populations present at >0.1% relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 96. The color of the edges corresponds to the substrate node and the thickness of the edges is representative of the relative abundance of each population genome (average for the three reactors). The percentages on the right-hand side of the panel show the fraction of the community (total relative abundance) classified within each functional guild.

Fig. 5. Metabolic network based on the functional classification of all populations present at >0.1% relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 362. The color of the edges corresponds to the substrate node and the thickness of the edges is representative of the relative abundance of each population genome (average for the three reactors). The percentages on the right-hand side of the panel show the fraction of the community (total relative abundance) classified within each functional guild.

29

This article is protected by copyright. All rights reserved. Table

Table 1. Summary statistics (Compl.: completeness; Cont.: contamination) of 62 population genomes selected for metabolic analysis, which were most complete and/or abundant in the reactors.

16S Bin_ID Size Scaffolds Compl. Cont. GC ORFs Genome tree Relative abundance (%) rRNA (Mb) # (%) (%) (%) # phylogeny AD1_T1 AD2_T1 AD3_T1 AD1_T2 AD2_T2 AD3_T2 gene Methan_01 2.6 1 99.4 0.0 59.1 2557 Euryarchaeota 0.00 0.00 0.00 10.13 19.67 5.79 + Methan_02 2.5 104 99.0 0.7 60.8 2633 Euryarchaeota 0.00 0.00 0.00 0.01 0.23 6.41 + Methan_04 2.3 235 85.3 2.9 53.4 2083 Euryarchaeota 0.03 0.04 0.49 0.03 0.11 0.12 - Methan_05 3.0 387 62.4 5.0 52.5 2034 Euryarchaeota 0.03 0.08 0.03 0.12 0.20 0.14 - Methan_06 2.7 427 59.0 3.6 48.0 1864 Euryarchaeota 0.15 0.28 0.18 0.07 0.44 0.10 - Methan_07 2.8 414 67.3 10.8 61.6 2182 Euryarchaeota 0.04 0.14 0.39 0.74 0.98 0.86 + Cren_01 1.8 151 89.0 0.9 58.0 1863 Crenarchaeota 0.00 0.00 0.00 0.23 0.05 0.28 + Actino_01 3.6 1 99.4 0.0 73.9 3175 Actinobacteria 0.16 0.16 0.22 23.49 7.59 6.24 + Actino_02 3.2 27 95.2 0.3 67.6 2823 Actinobacteria 3.19 4.26 5.16 0.31 0.21 0.19 - Alpha_01 3.9 55 99.0 0.5 68.1 3789 0.00 0.00 0.00 2.18 0.28 0.26 + Alpha_02 4.4 182 92.1 6.5 67.6 3768 Alphaproteobacteria 0.00 0.00 0.00 0.65 0.67 0.63 - Alpha_03 6.4 53 98.9 8.5 60.6 5964 Alphaproteobacteria 0.00 0.00 0.00 1.93 1.01 1.22 + Alpha_05 3.5 628 80.7 3.9 66.9 3160 Alphaproteobacteria 0.27 0.28 0.37 0.01 0.01 0.01 - Bact_02 4.5 99 90.9 4.1 41.5 3385 Bacteroidetes 0.76 1.11 0.92 0.02 0.01 0.01 + Bact_03 3.4 15 100.0 0.5 41.5 2882 Bacteroidetes 10.23 12.28 8.36 0.77 0.61 0.50 + Bact_08 2.4 59 80.4 0.0 33.6 1628 Bacteroidetes 0.10 0.22 0.03 0.00 0.07 0.03 + Bact_09 2.1 15 93.3 1.7 46.6 1707 Bacteroidetes 0.00 0.00 0.00 2.01 0.02 0.19 + Bact_10 3.0 67 93.8 4.8 46.0 2385 Bacteroidetes 0.25 0.35 0.02 0.04 0.08 0.94 + Bact_11 2.2 102 86.7 1.3 58.0 1670 Bacteroidetes 0.25 0.02 2.21 0.00 0.00 0.10 + Bact_13 2.5 178 84.4 4.0 58.6 1907 Bacteroidetes 0.38 0.00 0.00 0.00 0.01 0.00 + Bact_19 3.2 7 99.3 3.3 32.1 2464 Bacteroidetes 0.02 1.01 0.09 0.72 0.74 4.24 + Bact_22 2.1 2 96.7 0.0 47.0 1711 Bacteroidetes 0.00 0.01 0.00 0.34 4.77 0.10 + 31

This article is protected by copyright. All rights reserved. Bact_23 4.3 200 94.9 1.7 48.3 3135 Bacteroidetes 0.00 0.00 0.00 0.00 0.00 2.22 - Bact_24 3.7 81 80.1 2.5 42.8 2413 Bacteroidetes 0.25 0.94 1.04 0.07 1.13 0.00 + Beta_02 3.4 139 83.0 2.8 63.7 2793 0.20 0.17 0.23 0.16 0.12 0.11 - WWEI_01 2.0 130 91.3 6.0 36.4 1532 WWEI 0.20 0.06 0.60 0.00 0.00 0.00 - Clorobi_01 2.3 3 99.5 0.8 56.2 2119 Chlorobi 0.01 0.00 0.00 1.77 2.78 3.01 + Chloro_02 2.6 237 70.8 0.2 52.5 1804 Chloroflexi 0.03 0.00 0.00 0.05 0.37 0.06 - Deferri_01 2.9 14 98.2 0.9 44.4 2626 Deferribacterales 1.06 1.27 1.01 0.00 0.00 0.00 + Delta_01 4.9 267 88.8 4.4 59.7 3909 Deltaproteobacteria 0.00 0.02 0.00 0.37 0.86 0.27 + Delta_02 4.4 138 92.6 4.9 56.8 3832 Deltaproteobacteria 0.00 0.00 0.00 0.32 0.36 0.90 - Delta_03 5.1 106 69.0 5.8 61.5 3387 Deltaproteobacteria 0.00 0.00 0.00 1.60 1.34 1.28 + Epsilon_01 2.7 26 100.0 0.8 43.9 2690 Epsilonproteobacteria 1.42 1.46 1.19 0.01 0.03 0.01 - Fibro_01 2.9 50 98.9 2.2 37.4 2362 Fibrobacteres 0.87 6.26 0.67 0.00 0.00 0.00 - Fibro_02 3.5 122 93.1 0.7 51.4 2764 Fibrobacteres 4.00 4.16 2.62 0.05 0.01 0.02 + Fibro_03 4.1 11 89.4 2.3 50.2 2968 Fibrobacteres 0.13 0.45 0.09 5.83 4.40 12.69 + Firm_03 2.7 684 89.0 3.4 39.5 2573 Firmicutes 0.00 0.05 0.00 0.85 0.93 0.20 + Firm_04 4.2 53 83.9 2.7 54.1 2959 Firmicutes 9.18 0.00 3.83 0.00 0.00 0.00 + Firm_05 3.2 265 92.9 1.8 45.7 2806 Firmicutes 0.00 7.16 3.75 0.00 0.08 0.00 + Firm_06 4.2 215 99.3 3.4 44.5 3763 Firmicutes 0.00 3.09 0.73 0.00 0.00 0.00 - Firm_10 3.3 152 84.9 1.6 62.5 2482 Firmicutes 0.26 0.09 0.09 0.36 0.06 0.06 + Firm_11 3.5 3 99.2 0.3 55.2 2847 Firmicutes 0.00 0.00 0.00 3.14 2.73 14.64 + Firm_13 2.0 67 85.1 0.4 51.6 1514 Firmicutes 0.28 0.72 0.19 0.00 0.00 0.00 - Firm_14 3.1 45 100.0 0.7 46.3 2824 Firmicutes 0.00 0.00 4.61 0.00 0.00 0.00 + Firm_16 3.8 117 98.6 4.6 49.3 3097 Firmicutes 12.81 4.44 0.03 0.03 0.03 0.00 - Lenti_01 3.9 118 70.3 0.4 60.9 2774 Lentisphaerae 2.43 2.08 2.84 2.99 3.63 1.54 + Lenti_02 6.0 662 82.7 4.1 67.3 4114 Lentisphaerae 0.11 0.08 0.53 0.22 0.79 0.14 + Planc_01 5.7 162 100.0 1.1 62.8 4512 Planctomycetes 0.00 0.00 0.00 0.40 1.79 1.14 + Spiro_02 2.6 74 98.9 2.3 55.0 2311 Spirochaetes 2.93 1.23 0.07 0.46 0.19 0.01 - Spiro_03 2.6 64 94.3 0.0 56.0 2219 Spirochaetes 0.00 0.00 0.00 0.00 0.08 0.81 - Spiro_04 1.9 170 92.0 0.0 59.3 1727 Spirochaetes 0.24 0.05 0.09 0.09 0.19 0.01 - Spiro_07 3.1 59 85.8 2.1 44.7 2326 Spirochaetes 0.33 0.00 0.00 0.10 0.00 0.00 - Spiro_08 2.9 57 98.6 0.0 52.1 2443 Spirochaetes 0.16 0.13 0.00 0.44 0.25 1.27 - Spiro_09 3.0 29 90.6 0.7 51.0 2397 Spirochaetes 0.01 0.00 0.00 0.65 1.39 0.16 - Spiro_10 3.0 11 97.9 0.0 61.8 2527 Spirochaetes 0.33 0.36 1.29 6.10 2.61 1.55 + 32

This article is protected by copyright. All rights reserved. Spiro_12 2.4 139 94.9 0.0 57.2 2129 Spirochaetes 0.95 0.86 0.75 2.64 0.50 0.79 + Syner_01 3.7 683 83.8 5.7 58.9 3280 Synergistetes 0.01 0.01 0.00 0.44 1.26 0.86 + Syner_03 1.9 218 100.0 2.4 52.0 1862 Synergistetes 0.48 0.79 1.44 0.32 1.35 1.85 + Thermo_01 2.8 76 94.4 0.3 48.6 2472 Thermotogae 0.06 0.37 0.31 0.22 1.81 1.17 + Thermo_02 3.5 643 93.8 1.9 47.0 3159 Thermotogae 0.00 0.00 0.00 0.33 0.27 0.01 - Verruco_01 2.7 7 95.4 1.3 63.0 2261 Verrucomicrobia 0.00 0.01 0.19 5.88 0.51 0.00 + Verruco_02 2.9 33 94.6 2.0 58.8 2263 Verrucomicrobia 0.00 0.00 0.06 0.01 0.65 0.00 +

33

This article is protected by copyright. All rights reserved. Page 34 of 38

Metagenome-based microbial community composition. The community profiles are shown for AD1, AD2 and AD3 on Days 96 (T1) and 362 (T2) based on the 16S rRNA genes extracted from the metagenomes and clustered at 97% sequence similarity. All populations present at >0.5% relative abundance in at least one of the samples are shown. The taxonomic classification based on the 16S rRNA gene are shown at the phylum level (left-hand side) and lowest level of taxonomic assignment (c: class, o: order, f: family and g: genus; right-hand side). 189x278mm (300 x 300 DPI)

Wiley-Blackwell and Society for Applied Microbiology

This article is protected by copyright. All rights reserved. Page 35 of 38

Chloro_03

Chloro_02 Chloro_04 Chloro_01

Syner_02 Syner_01

Syner_03 Thermo_02 Thermo_01

Methan_07 CYANOBACTERIA

Actino_02 Methan_06

c_Actinobacteria o_Micrococcales Methan_05 Methan_04 Methan_03 Actino_01 Methan_02 Tener_01 ARMATIMONADETES

Methan_01 Anaerolinea sp. c_Actinobacteria

Anaerolinea sp.

Synergistetes sp. SGP1 f_Dehalococcoidaceae Dehalobacter sp. Aminobacterium colombiense Tener_02

Termobaculum terrenum Ktedonobacter racemifer o_Synergistales f_Propionibacteriaceae Cloacibacillus evryensis

Caldilinea aerophila

c_Thermomicrobia

Bacterium sp. JAD2

THERMI c_Chloroflexi f_Nocardoidaceae f_Corionibacteriacea c_Actinobacteria g_Acholeplasma c_Actinobacteria f_Synergistaceae Microlunatus phosphovorus c_Actinobacteria Actinobacterium c_Actinobacteria Brachybacterium faecium

Acidimicrobium ferrooxidans Actinopolymorpha alba

Jiangella gansuensis f_Erysipelotrichaceae

Anaerolinea thermophila Cren_01 o_Solirubrobacterales Beutenbergia cavernae Firm_01 MethanospirillumMethanoculleus hungatei marisnigri CellulomonasCellulomonas flavigena fimi Thermovirga lienii Cellvibrio gilvus f_Mycoplasmataceae Firm_02 o_Methanomicrobiales g_Rubrobacter Methanofollis f_Thermotogaceaeliminatans Nitriliruptor alkaliphilus Methanocorpusculum labreanum g_Anaerobaculum Firm_18 Mesotoga prima f_Thermotogaceae

Kosmotoga olearia Firm_17 c_Halobacteria g_Methanoplanus FUSOBACTERIA MethanosaetaMethanosaeta thermophila concilii Firm_09 o_Thermoplasmatales Methanosaeta harundinacea C. THAUMARCHAEOTAc_Thermoprotei Firm_10 Parvarchaeum acidiphilum Methanoflorens stordalenmirensis f_Tissierellaceae Firm_11 NANOARCHAEOTA o_Methanosarcinales OP9 OSPB1 - NAG1 Firm_12 f_Eubacteriaceaeo_Clostridiales Firm_13 C. Eubacterium saphenum Firm_14 Microarchaeum acidiphilum g_Methanocellus g_Clostridium Firm_15 KORARCHAEOTA f_Lachnospiraceae o_Thermanaerobacterales Firm_16 ANME-1 Eubacterium infirmum g_Archaeoglobus Firm_07 Tepidanaerobacter sp. f_Lachnospiraceae Firm_08 Abiotrophia defectiva f_Acetivibrionaceae Sinorhizobium meliloti Clostridiales sp. Sinorhizobium medicae Ensifer meliloti SinorhizobiumEnsifer medicae arboris f_Oscillospiraceae Firm_03 Sinorhizobium terangae f_Ruminococcaceae Alpha_03 SinorhizobiumRhizobium fredii sp. f_Ruminococcaceae Firm_04 Rhizobium sp. Anaerotruncus colihominis Firm_05 Rhizobium leguminosarumRhizobium giardinii Eubacterium siraeum Clostridium g_Ruminococcusmethylpentosum Firm_06 g_Agrobacterium Ruminococcus champanellensis g_Agrobacterium Ruminococcus flavefaciens C. Liberibacter asiaticus PARCUBACTERIA g_Rhizobium ELUSIMICROBIA TM6 Martelella mediterranae Hoeflea phototrophica f_Leptospiraceae Alpha_04 Nitratireductor aquibiodomus g_Brachyspira Chelativorans sp. g_Borellia g_Mesorhizobium g_Spirochaeta g_Spirochaeta o_Rhizobiales Spirochaeta smaragdinae f_Aurantimonadaceae Ahrensia sp. Spirochaeta coccoides Spiro_01 Labrenzia aggregata Spirochaeta sp. Grapes Labrenzia alexandrii Spiro_02 Roseibium sp. Polymorphum gilvum Spiro_03 Pseudovibrio sp. Spirochaeta sp. Buddy Spiro_04 g_Treponema Spiro_05 Pelagibacterium halotolerans Treponema brennaborense Spiro_06 Alpha_02 o_Rhizobiales Spiro_12 f_Hyphomicrobiaceae Parvibaculum lavamentivorans C. g_Treponema Spiro_08 Kuenenia stuttgartiensis c_Alphaproteobacteriag_Paracoccus Treponema denticola Spiro_07 Spiro_09 Rhodobacterg_Rhodobacter capsulatus c_Planctomycetiag_Treponema Spiro_10 Phycisphaera mikurensis Spiro_11 LentisphaeraCHLAMYDIA araneosa Alpha_01 f_Rhodobacteraceae Victivallis vadensis RhodobacteralesMeganema perideroedessp.g_Rhodospirillum o_Opitutales Coraliomargarita akajimensis g_Magnetospirillum Pedosphaera parvula Planc_01 o_Sphingomonadales Magnetococcus sp. c_Verrucomicrobiae IGNAVIBACTERIAFibrobacterC. succinogenes Deferribacteres Geminicoccus roseus GEMMATIMONADETESCloacamonas acidaminovorans Lenti_01 c_Alphaproteobacteria Chloroherpeton thalassium Lenti_02 f_Acetobacteraceae Lenti_03 Epsilonproteobacteria Alpha_05 Pussilimonas sp. g_Bondetella g_Prosthecochloris Chlorobaculum parvum Verruco_01 Bacteroidetes Chlorobium tepidum g_Chlorobium Verruco_02 Verruco_03 c_Alphaproteobacteria f_Alcaligenaceaef_Lautropiaceae c_Rhodothermia Chlorobi sp. Verruco_04 f_Sutterellaceae Verruco_05 Fibrobacteres Thiomonas intermedia f_Sphingobacteriaceae WWE1_01 Beta_01 f_Rhodocyclaceae c_Saprospirae Fibro_01 o_Cytophagales Fibro_02 WWE1 Desulfobacca acetoxidans f_Neisseriaceae c_Deltaproteobacteria Fibro_03 f_Comamonadaceae Desulfomonile tiedjei c_Betaproteobacteria Syntrophobacter fumaroxidansSyntrophus aciditrophicus

Verrucomicrobia f_Deferribacteraceae

Rikenella microfusus c_Deltaproteobacteria Alistipes indistinctus ACIDOBACTERIAo_Nitrospirales f_Nautiliaceae c_Betaproteobacteria Denitrovibrio acetiphilus g_Nitratiruptor Lentisphaerae AQUIFICAE

Beta_02 f_Marinilabiaceae o_Odoribacterales o_Flavobacteriales Chlorobi_01 f_Desulfurellaceae g_Alistipes Planctomycetes Syntrophorhabdus aromaticivorans Paludibacter propionicigenes

f_Barnesiellaceae

g_Parabacteroides

g_Dysgonomonas Tannerella forsythensis Tannerella g_Sulfurospirillum c_Deltaproteobacteria Bacteroidetes sp. c_Gammaproteobacteria c_Deltaproteobacteria Spirochaetes g_Bacteroides Bact_22 Bact_23 g_Bacteroides Bact_24 Bact_18 Bact_19 Firmicutes Delta_01 Bact_20

C.

g_Campylobacter Bact_07 Bact_17 Azobactoides pseudotrichonymphae Bact_08 Tenericutes Bact_09 Bact_10 c_Epsilonproteobacteria Bact_11 Delta_02 Bact_12 Bact_14Bact_13 f_Helicobacteraceae Bact_15 Delta_03 Bact_16 o_Bacteroidales Actinobacteria g_Porphyromonas Bact_06 Bact_05 Bact_04

Bact_03 Bact_02 Chloroflexi Bact_01

Synergistetes Deferri_01 Epsilon_01 Thermotogae

Euryarchaeota 0.001 Crenarchaeota

Wiley-Blackwell and Society for Applied Microbiology

This article is protected by copyright. All rights reserved. Page 36 of 38

Distribution of glycoside hydrolase (GH) families for 62 population genomes. The number of open reading frames (ORFs) identified within each GH family is shown by the heatmap and GH families are grouped by substrate activity. The phylum-level classification of the population genomes is shown on the left-hand side of the panel. 178x195mm (300 x 300 DPI)

Wiley-Blackwell and Society for Applied Microbiology

This article is protected by copyright. All rights reserved. Page 37 of 38

Metabolic network based on the functional classification of all populations present at >0.1% relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 96. The color of the edges corresponds to the substrate node and the thickness of the edges is representative of the relative abundance of each population genome (average for the three reactors). The percentages on the right-hand side of the panel show the fraction of the community (total relative abundance) classified within each functional guild. 212x183mm (300 x 300 DPI)

Wiley-Blackwell and Society for Applied Microbiology

This article is protected by copyright. All rights reserved. Page 38 of 38

Metabolic network based on the functional classification of all populations present at >0.1% relative abundance in at least one of the anaerobic digesters (AD1, AD2 and AD3) at Day 362. The color of the edges corresponds to the substrate node and the thickness of the edges is representative of the relative abundance of each population genome (average for the three reactors). The percentages on the right-hand side of the panel show the fraction of the community (total relative abundance) classified within each functional guild. 247x183mm (300 x 300 DPI)

Wiley-Blackwell and Society for Applied Microbiology

This article is protected by copyright. All rights reserved.