www.nature.com/scientificreports

OPEN Occurrence and expression of novel methyl-coenzyme M reductase gene (mcrA) variants in hot spring Received: 6 April 2017 Accepted: 27 June 2017 sediments Published: xx xx xxxx Luke J. McKay1,2, Roland Hatzenpichler1,3, William P. Inskeep2 & Matthew W. Fields1,4

Recent discoveries have shown that the marker gene for anaerobic methane cycling (mcrA) is more widespread in the than previously thought. However, it remains unclear whether novel mcrA genes associated with the Bathyarchaeota and Verstraetearchaeota are distributed across diverse environments. We examined two geochemically divergent but putatively methanogenic regions of Yellowstone National Park to investigate whether deeply-rooted archaea possess and express novel mcrA genes in situ. Small-subunit (SSU) rRNA gene analyses indicated that Bathyarchaeota were predominant in seven of ten sediment layers, while the Verstraetearchaeota and occurred in lower relative abundance. Targeted amplifcation of novel mcrA genes suggested that diverse taxa contribute to alkane cycling in geothermal environments. Two deeply-branching mcrA clades related to Bathyarchaeota were identifed, while highly abundant verstraetearchaeotal mcrA sequences were also recovered. In addition, detection of SSU rRNA and mcrA transcripts from one hot spring suggested that predominant Bathyarchaeota were also active, and that methane cycling genes are expressed by the Euryarchaeota, Verstraetearchaeota, and an unknown lineage basal to the Bathyarchaeota. These fndings greatly expand the diversity of the key marker gene for anaerobic alkane cycling and outline the need for greater understanding of the functional capacity and phylogenetic afliation of novel mcrA variants.

Archaea are the primary drivers of CH4 cycling on our planet. Each year, methanogens produce approximately 1 1 Gt of CH4, of which 80% is subsequently oxidized anaerobically by methanotrophic archaea in the same habitats . Because of its importance as an intermediate in the biological carbon pump2, 3 and its role as a potent greenhouse gas, it is crucial to understand microbial sources and sinks of CH4. Te discovery of near-complete methanogen- esis pathways in the genomes from the recently described Bathyarchaeota4 and Verstraetearchaeota5 has drawn into question our view, held for over four decades, that the ability to generate or oxidize CH4 anaerobically is limited to a single archaeal phylum, the Euryarchaeota. Both Bathyarchaeota and Verstraetearchaeota lack cul- tured representatives, but are widespread across diverse environments5–7. Moreover, recent analysis of Candidatus Syntrophoarchaeum implicates the involvement of the mcr complex in butane metabolism8, underscoring the need for further investigations into coenzyme M reductase functions and archaeal carbon cycling. Although geothermal systems are hypothesized as life’s frst habitat/s9 and methanogenesis is thought to be an ancient metabolism10, 11, very little evidence of methanogenesis has been reported in Yellowstone National Park (YNP, Wyoming, USA), the planet’s largest and most diverse terrestrial geothermal ecosystem. While investiga- tions of methane cycling in other continental geothermal ecosystems have recovered mcrA genes and enrich- ments of methanogens12, 13, only one methanogen, Methanothermobacter thermoautotrophicus, has been isolated from the YNP ecosystem14. Further, metagenome studies in YNP have not yet identifed habitats containing diverse mcrA (α-subunit of methyl coenzyme M reductase) genes15. Here, we investigated the possibility that

1Center for Biofilm Engineering, Montana State University, Bozeman, MT, 59717, USA. 2Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, MT, 59717, USA. 3Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT, 59717, USA. 4Department of Microbiology and Immunology, Montana State University, Bozeman, MT, 59717, USA. Correspondence and requests for materials should be addressed to L.J.M. (email: [email protected]) or M.W.F. (email: matthew.fields@biofilm. montana.edu)

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 1 www.nature.com/scientificreports/

Source Gases (%) Temperature Site (°C) pH CO2 H2 CH4 WS0 64 6.5 93.81 ± 2.93 2.42 ± 0.09 6.00 ± 0.19 WS1 77 5.8 85.21 ± 5.40 2.94 ± 0.05 4.82 ± 0.08 WS3 91 7.0 nd nd nd HL9 61 8.6 3.76 ± 0.19 0.01 ± 0.00 1.18 ± 0.02 HL10 34 8.2 31.66 ± nd 0.02 ± nd 0.75 ± nd HL11 73 8.5 nd nd nd

Table 1. Physicochemical attributes of Washburn (WS) and Heart Lake Geyser Basin (HL) hot springs. Temperatures and pH values are listed for each hot spring sampled at WS and HL. Geothermal source gas bubbles were captured and analyzed for the presence of CO2, H2, and CH4 (±values are equal to the standard deviation of the mean percentage value based on quadruplicate resampling). Time and weather constraints prevented replicate sampling from being collected at HL10. Collection of source gases was not possible at WS3, which is not fully submerged in spring fuid, and HL11, which did not produce observable bubbles (nd = not determined).

CH4 cycling may be a more dominant feature in thermal sediment communities of YNP than previously thought. We examined two chemically and hydrothermally distinct regions of YNP where methanogenesis has previously been inferred, either by successful isolation at Washburn Hot Springs (WS)14 or by recovery of taxonomic marker genes of methanogens in the Heart Lake Geyser Basin (HL)16. We characterized the archaeal communities pres- ent in several selected sites by determining the occurrence of SSU rRNA genes and transcripts. Noting that CH4 cycling organisms are commonly observed in subsurface environments, we investigated surface as well as sub- surface thermal sediments for euryarchaeotal- and bathyarchaeotal-like mcrA genes using newly developed PCR primers that were designed to amplify a greater diversity of known mcrA sequences based on recently reported metagenomes4. Results and Discussion Physicochemical attributes of thermal features. Two geographically and geochemically distinct ther- mally active regions of YNP (Washburn Hot Springs (WS) near Mt. Washburn and Heart Lake Geyser Basin (HL) at the base of Mt. Sheridan) were evaluated for archaeal community structure and the occurrence and expression of novel mcrA sequences. WS and HL lie along the margin of the most recent (0.6 Ma) caldera, with WS on the north-eastern border and HL on the south-central border17. WS is a vapor-dominated acid-sulfate system where subsurface carbonate-rich steam heats and reacts with spring fuid to yield anoxic pools replete with sodium + bicarbonate, sulfde and ammonium. Te high concentration of NH4 at WS is due to hydrothermal fuid con- tact with marine sediments during migration and is among the highest measured in YNP18, 19. Discharge pools at WS span a temperature range of 64–91 °C, and the major gases of the two thermal pools sampled include H2, CH4, and up to 94% CO2 (Table 1). By contrast, HL contains high-temperature, water-dominated systems that discharge alkaline fuids17; the three hot springs sampled at HL span a pH range of ~ 8.2–8.6 and a temperature range of 34–73 °C. Average CH4 concentrations at HL and WS are signifcantly diferent from one another based on one-tailed equal and unequal variance t-tests (p = 1.95 × 10−10 and p = 2.17 × 10−6, respectively), but they fall within the same order of magnitude of between 1 and 6% of source gas concentrations and are thus proportionally similar. Oxygen availability can be inferred from measurements of dissolved sulfde (DS), which indicate that HL and WS are geochemically divaricate: DS values reported previously for the springs sampled at HL were below detection16 (supplementary text) whereas values from WS0 range from 160 μM15 to 231 μM DS (Supplementary Table S3). Gas discharge at WS contains 85–94% CO2, which contrasts with HL where lower levels of CO2 (Table 1) are likely replaced by N2 (not measured). Elevated CO2 at WS contributes to carbonate-bufering and a lower pH range of 5.8–7.0 as compared to the alkaline (pH 8.2–8.6) springs of HL. Previous15, 16 as well as the present geochemical measurements from WS and HL detected the necessary substrates for hydrogenotrophic methanogenesis, which is consistent with the isolation of the CO2 reducing methanogen, M. thermautotrophicus from WS14. Tese observations together with the previous detection of sequences related to hydrogenotrophic, acetoclastic, and methylotrophic methanogens at HL16 indicate that both geothermal systems are appropriate sites for a targeted investigation of archaea with metabolic potential for CH4 cycling.

SSU rRNA gene survey of Archaea. DNA was extracted from 10 sediment samples (Supplementary Table S1) and used for amplifcation of the V4-V6 region of SSU rRNA genes with archaeal-specifc primers, which was subsequently sequenced on the MiSeq platform. Te most abundant sequences recovered were related to microorganisms within the Bathyarchaeota, Aigarchaeota, and Termoproteales (phylum ) (Fig. 1). Members of the Termoproteales were highly abundant in all springs at WS, but were more enriched (>90% of total sequences) at higher temperatures (WS1 and WS3 versus WS0). Termoproteales-like sequences were not detected in any of the HL hot springs. Aigarchaeotal OTUs were recovered from every hot spring in both regions except WS1 and were the dominant phylum recovered from HL11 (>40%). In contrast to previous reports of traditional methanogenic groups inhabiting WS and HL14, 16, our SSU rRNA library indicated that only members of the Methanomasiliiococcales family were detected in a single hot spring (HL10). High-temperature (73–77 °C) springs from WS and HL were dominated by diferent archaeal sequences: HL11 was characterized by sequences from the Aigarchaeota, Taumarchaeota, and Verstraetearchaeota, while

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 2 www.nature.com/scientificreports/

Figure 1. 16S rRNA gene diversity of Archaea in physicochemically and regionally distinct Yellowstone hot spring sediments. Taxonomic groupings of archaeal OTUs representing >0.1% of the sequences recovered from Washburn Hot Springs (WS) and Heart Lake Geyser Basin (HL). At one spring from each region—WS0 and HL9—a sediment push core was sectioned in three layers for separate microbial analysis, and the sediment depth in cm is labeled on each respective section of the circle chart with the deepest section closest to the center. Photographs of each hot spring are adjacent to diversity charts, and white circles indicate the location of sampling. is grouped by phylum and the next highest resolution taxonomic group possible is listed beneath the phylum as a separate color (c = class; o = order; f = family; g = ). For Bathyarchaeota, MCG subgroups are listed according to classifcations by Kubo and Lloyd et al. (2012) and Lazar et al. (2015), except for MCG-20, which is proposed in this study (Supplementary Figure S1). Te map in this fgure was obtained from the public national park maps available from the US National Park Service (https://www.nps.gov/ hfc/carto/PDF/YELLmap2.pdf; licensing: https://www.nps.gov/hfc/carto/data-sources.cfm) and simplifed with Adobe Illustrator CC 2015 (Adobe Systems; www.adobe.com/illustrator).

WS1 contained high amounts of different members within the (phylum Crenarchaeota). Diferences in the relative abundance of specifc archaea between thermal regions are consistent with the marked diferences in geochemistry and subsurface hydrothermal sources. Although the occurrence of Termoproteales, Taumarchaeota, , and Euryarchaeota varied considerably across the lower-temperature hot springs sampled (i.e., <70 °C), 50–90% of recovered sequences from both thermal regions (i.e., WS0, HL9, and HL10) were members of the Bathyarchaeota (Figs 1 and 2). Tese hot springs represent a temperature range of 34–65 °C, a pH range of 6.5–8.6, diverse source gas compositions (Table 1), and distinct hydrothermal and geochemical 15, 16 conditions including DS content and O2 availability (Supplementary Table S3). Despite these diverse physic- ochemical attributes, seven samples were dominated by bathyarchaeotal subgroups MCG-2, -6, -1520, 21, and -20 (Supplementary Figure S1; Fig. 1). However, closer examination reveals that distinct bathyarchaeotal phylotypes characterize WS versus HL (Fig. 2). Moreover, two subgroup-level lineages (MCG-10 and MCG-1420, 21) were recovered in greater abundance from the HL springs relative to WS0 (Fig. 1). Only a small fraction (<0.1%) of Bathyarchaeota-like sequences were recovered from hot springs above 70 °C (i.e., WS1, WS3, and HL11). Considering the broad global and ecological distribution of members within the Bathyarchaeota7, 20, 22, 23, it is not surprising that diferent members of this phylum are distributed across distinct hot spring environments, and is supported by the potential to metabolize a broad range of substrates6, 24. Previous studies have concluded that many carbon sources may sustain growth of diferent Bathyarchaeota, including acetate24, 25, protocachuate26, gly- cine, urea, lipids24, extracellular plant-derived carbohydrates, and proteins6, 27. Further, Bathyarchaeota have con- sistently been afliated with the degradation of complex organic matter in sedimentary environments20, 26, 28, 29. Te hot springs sampled in this study are within topographic depressions and are surrounded by vegetated soils (Fig. 1); grasses were observed in the cores of both HL9 and WS0. Ecophysiological relationships between certain bathyarchaeotal subgroups and plant-derived complex organics27 have been noted, which suggests that these archaea may be metabolically linked to photosynthetically-derived organic matter. Further, previous measure- ments of dissolved organic carbon (DOC) at WS0 indicated concentrations up to 330 μM30 and in HL9 DOC was detected at concentrations as high as 450 μM at the time of sampling (Supplementary Table S3). In part, these observations may explain the high abundance of Bathyarchaeota in these specifc springs. Canonical Correspondence Analysis (CCA) was used to examine the distribution of phylotypes among sites as related to major physicochemical characteristics—temperature, pH, and DS (Fig. 3). A strong division between

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 3 www.nature.com/scientificreports/

Figure 2. Phylogenetically resolved heat map of OTU occurrence and beta diversity across sampling sites. Te relative abundance of OTUs representing >0.5% of the archaeal community is shown as a gradient from black (high abundance) to white (low abundance). A comparison of phylogenetic diversity for each sampling site is represented as a weighted UniFrac UPGMA dendogram of jackknifed phylogenetic resampling of the rarifed OTU table, and demonstrates the major clustering patterns among hot springs from varying hydrothermal regions. OTU relationships are presented as a neighbor-joining tree with justifed branch lengths in which branches from the same phylum are the same color corresponding to the legend. For clarity, the frst letter of each phylum is placed at its respective node (B = Bathyarchaeota; A = Aigarchaeota; T = Taumarchaeota; C = Crenarchaeota; V = Verstraetearchaeota; K/E = Korarchaeota/Euryarchaeota). Following the sample IDs, the designations “TC”, “MC”, and “BC” stand for “top core”, “middle core”, and “bottom core”, respectively. NB: Te paraphyletic groupings of Crenarchaeota and Verstraetearchaeota are resolved in the full phylogenetic tree that includes long fragment reference sequences (Supplementary Figure S1).

the HL and WS geothermal regions was apparent whether only surface samples or only core samples were consid- ered, which demonstrates their widely divergent physicochemical structure. Tis structure is represented clearly by contrasting directional assignments of temperature, pH, and DS along the major diferential distribution pat- tern of OTUs from HL versus WS. Site WS1 falls outside of this pattern and contains OTUs belonging to the fam- ily Termoproteaceae, which appear controlled by unmeasured variable(s). HL11 is surrounded by a tight cluster of OTUs related to the Aigarchaeota, Taumarchaeota, and Verstraetearchaeota that also fts within the major physicochemical distribution pattern. Interestingly, while other phyla seem controlled by regional diferences, the Aigarchaeota and Bathyarchaeota include OTUs that span the range of thermal and chemical variables. Te ability of Bathyarchaeota to span such a wide range of conditions is consistent with the wide range in environ- mental distribution that has been repeatedly demonstrated for this phylum7, 20, 21, 23. However, while phylum-level occurrence spans a wide physicochemical range, OTU-level occurrence indicates that diferent groups within this

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 4 www.nature.com/scientificreports/

Figure 3. Canonical Correspondence Analyses of phylogenetic distribution and major geochemical parameters. Surface samples are compared to temperature, pH, dissolved sulfde (DS) and plotted with OTU distribution (A). Top, middle, and bottom sediment layers from core samples are compared to temperature, pH, DS, and porewater CH4 concentrations and plotted with OTU distribution (B). OTUs representing >0.5% sequence abundance are indicated by circles. Colors correspond to those in Fig. 1. White diamonds indicate sample sites. Black arrows indicate physicochemical parameters.

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 5 www.nature.com/scientificreports/

Figure 4. Phylogeny and occurrence of Bathyarchaeota-like (orange and yellow), verstraetearchaeotal (pink), and euryarchaeotal (other colors) McrA proteins. (A) Bayesian analysis of McrA phylogeny based on 150 amino acid positions. Heavy weighted branches indicate sequences recovered in the present study. Asterisks denote mcrA OTUs that were also recovered as RNA transcripts. Posterior probabilities are indicated by black, grey, or white circles at each node corresponding to ≥90%, ≥80%, and ≥70%, respectively. (B) Site specifc relative abundance of mcrA OTUs produced by traditional and newly designed primer sets. Following the sample IDs, the designations “TC”, “MC”, and “BC” stand for “top core”, “middle core”, and “bottom core”, respectively. Total sequence number for each sample is given for each pie chart. No mcrA genes were amplifed from HL10 with newly designed primers (ND = not determined). DpBrnchGuay = deeply-branching Guaymas group. Te scale bar corresponds to 0.1 amino acid substitutions per site.

phylum are distributed along gradients of temperature, pH, and DS. Tis observation underscores the importance of not generalizing the physiology or ecological preference at the phylum level, especially for a phylum as wide- spread and diverse as the Bathyarchaeota. CCA was also performed on core samples to understand the distribution patterns as a function of depth into hot spring sediments. Te strict delineation between HL and WS regions is again apparent, which suggests that the underlying hydrothermal and geochemical regimes are infuencing microbial diversity. Detailed inspection reveals that the WS site contains a higher abundance of diferent archaeal phyla, whereas the sediment layers at HL are dominated by diferent OTUs from the Bathyarchaeota. While Bathyarchaeota is the predominant phylum at both sites, this predominance is represented by only a few OTUs in WS0 sediments versus several OTUs in HL9. CH4 concentrations in sediment porewaters are higher in WS0 than HL9 (Fig. 3B, Supplementary Figure S3), which is correlated with the diferential regional distribution patterns. Tis may refect more microbial methanogenesis in WS0 sediments, but additional data are needed to assess the proportions of geologically and biologically generated CH4 at both sites.

SSU sequence recovery from reverse transcribed RNA. Total RNA was recovered from the surface sediments of HL9, and the V4-V6 region of SSU rRNA was reverse transcribed, amplifed, and sequenced for comparison with the DNA-derived archaeal community (Supplementary Figure S4). Te total proportion of the archaeal community captured by OTUs representing >0.1% was only 72% for the DNA survey and 97% for RNA. Shannon-Weiner diversity estimates for OTU libraries from RNA versus DNA (rarefed to equal sequence depth) were 5.6 and 7.8, respectively. Tis is consistent with the fact that RNA is more representative of active cells31, while DNA may remain afer cell death32. Alternatively, these discrepancies might be explained by variations in extraction efciencies of RNA versus DNA, or diferent RNA: DNA ratios across taxa. While some archaeal groups were underrepresented in the DNA pool (e.g., Taumarchaeota, Aenigmarchaeota; supplementary text), both RNA and DNA amplicon libraries at HL9 were predominated by Bathyarchaeota (82% and 52%, respec- tively) belonging to subgroups MCG-2, -6, -10, -14, -1520, 33, and -20 (present study; Supplementary Figure S1). Consequently, results from the RNA library of HL9 support the conclusion that Bathyarchaeota are not only present, but predominate the active microbial populations.

Occurrence of mcrA genes and transcripts. We used previously published mcrA gene primers that tar- geted euryarchaeotal mcrA gene sequences34, and we designed new mcrA primers based on recently discovered

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 6 www.nature.com/scientificreports/

bathyarchaeotal mcrA genes4 to examine the occurrence of mcrA genes across diferent geothermal springs (Fig. 4). Positive mcrA amplifcation was achieved from the top layer of WS0, the top, middle, and bottom layers of HL9, and the surface sediments of HL10. Amplifed mcrA sequences were not recovered from HL11, WS1, and WS3, which is consistent with the lack of SSU rRNA gene sequences of known methane cycling taxa from these samples (Fig. 1). Tese results may also suggest that mcrA genes are limited to environments less than approximately 70 °C, or that lower abundance of this gene in high temperature samples precluded recovery via amplifcation. We recovered 169,736 and 128,949 mcrA sequences using traditional and newly-designed prim- ers, respectively, which was based on comparison to our collection of currently published mcrA sequences (e value < 0.01). In contrast to highly divergent bathyarchaeotal mcrA genes, mcrA sequences associated with the recently proposed Verstraetearchaeota35 are relatively similar to those of the Euryarchaeota. Tis causes Verstraetearchaeotal McrA proteins to appear as a euryarchaeotal subgroup35 (Fig. 4), and explains why mcrA sequences from both phyla were amplifed by the same primer set. We grouped mcrA OTUs at 90% nucleo- tide identity for sequences amplifed by traditional primers, and at 97% for novel mcrA sequences to populate the disproportionately sparse side of the McrA tree (Fig. 4). Tis defnition yielded 35 mcrA OTUs related to Euryarchaeota and/or Verstraetearchaeota and 61 mcrA OTUs which were basal to bathyarchaeotal sequences published by Evans et al. (2015). Previous reports of euryarchaeotal mcrA recovery from terrestrial geothermal systems include two clones from the order from Tunisian hot springs12 and mcrA sequences related to several meth- anogenic genera from hot springs in Kamchatka, Russia and São Miguel Island in the Azores36. Te latter were related to the Methanothermobacter, Methanothermus, Methanothrix, Methanomassiliicoccales, Methanocellales, Methanomethylovorans, and the uncharacterized group, MCR-2A36, 37. In our study, mcrA sequences were recovered from euryarchaeotal orders Methanocellales, , Methanobacteriales, and . Within the Methanosarcinales, we detected an mcrA OTU with 92% sequence identity to that of Methanomethylovorans hollandica (DSM15978). Tis, together with our recovery of mcrA sequences from the Methanocellales, supports the recent conclusion that these methanogens can be found in terrestrial geothermal springs36. Additionally, we observed high relative abundance of mcrA sequences related to the uncharacterized deeply-branching Guaymas (DBG) group III, and detected group II in lower relative abundance. DBG mcrA clus- ters, which were discovered in a hydrothermal seep environment at a depth of 2000 m in the Gulf of California38, are functionally uncharacterized and have not been recovered previously from terrestrial systems. We did not detect sequences belonging to the MCR-2A group36, 37 (nucleotide identity <77%), which indicates that consider- able diferences exist in the ecological distribution of methanogens across globally separated terrestrial geother- mal environments. Te most abundant mcrA OTU amplifed using traditional primers was most similar to mcrA sequences belonging to the recently proposed Verstraetearchaeota35; this OTU was observed in every sample where mcrA sequences were successfully amplified. However, no SSU rRNA gene sequences affiliated with the Verstraetearchaeota were detected in these hot springs (i.e., WS0 and HL9). In contrast, the SSU RNA gene library from HL11 contained a large proportion of SSU rRNA gene sequences belonging to the Verstraetearchaeota, yet no verstraetearchaeotal mcrA sequences were recovered. Tese results suggest that (i) additional studies are necessary to determine whether some members of Verstraetearchaeota encode mcrA gene variants that are phy- logenetically distinct from those recently described35, (ii) not all members of the Verstraetearchaeota harbor mcrA genes, and/or (iii) most SSU rRNA gene primers currently used in next generation taxonomic diversity studies severely underestimate archaeal diversity39. Previously identifed Verstraetearchaeota were inferred to be methylotrophic methanogens capable of utilizing methanol, methanethiol, and methylated amines35, and we hypothesize these could be important metabolic modes in YNP hot springs as well. In surface sediments of HL9, 43% of mcrA sequences belonged to an OTU most similar to the Methanosaeta, one of two known groups capable of acetoclastic methanogenesis40. Te next most abundant OTU, which con- tained 49% of the mcrA sequences from WS0 surface sediments was related to the DBG group III, a function- ally uncharacterized mcrA clade originally discovered in the hydrothermal sediments of Guaymas Basin38, 41. Another Verstraetearchaeota-like sequence was recovered in high abundance in the HL sediments, and the most abundant mcrA OTU at the cooler hot spring (HL10) grouped within the Methanomicrobiales, which harbors hydrogenotrophic, formate-utilizing methanogens42. Despite the cultivation of Methanothermobacter thermau- totrophicus from WS0 sediments14, we observed low relative abundance of mcrA sequences belonging to the Methanobacteriales order. Instead, traditional mcrA primers suggest that WS0 is dominated almost entirely by members of the Verstraetearchaeota and the DBG III group38. We recently cultivated M. thermautotrophicus from WS0 using the same sediment samples (unpublished data), which indicates that this organism may be easily cap- tured by classical cultivation methods for hydrogenotrophic methanogens, but does not represent the ecologically predominant population. Using newly developed primers targeting recently reported bathyarchaeotal mcrA genes4, we identifed two novel, deeply-branching McrA clades (YNP1 and YNP2, Fig. 4; Supplementary Figure S2). A single mcrA OTU from the novel YNP1 clade was dominant at all depths of the HL9 sediment core, whereas a single OTU from the YNP2 clade was dominant in the surface sediments of WS0. Tese results are consistent with the major geochemical diferences between HL and WS (e.g., pH and sulfde content) that likely result in the enrichment of distinct CH4 cycling archaea. However, lower abundance of YNP1 and YNP2 mcrA sequences was observed in both regions, precluding regional isolation of these novel mcrA clades. Considering the closest phylogenetic relatives to mcrA OTUs recovered from the present study, it appears that all modes of methanogenesis (i.e., methylotrophic, hydrogenotrophic, and acetoclastic) may be possible by multiple archaeal phyla in YNP hot spring sediments.. Notably, only one of three newly designed mcrA primer sets was successful in amplifcation (Supplementary Table S2), and this primer set only targeted half of the recently reported mcrA genes. Hence, it is probable that

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 7 www.nature.com/scientificreports/

more than three major McrA clades related to Bathyarchaeota (Fig. 4) have yet to be discovered. Further, while mcrA genes with the YNP1 and YNP2 clades were amplifed by primers designed from sequences associated with bathyarchaeotal genomic assemblies4, the amino acid identities of translated amplicons to the bathyar- chaeotal McrAs are only 57% and 68%, respectively. Tis indicates that YNP1 and YNP2 McrA clades may not belong to the Bathyarchaeota. Illustrating this point, McrAs that are established as belonging to distinct phyla, (e.g., sequences from Verstraetearchaeota V45 and Methanomasiliicoccales (phylum Euryarchaeota)), share an amino acid identity of 68%. Tus, attributing YNP1 and YNP2 mcrA sequences to the Bathyarchaeota is not prudent until further investigation can test their taxonomic relationship. Moreover, given the relatively large phylogenetic distance between McrA clades, as well as the recently elucidated butane-processing capacity of C. Syntrophoarchaeum8, physiological function cannot be assumed based on phylogenetic position alone. Lastly, three of the 96 total mcrA OTUs were also recovered in the reverse-transcribed RNA library from HL9. A single mcrA transcript grouped within the novel YNP2 clade, while the most abundant mcrA transcript from traditional primers was related to the Verstraetearchaeota (asterisks, Fig. 4). Transcripts were also recovered for another OTU too distantly related to the anaerobic methanotroph group 1 (ANME-1) for functional inference of methanotrophy, underscoring the need for further studies of methane cycling diversity in hot springs. Te Verstraetearchaeota-like transcripts were represented by the most abundant OTU from the DNA library while the deeply-branching ANME-1 and YNP2 transcripts were recovered in very low abundance in the DNA library. Tese data indicate that novel and deeply-rooted mcrA sequences are present and actively expressed in hot spring sediments from distinct geothermal regions of YNP. Future investigations of physicochemical infuences, metage- nome assemblies, and carbon substrate utilization are needed to determine the capacity for methane cycling in novel archaea from high-temperature environments. Conclusion We investigated physicochemically diverse thermal sites in YNP and observed an abundance of archaeal SSU rRNA gene sequences related to the Bathyarchaeota, Aigarchaeota, Crenarchaeota, Verstraetearchaeota, and Taumarchaeota. Representatives of the Bathyarchaeota were highly abundant in both DNA and RNA libraries, indicating that members of this lineage are predominant and metabolically active in at least three geochemically and hydrothermally distinct hot springs. mcrA gene sequences from the Verstraetearchaeota and Euryarchaeota, and two novel clades basal to the Bathyarchaeota4, were observed across fve distinct hot spring sediment sam- ples using PCR primers designed to target previously unknown mcrA genes as well as those amplifed using well established primers. In addition, mcrA transcripts from diverse archaeal phyla were amplifed from cDNA and for the frst time demonstrate that novel, putative methane cycling genes are expressed in the environment. Tese data corroborate the hypotheses that recently discovered, widely distributed lineages4, 23, 35 might have been over- looked for their role in environmental methane cycling. Te presented data greatly expand the diversity of a key biomarker for methanogenesis, anaerobic methane oxidation, and higher alkane utilization, and confrm that at least two archaeal phyla likely contribute to global methane fuxes. Methods Site selection and sample collection. Washburn Hot Springs (WS) and Heart Lake Geyser Basin (HL) contain thermal features that span a wide range of physicochemical variables15, including temperatures from 34 °C –91 °C, pH values from 5.8–8.6, and dissolved sulfde concentrations from detection (HL) to 231 µM (WS). While the geochemical parameters that characterize WS and HL are widely divergent (Table 1; Supplementary Table S3), these geothermal regions were selected because they are, to our knowledge, two of three sites in YNP where methanogenesis has been suggested14, 15. Surface sediments (to 1 cm) were collected from two hot springs at WS (WS1 and WS3), and two springs at HL (HL10 and HL11) in 50 ml falcon tubes. Sediments from a third hot spring from each region that exhibited similar temperature near 60 °C (WS0 and HL9) were sampled by PVC push coring. Tree sections from each core (top, middle, and bottom) were subsampled for DNA and porewater gas analysis (Supplementary Figure S3). For DNA analyses, 50 ml of sediment was immediately placed on dry ice and subsequently stored at −80 °C in the lab until extraction, and for RNA, sediments were preserved in RNAlater solution (1:1) at the time of sampling and stored at 4 °C until extraction. Geothermal source gases were sampled from hot spring bubblers in quadruplicate at WS0, WS1, HL9 and, due to weather and time constraints, only once at HL10. Tis sampling method was not possible at WS3 or HL11, because the water depth was not sufcient to immerse serum bottles. Gas samples of 240 ml were stored in preevacuated 160 ml serum vials with 1 ml 2.5 M sodium azide and stored upside down until analysis. Sediments from core sections were sampled for porewater CH4 concentrations by injecting 5 ml sediment into a 30 ml serum vial containing 2 ml of 1 M NaOH, which were then sealed with blue butyl stoppers, shaken vig- orously, frozen on dry ice, and stored upside down at −20 °C until laboratory analysis43. In an efort to make all analyses (geochemical and molecular biological) on identical portions of sediment, core sediments were slurried in 5-cm sections for a total of 10 sediment samples per core (Supplementary Table 1). Prior to sediment coring, a thermal profle was retrieved adjacent to the coring location by attaching the thermocouple wire ends from a Fluke model 52–2 (60 Hz) dual input digital thermometer to the end of a wooden rod marked with 5-cm hashes, which was then pushed into the sediments stopping for 1 min at each 5-cm interval to record temperatures.

Gas collection and analyses. A Varian micro gas chromatograph (CP2900) equipped with molecular sieve and chemical separation columns (thermal conductivity detection) was used to determine concentrations of H2, CH4, carbon monoxide, and CO2 in gas samples obtained from geothermal source pools and only CH4 for pore- water gases.

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 8 www.nature.com/scientificreports/

DNA and RNA extraction and amplifcation. DNA was extracted from approximately 10 g of sediment using the PowerMax Soil DNA kit from MoBio (Carlsbad, California, USA). Samples from which DNA yield was low were re-extracted with the FastDNA Spin Kit for Soil from MP Biomedicals (Santa Ana, California, USA). Te SSU rRNA gene was amplifed using a previously published PCR protocol15 with the Bullseye Taq DNA pol- ymerase from MidSci (St. Louis, Missouri, USA) and archaeal primers 751 F and 1204R44 with Illumina MiSeq adapter sequences on the 5’ ends. Euryarchaeotal mcrA genes were targeted using previously tested primers developed from a database of 5200 mcrA sequences and amplifed by touchdown PCR34. In the present study, these primers were found to amplify newly discovered verstraetearchaeotal mcrA genes as well as those from the Euryarchaeota, and thus have a broader coverage than previously reported34. For detection of novel mcrA genes related to the Bathyarchaeota we manually designed primers in ARB v6.0.646 based on the 19 currently available sequences from the supplemen- tary information of recent work by Evans et al. (2015; Supplementary Text). We targeted three distinct phyloge- netic subgroups related to bathyarchaeotal mcrA genes (Supplementary Table 2; Supplementary Text). Consistent with reports from the literature that successful mcrA amplifcation ofen necessitates cumulatively high cycle numbers following touchdown PCR (e.g., 35–40 cycles46, 40 cycles38, 45 cycles47), we used Bullseye Taq DNA polymerase in amplifcation of Bathy-mcrA group 2 genes with the following touchdown PCR protocol: initial denaturation and enzymatic activation at 95 °C for 15 min, followed by 2 cycles of 20 sec at 95 °C, 30 sec at 52 °C, and 45 sec at 72 °C, 2 cycles of 20 sec at 95 °C, 30 sec at 50 °C, and 45 sec at 72 °C, 2 cycles of 20 sec at 95 °C, 30 sec at 48 °C, and 45 sec at 72 °C, 2 cycles of 20 sec at 95 °C, 30 sec at 46 °C, and 45 sec at 72 °C, 32 cycles of 20 sec at 95 °C, 30 sec at 45 °C, and 45 sec at 72 °C, and a fnal extension of 5 min at 72 °C. Te only successful amplifcation was achieved with the group 2 primers (Bathy_mcrA_2/3 F and Bathy_mcrA_2 R; Supplementary Table 2). Due to the high sequence divergence between McrAs from Ca. Syntrophoarchaeum (Laso-Perez et al., 2016) and Bathyarchaeota (Evans et al., 2015), exhibiting a closest possible amino acid identity of 48%, the Syntrophoarchaea were not targeted by our primer sets. PCR products of the correct length (ca. 450 bp for SSU rRNA genes and ca. 500 bp for mcrA genes) were excised from a 1% tris-acetate-ethylenediaminetetraacetic acid gel and purifed using the Wizard Gel/PCR cleanup kit from Promega (Madison, WI, USA) and were stored at −80 °C prior to sequencing. In some cases, non-specifc primer binding yielded an additional band at 750 bp; when sequenced, this product was identifed as a member of the tRNA-guanine-transglycosylase (TGT) complex. Interestingly, this particular TGT protein was related to previously sequenced bathyarchaeotal TGTs, which sug- gests proximal association between TGT and mcrA loci. Prior to RNA extraction, two parts of RNALater-preserved sediment solution were diluted in one part 1X PBS and centrifuged for 5 minutes at 6,000xG. Te supernatant was discarded and total nucleic acids were isolated per previously published methods48. From the total RNA/DNA pool, RNA was subsequently purifed starting from Step 6 in the manufacturer’s protocol of the FastRNA Pro Soil Direct kit from MP Biomedicals (Santa Ana, CA, USA). To eliminate DNA contamination samples were then treated with DNase using the Turbo DNA-free kit from Termo Fisher Scientifc with the suggested protocol for “rigorous” DNase treatment. Tis RNA isolation method was attempted on multiple replicates of every sediment sample but only yielded quantifable RNA from the surface sediments of HL10 and the top of the HL9 sediment core. On RNA from the top of HL9, the absence of DNA was confrmed by performing PCR amplifcation of SSU rRNA genes on an aliquot of RNA using the previously mentioned protocol and positive controls with DNA. Te remaining RNA was reverse-transcribed by frst strand synthesis with archaeal SSU rRNA primers mentioned above and the SuperScript III kit from Termo Fisher Scientifc according to the manufacturer’s protocol, with the exception that RNasin (Promega, Madison, WI, USA) was used instead of RNaseOUT but at the same suggested concentration. PCR amplifcation of reverse-transcribed cDNA was then performed as it was for DNA extractions (above) with the exception that MgSO4 was added to a fnal concentration of 1.5 mM to ofset the removal of Mg2+ ions by EDTA during DNase treatment. For detection of mcrA RNA transcripts, we performed primer-targeted first strand synthesis using the SuperScript III reverse transcriptase kit mentioned above. For euryarchaeotal mcrA transcripts we used the manufacturer’s protocol for “gene specifc reverse transcription” together with the previously mentioned euryar- chaeotal mcrA primers. For bathyarchaeotal mcrA transcripts, a “touchdown” frst strand synthesis with bath- yarchaeotal primers (Supplementary Table 2) and SuperScript III reverse transcriptase was used in which we followed the gene specifc protocol but, instead of incubating at 55 °C for 60 min, we incubated at 55 °C for 20 min, 50 °C for 20 min, and 45 °C for 20 min, followed by a 15 min enzyme inactivation at 70 °C. PCR amplifcation of mcrA cDNA was then performed as previously described for mcrA DNA.

Sequencing and analysis. Purifed SSU rRNA and mcrA gene amplicons (450 bp to 500 bp) were sequenced on an Illumina MiSeq platform (San Diego, CA, USA) following preparation according to the “16S Metagenomic Sequencing” protocol for 300-read paired end sequencing. Following original PCR, amplicons with MiSeq adapter sequences were purifed with Ampure XP beads and then index PCR was performed to ensure identifcation of samples post-pooling. Indexed amplicons were purifed with an additional bead cleanup and concentrations were determined for DNA aliquots stained with PicoGreen (Quant-IT, Invitrogen, Carlsbad, CA, USA). Sample concen- trations were normalized and all samples were pooled and mixed with the PhiX control library at 10% of the sample DNA concentration. Te mixed sample-PhiX DNA was then loaded on the Illumina MiSeq and sequenced. Forward and reverse reads for SSU rRNA genes were assembled using QIIME49 (MacQIIME v1.9.1) with the split_libraries_fastq.py command based on previously tested default parameters50, which resulted in 1,942,987 QC-passed reads. Te lowest sequence count for any sample was 52,130 for WS0_TC. OTUs were picked using the open reference picking method in QIIME49, 51 and initial taxonomic assignments were made with UCLUST51 and the Greengenes reference database (gg_13_5)52. Tis produced an OTU table without singletons with a total sequence count of 1,900,511. Next, sequences that did not align via PyNAST53 methods were thrown out, resulting in a total of 604,842 sequences with a smallest sample size of 9,042 for WS0BC. OTUs generated from

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 9 www.nature.com/scientificreports/

sequenced extraction blanks and negative PCR controls were pooled and used to flter out contamination-based OTUs from the total dataset, decreasing sequence counts to 582,343 and 8,087, respectively. Finally, OTUs were further fltered for chimeric sequences with UCHIME54 in Mothur55, which reduced the total archaeal sequence count from 582,342 to 543,797 sequences and resulted in a minimum of 8,080 sequences for WS0_BC. Tis quality-fltered dataset was then rarifed to 8,080 sequences per sample by random sampling with the single_rare- faction.py command in QIIME49. Rarefaction curves are provided to demonstrate sampling depth, OTU observa- tions, and Shannon-Weiner56 diversity estimates for the total, non-rarifed OTU table as well as the rarifed OTU table (Supplementary Figure S6). All downstream analyses of relative abundance, phylogeny, and beta diversity were performed on the OTU dataset rarifed to 8,080 sequences per sample. PCoA plots of weighted UniFrac57 metrics were created in QIIME49 with the beta_diversity.py and make_2D_plots.py commands to demonstrate that similar patterns of beta diversity are revealed by each OTU cutof value used in the present study. Rarefaction curves were calculated and display the number of new OTUs observed per sequence for the unrarifed and rare- fed datasets, and an additional rarefaction curve is presented to demonstrate Shannon-Weiner diversity estimates per sequence for the rarifed dataset (Supplementary Figure S6). Shannon-Weiner diversity estimates were calcu- lated in QIIME49 with the alpha_diversity.py function according to default parameters described at scikit-bio.org. Diversity estimates were calculated the same way for DNA and RNA generated libraries from HL9TC afer being normalized to a maximum common sequence count of 3,609. For detailed taxonomic assignments and phylogenetic relationships, reference sequences were selected based on closest BLAST identities to recovered OTUs as well as a priori knowledge of relevant type strains or well-characterized long-fragment SSU genes. OTUs representing 0.5% or more of all sequences were aligned with reference sequences using the SINA online alignment platform and the v1.2.11 database58 with default parame- ters. SINA-based alignments were manually inspected and edited in ARB v6.0.646. A maximum-likelihood phy- logenetic tree of near full length SSU reference sequences (>1100 bp) representing as many recovered groups as possible was generated in ARB45 with the rapid bootstrap RAxML algorithm with 1000 iterations. GTRGAMMA was selected as the rate distribution model because it resulted in strong bootstrap support of previously under- stood phylum level lineage relationships. OTUs from this study were added to the maximum likelihood reference tree with the ARB parsimony methods45. Te colored dendogram in Fig. 2 was generated only for OTUs > 0.5% sequence recovery (no reference sequences) by ARB neighbor-joining methods45 with default parameters fol- lowed by branch transforming. UniFrac57 jackknifed beta diversity for all samples was performed for the rarifed OTU table (8,080 sequence depth) and is displayed as a weighted UPGMA tree. Weighted UniFrac57 sample diversity was calculated with the beta_diversity_through_plots.py function in QIIME49 on four diferent OTU tables representing the diferent cutof values presented in this work (Supplementary Figure S5). CCA plots were created in Microsof Excel v15.33 (Redmond, WA, USA) with the XLSTAT add-in v19.03 (New York, NY, USA) and default parameters for the rarifed OTU table and associated physicochemical parameters for each sample location. Each CCA was run at 1000 permutations and a signifcance level of 5%. Forward and reverse mcrA sequences were assembled and quality-filtered in QIIME49 using the split_ libraries_fastq.py command with default parameters as described above. Next, sequences that failed a BLAST v2.2.22 (National Center for Biotechnology Information, USA) search against a reference collection of eur- yarchaeotal, verstraetearchaeotal, and bathyarchaeotal sequences at a cutof e value of 1e-10 were discarded. For the reference collection, we used all available mcrA sequences from Bathyarchaeota, Verstraetearchaeota, Ca. Syntrophoarchaeum, and two to three representatives from each order of known methanogens within the Euryarchaeota as well as sequences belonging to uncultured anaerobic methanotrophs. We selected a liberal e value to capture as much novel mcrA diversity as possible, but manually inspected alignments in ARB45 and dis- carded non-mcrA sequences following OTU picking. Euryarchaeotal mcrA OTUs were picked with UCLUST51 at a sequence similarity of 90%, and bathyarchaeotal mcrA OTUs were picked at a sequence similarity of 97% to add more phylogenetic information to the to the comparatively sparse bathyarchaeotal side of the phylogenetic tree. All mcrA gene sequences were translated to amino acid sequences with ORF-Predictor (http://bioinformatics. ysu.edu/tools/) and aligned with MAFFT59 to our reference collection, and chimeric assemblies were identifed visually in ARB45 and discarded. Initial neighbor-joining trees were constructed in ARB to identify sequences with no close relatives, and a web-based BLAST search (National Center for Biotechnology Information, USA) was performed to retrieve closer relatives where possible. Bayesian phylogenetic analyses of amino acid McrA sequences was performed using MrBayes v3.260. To guarantee similar lengths of protein sequences, a termini flter was applied, and all characters were considered during tree reconstruction. Model-jumping between fxed rate models was allowed during protein tree calculations. Te WAG model – a protein evolution model that uses unequal but fxed stationary state frequencies and substitution rates – was found to best describe the McrA dataset. A total of 41 million trees were calculated. At the end of the analyses, the standard deviations of split fre- quencies between the two chains had reached a value of 0.035. Parameter values and trees were summarized using a burn-in of 25%. Te fnal potential scale reduction factor value was 1.00. Taxonomic associations of sequenced mcrA genes were determined by neighboring reference sequences in the Bayesian tree.

Data availability. Te sequence datasets generated from the current study are available from the NCBI repository, under BioProject ID PRJNA393290 for SSU rRNA amplicons from DNA and RNA, and BioProject ID PRJNA393291 for mcrA amplicons from DNA and RNA. References 1. Reeburgh, W. S. Oceanic methane biogeochemistry. Chem. Rev. 107, 486–513 (2007). 2. Conrad, R. The global methane cycle: recent advances in understanding the microbial processes involved. Environmental Microbiology Reports 1, 285–292 (2009).

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 10 www.nature.com/scientificreports/

3. Tauer, R. K. Anaerobic oxidation of methane with sulfate: on the reversibility of the reactions that are catalyzed by enzymes also involved in methanogenesis from CO2. Curr. Opin. Microbiol. 14, 292–299 (2011). 4. Evans, P. N. et al. Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics. Science 350, 434–438 (2015). 5. Vanwonterghem, I. et al. Methylotrophic methanogenesis discovered in the archaeal phylum Verstraetearchaeota. Nature Microbiology 1–9, doi:10.1038/nmicrobiol.2016.170 (2016). 6. Lloyd, K. G. et al. Predominant archaea in marine sediments degrade detrital proteins. Nature 496, 215–218 (2013). 7. Xiang, X. et al. Distribution of Bathyarchaeota Communities Across Diferent Terrestrial Settings and Teir Potential Ecological Functions. Scientifc Reports 7, 45028 (2017). 8. Laso-Pérez, R. et al. Termophilic archaea activate butane via alkyl-coenzyme M formation. Nature 1–25 (1AD). doi:10.1038/ nature20152 9. Schwartzman, D. W. & Lineweaver, C. H. Te hyperthermophilic origin of life revisited. Biochemical Society Transactions 32, 168–171 (2004). 10. Martin, W. & Russell, M. J. On the origin of biochemistry at an alkaline hydrothermal vent. Philosophical Transactions of the Royal Society B: Biological Sciences 362, 1887–1926 (2007). 11. Weiss, M. C. et al. Te physiology and habitat of the last universal common ancestor. Nature Microbiology 1, 16116 (2016). 12. Sayeh, R. et al. Microbial diversity in Tunisian geothermal springs as detected by molecular and culture-based approaches. Extremophiles 14, 501–514 (2010). 13. Merkel, A. Y., Huber, J. A., Chernyh, N. A., Bonch-Osmolovskaya, E. A. & Lebedinsky, A. V. Detection of Putatively Termophilic Anaerobic Methanotrophs in Difuse Hydrothermal Vent Fluids. Appl. Environ. Microbiol. 79, 915–923 (2013). 14. Zeikus, J. G., Ben-Bassat, A. & Hegge, P. W. Microbiology of Methanogenesis in Termal, Volcanic Environments. Journal of Bacteriology 143, 432–440 (1980). 15. Inskeep, W. P. et al. Phylogenetic and Functional Analysis of Metagenome Sequence from High-Temperature Archaeal Habitats Demonstrate Linkages between Metabolic Potential and Geochemistry. Front. Microbiol. 4 (2013). 16. B. D. León, K., Gerlach, R., Peyton, B. M. & Fields, M. W. Archaeal and bacterial communities in three alkaline hot springs in Heart Lake Geyser Basin, Yellowstone National Park. Front. Microbiol. 4, 1–10 (2013). 17. Fournier, R. O. Geochemistry and dynamics of the Yellowstone National Park hydrothermal system. Annual Review of Earth and Planetary Sciences 17, 13–53 (1989). 18. Holloway, J. M., Nordstrom, D. K., Böhlke, J. K., McCleskey, R. B. & Ball, J. W. Ammonium in thermal waters of Yellowstone National Park: Processes afecting speciation and isotope fractionation. Geochimica et Cosmochimica Acta 75, 4611–4636 (2011). 19. Allen, E. T. & Day, A. L. Hot Springs of the Yellowstone National Park. (Washington Carnegie Institution of Washington, 1935). 20. Kubo, K. et al. Archaea of the Miscellaneous Crenarchaeotal Group are abundant, diverse and widespread in marine sediments. Te ISME Journal 6, 1949–1965 (2012). 21. Lazar, C. S. et al. Environmental controls on intragroup diversity of the uncultured benthic archaeaof the miscellaneous Crenarchaeotal group lineage naturally enriched in anoxic sediments of the White Oak River estuary (North Carolina, USA). Environmental Microbiology 17, 2228–2238 (2015). 22. Auguet, J.-C., Barberan, A. & Casamayor, E. O. Global ecological patterns in uncultured Archaea. Te ISME Journal 4, 182–190 (2009). 23. Fillol, M., Auguet, J.-C., Casamayor, E. O. & Borrego, C. M. Insights in the ecology and evolutionary history of the Miscellaneous Crenarchaeotic Group lineage. Te ISME Journal 10, 665–677 (2015). 24. Seyler, L. M., McGuinness, L. M. & Kerkhof, L. J. Crenarchaeal heterotrophy in salt marsh sediments. 8, 1534–1543 (2014). 25. Webster, G. et al. Prokaryotic functional diversity in diferent biogeochemical depth zones in tidal sediments of the Severn Estuary, UK, revealed by stable-isotope probing. FEMS Microbiology Ecology 72, 179–197 (2010). 26. Meng, J. et al. Genetic and functional properties of uncultivated MCG archaea assessed by metagenome and gene expression analyses. ISME J. 8, 650–659 (2013). 27. Lazar, C. S. et al. Genomic evidence for distinct carbon substrate preferences and ecological niches of Bathyarchaeota in estuarine sediments. Environmental Microbiology 18, 1200–1211 (2016). 28. Biddle, J. F. et al. Heterotrophic Archaea dominate sedimentary subsurface ecosystems of Peru. Proc Natl Acad Sci USA 103, 3846–3851 (2006). 29. Teske, A. & Sørensen, K. B. Uncultured archaea in deep marine subsurface sediments: have we caught them all? Te ISME Journal 2, 3–18 (2007). 30. Jennings, R. & de, M. et al. Integration of Metagenomic and Stable Carbon Isotope Evidence Reveals the Extent and Mechanisms of Carbon Dioxide Fixation in High-Temperature Microbial Communities. Front. Microbiol. 8, 88 (2017). 31. Moran, M. A. et al. Sizing up metatranscriptomics. Te ISME Journal 7, 237–243 (2013). 32. Dell’Anno, A. & Danovaro, R. Extracellular DNA plays a key role in deep-sea ecosystem functioning. Science 309, 2179–2179 (2005). 33. Lazar, C. S. et al. Environmental controls on intragroup diversity of the uncultured benthic archaeaof the miscellaneous Crenarchaeotal group lineage naturally enriched in anoxic sediments of the White Oak River estuary (North Carolina, USA). Environmental Microbiology 17, 2228–2238 (2015). 34. Angel, R., Claus, P. & Conrad, R. Methanogenic archaea are globally ubiquitous in aerated soils and become active under wet anoxic conditions. Te ISME Journal 6, 847–862 (2012). 35. Vanwonterghem, I. et al. Methylotrophic methanogenesis discovered in the archaeal phylum Verstraetearchaeota. Nature Microbiology 1–9, doi:10.1038/nmicrobiol.2016.170 (2016). 36. Merkel, A. Y., Podosokorskaya, O. A., Chernyh, N. A. & Bonch-Osmolovskaya, E. A. Occurrence, diversity, and abundance of methanogenic archaea in terrestrial hot springs of Kamchatka and Sa? Miguel Island. Microbiology 84, 577–583 (2015). 37. Steinberg, L. M. & Regan, J. M. mcrA-targeted real-time quantitative PCR method to examine methanogen communities. Appl. Environ. Microbiol. 75, 4435–4442 (2009). 38. Lever, M. A. & Teske, A. P. Diversity of Methane-Cycling Archaea in Hydrothermal Sediment Investigated by General and Group- Specifc PCR Primers. Appl. Environ. Microbiol. 81, 1426–1441 (2015). 39. Eloe-Fadrosh, E. A., Ivanova, N. N., Woyke, T. & Kyrpides, N. C. Metagenomics uncovers gaps in amplicon-based detection of microbial diversity. Nature Microbiology 1, 15032 (2016). 40. Smith, K. S. & Ingram-Smith, C. Methanosaeta, the forgotten methanogen? Trends in Microbiology 15, 150–155 (2007). 41. Lever, M. A. A New Era of Methanogenesis Research. Trends in Microbiology 24, 84–86 (2016). 42. Garcia, J.-L., Ollivier, B. & Whitman, W. B. In Te : Volume 3: Archaea. Bacteria: Firmicutes, Actinomycetes (eds. Dworkin, M., Falkow, S., Rosenberg, E., Schleifer, K.-H. & Stackebrandt, E.) 208–230 (Springer New York, 2006). doi:10.1007/0-387- 30743-5_10 43. McKay, L. J. et al. Spatial heterogeneity and underlying geochemistry of phylogenetically diverse orange and white Beggiatoa mats in Guaymas Basin hydrothermal sediments. Deep-Sea Research Part I 67, 21–31 (2012). 44. Baker, G. C., Smith, J. J. & Cowan, D. A. Review and re-analysis of domain-specifc 16S primers. Journal of Microbiological Methods 55, 541–555 (2003). 45. Ludwig, W. et al. ARB: a sofware environment for sequence data. Nucleic Acids Res. 32, 1363–1371 (2004).

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 11 www.nature.com/scientificreports/

46. Juottonen, H., Galand, P. E. & Yrjälä, K. Detection of methanogenic Archaea in peat: comparison of PCR primers targeting the mcrA gene. Research in Microbiology 157, 914–921 (2006). 47. Biddle, J. F. et al. Anaerobic oxidation of methane at diferent temperature regimes in Guaymas Basin hydrothermal sediments. 6, 1018–1031 (2011). 48. Yang, T. et al. Distinct Bacterial Communities in Surfcial Seafoor Sediments Following the 2010 Deepwater Horizon Blowout. Front. Microbiol. 7, 77–18 (2016). 49. Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010). 50. Bokulich, N. A. et al. Quality-fltering vastly improves diversity estimates from Illumina amplicon sequencing. Nat. Methods 10, 57–59 (2013). 51. Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010). 52. DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006). 53. Caporaso, J. G. et al. PyNAST: a fexible tool for aligning sequences to a template alignment. Bioinformatics 26, 266–267 (2010). 54. Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C. & Knight, R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27, 2194–2200 (2011). 55. Schloss, P. D. et al. Introducing mothur: open-source, platform-independent, community-supported sofware for describing and comparing microbial communities. Appl. Environ. Microbiol. 75, 7537–7541 (2009). 56. Margalef, R. Information theory in ecology. General systems 3, 36–71 (1958). 57. Lozupone, C. & Knight, R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71, 8228–8235 (2005). 58. Pruesse, E., Peplies, J. & Glöckner, F. O. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 28, 1823–1829 (2012). 59. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment sofware version 7: improvements in performance and usability. Molecular Biology and Evolution 30, 772–780 (2013). 60. Ronquist, F. et al. MrBayes 3.2: efcient Bayesian phylogenetic inference and model choice across a large model space. Systematic Biology 61, 539–542 (2012). Acknowledgements Te authors are grateful to the YNP Center for Resources for permits necessary to conduct research at Washburn Springs (YELL-2015-SCI-5068) and Heart Lake Geyser Basin (YELL–2015-SCI-5840). In addition, we thank Dr. Dana Skorupa for help with sampling eforts and geochemical analyses, Prof. Brent Peyton, Sara Altenburg, Hannah Schweitzer and Dr. Margaux Mesle for help with sampling eforts, Dr. Cassandre Lazar and Dr. Mark Lever for phylogenetic insights, Dr. Chiachi Hwang for sequencing assistance, and Dr. Kara De León for helpful conversation regarding the HL region. L. McKay was funded by the NASA Postdoctoral Program through the NASA Astrobiology Institute. W. Inskeep appreciates support from the Montana Agricultural Experiment Station (Project 911300). M. Fields and the development of high throughput sequencing techniques for novel environmental microorganisms were supported as a component of ENIGMA, a scientifc focus area program supported by the U.S. Department of Energy, Ofce of Science, Ofce of Biological and Environmental Research, Genomics: GTL Foundational Science through contract DE-AC02–05CH11231 between Lawrence Berkeley National Laboratory and the U.S. Department of Energy. Author Contributions L.J.M., W.P.I., and M.W.F. conducted feld sampling. L.J.M. processed samples in the laboratory, made Figures 1, 2, 3, and refned Figure 4. R.H. generated the base tree for Figure 4. All authors contributed to writing the manuscript. Additional Information Supplementary information accompanies this paper at doi:10.1038/s41598-017-07354-x Competing Interests: Te authors declare that they have no competing interests. Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afliations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre- ative Commons license, and indicate if changes were made. Te images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not per- mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

© Te Author(s) 2017

Scientific RepOrtS | 7:7252 | DOI:10.1038/s41598-017-07354-x 12 Supplementary Information

Title: Occurrence and expression of novel methyl-coenzyme M reductase gene (mcrA)

variants in hot spring sediments

Authors: Luke J. McKay1,2*, Roland Hatzenpichler1,3, William P. Inskeep2, and Matthew

W. Fields1,4*

Affiliations: 1Center for Biofilm Engineering, Montana State University, Bozeman, MT 59717

2Department of Land Resources and Environmental Sciences, Montana State

University, Bozeman, MT 59717

3Department of Chemistry and Biochemistry, Montana State University,

Bozeman, MT 59717

4Department of Microbiology and Immunology, Montana State University,

Bozeman, MT 59717

Key words: methane cycling, Bathyarchaeota, Verstraetearchaeota, geothermal

Corresponding authors: L.J. McKay, PhD 313 Barnard Hall Center for Biofilm Engineering Montana State University Bozeman, MT 59717 [email protected]

M.W. Fields, Professor 366 Barnard Hall Center for Biofilm Engineering Montana State University Bozeman, MT 59717 [email protected] NB: The springs at HL were previously referred to by their temperatures at the time of sampling for the investigation by De León et al.: 44°C, 63°C, and 75°C (2013), which correspond to HL10,

HL9, and HL11 in this study. Notably, at 34°C HL10 was 10°C cooler when sampled for the present study than the previous study by De León et al. (2013). At WS, the spring designated

“WS0” in this investigation is the same spring sampled by Inskeep et al., which was called

“WS_18” (2013). To our knowledge the microbial ecosystem of the other two springs, WS1 and

WS3, has not been previously investigated.

Results & Discussion

Push cores retrieved from WS0 and HL9 and cut into three sections (Supplementary

Table 1) revealed remarkable archaeal community similarity downcore. UniFrac metrics of b diversity, including weighted principal coordinate analyses and jackknifed resampling of overall phylogenetic diversity within samples, confirmed that separate sections of the sediment cores more resemble environmental replicates than distinct samples (Figure 2; Supplementary

Figure S4). At HL9, even when all OTUs comprised of n > 1 sequences are considered, every layer of the HL9 core as well as the sediments from HL10 cluster very tightly (Supplementary

Figure S4); this indicates that the background, low-abundance microbial community is consistent throughout the cooler Heart Lake hot springs. At Washburn, the top layer of the WS0 core is tightly clustered with a replicate non-core sample from the surface sediments of WS0 while the middle and deep sediment layers do not cluster closely in principal coordinate space. In other words, differential communities exist among separate layers of the WS0 core when the larger, background community is considered. All sediment communities from WS0, HL9, and HL10 are more tightly clustered with one another than they are to WS1, WS3, and HL11 (Supplementary

Figure S4), presumably resulting from marked differences in temperature. On the other hand, jackknifed phylogenetic resampling of the high-abundance community (n > 0.5%) indicates that the HL9 and HL10 samples are more similar to HL11 than they are to any Washburn samples

(Figure 2). This is surprising given the predominance of Bathyarchaeota in both regions but not in HL11, but can be explained by differences between the dominant bathyarchaeotal OTUs at

WS versus HL as well intraregional similarities in non-bathyarchaeotal groups (e.g.,

Thermoprotei are primarily associated with WS samples).

The bathyarchaeotal majority in the two push core samples, WS0 and HL9, extends from the surface layer to the deepest layer sampled, which may reflect a wide range of electron acceptor possibilities for bathyarchaeotal metabolisms. However, the “surface layer” of the push cores represents a range of 0 – 5 cm, and thus the predominance of Bathyarchaeota may be more representative of the sediments just below the surface than within the actual phototrophic mat community at the sediment-water interface. Indeed, while the archaeal community is consistent downcore, the bacterial community at the surface was representative of a phototrophic mat community (Roseiflexus, Gloeobacter, etc; unpublished data). These deeper layers may also serve as buffers to sources of oxidative stress{Lazar:2016fe} and/or thermal activity where temperatures are slightly less than values from surficial fluids (Supplementary Figure S3). The

UniFrac b diversity metrics (Figure 2; Supplementary Figure S4) also indicate a clear distinction between thermal regions in which the phylogenetic diversity is more similar among all HL springs than between any HL and WS hot spring. However, the same is not true for WS, where

WS0 is more similar to HL springs than to either WS1 or WS3, likely due to the predominance of the Bathyarchaeota.

In addition to sampling the top of the sediment core from WS0, we sampled surface sediments directly from the hot spring (Supplementary Table S1) to assess how different sampling techniques of the same site capture phylogenetic diversity. While the rarefaction curve for WS0TC is much steeper than for WS0 (Supplementary Figure S6), CCA (Figure 3), PCoA

(Supplementary Figure S5), and UPGMA (Figure 2) analyses each separately demonstrate a tight clustering of these replicates. In addition, tightly clustered groups for top, middle, and bottom layers of sediment cores from the same figures also indicate the reproducibility of SSU rRNA recovery. Since every analysis of beta diversity demonstrates that all other samples are more distantly related than either WS0 and WS0TC or distinct layers of the same sediment core, we maintain that our sampling design was appropriate for distinguishing between phylogenetic signals from different sample sites.

Differential recovery of SSU rRNA groups in DNA versus RNA.

Several lineages represented > 0.1% of sequences in the RNA pool but not the DNA pool, including sequences related to the Aenigmarchaeota (formerly Deep Sea Euryarchaeotal

Group), Nitrosocaldus spp. (phylum ), unclassified

(Crenarchaeota), and Archaeoglobus spp. (Euryarchaeota) (Supplementary Figure S4). The largest of these groups were comprised of sequences within the Thaumarchaeota and

Aenigmarchaeota, which represented 14.9 % and 10.7 %, respectively, of the total RNA library from the surface sediments of HL9. This indicates that these groups are more active in the community than is suggested by relative abundance SSU genes from DNA.

Primer design for detection of novel mcrA genes

We restricted the forward and reverse loci to the close vicinity of traditional primer sites to yield a similar amplicon region for downstream alignments, and we ensured that each primer set had similar melting temperatures (Supplementary Table S4). Our primers range in length between 18-20 bp and each has an even distribution of G/C bases while none have > 3 G/C bases in the 5 bp termini. No consensus primer sequence was possible for all 19 bathyarchaeotal sequences, which resulted in three primer sets targeted at distinct bathyarchaeotal subgroups. For group 1, Bathy_mcrA_1F and Bathy_mcrA_1R primers were designed to target mcrA gene sequences KT387817, KT387819, KT387808, and KT387809. A single forward primer,

Bathy_mcrA_2-3F, was designed to target bathyarchaeotal groups 2 and 3, which consist of

KT387813, KT387818, KT387814, KT387816, KT387812 (group 2), and KT387806 and

KT387810 (group 3). The group 2 reverse primer, Bathy_mcrA_2R, is a perfect match to

KT387813 and KT387814 and the group 3 reverse primer, Bathy_mcrA_3R, matches KT387806 and KT387810.

WS0 WS1 WS3 HL9 HL10 HL11

Top 0 – 5 cm 0 – 3 cm 0 – 3 cm 0 – 5 cm 0 – 3 cm 0 – 1 cm

Middle 23 – 28 cm 12 – 17 cm

Bottom 45 – 50 cm 25 – 30 cm

Supplementary Table S1 | Sediment depths sampled at each hot spring site. Sediments were sampled for DNA extractions from three hot springs at Washburn (WS) and three at Heart Lake (HL). At WS0 and HL9 a push core was retrieved and sectioned in three layers. The depth range of all sediment samples from every site is listed in cm. RNA was extracted from the top layer (0 – 5 cm) of HL9.

Supplementary Table S2 | Information for published and newly designed mcrA primers. Target organisms, amplification success, sequence, length, position, % GC, melting temperature (Tm), amplicon size, and reference for each primer set attempted is provided. Accession numbers of previously published genes that were targeted by each new primer set are listed, as well as which primer sets were successful. At the time of primer design, the only available sequences of mcrA genes belonging to Bathyarchaeota were from the publication by Evans et al. (2015). Anucleotide position was based on the mcrA gene sequence of M. thermautotrophicus (GenBank accession U10036 following Steinburg & Regan, 2009 and Angel et al., 2012)

Supplementary Table S3 | Physicochemical parameters for WS0, HL9, HL10, and HL11. DS = dissolved sulfide, DIC = dissolved inorganic carbon, DOC = dissolved organic carbon. Values in black were determined at the time of sediment sampling for DNA. Values in red are added from historical data (from De Leon et al., 2013, Inskeep et al., 2013, or Jennings et al., 2017).

100 New.CleanUp.ReferenceOTU8482 PKF10a_3061 100 New.ReferenceOTU169 PKF9TCa_1029766 100 1814820 PKF10a_245 100 New.ReferenceOTU582 PKF9MCa_792943 Bathyarchaeota/MCG−20 100 AY861963, OPPD030, 1, 1212, Archaea;Miscellaneous Crenarchaeotic Group 100 AY861940, OPPD002, 1, 1257, Archaea;Miscellaneous Crenarchaeotic Group 100 New.ReferenceOTU183 PKF9BCa_631194 100 100New.ReferenceOTU267New.CleanUp.ReferenceOTU8482 PKF9MCa_810511 PKF10a_3061 100 100 New.ReferenceOTU169144063 WS0BCa_1365861 PKF9TCa_1029766 100 1814820100 L25305, PKF10a_245 pJP 89, 1, 1336, Archaea;Miscellaneous Crenarchaeotic Group 100 New.ReferenceOTU582 PKF9MCa_792943 91 DQ243753, YNP_ObP_A23, 1, 1313, Archaea;Miscellaneous CrenarchaeoticBathyarchaeota/MCG−20 Group 95100 AY861963, OPPD030, 1, 1212, Archaea;Miscellaneous Crenarchaeotic Group 100 AY861940,EU924226, OPPD002, LHC4_L2_F11, 1, 1257, Archaea;Miscellaneous 1, 1349, Archaea;Miscellaneous Crenarchaeotic Crenarchaeotic Group Group Bathyarchaeota/MCG−2 100100 New.ReferenceOTU183EU924224, LHC4_L5_B12, PKF9BCa_631194 1, 1336, Archaea;Miscellaneous Crenarchaeotic Group 100100 New.ReferenceOTU267EU924225, LHC4_L2_F10, PKF9MCa_810511 1, 1336, Archaea;Miscellaneous Crenarchaeotic Group 100 100337415144063 PKF9BCa_600393 WS0BCa_1365861 100 100New.ReferenceOTU145L25305, pJP 89, 1, PKF9MCa_855706 1336, Archaea;Miscellaneous Crenarchaeotic Group 100 91New.ReferenceOTU158DQ243753, YNP_ObP_A23, WS0MCa_1839940 1, 1313, Archaea;Miscellaneous Crenarchaeotic Group 10095 EU924226, LHC4_L2_F11, 1, 1349, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−2 100 EU924224,AY861952, LHC4_L5_B12, OPPD016, 1, 1195, 1, 1336, Archaea;Miscellaneous Archaea;Miscellaneous Crenarchaeotic Crenarchaeotic Group GroupBathyarchaeota/MCG−15 91100 EU924225,New.ReferenceOTU365 LHC4_L2_F10, 1,PKF9TCa_982813 1336, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota 100 337415100 PKF9BCa_600393New.ReferenceOTU442 PKF9MCa_839862 100 New.ReferenceOTU145100 New.ReferenceOTU658 PKF9MCa_855706 PKF9MCa_838144 100 New.ReferenceOTU158100 New.ReferenceOTU568 WS0MCa_1839940 PKF9TCa_1042824 Bathyarchaeota/MCG−6 100 100 AY861952,New.ReferenceOTU316 OPPD016, 1, 1195, Archaea;MiscellaneousPKF10a_1972 Crenarchaeotic Group Bathyarchaeota/MCG−15 91 52 New.ReferenceOTU365 PKF9TCa_982813 Bathyarchaeota 100FJ901681,New.ReferenceOTU442 LPBB39, 1, 1229, PKF9MCa_839862Archaea;Miscellaneous Crenarchaeotic Group 86 100 AM076830,New.ReferenceOTU658 C4, 1, 1438, Archaea;Miscellaneous PKF9MCa_838144 Crenarchaeotic Group 100EU635905, New.ReferenceOTU568SSW_L4_D05, 1, 1323, Archaea;MiscellaneousPKF9TCa_1042824 Crenarchaeotic Group Bathyarchaeota/MCG−6 100 100gb|LIHK01000010.1|:44430−45784New.ReferenceOTU316 PKF10a_1972 Candidatus Bathyarchaeota archaeon BA2 ba2_11, whole genome shotgun sequence 100 52FJ902700,FJ901681, LPBB95, LPBB39,1, 1247, Archaea;Miscellaneous 1, 1229, Archaea;Miscellaneous Crenarchaeotic Crenarchaeotic Group Group 7386 New.ReferenceOTU318AM076830, C4, PKF9BCa_641628 1, 1438, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−14 98 EU635905, SSW_L4_D05, 1, 1323, Archaea;Miscellaneous Crenarchaeotic Group 100 gb|LIHK01000010.1|:44430−45784JQ989395, ORI−902A−02−P_S248−250_025A10, Candidatus Bathyarchaeota 1, 1434, archaeArchaea;Misceon BA2 ba2_11,llaneous wholeCrenarchaeotic genome shotgun Group sequence 100 FJ902700,DQ522920, LPBB95, 1, SBAK−mid−18, 1247, Archaea;Miscellaneous 1, 1399, Archaea;Miscellaneous Crenarchaeotic Group Crenarchaeotic Group 73100 New.ReferenceOTU318HQ916513, LGH02−A−39, PKF9BCa_641628 1, 1348, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−14 100 98 FJ902695,JQ989395, LPBB59, ORI−902A−02−P_S248−250_025A10, 1, 1227, Archaea;Miscellaneous Crenarchaeotic 1, 1434, Archaea;Misce Group llaneous CrenarchaeoticBathyarchaeota/MCG−10 Group New.ReferenceOTU429DQ522920, SBAK−mid−18, PKF10a_47891 1, 1399, Archaea;Miscellaneous Crenarchaeotic Group 98 100 FJ902270,HQ916513, LPROCKA24, LGH02−A−39, 1, 1184, 1, Archaea;Miscellaneous1348, Archaea;Miscellaneous Crenarchaeotic Crenarchae Groupotic Group 100 100 FJ902695, LPBB59, 1, 1227, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−10 837 PKF9BCa_600499New.ReferenceOTU429 PKF10a_47891 82 98 U63339,FJ902270, pSL17, 1, LPROCKA24, 1342, Archaea;Miscellaneous 1, 1184, Archaea;Miscellaneous Crenarchaeotic CrenarchaeoticGroup Group 100 100 837 PKF9BCa_600499New.ReferenceOTU138 WS0a_2870207 82 100 U63339, pSL17,EU635909, 1, 1342, SSE_L4_D01, Archaea;Miscellaneous 1, 1324, Archaea;Aigarchaeota;Aigarchaeot Crenarchaeotic Group a Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum 99 100 New.ReferenceOTU138HM448147, YNP_SBC_QL2_A36, WS0a_2870207 1, 1498, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum Aigarchaeota 100100 838 WS0BCa_1446076EU635909, SSE_L4_D01, 1, 1324, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum 97 99 HM448147, YNP_SBC_QL2_A36, 1, 1498, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum Aigarchaeota 100 838 WS0BCa_1446076U63341, pSL4, 1, 1331, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum 97 100 U63341,New.ReferenceOTU29 pSL4, 1, 1331, Archaea;Aigarchaeota;AigarchaeotaPKF9TCa_1079006 Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum 100100 New.ReferenceOTU29AY861957, OPPD023, PKF9TCa_1079006 1, 1206, Archaea;Miscellaneous Crenarchaeotic Group UnknownUnknown Deep Phylum Group Level Group 66 100 EU924237,AY861957, LHC3_L5_E06, OPPD023, 1, 1206, 1, 1338, Archaea;Miscellaneous Archaea;Miscellaneous Crenarchaeotic Crenarchaeotic Group Group Unknown Deep Group 66 EU924237,HQ395707, LHC3_L5_E06, VG1−14d, 1, 1398, 1, 1338, Archaea;Thaumarchaeota;Thaumarchaeo Archaea;Miscellaneous Crenarchaeoticta Group Incertae Sedis;Unknown Order;Unknown Family;Candidatus Nitrosocaldus 100 JF262245,HQ395707, YLA060, VG1−14d, 1, 1439,1, 1398, Archaea;Thaumarchaeota;Thaumarchaeot Archaea;Thaumarchaeota;Thaumarchaeoa Incertaeta Incertae Sedis;Unknown Sedis;Unknown Order;Unknown Order;Unknown Family;Candidatus Family;Candidatus Nitros Nitroocaldussocaldus 100 100 JF262245,822604 PKF11a_383053 YLA060, 1, 1439, Archaea;Thaumarchaeota;Thaumarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Nitrosocaldus 100 822604 PKF11a_383053 9999 EU635917,EU635917, SSE_L4_B03, SSE_L4_B03, 1, 1, 1310, 1310, Archaea;Thaumarchaeota;Thaumarch Archaea;Thaumarchaeota;Thaumarchaeotaaeota Incertae Incertae Sedis;Unknown Sedis;Unknown Order;Unknown Order;Unknown Family;Candidatus Family;Candidatus Ni Nitrosocaldustrosocaldus ThaumarchaeotaThaumarchaeota 100100 KF836722,KF836722, OPA−GC−16SrRNA−168, OPA−GC−16SrRNA−168, 1, 1, 1281, 1281, Archaea;Thaumarchaeota;p Archaea;Thaumarchaeota;pSL12SL12 100100 U63343,U63343, pSL12, pSL12, 1, 1, 1342, 1342, Archaea;Thaumarchaeota;pSL12 Archaea;Thaumarchaeota;pSL12 7676 New.ReferenceOTU110New.ReferenceOTU110 WS0TCa_2543096 WS0TCa_2543096 100100 43590504359050 WS0BCa_1373437 WS0BCa_1373437 100100 DQ243730, YNP_BP_A32, 1, 1285, Archaea;Crenarchaeota;Thermoprotei;;; 96 DQ243730, YNP_BP_A32, 1, 1285, Archaea;Crenarchaeota;Thermoprotei;Desulfurococcales;Desulfurococcaceae;Ignisphaera 96 CP002098, 1279164, 1280671, Archaea;Crenarchaeota;Thermoprotei;Desulfurococcales;Desulfurococcaceae;Ignisphaera 96 DQ924699, BW158,CP002098, 1, 1335, 1279164, Archaea;Crenarchaeota;Thermoprotei;Th 1280671, Archaea;Crenarchaeota;Thermoprotei;ermoproteales;;unculturedDesulfurococcales;Desulfurococcaceae;Ignisphaera 10096 806839DQ924699, WS0TCa_2533284 BW158, 1, 1335, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;uncultured 100 DQ924806,806839 WS0TCa_2533284 BW266, 1, 1332, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;uncultured AB931047,DQ924806, NasHA12, BW266, 1, 1363, 1, 1332, Archaea;Crenarchaeota;Thermoprotei; Archaea;Crenarchaeota;Thermoprotei;ThThermoproteales;Thermoproteaceaeermoproteales;Thermoproteaceae;uncultured 100 AB063630,AB931047, 1, NasHA12, 1476, Archaea;Crenarchaeota;Thermoprotei;Thermopro 1, 1363, Archaea;Crenarchaeota;Thermoprotei;teales;Thermoproteaceae;VulcanisaetaThermoproteales;Thermoproteaceae 53100 559851 WS1a_3017899 AB063630, 1,100 1476, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae; 53 4295385 WS3a_3233018 559851 WS1a_3017899100 New.ReferenceOTU285 WS3a_3263211 100 1001413514295385 WS0MCa_1842187 WS3a_3233018 100 100 DQ833848,New.ReferenceOTU285 SK487, 1, 1335, WS3a_3263211 Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;uncultured 100 100 CP001014,141351 WS0MCa_1842187690421, 692612, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae; 68 100 M35966,DQ833848, 1, 1504, Archaea;Crenarchaeota;Thermoprotei;Thermoprote SK487, 1, 1335, Archaea;Crenarchaeota;Thermoprotei;Thales;Thermoproteaceae;Thermoproteusermoproteales;Thermoproteaceae;uncultured Crenarchaeota_Thermoprotei 84 DQ833814,100 SK451, 1, 1471, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaeta 99 CP001014, 690421, 692612, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Pyrobaculum 68 DQ833800, SK435, 1, 1336, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaeta 100 197113 WS0MCa_1896358M35966, 1, 1504, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae; Crenarchaeota_Thermoprotei 96 84 DQ924710,DQ833814, BW169, SK451, 1, 1340, 1, 1471, Archaea;Crenarchaeota;Thermoprotei;Th Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaetaermoproteales;Thermoproteaceae;Vulcanisaeta 10099 150549DQ833800, WS0BCa_1566785 SK435, 1, 1336, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaeta

100 New.CleanUp.ReferenceOTU8482 PKF10a_3061 100 New.ReferenceOTU169100 PKF9TCa_1029766 100 1814820 PKF10a_245 100EU635922, SSE_L4_F03, 1, 1343, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;; 100 New.ReferenceOTU582 PKF9MCa_792943197113 WS0MCa_1896358 Bathyarchaeota/MCG−20 100 AY861963, OPPD030, 1, 1212, Archaea;Miscellaneous Crenarchaeotic Group 100 AY861940, OPPD002, 1, 1257, Archaea;Miscellaneous Crenarchaeotic Group 100 New.ReferenceOTU183 PKF9BCa_631194 100 New.ReferenceOTU26779 PKF9MCa_810511 100 144063 WS0BCa_1365861 EU924231, LHC4_L5_H08, 1, 1369, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermofilaceae;Thermofilum 100 L25305, pJP 89,96 1, 1336, Archaea;Miscellaneous Crenarchaeotic Group 91 DQ243753, YNP_ObP_A23, 1, 1313,DQ924710, Archaea;Miscellaneous BW169, 1, 1340, Crenarchaeotic Archaea;Crenarchaeota;Thermoprotei;Th Group ermoproteales;Thermoproteaceae;Vulcanisaeta 95 EU924226, LHC4_L2_F11, 1, 1349, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−2 100 EU924224, LHC4_L5_B12, 1, 1336, Archaea;Miscellaneous Crenarchaeotic Group 100 EU924225, LHC4_L2_F10,100 1, 1336, Archaea;Miscellaneous Crenarchaeotic Group 100 337415 PKF9BCa_600393 824 WS0BCa_1432475 100 New.ReferenceOTU145 PKF9MCa_855706 100 New.ReferenceOTU158 WS0MCa_1839940100 100 150549 WS0BCa_1566785 AY861952, OPPD016, 1, 1195, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−15 91 New.ReferenceOTU365 PKF9TCa_982813 Bathyarchaeota 100 New.ReferenceOTU442100 PKF9MCa_839862 100 L25300, pJP 33, 1, 1338, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) New.ReferenceOTU658 PKF9MCa_838144 100 New.ReferenceOTU568 PKF9TCa_1042824 Bathyarchaeota/MCG−6 100 New.ReferenceOTU316 PKF10a_1972 52 100 FJ901681, LPBB39, 1, 1229,EU635922, Archaea;Miscellaneous SSE_L4_F03, Crenarchaeotic 1, 1343, Archaea;Crenarchaeota;Thermoprot Group ei;Thermoproteales;Thermofilaceae;Thermofilum 86 AM076830, C4, 1, 1438, Archaea;Miscellaneous Crenarchaeotic Group EU635905, SSW_L4_D05,99 1, 1323, Archaea;Miscellaneous Crenarchaeotic Group 100 EU635923, SSW_L4_H06, 1, 1332, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) gb|LIHK01000010.1|:44430−45784 Candidatus Bathyarchaeota archaeon BA2 ba2_11, whole genome shotgun sequence 100 FJ902700, LPBB95, 1, 1247, Archaea;Miscellaneous Crenarchaeotic Group 73 New.ReferenceOTU318 PKF9BCa_641628 Bathyarchaeota/MCG−14 98 JQ989395, ORI−902A−02−P_S248−250_025A10,79 1, 1434, Archaea;Miscellaneous Crenarchaeotic Group DQ522920, SBAK−mid−18, 1, 1399,EU924231, Archaea;Miscellaneous LHC4_L5_H08, 1, Crenarcha 1369, Archaea;Crenarchaeota;Thermoproeotic Group tei;Thermoproteales;Thermofilaceae;Thermofilum 100 HQ916513, LGH02−A−39, 1, 1348,100 Archaea;MiscellaneousASPO01000025, Crenarchae 14362,otic Group 15843, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 FJ902695, LPBB59, 1, 1227, Archaea;Miscellaneous Crenarchaeotic Group Bathyarchaeota/MCG−10 New.ReferenceOTU429 PKF10a_47891 98 FJ902270, LPROCKA24, 1, 1184, Archaea;Miscellaneous Crenarchaeotic Group 100 837 PKF9BCa_600499 82 U63339, pSL17, 1, 1342, Archaea;Miscellaneous100 Crenarchaeotic Group 100 78 New.ReferenceOTU138 WS0a_2870207824 WS0BCa_1432475New.ReferenceOTU20 PKF11a_459363 100 EU635909, SSE_L4_D01, 1, 1324, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum 99 HM448147, YNP_SBC_QL2_A36, 1, 1498, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum Aigarchaeota 100 838 WS0BCa_1446076 97 U63341, pSL4, 1, 1331, Archaea;Aigarchaeota;Aigarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Caldiarchaeum 100 New.ReferenceOTU29100 PKF9TCa_1079006 100 AY861957, OPPD023, 1001, 1206, Archaea;MiscellaneousJN881575, Crenarchaeotic PNG_TBR_A43, Group 1, 1369,Unknown Archaea;Crenarchaeota;Thermopro Deep Group tei;Terrestrial Miscellaneous Gp(TMCG) 66 EU924237, LHC3_L5_E06, 1, 1338,L25300, Archaea;Miscellaneous pJP 33, 1, 1338, Archaea;Crenarchaeota;Thermoprotei;Ter Crenarchaeotic Group restrial Miscellaneous Gp(TMCG) HQ395707, VG1−14d, 1, 1398, Archaea;Thaumarchaeota;Thaumarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Nitrosocaldus 100 JF262245, YLA060, 1, 1439, Archaea;Thaumarchaeota;Thaumarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Nitrosocaldus 100 822604 PKF11a_383053 99 EU635917,100 SSE_L4_B03, 1, 1310, Archaea;Thaumarchaeota;Thaumarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Nitrosocaldus Thaumarchaeota 100 KF836722, OPA−GC−16SrRNA−168, 1, 1281, Archaea;Thaumarchaeota;pDQ924751, BW211,SL12 1, 1339, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 U63343, pSL12, 1, 1342,99 Archaea;Thaumarchaeota;pSL12 76 New.ReferenceOTU110 WS0TCa_2543096EU635923, SSW_L4_H06, 1, 1332, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 4359050 WS0BCa_1373437 100 DQ243730, YNP_BP_A32, 1, 1285, Archaea;Crenarchaeota;Thermoprotei;Desulfurococcales;Desulfurococcaceae;Ignisphaera 96 100 CP002098, 1279164, 1280671, Archaea;Crenarchaeota;Thermoprotei;Desulfurococcales;Desulfurococcaceae;Ignisphaera 96 DQ924699,159805 BW158, WS0BCa_1555824 1, 1335, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;uncultured 100 806839 WS0TCa_2533284 DQ924806, BW266, 1,100 1332, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;uncultured AB931047, NasHA12, 1, 1363, Archaea;Crenarchaeota;Thermoprotei;ASPO01000025, 14362,Thermoproteales;Thermoproteaceae 15843, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 AB063630, 1, 1476, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaeta 53 559851 WS1a_3017899100 100 4295385GQ228765, WS3a_3233018 Kam40.5a, 1, 1269, Archaea;Korarchaeota;Korarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Korarchaeum Korarchaeota 100 New.ReferenceOTU285 WS3a_3263211 100 141351 WS0MCa_1842187 100 DQ833848,78 SK487, 1, 1335, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;uncultured 100 CP001014, 690421, 692612,New.ReferenceOTU20 Archaea;Crenarchaeota;Thermoprotei;Th PKF11a_459363 ermoproteales;Thermoproteaceae;Pyrobaculum 68 M35966, 1, 1504, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Thermoproteus Crenarchaeota_Thermoprotei 84 DQ833814, SK451, 1, 1471, Archaea;Crenarchaeota;Thermoprotei;ThDQ228520, F99a109, 1, 1404, Archaea;Korarchaeota;Korarchaeotaermoproteales;Thermoproteaceae;Vulcanisaeta Incertae Sedis;Unknown Order;Unknown Family;Candidatus Korarchaeum 99 DQ833800, SK435, 1, 1336, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaeta 100 197113 WS0MCa_1896358 Verstraetearchaeota 96 DQ924710, BW169, 1, 1340, Archaea;Crenarchaeota;Thermoprotei;Thermoproteales;Thermoproteaceae;Vulcanisaeta 100 150549 WS0BCa_1566785 100 100 JN881575, PNG_TBR_A43, 1, 1369, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) EU635922, SSE_L4_F03, 1, 1343, Archaea;Crenarchaeota;Thermoprot100 ei;Thermoproteales;Thermofilaceae;Thermofilum 79 EU924231, LHC4_L5_H08, 1, 1369, Archaea;Crenarchaeota;ThermoproCP002565, 291377,tei;Thermoproteales;Thermofilaceae;Thermofilu 292844, Archaea;Euryarchaeota;Methanomicrobiam ;Methanosarcinales;Methanosaetaceae;Methanosaeta 100 824 WS0BCa_1432475 100 L25300, pJP 33, 1, 1338, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 99 EU635923, SSW_L4_H06, 1, 1332, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 ASPO01000025, 14362, 15843, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 78 100 New.ReferenceOTU2083 PKF11a_459363DQ924751, BW211, 1, 1339, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 JN881575, PNG_TBR_A43, 1, 1369,AB071701, Archaea;Crenarchaeota;Thermopro 1, 1403, Archaea;Euryarchaeota;;Methantei;Terrestrial Miscellaneous Gp(TMCG)osarcinales;Methanosaetaceae;Methanosaeta 100 DQ924751, BW211, 1, 1339, Archaea;Crenarchaeota;Thermoprotei;Terrestrial Miscellaneous Gp(TMCG) 100 159805 WS0BCa_1555824 100 GQ228765, Kam40.5a, 1, 1269, Archaea;Korarchaeota;Korarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Korarchaeum Korarchaeota DQ228520, F99a109, 1, 1404, Archaea;Korarchaeota;Korarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Korarchaeum 100 100CP002565, 100291377, 292844, Archaea;Euryarchaeota;Methanomicrobia;Methanosarcinales;Methanosaetaceae;Methanosaeta Euryarchaeota 83 X84901, 1, 1470, Archaea;Euryarchaeota;;;Picrophilaceae; AB071701, 1, 1403, Archaea;Euryarchaeota;Methanomicrobia;Methan159805 WS0BCa_1555824 osarcinales;Methanosaetaceae;Methanosaeta 100 X84901, 1, 1470, Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Picrophilaceae;Picrophilus Euryarchaeota 338213 WS0BCa_1393949 100 X05567, 1, 1492, Archaea;Euryarchaeota;Archaeoglobi;Archaeoglobales;;Archaeoglobus 0.10 100338213 WS0BCa_1393949GQ228765, Kam40.5a, 1, 1269, Archaea;Korarchaeota;Korarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Korarchaeum Korarchaeota 100 X05567,DQ228520, 1, 1492, Archaea;Euryarchaeota;Archaeoglobi;Archaeoglobales;Archaeoglobaceae;Archaeoglobus F99a109, 1, 1404, Archaea;Korarchaeota;Korarchaeota Incertae Sedis;Unknown Order;Unknown Family;Candidatus Korarchaeum 100 CP002565, 291377, 292844, Archaea;Euryarchaeota;Methanomicrobia;Methanosarcinales;Methanosaetaceae;Methanosaeta 0.1083 AB071701, 1, 1403, Archaea;Euryarchaeota;Methanomicrobia;Methanosarcinales;Methanosaetaceae;Methanosaeta 100 X84901, 1, 1470, Archaea;Euryarchaeota;Thermoplasmata;Thermoplasmatales;Picrophilaceae;Picrophilus Euryarchaeota 338213 WS0BCa_1393949 100 X05567, 1, 1492, Archaea;Euryarchaeota;Archaeoglobi;Archaeoglobales;Archaeoglobaceae;Archaeoglobus

0.10

Supplementary Figure S1 | Maximum likelihood phylogenetic relationships of OTUs based on 16S rRNA genes. A RAxML tree was constructed in ARB with a GTRGAMMA rate distribution model and bootstrapped at 1000 iterations. The base tree consisted only of reference sequences with > 1100bp, and OTUs from the present study (ca. 500bp) representing > 0.5 % sequence abundance were added with ARB parsimony methods. Phylum-level lineages are indicated in bold to the right of phylogenetic groups. MCG-subgroups for the Bathyarchaeota are named according to investigations by Kubo et al. (2012) and Lazar et al. (2015), except for MCG-20, which was identified in the present study.

49% JX569287 clone t19 100% denovo39_PKF10eury_89811 62% FJ902944 clone MCRAZ88 19% AF414047_Methanosphaera_stadtmanae_strain_DSM_3091 99% HM440320 clone POME_T37_m10 100% denovo17_PKF10eury_32422 7% DQ261759 clone D9RTCR137 100% denovo38_PKF9MCeury_259831 45% denovo6_PKF9MCeury_260047 100% KM370781_GB_EvMd_DBrGrpII_mcrA_1 7% KM370762_GB_EvMd_DBrGrpII_mcrA_2 100% denovo11_WS0TCeury_694174 96% JQ995719 clone mcrA47_T7 18% denovo45_WS0TCeury_729828 27% denovo23_PKF9TCeury_440713 83% JQ995686 clone mcrA88_T7 100% JQ995695 clone mcrA77_T7 60% denovo37_WS0TCeury_709869 30% denovo28_WS0TCeury_762373 20% KM370773_GB_EvMd_DBrGrpIII_mcrA_2 100% JX942285 clone J1_33 12% denovo35_PKF10eury_99338 35% JF937831_4489_13_ANME1mcr_037 61% AB233459_ANME1_Nankai_trough 87% JF937803_4483_ANME1mcr_03 100% JF937820_4486_ANME1mcr_11 44% JF937828_4489_9_ANME1mcr_007 100% JF937802_4483_ANME1mcr_12 100% JF937822_4489_ANME1mcr_12 100% AY760637_Lost_City_clone_LCM1133_4 44% HM004929_EastMedMudVolc_AmK_100_200_mcrA12 7% denovo0_PKF10eury_35808 90%JQ995700 clone mcrA70_T7 90%denovo46_PKF9BCeury_193207 85% JQ995745 clone mcrA12_T7 41% denovo15_WS0TCeury_707043 71% denovo8_WS0TCeury_783957 28% denovo36_PKF9TCeury_511371 17% JF937797_4483_MCRird_285 67% Verstraetearchaeota V3 mcrA 43% Verstraetearchaeota V4 mcrA 100% KF836733 clone OPA−GC−mcrA−IA32 47% denovo24_PKF10eury_3373 16% denovo29_PKF9TCeury_443720 98%427188531:1054558−1056210 100% denovo18_PKF9TCeury_403318 100% AE000666_Methanothermobacter_thermautotrophicus_str_Delta_H 30% denovo19_PKF9TCeury_440642 72% denovo22_PKF9TCeury_480921 35% denovo43_PKF9TCeury_469095 98% KF836735 clone OPA−GC−mcrA_II078 100% denovo1_PKF9TCeury_457282 54% KC466057 clone m27 25% denovo26_PKF9TCeury_474211 30% AE009439_Methanopyrus_kandleri_AV19 82% JF937806_4486_MCRird_d236 64% U22234_Methanococcoides_burtonii 96% Methanomethylovorans_hollandica_DSM_15978 56% U22258_Methanosarcina_mazeii_strain_C_16_ 99% AF121100_ODP8_ME2 30% Candidatus_Methanoperedens_sp_BLZ1 100% AY324367_Methanosarcina/ANME2_BR_42bB08 23% AY324365_Methanosarcina/ANME2_F17.1_30H02 98% denovo44_PKF10eury_4520 64% denovo9_PKF10eury_69987 78% denovo25_PKF10eury_38814 46% Methanocella_paludicola_SANAE 14% KJ464135 clone ML−LAL24 65% KF836740 clone OPA−GC−mcrA_II094 100% AY837772_Guaymas_D07 63% denovo42_PKF10eury_82501 100% KT314804 clone P6O_07 47% KC466053 clone m11 51% denovo12_PKF10eury_671 68% denovo41_PKF9TCeury_406376 37% denovo20_PKF9TCeury_418812 99% CP002565_Methanosaeta_concilii_GP_6 71% denovo4_PKF10eury_79438 51% KF267819 clone UP1cI46_418bp 100% KR075359 clone U3−C07 92% denovo10_PKF10eury_45144 99% HM800574 clone rlm_C_312 35% denovo30_PKF10eury_50156 62% denovo21_PKF10eury_14648 100% KF596115 clone E704S−20 53% KC466443 clone mcrP18 26% denovo13_PKF10eury_24010 86% GU447218 clone PL−B1_14_1_D10 21% denovo31_PKF10eury_83039 100% denovo3_PKF10eury_484 25% Methanospirillum_hungatei_JF_1 100% AY458416 clone MCR−F1SP−31 71% denovo40_PKF10eury_41344 95% LC154584 clone 14cDNAR76 AY837763_Guaymas_D08 24% denovo55_PKF9MCbath_244688 34% denovo152_PKF9TCbath_369263 49%denovo171_PKF9BCbath_104071 41% denovo119_PKF9BCbath_154853 47% denovo94_PKF9BCbath_128332 16% denovo44_PKF9TCbath_397163 43%denovo68_PKF9BCbath_192609 72% denovo133_PKF9BCbath_171430 51% denovo57_PKF9BCbath_116758 64% denovo166_PKF9BCbath_117462 68% denovo12_PKF9BCbath_161246 14% denovo37_PKF9MCbath_227692 78% denovo151_PKF9BCbath_177723 YNP1 65% denovo154_PKF9BCbath_111872 41% denovo180_PKF9BCbath_110699 29% denovo135_WS0TCbath_663548 4% denovo17_PKF9MCbath_215433 32% denovo22_PKF9MCbath_249900 4% denovo82_PKF9MCbath_228000 35% denovo64_PKF9MCbath_232886 23% denovo54_PKF9TCbath_373878 67% denovo134_WS0TCbath_615196 54% denovo52_PKF9BCbath_187041 46% denovo67_PKF9BCbath_177973 20% denovo48_PKF9MCbath_257361 49% denovo157_WS0TCbath_616763 100% denovo85_WS0TCbath_673748 61% denovo1_WS0TCbath_684629 56% denovo170_WS0TCbath_681196 89%denovo4_PRB76bath_603480 40% denovo31_WS0TCbath_652674 45% denovo59_PKF9MCbath_220803 18% denovo42_WS0TCbath_668486 79% denovo179_PKF9BCbath_128849 68% denovo36_PKF9MCbath_230141 72% denovo129_PKF9MCbath_250347 22% denovo167_PKF9MCbath_246038 22% denovo178_PKF9MCbath_236089 YNP2 15% denovo58_PKF9TCbath_400416 27% denovo126_WS0TCbath_617988 54% denovo65_PKF9TCbath_391771 denovo169_PKF9BCbath_104711 62% PKF9_TC 43% denovo0_PKF9TCbath_376551 47% denovo110_PKF9BCbath_168948 100% denovo99_PKF9BCbath_104441 69% KT387809_Bathy_clone_BS_19_9145_2 58% KT387808_Bathy_clone_BV_9_1718_1 100% KT387817_Bathy_clone_BS_36_20936_1 26% KT387819_Bathy_clone_CD_8_186481_1 25% KT387806_Bathy_clone_CX_10_BA2_13_151 39% KT387810_Bathy_clone_CX_10_14344_1 27% KT387811_Bathy_clone_AG_31_16992_1 56% KT387805_Bathy_clone_CX_10_BA1_24_9_ 50% KT387818_Bathy_clone_CX_10_1922_1 100% KT387813_Bathy_clone_BB_3_5336_3 48% KT387816_Bathy_clone_BS_36_5099_1 92% KT387814_Bathy_clone_BV_3_3440_1 100% LYOS01000002.1|:360006−361670 90% LYOR01000003.1|:29730−31394 100% LYOR01000007.1|:35316−36980 94% LYOS01000008.1|:3258−4925 LYOS01000003.1|:168824−170509 0.10 Supplementary Figure S2 | Neighbor-joining clustered dendogram of mcrA gene sequences. A neighbor-joining algorithm was used to cluster mcrA gene sequences for comparison to other mcrA amplicon studies, which commonly publish neighbor-joining trees. A Jukes-Cantor correction was applied and bootstrap values were calculated after 1000 tree iterations. Two newly identified bathyarchaeotal subgroups from YNP are identified as YNP1 and YNP2.

Temperature (C) Methane (ppm) WS0 core -5 60 65 70 0 50 100

5

15

25

35

Sediment depth (cm) 45

55

-5 55 60 65 70 0 20 40 60 80 HL9 core 0

5

10

15

20

Sediment depth (cm) 25

30 Grey bar 35 = Also Sampled for DNA

Supplementary Figure S3 | Sediment core sample sites and adjacent profiles of subsurface temperature and methane concentration. Photos of WS0 and HL9 hot springs indicate precise locations of sediment coring (yellow X). Thermal profiles were measured adjacent to each coring site at increments of ca. 5 cm down to 50 cm (WS0) and 30 cm (HL9). Porewater methane concentrations were measured in core sediments and demonstrate subsurface peaks in methane at both sites. Grey bars indicate sediment layers that were sampled for DNA analyses. Supplementary Figure S4.

0 Bathyarchaeota MCG0 -10/deeply branching c_Thermoplasmata/TMEG MCG0 -14 c_Methanomicrobia MCG0 -2/6/15/20 g_Methanosaeta Aigarchaeota f_Methanomassiliicoccaceae WS1a color key o_Micrarchaeles 0 o_unclassified o_Methanobacteriales c_Methanomicrobia c_Unknown 0 o_Caldiarcheales c_Thermoprotei f_Caldiarchaeceae 0 f_Caldiarchaeceae o_Methanobacteriales g_Caldiarchaeum g_Caldiarchaeum 0 g_Nitroscaldus 9030 Thaumarchaeota MBGA MBGB 0 Marine Benthic Group A 0 g_Nitroscaldus Parvarchaeota o_Micrarchaeles 449 Euryarchaeota 0 g_Archaeoglobus 0 c_Hadesarchaea 0 f_Methanomassiliicoccaceae o_Methanobacteriales c_Methanomicrobia g_Methanosaeta c_Thermoplasmata/TMEG Aenigmarchaeota DSEG/ArcA07 Marine Benthic Group B 0 Crenarchaeota 78 c_Thermoprotei 682 c_Unknown 0 17765 A) SSU rRNA gene B) SSU rRNA transcript 449

Supplementary Figure S4 | SSU rRNA genes versus SSU rRNA transcripts at hot spring HL9. (A) Relative abundance of archaeal community based on SSU rRNA gene recovery in HL9 surface sediments for groups that were also recovered from RNA. (B) Relative abundance of the archaeal community based on SSU rRNA transcript recovery in HL9 surface sediments. Group colors are consistent with Figures 1 and 2, and taxa that were recovered from RNA but not > 0.1 % for DNA (as shown in Figure 1) are in black and white patterns. Archaeal groups observed at low relative abundance are labeled on the pie charts for ease of viewing. MCG = Miscellaneous Crenarchaeotal Group (see Kubo and Lloyd et al., 2012; Lazar et al., 2015), DSEG = Deep Sea Euryarchaeotal Group (Takai et al., 2001), TMEG = Terrestrial Miscellaneous Euryarchaeotal Group (Sørensen et al., 2004).

A B

HL10 HL10 WS3 HL9MC HL9BC WS3 HL9MC HL9BC HL9TC HL9TC WS0BC WS0BC WS0TC WS0 WS0TC WS0 WS0MC WS0MC

WS1 WS1 HL11 HL11

C D

HL11 HL10 WS3 WS1 HL9MC HL9BC HL9TC WS0BC WS0 WS0TC WS0MC WS0MC WS0 WS0TC WS0BC WS3 HL9TC HL9MC HL9BC WS1 HL10 HL11

Supplementary Figure S5 | Principal coordinates analysis of weighted UniFrac beta diversity for different OTU cutoff values. UniFrac weighted metric of phylogenetic diversity between hot spring sediment samples is plotted in principal coordinates space for the quality- filtered OTU table (A), the rarefied OTU table to 8,080 sequence depth (B), the rarified OTU table filtered at a > 0.1 % cutoff consistent with Figure 1 (C), and the rarified OTU table filtered at a > 0.5 % sequence cutoff consistent with Figure 2 (D). Axis labels indicate the percent variation explained by each axis. TC, MC, and BC stand for “top core”, “middle core”, and “bottom core”, respectively.

A

HL9_BC HL9_MC HL9_TC HL10 OTUs HL11 WS0_BC WS0_MC WS0_TC WS0 Total: Observed WS0_BC WS1 WS1 WS3 HL11 WS0_MC

Sequences per sample

B

HL9_BC HL9_MC HL9_TC HL10 HL11 WS0_BC WS0_MC WS0_TC WS0 WS1 Rare8080: Observed OTUs WS3

Sequences per sample

C

HL9_BC HL9_MC HL9_BC HL9_MC HL9_TC HL10 HL11 WS0_BC WS0_MC

Rare8080: Shannon WS0_TC WS0 WS1 WS3

Sequences per sample Supplementary Figure S6 | Rarefaction curves of SSU rRNA reads. Distinct OTUs are plotted against total sequence recovery per sample for the total quality-filtered sequence library (A), the rarefied sequence library at 8,080 sequences (B), and Shannon-Weiner diversity estimates are plotted per recovered sequence from the rarified dataset (C). Colored lines correspond to separate samples and overlapping lines are indicated by black arrows and text.