<<

1 Biogeography and ecology of the rare and abundant microbial lineages in deep-sea 2 hydrothermal vents 3 4 5 Rika E. Anderson1,2, Mitchell L. Sogin3, and John A. Baross1 6 7 8 9 1School of Oceanography and Astrobiology Program, University of Washington, Seattle, WA 10 2Institute for Genomic Biology, University of Illinois at Urbana-Champaign, IL 11 3Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, MA 12 13 14 Corresponding author: 15 Rika Anderson 16 Institute for Genomic Biology 17 University of Illinois at Urbana-Champaign 18 1206 West Gregory Drive 19 MC-195 20 Urbana, IL 61801 21 (206) 353-2019 22 [email protected] 23 24 Running title: Rare and abundant lineages in deep-sea hydrothermal vents 25 26 27

28 Abstract

29 Environmental gradients generate countless ecological niches in deep-sea

30 systems, which foster diverse microbial communities. The majority of

31 distinct microbial lineages in these communities occur in very low abundance. However,

32 the ecological role and distribution of rare and abundant lineages, particularly in deep,

33 hot subsurface environments, remains unclear. Here, we use 16S rRNA tag sequencing to

34 describe biogeographic patterning and microbial community structure of both rare and

35 abundant and in hydrothermal vent systems. We show that while rare

36 archaeal lineages and almost all bacterial lineages displayed geographically restricted

37 community structuring patterns, the abundant lineages of archaeal communities displayed

38 a much more cosmopolitan distribution. Finally, analysis of one high-volume, high-

39 temperature fluid sample representative of the deep hot biosphere described a unique

40 microbial community that differed from microbial populations in diffuse flow fluid or

41 sulfide samples, yet the rare thermophilic archaeal groups showed similarities to those

42 that occur in sulfides. These results suggest that while most archaeal and bacterial

43 lineages in vents are rare and display a highly regional distribution, a small percentage of

44 lineages, particularly within the archaeal domain, are successful at widespread dispersal

45 and colonization.

46

47 Introduction

48 In almost all ecosystems investigated to date, a minority of microbial lineages

49 dominates the community, while many low-abundance lineages account for most of the

50 total community diversity. These rare lineages make up the archaeal and bacterial “rare

1

51 biosphere,” a term first coined by Sogin et al. (2006). Despite the ubiquity of rare

52 lineages, the ecological role and distribution of the rare biosphere in a given community

53 remains unclear.

54 Analysis of the spatial distribution of rare lineages can shed light on their

55 ecological role. Generally speaking, microbial spatial distribution can be controlled by

56 environmental selection, historical events such as dispersal limitation, or both (Martiny et

57 al. 2006). A commonly discussed hypothesis is that high population densities of

58 microorganisms facilitate rapid and widespread dispersal, but that contemporary

59 environments select for particular microbial assemblages—essentially, that “everything is

60 everywhere, but the environment selects” (Baas Becking 1934). It has been argued that if

61 microorganisms are freely dispersed, rare strains should be fairly cosmopolitan (Pedrós-

62 Alió 2006). In a study illustrating the enormous dispersal capabilities of microbes,

63 Gibbons et al. (2013) found that most microbial lineages identified in the International

64 Census of Marine Microbes dataset could be found at a single deeply-sequenced site in

65 the Western English Channel. However, a global analysis of bacterial distribution

66 revealed that microbial lineages had a bipolar distribution, being confined by hemisphere

67 more than expected under a null model (Sul et al. 2013), suggesting that marine

68 microorganisms are subject to a certain degree of dispersal limitation.

69 Moreover, studies examining the spatial distributions of rare versus abundant

70 strains have suggested that rare strains are not cosmopolitan, but are subject to

71 environmental and ecological pressures similar to those of the abundant strains. A study

72 of rare and abundant strains in the Arctic Ocean found that rare operational taxonomic

73 units (OTUs) exhibited similar geographic patterns to those of the abundant OTUs

2

74 (Galand et al. 2009), demonstrating that rare strains in the Arctic Ocean are not widely

75 dispersed. Similarly, Vergin et al. (2013) showed that both rare and abundant strains are

76 subject to spatiotemporal patterns such as seasonality and stratification.

77 With diverse but isolated habitats and a global distribution, hydrothermal vent

78 systems present a compelling test case for these questions regarding the distributions of

79 rare and abundant strains. In these systems, a wide variety of ecological niches develop

80 by mixing of high-temperature, reduced, metal-enriched hydrothermal fluid with cooler,

81 oxidized seawater both above and below the seafloor. The high-temperature centers of

82 structures are inhabited by microbial communities that tend to be much more archaeal-

83 dominated than the diffuse flow communities that result from the mixing of high-

84 temperature fluid and background seawater (Schrenk et al. 2003; Takai & Horikoshi

85 1999; Takai et al. 2001; Slobodkin et al. 2001; Kelley et al. 2002). Samples of fluids

86 from diffuse flow vents reveal richly diverse microbial communities with members that

87 range from deep subsurface to mesophiles and

88 entrained from deep seawater (Huber et al. 2003, 2002, 2007; Deming & Baross 1993).

89 Moreover, sampling of diffuse flow fluids suggests that a portion of the microbial

90 community found in these fluids draws from a deep subsurface habitat hosting

91 thermophilic, anaerobic archaeal and bacterial communities; effectively, hydrothermal

92 systems provide a “window” to the deep biosphere (Summit & Baross 2001; Deming &

93 Baross 1993). Deep sequencing of archaea and bacteria from seamount diffuse fluids

94 revealed thousands of bacterial and archaeal lineages, the majority of which occur in very

95 low abundances (Huber et al. 2007; Sogin et al. 2006). Some of these rare lineages may

96 be abundant in other niches of the vent system.

3

97 The question of geographic isolation versus geochemical selection in

98 hydrothermal systems has been investigated previously, but without consistent trends.

99 While previous work at hydrothermal systems has indicated that geographically distinct

100 vents generally host phylogenetically distinct populations (Holden et al. 2001; Huber et

101 al. 2006; Flores et al. 2012; Opatkiewicz et al. 2009), only some studies have seen clear

102 correlations between chemistry and community composition (Huber et al. 2006; Huber et

103 al. 2003). Other studies have found evidence of geographic isolation, but with little

104 correlation to fluid chemistry (Opatkiewicz et al. 2009; Huber et al. 2010). Here, we seek

105 to address two primary questions regarding the distribution of rare and abundant lineages

106 in hydrothermal systems. First, across large spatial scales, do rare and abundant strains

107 exhibit the same patterns of spatial structuring, or do we see differences that might point

108 to different ecological roles? While some have suggested that the rare biosphere

109 encompasses a persistent, cosmopolitan seed bank throughout the ocean (Pedrós-Alió

110 2006), longstanding theories of spatial ecology suggest that rare species should be fairly

111 restricted, while more abundant species have a more widespread distribution (Brown

112 1984). We aim to determine which pattern prevails in hydrothermal systems. Second,

113 since hydrothermal systems encompass many different habitat types linked by fluid flux,

114 we seek to determine whether strains that are rare in certain habitats of hydrothermal

115 systems are abundant in others.

116 We investigated all publicly available DNA sequences of the v6 region of 16S

117 rRNA gene (pyrotags) from hydrothermal vent samples in the VAMPS database, which

118 includes samples from diffuse flow fluids and vent sulfides worldwide, and combined

119 this evaluation with analysis of a single, large-volume high-temperature sample, sampled

4

120 from Hulk vent on the Juan de Fuca Ridge, that provides the best representation currently

121 available of the deep hot subsurface biosphere in vent systems. In this way, we sampled

122 multiple hydrothermal vent niches across a wide spatial sample area. We show that while

123 most archaeal and bacterial strains were observed only in particular hydrothermal regions

124 due to either restricted dispersal or ecological selection, certain archaeal strains were

125 widespread and therefore appear to be able to disperse and colonize new niches

126 efficiently. Moreover, analysis of the high-temperature fluid sample in combination with

127 other sample types indicates that strains that are rare in some niches are abundant in

128 others.

129 Materials and Methods

130 Sample site description and sampling procedures

131 We obtained all 16S v6 pyrotag datasets used in this study, with the exception of

132 those newly acquired from a single high-temperature sample, from the publicly available

133 VAMPS database (www.vamps.mbl.edu). Table S1 presents a list of all samples and

134 associated metadata; Figure S1 depicts a map of the sample sites.

135 We collected the high-temperature fluid sample at Hulk vent in August 2009

136 aboard the R/V Atlantis. Hulk is a large sulfide chimney located at 47˚ 57.00’ N, 129˚

137 5.81’ W on the Main Endeavour Field on the Juan de Fuca Ridge, a spreading center

138 located about 200 miles from the coast of Washington and Oregon (Figure S1). We

139 deployed a custom-built barrel sampler using DSV Alvin to collect 170 L of high-

140 temperature diffuse flow fluid from the base of the sulfide structure. The average

141 temperature of the sample, as determined from its silica and magnesium concentrations,

142 was approximately 125˚C (Anderson et al. 2011). This sample most likely represents a

5

143 wide range of niches resulting from mixtures of fluid at various temperatures, since the

144 sample funnel was placed on top of a colony of tube worms at approximately 20˚C, and

145 most likely sampled fluid from a nearby conduit measured to be about 300˚C. On deck,

146 we put samples on ice prior to filtering through three 0.02 μm Steripaks (Millipore,

147 USA). DNA extraction procedures are described in detail in Anderson et al. (2013).

148 Briefly, we freeze-thawed one of the Steripaks three times, then added DNA extraction

149 buffer (0.1M Tris-HCl, 0.2M Na-EDTA, 0.1M NaH2PO4, 1.5M NaCl, and 1%

150 cetyltrimethylammonium bromide), 50 mg ml–1 lysozyme, 1% proteinase K, and 20%

151 SDS solution to the filter. We extracted DNA using phenol:chloroform:isoamyl alcohol

152 and chloroform:isoamyl alcohol.

153 Other diffuse flow samples obtained from the VAMPS database were collected by

154 J. Huber from eight different seamounts at Axial Seamount, the Mariana Arc, and Loihi

155 Seamount, depicted in Figure S1. Axial Seamount is an active volcano located on the

156 Juan de Fuca Ridge about 300 miles off the coast of Oregon. The caldera is about 700 m

157 above the level of the ridge, and is bordered on three sides by a boundary fault. Several

158 areas of active venting are located within the caldera. Other diffuse flow fluid samples

159 derive from several seamounts along the Mariana Arc, located in the Western Pacific

160 Ocean from about 12 to 24˚N. These seamounts, including NW Eifuku, Daikoku, Nikko,

161 NW Rota, and E Diamante, were located along the active front of the Mariana Arc, with

162 the exception of Forecast, which may have greater influence from the backarc spreading

163 axis (Huber et al. 2010; Embley et al. 2004). Analysis of Epsilonproteobacteria in these

164 samples was described in Huber et al. 2010. Loihi Seamount is an active submarine

6

165 volcano located above the Hawaiian hotspot. Sampling and DNA extraction methods for

166 these diffuse flow samples were discussed in Huber et al. (2010).

167 All sulfide samples collected from the VAMPS database were collected by A.-L.

168 Reysenbach from the Lau Basin, which is a back-arc basin formed by the subduction of

169 the Pacific plate below the Australian plate. Sulfide samples were collected and

170 processed as described in Flores et al. 2011 and 2012. Samples were sequenced as part of

171 the International Census of Marine Microbes (ICoMM) initiative.

172

173 Sequencing

174 We constructed and sequenced v4–v6 and v6 amplicon libraries for the Hulk

175 sample at the Josephine Bay Paul Center at the Marine Biological Laboratory on a Roche

176 454 GSFLX Titanium platform using the techniques described in Huber et al. (2007) and

177 Sogin et al. (2006). All sequences are publicly available on the VAMPS website

178 (http://vamps.mbl.edu) under dataset names REA_HDF_Av6v4, REA_HDF_Bv6v4 and

179 REA_HDF_Bv6 for the Hulk sample. We used the v6 bacterial dataset for this study to

180 enable direct comparison between v6 datasets, and trimmed the archaeal v6v4 datasets

181 when comparing against other v6 samples. We compared this sample with sequences

182 from projects ICM_ALR_Av6, ICM_ALR_Bv6, KCK_SMT_Av6, and KCK_SMT_Bv6,

183 which we obtained from the VAMPS database.

184

185 Bioinformatic analysis

186 We trimmed and quality filtered all reads through the VAMPS pipeline using the

187 quality control parameters outlined in Huse et al. (2007). We performed taxonomic

7

188 analyses of each sample using the GAST process in VAMPS (Huse et al. 2008). We

189 further screened, filtered and trimmed all sequences as a batch set using mothur (Schloss

190 et al. 2009). Trimming of sequences removed the v4–v5 region of the Hulk sequences,

191 leaving behind only the v6 region to facilitate direct comparison. We aligned both

192 archaeal and bacterial sequences against the SILVA database (Quast et al. 2013) through

193 the mothur pipeline. We clustered sequences into operational taxonomic units (OTUs)

194 using average-neighbor hierarchical clustering to the 0.03 level using mothur, again

195 treating all sequences from all samples as a batch set for OTU clustering. We tested

196 different clustering cutoffs, including unique, 0.2, and 0.4 distance cutoffs, and observed

197 no qualitative differences in our results. We calculated diversity indices (rarefaction

198 curves and Shannon and Simpson evenness) using mothur. For comparison between

199 samples, we constructed distance matrices in mothur using the Bray-Curtis calculator of

200 community membership and structure. We also assessed community structure with

201 distance matrices calculated using the Unifrac method after creating a phylogenetic tree

202 for all sequences in each sample using clearcut within the mothur package (Schloss et al.

203 2009). To normalize between different sample sizes, we randomly subsampled the data

204 1000 times (to the size of the smallest sample) when comparing between datasets. We

205 generated cluster dendrograms from these distance matrices using PRIMERv6 (Clarke &

206 Gorley 2006).

207 For the rare vs. abundant OTU analysis, we separated sequences from OTUs that

208 were considered to be abundant in each sample (representing equal to or greater than 1%

209 of all sequences in the sample) from those considered to be rare (representing equal to or

210 less than 0.1% of all sequences in the sample). We carried out analysis of similarity

8

211 (ANOSIM) tests using PRIMERv6 to determine whether there were assemblage

212 differences between groups of samples specified according to geographic location. We

213 conducted nine hundred ninety nine permutations of the test for each ANOSIM analysis,

214 using a resemblance matrix of Bray-Curtis dissimilarity as determined in mothur. To test

215 for community similarity distance decay, we conducted Mantel tests of mothur-generated

216 Bray-Curtis community dissimilarity matrices against distance matrices of geographic

217 distance using the “fossil” library with the statistical software package R (R Core Team,

218 2013). We used R to generate heatmaps with OTU relative abundance data generated in

219 mothur.

220 To create phylogenetic trees of sequences from the and

221 Methanococcales, we identified all OTUs belonging to either of these groups according

222 to the SILVA taxonomic classification conducted in mothur, and selected a reference

223 sequence from each OTU. We created reference data sets with full-length 16S sequences

224 from the SILVA database (Quast et al. 2013); we aligned both the reference sequences

225 and sample sequences in the SILVA aligner (Pruesse et al. 2012). We created a base tree

226 from the reference sequences in RAxML (Stamatakis 2006) using a rapid bootstrap

227 analysis to search for the best maximum likelihood tree with 100 alternative runs on

228 distinct starting trees. We used EPA (Berger et al. 2011) within the RAxML package to

229 insert the short v6 sample sequences into the base tree. For tree construction, we used the

230 GTR+ optimization of substitution rates and the GAMMA model of rate heterogeneity.

231 We conducted comparisons between v4–v6 sequences in the Hulk sample and

232 uncultured crenarchaeal sequences with USEARCH v6 (Edgar 2010) using the

233 usearch_global command.

9

234

235 Results

236 Comparative community structure of diffuse flow and sulfide structures

237 Taxonomic classification of bacterial v6 sequences for all samples revealed high

238 abundances of both the Epsilonproteobacteria and Gammaproteobacteria groups in most

239 samples for both diffuse flow and sulfide samples (Figure 1A). Among the archaea,

240 sulfide samples exhibited high abundances of Archaeoglobi and Halobacteria, whereas

241 Marine Group I and Thermoplasmata dominated most diffuse flow samples (Figure 1B).

242 The high-temperature Hulk sample, in contrast, was unique in its high abundance of

243 Thermococcales.

244 OTUs at a 3% distance produced a total of 3711 OTUs in the archaea and 22029

245 OTUs in the bacteria for all samples. In almost all samples, bacterial communities had

246 higher richness than the archaeal communities, as depicted in the rarefaction curves for

247 samples from both domains (Figure S2). None of these rarefaction curves reached an

248 asymptote, indicating that none of the datasets captured the total diversity of the sample.

249 Simpson and Shannon evenness indices showed that bacterial communities had higher

250 evenness than archaeal communities (Table S2), which was significant for the Simpson

251 evenness test (t-test, p=7.8E-18).

252

253 Clustering of samples according to community similarity

254 Cluster dendrograms indicated the degree of similarity between samples based on

255 the relative abundance of OTUs. For the bacteria, some clustering according to seamount

256 was apparent among the diffuse flow samples (Figure 2A). For the archaea, there was

10

257 much higher similarity between diffuse flow samples and between sulfide samples than

258 for the bacteria (Figure 3A). Archaeal communities in diffuse flow samples were less

259 likely to cluster by location. In most cases, the sulfide structures clustered separately

260 from the diffuse flow samples. The high-temperature Hulk sample grouped with the other

261 diffuse flow samples, though at very low similarity.

262 We used ANOSIM analyses to test the hypothesis that archaeal and bacterial

263 samples clustered according to geographic location. We grouped diffuse flow samples

264 according to seamount, while sulfide samples, which were all collected in the Lau Basin,

265 were grouped together. We grouped the high temperature Hulk sample separately from

266 other samples. Location designations are shown in the legends of Figures 2 and 3. To test

267 whether clustering of OTUs by location was significant, we conducted ANOSIM

268 analyses for bacteria and archaea. ANOSIM analysis indicated that clustering according

269 to geographic location was significant for the bacteria (p ≤ 0.1%), but not for the archaea

270 (Table 1). Comparison of samples using Unifrac, a phylogenetic rather than taxonomic-

271 based metric of beta diversity, provided similar results (Figures S3, S4; Table S3). To test

272 the hypothesis that OTUs are dispersal-limited, we conducted Mantel tests of the

273 correlation between OTU dissimilarity and geographic distance. Our analyses found a

274 significant positive correlation (Table 1).

275

276 Biogeography and distribution of rare and abundant OTUs in hydrothermal systems

277 Analyses of all OTUs together cannot identify differences in ecological patterning

278 between the rare and abundant OTUs. Therefore, we separated the rare and abundant

279 OTUs within each sample to determine whether they exhibit different biogeographic and

11

280 community structuring patterns. For this analysis, we considered rare OTUs to be those

281 OTUs representing less than or equal to 0.1% of the sequences in the sample; abundant

282 OTUs were considered to be those OTUs representing greater than or equal to 1% of all

283 the sequences in the sample. This scoring follows definitions of rare and abundant groups

284 previously established by Pedros-Alió (2006) and Fuhrman (2009). This scoring leaves

285 out ambiguous sequences between 0.1-1% abundance, which cannot be not easily defined

286 as either “rare” or “abundant.” We also tested different percentage cutoffs for the

287 definition of “rare” and “abundant,” testing a range between 0.5 to 5% for abundant

288 strains, and 0.05 to 0.3% for rare strains. We repeated the analyses described below and

289 the patterns were consistent, indicating that the trends described here are not sensitive to

290 strict cutoffs.

291 The taxonomic identification of rare and abundant OTUs did not differ drastically

292 from each other, though certain taxonomic groups had a greater tendency to contain

293 abundant OTUs, specifically Gammaproteobacteria and Epsilonproteobacteria for

294 bacteria and Thermoplasmata and Marine Group I for archaea (Figures S5 and S6).

295 We identified the rare and abundant OTUs in each sample and clustered the

296 samples according to community similarity as before to determine whether the rare and

297 abundant OTUs exhibited similar patterns. In the bacteria, community structuring among

298 locations was similar, though not identical, between the rare and the abundant OTUs

299 (Figure 2B, C). Overall, rare OTUs showed more dissimilarity from sample to sample

300 (averaging about 80% dissimilarity) compared to abundant OTUs (averaging 60–70%

301 dissimilarity). ANOSIM tests of bacterial clustering indicated that clustering according to

302 seamount was significant in all cases (Table 1). Mantel tests indicated a significant

12

303 positive correlation between community dissimilarity and geographic distance for all

304 cases (Table 1). Clustering with the Unifrac method again yielded similar results;

305 however, ANOSIM analysis of geographic clustering for abundant bacterial strains

306 without sulfides was not significant (Table S3) and abundant bacterial strains showed

307 greater similarity among samples than among rare bacterial strains (Figure S3B, C).

308 For archaeal lineages, different patterns appeared. As with bacteria, abundant

309 OTUs showed much higher similarity between samples than rare OTUs. Separating the

310 rare and abundant OTUs revealed that the lack of biogeographic patterning for all archaea

311 was driven almost entirely by the abundant OTUs (Figure 3B, C). In contrast to the

312 bacterial case, the only archaeal OTUs to cluster significantly by geographic location

313 were the rare OTUs (Table 1). An extremely low R statistic and high p-value from the

314 ANOSIM analyses indicated that abundant archaeal OTUs showed almost no tendency to

315 group according to geographic location, especially when considering only diffuse flow

316 samples. Mantel tests of community dissimilarity versus geographic distance showed the

317 same pattern (Table 1). Unifrac analyses indicated similar trends, though all archaea

318 together demonstrated statistically significant biogeographic clustering (Table S3, Figure

319 S4).

320 Our analyses indicated that 69% of the bacterial OTUs and 66% of archaeal

321 OTUs were found only in one sample, most of which were rare lineages. Nevertheless,

322 removal of OTU singletons yielded the same statistical results. While a more

323 comprehensive sampling effort might have detected these rare lineages in a greater

324 number of samples, the community structure of rare OTUs was significantly consistent

325 within each region (Table 1). Random sampling from the environment in this way would

13

326 not yield a statistically significant biogeographic pattern of detection if the rare biosphere

327 were truly cosmopolitan (Pedrós-Alió 2006; Galand et al. 2009), and indicates different

328 patterns of spatial distributions between rare and abundant OTUs among the archaea, but

329 not among the bacteria.

330

331 Persistence of OTUs across samples

332 As suggested in Figure 3, the widely dispersed archaeal OTUs tended to be those

333 that were more abundant within samples. To determine whether archaeal strains shifted in

334 relative abundance from site to site, we visualized OTU abundance in the heatmap

335 depicted in in Figure 4, depicting the relative abundance of archaeal OTUs from the high-

336 temperature Hulk sample across all other samples analyzed here. Generally, abundant

337 OTUs in Hulk were more likely to be abundant or at least present in other samples, while

338 rare OTUs were more likely to be rare or undetected across samples. OTU 3147, for

339 example, a member of the Marine Group I crenarchaea, was abundant in almost all

340 diffuse flow samples. OTUs that were abundant in diffuse flow were absent or rare in

341 sulfide samples, and vice versa. An exception to this trend was the most abundant

342 archaeal OTU in the Hulk sample, OTU 3645, a Thermococcus sequence that comprised

343 63% of the sample. This OTU was rare in most other samples examined. However, this

344 OTU reached a maximum of 9% in one sulfide sample (sulf_20), from an entirely

345 different ecological niche.

346

347 Phylogenetics of the Thermococcales and Methanococcales

14

348 Clustering samples into OTUs does not give an indication of phylogenetic

349 relatedness between sequences and across samples, yet understanding phylogenetic

350 relatedness can provide insight into similarity and gene flow between samples. To

351 determine the extent to which rare strains bloomed or dispersed between niches within

352 hydrothermal systems, we focused on the phylogenetics of specific strains.

353 Thermococcales are a general indicator of high-temperature fluids and were dominant in

354 the Hulk sample. Thus we created a phylogenetic tree of sequences falling within the

355 order Thermococcales to investigate relationships between samples that might be based

356 on temperature, especially the high-temperature Hulk sample and the sulfide samples

357 (Figure 5). Both the sulfide samples and the Hulk sample had high diversity within the

358 Thermococcales order, with several OTUs falling into many different clades in the tree. It

359 was much more common for sulfide and Hulk sequences to group together into the same

360 OTU or branch (at 3% distance) than it was for diffuse flow fluid sequences to group

361 with sequences from sulfides or Hulk. Fewer Thermococcales OTUs were found in

362 diffuse flow samples overall; those that were present tended to cluster into a few clades

363 on the tree, particularly within the genus, or to group on branches with no

364 cultured representatives.

365 Similar results were found in a phylogenetic tree of OTUs falling in the

366 Methanococcales order, though these OTUs were found with greater frequency in diffuse

367 flow samples (Figure S7). The separation between sulfide and diffuse flow OTUs was

368 more distinct in this tree. The OTU from the Hulk sample fell within a clade shared with

369 other sulfide OTUs, despite being geographically closer to Axial Volcano, where most of

370 the Methanococcales OTUs were found.

15

371 Given the phylogenetic similarities between sulfide samples and the high-

372 temperature sample, we also sought to determine whether abundant sequences in sulfides

373 matched rare sequences in the Hulk high-temperature sample. Because previous work has

374 indicated that uncultured crenarchaea dominate the interior of sulfide structures (Schrenk

375 et al. 2003), we conducted global sequence comparisons of the Hulk high-temperature

376 v4–v6 region against a database of uncultured crenarchaea identified from sulfide clone

377 libraries. Two sequences were found that matched previously identified crenarchaea in

378 sulfides at 99–100% identity: a Desulfurococcales lineage from a white smoker spire on

379 the East Pacific Rise (Kormas et al. 2006), and a Pyrodictium lineage identified in an in-

380 situ growth chamber deployed within a sulfide structure (Nercessian et al. 2003).

381

382 Discussion

383 Domain differences in the ecology of rare and abundant lineages of the bacteria and

384 archaea

385 Null ecological models predict a positive correlation between abundance and

386 distribution: that is, the more highly abundant species will be found in more locations

387 (Brown 1984). With abundance aggregated among many OTUs, this leaves many rare

388 and few abundant. The abundance of a given OTU is distributed across space, leaving

389 many sites without rare taxa, and many sites where abundant taxa are present. Global

390 studies of microbial biogeography have confirmed this trend (Nemergut et al. 2011). On

391 the large scale, our results here confirm that abundant strains tend to be more widespread

392 in vent systems. Indeed, for both the archaeal and bacterial domains, there was a greater

393 tendency for community structuring to be similar across sites for the abundant strains.

16

394 However, in contrast to previous studies, our results indicate a striking difference in

395 ecological patterning between the archaeal and bacterial domains, and point to a

396 widespread, global range for specific abundant archaeal phylotypes.

397 For bacteria, microbial community structuring according to geographic location

398 applied to both the rare and abundant lineages. In all cases, bacterial community structure

399 was significantly similar within seamounts but distinct from vents farther afield. The rare

400 archaeal strains displayed a similar pattern. While we cannot distinctively determine

401 whether this biogeographic pattern can be attributed to environmental selection or

402 historical causes, the distance effect observed through the Mantel tests suggests that

403 dispersal limitation plays a strong role in forming the biogeographic patterns observed for

404 bacterial and rare archaeal strains. Moreover, since location was found to be a stronger

405 indicator of community similarity than chemistry in previous work (Opatkiewicz et al.

406 2009; Huber et al. 2010), dispersal limitation or other historical causes may be a stronger

407 driver of community structuring than environmental selection in hydrothermal systems.

408 However, a key difference from previous studies indicating community

409 structuring by vent field is the finding that many abundant archaeal strains were observed

410 at high abundance at sites spread across the globe. The results suggest an ecological

411 pattern in which the majority of archaeal OTUs are rare and biogeographically restricted,

412 but a few abundant archaeal OTUs dominate and are widespread. Many of the most

413 abundant and widespread archaeal OTUs in diffuse fluids belonged to Marine Groups I

414 and II, which are native to deep seawater and therefore might more easily travel in ocean

415 currents from one vent system to the next. These particular lineages of Marine Groups I

17

416 and II may have been ecotypes that gained a fitness advantage through some means, such

417 as horizontal gene transfer, that allowed them to proliferate rapidly in vent habitats.

418 However, it is unclear why certain abundant archaeal OTUs appear to be so

419 widely dispersed, while abundant bacterial OTUs are more biogeographically restricted.

420 The perceived ubiquity of certain archaeal lineages may be linked to the observation that

421 archaea generally exhibit lower richness compared to bacteria in various environments

422 globally (Aller & Kemp 2008). Low archaeal diversity may result from slow evolution of

423 the 16S gene in archaea relative to the bacterial 16S gene. It is also possible that current

424 primer designs do not adequately detect the true extent of archaeal diversity. In this case,

425 the taxonomic resolution used here cannot detect certain biogeographic patterns, a

426 potential problem that has been discussed previously (Hanson et al. 2012) and may be

427 rectified with new primers that target different regions of the 16S gene. Moreover, a

428 likely contributing factor is that 16S-based studies do not reflect the diversity encoded on

429 the rest of the archaeal genome. Previous work by Holden et al. (2001) has indicated that

430 Thermococcales isolates exhibit genetic and phenotypic diversity that correlates with

431 geographic location and habitat, yet is not reflected in the highly conserved 16S gene.

432 While a certain 16S sequence may be ubiquitous, individual phenotypes within that OTU

433 may be confined to particular regions. Extensive intra-species genetic diversity through

434 gene transfer and recombination has been observed in natural archaeal populations (Allen

435 et al. 2007), and physiological variation within the genome has been observed in Marine

436 Group II, for example, which encodes proteorhodopsins in the photic zone, but not deeper

437 in the water column (Frigaard et al. 2006). The archaeal pangenome can therefore be

438 quite extensive. This distinction may pertain especially to thermophilic archaea, given

18

439 that high rates of horizontal gene transfer have been observed among

440 (Koonin et al. 2001; Beiko et al. 2005). Thus, while the same OTUs were observed

441 across multiple vent sites, it is possible that there was a range of physiological variation

442 within those OTUs from one site to the next that was not reflected by the 16S v6

443 sequence. Further research involving comparisons of full genome sequences will provide

444 insight into this possibility.

445

446 Community structuring within hydrothermal niches and the deep subsurface

447 The gradients within deep-sea hydrothermal systems, created by the mixing of

448 hydrothermal fluid and deep seawater, establish multiple ecological niches that foster

449 high microbial diversity. Specific examination of the Hulk high-temperature sample

450 provides further insight into the dynamics of rare and abundant strains within vent

451 environments, as well as the extent of their dispersal between ecological niches. While

452 the community structure of the Hulk high-temperature sample was generally more similar

453 to diffuse fluid samples than to sulfide samples, specific OTUs were shared among the

454 sulfide samples and the Hulk sample, particularly among the rare archaeal communities.

455 This included the thermophilic archaeal groups Thermococcales and Methanococcales, as

456 well as uncultured crenarchaea. Microbial communities in sulfides are exposed to more

457 focused hydrothermal fluid flow, and thus higher temperatures, than microbial

458 communities in diffuse flow. Therefore the similarities in archaeal groups between the

459 high-temperature fluid sample and the sulfides most likely reflect selection according to

460 temperature, and may also point to a common source for these archaeal lineages, in which

461 thermophilic archaea in the hot subsurface are flushed into sulfide structures or fluids

19

462 emerging at the seafloor. This scenario depicts the microbial community in the deep hot

463 subsurface as occupying a unique niche, distinct from that of the diffuse flow and sulfide

464 microbial communities found downstream in the fluid flow, but one which we can

465 glimpse through the rare biosphere.

466 Taken together, these results provide a picture of microbial colonization and

467 dispersal both within and between vent systems. Within vents, we observe the abundant

468 strains of the deep, hot biosphere emerging as rare strains in sulfide structures, possibly

469 pointing to a deep, hot habitat for specific rare strains in sulfide structures. However, the

470 strains that were most successful at widespread dispersal and colonization appeared to be

471 specific archaeal strains that were not native to the deep biosphere but found mostly in

472 diffuse flow fluids, and appeared to be more capable of traveling deep-sea currents to

473 colonize new vent systems. Future work can reveal the extent of genome heterogeneity

474 within OTU groupings, and may reveal why certain archaeal strains appeared to be such

475 successful dispersers.

476

477

478 Acknowledgements

479 The authors gratefully acknowledge Chief Scientist Jim Holden, David Butterfield, and

480 the captain and crew of the R/V Atlantis for help with sample collection at sea. We would

481 like to thank W. Brazelton for stimulating discussions that informed our analyses, and we

482 thank J. Deming for edits that greatly improved the manuscript. J. Huber and A.-L.

483 Reysenbach kindly provided background information regarding samples used in this

20

484 analysis. Funding for this work was provided by a grant from the NASA Astrobiology

485 Institute to the Carnegie Institution of Washington.

21

References Allen EE, Tyson GW, Whitaker RJ, Detter JC, Richardson PM & Banfield JF (2007) Genome dynamics in a natural archaeal population. Proc. Natl. Acad. Sci. U. S. A. 104: 1883–1888. Aller JY & Kemp PF (2008) Are Archaea inherently less diverse than Bacteria in the same environments? FEMS Microbiol. Ecol. 65: 74–87. Anderson RE, Beltrán MT, Hallam SJ & Baross JA (2013) Microbial community structure across fluid gradients in the Juan de Fuca Ridge hydrothermal system. FEMS Microbiol. Ecol. 83: 324–339. Anderson RE, Brazelton WJ & Baross JA (2011) Using CRISPRs as a metagenomic tool to identify microbial hosts of a diffuse flow hydrothermal vent viral assemblage. FEMS Microbiol. Ecol. 77: 120–133. Baas Becking L (1934) Geobiologie of Inleiding Tot de Milieukunde [Geobiology or Introduction to the Science of the Environment]. W.P. Van Stockum & Zoon, The Hague, Netherlands. Beiko RG, Harlow TJ & Ragan M a (2005) Highways of gene sharing in prokaryotes. Proc. Natl. Acad. Sci. U. S. A. 102: 14332–14337. Berger SA, Krompass D & Stamatakis A (2011) Performance, accuracy, and Web server for evolutionary placement of short sequence reads under maximum likelihood. Syst. Biol. 60: 291–302. Brown JH (1984) On the Relationship between Abundance and Distribution of Species. Am. Nat. 124: 255–279. Clarke KR & Gorley RN (2006) PRIMER v6: User Manual/tutorial. Primer-E Ltd Plymouth. text. Deming J & Baross J (1993) Deep-sea smokers: Windows to a subsurface biosphere? Geochim. Cosmochim. Acta 57: 3219–3230. Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26: 2460–2461. Embley RW, Baker ET, Chadwick WW, Lupton JE, Resing JA, Massoth GJ & Nakamura K (2004) Explorations of Mariana Arc volcanoes reveal new hydrothermal systems. Eos, Trans. Am. Geophys. Union 85: 37. Flores GE et al. (2011) Microbial community structure of hydrothermal deposits from geochemically different vent fields along the Mid-Atlantic Ridge. Environ. Microbiol. 13: 2158–2171. Flores GE, Shakya M, Meneghin J, Yang ZK, Seewald JS, Wheat GC, Podar M & Reysenbach A-L (2012) Inter-field variability in the microbial communities of hydrothermal vent deposits from a back-arc basin. Geobiology 10: 333–346. Frigaard N-U, Martinez A, Mincer TJ & DeLong EF (2006) Proteorhodopsin lateral gene transfer between marine planktonic Bacteria and Archaea. Nature 439: 847–850.

22

Fuhrman JA (2009) Microbial community structure and its functional implications. Nature 459: 193–199. Galand PE, Casamayor EO, Kirchman DL & Lovejoy C (2009) Ecology of the rare microbial biosphere of the Arctic Ocean. Proc. Natl. Acad. Sci. U. S. A. 106: 22427– 22432. Gibbons SM, Caporaso JG, Pirrung M, Field D, Knight R & Gilbert JA (2013) Evidence for a persistent microbial seed bank throughout the global ocean. Proc. Natl. Acad. Sci. U. S. A. 110: 4651–4655. Hanson CA, Fuhrman JA, Horner-Devine MC & Martiny JBH (2012) Beyond biogeographic patterns: processes shaping the microbial landscape. Nat. Rev. Microbiol. 10: 497–506. Holden JF, Takai K, Summit M, Bolton S, Zyskowski J & Baross JA (2001) Diversity among three novel groups of hyperthermophilic deep-sea Thermococcus species from three sites in the northeastern Pacific Ocean. FEMS Microbiol. Ecol. 36: 51– 60. Huber J a, Johnson HP, Butterfield D a & Baross J a (2006) Microbial life in ridge flank crustal fluids. Environ. Microbiol. 8: 88–99. Huber JA, Butterfield DA & Baross JA (2003) Bacterial diversity in a subseafloor habitat following a deep-sea volcanic eruption. FEMS Microbiol. Ecol. 43: 393–409. Huber JA, Butterfield DA & Baross JA (2006) Diversity and distribution of subseafloor Thermococcales populations in diffuse hydrothermal vents at an active deep-sea volcano in the northeast Pacific Ocean. J. Geophys. Res. 111: G04016. Huber JA, Butterfield DA & Baross JA (2002) Temporal changes in archaeal diversity and chemistry in a mid-ocean ridge subseafloor habitat. Appl. Environ. Microbiol. 68: 1585–1594. Huber JA, Cantin H V, Huse SM, Welch DBM, Sogin ML & Butterfield DA (2010) Isolated communities of Epsilonproteobacteria in hydrothermal vent fluids of the Mariana Arc seamounts. FEMS Microbiol. Ecol. 73: 538–549. Huber JA, Mark Welch DB, Morrison HG, Huse SM, Neal PR, Butterfield DA & Sogin ML (2007) Microbial population structures in the deep marine biosphere. Science (80-. ). 318: 97–100. Huse SM, Dethlefsen L, Huber JA, Mark Welch D, Welch DM, Relman DA & Sogin ML (2008) Exploring microbial diversity and using SSU rRNA hypervariable tag sequencing. PLoS Genet. 4: e1000255. Huse SM, Huber JA, Morrison HG, Sogin ML & Mark Welch DB (2007) Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 8: R143. Kelley DS, Baross JA & Delaney JR (2002) Volcanoes, fluids, and life at mid-ocean ridge spreading centers. Annu. Rev. Earth Planet. Sci. 30: 385–491. Koonin E V., Makarova KS & Aravind L (2001) Horizontal gene transfer in prokaryotes: Quantification and classification. Annu. Rev. Microbiol. 55: 709–742.

23

Kormas KA, Tivey MK, Von Damm K & Teske A (2006) Bacterial and archaeal phylotypes associated with distinct mineralogical layers of a white smoker spire from a deep-sea hydrothermal vent site (9˚N, East Pacific Rise). Environ. Microbiol. 8: 909–920. Martiny JBH et al. (2006) Microbial biogeography: putting microorganisms on the map. Nat. Rev. Microbiol. 4: 102–112. Nemergut DR et al. (2011) Global patterns in the biogeography of bacterial taxa. Environ. Microbiol. 13: 135–144. Nercessian O, Reysenbach A-L, Prieur D & Jeanthon C (2003) Archaeal diversity associated with in situ samplers deployed on hydrothermal vents on the East Pacific Rise (13˚N). Environ. Microbiol. 5: 492–502. Opatkiewicz AD, Butterfield DA & Baross JA (2009) Individual hydrothermal vents at Axial Seamount harbor distinct subseafloor microbial communities. FEMS Microbiol. Ecol. 70: 81–92. Pedrós-Alió C (2006) Marine microbial diversity: can it be determined? Trends Microbiol. 14: 257–263. Pruesse E, Peplies J & Glöckner FO (2012) SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 28: 1823–1829. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J & Glöckner FO (2013) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41: D590–6. R Core T (2013) R: A Language and Environment for Statistical Computing. Schloss PD et al. (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75: 7537–7541. Schrenk MO, Kelley DS, Delaney JR & Baross JA (2003) Incidence and diversity of microorganisms within the walls of an active deep-sea sulfide chimney. Appl. Environ. Microbiol. 69: 3580–3592. Slobodkin A, Campbell B, Cary SC, Bonch-Osmolovskaya E & Jeanthon C (2001) Evidence for the presence of thermophilic Fe(III)-reducing microorganisms in deep- sea hydrothermal vents at 13 degrees N (East Pacific Rise). FEMS Microbiol. Ecol. 36: 235–243. Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, Neal PR, Arrieta JM & Herndl GJ (2006) Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc. Natl. Acad. Sci. U. S. A. 103: 12115–12120. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688–2690. Sul WJ, Oliver TA, Ducklow HW, Amaral-Zettler LA & Sogin ML (2013) Marine bacteria exhibit a bipolar distribution. Proc. Natl. Acad. Sci. U. S. A. 110: 2342– 2347.

24

Summit M & Baross JA (2001) A novel microbial habitat in the mid-ocean ridge subseafloor. Proc. Natl. Acad. Sci. U. S. A. 98: 2158–2163. Takai K & Horikoshi K (1999) Genetic diversity of archaea in deep-sea hydrothermal vent environments. Genetics 152: 1285–1297. Takai K, Komatsu T, Inagaki F & Horikoshi K (2001) Distribution of archaea in a black smoker chimney structure. Appl. Environ. Microbiol. 67: 3618–3629. Vergin K, Done B, Carlson C & Giovannoni S (2013) Spatiotemporal distributions of rare bacterioplankton populations indicate adaptive strategies in the oligotrophic ocean. Aquat. Microb. Ecol. 71: 1–13.

Figure Legends

Figure 1. Bar charts of bacterial (A) and archaeal (B) taxonomy for all samples. Taxonomy was assigned in VAMPS by the GAST process (Huse et al. 2008). Hulk archaeal sample is classified based on v4–v6 sequence; all others are classified based on v6 sequence.

Figure 2. Cluster dendrograms of diffuse flow and sulfide bacterial samples. Cluster dendrograms were created with group average method using the Bray-Curtis dissimilarity index. Operational taxonomic units are defined at the 3% distance for these analyses: A) analysis including all OTUs in each sample; B) analysis including only abundant OTUs (representing 1% or more of all sequences in each sample); and C) analysis including only rare OTUs (representing 0.1% or less of all sequences in each sample). Background samples are marked by asterisks.

Figure 3. Cluster dendrograms of diffuse flow and sulfide archaeal samples. Cluster dendrograms were created with group average method using the Bray-Curtis dissimilarity index. Operational taxonomic units are defined at the 3% distance for these analyses: A) analysis including all OTUs in each sample; B) analysis including only abundant OTUs (representing 1% or more of all sequences in each sample); and C) analysis including only rare OTUs (representing 0.1% or less of all sequences in each sample). Background samples are marked by asterisks.

Figure 4. Heatmap depicting the relative abundance of archaeal OTUs found in Hulk vent compared to other samples. OTUs are ordered according to their abundance in Hulk. OTUs falling roughly at or below the “Rare in Hulk” marker on the heatmap were present at 0.1% abundance or lower in the Hulk sample. White colors indicate that the OTU was not found in a given sample. Background samples are marked with asterisks.

Figure 5. Phylogenetic tree of Thermococcales based on 16S rRNA gene sequences, with pyrotag sequences added to the reference tree. Red dots indicate a sequence found in the high-temperature Hulk sample; blue dots, sequences found in diffuse flow samples; and green dots, sequences found in sulfide samples. Collapsed wedges are annotated with the

25

number of sequences in each cluster that was found in each respective environment. Evolutionary history was inferred using a rapid bootstrap, maximum likelihood method with 100 alternative runs on distinct starting trees, using the GTR+ optimization of substitution rates and the GAMMA model of heterogeneity in RAxML (Stamatakis 2006). The Evolutionary Placement Algorithm (Berger et al. 2011) was used to insert short reads into the reference tree. Bootstrap values for the reference tree are labeled where they are over 50.

26 100 100 10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90 0 0 classi by theGASTprocess (Huseetal. 2008).Hulkarchaeal sampleisclassi Figure (A) ofbacterial andarchaeal (B)taxonomy 1.Barcharts for allsamples. Taxonomy was assigned in VAMPS sulf09 sulf1 sulf10 sulf2 sulf3 f sulf12 ed basedonv6sequence. sulf16 sulf4 sulf5 sulf18 sulf6 sulf20 sulf7 Ax-FS317 sulf8 Ax-FS389 sulf17 Ax-FS392 sulf19 PS-FS430 Ax-FS317 PS-FS431 Ax-FS389 PS-FS432 Ax-FS392 PS-FS433 PS-FS430 PS-FS434 PS-FS431 PS-FS435 PS-FS432 PS-FS445 PS-FS433 PS-FS446 PS-FS434 PS-FS447 PS-FS435 PS-FS448 PS-FS445 PS-FS449 PS-FS446 PS-FS462 PS-FS447 NP-FS467 PS-FS448 NP-FS468 PS-FS449 NP-FS469 PS-FS462 NP-FS467 NP-FS473 NP-FS468 NP-FS475 NP-FS469 NP-FS479 NP-FS473 NP-FS481 NP-FS475 NP-FS482 NP-FS479 Ax-FS502 NP-FS480 Ax-FS503 NP-FS481 Ax-FS505 NP-FS482 Ax-FS509 Ax-FS501 Ax-FS510 Ax-FS502 Ax-FS511 Ax-FS503 Ax-FS518 Ax-FS505 Ax-FS519 Ax-FS509 Ax-FS520 Ax-FS510 Ax-FS521 Ax-FS511 Lo-PP1 Ax-FS518 Lo-PP2 Ax-FS519 Lo-PP4 Ax-FS520 Lo-PP5 Ax-FS521 f Lo-PP1

ed basedonv4–v6 sequence; allothersare Lo-PP6 Lo-PP2 Hulk Lo-PP4 Lo-PP5 Lo-PP6 Hulk Unknown archaea Archaea;;Marine_Group_I Archaea;Crenarchaeota;Miscellaneous_Crenarchaea Archaea;Crenarchaeota;Thermoprotei Archaea;;Archaeoglobi Archaea;Euryarchaeota;Halobacteria Archaea;Euryarchaeota;Methanobacteria Archaea;Euryarchaeota;Methanococci Archaea;Euryarchaeota;Methanomicrobia Archaea;Euryarchaeota;Methanopyri Archaea;Euryarchaeota; Archaea;Euryarchaeota;Thermoplasmata Archaea;Euryarchaeota;Miscellaneous_Euryarchaea Archaea;Korarchaeota Bacteria;Bacteroidetes Bacteria;Cyanobacteria Bacteria;Deferribacteres Bacteria;Deinococcus-Thermus Bacteria;Proteobacteria;Alphaproteobacteria Bacteria;Proteobacteria;Betaproteobacteria Bacteria;Proteobacteria;Deltaproteobacteria Bacteria;Proteobacteria;Epsilonproteobacteria Bacteria;Proteobacteria;Gammaproteobacteria Bacteria;Other C) Rare bacterial OTUs bacterial Rare C) B) Abundant bacterial OTUsB) Abundant bacterial A) OTUs All bacterial Dissimilarity Dissimilarity Dissimilarity * * * KEY: * * Di Sul Di Di Di f f f f f use use use use de * f f f f ow: Axial ow: ow: Nikko ow: Daikoku ow: NWEifuku ow: * * * * * * Di Di Di Di Di * f f f f f use use use use use * * f f f f * f ow: Loihi ow: ow: EDiamante ow: NWRota ow: Forecastow: ow: Hulk (high ow: T) * asterisks. Background samples are marked by sequences in each sample). (representing 0.1%orlessofall analysis including only rare OTUs sequences in each sample); and C) (representing 1%ormore ofall including only abundant OTUs in eachOTUs sample; B)analysis analyses: analysisA) including all defined at the 3%distance forthese index. Operational taxonomic units are using the Bray-Curtis dissimilarity created with groupaverage method samples. Cluster dendrograms were diffuse flowand sulfide bacterial Figure 2. Cluster dendrograms of C) Rare archaeal Rare OTUsC) B) Abundant archaeal OTUs A) All archaeal OTUs

Dissimilarity Dissimilarity Dissimilarity KEY: Di Sul Di Di Di f f f f f use use use use de f f f f ow: Axial ow: ow: Nikko ow: Daikoku ow: NWEifuku ow: * * Di Di Di Di Di f f f f f use use use use use * f f f f f * ow: Loihi ow: * ow: EDiamante ow: NWRota ow: Forecastow: ow: Hulk(high ow: T) * * * * samples are marked byasterisks. sequences in each sample). Background (representing 0.1%orlessofall analysis including only rare OTUs sequences in each sample); and C) (representing 1%ormore ofall analysis including only abundant OTUs including all in each OTUs sample; B) distance forthese analyses: analysisA) taxonomic units are defined at the 3% Curtis dissimilarity index. Operational group average method usingthe Bray- Cluster dendrograms were created with flow and sulfide archaeal samples. Figure 3. Cluster dendrograms ofdiffuse (10.6% ofHulk) O (63% ofHulk) O O O TU3147 TU3645 TU2977 TU3550 Rare in Rare Hulk indicate that the not was foundin OTU a given sample. Background samples are marked with asterisks. in Hulk” marker onthe samples. are OTUs ordered according to their abundance falling in Hulk.OTUs roughly at orbelow the “Rare Figure 4.

REA_Hulk DF_Ax_FS317

Heatmap DF_Ax_FS389 DF_Ax_FS392 DF_Ax_FS502

depicting the relative abundance ofarchaeal foundin Hulkvent OTUs compared to other DF_Ax_FS503 DF_Ax_FS505

heatmap DF_Ax_FS509 DF_Ax_FS510 DF_Ax_FS511

were present at 0.1%abundance orlower in the Hulksample. White colors DF_Ax_FS518 DF_Ax_FS519 DF_Ax_FS520 DF_Ax_FS521 DF_NP_FS467 0% DF_NP_FS468 C

DF_NP_FS469 olor s DF_NP_FS473 DF_NP_FS475 p

DF_NP_FS479 e c

DF_NP_FS481 t r DF_NP_FS482* um

DF_PP1 k

DF_PP2 e y DF_PP5 DF_PP6 DF_PS_FS430 100% DF_PS_FS431 DF_PS_FS432 DF_PS_FS433 DF_PS_FS434* DF_PS_FS435* DF_PS_FS445 DF_PS_FS446 DF_PS_FS447 DF_PS_FS448 DF_PS_FS449 DF_PS_FS462 sulf_09 sulf_10 sulf_12 sulf_16 sulf_20 Figure 4. Heatmap depicting the relative abundance of archaeal OTUs found in Hulk vent compared to other samples. OTUs are ordered according to their abundance in Hulk. OTUs falling roughly at or below the “Rare in Hulk” marker on the heatmap were present at 0.1% abundance or lower in the Hulk sample. White colors indicate that the OTU was not found in a given sample. Background samples are marked with asterisks. Palaeococcus sp. Ax00-3 (AY559124) 8 4 1 OTU115 OTU3171 Palaeococcus helgesonii (AY134472.1.1438) Palaeococcus ferrophilus Palaeococcus sp. DY20341 (HQ696462.1.1446) (AB019239) OTU89 1 10 1 OTU93 OTU99 OTU103 OTU107 99 OTU108 OTU118 OTU120 OTU125 OTU139 OTU143 OTU145 91 OTU1215 OTU1530 OTU1531 OTU1537 OTU1646 OTU2022 OTU2309 OTU2939 OTU3033 OTU3218 OTU3236 OTU3261 OTU3350 OTU3378 OTU3380 OTU3453 OTU3603 OTU3706 Thermococcus mexicalis (AY099181) OTU105 100 Thermococcus aggregans (Y08384) OTU104 OTU1199 OTU2957 OTU1567 Thermococcus barophilus (AY099172) OTU98 OTU1526 OTU1911 OTU2462 Thermococcus barossii (AY099173) OTU130 Thermococcus thioreducens (AF394925) Thermococcus tolerans (AY206705) Thermococcus marinus (AF479012) 53 Thermococcus kodakarensis (AP006878) Thermococcus fumicolans (AB055122) Thermococcus stetteri (AY099186) Thermococcus hydrothermalis (AY099179) Thermococcus coalescens (AB107767) 26 2 9 90 (AY099174) Thermococcus profundus (AY099184) 11 1 1 Thermococcus onnurineus (CP000855) Thermococcus acidaminovorans (AB055120) 59 OTU97 OTU144 OTU3327 furiosus, Pyrococcus horikoshii, Pyrococcus woesei (AE009950, 100 Pyrococcus abyssi (AJ248283) 3 1 5 AY519654) 5 2 9 0.06

Figure 5. Phylogenetic tree of Thermococcales based on 16S rRNA gene sequences, with pyrotag sequences added to the reference tree. Red dots indicate a sequence found in the high-temperature Hulk sample; blue dots, sequences found in difuse fow samples; and green dots, sequences found in sulfde samples. Collapsed wedges are annotated with the number of sequences in each cluster that was found in each respective environment. Evolutionary history was inferred using a rapid bootstrap, maximum likelihood method with 100 alternative runs on distinct starting trees, using the GTR+ optimization of substitution rates and the GAMMA model of heterogene- ity in RAxML (Stamatakis 2006). The Evolutionary Placement Algorithm (Berger et al. 2011) was used to insert short reads into the reference tree. Bootstrap values for the reference tree are labeled where they are over 50. !"##$%&%'()*+,-.$%/, Figure S1. Approximate locations of sampling sites at hydrothermal vents worldwide. Red star: Main Endeavour Field and Axial Seamount, Juan de Fuca Ridge. Blue star: Eifuku, Daikoku, Nikko, Forecast, NW Rota, and E Diamante seamounts, Mariana Arc. Green star: Loihi Seamount, Hawaii. Purple star: Lau Basin. Map was generated using GeoMapApp (http://www.geomapapp.org/). Table S1. Metadata for all samples used in this study. Metadata for publicly available samples were obtained from the VAMPS database.

Sample Depth Sampling) Temp name! Sample type! (m)! Date) Location! Latitude! Longitude! Domain! (˚C)! 4/9/05! Lau Basin, active sulfide ! South Pacific sulf_01! chimney! 2707! Ocean! –20.316686! –176.1363! Bacteria! 2.712!

4/11/05! Lau Basin, active sulfide ! South Pacific sulf_02! chimney! 2714! Ocean! –20.317851! –176.13737! Bacteria! 2.712!

4/13/05! Lau Basin, active sulfide ! South Pacific sulf_03! chimney! 2139! Ocean! –20.761027! –176.19081! Bacteria! 2.712!

4/16/05! Lau Basin, active sulfide ! South Pacific sulf_04! chimney! 1908! Ocean! –22.180673! –176.60124! Bacteria! 2.736!

4/22/05! Lau Basin, ! South Pacific sulf_05! microbial mat! 1918! Ocean! –22.180185! –176.60081! Bacteria! 2.736!

4/22/05! Lau Basin, active sulfide ! South Pacific sulf_06! flange! 1918! Ocean! –22.180185! –176.60081! Bacteria! 2.736!

4/24/05! Lau Basin, active sulfide ! South Pacific sulf_07! flange! 1875! Ocean! –21.989609! –176.56809! Bacteria! 2.731!

5/3/05! Lau Basin, active sulfide ! South Pacific sulf_08! chimney! 2619! Ocean! –20.053045! –176.13374! Bacteria! 2.706!

4/9/05! Lau Basin, active sulfide ! South Pacific sulf_09! chimney! 2707! Ocean! –20.316686! –176.1363! Archaea! 2.712!

4/11/05! Lau Basin, active sulfide ! South Pacific sulf_10! chimney! 2714! Ocean! –20.317851! –176.13737! Archaea! 2.712!

4/16/05! Lau Basin, active sulfide ! South Pacific sulf_12! chimney! 1908! Ocean! –22.180673! –176.60124! Archaea! 2.736!

5/3/05! Lau Basin, active sulfide ! South Pacific sulf_16! chimney! 2619! Ocean! –20.053045! –176.13374! Archaea! 2.706!

active sulfide 4/9/05! Lau Basin, chimney- ! South Pacific sulf_17! bottom! 2707! Ocean! –20.316686! –176.1363! Bacteria! 2.712!

active sulfide 4/9/05! Lau Basin, chimney- ! South Pacific sulf_18! bottom! 2707! Ocean! –20.316686! –176.1363! Archaea! 2.712!

active sulfide 4/9/05! Lau Basin, sulf_19! chimney-top! 2707! ! South Pacific –20.316686! –176.1363! Bacteria! 2.712! Ocean!

4/9/05! Lau Basin, active sulfide ! South Pacific sulf_20! chimney-top! 2707! Ocean! –20.316686! –176.1363! Archaea! 2.712!

9/4/03! Axial Volcano, Hydrothermal ! North Pacific – FS317! fluids! 1526! Ocean! 45.9227283! 129.9882383! Both! 26.6!

9/21/04! Axial Volcano, Hydrothermal ! North Pacific FS389! fluids! 1546! Ocean! 45.933583! –130.013583! Both! 32.5!

9/21/04! Axial Volcano, Hydrothermal ! North Pacific –– FS392! fluids! 1546! Ocean! 45.9337! 130.013617! Both! 68.2!

Hydrothermal 4/21/06! Forecast, FS430! fluids! 1449! ! Philippine Sea! 13.394633! 143.920096! Both! 71!

Hydrothermal 4/21/06! Forecast, FS431! fluids! 1448! ! Philippine Sea! 13.394632! 143.920083! Both! 6!

Hydrothermal 4/21/06! Forecast, FS432! fluids! 1451! ! Philippine Sea! 13.39532! 143.919902! Both! 6.5!

Hydrothermal 4/21/06! Forecast, FS433! fluids! 1447! ! Philippine Sea! 13.395265! 143.919873! Both! 40!

Background 4/21/06! Forecast, FS434! seawater! 195! ! Philippine Sea! 13.3811! 143.9021! Both! !

Background 4/21/06! Forecast, FS435! seawater! 1342.5! ! Philippine Sea! 13.3811! 143.9021! Both! !

Hydrothermal 4/23/06! NW Rota, FS445! fluids! 560! ! Philippine Sea! 14.600912! 144.775483! Both! 19.7!

Hydrothermal 4/23/06! NW Rota, FS446! fluids! 534! ! Philippine Sea! 14.60085! 144.77632! Both! 48!

Hydrothermal 4/23/06! NW Rota, FS447! fluids! 521! ! Philippine Sea! 14.601177! 144.775618! Both! 29!

Hydrothermal 4/23/06! NW Rota, FS448! fluids! 584! ! Philippine Sea! 14.60084! 144.7773! Both! 25!

Hydrothermal 4/23/06! NW Rota, FS449! fluids! 568! ! Philippine Sea! 14.60081! 144.77751! Both! 15.1!

4/30/06! E Diamante, Hydrothermal ! North Pacific FS462! fluids! 353! Ocean! 15.94277! 145.68141! Both! 22.5!

Hydrothermal 5/4/06! Eifuku, North FS467! fluids! 1612! ! Pacific Ocean! 21.48742! 144.04163! Both! 42.9!

Hydrothermal 5/4/06! Eifuku, North FS468! fluids! 1578! ! Pacific Ocean! 21.487248! 144.042123! Both! 45.1!

Hydrothermal 5/4/06! Eifuku, North FS469! fluids! 1578! ! Pacific Ocean! 21.487248! 144.042123! Both! 33.7! Hydrothermal 5/5/06! Daikoku, North FS473! fluids! 438! ! Pacific Ocean! 21.324536! 144.19293! Both! 15.3!

Hydrothermal 5/5/06! Daikoku, North FS475! fluids! 414! ! Pacific Ocean! 21.324962! 144.19139! Both! 45.5!

Hydrothermal 5/8/06! Nikko, North FS479! fluids! 458! ! Pacific Ocean! 23.081017! 142.325483! Both! 80.2!

Hydrothermal 5/8/06! Nikko, North FS480! fluids! 445! ! Pacific Ocean! 23.07913! 142.326433! Bacteria! 24.1!

Hydrothermal 5/8/06! Nikko, North FS481! fluids! 413! ! Pacific Ocean! 23.07977! 142.32687! Archaea! 32.6!

Background 5/8/06! Nikko, North FS482! seawater! 344! ! Pacific Ocean! 23.077802! 142.325151! Both! 14.7!

9/1/06! Axial Volcano, Background ! North Pacific FS501! seawater! 1526! Ocean! 45.94667! –129.98439! Bacteria! 2.4!

9/1/06! Axial Volcano, Hydrothermal ! North Pacific FS502! fluids! 1529! Ocean! 45.94632! –129.98398! Archaea! 83.4!

9/1/06! Axial Volcano, Hydrothermal ! North Pacific FS503! fluids! 1530! Ocean! 45.94364! –29.98519! Both! !

9/1/06! Axial Volcano, Hydrothermal ! North Pacific FS505! fluids! 1524! Ocean! 45.93319! –129.98223! Both! !

9/3/06! Axial Volcano, Hydrothermal ! North Pacific FS509! fluids! 1546! Ocean! 45.93357! –130.01329! Both! 24.6!

9/3/06! Axial Volcano, Hydrothermal ! North Pacific FS510! fluids! 1546! Ocean! 45.93331! –130.01334! Both! 49!

9/3/06! Axial Volcano, Hydrothermal ! North Pacific FS511! fluids! 1546! Ocean! 45.93364! –130.01329! Both! 96.8!

9/3/06! Axial Volcano, Hydrothermal ! North Pacific FS518! fluids! 1546! Ocean! 45.93357! –130.01329! Both! !

9/4/06! Axial Volcano, Hydrothermal ! North Pacific FS519! fluids! 1538! Ocean! 45.91724! –129.99299! Both! 29.6!

9/4/06! Axial Volcano, Hydrothermal ! North Pacific FS520! fluids! 1536! Ocean! 45.91631! –129.98916! Both! 14.9!

9/4/06! Axial Volcano, Hydrothermal ! North Pacific FS521! fluids! 1524! Ocean! 45.92279! –129.98838! Both! 27.5!

LOIHI- Hydrothermal 10/27/06! Loihi Seamount, PP1! fluids! 1272! ! North Pacific 18.900833! –155.261389! Both! ! Ocean!

10/27/06! Loihi Seamount, LOIHI- Hydrothermal ! North Pacific PP2! fluids! 1302! Ocean! 18.910278! –155.25111! Both! !

11/3/06! Loihi Seamount, LOIHI- Hydrothermal ! North Pacific PP4! fluids! 4983! Ocean! 18.703056! –155.180833! Bacteria!

11/4/06! Loihi Seamount, LOIHI- Hydrothermal ! North Pacific PP5! fluids! 1308! Ocean! 18.31222! –155.26111! Both! !

11/6/06! Loihi Seamount, LOIHI- Hydrothermal ! North Pacific PP6! fluids! 4988! Ocean! 18.703056! –155.180833! Both! !

10/30/06! Loihi Seamount, LOIHI- Background ! North Pacific CTD03! seawater! 1100! Ocean! 18.911667! –155.26194! Bacteria!

Background 4/24/06! NW Rota, CTDBtl12! seawater! ! Philippine Sea! 14.644167! –144.56667! Bacteria! 6.3!

6/17/2009! Juan de Fuca Hydrothermal Ridge, North Hulk! fluids! 2178! Pacific Ocean! 47.9500! –129.0968! Both! 125!

A) 3500

3000

2500

2000

1500

1000

Number of unique OTUs (0.03) 500

0 0 10000 20000 30000 40000 50000 60000 Number of sequences sampled

B) 500 450

400 03) . 350

300

250

200

150

100 Number of unique OTUs (0 50

0 0 5000 10000 15000 20000 25000 30000 35000 Number of sequences sampled

Figure S2. Rarefaction curves of A) bacterial and b) archaeal samples. Di use ow samples are noted in grey, sul de samples are noted in blue, and Hulk sample is noted in red. Rarefaction curves were generated in mothur (Schloss et al., 2009). Table S2. Evenness indices for each of the samples used in this study. Both the Shannon and Simpson indices are reported for both domains. All indices were calculated in mothur (Schloss et al., 2009). The overall difference in evenness between archaea and bacteria is significant for the Shannon test (t-test, Shannon p-value = 7.81E-18, Simpson p-value = 0.061).

Bacteria! Archaea!

Group! Simpson! Shannon! Simpson! Shannon!

DF_Ax_FS317! 0.037! 0.73! 0.0096! 0.31!

DF_Ax_FS389! 0.012! 0.66! 0.011! 0.38!

DF_Ax_FS392! 0.012! 0.51! 0.017! 0.21!

DF_Ax_FS501! 0.010! 0.20! ! !

DF_Ax_FS502! 0.0060! 0.16! 0.010! 0.36!

DF_Ax_FS503! 0.0041! 0.40! 0.0092! 0.29!

DF_Ax_FS505! 0.011! 0.59! 0.010! 0.36!

DF_Ax_FS509! 0.017! 0.62! 0.035! 0.54!

DF_Ax_FS510! 0.025! 0.63! 0.010! 0.31!

DF_Ax_FS511! 0.021! 0.55! 0.020! 0.29!

DF_Ax_FS518! 0.061! 0.73! 0.027! 0.50!

DF_Ax_FS519! 0.035! 0.72! 0.010! 0.26!

DF_Ax_FS520! 0.031! 0.73! 0.019! 0.16!

DF_Ax_FS521! 0.037! 0.71! 0.015! 0.22!

DF_NP_FS467! 0.028! 0.66! 0.019! 0.50!

DF_NP_FS468! 0.011! 0.64! 0.010! 0.35!

DF_NP_FS469! 0.037! 0.74! 0.018! 0.48!

DF_NP_FS473! 0.0037! 0.48! 0.013! 0.25!

DF_NP_FS475! 0.012! 0.49! 0.019! 0.28!

DF_NP_FS479! 0.023! 0.63! 0.012! 0.21!

DF_NP_FS480! 0.031! 0.60! ! !

DF_NP_FS481! 0.026! 0.66! 0.010! 0.36!

DF_NP_FS482! 0.022! 0.71! 0.014! 0.39!

DF_Lo_PP1! 0.026! 0.63! 0.018! 0.47!

DF_Lo_PP2! 0.031! 0.69! 0.016! 0.49! DF_Lo_PP4! 0.015! 0.62! ! !

DF_Lo_PP5! 0.020! 0.65! 0.0036! 0.24!

DF_Lo_PP6! 0.023! 0.66! 0.024! 0.43!

DF_PS_FS430! 0.079! 0.74! 0.022! 0.26!

DF_PS_FS431! 0.0085! 0.66! 0.024! 0.43!

DF_PS_FS432! 0.024! 0.75! 0.019! 0.31!

DF_PS_FS433! 0.039! 0.70! 0.019! 0.13!

DF_PS_FS434! 0.011! 0.60! 0.044! 0.49!

DF_PS_FS435! 0.0098! 0.60! 0.018! 0.40!

DF_PS_FS445! 0.045! 0.69! 0.048! 0.45!

DF_PS_FS446! 0.012! 0.47! 0.023! 0.36!

DF_PS_FS447! 0.024! 0.68! 0.015! 0.36!

DF_PS_FS448! 0.034! 0.72! 0.016! 0.50!

DF_PS_FS449! 0.027! 0.66! 0.029! 0.52!

DF_PS_FS462! 0.0060! 0.48! 0.0079! 0.24!

Hulk high temp! 0.020! 0.60! 0.011! 0.33! sulf_01! 0.0065! 0.44! ! ! sulf_02! 0.023! 0.57! ! ! sulf_03! 0.037! 0.71! ! ! sulf_04! 0.019! 0.63! ! ! sulf_05! 0.012! 0.55! ! ! sulf_06! 0.011! 0.60! ! ! sulf_07! 0.0071! 0.46! ! ! sulf_08! 0.037! 0.68! ! ! sulf_09! 0.039! 0.52! sulf_10! 0.015! 0.27! sulf_12! 0.022! 0.33! sulf_16! 0.016! 0.18! sulf_17! 0.0073! 0.42! ! ! sulf_19! 0.021! 0.57! ! ! sulf_20! 0.029! 0.47! CTDBTL12! 0.048! 0.72! ! !

LOIHI_CTD03! 0.016! 0.58! ! !

Average! 0.023! 0.61! 0.019! 0.35!

A) All Bacterial OTUs Figure S3. Cluster dendrograms of di use ow and sul de bacterial samples. Cluster dendrograms were created with group average method using distance matrices calculated using the Unifrac method after creating a phylogenetic tree for all sequences in each sample using clearcut within the mothur package (Schloss et al., 2009). For comparison between samples, we randomly subsampled the dataset 1000 times to the number of sequences contained within the smallest sample. We used the same previously-de ned sequences within the abundant and rare groupings for this analysis. A) analysis including all OTUs in each sample; B) analysis including only abundant OTUs B) Abundant Bacterial OTUs (representing 1% or more of all sequences in each sample); and C) analysis including only rare OTUs (representing 0.1% or less of all sequences in each sample). Background samples are marked by asterisks, and were collected either with a Niskin rosette on a CTD (labeled “CTD__“) or were collected by the hydrothermal uid sampler on the bottom during travel between or away from vents (labeled “FS__”). Samples are labeled according to uid sample number, seamount, and region: NP = North Paci c, Ax = Axial Seamount, PS = Philippine Sea, Lo = Loihi Seamount.

C) Rare Bacterial OTUs A) All Archaeal OTUs Figure S4. Cluster dendrograms of di use ow and sul de archaeal samples. Cluster dendrograms were created with group average method using distance matrices calculated using the Unifrac method after creating a phylogenetic tree for all sequences in each sample using clearcut within the mothur package (Schloss et al., 2009). For comparison between samples, we randomly subsampled the dataset 1000 times to the number of sequences contained within the smallest sample. We used the same previously-de ned sequences within the abundant and rare groupings for this analysis. A) analysis including all OTUs in each sample; B) analy- sis including only abundant OTUs B) Abundant Archaeal OTUs (representing 1% or more of all sequences in each sample); and C) analysis including only rare OTUs (representing 0.1% or less of all sequences in each sample). Background samples are marked by asterisks, and were collected either with a Niskin rosette on a CTD (labeled “CTD__“) or were collected by the hydrothermal uid sampler on the bottom during travel between or away from vents (labeled “FS__”). Samples are labeled according to uid sample number, seamount, and region: NP = North Paci c, Ax = Axial Seamount, PS = Philippine Sea, Lo = Loihi Seamount.

C) Rare Archaeal OTUs Table S3. ANOSIM results for bacterial and archaeal datasets, grouped according to environment as listed in Figures 1 and 2. ANOSIM was conducted as a one-way analysis on a distance matrix calculated with the Unifrac method among samples. Trees of all sequences in each sample were constructed using clearcut in the mothur package (Schloss et al., 2009), and distance matrices were calculated according to the Unifrac method (Lozupone & Knight, 2005) in the mothur package. ANOSIM analysis was conducted using the PRIMERv6 software package (Clarke & Gorley, 2006). We conducted nine hundred ninety nine permutations of the test for each ANOSIM analysis. A test is considered significant if p ≤ 0.001. Clustering was significant even after removing sulfides from the analysis, indicating that differentiation occurred not only due to sample type but also due to sample location. Statistical analyses remained the same even after OTU singletons (only appearing once in a sample) were removed (data not shown).

Domain Grouping With sulfides Without sulfides R p- Significant? R p- Significant? statistic value statistic value Bacteria All 0.534 <0.001 Yes 0.443 <0.001 Yes Abundant 0.307 <0.001 Yes 0.253 0.002 No Rare 0.588 <0.001 Yes 0.498 <0.001 Yes Archaea All 0.587 <0.001 Yes 0.468 <0.001 Yes Abundant 0.307 0.003 No 0.092 0.18 No Rare 0.577 <0.001 Yes 0.462 <0.001 Yes Bacteria unclassified 100% Chlorobi Synergistetes 90% Lentisphaerae

80% Fusobacteria Spirochaetes Gemmatimonadetes 70% Planctomycetes Other Nitrospirae 60% Thermodesulfobacteria Fusobacteria 50% Deinococcus-Thermus Aquificae Actinobacteria 40%

Percent of Community Cyanobacteria Deferribacteres

30% Chloroflexi Verrucomicrobia Firmicutes 20% Betaproteobacteria Bacteroidetes Deltaproteobacteria 10% Alphaproteobacteria Epsilonproteobacteria 0% Gammaproteobacteria Abundant OTUs Rare OTUs

Figure S5. Bar charts of bacterial taxonomy for abundant and rare OTUs. Each category includes OTUs that were abundant or rare in at least one sample. There were 434 OTUs that were abundant in at least one sample, and 21733 OTUs that were rare in at least one sample. Taxonomy was assigned in mothur (Schloss et al., 2009) according to alignment to the SILVA database (Quast et al., 2013). Archaea 100%

90%

80% Unclassified 70% Candidatus_Korarchaeum Methanobacteria 60% Halobacteria Methanomicrobia 50% Methanococci

40% Other Thermoprotei Archaeoglobi Percent of community 30% Marine Group I 20% Thermoplasmata

10%

0% Abundant OTUs Rare OTUs

Figure S6. Bar charts of archaeal taxonomy for abundant and rare OTUs. Each category includes OTUs that were abundant or rare in at least one sample. There were 263 OTUs that were abun- dant in at least one sample, and 3435 OTUs that were rare in at least one sample. Taxonomy was assigned in mothur (Schloss et al., 2009) according to alignment to the SILVA database (Quast et al., 2013) Thermococcus litoralis (AB603515.1.1360) QUERY___2490|1|DF_Ax_FS317 QUERY___2512|221|DF_Ax_FS317-DF_Ax_FS5 QUERY___2513|6|DF_Ax_FS317 QUERY___2526|1|DF_Ax_FS521 QUERY___3210|1|sulf_20 QUERY___3358|1|sulf_20 QUERY___3362|1|sulf_20 QUERY___3365|1|sulf_20 QUERY___3662|1|09 100 Methanothermococcus thermolithotrophicus (M59128.1.1452) Methanothermococcus sp. Mc70 (AB260042.1.1370) QUERY___2487|2|DF_Ax_FS317 QUERY___2476|4|DF_Ax_FS317-DF_Ax_FS389 QUERY___2479|1|DF_Ax_FS317 QUERY___2480|1|DF_Ax_FS317 QUERY___2481|1|DF_Ax_FS317 QUERY___2482|1|DF_Ax_FS317 QUERY___2483|7|DF_Ax_FS317-DF_Ax_FS521 QUERY___2488|4|DF_Ax_FS317-DF_PS_FS448 QUERY___2489|1|DF_Ax_FS317 QUERY___2491|2|DF_Ax_FS317 QUERY___2493|1|DF_Ax_FS317 QUERY___2494|5|DF_Ax_FS317-DF_Ax_FS521 QUERY___2495|1|DF_Ax_FS317 QUERY___2496|1|DF_Ax_FS317 QUERY___2497|1|DF_Ax_FS317 QUERY___2498|5|DF_Ax_FS317-DF_Ax_FS521 QUERY___2499|1|DF_Ax_FS317 QUERY___2500|2|DF_Ax_FS317 QUERY___2501|1|DF_Ax_FS317 QUERY___2502|1|DF_Ax_FS317 QUERY___2503|6|DF_Ax_FS317-DF_Ax_FS521 QUERY___2504|1|DF_Ax_FS317 QUERY___2505|2|DF_Ax_FS317 QUERY___2508|5|DF_Ax_FS317-DF_Ax_FS503 QUERY___2509|1|DF_Ax_FS317 QUERY___2510|2|DF_Ax_FS317 QUERY___2511|1|DF_Ax_FS317 QUERY___2527|1|DF_Ax_FS521 QUERY___2528|1|DF_Ax_FS521 QUERY___2529|1|DF_Ax_FS521 QUERY___2530|3|DF_Ax_FS317-DF_Ax_FS521 100 QUERY___2531|1|DF_Ax_FS521 QUERY___2532|1|DF_Ax_FS521 QUERY___2533|1|DF_Ax_FS521 QUERY___2534|1|DF_Ax_FS521 QUERY___2535|1|DF_Ax_FS521 QUERY___2536|2|DF_Ax_FS521 QUERY___2537|5|DF_Ax_FS521-DF_NP_FS467 QUERY___2538|2|DF_Ax_FS521 QUERY___2539|1|DF_Ax_FS521 QUERY___2540|1|DF_Ax_FS521 QUERY___2541|3|DF_Ax_FS317-DF_Ax_FS521 52 QUERY___2542|1|DF_Ax_FS521 100 QUERY___2543|1|DF_Ax_FS521 QUERY___2544|22|DF_Ax_FS317-DF_Ax_FS52 QUERY___2545|1|DF_Ax_FS521 QUERY___2546|1|DF_Ax_FS521 QUERY___2547|3|DF_Ax_FS317-DF_Ax_FS521 QUERY___2548|1|DF_Ax_FS521 QUERY___2549|3|DF_Ax_FS389-DF_Ax_FS521 QUERY___2550|4|DF_Ax_FS317-DF_Ax_FS521 QUERY___2552|3|DF_Ax_FS317-DF_Ax_FS521 QUERY___2554|1|DF_Ax_FS521 QUERY___2555|1|DF_Ax_FS521 99 QUERY___2977|12878|DF_Ax_FS317-DF_Ax_F QUERY___3040|2|DF_Ax_FS503-DF_Ax_FS521 Methanococcus aeolicus (AB546256.1.1391) QUERY___1978|1|DF_PS_FS448 QUERY___1980|1|DF_PS_FS448 QUERY___2485|1|DF_Ax_FS317 QUERY___2492|1|DF_Ax_FS317 QUERY___2506|1|DF_Ax_FS317 88 QUERY___2507|1|DF_Ax_FS317 QUERY___2551|2|DF_Ax_FS521-DF_PS_FS448 QUERY___2553|1|DF_Ax_FS521 QUERY___3041|1646|09-DF_Ax_FS317-DF_Ax Methanothermococcus okinawensis IH1 (CP002792.762570.764038)

Methanococcus voltae A3 (CP002057.1574464.1575930) 99 Methanococcus maripaludis (AB264796.1.1402) QUERY___60|1|REA_Hulk QUERY___61|2|REA_Hulk QUERY___3298|1|sulf_20 QUERY___3357|2|sulf_20 QUERY___3359|1|sulf_20 QUERY___3360|1|sulf_20 QUERY___3363|1|sulf_20 QUERY___3364|1|sulf_20 QUERY___3665|1|09 QUERY___3700|1|09 QUERY___3703|1|09 QUERY___3704|1|09 QUERY___3708|1|09 QUERY___3711|4218|09-DF_Ax_FS317-DF_Ax Methanotorris formicicus Mc-S-70 (AB095167.1.1171)

0.05

Figure S7. Phylogenetic tree of Methanococcales based on 16S rRNA gene sequences, with pyrotag sequences added into reference tree. Red dots indicate a sequence found in the high-temperature Hulk sample, blue dots indicate sequences found in di use ow samples, and green dots indicate sequences found in sul de samples. Collapsed wedges are annotated with the number of sequences in each cluster that was found in each respective environment. Evolutionary history was inferred using a rapid boot- strap, maximum likelihood method with 100 alternative runs on distinct starting trees, using the GTR+ optimization of substitution rates and the GAMMA model of heterogeneity in RAxML (Stamatakis 2006). The Evolutionary Placement Algorithm (EPA) (Berger et al, 2011) was used to insert short reads into the reference tree. Bootstrap values for the reference tree are labeled.