bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Active anaerobic methane oxidation and sulfur 2 disproportionation in the deep terrestrial subsurface

3 Emma Bell1,5*, Tiina Lamminmäki2, Johannes Alneberg3, Chen Qian4, Weili Xiong4, Robert L. 4 Hettich4, Manon Frutschi1 and Rizlan Bernier-Latmani1*

5 1 Environmental Microbiology Laboratory, Environmental Engineering Institute, School of 6 Architecture, Civil and Environmental Engineering, École Polytechnique Fédérale de Lausanne, 7 Lausanne, 1015, Switzerland

8 2 Posiva Oy, Eurajoki, 27160, Finland

9 3 Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and 10 Health, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, SE- 11 17121, Sweden

12 4 Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States

13 5 Present address; Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, 14 Canada

15 *Emma Bell and Rizlan Bernier-Latmani

16 Email: [email protected] and [email protected]

17 Author Contributions: Conceptualization, E.B., R.B-L. and T.L.; Methodology, E.B., R.B-L., T.L., 18 M.F., J.A., R.L.H.; Formal analysis, E.B., Investigation, E.B., M.F., C.Q., W.X.; Resources, R.B-L., 19 T.L., and R.L.H.; Writing – Original Draft, E.B., and R.B-L., Writing – Review and Editing, E.B., 20 R.B-L. and T.L., Supervision, R.B-L and T.L., Funding acquisition, R.B-L and T.L.

21 Competing Interest Statement: The authors declare no competing interests.

22 Keywords: sulfur disproportionation, anaerobic oxidation of methane, deep subsurface, 23 metagenomics, metaproteomics

24 Abstract

25 Microbial life is widespread in the terrestrial subsurface and present down to several kilometers

26 depth, but the energy sources that fuel metabolism in deep oligotrophic and anoxic environments

27 remain unclear. In the deep crystalline bedrock of the Fennoscandian Shield at Olkiluoto, Finland,

28 opposing gradients of abiotic methane and ancient seawater-derived sulfate create a terrestrial

29 sulfate-methane transition zone (SMTZ). We used chemical and isotopic data coupled to genome-

30 resolved metaproteogenomics to demonstrate active life and, for the first time, provide direct

31 evidence of active anaerobic oxidation of methane (AOM) in a deep terrestrial bedrock. Proteins

32 from Methanoperedens (formerly ANME-2d) are readily identifiable despite the low abundance

33 (≤1%) of this genus and confirm the occurrence of AOM. This finding is supported by 13C-depleted

1

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

34 dissolved inorganic carbon. Proteins from Desulfocapsaceae and Desulfurivibrionaceae, in addition

35 to 34S-enriched sulfate, suggest that these organisms use inorganic sulfur compounds as both

36 electron donor and acceptor. Zerovalent sulfur in the groundwater may derive from abiotic rock

37 interactions, or from a non-obligate syntrophy with Methanoperedens, potentially linking methane

38 and sulfur cycles in Olkiluoto groundwater. Finally, putative episymbionts from the candidate phyla

39 radiation (CPR) and DPANN archaea represented a significant diversity in the groundwater (26/84

40 genomes) with roles in sulfur and carbon cycling. Our results highlight AOM and sulfur

41 disproportionation as active metabolisms and show that methane and sulfur fuel microbial activity

42 in the deep terrestrial subsurface.

43 Significance Statement

44 The deep terrestrial subsurface remains an environment in which there is limited understanding of

45 the extant microbial metabolisms, despite its reported large contribution to the overall biomass on

46 Earth. It is much less well studied than deep marine sediments. We show that microorganisms in

47 the subsurface are active, and that methane and sulfur provide fuel in the oligotrophic and anoxic

48 subsurface. We also uncover taxonomically and metabolically diverse ultra-small organisms that

49 interact with larger host cells through surface attachment (episymbiosis). Methane and sulfur are

50 commonly reported in terrestrial crystalline bedrock environments worldwide and the latter cover a

51 significant proportion of the Earth’s surface. Thus, methane- and sulfur-dependent microbial

52 metabolisms have the potential to be widespread in the terrestrial deep biosphere.

53 Main Text

54 Introduction

55 Microbial cells in the subsurface comprise a significant proportion of the global prokaryotic biomass

56 (1). They are typically slow-growing, with long generation times ranging from months to years (2–

57 5). Despite their slow growth, they are active (6–9) and can respond to changes in subsurface

58 conditions (10, 11). Understanding the drivers of microbial metabolism in the subsurface is

2

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

59 therefore an important consideration for subsurface activities related to energy production, storage,

60 and waste disposal.

61 The engineering of stable bedrock formations for industrial purposes (spent nuclear fuel disposal,

62 underground storage of gases, geothermal energy production, oil and gas recovery) alters the

63 subsurface ecosystem causing biogeochemical changes that impact microbial community activity

64 and function (12, 13). Microbial activity can be beneficial, either through enhanced recovery (14),

65 removal of unwanted products (15), or competitive exclusion (16). It is, however, often associated

66 with detrimental impacts such as microbially-influenced corrosion (17), petroleum reservoir souring

67 (18), or reduced permeability and clogging resulting from the formation of biofilms (13).

68 The crystalline bedrock of Olkiluoto, Finland, will host a repository for the final disposal of spent

69 nuclear fuel. Copper canisters containing spent nuclear fuel will be one of several engineered

70 barriers for safe final disposal, the final barrier being the Olkiluoto bedrock (400–500 m depth).

71 Understanding the energy sources that fuel microbial activity at this site is important as slowly

72 moving groundwater present in cracks and fractures in the bedrock has the potential to reach and

73 interact with the engineered barriers. The microbial transformation of sulfate to sulfide in the

74 fracture water is a significant concern as it could accelerate corrosion of copper canisters and

75 compromise their longevity. The groundwater at Olkiluoto is geochemically stratified with depth with

76 a salinity gradient that extends from fresh near the surface (<100 m) to saline (>1000 m). Sulfate

77 is found in fractures at ~100–300 m depth. Below 300 m, the concentration of sulfate declines in

78 an opposing trend to methane, which increases with depth. This creates a sulfate-methane

79 transition zone (SMTZ) at ~250–350 m depth.

80 Sulfide is detected in relatively few drillholes at Olkiluoto, and its presence tends to be in

81 groundwater from fractures in the SMTZ (19). Exceptions in deeper fractures exist where drilling

82 has resulted in the introduction of sulfate-rich groundwater to deep methane-rich groundwater

83 providing an electron acceptor to a system otherwise limited in terminal electron acceptors (11). In

84 marine and coastal sediments, SMTZs are host to anaerobic methanotrophic archaea (ANME) and

3

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

85 sulfate-reducing that syntrophically cycle methane and sulfate (20), and quantitatively

86 consume methane diffusing up from deep methanogenic sources through anaerobic oxidation of

87 methane (AOM) (21, 22). In contrast, the biogeochemical significance of syntrophic interactions is

88 poorly understood in terrestrial SMTZs. Previous work has proposed, but not demonstrated,

89 widespread AOM in crystalline bedrock environments. This hypothesis was based on the detection

90 of ANME from the archaeal family Methanoperedenaceae (formerly ANME-2d) in groundwater from

91 granitic bedrock in both Finland (23, 24) and Japan (25–27). Additionally, past AOM activity has

92 also been evidenced by the precipitation of extremely 13C-depleted calcites in the granitic bedrock

93 of Sweden (28).

94 While many ANME rely on a syntrophic partner to couple methane oxidation to the reduction

95 sulfate, Methanoperedenaceae can independently conduct methane oxidation with diverse terminal

96 electron acceptors (29, 30). Methanoperedenaceae were originally enriched in a bioreactor shown

97 to couple AOM to the reduction of nitrate (31). Methanoperedenaceae lacking nitrate reductase

98 genes have since been enriched in bioreactors demonstrated to couple AOM to the reduction of

99 iron (Fe(III)) and manganese (Mn(IV)) (32). Studies from freshwater lake sediments have proposed

100 that Methanoperedenaceae also participate in sulfate-dependent AOM (33, 34). Similarly,

101 enrichment experiments using deep groundwater suggest Methanoperedenaceae conduct sulfate-

102 dependent AOM in the terrestrial subsurface (26, 35). However, AOM activity has not been

103 demonstrated in situ in any deep crystalline bedrock terrestrial environment to date.

104 Here, we use genome-resolved metaproteogenomics to uncover active sulfur and methane cycling

105 microorganisms in groundwater from a fracture in the SMTZ at Olkiluoto, Finland. Our goal was to

106 constrain the electron donor(s) fueling sulfidogenesis and determine whether sulfur and methane

107 cycles in the terrestrial SMTZ are linked. Stable isotopes and metaproteomics show that active

108 AOM occurs in this environment and that sulfate-reducing bacteria (SRB) and sulfur-

109 disproportionating bacteria (SDB) are active. Our data suggest that a syntrophic relationship may

110 exist between ANME archaea and Desulfobacterota with dissimilatory sulfur metabolism, either

111 through extracellular electron transfer or diffusible sulfur . Additionally, we propose that 4

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

112 ultra-small bacteria and archaea, abundant in the groundwater, live as episymbionts on the primary

113 microbiome.

114 Results

115 Stable isotopes indicate occurrence of dissimilatory sulfur metabolism and anaerobic

116 oxidation of methane

117 The groundwater was temperate (9.6°C), reducing (oxidation-reduction potential = -420 mV),

118 brackish (total dissolved solids (TDS) 7.4 g/L; conductivity 13 mS/cm) and at circumneutral pH (pH

119 value 7.6). DAPI (4′6-diamidino-2-phenylindole) cell counts indicated an abundance of 2.3 × 105

120 cells/mL. Sulfate in the groundwater (~0.8 mM; Fig. 1A) originates from the Littorina Sea, which

121 preceded the Baltic Sea 2,500–8,500 years ago (19). In the sampled fracture, sulfate is 32S-

122 depleted (44.6 ‰ VCDT; Fig. 1A) relative to groundwater with the same sulfate source at Olkiluoto

123 (~25 ‰; (19)) resulting from the preferential use of 32S by microorganisms with dissimilatory sulfur

124 metabolism. Reduced sulfur compounds, thiosulfate (~0.1 mM) and sulfide (~0.4 mM) are also

125 present (Fig. 1A), indicating ongoing metabolism of sulfur compounds. The isotope composition of

126 methane (-42.9 ‰ VPBD; Fig. 1b) reflects deep abiotic methane mixed with an increasing fraction

127 of microbially-produced methane with decreasing depth at Olkiluoto (19). During AOM, the light

128 carbon isotope value of methane is transferred to CO2. The light dissolved inorganic carbon (DIC)

129 isotope value (-28.0 ‰ VPBD; Fig. 1B) therefore indicates that methane could have contributed to

130 the DIC pool, and likely reflects the mixing of distinct carbon sources including a minor AOM part.

131 Hydrogen was relatively abundant (~30 µM) throughout the sampling period, except in the final

132 sampling month (November) where hydrogen was not detected (Dataset S1). The concentration of

133 dissolved gases in situ may be greater than the measured values as the pressure decrease during

134 pumping can cause degassing. Acetone (6 µM) and ethanol (3 µM) were detected but account for

135 only ~5% of the measured dissolved organic carbon (0.5 mM), meaning that unidentified organic

136 compounds are present in the groundwater.

137 Metaproteomics reveal active sulfur disproportionation among sulfur cycling bacteria

5

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

138 Analysis of the microbial community composition by 16S rRNA gene amplicon profiling (Fig. 2) and

139 phylogenomics (Fig. 3) revealed diverse microorganisms from both bacterial and archaeal

140 lineages. Dereplication of metagenome assembled genomes (MAGs) from seven metagenomic

141 datasets augmented with single cell genomes (SAGs) resulted in the recovery of 84 genomes from

142 25 phyla (Fig. 3; genome completeness is provided in Dataset S2). Desulfobacterota, a phylum

143 known to harbor sulfur-cycling bacteria, were most abundant in the 0.2 µm size fraction (Fig. 2),

144 consistent with chemical and isotope data that support dissimilatory sulfur metabolism in the

145 groundwater (Fig. 1). Genes for dissimilatory sulfate reduction; sulfate adenylyltransferase (sat),

146 adenylylsulfate reductase alpha and beta subunits (aprAB), and dissimilatory sulfite reductase

147 alpha, beta and delta subunits (dsrABD), were detected in MAGs from the phyla Desulfobacterota

148 and Nitrospirota (Fig. 4). Genomes from these phyla also contained electron transport complexes

149 qmoABC and dsrMKJOP and the sulfur relay protein dsrC (Fig. 4 and Dataset S3). One MAG from

150 the Gammaproteobacteria (genus Hydrogenophaga) contained genes for dissimilatory sulfate

151 reduction (Dataset S3) but was inferred to be a sulfur oxidizer based on the presence of subunits

152 dsrEFH (essential for reverse function) and absence of dsrD (needed for forward function) (36, 37).

153 Metaproteomes from two sampling months (Fig. 4) were explored to uncover the activity of

154 microorganisms in the groundwater at the time of sampling. The greatest number of proteins among

155 sulfur-cycling bacteria were recovered from Desulfocapsaceae, Desulfovibrionaceae, and ETH

156 SRB-1 (Dataset S4). Sulfur cycling enzymes from these three MAGs had high spectral counts (the

157 sum of MS/MS spectra for all peptides identified to a particular protein) indicating that these proteins

158 were abundant (Fig. 4). Desulfocapsaceae, Desulfovibrionaceae, and ETH SRB-1 and many other

159 Desulfobacterota harbored the metabolic potential for H2 oxidation, but peptides from respiratory

160 NiFe hydrogenase were only detected from Pseudodesulfovibrio aespoeensis (Fig. 4). P.

161 aespoeensis was originally isolated from another subsurface location in the Fennoscandian Shield

162 (Äspö, Sweden) (38), and belongs to the core microbiome of the aquifer in this crystalline rock (9).

163 Peptides from the beta subunit of adenylylsulfate reductase (AprB) were also detected in the P.

6

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

164 aespoeensis proteome (Fig. 4), thus, H2-dependent sulfate reduction is suggested as the

165 metabolism for this MAG.

166 In addition to hydrogen, some Desulfobacterota had the potential to oxidize organic carbon (lactate

167 and acetate; Dataset S3) but peptides from lactate dehydrogenase (Ldh) and acetyl-CoA

168 synthetase (Acs) were not detected in the metaproteome. We therefore searched the

169 metaproteome for enzymes catalyzing the oxidation of an alternate electron donor. We found that

170 some proteins from the ‘YTD’ gene cluster were expressed in both Desulfocapsaceae and

171 Desulfovibrionaceae (Fig. 4). This gene cluster is composed of five genes: a sulfur transport yedeE-

172 like gene, a sulfurtransferase tusA gene, a dsrE-like gene, and two conserved hypothetical proteins

173 (chp1 and chp2) and is a potential genome marker for SDB (39). While it is difficult to discriminate

174 SDB from SRB by genomic features alone (39–43), the expression of proteins from the YTD cluster

175 suggest that Desulfocapsaceae and Desulfovibrionaceae (and possibly other Desulfobacterota

176 with the YTD cluster; Fig. 4) disproportionate sulfur in this groundwater. Desulfocapsaceae and

177 Desulfovibrionaceae also expressed a polysulfide reductase subunit B (psrB) adjacent to a

178 thiosulfate reductase/polysulfide reductase (phsA/psrA) alpha subunit (Fig. 4). Polysulfide

179 reductase catalyzes the respiratory conversion of polysulfide to hydrogen sulfide (44), while

180 thiosulfate reductase catalyzes the initial step in disproportionation of thiosulfate (40). These

181 enzymes are difficult to distinguish by their protein sequence as both are molybdopterin enzymes

182 with a close phylogenetic relationship (45), but suggest that Desulfocapsaceae and

183 Desulfovibrionaceae can use polysulfide and/or thiosulfate.

184 Metaproteomics reveals active AOM in deep groundwater

185 Methanoperedens (phylum Halobacteriota) accounted for just ~1% of 16S rRNA gene amplicon

186 reads (Fig. 2) and the metagenomic community but had one of the greatest number of proteins

187 detected in the metaproteome of all the recovered MAGs (Dataset S2). Proteins for AOM were

188 abundant (Fig. 4), suggesting active AOM by Methanoperedens in this groundwater. Peptides from

189 alpha and gamma subunits of methyl coenzyme M reductase (MCR) were also detected from a

190 second, less abundant Methanoperedens MAG (0.17% of the metagenomic community). 7

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

191 Furthermore, a contribution of methane-derived CO2 from Methanoperedens archaea is supported

13 192 by the light d CDIC signature (Fig. 1).

193 Methanoperedens had the metabolic potential for extracellular electron transfer, which could be

194 either to an external electron acceptor or to a syntrophic partner. The potential for metal-dependent

195 AOM was predicted by the presence of gene clusters encoding a NrfD-like transmembrane protein,

196 a 4Fe-4S ferredoxin iron-sulfur protein, and multiheme cytochromes (MHC) (Dataset S5),

197 hypothesized to mediate electron transfer from the cytoplasm to the periplasm in

198 Methanoperendaceae during AOM coupled to Fe(III) and Mn(IV) oxides (29, 32). The more

199 abundant Methanoperedens MAG also contained two MHC annotated as OmcX, an outer

200 membrane protein necessary for electron transport out of the cell and growth on extracellular

201 electron acceptors in Geobacter species (46). One copy of OmcX was located near an MHC with

202 28 heme-binding motifs predicted to be located extracellularly. Extracellular MHC could facilitate

203 the transfer of electrons to metal oxides or to a syntrophic partner (47, 48). Archaeal flagellin have

204 also been implicated in extracellular electron transfer by ANME archaea (49). Peptides from the

205 major subunit of archaeal flagellin (FlaB) were detected in the metaproteome and may facilitate

206 extracellular electron transfer to a syntrophic partner. Nitrate-dependent AOM was ruled out in this

207 groundwater as nitrate reductase genes from Methanoperedens were absent in the two MAGs

208 (Dataset S3). There was also no evidence of genes for the reduction of arsenate, selenate or

209 elemental sulfur, found in some organisms from the Methanoperedenaceae family (29).

210 Methanoperedens encoded a respiratory NiFe group 1a hydrogenase, and peptides from both the

211 large and small subunit were detected in the metaproteome (Fig. 4). The gene arrangement was

212 the same as a Methanoperedenaceae MAG recovered from groundwater in Japan (26), where the

213 NiFe 1a catalytic subunit is adjacent a b-type cytochrome and a hydrogenase maturation protease.

214 In addition to the group 1 respiratory hydrogenase, Methanoperedens encoded a NiFe group 3b

215 hydrogenase (Dataset S5), which is predicted to oxidize NADPH and evolve hydrogen (50). Some

216 complexes have also been shown to have sulfhydrogenase activity, whereby the HydDA subunits

217 encode a hydrogenase and the HydBG subunits encode the sulfur reductase component (51, 52). 8

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

218 Candidate phyla radiation (CPR) bacteria and DPANN archaea contribute to carbon and

219 sulfur transformations

220 Organisms from the superphyla Patescibacteria (also the CPR) and DPANN, which are known to

221 have ultra-small cell sizes (53, 54), dominated the 0.1 µm size fraction (Fig. 2). Accordingly,

222 genomes recovered from organisms belonging to CPR and DPANN organisms ranged from 0.6–

223 1.7 Mbp (Dataset S2). Limited metabolic capabilities were evident (Fig. 5), consistent with the

224 predicted symbiotic lifestyle of these lineages with other, larger microorganisms (55–57). In total,

225 87 proteins were detected in the metaproteome from CPR and DPANN organisms (Dataset S6).

226 Most (65/87) were hypothetical or uncharacterized proteins, so putative function was predicted

227 based on conserved domains in the protein sequence and/or sequence similarity (Dataset S6). The

228 most abundant CPR proteins shared sequence similarity to protein sequences from other CPR

229 organisms annotated as cell wall surface anchor family proteins. These abundant but

230 uncharacterized proteins may therefore reflect the putatively episymbiotic lifestyle of CPR

231 organisms and facilitate attachment to the host. Other abundant proteins had peptidoglycan-

232 binding protein domains which are found in enzymes involved in bacterial cell wall degradation and

233 phage endolysins (58). Type 4 pilus subunits (PilA and PilO) were also detected, which can enable

234 attachment to surfaces, as well as potential host cells (59). Proteins recovered from CPR and

235 DPANN organisms were also involved in cell core machinery and genetic information processing,

236 such as those encoding ribosomal proteins, RNA-binding proteins, molecular chaperones, and

237 elongation factors (Dataset S6).

238 Candidatus Shapirobacterales BG1-G2, a member of the CPR supergroup Microgenomates, was

239 one of the more abundant CPR detected, representing 1.8% of the microbial community (Dataset

240 S2). Cand. Shapirobacterales expressed proteins with Type I and II cohesion domains in addition

241 to a Type III dockerin repeat domain (Dataset S6), required for the formation of a cellulosome (a

242 cellulose-degrading complex), indicating that this organism is actively involved in the metabolism

243 of organic carbon. Organic carbon metabolism by this organism is further supported by the

244 presence of glycoside hydrolases and polysaccharide lyases in the Cand. Shapirobacterales

9

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

245 genome (Dataset S7). Cand. Shapirobacterales encoded lactate dehydrogenase (Fig. 5), indicating

246 organic carbon could be fermented to lactate, as reported for other members of the CPR (60).

247 Other CPR and DPANN organisms also had the metabolic potential to supply fermentation products

248 to their host cells. Fermentative metabolism (pyruvate oxidation and acetogenesis) was most

249 common among the GWE2-39-37 family of Patescibacteria and was also detected in DPANN

250 genomes (Fig. 5). DPANN archaea had genes for carbon fixation, including RuBisCo Form III (Fig.

251 5), which is predicted to function in the adenosine monophosphate pathway allowing DPANN to

252 derive energy from ribose produced by other community members (61). Patescibacteria and

253 DPANN organisms encoded sulfate adenylyltransferase (sat), dissimilatory sulfite reductase delta

254 subunit (dsrD), phosphoadenosine phosphosulfate reductase (cysH), cysteine synthase (cysK),

255 and S-sulfo-L-cysteine synthase (cysM), suggesting a potential role in sulfur cycling and the

256 metabolism of sulfur amino acids (cysteine and methionine) as well as other organic sulfur

257 compounds (Fig. 5). The metabolic capacity of CPR and DPANN symbionts has been shown to

258 vary according to the metabolism of their host organisms (62). Thus, the potential for sulfur

259 transformations by some CPR and DPANN could suggest a Desulfobacterota or Nitrospirota host

260 in this groundwater.

261 Discussion

262 Metaproteogenomic data reveal active sulfur and methane cycling microorganisms in deep

263 groundwater in a terrestrial SMTZ. Canonical sulfate reducers coupling the oxidation of hydrogen

264 to the reduction of sulfate are active, but metaproteomic data indicate that the most abundant

265 Desulfobacterota in the groundwater gain energy from the disproportionation of inorganic sulfur

266 (elemental sulfur or polysulfide). These sulfur compounds could be supplied by abiotic rock

267 interactions (Fig. 6). For instance, exposure of iron silicate minerals from Olkiluoto bedrock to

268 sulfide has been shown to result in the production of elemental sulfur and Fe2+ (which precipitates

269 with excess sulfide), due to the abiotic reduction of Fe(III) in the major iron-bearing bedrock

270 minerals (biotite, garnet and chlorite) (63). Inorganic sulfur could also be supplied through a

271 syntrophic relationship with Methanoperedens in the form of diffusible sulfur species (Fig. 6), as 10

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

272 proposed for other ANME archaea (64). In that case, both methane oxidation and sulfate reduction

273 to zerovalent sulfur is carried out by Methanoperedens. The produced zerovalent sulfur (elemental

274 sulfur and polysulfide) then provides both electron donor and acceptor to SDB, producing sulfate

275 and sulfide (Fig. 6). Close physical association would not be required for syntrophy via diffusible

276 sulfur species, suggesting that the bacterial partner may be non-specific. Alternatively, syntrophy

277 between Methanoperedens and Desulfobacterota may be through extracellular electron transfer to

278 SRB. Three MAGs belonging to the lineage ETH-SRB1 were recovered (Fig. 3), and sulfate

279 reducers from this family have been suggested to form a syntrophic relationship with ethane-

280 oxidizing archaea also from ANME-2d (65). ETH-SRB1 expressed enzymes for sulfate reduction

281 but no enzymes for oxidation of an electron donor were detected in the metaproteome. Thus, ETH-

282 SRB1 could be possible partners for Methanoperedens in this groundwater (Fig. 6).

283 The genomic potential to transfer electrons extracellularly suggests that metal-dependent AOM

284 could represent another metabolic strategy for Methanoperedens in Olkiluoto groundwater. Iron-

285 bearing minerals in the bedrock have been shown to support culturable metal-reducing bacteria

286 from Olkiluoto (63), and other metal-reducing bacteria have been shown to be active in situ (7).

287 Methanoperedens expressed a respiratory hydrogenase suggesting active hydrogen oxidation.

288 The potential for hydrogen oxidation by Methanoperedenaceae has been detected in other MAGs

289 recovered from the deep terrestrial subsurface (26, 27), but this metabolic trait is not common

290 amongst all genome-sequenced Methanoperedenaceae (29). The presence of a respiratory

291 hydrogenase in some Methanoperedenaceae led to the suggestion that this metabolism may be

292 advantageous in environments with temporally variable availability of methane (29). In this

293 groundwater, however, Methanoperedens expresses both the core AOM pathway and the

294 respiratory hydrogenase when both methane and hydrogen are available. The expression of both

295 the core AOM pathway and the respiratory hydrogenase raises the possibility of hydrogenotrophic

296 methanogenesis as a possible metabolism for this Methanoperedens, although this organism may

297 simply gain energy from both methane and hydrogen. To our knowledge, Methanoperedenaceae

298 have not been shown to produce methane, however, it has recently been suggested that ANME-1

11

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

299 archaea could alternate between AOM and methanogenesis, based on their presence in methane-

300 producing sediments where hydrogen accumulates sufficiently to make hydrogenotrophic

13 301 methanogenesis exergonic (66). A small shift in CCH4 toward lighter values was observed over

302 the 12-month sampling period (Fig. 1), which indicates an increased proportion of 12C, associated

303 with a biogenic methane contribution. The only organisms containing McrA and the capacity to

304 produce methane in the metagenome were Methanoperedens. This methane isotope trend could,

305 however, also be the result of long-term pumping of the fracture which also resulted in a shift in the

306 groundwater chemistry overtime (Dataset S1). Furthermore, the light isotope ratio of dissolved

307 inorganic carbon supports the contribution of methane-derived carbon. In ANME-1 archaea that

308 are predicted to produce methane, it is not known whether one cell alternates between AOM and

309 methanogenesis, or if different subclades selectively perform AOM or methanogenesis (66). In this

310 groundwater, two populations of Methanoperedens were identified, which could enable alternate

311 roles in methane cycling.

312 Proteomic data supported episymbiosis of CPR bacteria with a host cell, as observed in cocultures

313 of other CPR organisms with their respective hosts (62, 67, 68). Genomic data suggest a role for

314 some CPR bacteria and DPANN archaea in sulfur cycling, possibly reflecting the metabolism of

315 host cells (37, 62). In this groundwater sulfur cycling organisms belong to Desulfobacterota and

316 Nitrospirota, but it is not yet known if DPANN archaea can have bacterial hosts (69). Proteins

317 required for the formation of a cellulosome detected in Cand. Shapirobacterales indicate CPR

318 bacteria also contribute to carbon cycling in this groundwater. Cellulosomes have been detected in

319 Cand. Roizmanbacterium (70), a CPR organism also from the Microgenomates supergroup. Cand.

320 Roizmanbacterium was similarly recovered from groundwater and encoded lactate

321 dehydrogenase, suggesting a common role for groundwater Microgenomates in the production of

322 labile organic carbon.

323 Our results demonstrate that methane and sulfur fuel anaerobic microbial metabolism in fracture

324 water from the deep terrestrial subsurface. Sulfate and methane are abundant in many deep

325 crystalline bedrock environments (19, 25, 71) which represent a significant fraction (~20%) of the 12

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

326 Earth’s surface (72). Syntrophy between methane and sulfur cycling organisms therefore has the

327 potential to be widespread in SMTZs throughout the terrestrial biosphere, though the quantitative

328 significance of the process remains to be determined. Carbon isotope values of DIC and fracture

329 calcites at Olkiluoto indicate that methane oxidation is limited, with only one other sample showing

330 evidence of a contribution from AOM (19). Thus, active methane oxidation may be a phenomenon

331 that occurs locally in distinct locations throughout terrestrial SMTZs. The results further

332 demonstrate the importance of activity measurements such as metaproteomics in genome-based

333 studies that aim to uncover the roles of microorganisms in biogeochemical cycling. From only

334 genome-based potential, our data would suggest that hydrogenotrophic sulfate reduction, which is

335 common in other sulfidic Olkiluoto groundwaters (11), is the prominent metabolism with a limited

336 contribution from ANME archaea, which were relatively low abundance in both the 16S rRNA gene

337 amplicon and metagenomic datasets. With insight from metaproteomics, we propose that both

338 anaerobic methane oxidation and sulfur disproportionation are important metabolisms in this deep

339 oligotrophic groundwater.

340 Materials and Methods

341 Sample site and chemistry

342 Groundwater was collected from drillhole OL-KR13 on the island of Olkiluoto (61°14'30.7''N,

343 21°28'49.3'') from a fracture at 330.52–337.94 meters below sea level at ten specific time points

344 for 9 months. The bedrock at Olkiluoto is predominantly granite and high-grade metamorphic rock

345 (19). Hydrological models have shown that drillhole OL-KR13 intersects a fracture zone that

346 connects to ONKALO, the underground nuclear waste repository that is currently under

347 construction and that extends to ~455 m depth (73). This connection can result in slow groundwater

348 flow caused by drawdown towards open tunnels via the fracture zone.

349 Methods of groundwater chemical analysis have been described in detail previously (11). Briefly,

350 anions (sulfate, thiosulfate, nitrate, nitrite) were measured by ion chromatography using a Dionex

351 Integrion HPIC system with an IonPac AS11HC analytical column. Total and dissolved organic

352 carbon were measured on a Shimadzu TOC-V. Organic acids (acetate, lactate, propionate, 13

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

353 butyrate) and glucose were measured with an Agilent 1290 Infinity Liquid Chromatography System

354 fitted with an Agilent Hi-Plex H column with RI detection. Alcohols (ethanol, methanol, propanol,

355 and 2-butanol) and acetone were measured on a Varian CP-3800 gas chromatograph with 1-

356 butanol as an internal standard. Dissolved gases (CH4 and H2) were measured on a Varian 450-

357 GC with flame ionization (FID) and thermionic specific (TSD) detector. The 12C/13C ratio of methane

13 12 13 358 (d CCH4 ‰ VPBD) was measured with a Picarro G2201-I Analyser. The C/ C ratio of dissolved

13 32 34 34 359 inorganic carbon (DIC) (d CDIC ‰ VPBD) and the S/ S isotope ratio of sulfate (d SSO4 ‰ VCDT)

360 were measured at the Stable Isotope Laboratory, Faculty of Geosciences and Environment,

361 Université de Lausanne, Switzerland. The C isotope composition was measured with a Thermo

362 Finnigan Delta Plus XL IRMS equipped with a GasBench II for analyses of carbonates. The S

363 isotope composition was measured with a He carrier gas and a Carlo Erba (CE 1100) elemental

364 analyzer linked to a Thermo Fisher Delta mass spectrometer. Additional chemical measurements

365 were conducted by Teollisuuden Voima Oyj (TVO), Finland, and methods are described in (74).

366 Microbial biomass collection, DNA and protein extraction

367 To collect biomass for DNA and protein extraction, groundwater was pumped directly into a sterile

368 chilled Nalgene filtration unit fitted with a 0.22 µm pore size Isopore polycarbonate membrane

369 (Merck Millipore, Darmstadt, Germany). Approximately 10 L of groundwater was filtered each for

370 metagenomic and metaproteomic sample and 1 L was filtered when samples were taken for 16S

371 rRNA gene amplicon analysis only. Ultra-small cells in the 0.2 µm filtrate were collected by

372 subsequent filtration through a 0.1 µm pore size polycarbonate filter. Filters collected for DNA were

373 preserved in 750 µL LifeGuard Soil Preservation Solution (MoBio, Carlsbad, CA, United States)

374 and stored at -20°C. Filters collected for protein were flash-frozen in a dry ice and ethanol mixture

375 and stored at -80°C. DNA was extracted using a modified phenol-chloroform method (11, 75) and

376 quantified with the dsDNA High Sensitivity Assay kit (Thermo Fisher Scientific) on a Qubit 3.0

377 Fluorometer. Protein was extracted at the Oak Ridge National Laboratory (Oak Ridge, TN, United

378 States) using previously described methods (11, 76).

14

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

379 DNA sequencing

380 Extracted DNA was used as a template for PCR amplification using primers 515F/806R that target

381 the V4 region of the 16S rRNA gene (77). Amplicon libraries were sequenced on an Illumina MiSeq

382 at either RTL Genomics, Lubbock, TX, USA, or the Lausanne Genomics Technologies Facility at

383 Université de Lausanne (UNIL), Lausanne, Switzerland, with a 2 × 250 bp read configuration. Raw

384 sequence reads were merged and quality-filtered with USEARCH v11 (78). Zero-radius operational

385 taxonomic units (ZOTUs) were generated with UNOISE3 (79). was predicted with

386 SINTAX (80) using a USEARCH compatible (81) SILVA 138 release (82). The resulting ZOTU table

387 was analyzed and visualized with the R-package ampvis2 (83).

388 Seven metagenomes were sequenced at three facilities: The Joint Genome Institute (JGI), Walnut

389 Creek, CA, USA; the Marine Biological Laboratory, Woods Hole, MA, USA, and; RTL Genomics,

390 Lubbock, TX, USA. DNA was sequenced on an Illumina HiSeq 2500 (RTL and JGI) and Illumina

391 NextSeq (MBL) with paired end 2 × 150 bp reads with further details in (11). All samples were pre-

392 processed following the same procedure: low quality bases and adaptors were removed with

393 cudadapt (84) and artificial PCR duplicates were screened using FastUniq (85) with default

394 parameters.

395 Metagenomic assembly, genome binning and single amplified genomes

396 After preprocessing, samples were individually assembled using MEGAHIT (86) with the -meta-

397 sensitive option. Assembled contigs were binned using two methods: (1) assembled contigs were

398 quantified across all samples using Kallisto (87) and binned using CONCOCT (88); (2) assembled

399 contigs were quantified across all samples using bbmap (89) and binned using metabat2 (90). A

400 non-redundant set of optimized bins was then selected with DAS Tool (91), with the default score

401 threshold of 0.5. This resulted in a total of 238 bins from seven metagenomic datasets (June, July,

402 September (×3), November (×2)). Taxonomy was assigned to bins with GTDB-Tk v1.3.0 reference

403 data version r95 (92). The 238 bins were dereplicated with dRep (93) with parameters: minimum

404 completeness 70%, maximum contamination 10%, primary cluster ANI 90%, secondary cluster ANI

405 99% (strain level). Bins belonging to Patescibacteria (50/238) were checked for completeness and 15

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

406 contamination with the Candidate Phyla Radiation (CPR; Patescibacteria) custom 43 gene marker

407 set in CheckM (94) and clustered into groups (ANI 99 %) with FastANI (95). In cases where the

408 completeness of Patescibacteria bins improved past the threshold used for dereplication (70%),

409 the bin with the greatest completeness and least contamination from each cluster was added to the

410 collection of dereplicated bins from dRep. This resulted in a total of 76 dereplicated genome bins,

411 hereafter referred to as metagenome-assembled genomes (MAGs).

412 Single amplified genomes (SAGs) were sequenced on an Illumina NextSeq. Raw reads were

413 quality-filtered using BBTools (89) and assembled with SPAdes v3.13.0 (96). Contig ends were

414 trimmed and discarded if the length was <2kb or the read coverage was less than 2. SAGs were

415 clustered into groups (ANI 99%) with FastANI (95) and completeness and contamination was

416 estimated with CheckM (94). Eight dereplicated SAGs were retained resulting in a total of 84

417 genomes (MAGs and SAGs) in our dataset. A phylogenomic tree was constructed with the 84

418 genomes from this study using GToTree (97) using Hidden Markov Models (HMMs) for 16 universal

419 single copy genes (98). GTDB representative species from the same families as the groundwater

420 MAGs and SAGs were recovered using gtt-get-accessions-from-GTDB and included in the tree.

421 The tree was visualized with iToL (99).

422 Metabolic predictions

423 Protein coding genes were predicted with Prodigal (100). METABOLIC v4.0 (101) was used to

424 search amino acid sequences against a curated set of KOfam (102), TIGRfam (103), Pfam (104)

425 and sulfur-related HMM profiles (37) corresponding to key marker genes for biogeochemical

426 cycling. Genes putatively involved in sulfur disproportionation (YTD gene cluster consisting of a

427 YEDE-like gene, tusA, a dsrE-like gene, and two conserved hypothetical proteins (39) were

428 identified with custom HMMs (https://github.com/emma-bell/metabolism). For the generation of

429 custom HMM profiles, reference gene sequences identified in (39) were aligned using MUSCLE

430 (105) with default parameters. HMMs were built with hmmbuild in HMMER v3.3.2 (106). The tusA

431 HMM was checked for CPxP conserved residues that stabilize the first helix (107). HMM profiles

432 were searched against genomes using search-custom-markers workflow in metabolisHMM (108). 16

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

433 Complexes putatively involved in metal oxide reduction by Methanoperendaceae (32) were

434 identified by searching for CXXCH heme-binding motifs. Cellular location was predicted with PsortB

435 v3.0 (109) and transmembrane helices were identified with TMHMM v2.0 (110). Putative function

436 derived from CPR and DPANN protein sequences was predicted by searching conserved domains

437 within protein coding sequences using CD-search (111) and performing similarity searches using

438 blastp against the non-redundant protein sequences (nr) database (112).

439 Metaproteomics

440 Digested peptides were analyzed by online two-dimensional liquid chromatography tandem mass

441 spectrometry on an Orbitrap-Elite mass spectrometer (ThermoFisher Scientific). Mass spectra data

442 were acquired in data dependent mode and the top ten most abundant parent ions were selected

443 for further fragmentation by collision-induced dissociation at 35% energy level. Three technical

444 replicates were analyzed per sample.

445 A protein database was constructed from predicted protein sequences from the seven

446 metagenomic datasets. All collected MS/MS spectra were searched against the protein database

447 with Myrimatch v2.2 (113). Peptides were identified and assembled into proteins using IDPicker

448 v3.1 (114) with a minimum of two distinct peptides per protein and a false discovery rate (FDR) of

449 <1% at the peptide level. Proteins were further clustered into protein groups post database

450 searching if all proteins shared the same set of identified peptides.

451 Data Availability

452 Supporting datasets are provided in the Supplementary Information (Datasets S1-S7). Sequence

453 data are deposited in the National Centre for Biotechnology Information (NCBI) Sequence Read

454 Archive (SRA). 16S rRNA gene amplicon data are deposited under the NCBI Bioproject

455 PRJNA472445. BioSample accessions for MAGs and SAGs are provided in Dataset S2.

456 Acknowledgments

457 We thank POSIVA OY for providing the funding and the logistical support to carry out the work,

458 Maarit Yli-Kaila and Raila Viitala for support and assistance conducting fieldwork, Louise Balmer 17

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

459 and Guillaume Sommer for assistance collecting samples, and Petteri Pitkänen for valuable

460 discussions. Support for metagenomic sequencing was provided by The Census of Deep Life within

461 the Deep Carbon Observatory and the U.S. Department of Energy Joint Genome Institute. The

462 work conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science

463 User Facility, is supported under Contract No. DE-AC02-05CH11231. The metagenomic

464 computations were performed on resources provided by SNIC through Uppsala Multidisciplinary

465 Center for Advanced Computational Science (UPPMAX).

466 References 467 468 1. C. Magnabosco, et al., The biomass and biodiversity of the continental subsurface. Nat. 469 Geosci. 11, 707–717 (2018).

470 2. T. M. Hoehler, B. B. Jørgensen, Microbial life under extreme energy limitation. Nat. Rev. 471 Microbiol. 11, 83–94 (2013).

472 3. S. Braun, et al., Microbial turnover times in the deep seabed studied by amino acid 473 racemization modelling. Sci. Rep. 7, 5680 (2017).

474 4. B. B. Jørgensen, Deep subseafloor microbial cells on physiological standby. Proc. Natl. 475 Acad. Sci. 108, 18193 LP – 18194 (2011).

476 5. E. Trembath-Reichert, et al., Methyl-compound use and slow growth characterize microbial 477 life in 2-km-deep subseafloor coal and shale beds. Proc. Natl. Acad. Sci. 114, E9206 LP- 478 E9215 (2017).

479 6. M. C. Y. Lau, et al., An oligotrophic deep-subsurface community dependent on syntrophy is 480 dominated by sulfur-driven autotrophic denitrifiers. Proc. Natl. Acad. Sci. U. S. A. 113, 481 E7927–E7936 (2016).

482 7. E. Bell, et al., Active sulfur cycling in the terrestrial deep subsurface. ISME J. (2020) 483 https:/doi.org/10.1038/s41396-020-0602-x.

484 8. M. Lopez-Fernandez, et al., Metatranscriptomes Reveal That All Three Domains of Life Are 485 Active but Are Dominated by Bacteria in the Fennoscandian Crystalline Granitic Continental 486 Deep Biosphere. MBio 9, e01792-18 (2018).

487 9. M. Mehrshad, et al., Energy efficiency and biological interactions define the core 488 microbiome of deep oligotrophic groundwater. bioRxiv, 2020.05.24.111179 (2020).

489 10. P. Rajala, et al., Rapid Reactivation of Deep Subsurface Microbes in the Presence of C-1 490 Compounds. Microorganisms 3, 17–33 (2015).

491 11. E. Bell, et al., Biogeochemical cycling by a low-diversity microbial community in deep 492 groundwater. Front. Microbiol. 9, 2129 (2018).

493 12. A. Vigneron, et al., Succession in the petroleum reservoir microbiome through an oil field 494 production lifecycle. ISME J. 11, 2141–2154 (2017).

18

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

495 13. R. A. Daly, et al., Microbial metabolisms in a 2.5-km-deep ecosystem created by hydraulic 496 fracturing in shales. Nat. Microbiol. 1, 16146 (2016).

497 14. S. Y. Park, Y. Liang, Biogenic methane production from coal: A review on recent research 498 and development on microbially enhanced coalbed methane (MECBM). Fuel 166, 258–267 499 (2016).

500 15. A. Bagnoud, et al., Reconstructing a hydrogen-driven microbial metabolic network in 501 Opalinus Clay rock in Nature Communications, (2016), p. 12770.

502 16. C. Hubert, “Microbial Ecology of Oil Reservoir Souring and its Control by Nitrate Injection” 503 in Handbook of Hydrocarbon and Lipid Microbiology, (Springer Berlin Heidelberg, 2010), 504 pp. 2753–2766.

505 17. S. Lahme, J. Mand, J. Longwell, R. Smith, D. Enning, Severe Corrosion of Carbon Steel in 506 Oil Field Produced Water Can Be Linked to Methanogenic Archaea Containing a Special 507 Type of [NiFe] Hydrogenase. Appl. Environ. Microbiol. 87, e01819-20 (2021).

508 18. L. M. Gieg, T. R. Jack, J. M. Foght, Biological souring and mitigation in oil reservoirs. Appl. 509 Microbiol. Biotechnol. 92, 263–282 (2011).

510 19. Posiva, Olkiluoto Site description, Posiva report 2011-02 (2013).

511 20. K. Knittel, A. Boetius, Anaerobic Oxidation of Methane: Progress with an Unknown Process. 512 Annu. Rev. Microbiol. 63, 311–334 (2009).

513 21. M. Egger, N. Riedinger, J. M. Mogollón, B. B. Jørgensen, Global diffusive fluxes of methane 514 in marine sediments. Nat. Geosci. 11, 421–425 (2018).

515 22. A. J. Wallenius, P. Dalcin Martins, C. P. Slomp, M. S. M. Jetten, Anthropogenic and 516 Environmental Constraints on the Microbial Methane Cycle in Coastal Sediments . Front. 517 Microbiol. 12, 293 (2021).

518 23. M. Bomberg, M. Nyyssönen, P. Pitkänen, A. Lehtinen, M. Itävaara, Active Microbial 519 Communities Inhabit Sulphate-Methane Interphase in Deep Bedrock Fracture Fluids in 520 Olkiluoto, Finland. Biomed Res. Int. 2015, 1–17 (2015).

521 24. M. Bomberg, T. Lamminmäki, M. Itävaara, Microbial communities and their predicted 522 metabolic characteristics in deep fracture groundwaters of the crystalline bedrock at 523 Olkiluoto, Finland. Biogeosciences 13 (2016).

524 25. K. Ino, et al., Deep microbial life in high-quality granitic groundwater from geochemically 525 and geographically distinct underground boreholes. Environ. Microbiol. Rep. 8 (2016).

526 26. K. Ino, et al., Ecological and genomic profiling of anaerobic methane-oxidizing archaea in a 527 deep granitic environment. ISME J. 12, 31–47 (2018).

528 27. A. W. Hernsdorf, et al., Potential for microbial H2 and metal transformations associated with 529 novel bacteria and archaea in deep terrestrial subsurface sediments. ISME J. 11, 1915– 530 1929 (2017).

531 28. H. Drake, et al., Extreme 13C depletion of carbonates formed during oxidation of biogenic 532 methane in fractured granite. Nat. Commun. 6, 7020 (2015).

533 29. A. O. Leu, et al., Lateral Gene Transfer Drives Metabolic Flexibility in the Anaerobic 534 Methane-Oxidizing Archaeal Family <em>Methanoperedenaceae</em> MBio 11, 535 e01325-20 (2020). 19

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

536 30. K. F. Ettwig, et al., Archaea catalyze iron-dependent anaerobic oxidation of methane. Proc. 537 Natl. Acad. Sci. 113, 12792–12796 (2016).

538 31. M. F. Haroon, et al., Anaerobic oxidation of methane coupled to nitrate reduction in a novel 539 archaeal lineage. Nature 500, 567–570 (2013).

540 32. A. O. Leu, et al., Anaerobic methane oxidation coupled to manganese reduction by 541 members of the Methanoperedenaceae. ISME J. (2020) https:/doi.org/10.1038/s41396- 542 020-0590-x.

543 33. H. S. Weber, K. S. Habicht, B. Thamdrup, Anaerobic methanotrophic archaea of the ANME- 544 2d cluster are active in a low-sulfate, iron-rich freshwater sediment. Front. Microbiol. 8, 619 545 (2017).

546 34. G. Su, et al., Manganese/iron-supported sulfate-dependent anaerobic oxidation of methane 547 by archaea in lake sediments. Limnol. Oceanogr. 65, 863–875 (2020).

548 35. K. Pedersen, Metabolic activity of subterranean microbial communities in deep granitic 549 groundwater supplemented with methane and H 2. ISME J. 7, 839–849 (2013).

550 36. C. Dahl, C. Friedrich, A. Kletzin, “Sulfur Oxidation in Prokaryotes” in Encyclopedia of Life 551 Sciences, (John Wiley & Sons, Ltd, 2008) 552 https:/doi.org/10.1002/9780470015902.a0021155 (January 21, 2019).

553 37. K. Anantharaman, et al., Expanded diversity of microbial groups that shape the dissimilatory 554 sulfur cycle. ISME J. 12, 1715–1728 (2018).

555 38. M. Motamedi, K. Pedersen, Note: Desulfovibrio aespoeensis sp. nov., a mesophilic sulfate- 556 reducing bacterium from deep groundwater at aspo hard rock laboratory, Sweden. Int. J. 557 Syst. Bacteriol. 48, 311–315 (1998).

558 39. K. Umezawa, H. Kojima, Y. Kato, M. Fukui, Disproportionation of inorganic sulfur 559 compounds by a novel autotrophic bacterium belonging to Nitrospirota. Syst. Appl. 560 Microbiol. 43, 126110 (2020).

561 40. K. Finster, Microbiological disproportionation of inorganic sulfur compounds. J. Sulfur 562 Chem. 29, 281–292 (2008).

563 41. C. Thorup, A. Schramm, A. J. Findlay, K. W. Finster, L. Schreiber, Disguised as a Sulfate 564 Reducer: Growth of the Deltaproteobacterium Desulfurivibrio alkaliphilus by Sulfide 565 Oxidation with Nitrate. MBio 8, e00671-17 (2017).

566 42. A. I. Slobodkin, G. B. Slobodkina, Diversity of Sulfur-Disproportionating Microorganisms. 567 Microbiol. (Russian Fed. 88, 509–522 (2019).

568 43. K. W. Finster, et al., Complete genome sequence of sulfexigens, a marine 569 deltaproteobacterium specialized in disproportionating inorganic sulfur compounds. Stand. 570 Genomic Sci. 8, 58–68 (2013).

571 44. M. Jormakka, et al., Molecular mechanism of energy conservation in polysulfide respiration. 572 Nat. Struct. Mol. Biol. 15, 730–737 (2008).

573 45. S. Duval, A.-L. Ducluzeau, W. Nitschke, B. Schoepp-Cothenet, Enzyme phylogenies as 574 markers for the oxidation state of the environment: the case of respiratory arsenate 575 reductase and related enzymes. BMC Evol. Biol. 8, 206 (2008).

576 46. J. E. Butler, N. D. Young, D. R. Lovley, Evolution of electron transfer out of the cell: 20

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

577 comparative genomics of six Geobacter genomes. BMC Genomics 11, 40 (2010).

578 47. C. Cai, et al., A methanotrophic archaeon couples anaerobic oxidation of methane to Fe(III) 579 reduction. ISME J. 12, 1929–1939 (2018).

580 48. S. E. McGlynn, G. L. Chadwick, C. P. Kempes, V. J. Orphan, Single cell activity reveals 581 direct electron transfer in methanotrophic consortia. Nature 526, 531–535 (2015).

582 49. V. Krukenberg, et al., Gene expression and ultrastructure of meso- and thermophilic 583 methanotrophic consortia. Environ. Microbiol. 20, 1651–1666 (2018).

584 50. D. Søndergaard, C. N. S. Pedersen, C. Greening, HydDB: A web tool for hydrogenase 585 classification and analysis. Sci. Rep. 6, 34212 (2016).

586 51. P. Pedroni, et al., Characterization of the locus encoding the [Ni-Fe] sulfhydrogenase from 587 the archaeon Pyrococcus furiosus: evidence for a relationship to bacterial sulfite 588 reductases. Microbiology 141, 449–458 (1995).

589 52. K. Ma, R. N. Schicho, R. M. Kelly, M. W. Adams, Hydrogenase of the hyperthermophile 590 Pyrococcus furiosus is an elemental sulfur reductase or sulfhydrogenase: evidence for a 591 sulfur-reducing hydrogenase ancestor. Proc. Natl. Acad. Sci. 90, 5341 LP – 5344 (1993).

592 53. B. Luef, et al., Diverse uncultivated ultra-small bacterial cells in groundwater. Nat. Commun. 593 6, 6372 (2015).

594 54. C. J. Castelle, et al., Biosynthetic capacity, metabolic variety and unusual biology in the 595 CPR and DPANN radiations. Nat. Rev. Microbiol. 16, 629–645 (2018).

596 55. K. C. Wrighton, et al., Fermentation, hydrogen, and sulfur metabolism in multiple 597 uncultivated bacterial phyla. Science 337, 1661–5 (2012).

598 56. C. J. Castelle, et al., Genomic Expansion of Domain Archaea Highlights Roles for 599 Organisms from New Phyla in Anaerobic Carbon Cycling. Curr. Biol. 25, 690–701 (2015).

600 57. R. E. Danczak, et al., Members of the Candidate Phyla Radiation are functionally 601 differentiated by carbon- and nitrogen-cycling capabilities. Microbiome 5, 112 (2017).

602 58. L. Tišáková, B. Vidová, J. Farkašovská, A. Godány, Bacteriophage endolysin Lyt μ1/6: 603 characterization of the C-terminal binding domain. FEMS Microbiol. Lett. 350, 199–208 604 (2014).

605 59. B. Maier, G. C. L. Wong, How Bacteria Use Type IV Pili Machinery on Surfaces. Trends 606 Microbiol. 23, 775–788 (2015).

607 60. K. C. Wrighton, et al., Metabolic interdependencies between phylogenetically novel 608 fermenters and respiratory organisms in an unconfined aquifer. ISME J. 8, 1452–1463 609 (2014).

610 61. K. C. Wrighton, et al., RubisCO of a nucleoside pathway known from Archaea is found in 611 diverse uncultivated phyla in bacteria. ISME J. 10, 2702–2714 (2016).

612 62. C. He, et al., Genome-resolved metagenomics reveals site-specific diversity of episymbiotic 613 CPR bacteria and DPANN archaea in groundwater ecosystems. Nat. Microbiol. (2021) 614 https:/doi.org/10.1038/s41564-020-00840-5.

615 63. L. Johansson, J. Stahlén, T. Taborowski, K. Pedersen, “Microbial Release of Iron from 616 Olkiluoto Rock Minerals” (Posiva Oy, 2019). 21

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

617 64. J. Milucka, et al., Zero-valent sulphur is a key intermediate in marine methane oxidation. 618 Nature 491, 541–546 (2012).

619 65. S.-C. Chen, et al., Anaerobic oxidation of ethane by archaea from a marine hydrocarbon 620 seep. Nature 568, 108–111 (2019).

621 66. R. Kevorkian, S. Callahan, R. Winstead, K. G. Lloyd, ANME-1 archaea drive methane 622 accumulation and removal in estuarine sediments. bioRxiv, 2020.02.24.963215 (2020).

623 67. X. He, et al., Cultivation of a human-associated TM7 phylotype reveals a reduced genome 624 and epibiotic parasitic lifestyle. Proc. Natl. Acad. Sci. 112, 244 LP – 249 (2015).

625 68. K. L. Cross, et al., Targeted isolation and cultivation of uncultivated bacteria by reverse 626 genomics. Nat. Biotechnol. 37, 1314–1321 (2019).

627 69. C. J. Castelle, et al., Protein family content uncovers lineage relationships and bacterial 628 pathway maintenance mechanisms in DPANN archaea. bioRxiv, 2021.01.12.426361 629 (2021).

630 70. P. Geesink, et al., Genome-inferred spatio-temporal resolution of an uncultivated 631 Roizmanbacterium reveals its ecological preferences in groundwater. Environ. Microbiol. 632 22, 726–737 (2020).

633 71. R. Kietäväinen, L. Purkamo, The origin, source, and cycling of methane in deep crystalline 634 rock biosphere. Front. Microbiol. 6, 725 (2015).

635 72. S. Chandra, E. Auken, P. K. Maurya, S. Ahmed, S. K. Verma, Large Scale Mapping of 636 Fractures and Groundwater Pathways in Crystalline Hardrock By AEM. Sci. Rep. 9, 398 637 (2019).

638 73. T. Vaittinen, H. Ahokas, J. Nummela, S. Paulamäki, “Hydrogeological Structure Model of 639 the Olkiluoto Site – Update in 2010, Posiva Report 2011-65” (2011).

640 74. P. Tuomi, et al., “Conceptual Model of Microbial Effects on Hydrogeochemical Conditions 641 at the Olkiluoto Site” in Posiva Report 2020-03, (Posiva Oy, 2020), p. 278.

642 75. A. Bagnoud, O. Leupin, B. Schwyn, R. Bernier-Latmani, Rates of microbial hydrogen 643 oxidation and sulfate reduction in Opalinus Clay rock. Appl. Geochemistry 72, 42–50 (2016).

644 76. K. Chourey, et al., Environmental proteomics reveals early microbial community responses 645 to biostimulation at a uranium- and nitrate-contaminated site. Proteomics 13, n/a-n/a (2013).

646 77. J. G. Caporaso, et al., Global patterns of 16S rRNA diversity at a depth of millions of 647 sequences per sample. Proc. Natl. Acad. Sci. 108, 4516–4522 (2011).

648 78. R. C. Edgar, Search and clustering orders of magnitude faster than BLAST. Bioinformatics 649 26, 2460–2461 (2010).

650 79. R. C. Edgar, “UNOISE2: improved error-correction for Illumina 16S and ITS amplicon 651 sequencing” (Cold Spring Harbor Laboratory, 2016) https:/doi.org/10.1101/081257 652 (October 3, 2017).

653 80. R. Edgar, SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences. 654 bioRxiv, 074161 (2016).

655 81. M. Lee, making-usearch-compatible-tax-file-from-dada2-silva-v138-format.sh (2020) 656 https:/doi.org/https:/doi.org/10.6084/m9.figshare.12226943.v1. 22

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

657 82. C. Quast, et al., The SILVA ribosomal RNA gene database project: Improved data 658 processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).

659 83. K. S. Andersen, R. H. Kirkegaard, S. M. Karst, M. Albertsen, ampvis2: an R package to 660 analyse and visualise 16S rRNA amplicon data. bioRxiv, 299537 (2018).

661 84. M. Martin, Cutadapt removes adapter sequences from high-throughput sequencing reads. 662 EMBnet.journal; Vol 17, No 1 Next Gener. Seq. Data Anal. (2011) 663 https:/doi.org/10.14806/ej.17.1.200.

664 85. H. Xu, et al., FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads. 665 PLoS One 7, e52249 (2012).

666 86. D. Li, C. M. Liu, R. Luo, K. Sadakane, T. W. Lam, MEGAHIT: An ultra-fast single-node 667 solution for large and complex metagenomics assembly via succinct de Bruijn graph. 668 Bioinformatics 31, 1674–1676 (2015).

669 87. N. L. Bray, H. Pimentel, P. Melsted, L. Pachter, Near-optimal probabilistic RNA-seq 670 quantification. Nat. Biotechnol. 34, 525–527 (2016).

671 88. J. Alneberg, et al., Binning metagenomic contigs by coverage and composition. Nat. 672 Methods 11, 1144–1146 (2014).

673 89. B. Bushnell, J. Rood, E. Singer, BBTools Software Package. PLoS One 12, e0185056 674 (2017).

675 90. D. D. Kang, et al., MetaBAT 2: an adaptive binning algorithm for robust and efficient genome 676 reconstruction from metagenome assemblies. PeerJ 7, e7359–e7359 (2019).

677 91. C. M. K. K. Sieber, et al., Recovery of genomes from metagenomes via a dereplication, 678 aggregation and scoring strategy. Nat. Microbiol. 3, 836–843 (2018).

679 92. P.-A. Chaumeil, A. J. Mussig, P. Hugenholtz, D. H. Parks, GTDB-Tk: a toolkit to classify 680 genomes with the Genome Taxonomy Database. Bioinformatics (2019) 681 https:/doi.org/10.1093/bioinformatics/btz848 (January 9, 2020).

682 93. M. R. Olm, C. T. Brown, B. Brooks, J. F. Banfield, dRep: a tool for fast and accurate genomic 683 comparisons that enables improved genome recovery from metagenomes through de- 684 replication. ISME J. 11, 2864–2868 (2017).

685 94. D. H. Parks, M. Imelfort, C. T. Skennerton, P. Hugenholtz, G. W. Tyson, CheckM: Assessing 686 the quality of microbial genomes recovered from isolates, single cells, and metagenomes. 687 Genome Res. 25, 1043–1055 (2015).

688 95. C. Jain, L. M. Rodriguez-R, A. M. Phillippy, K. T. Konstantinidis, S. Aluru, High throughput 689 ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 690 9, 5114 (2018).

691 96. A. Bankevich, et al., SPAdes: a new genome assembly algorithm and its applications to 692 single-cell sequencing. J. Comput. Biol. 19, 455–77 (2012).

693 97. M. D. Lee, GToTree: a user-friendly workflow for phylogenomics. Bioinformatics (2019) 694 https:/doi.org/10.1093/bioinformatics/btz188 (July 18, 2019).

695 98. L. A. Hug, et al., Critical biogeochemical functions in the subsurface are associated with 696 bacteria from new phyla and little studied lineages. Environ. Microbiol. 18, 159–173 (2016).

23

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

697 99. I. Letunic, P. Bork, Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree 698 display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

699 100. D. Hyatt, et al., Prodigal: prokaryotic gene recognition and translation initiation site 700 identification. BMC Bioinformatics 11, 119 (2010).

701 101. Z. Zhou, et al., METABOLIC: High-throughput profiling of microbial genomes for functional 702 traits, biogeochemistry, and community-scale metabolic networks. bioRxiv, 761643 (2020).

703 102. T. Aramaki, et al., KofamKOALA: KEGG Ortholog assignment based on profile HMM and 704 adaptive score threshold. Bioinformatics 36, 2251–2252 (2020).

705 103. J. D. Selengut, et al., TIGRFAMs and Genome Properties: tools for the assignment of 706 molecular function and biological process in prokaryotic genomes. Nucleic Acids Res. 35, 707 D260–D264 (2007).

708 104. R. D. Finn, et al., Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 709 (2014).

710 105. R. C. Edgar, MUSCLE: multiple sequence alignment with high accuracy and high 711 throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

712 106. S. R. Eddy, Accelerated Profile HMM Searches. PLoS Comput. Biol. 7, e1002195– 713 e1002195 (2011).

714 107. E. Katoh, et al., High precision NMR structure of YhhP, a novel Escherichia coli protein 715 implicated in cell division11Edited by M. F. Summers. J. Mol. Biol. 304, 219–229 (2000).

716 108. E. A. McDaniel, K. Anantharaman, K. D. McMahon, metabolisHMM: Phylogenomic analysis 717 for exploration of microbial phylogenies and metabolic pathways. bioRxiv, 718 2019.12.20.884627 (2019).

719 109. N. Y. Yu, et al., PSORTb 3.0: improved protein subcellular localization prediction with 720 refined localization subcategories and predictive capabilities for all prokaryotes. 721 Bioinformatics 26, 1608–1615 (2010).

722 110. A. Krogh, B. Larsson, G. von Heijne, E. L. L. Sonnhammer, Predicting transmembrane 723 protein topology with a hidden markov model: application to complete genomes11Edited by 724 F. Cohen. J. Mol. Biol. 305, 567–580 (2001).

725 111. S. Lu, et al., CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 726 48, D265–D268 (2020).

727 112. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman, Basic local alignment search 728 tool. J Mol Biol 215, 403–410 (1990).

729 113. D. L. Tabb, C. G. Fernando, M. C. Chambers, MyriMatch: Highly accurate tandem mass 730 spectral peptide identification by multivariate hypergeometric analysis. J. Proteome Res. 6, 731 654–661 (2007).

732 114. Z. Q. Ma, et al., IDPicker 2.0: Improved protein assembly with high discrimination peptide 733 identification filtering. J. Proteome Res. 8, 3872–3881 (2009).

734 735

24

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

736 Figures 737 A 1. 2 50 B 1. 2 0 d

1. 0 1. 0 13

40 -10 C (mM) CH4 d - 34

0. 8 S 0. 8 HS 30 SO4 -20 d - 13 2 DIC (mM) 3 ‰ VCDT

0. 6 0. 6 C O DIC 2 4 S 20 -30 ‰ VPDB VPDB ‰ CH

- 0. 4 0. 4 2 4 10 -40 SO 0. 2 0. 2

0. 0 0 0. 0 -50 03 04 05 06 07 08 09 10 11 12 03 04 05 06 07 08 09 10 11 12 Month Month 738 739 740 Figure 1. Groundwater chemistry (drillhole OL-KR13). (A) Sulfur species and sulfur isotope 34 32 34 741 values ( S/ S expressed as d SSO4); (B) dissolved inorganic carbon (DIC), total organic carbon 13 12 13 13 742 (DOC) and methane. Carbon isotope values ( C/ C) are expressed as d CDIC and d CCH4. Data 743 used in this figure is provided in Dataset S1.

744 745

25

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Desulfobacterota 12 Caldatribacteriota 2 11 Nanoarchaeota 2 Verrucomicrobiota 8 Actinobacteriota 8 Patescibacteria 20 2 2 Thermoplasmatota 1 1 Acidobacteriota 2 1 Chlorofexota 2 Nitrospirota 3 Halobacteriota 2 Bacteroidota 2 Archaea (k) 1 Firmicutes 1 Elusimicrobiota 1 Campilobacterota 0 Planctomycetota 1 Dependentiae 1 Marinimicrobia 1 Remaining taxa 4 3 Aug Mar July May June Nov 1 Nov 2 SAGs Sept 1 MAGs Sept 2 Sept 3 Metagenome Metaproteome

SAGs Nov (0.1 µm)

16S rRNA gene amplicon read abundance (%) 5 15 25 35 45 746 747 748 Figure 2. Phyla detected by 16S rRNA gene amplicon sequencing of groundwater from 749 drillhole OL-KR13. DNA was extracted from biomass collected on an 0.2 µm filter for all samples 750 except one (Nov (0.1 µm)). If the phylum could not be assigned, the kingdom is given, denoted by 751 a ‘k’ in parentheses. Sampling months (March – November) with a corresponding metagenome 752 (filled star), metaproteome (filled pentagon) and SAGs (filled circle) are indicated. The number of 753 MAGs and SAGs is given for each phylum following dereplication. Verrucomicrobiota includes 754 MAGs from the class Omnitrophia in the Silva 138 release that are included as a separate phylum 755 (Omnitrophota) in the GTDB release 95. Genomes assigned to remaining taxa are: Altiarchaeota 756 (MAG + SAG); Bacteria UPB18 (2x MAGs); Delongbacteria (MAG); Bipolaricaulota (SAG); 757 Cloacimonadota (SAG).

758

759 760 761

26

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Bacteria

0.1 Archaea Altarchaeota Patescibacteria (CPR) UBP18 Marinisomatota EXX4484-2 (f) N1.25 B61-G2 (g) S7.74 UBA7526 (f) N2.23 UBA9312 (g) S2.59 EXX4484-2 (f)SAG* UBA9983 (f)S2.106 UBA7526 (f)S2.24 Bacteroidota EX4484-52 CG1-02-47-37 (f) N1.108 Acidobacteriota Kaistella (g) S2.1 GWE2-39-37 (f)S2.64 UBA2199 (g) S2.79 Epilithonimonas (g) S2.104 B79-G9 (f) N2.53 GWE2-39-37 (f) N2.35 Aminicenantales (o) N2.26 Firmicutes Nanoarchaeota GWE2-39-37 (f)S7.38 Aminicenantales (o) SAG SCGC-AAA011-G17 (o) N2.15 GWE2-39-37 (f) N1.67 Acetobacterium (g) S2.10 GW2011-AR17 (f)S2.40* GWE2-39-37 (f)SAG Nitrospirota Chloroflexota Staskawiczbacteraceae (f) N1.36 SM23-35 (f)S2.74 Halobacteriota UBA1429 (o) N2.130 ABY1 (c)S2.35 UBA1546 (g) N1.33 UBA8950 (g) S2.119 Methanoperedens (g) N1.45 RBG-13-40-8-A (f) N2.18 UBA1546 (g) S3.41* Methanoperedens (g) S2.57 GWA2-46-15 (g) N2.34 Actinobacteriota Cloacimonadota Propionicimonas (g) N1.74 Thermoplasmatota UBA2196 (f)S2.28 JGIOTU-2 (g) SAG GCA-002772895 (g) N2.78 Fen-727 (o) S3.25 Thermoplasmata (c)S3.50 Delongbacteria Fen-727 (o) S2.68 Thermoplasmata (c)SAG* UBA12079 (g) N2.81 UBA9934 (g) N2.9 UBA12480 (g) S2.13 UBA1414 (f) N2.117 20-14-0-20-35-9 (g) S2.32 Bipolaricaulota 0-14-0-20-40-13 (f) N2.99 Desulfobacterota SPBW01 (g) SAG* 0-14-0-20-40-13 (f)SAG Cellulomonas (g) N1.63 UBA2270 sp002347745 (s)S2.100 Cellulomonas (g) N1.4* CG1-02-37-13 (g) S2.84 UBA2262 sp002841785 (s)S1.22 Dependentiae UBA2143 (f)S3.58 UBA912 (g) S2.21* UASB340 (g) N1.41 SURF-16 (g) N2.43 PEYW01 (o) S2.147 SURF-10 (g) N1.69 Planctomycetota Elusimicrobiota BM507 (f)S2.53 B13-G4 (g) S3.18 UBA11386 (f) N1.106 UBA9959 (f) N2.88 Omnitrophota B13-G4 (g) N1.59 Verrucomicrobiota Caldatribacteriota UBA6249 (g) S2.66 ETH-SRB1 (f) N2.53 Kiritimatiellae (c) N2.116 UBA6249 (g) S2.101 4484-190-2 (g) N1.64 34-128 (g) N1.50 Kiritimatiellae (c)S2.140 UBA6249 (f) N2.22 UBA5616 (g) S2.15 Proteobacteria 34-128 (g) S1.31* B26-G9 (o) S2.42 0-14-0-80-60-11 (g) N1.3 34-128 (g) SAG B26-G9 (o) N2.31 P. aespoeensis (s)S2.5 H. sp003509765 (s)S7.34 762 UBA1560 (o) S2.17 UBA5619 (g) N2.62 B. mediterranea (s)S7.51 763 Figure 3. Diversity of recovered genomes (dereplicated MAGs and SAGs) based on 16 single 764 copy genes. MAGs are indicated by a blue star and SAGs are indicated by a pink circle. The colors 765 represent different phylum-level lineages. Letters in parentheses indicate the taxonomic rank 766 assignment: o, order; c, class; f, family; g, genus; s, species. Full taxonomic assignments are 767 provided in Dataset S2. MAGs/SAGs marked with an asterisk (*) were excluded from the tree as 768 they contained too few of the single copy genes used for alignment. The scale bar corresponds to 769 per cent average amino acid substitution over the alignment.

770

27

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

MAG Metaproteome (spectral counts)

Present 5 25 50 95

Desulfocapsaceae S2.100 Desulfurivibrionaceae S1.22 ETH-SRB1 S3.18 ETH-SRB1 N1.59 Desulfatiglandales N1.64 Desulfurivibrionaceae N2.43 S2.15 Desulfobaccales N1.3 Desulfobacterota Pseudodesulfovibrio S2.5 Syntrophales N2.62 ETH-SRB1 N2.53 Desulfarculaceae N1.69 Thermodesulfovibrionales N1.33 Thermodesulfovibrionales S3.41 Thermodesulfovibrionales S2.74 Nitrospirota Methanoperedens N1.45 Methanoperedens S2.57 Ftr Sat Mtd Mer Mch DsrJ PsrB DsrA DsrB DsrK DsrP AprA AprB DsrC DsrD TusA DsrO McrA McrB DsrM Chp1 Chp2 McrG FwdA Halobacteriota DsrE-like YEDE-like NiFe large NiFe small SO 2- reduction PhsA/PsrA CH oxidation 4 'YTD' 4 H oxidation S reduction & 2 771 disproportionation 772 773 Figure 4. Abundance of proteins involved in sulfur, hydrogen and methane metabolism in 774 Desulfobacterota, Nitrospirota and Halobacteriota. Filled circles indicate gene presence in the 775 MAG. The size of the circle shows the abundance of the protein (spectral counts) in the 776 metaproteome.

777

28

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

CG1-02-47-37 (f) N1.108 B61-G2 (g) S7.74 ABY1 (c) S2.35 BM507 (f) S2.53 GWE2-39-37 (f) S2.64 GWE2-39-37 (f) N2.35 GWE2-39-37 (f) S7.38 GWE2-39-37 (f) N1.67 GWE2-39-37 (f) SAG GCA-002772895 (g) N2.78 UBA9934 (g) N2.9 UBA2196 (f) S2.28 CG1-02-37-13 (g) S2.84 GWA2-46-15 (g) N2.34 Staskawiczbacteraceae (f) N1.36 RBG-13-40-8-A (f) N2.18 Patescibacteria (CPR) UBA2143 (f) S3.58 UBA12079 (g) N2.81 UBA9983 (f) S2.106 0-14-0-20-40-13 (f) N2.99 0-14-0-20-40-13 (f) SAG PEYW01 (o) S2.147 B79-G9 (f) N2.53 SCGC-AAA011-G17 (o) N2.15 GW2011-AR17 (f) S2.40 EXX4484-2 (f) N1.25 DPANN EXX4484-2 (f) SAG UASB340 (g) N1.41 UBA7526 (f) N2.23 UBA7526 (f) S2.24 SPBW01 (g) SAG phyla JGIOTU-2 (g) SAG Candidate UBA12480 (g) S2.13 sat pta fae ack acs sdo dcm aclA aclB ubiX fdhB dsrD porA cysK YgfK cysH coxS malY acdA cdhE cooS aspB ahcY cysM metK cdhD nosD sdhC xdhD msrC coxM metH mtnN megL dmoB Present mmoB Form III ldh/ldhA Absent atpA (F-type) atpD (F-type) alpha-amylase atpA (V/A-type) atpB (V/A-type) [NiFe] Group 3c hexosaminidase sulfur Sulfur Sulfur [FeFe] Group a1-3 [NiFe] Group 3a,b,d Organic Selenate reduction acyl-CoA dehydrogenase metabolism amino acids aminotransferase class I and II fxation Carbon phosphoserine aminotransferase Nitrogen serine-pyruvate aminotransferase branched-chain aminotransferase Methane Oxidative Fermentation Hydrogenase C1 metabolism ornithine/acetylornithine aminotransferase Aromatics phosphorylation Complex carbon Fatty acids aspartate/tyrosine/aromatic aminotransferase histidinol-phosphate/aromatic aminotransferase acids 778 Amino 779 Figure 5. Metabolic profile of ultra-small CPR bacteria, DPANN archaea and other candidate 780 phyla bacteria. Heatmap (presence/absence) of metabolic genes in MAGs and SAGs in Olkiluoto 781 groundwater. Full gene names and enzyme reactions are provided in Dataset S3.

782

29

bioRxiv preprint doi: https://doi.org/10.1101/2021.08.21.457207; this version posted August 22, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

NOM OC

CPR DPANN 2- SO4 ETH-SRB1 HS-

CPR DPANN

NOM OC 783 784 785 Figure 6. Proposed sulfur and methane cycling in Olkiluoto groundwater. Solid lines indicate 786 metaproteomic evidence for the processes depicted while dashed lines indicate putative processes 787 consistent with the data. Bold text indicates substrates present in the groundwater. Abbreviations 788 shown in the figure are as follows: SRB, sulfate-reducing bacteria; SDB, sulfur-disproportionating 789 bacteria; EET, extracellular electron transfer; NOM, natural organic matter; OC, organic carbon.

30