bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Carbon Assimilation Strategies in Ultrabasic Groundwater: Clues from the Integrated

2 Study of a Serpentinization-Influenced Aquifer

3

4 L.M. Seyler, W.J. Brazelton, C. McLean, L.I. Putman, A. Hyer, M.D.Y. Kubo, T. Hoehler, D.

5 Cardace, M.O. Schrenk

6

7 Abstract

8 Serpentinization is a low-temperature metamorphic process by which ultramafic rock

9 chemically reacts with water. These reactions provide energy and materials that may be

10 harnessed by chemosynthetic microbial communities at hydrothermal springs and in the

11 subsurface. However, the biogeochemistry of microbial populations that inhabit these

12 environments are understudied and are complicated by overlapping biotic and abiotic processes.

13 We applied metagenomics, metatranscriptomics, and untargeted metabolomics techniques to

14 environmental samples taken from the Coast Range Ophiolite Microbial Observatory (CROMO),

15 a subsurface observatory consisting of twelve wells drilled into the ultramafic and serpentinite

16 mélange of the Coast Range Ophiolite in California. Using a combination of DNA and RNA

17 sequence data and mass spectrometry data, we determined that several carbon assimilation

18 strategies, including the Calvin-Benson-Bassham cycle, the reverse tricarboxylic acid cycle, the

19 reductive acetyl-CoA pathway, and methylotrophy are used by the microbial communities

20 inhabiting the serpentinite-hosted aquifer. Our data also suggests that the microbial inhabitants of

21 CROMO use products of the serpentinization process, including methane and formate, as carbon

22 sources in a hyperalkaline environment where dissolved inorganic carbon is unavailable.

23 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

24 Importance

25 This study describes the metabolic pathways by which microbial communities in a

26 serpentinite-influenced aquifer may produce biomass from the products of serpentinization.

27 Serpentinization is a widespread geochemical process, taking place over large regions of the

28 seafloor, particularly in slow-spreading mid ocean ridge and subduction zone environments. The

29 serpentinization process is implicated in the origin of life on Earth and as a possible environment

30 for the discovery of life on other worlds in our solar system. Because of the difficulty in

31 delineating abiotic and biotic processes in these environments, major questions remain related to

32 microbial contributions to the carbon cycle and physiological adaptation to serpentinite habitats.

33 This research explores multiple mechanisms of carbon assimilation in serpentinite-hosted

34 microbial communities.

35

36 Introduction

37 Serpentinization is the process by which ultramafic rock in the lower crust and upper

38 mantle of the Earth is hydrated, leading to the oxidation of ferrous iron in the minerals olivine

- 39 and pyroxene, and the release of hydrogen (H2) and hydroxyl ions (OH ). The reducing power

40 supplied by serpentinization, when mixed with oxidants from surface and subsurface fluids,

41 creates chemical disequilibria that microbial populations can harness (McCollom, 2007; Amend

42 et al., 2011; Cardace et al., 2015). However, due to the release of cations during rock weathering

43 much of the dissolved inorganic carbon (DIC) in serpentinite systems is precipitated as calcite

44 and aragonite (Barnes et al., 1978). Consequently, small carbon-bearing compounds including

45 methane, carbon monoxide, and formate can take on important roles in sustaining microbial

46 ecosystems in serpentinites. The high pH and limited availability of both DIC and electron bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

47 acceptors creates a challenging environment for microbial communities hosted in

48 serpentinization-influenced environments (Twing et al. 2017).

49 Overlapping biogenic and abiogenic processes in serpentinizing rocks make it difficult to

50 delineate carbon cycling pathways used in these environments (Figure 1). Under highly reducing

51 conditions, in the presence of specific mineral catalysts, hydrogen reacts with carbon dioxide

52 (CO2) or carbon monoxide (CO) to form methane (CH4) and small-chain hydrocarbons via

53 Fischer-Tropsch type reactions (McCollom and Seewald, 2001; Charlou et al., 2002;

54 Proskurowski et al., 2008; McCollom, 2013). These same compounds may also be formed by

55 biological activity, or diagenesis of organic matter (Sephton and Hazen, 2013). For example,

56 high concentrations of dissolved organic material, produced by past microbiological activity that

57 may have been supported by the by-products of serpentinization, have been detected in close

58 association with serpentine-hosted hydrogarnets recovered from the Mid-Atlantic Ridge (Ménez

59 et al., 2012). Acetate in endmember fluids at Lost City is also likely produced by biological

60 activity (Lang et al., 2010), but other hydrocarbons in the rock-hosted fluids do not have a clear

+ 61 biotic or abiotic source (Delacour et al., 2008), though formate and C2 alkanes appear to be

62 produced abiotically (Proskurowski et al., 2008; Lang et al., 2010). Biotic versus abiotic sources

63 of methane in serpentinizing systems are likewise difficult to resolve and vary from site to site

64 (Bradley and Summons, 2010; Etiope et al., 2013; Wang et al., 2015; Kohl et al., 2016).

65 In situ microbial communities may utilize the products of serpentinization - including

66 methane, carbon monoxide, formate, and acetate - as sources of carbon (Schrenk et al., 2004;

67 Lang et al., 2010; Brazelton et al., 2011; Brazelton et al., 2012; Lang et al., 2012; Brazelton et

68 al., 2013; Crespo-Medina et al., 2014). Microbial communities in serpentinite springs in the

69 Voltri Massif, Italy, appear to perform both aerobic methanotrophy and methanogenesis using bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

70 CO2 liberated from acetate (Brazelton et al., 2017). Methanogens in The Cedars, CA, are capable

71 of producing methane from both inorganic carbon and a variety of organic carbon substrates:

72 acetate may be produced through fermentation (Kohl et al., 2016), and communities that reside

73 deeper in the springs also possess a reductive acetyl-CoA pathway for autotrophic acetogenesis

74 (Suzuki et al., 2017). Carbon monoxide may also be used as a substrate by microbial

75 communities in the Tablelands, Canada, but it appears to be primarily used as a source of energy

76 rather than carbon (Morrill et al., 2014). The carbonate chimneys of the Lost City Hydrothermal

77 Field are dominated by a single of archaea that may be capable of both production and

78 consumption of methane in tandem (Brazelton et al., 2011). Genes encoding NAD(+)-dependent

79 formate dehydrogenase in metagenomes recovered from the Samail Ophiolite aquifer suggest

80 that formate is an important source of carbon in these DIC-limited hyperalkaline waters (Fones et

81 al., 2019). In short, carbon assimilation in these environments is both enigmatic to researchers

82 and challenging for in situ microbial communities, but pathways and strategies utilizing alternate

83 sources of carbon supplied by serpentinization appear to be prevalent.

84 In this study, we examined the microbial metabolic potential and biogeochemistry of a

85 serpentinization-influenced aquifer in the Coast Range Ophiolite Microbial Observatory

86 (CROMO), California. Wells have been drilled at various depths and locations at CROMO such

87 that microbial processes can be explored across a wide range of physical and chemical

88 conditions. Carbon assimilation pathways in the microbial communities at CROMO were

89 investigated using a combination of metagenomic, metatranscriptomic, and metabolomic

90 approaches. Metagenomic data sets from nine of the twelve CROMO wells, and

91 metatranscriptomic data sets from four wells, obtained over multiple sampling trips, were

92 queried for complete pathways and individual genes associated with carbon assimilation. We bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

93 then applied environmental metabolomics, an emerging approach used to directly assess the

94 products of ecosystem activity, on intra- and extracellular extracts from the two most alkaline

95 wells (CSW1.1 and QV1.1) to describe the dissolved organic carbon pools in these wells. This

96 study links the genetic potential of serpentinite-hosted communities to the observed

97 biogeochemistry of the groundwater in CROMO.

98

99 Materials and Methods

100 Site Description

101 The Coast Range Ophiolite in northern California is a 155–170 million-year-old portion

102 of oceanic crust tectonically emplaced on the North American continental margin. Trapped

103 Cretaceous seawater and circulating meteoric water reacts with ultramafic rock (Peters, 1993;

104 Shervais et al., 2005) as evidenced by numerous calcium hydroxide rich springs throughout the

105 formation (Barnes and O’Neil, 1971). Previous studies of continental serpentinite environments

106 have sampled similar springs in the Tablelands, Newfoundland, Canada (Brazelton et al., 2012;

107 Brazelton et al., 2013) and The Cedars, California (Suzuki et al., 2013), providing information on

108 the surface expression of subsurface biogeochemical processes. In order to directly access the

109 serpentinite subsurface environment and monitor groundwater biogeochemistry with minimal

110 contamination, a microbial observatory was established in the Coast Range Ophiolite at the UC-

111 Davis McLaughlin Natural Reserve (Lower Lake, CA) (Cardace et al., 2013). The Coast Range

112 Ophiolite Microbial Observatory (CROMO) consists of two sets of wells 1.4 km apart—the

113 Quarry Valley (QV) and Core Shed Wells (CSW)—each with an uncased main borehole

114 (designated “1.1”) surrounded by two or four monitoring wells, respectively. In addition to the

115 eight newly drilled wells, monitoring wells (N08A, N08B, and N08C at Quarry Valley, and bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

116 CSWold) were previously drilled by the Homestake Mining Company, Inc. more than 30 years

117 prior, in conjunction with mining operations conducted in the area. All of the wells are cased

118 with PVC, with the exception of the 1.1 wells which are only cased to bedrock, leaving the

119 bottom of the hole uncased.

120

121 Extraction of DNA and RNA

122 For each environmental sample from the CROMO wells, four liters of fluid were pumped

123 via slow flow bladder pump (Cardace et al., 2013) and immediately filtered through Sterivex 0.2

124 µm filter cartridges (Millipore, Billerica, MA) using a portable peristaltic pump (Masterflex®,

125 Cole Parmer, Vernon Hills, IL). Cartridges were kept on ice during filtration, immediately stored

126 in liquid nitrogen upon completion, shipped to the home laboratory, and stored at -80℃ until

127 processing. Total genomic DNA extractions were completed as previously described by

128 Brazelton et al., (2017), Crespo-Medina et al., (2017) and Twing et al., (2017) and briefly

129 described here. Freeze/thaw cycles and lysozyme/Proteinase K treatment were performed to lyse

130 cells, followed by purification with phenol-chloroform, precipitation using ethanol, and

131 purification using QiaAmp (Qiagen, Hilden, Germany) columns according to the manufacturer's

132 instructions. A Qubit 2.0 fluorometer (ThermoFisher, Waltham, MA) was used to quantify

133 extracted DNA using a Qubit dsDNA High Sensitivity Assay kit.

134 Extractions for RNA from CROMO wells were performed as described previously with

135 slight modifications (Lin and Stahl, 1995; MacGregor et al., 1997). Briefly, frozen 0.2 µm

136 Sterivex filter cartridges were broken open, cut into four equal pieces, and divided into two

137 screw-cap Eppendorf tubes containing phenol, 20% sodium dodecyl sulfate, 5× low-pH buffer,

138 and 0.2 to 0.5 g baked zirconium beads. Samples were bead-beaten for 3 minutes, heated in a bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

139 60°C water bath for 10 minutes, bead beaten again for 3 minutes, and centrifuged at 4°C and

140 18,407 × g to separate phases. Supernatant was transferred to a fresh Eppendorf tube and chilled.

141 1× low-pH buffer was added to the remaining sample, and bead-beating was repeated.

142 Supernatants were combined, and phenol, 1:1 phenol:chloroform, and chloroform were added in

143 series with vortexing and centrifugation in between. Between steps, aqueous phases were

144 transferred to clean Eppendorf tubes. The final aqueous phase was transferred to a clean

145 Eppendorf tube with additions of ammonium acetate, isopropanol, and magnesium chloride

146 before vortexing and incubation at -20°C overnight. Samples were centrifuged for 30 minutes at

147 4°C, washed with ethanol, and dried under vacuum before suspension in RNase-free water and

148 storage at -80°C until analyzed.

149

150 Metagenome and Metatranscriptome Analyses

151 This study reports a new analysis of the metagenome and metatranscriptome sequences

152 reported in Sabuda et al. (in review). Our metagenomic assembly and mapping of

153 metatranscriptome reads was identical to that of Sabuda et al., and all methods are reported again

154 here for completeness. Metagenome and metatranscriptome sequencing was conducted at the

155 Joint Genome Institute (JGI) on an Illumina HiSeq2000 instrument. A Covaris LE220

156 ultrasonicator was used to shear DNA samples into 270 bp fragments, and size selection was

157 performed using solid phase reversible immobilization (SPRI) magnetic beads (Rohland and

158 Reich, 2012). DNA fragments were end-repaired, A-tailed, and ligated with Illumina-compatible

159 adapters with barcodes unique for each library. KAPA Biosystem’s next-generation sequencing

160 library qPCR kit and Roche LightCycler 280 RT PCR instrument were used to quantify libraries.

161 10-library pools were assembled and prepared for Illumina sequencing in one lane each. bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

162 Clustered flowcells were produced using a TruSeq paired-end cluster kit (v3) and Illumina’s

163 cBot instrument. The Illumina HiSeq2000 instrument was utilized with a TruSeq SBS

164 sequencing kit (v3) and a 2 × 150 indexed run recipe to sequence the samples. The raw sequence

165 reads were trimmed by the JGI with a minimum quality score cutoff of 10 and to remove

166 adapters. These trimmed reads from CROMO wells were previously reported by Twing et al.,

167 2017, but additional quality-filtering and a new assembly, distinct from the JGI assembly

168 reported by Twing et al., was performed for this study.

169 The trimmed reads from the JGI were subjected to an additional quality screen to trim 3'

170 adapters with cutadapt v. 1.15 (Martin, 2011), to remove replicate sequences, and to trim

171 sequences again with a threshold of 20 along a sliding window of 6 bases with qtrim (part of the

172 seq-qc package: https://github.com/Brazelton-Lab/seq-qc). All CROMO metagenomes and

173 metatranscriptomes were pooled together for a master CROMO assembly computed with Ray

174 Meta v.2.3.1 (Boisvert et al., 2012). Phylogenetic affiliation of contigs was assigned using

175 PhyloPythiaS+ (Gregor et al., 2016). Metagenome and metatranscriptome short reads were

176 mapped to the pooled assembly using Bowtie2 v.2.2.6 (Langmead and Salzberg, 2013). The

177 Prokka pipeline (Seemann, 2014) was used for gene prediction and functional annotation of

178 contigs. The arguments –metagenome and –proteins were used with Prokka v.1.12 to indicate

179 that genes should be predicted with the implementation of Prodigal v.2.6.2 (Hyatt et al., 2010)

180 optimized for metagenomes as described by Twing et al., 2017. Metagenome-assembled genome

181 (MAG) bins were constructed with ABAWACA (https://github.com/CK7/abawaca) using

182 tetranucleotide frequencies and differential abundance as measured by Bowtie2 mapped read

183 abundances. Bin quality was computed with CheckM (Parks et al., 2015), and only "high

184 quality" MAG bins are reported here (>50% completeness, <10% contamination, as defined by bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

185 Bowers et al., 2017). The completeness and contamination of some bins were improved by

186 refinement with Binsanity (Graham et al., 2017). Taxonomic classifications of MAGs were

187 annotated using GTDB-Tk (https://github.com/Ecogenomics/GTDBTk; Parks et al., 2018).

188 Predicted protein-coding sequences were annotated by searching the Kyoto Encyclopedia

189 of Genes and Genomes (KEGG; Ogata et al., 1999; Kanehisa and Goto, 2000) release v. 83.2

190 within Prokka. HTSeq v.0.6.1 (Anders et al., 2015) and were used to calculate predicted protein

191 abundances. The abundances of predicted protein functions in all CROMO metagenomes and

192 metatranscriptomes were normalized to predicted protein size and metagenome size. Data

193 reported here are in units of metagenome fragments per kilobase of predicted protein sequence

194 per million mapped reads (FPKM). Abundances of metabolic pathways were obtained by

195 mapping KEGG protein IDs and their normalized counts onto the FOAM ontology (Prestat et al.,

196 2014) with MinPath (Ye & Doak, 2009) as implemented in HUMAnN2 v.0.6.0 (Abubucker et

197 al., 2012).

198 The complete translated genome assemblies of Serpentinomonas isolates from the Cedars

199 (Comamonadaceae bacterium A1, B1, and H1) were obtained as amino acid FASTA files from

200 RefSeq (accession numbers: GCF_000828895.1, GCF_000828915.1, GCF_000696225.1). ORFs

201 were annotated as KEGG orthologs (KOs) using BlastKOALA (Kanehisa and Goto 2000).

202 KEGG modules were annotated within genomes using the KEGG module reconstruction tool.

203 MAGs were translated to protein sequences using Prodigal (Hyatt et al., 2010) and annotated in

204 the same manner. Modules within the three cultured representative genomes and the MAGs were

205 then directly compared in order to compare pathways identified in available cultivars from the

206 Cedars to pathways in the uncultured communities hosted in CROMO’s hyperalkaline bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

207 groundwater. A phylogenetic tree comparing MAGs to the Serpentinomonas isolates was

208 generated using SpeciesTree in KBase (Arkin et al., 2018).

209

210 Metabolite Extraction and Analysis

211 Three 4-L samples of groundwater from QV1.1 and CSW1.1 were taken in June 2016

212 and filtered using 0.22 µm 47 mm polytetrafluoroethylene (PTFE) filters on a glass vacuum

213 filtration tower. Filters were frozen in liquid nitrogen and stored at -80° C until extraction (Fiore

214 et al., 2015; Kido Soule et al., 2015). Approximately 30 mL filtrate was set aside and frozen at -

215 20 °C for total organic carbon analysis on a Shimadzu Total Organic Carbon Analyzer.

216 Formaldehyde was measured using a CHEMetricsTM Vacu-vialsTM colorimetric kit (detection

217 range 0.4-8.0 ppm). The remaining filtrate was acidified to pH 2-3 using concentrated HCl and

218 stored at 4°C for 5 days until further analysis by solid phase extraction.

219 Frozen filters were cut into small pieces over a muffled piece of aluminum foil, using

220 methanol-washed scissors, then transferred into muffled amber glass vials. 2 mL cold extraction

221 solvent (40:40:20 acetonitrile:methanol:0.1 M formic acid) was added to each vial, and the

222 samples were sonicated for 10 minutes. Extracts were transferred to a centrifuge tube via pipette

223 and centrifuged at 20,000 × g for 5 min at 4 °C. The supernatant was then transferred to a new

224 amber glass vial, neutralized with 6 M ammonium hydroxide, and vacuum centrifuged to

225 dryness. Samples were dissolved in 495 µl 95:5 water:acetonitrile with 5 µl of 5 µg/ml of biotin-

226 (ring-6,6-d2) added as an injection standard (Rabinowitz and Kimball, 2007; Fiore et al., 2015).

227 Dissolved organics were captured from the acidified filtrate on solid phase extraction

228 (SPE) Bond Elut-PPL cartridges (Agilent Technologies) and eluted with 100% methanol into

229 muffled amber glass vials (Dittmar et al., 2008; Fiore et al., 2015). SPE-PPL cartridges were bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

230 rinsed with methanol immediately before use. The supernatant was passed through the cartridge

231 using 1/8" × 1/4" PTFE tubing to pull the supernatant into the cartridge to minimize the

232 possibility of contamination from plastic leaching, while Viton tubing was used to remove the

233 discarded flow-through via peristaltic pump (flow rate not exceeding 40 ml/min). The cartridges

234 were then rinsed with at least 2 cartridge volumes of 0.01 M HCl to remove salt, and the sorbent

235 dried with air for 5 minutes. Dissolved Organic Matter (DOM) was eluted from the cartridge

236 with 2 ml methanol via gravity flow into a muffled amber glass vial. The eluate was then vacuum

237 centrifuged to dryness. Precipitated samples were dissolved in 495 µl 95:5 water:acetonitrile,

238 with 5 µl of 5 µg/ml of biotin-(ring-6,6-d2).

239 The above protocol was repeated for 4 L milliQ water, and organics extracted from the

240 filter and filtrate as above as extraction blanks. 495 µl 95:5 water:acetonitrile plus 5 µl 5 µg/mL

241 biotin-(ring-6,6-d2) was also run as a blank.

242 Samples were analyzed using tandem liquid chromatography mass spectrometry

243 (LC/MS/MS) at the Michigan State University Metabolomics Core facility. Triplicate samples

244 from each well were separated chromatographically in a Acquity UPLC BEH C18 column (1.7

245 µm, 2.1 mm × 50 mm) using a polar/non-polar gradient made up of 10 mM TBA and 15 mM

246 acetic acid in 97:3 water:methanol (solvent A), and 100% methanol (solvent B). The gradient

247 was run at 99.9/0.1 (% A/B) to 1.0/99.0 (% A/B) over 9 min, held an additional 3 min at 1.0/99.0

248 (% A/B), then reversed to 99.9/0.1 (% A/B) and held another 3 min. At a rate of 400µL/min, the

249 sample was fed into a quadrupole time-of-flight mass spectrometer using a Waters Xevo G2-XS

250 MS/MS in negative ion mode using a data independent collection method (m/z acquisition range

251 50 to 1500 Da). OM levels were too low to allow distinction between samples and blanks from

252 the first run, so triplicate samples were pooled and run as before. After data was converted to bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

253 centroid mode, raw data files were converted from proprietary Waters RAW format into XML-

254 based mzML format using MSConvert (Chambers et al, 2012). Data was processed using XCMS

255 (Smith et al., 2006) parametrized by AutoTuner (McLean and Kujawinski, in review). Isotopes

256 and adducts of each feature were identified using CAMERA (Kuhl et al., 2012). This approach

257 yielded a table of peak areas that are identified by a unique mass to charge (m/z) and retention

258 time (rt) pair referred to as features. The table of features was subject to quality assurance

259 through blank correction by removing any feature from a sample that also appeared within the

260 blank control (described above) (Broadhurst et al., 2018). Features were first putatively

261 annotated as KEGG compounds using mummichog (Li et al., 2013). Chemical standards of

262 putative annotations were purchased whenever available to check the retention time to increase

263 confidence of annotation from level 1 to 2 (Sumner et al. 2007). Due to the great complexity of

264 the secondary mass spectral data and insufficient coverage of mass spectral databases for natural

265 organic matter, MS2 information could not be readily used to increase confidence of annotation.

266

267 Data Availability

268 CROMO metagenome and metatranscriptome sequences are publicly available in the NCBI SRA

269 with the BioProject IDs PRJNA410019, PRJNA410020, PRJNA410022-410033,

270 PRJNA410035-410037, PRJNA410054, PRJNA410057, PRJNA410286, PRJNA410403,

271 PRJNA410404, PRJNA410553-410555, PRJNA410557. Raw spectrometry data are available in

272 the MetaboLights database under study identifier MTBLS1260.

273

274 Results

275 Presence and Expression of Carbon Assimilation Pathways bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

276 The metagenomic data collected from nine of the twelve CROMO wells contains

277 evidence for a range of carbon fixation/assimilation pathways (Figure 2). Functional profiling of

278 the metagenomes using HUMAnN2 identified multiple C1 compound utilization and

279 assimilation pathways, including the reductive acetyl-CoA (Wood-Ljungdahl) pathway, the

280 reverse tricarboxylic acid (TCA) cycle, formaldehyde oxidation to formate, and formaldehyde

281 fixation to biomass. Whenever these complete pathways were identified in the metagenomic

282 data, they were also detected in the corresponding metatranscriptomic data (Figure 2). A

283 complete Calvin-Benson-Bassham cycle was also detected in all nine metagenomes and all four

284 metatranscriptomes, at a similar coverage across wells (Supplemental Figure 1). Carbon fixation

285 pathways found exclusively in archaea (i.e., hydroxypropionate-hydroxybutylate cycle, 3-

286 hydroxypropionate bicycle) were not detected. MAG bins contained the reductive pentose

287 phosphate/Calvin-Benson-Bassham cycle, reductive TCA cycle, and reductive acetyl-CoA

288 pathways (Figure 3).

289 Several pathways associated with methanotrophy and methylotrophy were detected in the

290 metagenomes and metatranscriptomes. The pathway for aerobic methane oxidation to

291 formaldehyde was present in most of the metagenomes (Figure 4) and the complete pathway was

292 actively transcribed in wells QV1.2 and N08B. Formaldehyde uptake and assimilation pathways

293 were prevalent in all of the sequenced wells (Figure 2). Complete glutathione-dependent and

294 tetrahydrofolate (THF) pathways for formaldehyde oxidation to formate were present in all of

295 the metagenomes, expressed in all four metatranscriptomes (Figure 2, Figure 4), and present in

296 multiple MAGs (Figure 3). (The gfa gene, which codes for an enzyme that catalyzes a

297 thermodynamically spontaneous step in the glutathione-dependent pathway [Goenrich et al.,

298 2002], is only found in some formaldehyde-oxidizing and was absent from most of the bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

299 metagenomes.) Formaldehyde fixation via the tetrahydromethanopterin (H4MPT) pathway was

300 poorly represented across the metagenomes, but was expressed in the metatranscriptomes of

301 QV1.2 and N08B. The capability for formaldehyde fixation to biomass by the ribulose

302 monophosphate (RuMP) and serine pathways was likewise present in all nine metagenomes and

303 all four metatranscriptomes (Figure 2), and the RuMP pathway was partially complete in 18

304 MAGs and complete in one MAG (Figure 3).

305 Pathway identification using the KEGG database and FOAM pathways suggested that

306 homoacetogenesis via the reductive acetyl-CoA pathway (also known as the Wood-Ljungdahl

307 pathway) was represented in the metagenomes of only half of the wells sequenced (N08A,

308 N08B, QV1.2, CSW1.1, and CSWold) (Figure 2). However, a gene-by-gene search indicated

309 that the pathway was also present and nearly complete in the QV1.1 and CSW1.3 wells (Figure

310 5; Supplemental Tables 1 and 2). In the three least alkaline wells (CSW1.4, N08C, and QV1.2),

311 most of the genes in the pathway were poorly represented (< 2 FPKM) in the metagenomic data.

312 Metatranscriptomic data showed that the pathway was actively transcribed in all of the four wells

313 that were sampled for mRNA (Figure 2; Figure 5). However, one of the initial steps of the

314 reductive acetyl-CoA pathway, the conversion of CO2 to formate by NADP-dependent formate

315 dehydrogenase (FDH), appeared to be poorly expressed or altogether missing in both the

316 metagenomes and the metatranscriptomes (Figure 5). Seven of the 66 MAG bins also contained a

317 nearly complete reductive acetyl-CoA pathway that was missing the genes encoding FDH

318 (Figure 3, inset). The type I pathway for the formation of acetate from acetyl-CoA via acetyl-

319 phosphate (present in Methanosarcina and Clostridiales) is relatively abundant across all

320 metagenomes, while the type II pathway (which does not utilize an acetyl-phosphate

321 intermediate) is sparsely represented in the metagenomic data (Figure 2). bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

322

323

324 Taxonomic assignments of the metagenomic contigs using PhyloPythiaS+ were

325 consistent with previous 16S rRNA sequencing data from the CROMO wells (Twing et al.,

326 2017) and showed that the microbial communities in QV1.1 and CSW1.1 (the two most alkaline

327 wells) were highly similar (Supplemental Figure 3). Taxonomic classification of MAG bins,

328 including bin completeness, contamination, and strain heterogeneity, is available in Table 1; a

329 phylogenetic tree comparing the MAGs to the three Serpentinomonas isolates is included in

330 Figure 3.

331 In the metagenomes, pmoA and mxaF genes from the methane oxidation pathway were

332 detected on contigs classified as Methylophilales (Methylophilaceae) and Methylococcales

333 (Methylococcaceae); two MAG bins containing a complete methane oxidation pathway were

334 classified as Methylococcaceae (Figure 3; Table 1). Genes from the THF pathway for

335 formaldehyde oxidation (folD and purU) were associated with Burkholderiales

336 (Comamonadaceae), Deinococcales (Trueperaceae), Hydrogenophilales (Hydrogenophilaceae),

337 Methylophilales (Methylophilaceae), Rhizobiales, and Rhodocyclales (Rhodocyclaceae); the

338 frmA gene from the glutathione pathway for formaldehyde oxidation was detected on contigs

339 identified as Burkholderiales (Comamonadaceae), Rhizobiales, Rhodobacterales, and

340 Rhodocyclales (Rhodocyclaceae). The folD gene is also present in the reductive acetyl-CoA

341 pathway for acetogenesis; this gene, as well as acsB and cooS were associated with Clostridiales

342 (Syntrophomodaceae) and Desulfuromodales. All of the MAG bins containing the reductive

343 acetyl-CoA pathway were classified as Clostridiales (Figure 3; Table 1). The large and small

344 subunits of RuBisCO (rbcL and rbcS) were detected on contigs classified as Burkholderiales bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

345 (Comamonadaceae), Hydrogenophilales (Hydrogenophilaceae), Rhodocyclales

346 (Rhodocyclaceae), and Methylophilales (Methylophilaceae). A complete CBB cycle was only

347 identified on one MAG, however, classified as Rhodocyclales (Rhodocyclaceae). The aclA gene

348 from the reverse TCA cycle was associated with Clostridiales (Syntrophomodaceae) in the

349 metagenomes.

350

351 Metabolomics

352 The metabolomes of QV1.1 and CSW1.1 displayed many unique observed features

353 across both sample types (intracellular metabolites and extracellular metabolites/dissolved

354 organic carbon [DOC]). In particular, the DOC pools of QV1.1 and CSW1.1 were highly distinct

355 from one another and from intracellular extracts, and these metabolomes contained the greatest

356 number of features unique to the sample (Figure 6). Metabolomes also had more features in

357 common between samples of the same type (e.g., intracellular vs. extracellular) than samples

358 from the same well (Figure 6). Dissolved inorganic carbon concentration in the wells was

359 inversely proportional to pH (Pearson correlation coefficient, r = -0.798), while non-purgeable

360 organic carbon was positively correlated to dissolved oxygen concentration (r = 0.587) (Figure

361 7). All wells were below the detection limits of the colorimetric formaldehyde test kit (<0.4

362 ppm).

363

364 Discussion

365 Previous studies of microbial communities at CROMO have used marker gene analysis,

366 shotgun metagenomics, microcosm experiments, and geochemical measurements to elucidate

367 pathways of biogeochemical cycling in a serpentinite-hosted aquifer (Crespo-Medina et al., bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

368 2014; Twing et al., 2017). We expanded upon these efforts by combining metagenomics and

369 metatranscriptomics with an untargeted metabolomics technique to verify genomic expression

370 and metabolic activity. Using this integrated –omics approach, we determined that multiple

371 carbon assimilation pathways are used by the microbial communities in the serpentinite-hosted

372 groundwater at CROMO, including the Calvin-Benson-Bassham cycle, the reverse TCA cycle,

373 acetogenesis via the reductive acetyl-CoA pathway, and methanotrophy.

374

375 Methane Uptake and Formaldehyde Cycling

376 The addition of methane to microcosms from CROMO has been shown to stimulate

377 microbial growth (Crespo-Medina et al., 2014). In the previous metagenomic study of 4 samples

378 by Twing et al. (2017), no particulate methane monooxygenase (pmoA) genes were detected;

379 however, a methanol dehydrogenase (mxaF) gene was identified in the CSW1.3 metagenome.

380 Our expanded search through nine metagenomes and four metatranscriptomes succeeding in

381 finding the genes for particulate methane monooxygenase (PMO) and methanol dehydrogenase

382 (MXA) in the CROMO microbial communities, though at low abundance in the higher pH wells

383 (Figure 4), suggesting that the prior perceived absence of these genes could have been the result

384 of a smaller dataset. Two MAG bins contain the complete methane oxidation pathway, and

385 another two contain a partial pathway (Figure 3). No other known methane monooxygenases or

386 methanol dehydrogenases were identified in the metagenomes or metatranscriptomes.

387 The prevalence of formaldehyde- and formate-assimilation pathways in CROMO

388 reinforces the importance of methanotrophy as a carbon assimilation strategy in these microbial

389 communities. There is a wealth of metagenomic and metatranscriptomic evidence that

390 formaldehyde oxidation to formate and formaldehyde fixation to biomass can be performed by bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

391 the microbial communities in CROMO (Figure 2). Formaldehyde is an important intermediate in

392 methylotrophic metabolic pathways, and though it is highly toxic, methylotrophic

393 microorganisms have developed a variety of functionally redundant modules for rapidly

394 converting formaldehyde to less toxic compounds (Marx et al., 2005). Formaldehyde can also act

395 as a carbon and electron shuttle between microbial subpopulations, in which it is produced by

396 autotrophic primary producers as a by-product, then taken up by methylotrophic microbes and

397 incorporated into biomass (Moran et al., 2016).

398 In methanotrophs, formate oxidation to CO2 is carried out by the NADP-dependent

399 formate dehydrogenase that converts CO2 to formate in the Wood-Ljungdahl pathway, operating

400 in reverse. As previously stated, the genes for this enzyme were not well-represented (and in

401 some cases missing) in the metagenomic data (Figure 5). Formate can also be oxidized to CO2 by

402 an NAD-dependent formate dehydrogenase (FDH) as part of the degradation of oxalate;

403 however, the complete suite of genes required to produce this enzyme was likewise poorly-

404 represented or partially missing, particularly in the most alkaline wells (Supplemental Figure 2).

405 This contrasts with subsurface wells in the Samail Ophiolite, where FDH encoding genes were

406 enriched in metagenomes recovered from alkaline and hyperalkaline groundwater (Fones et al.,

407 2019). Therefore, while there is substantial metagenomic and metatranscriptomic evidence for

408 the production of formate from formaldehyde by the microbial communities in CROMO,

409 DNA/RNA evidence for the further oxidation of formate to CO2 is lacking.

410

411 Bicarbonate Fixation Strategies in the Metagenomes and Metatranscriptomes

412 Despite the DIC concentrations being challengingly low for CO2 fixation at the high pH

413 of the deepest wells, we found complete Calvin-Benson-Bassham (CBB; Supplemental Figure 1) bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

414 and reverse tricarboxylic acid (rTCA; Figure 2) cycles in all nine of our metagenomes and all

415 four metatranscriptomes. Sequences encoding the rbcL gene of the RuBisCo enzyme were

416 previously detected in metagenomes from four wells (CSW1.1, CSW1.3, QV1.1, and QV1.2) in

417 the earlier metagenomic study by Twing et al. (2017). QV1.1 displays the highest coverage of

418 the rTCA pathway in both its metagenome (607.71±55.83 FPKM) and metatranscriptome

419 (446.86±149.59 FPKM) (Figure 2), even though the concentration of dissolved inorganic carbon

420 (DIC) in QV1.1 is lower than almost any other well (11.48) (conversely, presence/expression of

421 the CBB cycle was roughly equal throughout all the wells [Supplemental Figure 1]). Above pH

422 11, carbonate is the dominant species of inorganic carbon. Serpentinomonas spp., which are

423 among the most abundant taxa in the highest pH wells in CROMO (Twing et al., 2017) require

424 the addition of calcium carbonate (even when bicarbonate is also provided) for growth in culture,

425 and form aggregates on carbonate precipitates (Suzuki et al., 2014). Bicarbonate-fixing microbes

426 may create a localized microenvironment of decreased pH where carbonate may be converted to

427 bicarbonate, or they may have bicarbonate transporters that solubilize carbonate. We did not

428 detect any ATP-binding cassette transporters for bicarbonate in our metagenomic data

429 (Supplemental Table 1). Alternatively, formate may be used as a carbon source in the CBB cycle

430 (Bar-Even, 2016) or in the rTCA cycle (Cotton et al., 2018). Formate could thus be produced

431 from formaldehyde via formaldehyde oxidation and then used in assimilatory/anabolic pathways.

432 Evidence for the reductive acetyl-CoA pathway was also previously detected in several

433 wells (Twing et al., 2017). In this study, we found a nearly complete homoacetogenic reductive

434 acetyl-CoA pathway in the metagenomic and metatranscriptomic data from all but the two least

435 alkaline wells, except for one gene. The gene that encodes the beta subunit of formate

436 dehydrogenase (fdhB), which converts CO2 to formate as the first step of the reductive acetyl- bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

437 CoA pathway, represents <1 FPKM in every metagenome but two (N08A and N08B) (Figure 5),

438 and is only present in the metatranscriptomes of QV1.2 and N08B. Likewise, the genes for FDH

439 are missing from every MAG bin that possesses the reductive acetyl-CoA pathway (Figure 3,

440 inset). Yet, this step of the pathway may be unnecessary for the pathway to proceed if formate is

441 being produced by other means, either through the oxidation of formaldehyde via the

442 glutathione-dependent or tetrahydrofolate pathways, or by abiotic production via Fischer-

443 Tropsch-like reactions. Abiotically-produced formate has been implicated as a possible carbon

444 source in serpentinizing environments such as the Lost City (Lang et al., 2010; Lang et al., 2012;

445 Lang et al., 2018) and the Samail Ophiolite (Fones et al. 2019). While formate-fixing reactions

446 are not common, as formate has a relatively low reactivity compared to CO2 or other carboxylic

447 acids (Bar-Even, 2016), formate can be used as the sole carbon source by some microorganisms.

448 In the Samail Ophiolite, rates of assimilation or oxidation of formate were higher than rates of

449 assimilation or dissimilation of bicarbonate or CO regardless of pH (Fones et al., 2019),

450 suggesting that formate may be a preferred source of carbon and/or electrons. Formate

451 concentrations in CROMO are similar to DIC concentrations or lower (Supplemental Table 1;

452 see also Crespo-Medina et al., 2014; Twing et al., 2017), but the addition of formate stimulates

453 microbial growth (Crespo-Medina et al., 2014). We detected two formate transporters in the

454 metagenomic data: fdhC, which has been detected in putative formate-metabolizing bacteria in

455 Lost City chimneys (Lang et al., 2018), was detected in two MAG bins (126 and 192, both

456 ), and focA was detected in one MAG bin (222_8, Actinobacter) and across multiple

457 metagenomes and all four metatranscriptomes (Supplemental Table 1).

458 As DIC becomes more limiting due to pH, bypassing the conversion of CO2 to formate

459 may be an important strategy for carbon assimilation. Provided that sufficient reductant is bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

460 available (for example, from the oxidation of H2), assimilation of CO2, CO, or formate can occur

461 via the reductive acetyl-CoA pathway (Takami et al., 2012). The assimilation of formate via the

462 acetogenic reductive acetyl-CoA pathway avoids ATP consumption, supports energy

463 conservation through the use of multiple electron bifurcation mechanisms, and removes the need

464 for an external electron acceptor apart from co-assimilated CO2 (Bar-Even, 2016; Cotton et al.,

465 2018). While the cooS gene for the conversion of CO2 to CO is present (Figure 5), this step could

466 likely be bypassed as well, in wells where the CO concentration is sufficiently high

467 (Supplemental Table 3). Evidence for carbon monoxide uptake (Morrill et al., 2014) and the

468 reductive acetyl-CoA pathway (Suzuki et al., 2017) as important carbon assimilation pathways

469 was previously found in the Tablelands, Canada, and The Cedars, CA, respectively. In the

470 Samail Ophiolite, Oman, it appears that carbon limitation outweighs energy limitation for the

471 autotrophic members of microbial communities at high pH (Fones et al., 2019), making carbon

472 compounds important as both a feedstock for biosynthesis and a source of electrons.

473 In summary, a putative metabolic pathway for the assimilation of methane into biomass is

474 outlined in Figure 8. Methane oxidation to formaldehyde is carried out by the Methylococcales

475 and Methylophilales, and then either fixed to biomass via the RuMP and serine pathways, or

476 oxidized to formate by multiple clades via the THF and glutathione-dependent pathways.

477 Formate can then be assimilated into biomass via the reductive acetyl-CoA pathway in the

478 Clostridiales or Desulfuromodales, as shown, or through the CBB or rTCA cycles (Bar-Even et

479 al., 2016; Cotton et al., 2018).

480

481 Methanogenesis bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

482 While methanogenic archaea have not been detected in 16S rRNA data obtained from

483 CROMO (Twing et al., 2017) and methane isotopologue data suggest a thermogenic source

484 (Wang, 2015), metagenomic and metatranscriptomic evidence suggest the possibility of

485 methanogenesis by multiple pathways in at least some of the wells. However, these pathways do

486 not include the essential backbone of the methanogenesis pathway, including the crucial mcrA

487 gene for the production of coenzymeM, required for making methane. The mcrA gene was

488 expressed in the metatranscriptomes but other key genes for methane production were absent

489 (Supplemental Tables 1 and 2). The apparent presence of pathways for methanogenesis from

490 organic molecules likely indicate the ability to metabolize these organics (e.g. acetate, methanol,

491 etc.), but whether these compounds are converted to methane by methanogens remains unclear.

492 Biological methanogenesis appears to be occurring in other continental serpentinizing sites, such

493 as the Voltri Massif (Brazelton et al., 2017) and the Cedars (Kohl et al., 2016). In the Tablelands,

494 however, no evidence of biological methane production has been detected (Morrill et al., 2014),

495 and methane production genes in metagenomes recovered from the Samail Ophiolite in Oman

496 are likewise sparse (Fones et al., 2019). Methane cycling in ophiolite-hosted microbial

497 communities therefore seems to vary from site to site, and further work will be required to

498 uncover the geochemical conditions that drive its production and uptake.

499

500 Searching the Metabolome for Intermediates

501 Intermediates of formaldehyde oxidation and assimilation pathways did not match the

502 masses and retention times of any of the features in our metabolomics data. In negative ion

503 mode, glutathione derivatives should have appeared in the data as deprotonated or double-

504 negatively charged molecules, but we did not detect masses that matched either of those bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

505 possibilities. This may be due to the relatively low concentrations of DOM (3-13 mg/L) (Fiore et

506 al., 2015), as well as bias introduced by our extraction method (Johnson et al., 2017).

507 Tetrahydrofolate compounds would likely have been more easily detected in positive ion mode;

508 however, we ran our samples in negative mode in the hopes of capturing the broadest array of

509 metabolites possible with no preconceived notions of what compounds would be present in the

510 data. Metabolites may also fragment or form non-covalent interactions with other compounds

511 upon entering the mass spectrometer, making identification of particular intermediates

512 challenging (Zamboni et al., 2015). Additionally, low cell counts in alkaline groundwater make it

513 difficult to obtain enough biomass during filtration to accurately depict the entire intracellular

514 metabolome of the microbial community. An ideal target of total carbon would be 0.3 mg (Kido

515 Soule et al., 2015). Assuming each cell contains 30 fg of carbon (Fukuda et al., 1998), and with

516 an average cell count of 2 × 105 cells/ml (Twing et al., 2017; unpublished data), 50 liters of

517 water would need to be filtered. Although CSW1.1 and QV1.1 have high fluid outputs compared

518 to most of the other CROMO wells, filtration limitations such as clogging due to carbonate

519 precipitates, keeping the fluid cold to prevent further chemical alteration of the metabolites, and

520 prevention of contamination all precluded our ability to filter such large amounts of water. We

521 were also unable to detect formaldehyde using colorimetric methods in any of the wells. If

522 formaldehyde and its intermediates are cycled rapidly within the cell, it may be difficult to

523 identify and quantify these compounds with the techniques used in this study. Interestingly,

524 while the two wells with the highest pH (QV1.1 and CSW1.1) share highly similar microbial

525 communities (Supplemental Figure 3) with similar metabolic capabilities, the DOC pools

526 observed in these wells appear significantly different in character (Figure 6). Putative

527 annotations of unique metabolite features in the DOC pools include dipeptides, fatty acids, bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

528 vitamins, phosphorylated metabolites, and secondary metabolites. This observation underscores

529 the need for further characterization of the available carbon in serpentinite-hosted aquifers like

530 CROMO through metabolomics techniques.

531

532 Conclusion

533 Through an in-depth analysis of metagenomic and metatranscriptomic data taken over

534 several time points, we have determined that methanotrophy, the reductive acetyl-CoA pathway,

535 the reverse TCA cycle, and the Calvin-Benson-Bassham cycle may all be important means of

536 carbon assimilation for microbes living in the Coast Range Ophiolite groundwater. We also

537 identified several mechanisms by which microbes in the hyperalkaline fluids of deep

538 serpentinizing rock may overcome limiting concentrations of DIC, including the use of formate

539 and methane/formaldehyde as carbon sources for homoacetogenesis and the production of

540 biomass. Further optimization of metabolomics techniques for this environment may be of use to

541 track the fate of carbon and delineate between abiotic and biotic processes in serpentinizing

542 systems such as CROMO.

543

544 Acknowledgements

545 The authors wish to thank McLaughlin Reserve, in particular Paul Aigner and Cathy

546 Koehler, for hosting sampling at CROMO and providing access to the wells, A. Daniel Jones and

547 Anthony Schilmiller for their advice regarding metabolite extraction and mass spectrometry,

548 Elizabeth Kujawinski for her guidance in metabolomics data analysis and interpretation, and

549 Julia McGonigle, Christopher Thornton, and Katrina Twing for assistance with metagenomic and

550 computational analyses. bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

551

552

553 References

554 Amend JP, McCollom TM, Hentscher M, Bach W. Catabolic and anabolic energy for

555 chemolithoautotrophs in deep-sea hydrothermal systems hosted by different rock types. Geochim

556 Cosmochim Acta 2011; 75(19): 5736-5748.

557

558 Anders S, Pyl PT, Huber W. HTSeq-A Python framework to work with high-throughput

559 sequencing data. Bioinformatics 2015; 31: 166–169.

560

561 Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The

562 United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology.

563 2018;36: 566. doi: 10.1038/nbt.4163

564

565 Bar-Even A. Formate assimilation: the metabolic architecture of natural and synthetic pathways.

566 Biochemistry 2016; 55: 3851-3863.

567

568 Barnes I, O’Neill JR, Trescases JJ. Present da serpentinization in New Caledonia, Oman and

569 Yugoslavia. Geochim Cosmochim Acta 1978; 42: 144-145.

570

571 Barnes I, O’Neil JR. The relationship between fluids in some fresh Alpine-type ultramafics and

572 possible modern serpentinization, Western United States. Bull Geo Soc Am 1971; 80: 1947-1960.

573 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

574 Boisvert S, Raymond F, Godzaridis E, Laviolette F, Corbeil J. Ray Meta: scalable de novo

575 metagenome assembly and profiling. Genome Bio. 2012; 13: R122

576

577 Bradley A, Summons RE. Multiple origins of methane at the Lost City Hydrothermal Field.

578 Earth Planet Sci Lett 2010; 297(1): 34-41.

579

580 Brazelton WJ, Mehta MP, Kelley DS, Baross JA. Physiological differentiation within a single-

581 species biofilm fueled by serpentinization. mBio 2011; doi: 10.1128/mBio.00127-11.

582

583 Brazelton WJ, Nelson B, Schrenk MO. Metagenomic evidence for H2 oxidation and H2

584 production by serpentinite-hosted subsurface microbial communities. Front Microbiol 2012; doi:

585 10.3389/fmicb.2011.00268.

586

587 Brazelton WJ, Morrill PL, Szponar N, Schrenk MO. Bacterial communities associated with

588 subsurface geochemical processes in continental serpentinite springs. Appl Environ Microbiol

589 2013; 79(13): 3906-3916.

590

591 Brazelton WJ, Thornton CN, Hyer A, Twing KI, Longino AA, Lang SQ et al. Metagenomic

592 identification of active methanogens and methanotrophs in serpentinite springs of the Voltri

593 Massif, Italy. Peer J 2017; 5: e2945.

594

595 Broadhurst D, Goodacre R, Reinke SN, Kuligowski J, Wilson ID, Lewis MR, Dunn WB.

596 Guidelines and considerations for the use of system suitability and quality control samples in bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

597 mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics

598 2018; 14: 72.

599

600 Cardace D, Hoehler T, McCollom T, Schrenk M, Carnevale D, Kubo MD et al. Establishment of

601 the Coast Range Ophiolite Microbial Observatory (CROMO): drilling objectives and preliminary

602 outcomes. Sci Dril 2013; 16(16): 45-55.

603

604 Cardace D, Meyer Dombard DR, Woycheese KM, Arcilla CA. Feasible metabolisms in high pH

605 springs of the Philippines. Front Microbiol 2015; doi: 10.3389/fmicb.2015.00010.

606

607 Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S et al. A cross-

608 platform toolkit for mass spectrometry and proteomics. Nature Biotechnol 2012; 30(10): 918–

609 920.

610

611 Charlou JL, Donval JP, Fouquet Y, Jean-Baptiste P, Holm N. Geochemistry of high H2 and CH4

612 vent fluids issuing from ultramafic rocks at the Rainbow hydrothermal field (36 Deg14’N,

613 MAR). Chem Geol 2002; 191(4): 345-359.

614

615 Cotton CAR, Edlich-Muth C, Bar-Even A. Reinforcing carbon fixation: CO2 reduction and

616 supporting carboxylation. Curr Op Biotechnol 2018; 49: 49-56.

617 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

618 Crespo-Medina M, Twing KI, Kubo MD, Hoehler TM, Cardace D, McCollom TM et al. Insights

619 into environmental controls on microbial communities in a continental serpentinite aquifer using

620 a microcosm-based approach. Front Microbiol 2014; 5: 604.

621

622 Delacour A, Früh-Green GL, Bernasconi SM, Schaeffer P, Kelley DS. Carbon geochemistry of

623 serpentinites in the Lost City Hydrothermal System (30°N, MAR). Geochim Cosmochim Acta

624 2008; 72: 3681-3702.

625

626 Dittmar T, Koch B, Hertkorn N, Kattner G. A simple and efficient method for the solid-phase

627 extraction of dissolved organic matter (SPE-DOM) from seawater. Limnol Oceanogr Lett 2008;

628 6(6): 230-235.

629

630 Fiore CL, Longnecker K, Kido Soule MC, Kujawinski E. Release of ecologically relevant

631 metabolites by the cyanobacterium Synechococcus elongatus CCMP 1631. Env Microbiol 2015;

632 17(10): 3949-3963.

633

634 Fukuda R, Ogawa H, Nagata T, Koike I. Direct determination of carbon and nitrogen contents of

635 natural bacterial assemblages in marine environments. Appl Environ Microbiol 1998; 64(9):

636 3352-3358.

637

638 Fones EM, Colman DR, Kraus EA, Nothaft DB, Poudel S, Rempfert KR et al. Physiological

639 adaptations to serpentinization in the Semail Ophiolite, Oman. ISME J 2019; 13: 1750-1762.

640 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

641 Goenrich M, Bartoschek S, Hagemeier CH, Griesinger C, Vorholt JA. A glutathione-dependent

642 formaldehyde-activating enzyme (Gfa) from Paracoccus denitrifcans detected and purified via

643 two-dimensional proton exchange NMR spectroscopy. J Biol Chem 2002; 277(5): 3069-3072.

644

645 Graham ED, Heidelberg JF, Tully BJ. BinSanity: unsupervised clustering of environmental

646 microbial assemblies using coverage and affinity propagation. PeerJ 2017; 5: e3035.

647

648 Gregor I, Dröge J, Schirmer M, Quince C, McHardy AC. PhyloPythiaS+: a self-training method

649 for the rapid reconstruction of low-ranking taxonomic bins from metagenomes. PeerJ 2016; 4:

650 e1603.

651

652 Huber H, Gallenberger M, Jahn U, Eylert E, Berg IA, Kockelkorn D et al. A dicarboxylate/4-

653 hydroxybutyrate autotrophic carbon assimilation cycle in the hyperthermophilic

654 Archaeum Ignicoccus hospitalis. PNAS 2008; 105(22): 7851-7856.

655

656 Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene

657 recognition and translation initiation site identification. BMC Bioinform 2010; 11: 11.

658

659 Johnson WM, Kido Soule MC, Kujawinski EB. Interpreting the impact of matrix on extraction

660 efficiency and instrument response in a targeted metabolomics method. Limnol Oceanogr-Meth

661 2017; 15(4): 417-428.

662 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

663 Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res

664 2000; 28(1): 4.

665

666 Kanehisa M, Sato Y, Morishima K. BlastKOALA and GhostKOALA: KEGG Tools for

667 Functional Characterization of Genome and Metagenome Sequences. J Mol Biol 2016; 428(4):

668 726-731. doi: 10.1016/j.jmb.2015.11.006.

669

670 Kido Soule MC, Longnecker K, Johnson WM, Kujawinksi EB. Environmental metabolomics:

671 Analytical strategies. Mar Chem 2015; 177(2): 274-387.

672

673 Kittichotirat W, Good NM, Hall R, Bringel F, Lajus A, Médigue C et al. Genome Sequence

674 of Methyloversatilis universalis FAM5T, a Methylotrophic Representative of the

675 Order Rhodocyclales. J Bacteriol 2011; 193(17): 4541-4542.

676

677 Kohl L, Cumming E, Cox A, Rietze A, Morrisey L, Lang SQ et al. Exploring the metabolic

678 potential of microbial communities in ultra‐basic, reducing springs at The Cedars, CA, USA:

679 Experimental evidence of microbial methanogenesis and heterotrophic acetogenesis. JGR

680 Biogeosci 2016; 121(4) :1203-1220.

681

682 Kuhl C, Tautenhahn R, Böttcher C, Larson TR, Neumann S. CAMERA: An Integrated Strategy

683 for Compound Spectra Extraction and Annotation of Liquid Chromatography/Mass

684 Spectrometry Data Sets. Anal Chem 2012; 84(1): 283–289.

685 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

686 Lang SQ, Butterfield DA, Schulte M, Kelley DS, Lilley MD. Elevated concentrations of formate,

687 acetate and dissolved organic carbon found at the Lost City hydrothermal field. Geochim

688 Cosmochim Acta 2010; 74: 941-952.

689

690 Lang SQ, Früh-Green GL, Bernasconi SM, Lilley MD, Proskurowski G, Méhay S et al.

691 Microbial utilization of abiogenic carbon and hydrogen in a serpentinite-hosted system. Geochim

692 Cosmochim Acta 2012; 92: 82-99.

693

694 Lang SQ, Früh-Green GL, Bernasconi SM, Brazelton WJ, Schrenk MO, McGonigle JM. Deeply-

695 sourced formate fuels sulfate reducers but not methanogens at Lost City hydrothermal field. Sci

696 Rep 2018; 8: 755.

697

698 Langmead B, Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 2013; 9:

699 357–359.

700

701 Li S, Park Y, Duraisingham S, Strobel FH, Khan N, Soltow QA et al. Predicting Network

702 Activity from High Throughput Metabolomics. PLoS Comp Biol 2013; 9(7): e1003123.

703

704 Luo W, Brouwer C. Pathview: an R/Bioconductor package for pathway-based data integration

705 and visualization. Bioinformatics 2013; 29(14): 1830-1831.

706

707 Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads.

708 EMBnet.journal 2011; 17(1): https://doi.org/10.14806/ej.17.1.200. bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

709

710 Marx CJ, Van Dien SJ, Lidstrom ME. Flux analysis uncovers key role of functional redundancy

711 in formaldehyde metabolism. PLoS Biol 2005; 3(2): e16.

712

713 McCollom TM, Seewald JS. A reassessment of the potential for reduction of dissolved CO2 to

714 hydrocarbons during serpentinization of olivine. Geochim Cosmochim Acta 2001; 65: 3769-

715 3778.

716

717 McCollom TM. Geochemical constraints on sources of metabolic energy for

718 chemolithoautotrophy in ultramafic-hosted deep-sea hydrothermal systems. Astrobiol 2007; 7(6):

719 933-950.

720

721 Ménez B, Pasini V, Brunelli D. Life in the hydrated suboceanic mantle. Nat Geosci 2012; 5:

722 133–137.

723

724 Moran JJ, Whitmore LM, Isern NG, Romine MF, Riha KM, Inskeep WP et al. Formaldehyde as

725 a carbon and electron shuttle between autotrophic and heterotrophic populations in acidic

726 hydrothermal vents of Norris Geyser Basin, Yellowstone National Park. Extremophiles 2016;

727 20(3): 291-299.

728

729 Morrill PL, Brazelton WJ, Kohl L, Rietze1 A, Miles SM, Kavanagh H et al. Investigations of

730 potential microbial methanogenic and carbon monoxide utilization pathways in ultra-basic bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

731 reducing springs associated with present-day continental serpentinization: the Tablelands, NL,

732 CAN. Front Microbiol 2014; doi: 10.3389/fmicb.2014.00613

733

734 Neubeck A, Sun L, Müller B, Ivarsson M, Hosgörmez H, Özcan D et al. Microbial community

735 structure of a serpentine-hosted abiotic gas seepage at the Chimaera ophiolite, Turkey. Appl

736 Environ Microbiol 2017; 83: e03430-16.

737

738 Ogata H, Goto S, Sato K, Fujibuchi W, Bono H. KEGG: Kyoto Encyclopedia of Genes and

739 Genomes. Nucleic Acids Res 1999; 27, 29–34.

740

741 Parks DH, Chuvochina M, Waite DW, Rinke C, Sharskewski A, Chaumeil P-A, Hugenholtz P. A

742 standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of

743 life. Nature Biotechnol 36: 996-1004.

744

745 Perner M, Kuever J, Seifert R, Pape T, Koschinsky A, Schmidt K et al. The influence of

746 ultramafic rocks on microbial communities at the Logatchev hydrothermal field, located 15°N on

747 the Mid-Atlantic Ridge, FEMS Microbiol Ecol 2007; 61(1): 97–109.

748

749 Peters EK. D-18O enriched waters of the Coast Range Mountains, northern California: connate

750 and ore-forming fluids. Geochim Cosmochim Acta 1993; 57: 1093-1104.

751 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

752 Preiner M, Xavier JC, Sousa FL, Zimorski V, Neubeck A, Lang SQ, Greenwell HS,

753 Kleinermanns K et al. Serpentinization: connecting geochemistry, ancient metabolism and

754 industrial hydrogenation. Life 2018; 8: 41.

755

756 Proskurowski G, Lilley MD, Seewald JS, Früh-Green GL, Olson EJ, Lupton JE et al. Abiotic

757 hydrocarbon production at Lost City Hydrothermal Field. Science 2008; 319: 604-607.

758

759 Rabinowitz JD, Kimball E. Acidic acetonitrile for cellular metabolome extraction from

760 Escherichia coli. Anal Chem 2007; 79(16): 6167-6173.

761

762 Rohland N, Reich D. Cost-effective, high-throughput DNA sequencing libraries for multiplexed

763 target capture. Genome Res 2012; 22:939–946. https://doi.org/10.1101/gr.128124.111

764

765 Salek RM, Steinbeck C, Viant MR, Goodacre R, Dunn WB. The role of reporting standards for

766 metabolite annotation and identification in metabolomic studies. GigaScience 2013; 2: 13.

767 doi:10.1186/2047-217X-2-13

768

769 Schrenk MO, Kelley DS, Murdock S, Baross J. Low archaeal diversity linked to subseafloor

770 geochemical processes at the Lost City Hydrothermal Field, Mid-Atlantic Ridge. Environ

771 Microbiol 2004; 6(10): 1086-1095.

772

773 Seemann T. Prokka: Rapid prokaryotic genome annotation. Bioinformatics 2014; 30: 2068–

774 2069. bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

775

776 Sephton MA, Hazen RM. On the origins of deep hydrocarbons. Rev Mineral Geochem 2013;

777 75:449-465.

778

779 Shervais JW, Murchey BL, Kimbrough DL, Renne PR, Hanan B. Radioisotopic and

780 biostratigraphic age relations in the Coast Range Ophiolite, northern California: Implications for

781 the tectonic evolution of the Western Cordillera. Geol Soc Am Bull 2005; 177: 633-653.

782

783 Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G. XCMS: Processing Mass Spectrometry

784 Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification.

785 Anal Chem 2006; 78(3): 779–787.

786

787 Sumner LW, Amberg A, Barrett D, Beale MH, Beger R, Daykin CA et al. Proposed minimum

788 reporting standards for chemical analysis Chemical Analysis Working Group (CAWG)

789 Metabolomics Standards Initiative (MSI). Metabolomics 2007; 3(3): 211–221.

790

791 Suzuki S, Ishii S, Wu A, Cheung A, Tenney A, Warner G et al. Microbial diversity in The

792 Cedars, an ultrabasic, ultrareducing, and low salinity serpentinizing ecosystem. PNAS 2013;

793 110(38): 15336-15341.

794

795 Suzuki S, Kuenen JG, Schipper K, van der Velde S, Ishii S, Wu A, et al. Physiological and

796 genomic features of highly alkaliphilic hydrogen-utilizing Betaproteobacteria from a continental

797 serpentinizing site. Nature Comm 2014; 5: 3900. bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

798

799 Suzuki S, Ishii S, Hoshino T, Rietze A, Tenney A, Morrill PL, et al. Unusual metabolic diversity

800 of hyperalkaliphilic microbial communities associated with subterranean serpentinization at The

801 Cedars. ISME J 2017; 11: 2584-2598.

802

803 Takami H, Noguchi H, Takaki Y, Uchiyama I, Toyoda A, Nishi S et al. A deeply branching

804 thermophilic bacterium with an ancient acetyl-CoA pathway dominates a subsurface ecosystem.

805 PLoS One. 2012; 7: e30559.

806

807 Tenenbaum D. KEGGREST: Client-side REST access to KEGG. R package version 1.22.0;

808 2018.

809

810 Twing KI, Brazelton WJ, Kubo MD, Hyer AJ, Cardace D, Hoehler TM et al. Serpentinization-

811 influenced groundwater harbors extremely low diversity microbial communities adapted to high

812 pH. Front Microbiol 2017; 8: 308.

813

814

815 Figure Legends

816

817 Figure 1. A diagram depicting abiotic (white) and biotic (black) sources of carbon compounds in

818 serpentinizing environments. Adapted from Preiner et al., 2018.

819 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

820 Figure 2. Metagenome coverage of Functional Ontology Assignments for Metagenomes

821 (FOAM) pathways assigned using HUMAnN2. If a well was sampled for metagenomics more

822 than once, the average percent coverage and standard error of the mean for that well are

823 provided. Metatranscriptome coverage for four wells sampled for RNA is indicated using bars

824 with dotted outlines.

825

826 Figure 3. Selection of carbon cycling-associated KEGG modules detected in MAG bins, and

827 Serpentinomonas isolate genomes (Suzuki et al., 2014), using BLASTKoala. Dark blue boxes

828 indicate the complete module is present; light blue boxes indicate one step of the pathway is

829 missing. The phylogenetic tree comparing the MAGs to the three Serpentimonas isolates was

830 generated using SpeciesTree in KBase.

831 Inset: Bins containing reductive acetyl-CoA pathway; gene presence in the genome is indicated

832 in yellow.

833

834 Figure 4. Fragments per kilobase of sequences per million mapped reads (FPKM) of genes in the

835 homoacetogenic reductive acetyl-CoA pathway. Metagenomes are represented in black;

836 metatranscriptomes are represented in gray. The genes for the two subunits of formate

837 dehydrogenase (fdhA and fdhB) are missing or comparatively low in most wells. This enzyme is

838 responsible for converting CO2 to formate.

839

840 Figure 5. Fragments per kilobase of sequences per million mapped reads (FPKM) of genes in the

841 methane oxidation, tetrahydrofolate (THF), and glutathione-dependent formaldehyde oxidation

842 pathways. Metagenomes are represented in black; metatranscriptomes are represented in gray. bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

843 The gene for S-(hydroxymethyl)glutathione synthase (gfa) is poorly represented or missing in all

844 wells; this enzyme catalyzes a spontaneous reaction in the formaldehyde oxidation pathway.

845

846 Figure 6. Left: Heatmap of untargeted metabolomics features across all four metabolome

847 samples. The scale represents the ln(1 + peak intensity) (1 is added to avoid cases of ln(0)).

848 Right: Venn diagram of compounds unique to each sample and shared between samples.

849 Observed compounds appear to be generally distinct to each sample. Filtered fluid/dissolved

850 organic carbon samples had the greatest number of unique samples, and CSW1.1 had more

851 unique compounds than QV1.1.

852

853 Figure 7. Scatter plots depicting dissolved inorganic carbon (DIC) vs pH (left) and nonpurgeable

854 organic carbon (NPOC) vs dissolved oxygen (DO). The Pearson correlation coefficient (r) is

855 indicated on each graph, as well as the R2 of the trendlines.

856

857 Figure 8. Putative pathway for carbon assimilation in the CROMO wells. Formaldehyde is either

858 converted directly to biomass via the RuMP or serine pathways, or oxidized to formate, which is

859 then fed into the homoacetogenic reductive acetyl-CoA pathway. The pathway for methane

860 conversion to formaldehyde is shown, but appears to be poorly expressed in most of the wells.

861

862 Table 1. Completeness, contamination, strain heterogeneity, and classification of MAG bins

863 from metagenomes. Bin quality scores assessed using CheckM; taxonomic annotations obtained

864 using GTDB-Tk.

865 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

866 Supplemental Figure 1. Fragments per kilobase of sequences per million mapped reads (FPKM)

867 of genes in the Calvin-Benson-Bassham cycle. Metagenomes are represented in black;

868 metatranscriptomes are represented in gray.

869

870 Supplemental Figure 2. Fragments per kilobase of sequences per million mapped reads (FPKM)

871 of genes in the NAD-dependent pathway for formate oxidation to CO2. Metagenomes are

872 represented in black; metatranscriptomes are represented in gray.

873

874 Supplemental Figure 3. Microbial community composition of QV1.1 and CSW1.1 wells as

875 determined by assigning taxonomy to metagenomic contigs using PhyloPython.

876

877

878

879

880

881

882 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Conta- Strain Complete mina- hetero- MAG -ness tion geneity Phylum Class Order Family Genus Species

Candidatus 83_0_0 73.25 3.56 20 Archaea Thaumarchaeota incertae sedis Nitrosopumilales Nitrosopumilaceae Nitrosoarchaeum Nitrosoarchaeum koreensis

Candidatus 83_3_0 91.75 4.85 100 Archaea Thaumarchaeota incertae sedis Nitrosopumilales Nitrosopumilaceae Nitrosoarchaeum Nitrosoarchaeum koreensis

109_2_0 75.96 0.85 100 Bacteria Actinobacteria

222_8 97.01 2.56 0 Bacteria Actinobacteria Actinobacteria

152 94.17 1.11 0 Bacteria Actinobacteria Actinobacteria Algoriphagus 58 98.5 3.33 47.37 Bacteria Sphingobacteriales Cyclobacteriaceae Algoriphagus machipongonensis 11 94.62 5.41 29.41 Bacteria Bacteroidetes Bacteroidia Bacteroidales

82 94.83 4.15 69.23 Bacteria Bacteroidetes Chitinophagia Chitinophagales Chitinophagaceae

49 78.79 7.76 9.09 Bacteria Bacteroidetes Chitinophagia Chitinophagales Chitinophagaceae

74 97.7 6.12 54.17 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales

85_0 96.21 4.08 0 Bacteria Bacteroidetes

64 89.73 4.12 38.46 Bacteria Bacteroidetes

120 85.67 0.26 0 Bacteria Bacteroidetes

117 97.31 5.38 14.29 Bacteria Bacteroidetes

87_3_0 84.08 2.7 20 Bacteria Chlamydiae Chlamydiales Simkaniaceae Simkania Simkania negevensis 100 98.88 8.45 0 Bacteria Chlorobi Ignavibacteria Ignavibacteriales Ignavibacteriaceae Ignavibacterium Ignavibacterium album

200 84.69 1.12 0 Bacteria Chlorobi Ignavibacteria Ignavibacteriales Melioribacteraceae Melioribacter Melioribacter roseus

84 71.17 1.09 0 Bacteria Chlorobi Ignavibacteria Ignavibacteriales Deinococcus- 191 88.56 5.59 50 Bacteria Thermus Deinococci Deinococcales Trueperaceae Truepera Truepera radiovictrix Deinococcus- 207 59.14 2.33 90 Bacteria Thermus Deinococci Deinococcales Trueperaceae Truepera Truepera radiovictrix 126 51.67 1.4 100 Bacteria Clostridia Clostridiales Clostridiaceae Alkaliphilus

128 94.07 3.11 68.75 Bacteria Firmicutes Clostridia Clostridiales Clostridiaceae Clostridium Clostridiales Family XVII. 61 93.27 6.73 75 Bacteria Firmicutes Clostridia Clostridiales Incertae Sedis Thermaerobacter 245 94.23 0.24 0 Bacteria Firmicutes Clostridia Clostridiales Peptococcaceae

bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

236 80.27 0.32 0 Bacteria Firmicutes Clostridia Clostridiales Peptococcaceae

244 54 3.25 100 Bacteria Firmicutes Clostridia Clostridiales Dethiobacter Dethiobacter alkaliphilus

105 94.35 4.31 20 Bacteria Firmicutes Clostridia Clostridiales Syntrophomonadaceae Dethiobacter Dethiobacter alkaliphilus

148 81.81 1.55 40 Bacteria Firmicutes Clostridia Clostridiales Syntrophomonadaceae Dethiobacter Dethiobacter alkaliphilus

151 89.75 3.95 50 Bacteria Firmicutes Clostridia Clostridiales Syntrophomonadaceae Dethiobacter Dethiobacter alkaliphilus 243 80.51 3.39 0 Bacteria Firmicutes Clostridia Clostridiales Syntrophomonadaceae Dethiobacter Dethiobacter alkaliphilus

165 96.41 1.28 33.33 Bacteria Firmicutes Clostridia Clostridiales Syntrophomonadaceae

104 95.76 1.41 0 Bacteria Firmicutes Clostridia Clostridiales

182 94.83 3.33 0 Bacteria Firmicutes Clostridia Clostridiales

192 78.72 2.07 50 Bacteria Firmicutes Clostridia

107 79.91 3.49 0 Bacteria Firmicutes Clostridia

142 89.74 0.94 0 Bacteria Firmicutes Erysipelotrichia Erysipelotrichales Erysipelotrichaceae Erysipelothrix

134 87.57 8.49 81.82 Bacteria Firmicutes Erysipelotrichia Erysipelotrichales Erysipelotrichaceae Erysipelothrix 155 100 5.66 37.5 Bacteria Firmicutes Erysipelotrichia Erysipelotrichales Erysipelotrichaceae Erysipelothrix

147 95.19 0.74 75 Bacteria Firmicutes Negativicutes Selenomonadales

102 91.11 1.72 0 Bacteria Latescibacteria

145 92.78 3.7 83.33 Bacteria Rhizobiales Bradyrhizobiaceae Afipia broomeae

2 88.68 0.95 0 Bacteria Proteobacteria Alphaproteobacteria Rhizobiales Roseovarius 118 52.2 1.18 0 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Roseovarius sp. TM1035 187 94.42 0.53 25 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae

87_2_1 69.96 6.9 0 Bacteria Proteobacteria Alphaproteobacteria Rickettsiales Anaplasmataceae Sandarakinorhabdus 188 88.47 7.97 65.22 Bacteria Proteobacteria Alphaproteobacteria Sphingomonadales Sphingomonadaceae Sandarakinorhabdus sp. AAP62

53_5 51.72 0 0 Bacteria Proteobacteria Alphaproteobacteria Sphingomonadales Sphingomonadaceae

53_1 89.43 0.39 50 Bacteria Proteobacteria Alphaproteobacteria Sphingomonadales

67 97.33 0 0 Bacteria Proteobacteria Alphaproteobacteria

66 66.03 3.31 100 Bacteria Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae Hydrogenophaga Hydrogenophaga sp. PBC

119 59.64 0 0 Bacteria Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae Hydrogenophaga Hydrogenophaga sp. PBC

123_1 54.83 0.23 100 Bacteria Proteobacteria Betaproteobacteria Burkholderiales Comamonadaceae Hydrogenophaga Hydrogenophaga sp. PBC bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

6 93.81 6.35 5 Bacteria Proteobacteria Betaproteobacteria Nitrosomonadales Gallionellaceae Sulfuricella Sulfuricella denitrificans Sulfuritalea 22_9 57.52 0 0 Bacteria Proteobacteria Betaproteobacteria Nitrosomonadales Sterolibacteriaceae Sulfuritalea hydrogenivorans Sulfuritalea 203 77.21 0.36 66.67 Bacteria Proteobacteria Betaproteobacteria Nitrosomonadales Sterolibacteriaceae Sulfuritalea hydrogenivorans

93 95.73 6.56 10 Bacteria Proteobacteria Betaproteobacteria Rhodocyclales Rhodocyclaceae Dechloromonas Dechloromonas aromatica

116 96.34 5.19 18.18 Bacteria Proteobacteria Deltaproteobacteria Desulfobacterales Desulfobulbaceae

212_2_0 76.16 1.29 0 Bacteria Proteobacteria Deltaproteobacteria Syntrophobacterales Syntrophaceae

46_3 90.84 3.23 0 Bacteria Proteobacteria Deltaproteobacteria

10 99.59 2.78 0 Bacteria Proteobacteria Epsilonproteobacteria Campylobacterales Helicobacteraceae Sulfurimonas Sulfurimonas denitrificans Methylomicrobium 115_2 98.56 3.12 0 Bacteria Proteobacteria Gammaproteobacteria Methylococcales Methylococcaceae Methylomicrobium buryatense

70 74.66 2.24 0 Bacteria Proteobacteria Gammaproteobacteria Methylococcales Methylococcaceae Methylomonas Methylomonas methanica 173 87.07 6.58 0 Bacteria Proteobacteria Gammaproteobacteria Methylococcales Methylococcaceae Methylomonas

196 89.46 3.62 31.58 Bacteria Proteobacteria Gammaproteobacteria Xanthomonadales Xanthomonadaceae

221_12 53.45 0 0 Bacteria Proteobacteria Gammaproteobacteria Xanthomonadales Xanthomonadaceae

91 64.08 5.39 0 Eukaryota Loukozoa Malawimonadea Malawimonadida Malawimonadidae Malawimonas Malawimonas jakobiformis bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Water ac CHetate4 formate

H 2O CO2

CO CH4 CO2 2

CO Fe2+ + H O Fe O + H 2 2 à 3 4 2 CO

- CO3 formate Rock bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

pH 7.77 2690 7.89 8.66 9.84 10.12 10.54 10.94 11.48 12.34 2621 Dashed bars represent metatranscriptomes 1240

FPKM FPKM bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Thaumarchaea

Loukozoa Chlorobi Bacteroidetes Proteobacteria Actinobacter Clostridia Deinococcus Delta Firmicutes

Gamma Beta Alpha Reductive acetyl-CoA pathway

Pathway complete Pathway missing one step bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

pH 7.77 CSW1.4

7.89 N08C FPKM 10 8.66 QV1.2 100 500 9.84 CSWold 1000

10.12 CSW1.3 Metagenome Metatranscriptome

10.54 N08A

10.94 N08B

11.48 QV1.1

12.34 CSW1.1 gfa folD frmB mxaI purU frmA (spontaneous) mxaF pmoC pmoB pmoA THF pathway glutathione-dependent CH4 -> CH3OH -> formaldehyde Methane Oxidation Formaldehyde Oxidation to Formate 11.48 12.34 10.54 10.94 10.12 9.84 7.77 7. 8.66 89 bioRxiv preprint not certifiedbypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmadeavailable pH CSW1.3 CSW1.1 CSWold CSW1.4 QV1.2 QV1.1 N08A N08C N08B doi: https://doi.org/10.1101/776849 CO) (CO 2 -

> cooS, acsA rmate ) ( CO 2 under a

- fdhA > ; fo this versionpostedSeptember20,2019. CC-BY-NC-ND 4.0Internationallicense fdhB Reductive Acetyl fhs

folD The copyrightholderforthispreprint(whichwas .

metF - CoA

acsE

(CO acsB - > acetyl cdhE,

- acsC CoA)

cdhD, acsD Metatranscriptome Me FPKM tagenome 2500 1000 5 100 10 00 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Heatmap of Observed Features Across Samples Venn Diagram of Observed Features Across Samples

CSW1.1 QV1.1 DOM Intracellular CSW1.1 QV1.1 Intracellular DOM

QV 1,1 Intracellular CSW 1,1 Intracellular QV 1,1 DOM CSW 1,1 DOM

ln(1 + peak intensity) bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

DIC vs pH NPOC vs O2 r = -0.798 r = 0.587 bioRxiv preprint doi: https://doi.org/10.1101/776849; this version posted September 20, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

NAD(P)+ NAD(P)H

a 5,10-methenyl- H2O a 10-formyl- tetrahydrofolate folD tetrahydrofolate

aF TH pmoA, pmoB, pmoC H2O Reductive Acetyl-CoA methane methanol CO2 H+ • Clostridiales (Syntrophomodaceae) 2 oxidized THF Pathway O 2, quinol H O, quinone a 5,10-methenyl- • Desulfuromodales 2 cytochrome ⍺ • Burkholderiales (Comamonadaceae) mxaF, tetrahydrofolate fdhA, mxaI • Deinococcales (Trueperaceae) 2H+ , 2 reduced • Hydrogenophilales (Hydrogenophilaceae) X fdhB cytochrome ⍺ • Methylophilales (Methylophilaceae) Methane Oxidation • Rhizobiales Rhodocyclales (Rhodocyclaceae) • Methylococcales (Methylococcaceae) • an N10-formyl- formaldehyde formate fhs • Methylophilales (Methylophilaceae) Formaldehyde Oxidation tetrahydrofolate G lutathione Pathway folD • Burkholderiales (Comamonadaceae) • Rhizobiales a 5,10-methenyl- H+ • Rhodobacterales tetrahydrofolate glutathione • Rhodocyclales (Rhodocyclaceae) folD CO2 a 5,10-methylene- H2O tetrahydrofolate frmA S-(hydroxymethyl)- S- formylglutathione glutathione metF H+ cooS, + NAD(P) NAD(P)H acsA a 5-methyl- tetrahydrofolate Formaldehyde Fixation to Biomass acsE (RuMP or Serine Pathway) a methyl-Co(III) carbon corrinoid Fe-S protein monoxide

Acetate Formation acsB , acsC/chdE, acsD/cdhD From Acetyl-CoA (Type I Pathway) acetyl-CoA