<<

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Mercury methylation by metabolically versatile and cosmopolitan marine 2 3 4 Heyu Lin1, David B. Ascher2,3, Yoochan Myung2,3, Carl H. Lamborg4, Steven J. 5 Hallam5,6, Caitlin M. Gionfriddo7, Kathryn E. Holt8, 9 and John W. Moreau1,10,* 6 7 1School of Earth Sciences, The University of Melbourne, Parkville, VIC 3010, AUS 8 2Structural Biology and Bioinformatics, Department of Biochemistry and Molecular 9 Biology, Bio21 Molecular Science and Biotechnology Institute, The University of 10 Melbourne, Parkville, VIC 3010, AUS 11 3Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 12 PO Box 6492, Melbourne, VIC 3004, AUS 13 4Department of Ocean Sciences, University of California, Santa Cruz, CA 95064, USA 14 5Department of Microbiology and Immunology, University of British Columbia, 15 Vancouver, BC V6T 1Z1, CA 16 6Genome Science and Technology Program, University of British Columbia, 17 Vancouver, BC V6T 1Z4, CA 18 7Biosciences Division, Oak Ridge National Laboratory, PO Box 2008, Oak Ridge, TN 19 37831, USA 20 8Department of Infectious Diseases, Central Clinical School, Monash University, VIC 21 3800, AUS 22 9Department of Infection Biology, London School of Hygiene & Tropical Medicine, 23 London WC1E 7HT, UK 24 10Currently at School of Geographical & Earth Sciences, University of Glasgow, 25 Glasgow G12 8QQ, UK 26 27 *Corresponding author: [email protected] 28 The authors declare no competing interests. 29 30 Running title: Mercury methylation by Marinimicrobia 31

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

32 Abstract 33 Microbes transform aqueous mercury (Hg) into methylmercury (MeHg), a potent 34 neurotoxin in terrestrial and marine food webs. This process requires the gene pair 35 hgcAB, which encodes for proteins that actuate Hg methylation, and has been well 36 described for anoxic environments. However, recent studies report potential MeHg 37 formation in suboxic seawater, although the microorganisms involved remain poorly 38 understood. In this study, we conducted large-scale multi-omic analyses to search for 39 putative microbial Hg methylators along defined redox gradients in Saanich Inlet (SI), 40 British Columbia, a model natural ecosystem with previously measured Hg and MeHg 41 concentration profiles. Analysis of gene expression profiles along the redoxcline 42 identified several putative Hg methylating microbial groups, including Calditrichaeota, 43 SAR324 and Marinimicrobia, with the last by far the most active based on hgc 44 transcription levels. Marinimicrobia hgc genes were identified from multiple publicly 45 available marine metagenomes, consistent with a potential key role in marine Hg 46 methylation. Computational homology modelling predicted that Marinimicrobia 47 HgcAB proteins contain the highly conserved structures required for functional Hg 48 methylation. Furthermore, a number of terminal oxidases from aerobic respiratory 49 chains were associated with several SI putative novel Hg methylators. Our findings thus 50 reveal potential novel marine Hg-methylating microorganisms with a greater oxygen 51 tolerance and broader habitat range than previously recognised. 52

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

53 Introduction 54 Mercury (Hg), a highly toxic metal, is widespread in the environment from primarily 55 anthropogenic sources, leading to increased public concern over the past few decades. 56 (e.g. Fitzgerald and Clarkson 1991, Hsu-Kim et al 2018, Selin 2009). Methylmercury 57 (MeHg) is recognized as a potent neurotoxin that bioaccumulates through both marine 58 and terrestrial food webs (Lee and Fisher 2017, Selin 2009). With implementation of 59 the Minamata Convention on Mercury (Hsu-Kim et al 2018), better understanding is 60 expected of potential MeHg sources, in the context of global biogeochemical cycles 61 and factors influencing Hg speciation. For example, oxygen gradients in seawater have 62 been observed to expand in response to climate change (Stramma et al 2008), and the 63 resulting impacts on biogeochemical cycles, e.g., carbon, nitrogen, and sulfur, have 64 been studied (e.g., Wright et al 2012). Effects on Hg cycling, however, are rarely 65 considered. 66 67 The environmental transformation of Hg(II) to MeHg is a microbially mediated process, 68 for which the proteins are encoded by the two-gene cluster hgcA and hgcB (Parks et al 69 2013). Possession of the hgcAB gene pair is a predictor for Hg methylation capability 70 (Gilmour et al 2013), and the discovery of hgc has stimulated a search for potential Hg 71 methylating microbes in diverse environments (Podar et al 2015). To date, all 72 experimentally confirmed Hg methylators are anaerobes (Grégoire et al 2018, Podar et 73 al 2015) from: three clades (sulfate-reducing bacteria, SRB; Fe- 74 reducing bacteria, FeRB); and syntrophic bacteria Syntrophobacterales), one clade 75 belonging to fermentative , and an archaeal clade, Methanomicrobia. These 76 microorganisms are ubiquitous in soils, sediments, seawater, freshwater, and extreme 77 environments, as well as the digestive tracts of some animals (Podar et al 2015). 78 Recently, other hgc carriers, not only from anaerobic but also microaerobic habitats, 79 were discovered using culture-independent approaches, including , 80 Chrysiogenetes, (Podar et al 2015), Nitrospina (Gionfriddo et al 2016), 81 and (Jones et al 2019). 82 83 A global Hg survey (Lamborg et al 2014) found a prevalence for MeHg in suboxic 84 waters, especially in regionally widespread oxygen gradients at upper and intermediate 85 ocean depths. Oxygen concentrations are lower there due to a combination of physical

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

86 and biological forcing effects (Paulmier and Ruiz-Pino 2009, Wright et al 2012). These 87 gradients can exist as permanent features in the water column, impinge on coastal 88 margins, or manifest more transiently (e.g., induced by phytoplankton blooms). As 89 oxygen levels decrease, metabolic energy gets increasingly diverted to alternative 90 electron acceptors, resulting in coupling of other biogeochemical cycles, e.g., C, N, S, 91 Fe, and Mn (Bertagnolli and Stewart 2018, Moore and Doney 2007, Wright et al 2012). 92 Recent findings of microaerophilic microbial Hg methylation potential in sea ice and 93 seawater (Gionfriddo et al 2016, Villar et al 2020) raise the possibility that this process 94 contributes significantly to ocean MeHg biomagnification. 95 96 In this study, we used existing Hg (total) and MeHg concentration data from Saanich 97 Inlet (SI), a seasonally anoxic fjord on the coast of Vancouver Island (British Columbia, 98 Canada) to guide targeted metagenomic and metatranscriptomic analyses of SI seawater 99 from varying depths. Saanich Inlet, as a model natural ecosystem for studying microbial 100 activity along defined oxygen gradients (Hawley et al 2017b), provides an ideal site to 101 search for novel putative Hg methylators in low-oxygen environments. Computational 102 homology modelling was performed to predict the functionality of HgcAB proteins 103 encoded for by putative novel Hg-methylators. Finally, we scanned global 104 metagenomic datasets for recognizable hgcA genes, to reassess the environmental 105 distribution of microbial mercury methylation potential. 106 107 Results and Discussion

108 Hg and MeHg concentrations along redox gradients in SI

109 Concentrations of total dissolved Hg (HgT) and monomethylmercury (MeHg) from 110 filtered SI station “S3” seawater samples were vertical profiled from eight different

111 depths (10 m to 200 m) below sea surface. The concentration of HgT at sea surface was 112 ~0.70 pM and remained nearly constant in seawater above 120 m depth, increasing to 113 1.35 pM and ~10.56 pM at 135 m and 200 m depths, respectively (Figure 1A). MeHg 114 was below detection limit (<0.1 pM) for seawater above 100 m depth, but increased to 115 0.50 pM (17.2% of total Hg) at 150 m depth. However, MeHg then decreased to 0.1 116 pM at 165 m depth, becoming undetectable at 200 m (Figure 1B and 1C). 117 118 MAG (metagenome-assembled genome) reconstruction and putative Hg methylator 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

119 identification 120 A total of 2088 MAGs with completeness >70% and contamination <5% were 121 recovered from all SI metagenomic datasets (generated across all sampling depths, 96% 122 being from station S3). From these, 56 MAGs, belonging to seven phyla: 123 , Marinimicrobia, Verrucomicrobia, Firmicutes, Calditrichaeota, 124 Spirochaetes, and Nitrospinae (Table S1), were identified as having genes homologous 125 with hgcA (Figure S1). Among these, Marinimicrobia and Calditrichaeota have not 126 previously been implicated in Hg methylation. No hgcAB genes were recovered from 127 archaeal MAGs, suggesting that archaeal Hg methylators were rare or absent. 128

129 Novel potential Hg methylators 130 Fifteen hgcA-carrying MAGs represented members of phylum Marinimicrobia, a 131 widespread but uncultured marine phylum that couples C, N, and S cycling (Hawley et 132 al 2017a). These Marinimicrobia contained the same hgcA gene sequence (100% 133 nucleotide identity), and uniformly possessed hgcB genes downstream of hgcA. 134 Marinimicrobia genome association with hgcAB was supported strongly by emergent 135 self-organizing maps (ESOMs; Figure S2A). One Marinimicrobia-associated MAG, 136 SI037_bin139, exhibited the highest binning quality score (Table S1), with a 137 completeness of 97.8% and contamination ~0%, having 75 contigs in total. Many 138 Marinimicrobia MAGs also carried 16S rRNA genes, further supporting taxonomic 139 classification (Figure S3). Notably, Marinimicrobia-HgcA sequences fell within the 140 Euryarchaeota in the HgcA phylogenetic tree (Figure 2), suggesting a different 141 evolutionary pathway to other bacterial hgc sequences. 142 143 Three MAGs containing hgcAB genes were associated with Calditrichaeota, another 144 cosmopolitan marine phylum (Marshall et al 2017). HgcA sequences from 145 Calditrichaeota formed a cluster with the , Verrucomicrobia and 146 (PVC) group and Chloroflexi, which also have close phylogenetic 147 relationships to Deltaproteobacteria, as shown in Figure 2. ESOMs (Figure S2B) 148 supported the presence of hgcAB genes in Calditrichaeota MAGs. 149 150 HgcA sequences were found in eight Proteobacteria MAGs, all belonging to the 151 Deltaproteobacteria group SAR324. Deltaproteobacteria are the most diverse Hg- 5 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

152 methylating clade currently known (Podar et al 2015). However, the SAR324 hgcA 153 sequences clustered separately from those of three previously described 154 Deltaproteobacteria methylating clades (Desulfovibrio-like, Geobacter-like, and 155 Syntrophus-like; Gilmour et al, 2013; Podar et al, 2015), forming a potentially novel 156 clade of Hg-methylators (Figure 2). SAR324 is affiliated with a group of 157 Deltaproteobacteria abundant in the deep sea and in low-oxygen settings, with the

158 ability to metabolize sulfur, organic carbon, and C1 compounds (Sheik et al 2014). An 159 adjacent hgcB gene was also recognised downstream of hgcA in SAR324 MAGs 160 SI047_bin2 and SI048_bin69. Six other SAR324 MAGs did not contain recognizable 161 hgcAB genes, but several HgcB-like candidates with tandem [CX2CX2CX3C] motifs 162 were recognised. The presence of hgc genes in SAR324 MAGs was well supported by 163 ESOM (Figure S2A). 164

165 Relative abundance of hgcA DNA and RNA in SI 166 The 56 hgcA sequences were clustered into 15 groups using a 99% sequence identity 167 threshold. One hgcA sequence was selected arbitrarily from each group, with 168 corresponding MAG (Table 1; also indicated by stars in Figure 2), for further analysis. 169 These sequences were used to recruit reads from SI metagenomic and 170 metatranscriptomic datasets, in order to calculate relative abundance and expression. 171 Results showed that hgcA genes were widely distributed in SI water samples, and the 172 abundance of hgcA gene copies increased with depth, peaking at ~200 m depth (Figure 173 3A). Since hgcA is usually present as a single copy per genome, this finding suggests 174 that Hg methylators were most abundant at this depth. Notably, Marinimicrobia was 175 the most abundant putative Hg-methylator detected throughout the water column, 176 accounting for >2% of the microbial community for many samples from 200 m depth 177 (Figure 3B). Deltaproteobacteria with hgcA also showed higher abundance in 200 m 178 depth samples, suggesting an overlapping habitat range with Marinimicrobia. Although 179 hgcA-carrying Nitrospina was not a dominant phylum in SI datasets, the relative 180 abundance of Nitrospina-hosted hgcA remained nearly constant from sea surface to 181 bottom water, implying the adaptability this phylum across the redoxcline. 182 183 Reads per kilobase per million mapped (RPKM) values were calculated for hgcA 184 transcripts in each sample to assess hgc gene expression in a subset of 36 samples from 6 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

185 site S3 (Figure 3C). Consistent with relative DNA abundance, hgcA transcripts also 186 increased with depth to a maximum at 200 m (mean RPKM = 5.1); the most abundant 187 hgcA transcripts were associated with Marinimicrobia (Figure 3D). Intriguingly, no 188 hgcA transcripts were recovered from 10 m depth samples, although a number of hgc 189 DNA sequences were recovered. Additionally, no hgcA transcript associated with 190 Verrucomicrobia (Lentisphaeria and Kiritimatiellae) was detected for any dataset, 191 which may reflect the relatively lower abundance of Verrucomicrobia-hgcA genes, as 192 well as a low transcription level. Predicted amino acid sequences from SI 193 metaproteomic datasets were also scanned for expression of putative hgcA genes 194 (Figure S4; also indicated by stars in Figure 3D). Excepting from phyla Calditrichaeota, 195 Verrucomicrobia, Spirochaetes, and Firmicutes, representative HgcA sequences could 196 be detected in matched metaproteomic datasets. 197

198 Abundance of merB genes 199 As MeHg concentrations in the SI water column may reflect net Hg methylation and 200 demethylation, the distribution with depth of merB, a gene encoding for an 201 organomercury lyase, was also assessed (Figure 1B). Results showed the relative 202 abundance of merB genes peaked at 200 m depth, as ~0.6% of total microbial 203 community genes (Figure S5A). RPKM values of merB transcripts in each 204 metatranscriptomic sample were calculated and compared with hgcA transcripts. 205 Consistent with metagenomic results, merB transcripts peaked at 200 m depth (mean 206 RPKM = 5.2), whereas no merB transcripts were detected at 10 m depth (Figure S5B). 207 Average RPKM values of merB were higher than those of hgcA at depths < 120 m, but 208 were exceeded by hgcA RPKM values at 120-200 m depth. Interestingly, the average 209 RPKM of hgcA was again exceeded by that of merB at 200 m depth, although 210 transcription of both genes increased to maxima at that same depth (Figure S5C). 211

212 Phylogenetic analysis of hgcA-carrying Marinimicrobia 213 All 15 hgcA-carrying Marinimicrobia and 409 non-hgcA-carrying Marinimicrobia spp., 214 derived from this study and public databases (NCBI and IMG), were used to build a 215 phylogenetic tree based on concatenated alignment of housekeeping genes (Figure 4A, 216 Figure S6). All 15 hgcA-carrying Marinimicrobia clustered into a monophyletic clade 217 (Figure 4A) and carried identical hgcA gene sequences (Figure 2), consistent with a 7 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

218 single horizontal transfer of hgcA into a sublineage of Marinimicrobia. Furthermore, 219 while all Marinimicrobia-associated hgcA sequences were identical, four different 16S 220 rRNA genes were represented by these MAGs, with minimum identity of 96.0% 221 (Figure S3). This finding suggests that hgcA-carrying Marinimicrobia represent a novel 222 candidate species or genus, according to Yarza et al (2014). 223

224 Marinimicrobia-hgcA distribution in public databases 225 In order to explore the global distribution of hgcA-carrying Marinimicrobia, a wider 226 search was performed against NCBI SRA metagenomic datasets. In total, 58 227 BioProjects from a range of environments, including seawater, soil, sediment, 228 freshwater, industrial wastewater, and plant rhizomes, contained Marinimicrobia-hgcA 229 associated reads (Figure 4B, Table S2), suggesting a wide ecological distribution. 230

231 Homology models of Marinimicrobia-HgcA and -HgcB proteins 232 As no cultivated representative of Marinimicrobia is currently available to assay for 233 functional Hg methylation, we constructed homology models to test the hypothesis that 234 HgcA and HgcB proteins encoded for by Marinimicrobia possess the three-dimensional 235 structural functionality required for methylating Hg. 236 237 Analysis of the putative Marinimicrobia-HgcA sequence revealed a comparable 238 domain structure (Figure 5A) to the previously characterized HgcA from the 239 functionally validated Hg-methylating strain Desulfovibrio desulfuricans ND132 240 (Parks et al 2013), including the presence of a globular domain at the N-terminus and 241 five transmembrane spanning helices at the C-terminus. The cap-helix structure and 242 coordinated Cys residues required for Hg(II) methylation by ND132 were conserved in 243 the globular domain of our putative Marinimicrobia-HgcA (Figure S7A and Figure 244 S7B), consistent with a conserved catalytic mechanism. The structure of ND132-HgcA 245 was solved bound to the co-factor cobalamin required for catalytic methylation activity. 246 The cobalamin binding pocket for ND132 was comparable to that of the 247 Marinimicrobia-HgcA homology model (726.5 Å3 and 678.4 Å3, respectively), with 248 69% conservation of residues interacting with cobalamin. Modelling of cobalamin in 249 the binding site of the Marinimicrobia-HgcA homology model revealed a similar H- 250 bonding network for the two structures (Table S3). Modelling of the Marinimicrobia- 8 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

251 HgcB protein (Figure 5C) demonstrated binding with 2[4Fe-4S] clusters at the N- 252 terminal, similar to the structure of ND132-HgcB (Figure S7C and Figure S7D). The 253 conserved two ferredoxin motifs (CX2CX2CX3C), and the C-terminal Cys tail in 254 ND132-HgcB required for potentially transferring electrons and binding the Hg(II) 255 substrate (Parks et al 2013, Rush 2018), were also observed for Marinimicrobia-HgcB. 256 The structural qualities of Marinimicrobia-HgcA and -HgcB were further compared 257 using Ramachandran plots (Figure 5B and 5D), with most residues located in the 258 favored or allowed regions in both structures. Overall, this analysis strongly supports 259 the conservation of Hg(II) methylation activity for the putative Marinimicrobia-HgcA 260 and -HgcB proteins. Additionally, similar HgcAB homology models constructed for 261 novel putative Hg methylators Calditrichaeota and SAR324 also support functionality 262 (Figure S8). 263

264 Genome composition of hgcA-carrying MAGs 265 Metabolic pathways were inferred for MAGs by reference to Kyoto Encyclopedia of 266 Genes and Genomes (KEGG) annotations (Figure 6). Several genes encoding for 267 different terminal oxidases were detected, including cytochrome c oxidase aa3-type 268 (coxABCD), cytochrome c oxidase cbb3-type (ccoPQNO), cytochrome o oxidase

269 (cyoABCD), cytochrome aa3-600 menaquinol oxidase (qoxABCD), and cytochrome bd- 270 type quinol oxidase (cydAB), demonstrating the oxygen utilization/tolerance capacities 271 of these MAGs. Specifically, all hgcA-carrying Marinimicrobia contained cytochrome

272 c oxidase aa3-type and cytochrome aa3-600 oxidase, and Marinimicrobia SI039_bin42

273 contained another cbb3-type oxidase. All hgcA-carrying SAR324, and the majority of 274 other hgcA-carrying Deltaproteobacteria, contained cytochrome bd complex; and most 275 of these also contained cytochrome c oxidase, and aa3-type and o ubiquinol oxidases.

276 A few Deltaproteobacteria carried cytochrome aa3-600 oxidase. The two Nitrospina

277 MAGs both used cbb3-type oxidase to terminate respiratory chains, and one also carried

278 genes encoding for cytochrome c oxidase. In contrast, no genes involved in O2 279 respiration were observed in hgcA-carrying Calditrichaeota, Verrucomicrobia, 280 Spirochaetota, and Firmicutes genomes identified from databases. 281 282 Genes involved in carbohydrate degradation pathways were found in many hgcA- 283 carrying MAGs, except for Syntrophus-like Deltaproteobacteria and Calditrichaeota. 9 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

284 These MAGs represent versatile metabolic capabilities in nitrogen and sulfur cycling. 285 Notably, Marinimicrobia MAGs contained complete gene operons for utilising 286 nitrogen species in electron transfers, e.g., hydroxylamine oxidoreductase (hao), nitrite 287 oxidase (nxrAB), nitrate reductase (narGHI), and hydrazine dehydrogenase (hdh). 288 Deltaproteobacteria MAGs presented various capabilities for sulfur utilisation, for 289 which SAR324 MAGs were the only group possessing the gene ddhA encoding for a 290 subunit of dimethyl sulfide (DMS) dehydrogenase. Nearly all MAGs contained some 291 genes involved in methanogenesis and arsenic reduction, as well as transporters for Cu 292 and Fe. Furthermore, MAGs associated with Marinimicrobia, Calditrichaeota, 293 SAR324, Spirochaetes, and Nitrospina were found to carry genes encoding proteins for 294 flagellar assembly (Figure 6). 295 296 Novel putative marine Hg methylators in SI 297 In this study, SI provided a model natural ecosystem in which to study Hg methylation 298 under oxygen-depleted conditions (Hawley et al 2017b). Our results indicate that SI 299 water column contained zones of MeHg production, where ~17% of total Hg was as 300 MeHg. Similar MeHg peaks under suboxic conditions have been observed in other 301 seawater depth profiles: the Pacific Ocean (Hammerschmidt and Bowman 2012), 302 Arctic Ocean (Wang et al 2012), Southern Ocean (Cossa et al 2011), Arabian Sea 303 (Chakraborty et al 2016), and Mediterranean Sea (Cossa et al 2009). Contrasting the 304 relative abundance of potential methylators observed in the benthic zone, the observed 305 increased abundance of merB genes and transcripts may provide a viable hypothesis for 306 why MeHg levels decreased over certain depth ranges (Figure S5). 307 308 A number of known and putative Hg methylators were found, in which the phyla 309 Marinimicrobia and Calditrichaeota, as well as the clade SAR324, are reported for the 310 first time as carrying hgcAB genes (Table 1 and Figure 2). As most known Hg- 311 methylating microorganisms derive from a limited number of phylogenetic groups 312 (Gilmour et al 2013, Gionfriddo et al 2016, Jones et al 2019), our findings expand the 313 database of putative microbial Hg-methylators and their microenvironments. 314 Unfortunately, no cultivated representatives are currently available to confirm Hg 315 methylation capability experimentally. However, protein homology modelling here 316 provided theoretical HgcAB structures (Figure 5) that demonstrate intermolecular

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

317 forces consistent with those of experimentally validated methylator, D. desulfuricans 318 ND132 (Figure S7). The same methodology has been adapted to infer functionality for 319 HgcAB proteins from Nitrospina (Gionfriddo et al 2016), another uncultivated 320 potential Hg methylator (Gionfriddo et al 2016, Villar et al 2020). This approach 321 represents a promising method for predicting the functionality of proteins encoded for 322 by uncultivated microorganisms. 323 324 Surprisingly, Marinimicrobia dominated Hg-methylation capability across all water 325 depths, and comprised a novel phylogenetic clade (Figure 4A and Figure S6). This

326 finding supports the O2 adaptative capability of this clade, and both broadens and 327 distinguishes the evolutionary history of hgcA. Notably, of many hgcA DNA sequences 328 from the upper depths of SI seawater, only a small number were actively expressed 329 (Figure 3C and 3D), suggesting a strong influence from oxygen level or other 330 environmental factors. We therefore caution that abundance of hgcA genes, as measured 331 by metagenomic or amplicon sequencing, cannot effectively predict the actual extent 332 of MeHg production. 333

334 Marinimicrobia is a widespread potential Hg methylator 335 The phylum Marinimicrobia, an uncultivated but apparently cosmopolitan group, is 336 thought to play an important role in global biogeochemical cycling, especially across 337 redox gradients (Bertagnolli et al 2017, Hawley et al 2017a). However, the Hg- 338 methylation potential of Marinimicrobia has not previously been recognised. A few 339 HgcA sequences from SI seawater columns were discovered by Podar et al (2015) from 340 metagenomic data; these unidentified HgcA sequences phylogenetically co-locate with 341 Marinimicrobia-HgcA identified in this study with genome resolution. Furthermore, 342 we found hgcA-carrying Marinimicrobia exhibit a global distribution pattern (Figure 343 4B). Interestingly, our findings of hgcA-carrying Marinimicrobia from metagenomic 344 datasets from Gulf of Mexico (PRJNA288120) and Canadian Arctic (PRJNA266338) 345 waters contaminated by oil spills (Yergeau et al 2015, Yergeau et al 2017) may provide 346 one explanation for observed in situ MeHg production (Liu et al 2015). 347

348 Adaptation strategies of Hg methylators in oxygen-deficient environments 349 Hg methylators living in oxygen gradients may require the capability to tolerate 11 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

350 intermittent lower levels of oxygen exposure. In this study, multiple terminal oxidases 351 of aerobic respiratory chains were associated with several putative novel Hg 352 methylators (Figure 6). Cytochrome c aa33-type oxidase, a canonical heme-copper 353 containing terminal oxidase (Sousa et al 2012), was the most common enzyme found 354 in hgcA-carrying MAGs, as well as SAR324, most Deltaproteobacteria, all 355 Marinimicrobia, and one Nitrospina. Other terminal oxidase systems were also

356 recognised; for example, genes encoding for cytochrome c cbb3-type oxidase were 357 found in several members of Deltaproteobacteria, Marinimicrobia, and Nitrospina. 358 This oxidase has been shown to exhibit high affinity for oxygen, enabling bacteria to

359 respire O2 under both oxic and suboxic conditions (Colburn-Clifford and Allen 2010, 360 Visser et al 2006). Cytochrome bd oxidase was found to be used by most 361 Deltaproteobacteria; this type of oxidase has also been demonstrated to support

362 adaptation to low-oxygen environments, because of its relative high affinity for O2 363 (Borisov et al 2011). In contrast, cytochrome o oxidase carried by many

364 Deltaproteobacteria is an enzyme proposed to function under O2-rich conditions 365 (Cotter et al 1990). As shown here, many SI Hg methylator genomes contained more 366 than one type of terminal oxidase. Such a combination of multiple terminal oxidases

367 has been observed for other microorganisms, facilitating electron transfer to O2 under 368 variable redox potentials (Cotter et al 1990). In addition, genes encoding for flagellar 369 proteins were identified in most SI putative Hg-methylators, enabling these bacteria to

370 reposition optimally within redox gradients to suitable O2 and nutrient levels. We 371 acknowledge, however, that cultivation experiments with hgcAB-bearing 372 Marinimicrobia isolates are needed to elucidate their lifestyle and confirm functionality 373 for Hg methylation. 374

375 Materials and methods

376 Mercury analysis 377 Seawater samples for mercury analysis were collected from SI S3 station (48°35.500 378 N, 123°30.300 W) in April 2010. Total and methylmercury (sum of mono- and 379 dimethylmercury forms) determinations were made using USEPA standard methods 380 1631-E and 1630, respectively (USEPA 1998, USEPA 2002). In brief, total Hg was 381 determined on filtered water samples following wet chemical oxidation by BrCl,

382 followed by reduction by NH2OH and SnCl2, rendering all Hg species as volatile Hg(0). 12 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

383 Hg was purged from solution by N2 and concentrated on a gold-coated sand cartridge, 384 which was then heated, releasing Hg for re-concentration on a second gold cartridge for 385 final quantification by cold-vapor atomic fluorescence spectrometry (CVAFS; Tekran 386 2600). Methylated Hg was first separated from the seawater matrix by

387 KCl/H2SO4/CuSO4 extraction and steam distillation. Removal from seawater allowed 388 for derivatization of MeHg into methylethyl-Hg through the use of sodium 389 tetraethylborate, which is volatile and can be purged from solution and pre-concentrated 390 on Tenax. The MeHg derivative was then separated from other Hg forms on a packed 391 gas chromatography column of 15% OV-3 on Chromasorb-W at 110°C, and then 392 rendered into Hg0 through pyrolysis for quantification by CVAFS. Detection limits for 393 total and methylated Hg (3σ of reagent blanks) were 0.5 and 0.1 pM, respectively. 394 395 Multi-omics data description 396 The multi-omic (metagenomic, metatranscriptomic, and metaproteomic) time-series 397 samples from SI were taken from 2009 to 2012. A total of 84 genome-resolved 398 metagenomic samples (Table S4) were employed in this study, including 78 samples 399 from SI midpoint station “S3” (48°35.500 N, 123°30.300 W), 6 samples from the inlet 400 mouth station “S4” (48°6 N 123°5 W) and the inlet end station “S2” (48°33.148 N, 401 123°31.969 W). A set of shotgun metatranscriptomic raw data corresponding to 36 402 metagenomic samples (Table S4) was used to investigate the transcriptional activity of 403 genes of interest. Protein sequences predicted from the SI metaproteomic dataset 404 (Hawley et al 2014) were also used to confirm the expression of target proteins in the 405 environment. Sample collection, DNA/ RNA/ protein extraction, and sequencing 406 methods of the datasets were described previously (Hawley et al 2017b). Briefly, 407 seawater samples spanning six major depths (10, 100, 120, 135, 150, and 200 m) were 408 collected and filtered onto 0.22 μm Sterivex (Millipore) filters. 1.8 ml of RNAlater 409 (Ambion) was added to metatranscriptomic sample filters, and 1.8 ml of sucrose lysis 410 buffer was added to metaproteomic sample filters. Filters were stored at -80 °C until 411 processing. For metagenomic samples, Sterivex filters were thawed on ice and 412 incubated at 37 °C for 1 h with lysozyme (Sigma). Proteinase K (Sigma) and 20% SDS 413 were added subsequently and incubated at 55 °C for 2 h with rotation. Filters were 414 rinsed with sucrose lysis buffer after lysate was removed. Combined lysate was 415 extracted with phenol-chloroform followed by chloroform. The aqueous layer was

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

416 washed with TE buffer (pH 8.0) for three times and concentrated to 150~400μl. The 417 metagenomic samples were sequenced at the DOE Joint Genome Institute (JGI) and 418 sequenced on the Illumina HiSeq 2000 platform. For metatranscriptomic samples, total 419 RNA was extracted using the mirVana miRNA Isolation kit (Ambion) modified for 420 Sterivex filters. RNAlater was removed from thawed filters by extrusion and rinsed 421 with Ringer’s solution (Sigma). After incubation at 25 °C for 20 min with rotation, 422 Ringer’s solution was removed by extrusion. Lysozyme was added, following by 423 incubating at 37 °C for 30 min with rotation. Lysate was removed into 15 ml tube and 424 subjected to organic extraction. TURBO DNA-free kit (Thermofisher) was used to 425 remove DNA and the RNeasy MinElute Cleanup kit (Qiagen) was used for purification. 426 Metatranscriptomic shotgun libraries were generated at the JGI on the HiSeq 2000 427 platform. For metaproteomic samples, Bugbuster (Novagen) was added to thawed 428 filters and incubated at 25 °C for 20~30 min with rotation. Filters were rinsed with 1 429 ml lysis buffer after removing lysate. Combined lysate was then subject to buffer

430 exchange using Amicon Ultra 10 K (Millipore) with 100 mM NH4HCO3 for three times. 431 Urea was added to a final concentration of 8M and dithiothreitol added to a final 432 concentration of 5 mM. Samples were incubated at 60 °C for 30 min and diluted ten

433 times with 100mM NH4HCO3, following by digesting at 37 °C for 6 h with trypsin. 434 C18 solid phase extraction and strong cation exchange were carried out subsequently. 435 Tandem mass spectrometry (MS/MS) was used to sequence the protein samples, and 436 MS-GFDB (Kim et al 2010) was used to identify peptides based on the matched SI 437 metagenomic sequences. 438

439 Metagenomic assembly and binning 440 The raw Illumina reads from metagenomic samples (2×150 bp paired-end, median 441 13.75 Gbp per sample) were first filtered and trimmed by Trimmomatic v3.6 (Bolger 442 et al 2014). Samples from the same station and time point were co-assembled with 443 MEGAHIT v1.2.8. MAGs were recovered using MetaBAT v2.14 (Kang et al 2015) and 444 MaxBin v2.2.7 (Wu et al 2014), followed by merged and refined with MetaWRAP v1.2 445 (Uritskiy et al 2018). In order to further improve the quality of the MAGs, metagenomic 446 reads were remapped to each MAG and then reassembled by SPAdes (Bankevich et al 447 2012) in careful mode. The qualities of derived MAGs were examined using CheckM 448 (Parks et al 2015). 16S rRNA genes were predicted by RNAmmer (Lagesen et al 2007) 14 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

449 and then used to assign taxonomic classifications of the MAGs. For those MAGs that 450 lacked 16S rRNA genes, GTDB-Tk v0.3.2 (Parks et al 2018) was used to predict 451 taxonomies according to genome contents. Genome annotation was conducted with 452 Prokka v1.14 (Seemann 2014). The presence of hgcAB genes in MAGs was further 453 confirmed by tetra-nucleotide frequencies signatures based on ESOM mapping (Ultsch 454 and Mörchen 2005).

455 Relative abundance calculation 456 CD-HIT v4.6 (Godzik and Li 2006) was used to choose representative hgcA sequences 457 from all the MAGs under a 99% sequence identity threshold. Paired-end reads were 458 remapped to these hgcA sequences with BWA-MEM algorithm (Li 2013). BBMap 459 (http://sourceforge.net/projects/bbmap/) was used to calculate the average coverage. 460 MicrobeCensus (Nayfach and Pollard 2015) was used to estimate the genome 461 equivalent of every sample. The relative abundance of every representative hgcA in 462 every sample was calculated by the formula “hgcA coverage / genome equivalent”; 463 relative abundance of hgcA genes was assumed to represent that of corresponding 464 MAGs, since hgcA genes are single copied in all reported methylators. To evaluate and 465 compare expression levels of genes of interest, RPKM values were calculated for each 466 gene, normalized for both gene length and sequencing depth. . 467

468 Search for hgcAB genes in MAGs 469 An in-house database of experimentally validated functional HgcAB sequences (see 470 Table S5) was used to search for HgcAB encoding genes in MAGs by BLASTp, and 471 BLAST results were further confirmed by examining the existence of conserved motifs 472 ([N(V/I)WCA(A/G)(A/G)K] in HgcA and [CX2CX2CX3C] in HgcB), respectively. 473

474 Phylogenetic analysis 475 HgcA sequences found in MAGs were aligned with other experimentally validated and 476 predicted sequences (Gilmour et al 2013) by MAFFT v7 with the high-sensitivity (L- 477 INS-i) algorithm (Standley and Katoh 2013). The alignment was trimmed by trimal 478 v1.2 (Silla-Martínez et al 2009), followed by the maximum likelihood (ML) tree was 479 reconstructed by IQ-TREE v1.6 (Schmidt et al 2014) under the LG+F+R9 protein 480 substitution model chosen according to BIC. Publicly available Marinimicrobia 16S

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

481 rRNA genes (see Table S6) were retrieved from the SILVA database (release 138; 482 https://www.arb-silva.de/) and aligned with the 16S rRNA genes of hgcA-carrying 483 Marinimicrobia to build an ML tree using IQ-TREE v1.6. The ML tree of all the 484 Marinimicrobia-affiliated MAGs found in this study, as well as in public databases (see 485 Table S7), was built by PhyloPhlAn2 (Segata et al 2013) using 400 universal proteins 486 without duplication. 487

488 Searching SRA database 489 To investigate the distribution of Marinimicrobia-hgcA, the recovered Marinimicrobia- 490 hgcA gene nucleotide sequence was first employed as a query to be searched in the 491 NCBI SRA database using the SRA search tool (Levi et al 2018), which is able to map 492 ~1% reads of more than 100,000 public whole shotgun (WGS) metagenomic samples 493 to the query sequences using Bowtie2 (Langmead and Salzberg 2012). Samples that 494 contained reads associated with the Marinimicrobia-hgcA gene were then checked by 495 downloading the corresponding read set from the SRA database and mapping these 496 reads to the hgcA nucleotide sequence. 497

498 Protein homology modeling 499 The three-dimensional structures of the putative HgcA and HgcB sequences found in 500 this study were built and refined using Schrödinger suit v2012 (LLC, New York, USA). 501 The transmembrane regions were predicted by CCTOP (Dobson et al 2015). HgcAB 502 sequences were searched against the protein data bank (PDB) using BLASTp with a 503 word size of three. The structure of the corrinoid iron-sulfur protein (CFeSP) AcsC 504 from Carboxydothermus hydrogenoformans (PDB ID: 2H9A; 1.9 Å; 28% identity and 505 55% similarity to Marinimicrobia-HgcA) and the CFeSP AcsC from Moorella 506 thermoacetica (PDB ID: 4DJD; 2.38 Å; 31% identity and 59% similarity to 507 Marinimicrobia-HgcA) were identified as appropriate templates for homology 508 modeling of the HgcA N-terminus globular structured domain, according to their 509 similarities and the resolution at which they were solved. The structure of the 510 transmembrane bacteriorhodopsin Bop (PDB ID:2ZFE; 2.5 Å) was used to model the 511 transmembrane region of HgcA, due to their comparable secondary structures. The 512 homology model of HgcB proteins was built using the experimental structures of iron 513 hydrogenase HydA from D. desulfuricans (PDB ID: 1HFE; 1.6 Å; 36% identity and 16 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

514 53% similarity to Marinimicrobia-HgcB) and the flavin-based caffeyl-CoA reductase 515 CarE (PDB ID: 6FAH_A; 3.133 Å; 39% identity and 45% similarity to 516 Marinimicrobia-HgcB) in a similar procedure. The stereochemical quality and 517 accuracy of reconstructed homology models were assessed and compared to each other 518 by generating a Ramachandran plot using the Rampage server (Lovell et al 2003) and 519 a best model was selected. Hydrogen bonding between HgcA protein and the ligand 520 cobalamin were predicted by Arpeggio (Jubb et al 2017), PLIP (Salentin et al 2015), 521 Chimera (Pettersen et al 2004), and NGL Viewer (Rose and Hildebrand 2015), and 522 consistent hydrogen bonding among the four programs was considered to be robust. 523 Homology models for HgcAB from the Hg-methylating model strain D. desulfuricans 524 ND132 were also built by the methods described above. 525 526 527 Acknowledgements 528 H.L. was supported by a postgraduate fellowship from The University of Melbourne 529 Environmental Microbiology Research Initiative, awarded to J.W.M. D.B.A. was 530 supported by an Investigator Grant from the National Health and Medical Research 531 Council (NHMRC) of Australia [GNT1174405], and by the Victorian Government OIS 532 Program. C.H.L thanks the Woods Hole Oceanographic Institution for support and 533 Tracy Mincer for help and inspiration. K.E.H. is supported by a Senior Medical 534 Research Fellowship from the Viertel Foundation of Australia. H.L, D.B.H., Y.M., 535 R.W., K.E.H and J.W.M. gratefully acknowledge the use of data generated under the 536 auspices of the US Department of Energy (DOE) Joint Genome Institute and Office of 537 Science User Facility, supported by the Office of Science of the U.S. Department of 538 Energy under Contract DE-AC02- 05CH11231, the G. Unger Vetlesen and Ambrose 539 Monell Foundations, and the Natural Sciences and Engineering Research Council of 540 Canada through grants awarded to S.J.H.. 541 542 Data availability 543 Data from this project have been deposited at DDBJ/ENA/GenBank under Project ID 544 PRJNA630981. The 2088 MAGs generated by this study are available with accession 545 numbers JABGOO000000000 to JABJQV000000000. 546 17 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

547 548 Code availability 549 Customised scripts for conducting metagenomic and metatranscriptomic analyses are 550 available from GitHub (https://github.com/SilentGene/MultiOmicAnalysis). 551 552 References 553 Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS et al (2012). 554 SPAdes: a new genome assembly algorithm and its applications to single-cell 555 sequencing. J Comput Biol 19: 455-477. 556 557 Bertagnolli AD, Padilla CC, Glass JB, Thamdrup B, Stewart FJ (2017). Metabolic 558 potential and in situ activity of marine Marinimicrobia bacteria in an anoxic water 559 column. Environ Microbiol 19: 4392-4416. 560 561 Bertagnolli AD, Stewart FJ (2018). Microbial niches in marine oxygen minimum zones. 562 Nat Rev Microbiol. 563 564 Bolger AM, Usadel B, Lohse M (2014). Trimmomatic: a flexible trimmer for Illumina 565 sequence data. Bioinformatics 30: 2114-2120. 566 567 Borisov VB, Gennis RB, Hemp J, Verkhovsky MI (2011). The cytochrome bd 568 respiratory oxygen reductases. Biochim Biophys Acta 1807: 1398-1413. 569 570 Chakraborty P, Mason RP, Jayachandran S, Vudamala K, Armoury K, Sarkar A et al 571 (2016). Effects of bottom water oxygen concentrations on mercury distribution and 572 speciation in sediments below the oxygen minimum zone of the Arabian Sea. Mar 573 Chem 186: 24-32. 574 575 Colburn-Clifford J, Allen C (2010). A cbb(3)-type cytochrome C oxidase contributes 576 to Ralstonia solanacearum R3bv2 growth in microaerobic environments and to bacterial 577 wilt disease development in tomato. Mol Plant Microbe Interact 23: 1042-1052. 578 579 Cossa D, Averty B, Pirrone N (2009). The origin of methylmercury in open 580 Mediterranean waters. Limnol Oceanogr 54: 837-844. 581 582 Cossa D, Heimbürger L-E, Lannuzel D, Rintoul SR, Butler ECV, Bowie AR et al 583 (2011). Mercury in the Southern Ocean. Geochim Cosmochim Acta 75: 4037-4052. 584 585 Cotter PA, Chepuri V, Gennis RB, Gunsalus RP (1990). Cytochrome o (cyoABCDE) 586 and d (cydAB) oxidase gene expression in Escherichia coli is regulated by oxygen, pH, 587 and the fnr gene product. J Bacteriol 172: 6333-6338. 588 589 Dobson L, Reményi I, Tusnády GE (2015). CCTOP: a Consensus Constrained 590 TOPology prediction web server. Nucleic Acids Res 43: W408-W412. 591 592 Fitzgerald WF, Clarkson TW (1991). Mercury and monomethylmercury: present and 18 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

593 future concerns. Environ Health Perspect 96: 159-166. 594 595 Gilmour CC, Podar M, Bullock AL, Graham AM, Brown SD, Somenahally AC et al 596 (2013). Mercury methylation by novel microorganisms from new environments. 597 Environ Sci Technol 47: 11810-11820. 598 599 Gionfriddo CM, Tate MT, Wick RR, Schultz MB, Zemla A, Thelen MP et al (2016). 600 Microbial mercury methylation in Antarctic sea ice. Nat Microbiol 1: 16127. 601 602 Godzik A, Li W (2006). Cd-hit: a fast program for clustering and comparing large sets 603 of protein or nucleotide sequences. Bioinformatics 22: 1658-1659. 604 605 Grégoire DS, Poulain AJ, Ivanova EP (2018). Shining light on recent advances in 606 microbial mercury cycling. Facets 3: 858-879. 607 608 Hammerschmidt CR, Bowman KL (2012). Vertical methylmercury distribution in the 609 subtropical North Pacific Ocean. Mar Chem 132-133: 77-82. 610 611 Hawley AK, Brewer HM, Norbeck AD, Pasa-Tolic L, Hallam SJ (2014). 612 Metaproteomics reveals differential modes of metabolic coupling among ubiquitous 613 oxygen minimum zone microbes. Proc Natl Acad Sci U S A 111: 11395-11400. 614 615 Hawley AK, Nobu MK, Wright JJ, Durno WE, Morgan-Lang C, Sage B et al (2017a). 616 Diverse Marinimicrobia bacteria may mediate coupled biogeochemical cycles along 617 eco-thermodynamic gradients. Nat Commun 8: 1507. 618 619 Hawley AK, Torres-Beltran M, Zaikova E, Walsh DA, Mueller A, Scofield M et al 620 (2017b). A compendium of multi-omic sequence information from the Saanich Inlet 621 water column. Sci Data 4: 170160. 622 623 Hsu-Kim H, Eckley CS, Selin NE (2018). Modern science of a legacy problem: 624 mercury biogeochemical research after the Minamata Convention. Environ Sci-Process 625 Impacts 20: 582-583. 626 627 Jones DS, Walker GM, Johnson NW, Mitchell CPJ, Coleman Wasik JK, Bailey JV 628 (2019). Molecular evidence for novel mercury methylating microorganisms in sulfate- 629 impacted lakes. ISME J. 630 631 Jubb HC, Higueruelo AP, Ochoa-Montaño B, Pitt WR, Ascher DB, Blundell TL (2017). 632 Arpeggio: a web server for calculating and visualising interatomic interactions in 633 protein structures. J Mol Biol 429: 365-371. 634 635 Kang DD, Froula J, Egan R, Wang Z (2015). MetaBAT, an efficient tool for accurately 636 reconstructing single genomes from complex microbial communities. PeerJ 3: e1165. 637 638 Kessler JD, Valentine DL, Redmond MC, Du M, Chan EW, Mendes SD et al (2011). 639 A persistent oxygen anomaly reveals the fate of spilled methane in the deep Gulf of 640 Mexico. Science (New York, NY) 331: 312-315. 641 642 Kim S, Mischerikow N, Bandeira N, Navarro JD, Wich L, Mohammed S et al (2010). 19 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

643 The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: 644 applications to database search. Mol Cell Proteomics 9: 2840-2852. 645 646 Lagesen K, Hallin P, Rødland EA, Staerfeldt H-H, Rognes T, Ussery DW (2007). 647 RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids 648 Res 35: 3100-3108. 649 650 Lamborg CH, Hammerschmidt CR, Bowman KL, Swarr GJ, Munson KM, Ohnemus 651 DC et al (2014). A global ocean inventory of anthropogenic mercury based on water 652 column measurements. Nature 512: 65-68. 653 654 Langmead B, Salzberg SL (2012). Fast gapped-read alignment with Bowtie 2. Nat 655 Methods 9: 357-359. 656 657 Lee C-S, Fisher NS (2017). Bioaccumulation of methylmercury in a marine copepod. 658 Environ Toxicol Chem 36: 1287-1293. 659 660 Levi K, Rynge M, Abeysinghe E, Edwards RA: Searching the sequence read archive 661 using Jetstream and Wrangler. Proc Pract Exp Adv Res Comput; 22-26 July; Pittsburgh, 662 PA, USA. ACM: 2018. 663 664 Li H (2013). Aligning sequence reads, clone sequences and assembly contigs with 665 BWA-MEM. arXiv preprint arXiv:13033997. 666 667 Liu B, Schaider LA, Mason RP, Shine JP, Rabalais NN, Senn DB (2015). Controls on 668 methylmercury accumulation in northern Gulf of Mexico sediments. Estuar Coast Shelf 669 S 159: 50-59. 670 671 Lovell SC, Davis IW, Arendall III WB, de Bakker PIW, Word JM, Prisant MG et al 672 (2003). Structure validation by Cα geometry: ϕ,ψ and Cβ deviation. Proteins 50: 437- 673 450. 674 675 Lu Z, Deng Y, Van Nostrand JD, He Z, Voordeckers J, Zhou A et al (2012). Microbial 676 gene functions enriched in the Deepwater Horizon deep-sea oil plume. ISME J 6: 451- 677 460. 678 679 Marshall IPG, Starnawski P, Cupit C, Fernandez Caceres E, Ettema TJG, Schramm A 680 et al (2017). The novel bacterial phylum Calditrichaeota is diverse, widespread and 681 abundant in marine sediments and has the capacity to degrade detrital proteins. Environ 682 Microbiol Rep 9: 397-403. 683 684 Moore JK, Doney SC (2007). Iron availability limits the ocean nitrogen inventory 685 stabilizing feedbacks between marine denitrification and nitrogen fixation. Glob 686 Biogeochem Cycle 21: 2. 687 688 Nayfach S, Pollard KS (2015). Average genome size estimation improves comparative 689 metagenomics and sheds light on the functional ecology of the human microbiome. 690 Genome Biology 16: 51. 691 692 Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW (2015). CheckM: 20 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

693 assessing the quality of microbial genomes recovered from isolates, single cells, and 694 metagenomes. Genome Res 25: 1043-1055. 695 696 Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, Chaumeil P-A et al 697 (2018). A standardized bacterial based on genome phylogeny substantially 698 revises the tree of life. Nat Biotechnol 36: 996. 699 700 Parks JM, Johs A, Podar M, Bridou R, Hurt RA, Jr., Smith SD et al (2013). The genetic 701 basis for bacterial mercury methylation. Science (New York, NY) 339: 1332-1335. 702 703 Paulmier A, Ruiz-Pino D (2009). Oxygen minimum zones (OMZs) in the modern ocean. 704 Prog Oceanogr 80: 113-128. 705 706 Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC et al 707 (2004). UCSF Chimera—A visualization system for exploratory research and analysis. 708 J Comput Chem 25: 1605-1612. 709 710 Podar M, Gilmour CC, Brandt CC, Soren A, Brown SD, Crable BR et al (2015). Global 711 prevalence and distribution of genes and microorganisms involved in mercury 712 methylation. Sci Adv 1: e1500675. 713 714 Rose AS, Hildebrand PW (2015). NGL Viewer: a web application for molecular 715 visualization. Nucleic Acids Res 43: W576-W579. 716 717 Rush KW (2018). Expression and characterization of HgcA and HgcB, two proteins 718 involved in methylmercury biosynthesis. Ph.D. thesis, University of Michigan, Ann 719 Arbor, MI. 720 721 Salentin S, Schreiber S, Haupt VJ, Adasme MF, Schroeder M (2015). PLIP: fully 722 automated protein–ligand interaction profiler. Nucleic Acids Res 43: W443-W447. 723 724 Schmidt HA, Minh BQ, von Haeseler A, Nguyen L-T (2014). IQ-TREE: a fast and 725 effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol 726 Biol Evol 32: 268-274. 727 728 Seemann T (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics 30: 729 2068-2069. 730 731 Segata N, Börnigen D, Morgan XC, Huttenhower C (2013). PhyloPhlAn is a new 732 method for improved phylogenetic and taxonomic placement of microbes. Nat 733 Commun 4: 2304-2304. 734 735 Selin NE (2009). Global biogeochemical cycling of mercury: A review. Annu Rev 736 Environ Resour 34: 43-63. 737 738 Sheik CS, Jain S, Dick GJ (2014). Metabolic flexibility of enigmatic SAR324 revealed 739 through metagenomics and metatranscriptomics. Environ Microbiol 16: 304-317. 740 741 Silla-Martínez JM, Capella-Gutiérrez S, Gabaldón T (2009). trimAl: a tool for 742 automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25: 21 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

743 1972-1973. 744 745 Sousa FL, Alves RJ, Ribeiro MA, Pereira-Leal JB, Teixeira M, Pereira MM (2012). 746 The superfamily of heme-copper oxygen reductases: types and evolutionary 747 considerations. Biochim Biophys Acta 1817: 629-637. 748 749 Standley DM, Katoh K (2013). MAFFT multiple sequence alignment software version 750 7: improvements in performance and usability. Mol Biol Evol 30: 772-780. 751 752 Stramma L, Johnson GC, Sprintall J, Mohrholz V (2008). Expanding oxygen-minimum 753 zones in the tropical oceans. Science (New York, NY) 320: 655-658. 754 755 Ultsch A, Mörchen F (2005). ESOM-Maps: tools for clustering, visualization, and 756 classification with Emergent SOM. Technical Report Dept. of Mathematics and 757 Computer Science. University of Marburg: Germany. 758 759 Uritskiy GV, DiRuggiero J, Taylor J (2018). MetaWRAP—a flexible pipeline for 760 genome-resolved metagenomic data analysis. Microbiome 6: 158. 761 762 USEPA (1998). Method 1630: Methylmercury in water by distillation, aqueous 763 ethylation, purge and trap, and cold vapor atomic fluorescence: Washington, D.C. 764 765 USEPA (2002). Method 1631, Revision E: Mercury in water by oxidation, purge and 766 trap, and cold vapor atomic fluorescence spectrometry: Washington, D.C. 767 768 Villar E, Cabrol L, Heimburger-Boavida LE (2020). Widespread microbial mercury 769 methylation genes in the global ocean. Environ Microbiol Rep. 770 771 Visser JM, Jong GAH, Vries S, Robertson LA, Kuenen JG (2006). cbb3-type 772 cytochrome oxidase in the obligately chemolithoautotrophic Thiobacillus sp. W5. 773 FEMS Microbiol Lett 147: 127-132. 774 775 Wang F, Macdonald RW, Armstrong DA, Stern GA (2012). Total and methylated 776 mercury in the Beaufort Sea: The role of local and recent organic remineralization. 777 Environ Sci Technol 46: 11821-11828. 778 779 Wright JJ, Konwar KM, Hallam SJ (2012). Microbial ecology of expanding oxygen 780 minimum zones. Nat Rev Microbiol 10: 381-394. 781 782 Wu Y-W, Tang Y-H, Tringe SG, Simmons BA, Singer SW (2014). MaxBin: an 783 automated binning method to recover individual genomes from metagenomes using an 784 expectation-maximization algorithm. Microbiome 2: 26. 785 786 Yarza P, Yilmaz P, Pruesse E, Glockner FO, Ludwig W, Schleifer KH et al (2014). 787 Uniting the classification of cultured and uncultured bacteria and archaea using 16S 788 rRNA gene sequences. Nat Rev Microbiol 12: 635-645. 789 790 Yergeau E, Maynard C, Sanschagrin S, Champagne J, Juck D, Lee K et al (2015). 791 Microbial community composition, functions, and activities in the Gulf of Mexico 1 792 year after the deepwater horizon accident. Appl Environ Microbiol 81: 5855-5866. 22 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

793 794 Yergeau E, Michel C, Tremblay J, Niemi A, King TL, Wyglinski J et al (2017). 795 Metagenomic survey of the taxonomic and functional microbial communities of 796 seawater and sea ice from the Canadian Arctic. Sci Rep 7: 42242. 797 798

23

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

799 Figures: 800

801 802 Figure 1. Hg and MeHg profiles of Saanich Inlet station “S3”. (A) Concentration of total 803 dissolved Hg. (B) Concentration of dissolved MeHg. (C) MeHg as a percentage of total dissolved 804 Hg.

1

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

805 806 Figure 2. Maximum-likelihood phylogenetic tree of HgcA amino acid sequences (1,000 ultrafast 807 bootstrap replicates; values >90% are shown by black dots at the nodes). HgcA sequences recovered 808 in this study are highlighted in blue. HgcA sequences retrieved from public databases with 809 corresponding IMG gene IDs are shown in black. HgcA paralogues from non-methylating bacteria 810 were used as outgroups and are shown in grey. The 15 representative HgcA sequences used in this 811 study are indicated by stars. Taxonomic classifications of the hgcA-carrying genomes are labeled in 812 the outer circle by different colors.

2

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

813 814 815 Figure 3. Relative abundance of representative hgcA genes and transcripts in different samples 816 from Saanich Inlet. Gene abundance is normalized by gene length and genome equivalent for each 817 sample, and transcript abundances are represented as RPKM values. (A) Relative abundance of the 818 sum of representative hgcA genes for different depths. (B) Relative abundance of different 819 representative hgcA genes from different metagenomic datasets. Larger circles indicate a higher 820 percentage of the whole microbial community. Different shades of blue indicate different depths from 821 which the samples were taken. (C) RPKM values of the sum of representative hgcA transcripts in 822 different depths. (D) RPKM values of different representative hgcA transcripts in different 823 metatranscriptomic samples. HgcA sequences from metaproteomic samples are indicated with stars. 824

3

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

825

826 827 Figure 4. Phylogeny and global distribution of phylum Marinimicrobia (A) Maximum-likelihood 828 phylogenetic tree of phylum Marinimicrobia based on concatenated housekeeping genes (1,000 829 ultrafast bootstrap replicates; values >90% are shown at the nodes). A total of 424 Marinimicrobia 830 genomes were used for the tree; Marinimicrobia lineages without hgcA genes were collapsed to 831 simplify presentation (see Figure S6 for a more detailed tree). (B) Distribution of hgcA-carrying 832 Marinimicrobia in various environments globally, shown as different colors; total numbers of 833 BioProjects from each environment are shown in brackets. 834 835

4

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

836 837 Figure 5. Three-dimensional homology models of Marinimicrobia-HgcA and -HgcB proteins. (A) 838 Model of full-length Marinimicrobia HgcA, parallel to the membrane. An enlarged view of the 839 functional domain binding to cobalamin is shown, with the interaction between cysteine and cobalt 840 represented by a dotted line. The electrostatic surface potential of the cap-helix structure is shown. (B) 841 Ramachandran plot of Marinimicrobia-HgcA model; 93.2%, 5.9%, and 0.8% of residues plotted in 842 favoured, allowed, and outlier regions, respectively. (C) Model of full-length Marinimicrobia-HgcB 843 binding with two [4Fe4S] clusters. Interactions between the protein and iron are shown by dotted lines. 844 (D) Ramachandran plot of Marinimicrobia-HgcB model; 94.1%, 5.9%, and 0% of residues plotted in 845 favoured, allowed, and outlier regions, respectively. 846

5

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

847 848 Figure 6. KEGG pathways of the hgcA-carrying MAGs. Taxonomic classifications of MAGs are 849 represented at bottom of heatmap by different colors. Categories of pathways are represented at left 850 side of heatmap by different colors. Color of each cell refers to completeness of enzymes involved in 851 each pathway. 852 853

6

bioRxiv preprint doi: https://doi.org/10.1101/2020.06.03.132969; this version posted June 4, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

854 Tables: 855 Table 1. Summary of 15 representative hgcA-carrying MAGs a MAGs %Compl. b %Conta. b Size (Mbp) # Contigs %GC 16S Taxonomy SI034_bin137 97.58 2.992 4.94 304 40.0 Desulfobacula sp. SI037_bin96 71.02 3.87 2.76 633 39.5 Desulfobacula sp. SI073_bin145 87.48 5.322 4.26 657 39.2 Desulfobacula sp. SI075_bin31 88.61 2.293 4.89 394 40.1 Desulfobacula sp. co234_bin4 81.15 5.545 5.91 1000 50.1 Order Desulfatiglandales SI037_bin147 89.67 1.935 3.28 289 38.6 Order SI037_bin154 89.02 3.361 7.85 512 45.5 SAR324 group SI047_bin2 74.15 2.941 3.63 142 36.2 SAR324 group SI047_bin25 96.7 0 3.17 92 37.4 + Phylum Marinimicrobia co234_bin30 96.08 1.648 4.54 293 42.9 Phylum Calditrichaeota SI037_bin60 83.67 3.914 8.81 2073 63.9 Class Lentisphaeria co234_bin56 95.6 2.027 4.85 162 60.6 + Class Kiritimatiellae co234_bin61 98.25 0 2.67 97 47.7 Order Christensenellales co234_bin41 72.27 1.253 4.68 1025 48.9 Order Spirochaetales SI073_bin113 74.89 1.709 2.92 321 43.1 Nitrospina sp. 856 a Representative MAGs selected by 99% hgcA identity threshold. See Table S1 for all 56 hgcA-carrying MAGs. 857 b Completeness above 95% and contamination below 5% are shown in bold; quality of MAGs was determined by CheckM 858 with the “lineage_wf” pipeline. 859 860 861

7