The ISME Journal (2016) 10, 1696–1705 © 2016 International Society for Microbial Ecology All rights reserved 1751-7362/16 www.nature.com/ismej ORIGINAL ARTICLE Genomic reconstruction of a novel, deeply branched sediment archaeal with pathways for acetogenesis and sulfur reduction

Kiley W Seitz1, Cassandre S Lazar2,3,4, Kai-Uwe Hinrichs3, Andreas P Teske2 and Brett J Baker1 1Department of Marine Science, University of Texas Austin, Marine Science Institute, Port Aransas, TX, USA; 2Department of Marine Sciences, University of North Carolina, Chapel Hill, NC, USA; 3MARUM Center for Marine Environmental Sciences, University of Bremen, Bremen, Germany and 4Department of Aquatic Geomicrobiology, Institute of Ecology, Friedrich Schiller University Jena, Dornburger Straße 159, Jena, Germany

Marine and estuary sediments contain a variety of uncultured whose metabolic and ecological roles are unknown. De novo assembly and binning of high-throughput metagenomic sequences from the sulfate–methane transition zone in estuary sediments resulted in the reconstruction of three partial to near-complete (2.4–3.9 Mb) genomes belonging to a previously unrecognized archaeal group. Phylogenetic analyses of ribosomal RNA genes and ribosomal proteins revealed that this group is distinct from any previously characterized archaea. For this group, found in the White Oak River estuary, and previously registered in sedimentary samples, we propose the name ‘Thorarchaeota’. The Thorarchaeota appear to be capable of acetate production from the degradation of proteins. Interestingly, they also have elemental sulfur and thiosulfate reduction genes suggesting they have an important role in intermediate sulfur cycling. The reconstruction of these genomes from a deeply branched, widespread group expands our understanding of sediment biogeochemistry and the evolutionary history of Archaea. The ISME Journal (2016) 10, 1696–1705; doi:10.1038/ismej.2015.233; published online 29 January 2016

Introduction uncovered, new roles for archaea in carbon, nitrogen and sulfur cycling may also be discovered. Although a few cultured archaeal phyla provide Estuaries are dynamic and productive environ- most of our knowledge on archaeal metabolism, the ments that contain microbial communities essential uncultured majority of the archaeal remains to global nutrient cycling (Bauer et al., 2013). essentially unaccounted for (Baker and Dick, 2013). Microorganisms in estuary sediments facilitate the Single-gene (for example, 16S ribosomal RNA turnover of carbon and the anaerobic respiration of (rRNA)) sequencing surveys have shown that uncul- sulfur and nitrogen (Oremland and Polcin, 1982). tured archaea are extremely diverse in marine and Understanding the role of microorganisms in carbon estuary sediments (Durbin and Teske, 2011; Kubo cycling in sedimentary estuarine environments is et al., 2012). Stable carbon isotopic analyses of globally important because estuaries provide a archaeal lipids and archaeal cells indicated that significant sink of atmospheric CO (Cai, 2011). archaea assimilate buried organic carbon of photo- 2 In this study, we are focusing on the estuarine synthetic origin in deep-sea sediments, suggesting a sediments and microbial communities of the White heterotrophic lifestyle (Biddle et al., 2006). Recently, Oak River (WOR) in North Carolina, a typical black- partial genomes of two widespread sediment archaea water river traversing coastal woodlands and swamps. groups (Marine Benthic Group D and Miscellaneous The tidally influenced WOR estuary has served as a Crenarchaeaotal Group) revealed they are able to model system for geochemical and microbiological degrade detrital proteins as a source of carbon (Lloyd studies of anaerobic carbon cycling, especially metha- et al., 2013; Meng et al., 2014; Seyler et al., 2014). nogenesis and methane oxidation (Martens and However with new uncultured archaeal groups being Goldhaber, 1978; Kelley et al., 1990, 1995; Lloyd et al., 2011). Surveys of archaeal diversity of the WOR Correspondence: BJ Baker, Department of Marine Science, Uni- estuary sediment have revealed numerous uncultured versity of Texas Austin, Marine Science Institute, 750 Channel archaea commonly seen in sediments throughout the View Dr, Port Aransas, TX 78373, USA. E-mail: [email protected] world including Miscellaneous Crenarchaeotal Group, Received 20 April 2015; revised 4 November 2015; accepted and Marine Benthic Groups B and D (Lloyd et al., 10 November 2015; published online 29 January 2016 2011; Kubo et al., 2012; Lazar et al., 2014). The Pathways for acetogenesis and sulfur reduction KW Seitz et al 1697 distinct redox zonation and high archaeal diversity of CA, USA) using KAPA-Illumina library creation kit the WOR sediments make this site ideal for the (KAPA Biosystems, Wilmington, MA, USA). environmentally contextualized reconstruction of novel genomes that will lend to better understanding of the metabolic and environmental potential of Genomic assembly, binning and annotation uncultured archaea. Illumina (HiSeq) shotgun genomic reads were De novo assembly and binning of random shotgun screened against Illumina artifacts (adapters and genomic libraries has proven to be an extremely DNA spike-ins) with a sliding window with a kmer powerful tool in reconstructing genomes of uncul- size of 28 and a step size of 1. Reads with 3 or more tured groups from the environment (Baker et al., N’s or with average quality score of less than Q20 2010; Castelle et al., 2015) and determining their and a length o50 bps were removed. Screened reads physiologies (Wrighton et al., 2012; Castelle et al., were trimmed from both ends using a minimum 2013, 2015; Hug et al., 2013). Advances in metage- quality cutoff of 5 using Sickle (https://github.com/ nomic assembly and binning techniques have najoshi/sickle). Trimmed, screened, paired-end enabled us to obtain genomes directly from the Illumina reads were assembled using IDBA-UD environment (Baker and Dick, 2013). In this study, (Peng et al., 2012) with the following parameters we used this approach to reconstruct three unique, (–pre_correction –mink 55 –maxk 95 –step 10 partial and near-complete genomes belonging to a –seed_kmer 55). To maximize assembly reads from deeply branched widespread and newly identified different sites were coassembled. phylum of archaea, for which we propose the name The SMTZ assembly was generated from a ‘Thorarchaeota’; the reconstructed genomes provide combination of reads (698,574,240, average read a basis for inferring the evolutionary history and length 143 bp and average insert 274 bp) from site 2 physiological potential of these sedimentary archaea. (30–32 cm) and 3 (24–28 cm). The methane- producing zone assembly was generated from high- quality reads (378,027,948, average read length Materials and methods 124 bp and average insert 284 bp) of site 1 (52–54 cm). As we were not able to coassemble all Sample collection and processing three of the SMTZ samples after running into Six 1-m plunger cores were collected from ~ 1.5 m computational limits at 1 TB RAM memory, one of water depth in three mid-estuary locations (two the samples (site 1, 26–30 cm) was assembled cores per site) of the White Oak River, North Carolina separately from 345,710,832 reads (average length in October 2010 (site 1 at 34° 44.592ʹ N, 77° 07.435 ʹ W; of 129 bp and average insert 281 bp). The contigs site 2 at 34° 44.482ʹ N, 77° 07.404ʹ W, and site 3 at from this SMTZ sample were co-binned with the 34° 44.141ʹ N, 77° 07298ʹ W). Cores were stored at assembly of the other two SMTZ samples from sites 2 4 °C overnight and processed 24 h after sampling and 3, Contigs of genes of particular interest were (Lazar et al., 2014). Each core was sectioned into checked for chimeras by looking for dips in coverage 2-cm intervals. From each site one core was within read mappings. subsampled for geochemical analyses and the other Initial binning of the assembled fragments was core was subsampled for DNA extractions. A geo- done using tetra-nucleotide frequency signatures chemical analysis of sulfate, sulfide and methane using 5 kb fragments of the contigs (Dick et al., profiles of these cores indicated a distinct sulfur 2009). The Emergent Self-organizing Maps were and methane-cycling zonation at each site. The manually delineated and curated based on clusters sulfate-reducing zone started in sediment layers within the map (as shown in Supplementary Figure where sulfate was abundant and sulfide just started S1). This binning was enhanced by incorporating to accumulate (8–12 cm). The sulfate–methane transi- coverage signatures for all of the assembled contigs tion zone (SMTZ) where porewater sulfate and into the Emergent Self-organizing Maps maps methane overlapped, was located and sampled at (Dick et al., 2009). Coverage was determined by around 26, 24 and 26 cm in sites 1, 2 and 3, recruiting reads (from each individual library/sam- respectively. The methane-producing zone was char- ple) to scaffolds by BLASTN (bitscore 475), which acterized by accumulating methane; a single sample was then normalized to the number of reads from from site 1 of this zone was taken at 52–54 cm (Lazar each library. Contigs from both assemblies were et al., 2014). DNA was extracted from these sediment binned together, which resulted in the co-binning of depths using the UltraClean Mega Soil DNA Isolation SMTZ1-45 and SMTZ-45. These genomes were then Kit (MoBio, Carlsbad, CA, USA), using 6 g of sediment, separated based on their site of origin. The accuracy andstoredat− 80 °C until use. A total of 100 ng of of the binning was then assessed by checking the DNA was sheared to 270 bp using a focused ultra- genomic bins to which each of the 5 kb sub-portions sonicator (Covaris, Woburn, MA, USA). The sheared of the contigs were assigned. If the contig was DNA fragments were size selected using SPRI beads 415 kb then the contig was assigned to the bin that (Beckman Coulter, Brea, CA, USA). The selected contained the majority of its 5 kb sub-portions. Some fragments were then end-repaired, A-tailed and ligated of the fragments were identified as contaminants of Illumina-compatible adapters (IDT, Inc., San Jose, based on differential coverage, GC content,

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1698 phylogenetic placement and the presence of dupli- Results and Discussion cate genes and were removed if necessary. Closely related variants of a lineage were retained in a bin. Genomic reconstruction of a novel, deeply branched Binning was also manually curated based on GC archaeal lineage content, top blast hits and mate pairings. The Metagenomic assembly and binning of 262 Gb of completeness, contamination and strain heterogene- sequence data generated over 150 genomic bins from ity of the genomes within bins was then estimated by three sites, and distinct redox regimes, in the WOR using CheckM (Parks et al., 2015). estuary sediment (Baker et al., 2015). Large-scale Genes were called and putative function was microbial community analysis and single-cell geno- assigned using the JGI IMG/MER system (Markowitz mic reconstruction was preformed on the data et al., 2012). The CAZy server (Marseille, France) (e- generated from these sites (Lazar et al., 2014; Baker value cutoff of o1e − 5) was used to identify all et al., 2015). To begin to resolve the metabolic carbohydrate-active genes (Lombard et al.,2014);a capabilities of uncultured and genomically subset of these have been shown to be involved in unsampled phyla we generated a phylogenetic tree hydrolysis of extracellular carbohydrates (Wrighton of 16S rRNA genes from all of the genomic bins et al., 2014). The full metagenomic assemblies pre- (Figure 1). This analysis revealed a 3.31 Mb bin from sented in this study are available in IMG with the site 1 (SMTZ1-83) belonging to a deeply branched, following IMG Taxon IDs: 3300002052 for the SMTZ previously undescribed archaeal group (Raes et al., site1assembly(‘SM1’ genomes) and 3300001753 for 2007). The genome is estimated to be 90.3% SMTZ sites 2 and 3 assembly (‘SM23’ genomes). The complete (Parks et al., 2015), with single copies of genomic bins supporting the results of this article are 137 of the 150 markers, multiple copies of eight other being made available in NCBI Genbank under markers and minimal contamination (Table 1). Com- the BioProjectID PRJNA270657. The genomes parison between genomes showed that 58% of the described in this study have been deposited in NCBI genes are shared between all three bins and 84% are under BioProject PRJNA270657—SAMN03998758, shared among two of them. SAMN03998759 and SAMN03998760 for SMTZ1- Due to the high conservation of 16S rRNA genes, 83, SMTZ1-45 and SMTZ-45 respectively. these genes are often fragmented in short-read genomic libraries, and absent from bins (Miller Phylogenetic analyses et al., 2011). Therefore, in order to identify addi- The concatenated ribosomal protein tree was gener- tional bins belonging to this group we generated ated using 16 syntenic genes that have been shown to concatenated ribosomal protein trees from all of the undergo limited lateral gene transfer (rpL2, 3, 4, 5, 6, genomic bins. This revealed an additional bin, which 14, 15, 16, 18, 22, 24 and rpS3, 8, 10, 17, 19; Sorek contained fragments from two assemblies. These et al., 2007). The reference data sets were derived contigs from the two assemblies were separated into from the Phylosift database (Darling et al., 2014), two bins, SMTZ1-45 with 3.87 Mb (from site 1) and with additional sets from the Joint Genomic Institute SMTZ-45 with 2.37 Mb (from sites 2 and 3). IMG database (Castelle et al., 2013). The presence of The 16S rRNA gene from bin SMTZ1-83 clustered all 16 genes was not seen in the majority of with sequences recovered from a variety of environ- sequences. The number of genes missing varied; ments including mangroves (Pires et al., 2012), however, the majority of loci present were consistent freshwater Lake Pontchartrain (Amaral-Zettler throughout the bins. Scaffolds containing o50% of et al., 2008) and hydrothermal marine sediments the selected syntenic ribosomal protein genes were (Yanagawa et al., 2013a), suggesting that this group not included in the analyses. We searched NCBI to of organisms is broadly distributed in aquatic include reference amino-acid sequences for phylo- sediments. Samples collected at these sites revealed genetic analyses. Amino-acid alignments of the that sulfate reduction and methanogenic archaea individual genes were generated using MUSCLE dominate the sediments profiles similar to the WOR (Edgar, 2004). Alignments were trimmed for poorly esturary. They have also been identified in hydro- aligned regions and gaps using the BMGE tool thermal vents in the Okinawa Trough (Pires et al., (Criscuolo and Gribaldo, 2010; with the following 2012; Moyer et al., 2013; Yanagawa et al., 2013b). To settings: -m BLOSUM30 –g 0.5) and concatenated our knowledge, 16S rRNA sequences for these before inferences methods. The curated alignments archaea were not identified from other estuaries. were then concatenated for phylogenetic analyses. Rank abundance plots of the SMTZ communities The ribosomal protein tree included 71 taxa and revealed that these organisms are rare members of 2295 unambiguously aligned positions. The trees the WOR sediment community (Supplementary shown in all the figures were generated using Figure S2). maximum likelihood using RAxML (GTRGAMMA Before this study this phylogenetic lineage had not for 16S rRNA gene tree and PROTGAMMA for even been given a candidate phylum designation r-protein rate distribution models; Stamatakis, (Yanagawa et al., 2013a), although both 16S rRNA 2014). Bootstrap values were generated using IQ- gene phylogenies and the ribosomal protein trees Tree with the C60+LG mixture model for 10 000 ultra revealed that these genomes are distinct from all rapid bootstraps (Nguyen et al., 2015). previously described archaeal phyla. The lineage

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1699

Crenarchaeota

Korarchaeota

Terrestrial Hot Spring Gp (THSCG) Bathyarchaeota Caldiarchaeum subterraneum, AP011786 Aigarchaeota

Euryarchaeota

Nanoarchaeota Woesearchaeota Pacearchaeota Parvarchaeum acidiphilum Parvarchaeota 3 Diapherotrites Micrarchaeota MSP41 DUSEL2_SAG_AAA011_O165 pMC2A209 mangrove sediment clone LAGU_DUEOP, JN875527 Okinawa Trough sediment clone Exp331_INH_23U_82 , AB825742 SMTZ-1 bin 83 sinkhole clone Z17M89U, FJ484633 Lake Pontchartrain sediment clone 060329_T2S2, FJ351268 Lake Charles rawsewage clone Gctb_ML_097, 867, FJ353523 Thorarchaeota Lake Pontchartrain sediment clone 060329_T2S3, FJ352077 Lake Pontchartrain sediment clone 060329_T2S1, FJ351238 Okinawa Trough sediment clone T_35, KC471281 Okinana Trough sediment clone T_36, KC471282 Okinawa Trough sediment clone Exp331_INH_66U_01, AB825877 Okinawa Tropugh sediment clone Exp331_INH_50A_49, AB825828 Ancient Archaeal Group(AAG) Marine Hydrothermal Vent Group 2 (MHVG−2) GoC−Arc−109−D0−C1−M0 0.1

Figure 1 Phylogenetic position of the full-length 16S rRNA gene from the SMTZ1-83 genomic bin within the Archaea. The other two bins contain 16S rRNA gene o900 bp, thus were not included in this analysis. Bin SMTZ1-45 contains 682 bp of a 16S rRNA gene. The tree was generated using RAxML in the ARB software package (Ludwig et al., 2004). Closed and open circles represent bootstraps 480% and 460% (respectively) generated using IQtree (Nguyen et al., 2015). Closed circles represent bootstrap values of 470%. Giardia lamblia was used as the outgroup.

Table 1 General characteristics of the genomes reconstructed in distribution of taxonomic top hits to their genes this study. Genome completeness was evaluated using CheckM (Parks et al., 2015) between Archaea and Bacteria (Figure 3). Thus we propose that this group of archaea from the WOR Bin ID SMTZ1-83 SMTZ1-45 SMTZ-45 sediments be given a distinct name that reflects its antiquity and sibling group relationship with the Number of markers evaluated 150 150 150 Lokiarchaeota, ‘Thorarchaeota’. Numbers of markers identified 137 131 97

Copies of individual markers 0131953 1 129 112 91 Carbon metabolism 2 8 18 6 Sediments receive a variety of forms of detrital 3010organic matter from the overlying water column, Completeness 90.28 87.37 69.66 which provide carbon and nitrogen to the benthic Contamination 6.55 5.24 4.70 microbial communities. Consistent with recently Strain heterogeneity 12.5 11.11 0.00 GC (%) 49 42 43 obtained benthic archaeal genomes (Lloyd et al., Total length of bin (bp) 3,318,734 3,555,063 2,323,851 2013), Thorarchaeota have the genomic potential Longest scaffold (bp) 71,522 179,128 96,443 to degrade peptides (Figure 4; Supplementary Number of scaffolds 269 121 201 Table S1). Genes for protein degradation and Average scaffold length 12,337 29,381 11,561 assimilation, including clostripain (cloSI) and gingi- Abbreviation: SMTZ, sulfate–methane transition zone. pain (rgpA), were identified along with several other extracellular peptidases. Genes encoding a complete branched amino-acid importer (livKHMGF) were branched deeply within the TACK superphylum found in SMTZ1-83 and SMTZ1-45. Partial dipep- (Figure 1; Guy and Ettema, 2011) and shared a root tide (dppABCF) and oligopeptide (oppABCDF) with the newly described Lokiarchaeota (Spang importers were detected in all Thorarchaeota gen- et al., 2015) in the concatenated ribosomal protein omes, indicating that transport of peptides into the gene tree (Figure 2). Furthermore, the novelty and cell could be a common capability of these archaea. deep branching is reflected in a fairly even Numerous endopeptidase genes including pepT,

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1700 Aigarchaeota

Thaumarchaeota

TACK Korarchaeum cryptofilum OPF8 Geoarchaeota

Crenarchaeota

Euryarchaeota

Woesearchaeota

Pacearchaeota

Nanoarchaeum equitans_Kin4_M Parvarchaeum acidophilus ARMAN5 Diapherotrites Aenigmarchaeota_AR5_gwa2_scaffold_521 DPANN Nanohaloarchaeota Micrarchaeum acidiphilum ARMAN2 Altiarchaeota

Lokiarchaeaota Loki WOR bin SMTZ1-83 WOR bin SMTZ-45 Thoarchaeota 0.10

Figure 2 Phylogenetic analysis of 16 concatenated ribosomal proteins (rpL2, 3, 4, 5, 6, 14, 15, 16, 18, 22, 24 and rpS3, 8, 10, 17, 19) generated using RAxML in the ARB software package. Closed and open circles represent bootstraps 480% and 460% (respectively) generated using IQtree (Nguyen et al., 2015).

Figure 3 Domain-level taxonomic distribution of top hits to genes in the three Thorarchaeota genomes, based on nucleotide comparisons (BLASTn) of the genes with those present in Genbank (NCBI, nt database).

pepA and pepB as well as multiple aminotransferase genes were identified in SMTZ1-83, SMTZ-45 and genes from families I through V (aspB, argD, glmS, SMTZ1-45 (Supplementary Tables S2 and S3). These puuE and serC) known to assist with the degradation enzymes require as electron acceptor oxidized of imported peptides to keto-acids were identified in ferredoxin, which can be supplied by Ni–Fe hydro- all three bins of the Thorarchaeota. The abundance genases (Sapra et al., 2003). Genes coding for several and variety of these peptidases in the Archaea Ni–Fe hydrogenases were found in the thorarchaeal suggest that proteins and peptides function as a genomes. A Phylip PROML phylogenetic tree of likely carbon source. known Ni–Fe hydrogenases showed these genes The Thorarchaeota genomic bins also encode grouped closely with several subunits of the putative oxidoreductases including pyruvate:ferre- methanogenesis-related F420-reducing and non- doxin oxidoreductase (porABDG), indolepyruvate reducing dehydrogenases (Supplementary Figure S3). ferredoxin oxidoreductase (iorAB) and aldehyde The lack of other genes required for methanogenesis oxidoreductase (aor) that transform the keto-acids suggests that these Ni–Fe hydrogenases are likely into acetyl-CoA and organic acids such as acetate performing non-methanogenic functions in the (Lloyd et al., 2013). Multiple copies of all these Thorarchaeota.

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1701

Figure 4 Metabolic reconstruction of the Thorarchaeaota genomic bin (SMTZ1-83) based on genes identified using the KEGG database. Genes for various metabolisms, importers and hydrogenases are represented. White boxes represent the genes that are present in the genome and gray boxes represent genes that are absent. Details about the genes numbered are provided in Supplementary Table S2.

Carbohydrates are another possible carbon source All three genomes contain the genes necessary to for heterotrophic microorganisms. Genomes were convert pyruvate to acetyl-CoA including both the E1 searched for pathways involved in carbohydrate (PDHA) and E2 (PDHB) units of pyruvate dehydro- degradation using the CAZy server (Lombard et al., genase and all the subunits of pyruvate ferredoxin 2014). A total of 81, 72 and 72 carbohydrate-active oxioreductase (porABDG). Although the majority of enzymes were found in SMTZ-45, SMTZ1-45 and enzymes that are necessary for the citrate acid cycle SMTZ1-83 respectively. However only a small were not present, a near-complete succinate hydro- percentage (between 8.3% and 12.3%) of these genase complex was present in all three genomes. enzymes could be assigned specific roles in The genomic evidence suggests that terminal oxida- carbohydrate degradation, mostly as cellulases and tion via the citric acid cycle is not a major energy alpha-mannosidases (Supplementary Table S1). source for the Thorarchaeota. The limited quantity and constrained roles of The genomes all possessed the ADP-forming subunit carbohydrate degradation genes suggest that these of acetyl-CoA synthetase that has been seen to catalyze archaea use carbohydrates selectively as carbon acetate production in Pyrococcus furiosus and other sources. The identification of an apparent complete archaea (Glasemacher et al., 1997). The presence of peptide degradation pathway leads us to suggest that this enzyme suggests that acetyl-CoA could be con- Thorarchaeota may prefer the heterotrophic degra- verted to acetate; however genes for ethanol fermenta- dation of proteins over carbohydrates. tion did not appear to be present. Therefore, we believe The assembled genomes show the capability for that Thorarchaeota are likely involved in the produc- with all the required genes except tion of acetate through the hydrolytic cleavage of , present in at least one of the three acetyl-CoA. As acetate is a key substrate for sulfate genomes. Thorarchaeota appear to have the ability to reduction and methanogenesis in estuary sediments convert glucose-6-phosphate partially or completely (Oremland and Polcin, 1982) and appears to be an to phosphoenol pyruvate (Figure 4). important link between fermentation and respiratory was not identified in any of the Thorarchaeota metabolisms (Wrighton et al., 2014), the ability of and instead pyruvate appears to be generated using Thorarchaea to generate acetate could greatly influ- phosphoenolpyruvate synthase (Bräsen et al., 2014). ence the terminal anaerobic degradation cascade.

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1702 The Wood–Ljungdahl pathway enables auto- monophyletic with those sequences from P. furiosus trophic carbon fixation by converting extracellular (Supplementary Figure S3). Sequences that code for

two CO2 molecules first to acetyl-CoA and then to the reactive gamma subunit grouped closest to acetate. The first CO2 is converted to a methyl hypothetical proteins from the bacterial phyla group through sequential transfer of six electrons. Aminicenantes and Cloacimonetes (Rinke et al., 2013) The methyl group is then transferred to convert a and formed a monophyletic group with genes second CO2 to acetyl-CoA (Ragsdale and Pierce, isolated P. furiosus. The beta sulfhydrogenases 2008). Although SMTZ-45 did not show any genes grouped with a beta subunit of the same gene from for enzymes in this pathway, near-complete gene P. furiosus and a hydrogenase/sulfur reductase sets were found in the SMTZ1-45 and SMTZ1-83 isolated from Pelodictyon phaeoclathratiforme, bins. The CO2-reducing enzymes lacked formate a member of the bacterial phylum Chlorobia dehydrogenase for the initial CO2 reduction step, (Supplementary Figure S3; Lucas et al., 2008). Various and methylene-tetrahydrofolate reductase (metF), other Ni–Fe hydrogenases were also identified, which converts 5,10-methylene-tetrahydrofolate to however only the delta subunit of the F420-reducing 5-methyl-tetrahydrofolate. Genes for acetyl-CoA hydrogenase is seen in all three genomes. As the synthesis were nearly complete except for the acsE Ni–Fe hydrogenases seen in these three archaeal gene, which codes for the methyltransferase that genomes show close affiliations to the sulfhydrogen- transfers the methyl group from methyl- ase subunits of P. furiosus, the Thorarchaeaota are tetrahydroformate to a corrinoid iron/sulfur protein, most likely capable of sulfur reduction. an intermediate methyl carrier; from here, the These genomes were isolated from the SMTZ methyl group is accepted by the key enzyme of the where sulfate is almost completely depleted by pathway, CO-dehydrogenase/acetyl-CoA synthase, sulfate reduction and sulfide concentrations are to produce acetyl-CoA and acetate (Ragsdale and highest (Lazar et al., 2014). Of these genomic bins Pierce, 2008). If the missing genes are accounted for SMTZ-45 has the most well-defined sulfur meta- by the incomplete sequence coverage of bins bolism with genes for a sulfate/thiosulfate importer, SMTZ1-45 and SMTZ1-83, these Thorarchaeaota a thiosulfate reductase electron transporter (phsB) may be able to use the Wood–Ljungdahl pathway and the sulfhydrogenase. SMTZ1-83 only appears to of carbon fixation and support themselves by have the sulfhydrogenase, whereas SMTZ1-45 has autotrophic acetogenesis, depending on the need. both the sulfhydrogenase and a thiosulfate reductase, However, it is possible that the pathway is in fact suggesting that these archaea might also be mediat- incomplete in these archaea. The presence of this ing the transformation of intermediate sulfur

pathway is not direct evidence of autotrophy, as CO2 compounds (Supplementary Table S3). For both could serve primarily as an acceptor for electrons SMTZ-45 and SMTZ1-45 the closest NCBI hits to derived from a wide range of fermentative reactions the thiosulfate reductase included a 4Fe–4S ferre- within the organism, rather than an autotrophic doxin isolated from Desulfobacterium autotrophi- carbon source that sustains the biosynthetic needs of cum (Brysch et al., 1987), an Enterobacteriacea the cell (Ragsdale and Pierce, 2008). thiosulfate reductase electron transporter (phsB) (Brenner, 1983) and the polysulfide reductase subunit B of Citrobacter freundii (Sakazaki, 1984). The possible role of Thorarchaeota in sulfur cycling The phs operon codes for a molybdopterin cofactor The microbial remineralization of organic matter in that uses cysteine residues to bind and break the sediments is often coupled to anaerobic respiration di-sulfur bonds (Hinsley and Berks, 2002). Homologs of sulfate and nitrate/nitrite reduction in sediments for these cysteine residues and for the iron (Jørgensen, 1982; Kostka et al., 2002). As archaea sulfur clusters responsible for electron transfer have participate extensively in the cycling of nitrogen and been identified on the phsB gene (Heinzinger et al., sulfur (Canfield and Raiswell, 1999; Cabello et al., 1995). Although this gene has been seen in a few 2004), and previous 16S rRNA detection of the non-thiosulfate-reducing organisms, all BLAST hits Thorarchaoeta in anaerobic aquatic sediments is to our sequences had functional enzymes (Park et al., broadly consistent with these geochemical roles, 2012). Thiosulfate reductase preferentially utilizes we searched the Thorachaeota genomic bins for the thiosulfate as an electron acceptor. Thiosulfate is a presence of relevant genes. All three of these dominant sulfur oxidation product in marine sedi- archaeal bins have homologs to sulfohydrogenase, ments (Jørgensen, 1990) and can be used as a which has been shown to be involved in the substrate for both oxidation and reduction; micro- reduction of elemental sulfur in the archaeon organisms that reduce this important intermediate in P. furiosus (Ma et al., 1993). We found that the the would be imperative to sulfur Thorarchaeaota bins contain a variety of hydroge- transformation in anoxic environments (Stoffels nase genes (9–15 per bin). In an attempt to resolve et al., 2012). The presence of genes involved in both the functions of these proteins we compared them elemental sulfur and thiosulfate reduction suggests with other phylogenetically related proteins. All that these archaea may be key players in the three genomic bins have a single copy of the alpha, reduction of intermediate sulfur species at the SMTZ beta, gamma and delta subunits (hydABGD) that are redox layer.

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1703 Conclusions Baker BJ, Dick GJ. (2013). Omic approaches in microbial ecology: charting the unknown. Microbe 8: 353–359. Unlike many other methods for environmental Baker BJ, Lazar CS, Teske AP, Dick GJ. (2015). Genomic community analysis, the ability to reconstruct resolution of linkages in carbon, nitrogen, and sulfur unique genomes allows for insight into the ecological cycling among widespread estuary sediment bacteria. roles microbes have in the environment. Genomic Microbiome 3: 14. reconstruction of Thorarchaeota, a widespread sedi- Bauer JE, Cai W-J, Raymond PA, Bianchi TS, Hopkinson CS, Regnier PAG. (2013). The changing ment group from the WOR estuary, has resolved the – metabolic potential and previously unknown niches of the coastal ocean. Nature 504:61 70. for these archaea, including their apparent ability to Biddle JF, Lipp JS, Lever MA, Lloyd KG, Sørensen KB, Anderson R et al. (2006). Heterotrophic Archaea degrade organic matter, fix inorganic carbon, and dominate sedimentary subsurface ecosystems off Peru. reduce sulfur. They may be having an important role Proc Natl Acad Sci USA 103: 3846–3851. in the reduction of intermediate sulfur compounds Brasen C, Esser D, Rauch B, Siebers B. (2014). generated by the oxidation of sulfide in the SMTZ Carbohydrate metabolism in archaea: current insights (Lazar et al., 2014). Furthermore, these first genomic into unusual enzymes and pathways and their regula- glimpses at the Thorarchaeota provide strong evi- tion. Microbiol Mol Biol Rev 78:89–175. dence that they are capable of acetogenesis. Based on Brenner DJ. (1983). Opposition to the proposal to replace the family name Enterobacteriaceae. Int J Syst Bacteriol energetic considerations it has been hypothesized – that acetogens generally cannot outcompete metha- 33: 892 895. nogens, except in oligotrophic marine settings and Brysch K, Schneider C, Fuchs G, Widdel F. (1987). Lithoautotrophic growth of sulfate-reducing bacteria, the deep subsurface (Lever, 2012; Oren, 2012); and description of Desulfobacterium autotrophicum targeted habitat studies should reveal whether gen. nov., sp. nov. Arch Microbiol 148: 264–274. Thorarchaeota match this prediction. Incorporation Cabello P, Roldan MD, Moreno-Vivián C. (2004). of this new knowledge into future models will result Nitrate reduction and the nitrogen cycle in archaea. in a more accurate representation of benthic biogeo- Microbiology 150: 3527–3546. chemical cycling. The acquisition of additional Cai W-J. (2011). Estuarine and coastal ocean carbon thorarchaeotal genomes, and matching gene expres- paradox: CO2 sinks or sites of terrestrial carbon – sion studies, from other sediment environments will incineration? Annu Rev Mar Sci 3: 123 145. Canfield DE, Raiswell R. (1999). The evolution of the provide a better understanding of this previously – overlooked archaeal phylum. Given the deep branch- sulfur cycle. American J Sci 299: 697 723. Castelle CJ, Hug LA, Wrighton KC, Thomas BC, ing of the Thorarchaeota within the archaeal domain Williams KH, Dongying W et al. (2013). Extraordinary and its relationship to Lokiarchaeota, future evolu- phylogenetic diversity and metabolic versatility in tionary investigations of this phylum will provide aquifer sediment. Nat Commun 4: 2120. insights into the ancestral relationships between the Castelle CJ, Wrighton KC, Thomas BC, Hug LA, Brown CT, archaea and eukaryotes. Wilkins MJ et al. (2015). Genomic expansion of domain Archaea highlights roles for organisms from new phyla Conflict of Interest in anaerobic carbon cycling. Current Biol 25: 690–701. Criscuolo A, Gribaldo S. (2010). BMGE (Block Mapping The authors declare no conflict of interest. and Gathering with Entropy): a new software for selection of phylogenetic informative regions from Acknowledgements multiple sequence alignments. BMC Evolutionary Biol 10: 210. The sequencing at JGI was funded by the U.S. Department Darling AE, Jospin G, Lowe E, Matsen FA, Bik HM, of Energy Joint Genome Institute is supported by the Office Eisen JA. (2014). Phylosift phylogenetic analysis of of Science of the U.S. Department of Energy under Contract genomes and metagenomes. PeerJ 2: e243. No. DE-AC02-05CH11231 to BJB. Cassandre Lazar as well Dick GJ, Andersson AF, Baker BJ, Simmons SL, Thomas BC, as sampling in the White Oak River estuary have been Yelton AP et al. (2009). Community-wide analysis of supported by the European Research Council ‘DARCLIFE’ genome sequence signatures. Gen Biol 10: R85. grant 247153 to K-U Hinrichs. AT was in part supported Durbin AM, Teske A. (2011). Microbial diversity and by the Center for Dark Energy Biosphere Investigations stratification of South Pacific abyssal marine sedi- (C-DEBI). We thank Dr Thijs JG Ettema for the naming of ments. Environ Microbiol 12: 3219–3234. Thorarchaeota. Edgar R. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792–1791. References Glasemacher J, Bock AK, Schmid R, Schönheit P. (1997). Purification and properties of acetyl-coa synthetase Amaral-Zettler LA, Rocca JD, Lamontagne MG, (adp-forming), an archaeal enzyme of acetate formation Dennett MR, Gast RJ. (2008). Changes in micro- and ATP synthesis, from the hyperthermophile bial community structure in the wake of Hurricanes Pyrococcus furiosus. Eur J Biochem 244: 561–567. Katrina and Rita. Environ Sci Technol 42: Guy L, Ettema TJ. (2011). The archaeal ‘TACK’ super- 9072–9078. phylum and the origins of eukaryotes. Trends Baker BJ, Comolli LR, Dick GJ, Hauser LJ, Hyatt D, Dill BD Microbiol 19: 580–587. et al. (2010). Enigmatic, ultrasmall, uncultivated Heinzinger NK, Fujimoto SY, Clark MA, Moreno MS, Archaea. Proc Natl Acad Sci USA 107: 8806–8811. Barrett EL. (1995). Sequence analysis of the phs operon

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1704 in Salmonella typhimurium and the contribution of Markowitz VM, Chen IM, Palaniappan K, Chu K, Szeto E, thiosulfate reduction to anaerobic energy metabolism. Grechkin Y et al. (2012). IMG, the integrated microbial J Bacteriol 177: 2813–2820. genomes database and comparative analysis system. Hinsley AP, Berks BC. (2002). Specificity of respiratory Nucleic Acids Res 40: D115–D122. pathways involved in the reduction of sulfur com- Martens CS, Goldhaber MB. (1978). Early diagenesis in pounds by Salmonella enterica. Microbiology 148: transitional sedimentary environments of the White 3631–3638. Oak River Estuary, North Carolina. Limnol Oceanogr Hug LA, Castelle CJ, Wrighton KC, Thomas BC, Sharon I, 23: 428–441. Frischkorn KR et al. (2013). Community genomic Meng J, Xu J, Qin D, He Y, Xiao X, Wang F. (2014). Genetic analyses constrain the distribution of metabolic traits and functional properties of uncultivated MCG archaea across the Chloroflexi phylum and indicate roles in assessed by metagenome and gene expression ana- sediment carbon cycling. Microbiome 1: 22. lyses. ISME J 8: 650–659. Jørgensen BB. (1982). Mineralization of organic matter in Miller CS, Baker BJ, Thomas BC, Singer S, Banfield JF. the sea bed – the role of sulphate reduction. Nature (2011). EMIRGE: reconstruction of full-length ribosomal 296: 643–645. genes from microbial community short read sequen- Jørgensen BB. (1990). A thiosulfate shunt in the cing data. Gen Biol 12: R44. sulfur cycle of marine sediments. Science 249: Moyer CL, Birrien JL, Aoike K, Sunamura M, Urabe T, – 152 154. Mottl MJ et al. (2013). The first microbiological Kelley CA, Martens CS, Chanton JP. (1990). Variations in contamination assessment by deep-sea drilling and sedimentary carbon remineralization rates in the White coring by the D/V Chikyu at the Iheya North hydro- Oak River Estuary, North Carolina. Limnol Oceanogr thermal field in the Mid-Okinawa Trough (IODP – 35: 372 383. Expedition 331). Front Microbiol 4: 327. Kelley CA, Martens CS, Ussler W III. (1995). Methane Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. dynamics across a tidally flooded riverbank margin. – (2015). IQ-TREE: a fast and effective stochastic algo- Limnol Oceanogr 40: 1112 1129. rithm for estimating maximum-likelihood phylogenies. Kostka JE, Roychoudhury A, Van Cappellen P. (2002). Mol Biol Evol 32: 268–274. Rates and controls of anaerobic respiration across Oremland RS, Polcin S. (1982). Methanogenesis and spatial and temporal gradients in saltmarch sediments. – sulfate reduction: competitive and noncompetitive Biogeochemistry 60:49 76. substrates in estuarine sediments. Appl Environ Kubo K, Lloyd KG, Biddle JF, Teske A, Amann R, Knittel K. Microbiol 44: 1270–1276. (2012). Archaea of the Miscellaneous Crenarchaeotal Oren A. (2012). There must be an acetogen somewhere. Group are abundant, diverse and widespread in marine Front Microbiol 3: 22. sediments. ISME J 6: 1949–1965. Park C-I, Choib SY, Kimb WS, Ohb MJ, Kimb EH, Kimb HY Lazar CS, Biddle JF, Meador TB, Blair N, Hinrichs KU, et al. (2012). Characterization of the unusual non- Teske AP. (2014). Environmental controls on thiosulfate-reducing Edwardsiella tarda isolated from intragroup diversity of the uncultured benthic archaea eel (Anguilla japonica) farms. Aquaculture 356–357: of the miscellaneous Crenarchaeotal group lineage – naturally enriched in anoxic sediments of the White 415 419. Oak River estuary (North Carolina, USA). Environ Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Microbiol 17: 2228–2238. Tyson GW. (2015). CheckM: assessing the quality Lever MA. (2012). Acetogenesis in the energy-starved deep of microbial genomes recovered from isolates, – single cells, and metagenomes. Gen Res 25: biosphere a paradox? Front Microbiol 2: 284. – Lloyd KG, Alperin MJ, Teske A. (2011). Environmental 1043 1055. evidence for net methane production and oxidation in Peng Y, Leung HCM, Yiu SM, Chin FYL. (2012). IDBA-UD: putative ANaerobic MEthanotrophic (ANME) archaea. a de novo assembler for single-cell and meta- Environ Microbiol 13: 2548–2564. genomic sequencing data with highly uneven depth. – Lloyd KG, Schreiber L, Petersen DG, Kjeldsen KU, Bioinformatics 28: 1420 1428. Lever MA, Steen AD et al. (2013). Predominant archaea Pires AC, Cleary DF, Almeida A, Cunha A, Dealtry S, in marine sediments degrade detrital proteins. Nature Mendonca-Hagler LC et al. (2012). Denaturing gradient 496: 215–218. gel electrophoresis and barcoded pyrosequencing Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, reveal unprecedented archaeal diversity in mangrove Henrissat B. (2014). The carbohydrate-active enzymes sediment and rhizosphere samples. Appl Environ database (CAZy) in 2013. Nucleic Acids Res 42: Microbiol 78: 5520–5528. 490–495. Raes J, Korbel JO, Lercher MJ, von Mering C, Bork P. (2007). Lucas S, Copeland A, Lapidus A, Glavina del Rio T, Prediction of effective genome size in metagenomic Dalin E, Tice H et al. (2008). Complete sequence of samples. Genome Biol 8: R10. Pelodictyon phaeoclathratiforme BU-1. EMBL/Gen- Ragsdale SW, Pierce E. (2008). Acetogenesis and the Bank/DDBJ databases. Wood-Ljungdahl pathway of CO2 fixation. Biochim Ludwig W, Strunk O, Westram R, Richter L, Biophys Acta 1784: 1873–1898. Meier H, Yadhukumar et al. (2004). ARB: a software Rinke C, Schwientek P, Sczyrba A, Ivanova NN, environment for sequence data. Nucleic Acids Res 32: Anderson IJ, Cheng JF et al. (2013). Insights into the 1363–1371. phylogeny and coding potential of microbial Ma K, Schicho RN, Kelly RM, Adams MWW. (1993). dark matter. Nature 499: 431–437. Hydrogenase of the hyperthermophile Pyrococcus Sakazaki R. (1984). Genus IV. Citrobacter Werkman and furiosus is an elemental sulfur reductase or sulfhydro- Gillen 1932, 173AL. In: Krieg NR, Holt JG (eds). genase: evidence for a sulfur-reducing hydrogenase Bergey's Manual of Systematic Bacteriology, vol. 2. ancestor. Proc Natl Acad Sci USA 90: 5341–5344. The Williams & Wilkins Co: Baltimore, pp 458–461.

The ISME Journal Pathways for acetogenesis and sulfur reduction KW Seitz et al 1705 Sapra R, Bagramyan K, Adams M. (2003). A simple Wrighton KC, Castelle CJ, Wilkins MJ, Hug LA, Sharon I, energy-conserving system: proton reduction coupled Thomas BC et al. (2014). Metabolic interdependencies to proton translocation. Proc Natl Acad Sci USA 100: between phylogenetically novel fermenters and – 7545 7550. respiratory organisms in an unconfined aquifer. ISME Seyler LM, McGuinness LM, Kerkhof LJ. (2014). Crenarch- J 7: 1452–1463. aeal heterotrophy in salt marsh sediments. ISME J 8: Wrighton KC, Thomas BC, Sharon I, Miller CS, Castelle CJ, 1534–1543. Sorek R, Zhu Y, Creevey CJ, Francino MP, Bork P, VerBerkmoes NC et al. (2012). Fermentation, hydrogen, Rubin EM. (2007). Genome-wide experimental deter- and sulfur metabolism in multiple uncultivated mination of barriers to horizontal gene transfer. bacterial phyla. Science 337: 1661–1665. Science 318: 1449–1452. Yanagawa K, Morono Y, Beer DD, Haeckel M, Sunamura M, Spang A, Saw JH, Jorgensen SL, Zaremba-Niedzwiedzka K, Futagami T et al. (2013a). Metabolically active microbial

Martijn J, Lind AE et al. (2015). Complex archaea that communities in marine sediment under high-CO2 and bridge the gap between and eukaryotes. – – low-pH extremes. ISME J 7: 555 567. Nature 521: 173 179. Yanagawa K, Nunoura T, McAllister SM, Hirai M, Stamatakis A. (2014). RAxML Version 8: a tool for Breuker A, Brandt L et al. (2013b). The first micro- phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30: 1312–1313. biological contamination assessment by deep-sea drilling Stoffels L, Krehenbrink M, Berks BC, Unden G. (2012). and coring by the D/V Chikyu at the Iheya North Thiosulfate reduction in Salmonella enterica Is driven hydrothermal field in the Mid-Okinawa Trough (IODP by the proton motive force. J Bacteriol 194: 475–485. Expedition 331). Front Microbiol 4: 327.

Supplementary Information accompanies this paper on The ISME Journal website (http://www.nature.com/ismej)

The ISME Journal