<<

Insights into the lifestyle of uncultured bacterial PNAS PLUS factories associated with marine sponges

Gerald Lacknera,b, Eike Edzard Petersa, Eric J. N. Helfricha, and Jörn Piela,1

aInstitute of Microbiology, Eidgenössische Technische Hochschule (ETH) Zurich, 8093 Zurich, Switzerland; and bJunior Research Group Synthetic Microbiology, Friedrich Schiller University at the Leibniz Institute for Natural Product Research and Infection Biology–Hans Knöll Institute, 07745 Jena, Germany

Edited by Jerrold Meinwald, Cornell University, Ithaca, NY, and approved December 7, 2016 (received for review September 29, 2016) The as-yet uncultured filamentous bacteria “Candidatus Entotheonella anticancer drug candidate discodermolide (9, 10), but their chemical factor” and “Candidatus Entotheonella gemina” live associated with role remains unclear (11). Despite repeated efforts (12, 13), the vast themarinespongeTheonella swinhoei Y, the source of numerous majority of T. swinhoei symbionts, including Ca. Entotheonella, re- unusual bioactive natural products. Belonging to the proposed candi- main uncultured to date, except for a report on the detection of date phylum “Tectomicrobia,” Candidatus Entotheonella members are Ca. Entotheonella in a mixed culture (7). There is some pros- only distantly related to any cultivated . The Ca.E.factorhas pect, however, of overcoming this challenge by genomics-based been identified as the source of almost all and modified targeted cultivation (14). peptides families reported from the sponge host, and both Ca. Ento- By metagenomic, single-particle genomic, and functional studies, we theonella phylotypes contain numerous additional genes for as-yet and collaborators recently provided evidence that almost all known unknown metabolites. Here, we provide insights into the biology of bioactive and modified peptides previously reported from these remarkable bacteria using genomic, (meta)proteomic, and chem- a Japanese chemotype of T. swinhoei (termed “chemotype Y”)are ical methods. The data suggest a metabolic model of Ca. Entotheonella produced by a single member of the complex microbiome named as facultative anaerobic, organotrophic with the ability to “Candidatus Entotheonella factor” TSY1 (15–18). In this sponge, the use methanol as an energy source. The symbionts appear to be auxo-

bacterium co-occurs with a second Ca. Entotheonella symbiont (15) MICROBIOLOGY trophic for some , but have the potential to produce most that, based on average identity values, is a distinct candi- amino acids as well as rare cofactors like .Thelatter date species and was termed “Candidatus Entotheonella gemina” likely accounts for the strong autofluorescence of Ca. Entotheonella TSY2 (21). Disentangling the metagenomic sequence data by binning filaments. A large expansion of proteinfamiliesinvolvedinregulation analysis revealed a striking number of natural product gene clusters in and conversion of organic molecules indicates roles in host–bacterial both phylotypes, but assigned all clusters for attributable T. swinhoei interaction. In addition, a massive overrepresentation of members of Y compounds (onnamides, polytheonamides, keramamides, pseudo- the luciferase-like monooxygenase superfamily points toward an im- theonamides, cyclotheonamides, and nazumamides) to Ca.E.factor portant role of these proteins in Ca. Entotheonella. Furthermore, we (15). In addition, multiple clusters for as-yet cryptic natural products performed mass spectrometric imaging combined with fluorescence in were identified in both phylotypes,suggestinganevenhigherbio- situ hybridization to localize Ca. Entotheonella and some of the bio- synthetic capacity. Both phylotypes remain as-yet uncultivated; are active natural products in the sponge tissue. These metabolic insights only distantly related to any cultivated bacterium; and, on the basis of into a new candidate phylum offer hints on the targeted cultivation of phylogenomic data, belong to a novel candidate phylum that was the chemically most prolific microorganisms known from microbial termed “Tectomicrobia” (15). dark matter. Recently, we presented evidence that another Ca. Entotheonella phylotype, “Candidatus Entotheonella serta,” present in the chemically uncultivated bacteria | natural products | genomics | proteomics | symbiosis Significance

arine sponges are prolific sources of bioactive natural products The candidate genus “Candidatus Entotheonella” belongs to a Mand of great interest for drug development (1). Besides their recently proposed bacterial candidate phylum with largely un- pharmacological potential, sponges are among the oldest metazoans known properties due to the lack of cultivated members. Among – and have attracted attention as ancient models of animal bacterial the few known biological properties is an association of Ca. symbioses. Many sponges harbor highly abundant bacterial commu- Entotheonella with marine sponges and an extraordinarily rich nities that exhibit a similar biological complexity as the human genomic potential for bioactive natural products with unique microbiome (2, 3), but the ecological roles of these mostly un- structures and unprecedented biosynthetic enzymology. In- cultivated microbes remain largely elusive. One of the functions, for creasing evidence suggests that Ca. Entotheonella are wide- which evidence is accumulating, is the production of toxic natural spread key producers of sponge natural products with a chemical products that might contribute to host defense (4, 5). Significant ef- richness comparable to soil actinomycetes. Given the unusual forts have been made to connect the chemistry of sponges to possible biology and exceptional pharmacological potential of Ca.Ento- bacterial producers, largely motivated by the prospect of developing theonella, the bioinformatic and functional insights into their sustainable production systems for drug development. One of the lifestyle presented here provide diverse avenues for marine important sponge models that has emerged in these studies is The- natural product research, biotechnology, and microbial ecology. onella swinhoei, a chemically exceptionally rich complex of distinct chemotypes. Pioneering work on a variant from Palau revealed the Author contributions: J.P. designed research; G.L., E.E.P., and E.J.N.H. performed research; presence of filamentous, multicellular bacteria that could be G.L. and J.P. analyzed data; and G.L. and J.P. wrote the paper. mechanically enriched and contained elevated amounts of theopa- The authors declare no conflict of interest. lauamide-type antifungal peptides (6). The symbiont was assigned to This article is a PNAS Direct Submission. a new candidate genus and named “Candidatus Entotheonella pal- 1To whom correspondence should be addressed. Email: [email protected]. ” auensis (7). Related Candidatus Entotheonella bacteria were also This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. detected in the sponge Discodermia dissoluta (8), the source of the 1073/pnas.1616234114/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1616234114 PNAS Early Edition | 1of10 Downloaded by guest on September 25, 2021 distinct Japanese sponge T. swinhoei WA, is the producer of (at a false discovery rate of 1%). Considering the challenging the actin-binding polyketide misakinolide (19). Additional Ca. sample and large size of ca. 9Mb,thisproteomecoverage Entotheonella variants were also detected by PCR in a wide range of is respectable. The lower number of identifications for Ca.E. other sponge species (15) and were connected to natural product gemina is most likely due to the lower abundance of Ca.E.gemina biosynthesis in the sponge Discodermia calyx (20, 21). These data and in the sample as determined by genome sequencing (15). the previous Ca. Entotheonella studies mentioned above (6–8) suggest a more general relevance for the production of sponge- Central Carbon and Energy Metabolism. One important aspect of derived compounds. Biosynthetic pathways from Ca. Entotheonella genomic research on uncultured bacteria is the deduction of meta- exhibit a high frequency of unusual enzymatic features, and their bolic pathways that might deliver valuable information for targeted chemistry shows little similarity to the chemistry of cultivated cultivation experiments. Important clues for successful cultivation bacteria (17, 22–24). Being the first example of a chemically pro- are, for instance, potential energy sources, carbon sources, and tol- lific taxon among uncultivated bacteria, these producers offer ex- erance to oxygen. Concerning oxygen, the DNA-based data suggest citing opportunities for pharmaceutical applications, for systematic that Ca.E.factorandCa. E. gemina perform aerobic metabolism studies on the ecology of sponge-associated bacteria, and for in- and are likely facultative anaerobes. Both strains harbor genes vestigating functional properties of elusive candidate phyla. encoding a respiratory chain, including cytochrome c oxidase Since our first release of the Ca. Entotheonella from (expressed) as well as catalase (expressed) and superoxide dismutase T. swinhoei Y (15), we have made several attempts to close gaps in (Dataset S1). In addition, anaerobic growth might be promoted, as the metagenome by rounds of PacBio and Illumina sequencing. suggested by putative fermentation pathways to D-lactate, to acetate, Unfortunately, these attempts did not significantly reduce the andto(R,R)-butanediol from pyruvate (not detected by proteo- number of contigs. Generally, the dataset posed several challenges mics). Further support for facultative anaerobic growth is provided to genome analysis. First, many of the sequencing gaps disrupted by the presence of the oxygen-independent biosynthesis pathway for ORFs, which regularly prevented the identification of gene models cobalamin and deoxynucleotides, or by the high number of CoA- by automatic annotation pipelines. Furthermore, due to the remote III family encoding genes (discussed below), for instance. phylogenetic placement of Tectomicrobia (the candidate phylum Concerning central carbon metabolism (Fig. 1), we found harboring Ca. Entotheonella), functional predictions were gener- pathways for the utilization of glycerol, acetate, and L-lactate, (the ally more challenging than for strains with a closer relationship to latter two with proteomics support). Accordingly, we also identi- model organisms. Therefore, we performed an extensive rean- fied putative genes for glycerol and glycerol-3-phosphate trans- notation of the genomes and used metaproteomic, metabolic, and porters (both expressed), and we readily identified gene candidates imaging methods to provide insights into the intricacies of host– for a complete TCA cycle and the gluconeogenesis pathway. Ad- symbiont interactions, metabolic capabilities, nutritional require- ditional genes suggested a group of ATP-binding cassette (ABC) ments, and genome expansions of Ca. Entotheonella symbionts of transporters that highly likely import monosaccharides according T. swinhoei Y. The focus of this study rests outside of classical to classification of the ABC transporter database AbcDB (26) and secondary metabolism. As an ultimate goal, this study is intended that are expressed based on our proteomics data. Genes for the to guide the development of cultivation strategies for these high- breakdown of polysaccharides (e.g., for glycogen-debranching en- potential natural product producers. zyme, for glycogen phosphorylase) are also present in both ge- nomes, but their corresponding proteins could not be detected by Results and Discussion proteomics. Even if these findings strongly suggest that sugars can Reannotation of Ca. Entotheonella Genomes and Shotgun Metaproteomics. be consumed by Ca. Entotheonella, we were not able to identify Because of the above-mentioned difficulties connected with Ca. any complete pathway for the of monosaccharides. The Entotheonella genomes, we performed extensive automated and (Embden–Meyerhof–Parnas) pathway is not complete in manual reanalyses of genes. All reannotated pathways are listed in the Ca. E. factor dataset, because a 6-phosphofructokinase (pfkA or Dataset S1. To gain insights into the actual activity of Ca.Ento- pfkB) gene candidate is missing. There is, however, a partially se- theonella genes at the time of sponge collection, we aimed to quenced gene in Ca. E. gemina (GenBank accession no. ETX08415) investigate their expression status. Ideally, expression data should that encodes a member of the PfkB family. Thus, this locus might be generated from the same organisms that were used for the have been missed due to assembly errors or binning problems. sequencing project. These specimens had been stored in ethanol Interestingly, no convincing evidence was obtained for the oxi- at 4 °C for 4 y, and the amount of sample was severely limited. dative portion of the pentose phosphate pathway, including the Facing these challenges, we used a peptide fingerprinting ap- key glucose-6-phosphate dehydrogenase (G6PDH). Al- proach using nanoflow liquid chromatography (nano-LC) coupled though several genes encode F420-dependent dehydrogenases, with electrospray ionization tandem mass spectrometry (MS/MS). none of them belongs to the trusted family TIGR03554. No com- For sample preparation, we ground 1 cm3 of sponge sample in plete variant of the Entner–Doudoroff–related pathways (phos- artificial seawater, filtered off sponge debris, and enriched for the phorylative, hemiphosphorylative, or nonphosphorylative) was filamentous Ca. Entotheonella bacteria by density gradient cen- identified either. There are, however, several genes and their trifugation, yielding a barely visible bacterial pellet. Microscopic expressed proteins with moderate similarity to D-glucono-1,5- inspection of this preparation revealed the abundant presence of lactonase, a component of the nonphosphorylative or semi- large filaments with the typical Ca. Entotheonella morphology, phosphorylative Entner–Doudoroff pathways. Although it can- but almost no detectable unicellular contamination (Fig. S1). The not be excluded that the irregularities observed might be partly pellet was heat-lysed, fractionated by SDS/PAGE, and subjected due to sequencing or assembly errors, the sugar metabolism of to nano-LC–MS/MS. Using a Mascot search against the whole Entotheonella is a very attractive target for further studies, be- National Center for Biotechnology Information (NCBI) database, cause it might unveil some as-yet unknown . we identified 557 proteins with over 95% probability. In a more sensitive search against the two Ca. Entotheonella genomes using Ca. Entotheonella Are Likely Fueled by Methanol and Oxalate. More MaxQuant (25), we detected over 1,661 proteins (1,443 from conclusive suggestions on the putative energy metabolism of Ca. Ca. E. factor and 886 from Ca. E. gemina, with 668 hits matching Entotheonella were provided by the proteomics experiment: A PQQ- to both genomes; Dataset S2). With 8,338 and 8,989 unique an- dependent methanol dehydrogenase (MDH; GenBank accession no. notated proteins encoded in the available Ca.E.factorandCa.E. ETW97015) ranks among the top 30 most abundant proteins (in- gemina genomic data, respectively, these values translate into a tensity as to shotgun proteomics; Dataset S2). This finding is typical proteome coverage of 17.3% and 9.9% of all predicted proteins for methylotrophic organisms, such as Methylobacterium extorquens

2of10 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al. Downloaded by guest on September 25, 2021 monosaccharides oligopeptides dipeptides branched-chain AS Gly/Pro PNAS PLUS

glucose-DH? glucose glucono-1,5-lactone tungstate PENTOSE-PHOSPHATE? ENTNER- G6PDH? gluconate Fe3+ 6-P-gluconate? glucose-6-P DOUDOROFF? ? 2+ fructose-6-P Zi KDG KDG-6-P Pfk? ??KDG-(P)- fructose-1,6-bisphosphate aldolase? HMP HMP glycerol-3-P GAP DHAP GLUCONEOGENESIS sulfite taurine taurine 3- phosphoglycerate glycerol sulfide B12 2-phosphoglycerate glycerate homocysteine phosphate PEP CO2 methionine oxaloacetate hydroxypyruvate phosphonate

lactate pyruvate malate SERINE METHANOL PATHWAY serine OXIDATION urea acetate malyl-CoA methylene-THF glycine NH MICROBIOLOGY acetyl-CoA formaldehyde 4 ADP methanol DH H+ OA glyoxylate + H citrate methanol methanol malate ATP + H OH- aconitate formate fumarate formate TCA CYCLE formyl-CoA isocitrate DH NADH/H+ CO + O succinate 2 2 oxalyl-CoA RESPIRATORY 2-ketoglutarate succinyl-CoA CHAIN oxalate

dicarboxylate importer dicarboxylate importer

Fig. 1. Metabolic pathway reconstruction of Ca. Entotheonella. Ca. Entotheonella possesses various transporters for inorganic and organic substrates. Given the presence of appropriate transporters, the utilization of sugars is likely. The exact pathway of hexose and pentose catabolism, however, is elusive. Lactate, acetate, and glycerol are likely used as a carbon sources. MDH renders methanol available for energy metabolism. The assimilation of C1 compounds via the serine pathway is plausible; however, some uncertain annotations require experimental confirmation. DH, dehydrogenase; DHAP, dihydroxyacetone phos- phate; GAP, glyceraldehyde-6-phosphate; KDG, 2-keto-3-deoxy gluconate; OA, oxaloacetate; PEP, phosphoenolpyruvate; Pfk, phosphofructokinase.

(27). Intrigued by this result, we constructed a phylogenetic tree of tabolism. Notably, methanol is an abundant molecule in the oceans the candidate MDH proteins together with known MDHs classified (32). To also feed on C1 compounds as a sole carbon source, C1 by Keltjens et al. (28). Both Ca. Entotheonella cluster well fixation pathways have to be present. Examining such pathways with MDHs (Fig. S2A) and, more specifically, fall into the XoxF2 revealed a large portion of the serine pathway (33), including serine- (lanthanide-dependent methanol dehydrogenase) clade. In contrast hydroxymethyl transferase (expressed) and D-glycerate-2-kinase + to the long-known MxaF-type MDHs that use Ca2 as , (Fig. 1). However, dedicated candidate genes of the pathway-specific XoxF-type MDHs have only recently been shown to depend on ions malyl-CoA synthetase (EC 6.2.1.9, distinct from succinyl-CoA syn- + + of rare earth elements like La3 or Ce3 (29). In addition, they use thetase) were not detected. We are therefore cautious with a final the rare coenzyme (PQQ). The xoxF gene conclusion about C1 compound assimilation in Ca. Entotheonella, is clustered with genes for the periplasmic protein XoxJ and the cy- but we suggest that the addition of methanol and rare earth ele- tochrome c homolog XoxG, a genome context also known from other ments (e.g., LaCl3) to artificial growth media could be a highly xoxF homologs (28). These data strongly suggest that Ca. Ento- promising strategy for cultivation experiments. Interestingly, the theonella can use MeOH for energy metabolism. Furthermore, the addition of rare earth metals to growth media was a key to culti- genomes encode two putative enzymes for further oxidation of vating methanotrophic Verrucomicrobia successfully from volcanic methanol to carbon dioxide, both depending on rare cofactors (Fig. mud pods (29). S2B): mycothiol-dependent formaldehyde dehydrogenase (30) and In addition to methylotrophy, there is evidence for oxalotrophy tungsten-dependent formate dehydrogenase (31). Like MDH, both of Ca. Entotheonella, because we found a putative formyl-CoA/ coding genes are expressed in situ. The data strongly suggest that oxalate CoA-transferase gene expressed in both Ca. Entotheonella Ca.E.factorandCa.E.geminaoxidizemethanolforenergyme- phylotypes. Together with the oxalyl-CoA decarboxylase (ETW96684

Lackner et al. PNAS Early Edition | 3of10 Downloaded by guest on September 25, 2021 and ETW96684, both expressed) and the tungsten-dependent for- predicted by the AbcDB). Although we cannot rule out that some mate dehydrogenase, Ca. Entotheonella could use oxalate as an missing genes might simply reflect sequencing gaps, we suggest that energy source. Additionally, a putative oxalate/formate anti- artificial growth media for Ca. Entotheonella should be supple- porter (ETW98700 and ETX08188) could directly generate a mented with cofactors or vitamins, particularly with thiamine, proton motive force similar as described for Oxalobacter for- pantothenate, and . migenes (34). Notably, the formation of calcium oxalate has Although automatic annotations predicted both Ca. Entotheonella been reported in the marine sponge Chondrosia reniformis (35). phylotypes to be auxotrophic for several amino acids, we were able to identify complete pathways for the biosynthesis of all Vitamins for Productive Microbes. The putative dependence of Ca. proteinogenic amino acids by manual searches (Dataset S1). Entotheonella on rare cofactors directed us to the investigation of The presence of various and oligopeptide importers coenzyme biosynthesis pathways. Auxotrophy for cofactors should is, of course, not a contradiction to that finding. We therefore hy- provide helpful hints for the design of appropriate growth media. pothesize that the addition of amino acids to growth media could In agreement with the presence of enzymes dependent on co- promote growth of Ca. Entotheonella, but might not be necessary. enzyme F420, PQQ, and mycothiol (Fig. 2A), we indeed identified biosynthesis pathways for these rare cofactors encoded in the ge- Bioactive Metabolites. The extraordinary genetic potential of Ca. nome. Key enzymes for F420 and PQQ biosynthesis are expressed Entotheonella to produce bioactive metabolites has been discussed according to our proteomics data. Because both Ca. Entotheonella in detail before (15, 21). Unfortunately, we could not identify any of candidate species harbor a F420/glutamyl transferase (EC 6.4.2.31), the assembly line-like (PKS or NRPS) natural product biosynthesis we assume that F420 is present in a polyglutamylated form. The enzymes by shotgun proteomics. We could, however, identify PoyA deazaflavin moiety is known to exhibit strong autofluorescence, (ETX03681), the precursor peptide of polytheonamides (17). In which is, intriguingly, also a property of a proportion of Ca. Ento- addition to these analyses, we have discovered genes for carotenoid theonella cells in our samples. To test whether this fluorescence biosynthesis (Fig. 3A), which are in a different context involved in might be connected to the presence of F420, we recorded fluores- pathogenicity. The encoded enzymes are highly similar to the ho- cence emission spectra of autofluorescing Ca. Entotheonella fila- mologs of Staphylococcus aureus that generate staphyloxanthin ments and compared them with control spectra obtained from (37). This eponymous yellowish-golden (Latin aureus = golden) Methanobrevibacter smithii DSM 861, a methanogenic archaeon pigmentwasshowntoprotectthepathogenfromoxidativestress known to produce large amounts of the coenzyme F420. Gratifyingly, (38) and from killing by neutrophil cells (39). For Ca. Entotheonella, the spectra were nearly identical (Fig. 2B), supporting the hypothesis a possible role of the carotenoid could be UV-induced stress de- that Ca. Entotheonella produces F420. fense, because the biosynthesis cluster is flanked by a B12-dependent Concerning more common cofactors, we identified complete light-sensing transcription factor (40). Thus, Ca.E.factorislikely pathways for nicotine amide, flavin, folate, , able to perceive UV light and react to it by pigment formation. , lipoate, and cofactor (Dataset S1). Addi- The gene loci and a proposed pigment biosynthetic pathway tionally, the oxygen-independent (early cobalt insertion) pathway areshowninFig.3B. Due to the similarities to staphyloxanthin, for cobalamin (B12) is represented and expressed. Consistently, we propose the name theoxanthin for the as-yet unknown Ca. Ca. Entotheonella uses a B12-dependent ribonucleotide reductase Entotheonella carotenoid. The terpenoid precursors are, as usual (ETW93474 and ETX06210), which is expected to be functional in bacteria, produced via the nonmevalonate pathway (MEP/DOXP) under both aerobic and anaerobic conditions (36). Curiously, we starting from 1-deoxy-xylulose-5-phosphate (41). Experiments in our were not able to identify an obvious route for pantothenate pro- laboratory to identify theoxanthin by spectroscopy and MS have duction. This finding is surprising in light of the high number of not been successful so far. polyketide synthase (PKS) and synthetase Because more than 40 bioactive natural products were pre- (NRPS) proteins in both Ca. Entotheonella phylotypes. PKS viously isolated from T. swinhoei Y(42–44), we also investigated and NRPS megaenzymes contain multiple carrier domains that exemplarily the localization of two PKS-NRPS hybrid products, use a covalently bound arm as a prosthetic onnamide A and cyclotheonamide A, by matrix-assisted laser group. Accordingly, the Ca. Entotheonella genomes contain sev- desorption-ionization imaging mass spectrometry (MALDI-IMS). eral genes coding for phosphopantetheinyl (PPTases; This MS method was recently used to localize misakinolide A pfam01648). Deduced proteins ETX01839 and ETX09163 from spatially in the chemotype T. swinhoei WA from Hachijo-Jima, Ca. E. factor and Ca. E. gemina, respectively, represent holo-acyl Japan (21). Subsequent catalyzed reporter deposition fluorescence carrier protein synthases (ACPS) activating synthases. in situ-hybridization (CARD-FISH) experiments with Ca.Ento- ETW94202 and ETW96748 are Sfp-type PPTases, which are theonella-specific probes revealed the same spatial distribution for typically responsible for the activation of assembly line-like these bacteria as for misakinolide. Interestingly, we observed here enzymes. Another enzyme (ETX03667), which is phylogeneti- the same highly localized distribution for Ca. Entotheonella, cally distinct from the other candidates, is encoded by a gene on cyclotheonamide A, and onnamide A within the pores, chambers, plasmid pTSY1. This plasmid is particularly rich in secondary and exterior part of the yellow chemotype of the sponge (Fig. 4). metabolite gene clusters. None of the PPTase proteins, however, Collectively, these findings further support the hypothesis that could be identified by proteomics. Ca. Entotheonella locally produces these cytotoxic compounds Also, no evidence for biotin biosynthesis was obtained. Again, this in surface-accessible areas, possibly to protect the host sponge from finding is unusual for polyketide-producing organisms, because predators, to assist in killing of eukaryotic prey in chambers, and acetyl-CoA carboxylase depends on biotin for the activation of bi- perhaps to protect the sponge tissue by a compartimentalization-like carbonate. Concerning classical quinone cofactors, we could not strategy. The CARD-FISH–based localization to regions facing sea detect genes for ubiquinone biosynthesis, but parts of the classical water also support our in silico data that suggest aerobic metabolism. menaquinone biosynthesis route. Surprisingly, genes encoding key enzymes of menaquinone biosynthesis, such as chorismate isomer- Symbiosis Factors. Members of the candidate genus Ca. Entotheonella ase, were likewise absent in our datasets. Furthermore, thiamine are associated with various marine sponges and are likely to pro- biosynthesis appeared to be incomplete. This pathway most likely duce mixtures of defensive compounds for the benefit of their hosts starts from hydroxymethyl pyrimidine (HMP), because phosphohy- (7, 15, 21). To identify further genetic factors involved in mutual- droxymethyl pyrimidine synthase (EC 4.1.99.17) is missing. In per- ism, we searched the genome for classical virulence factors that are fect agreement with this hypothesis, we discovered a putative HMP also commonly found in symbiotic bacteria. However, we did not importer belonging to the ABC transporter family (specificity was find any typical virulence-related secretion systems, such as type

4of10 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al. Downloaded by guest on September 25, 2021 A III secretion systems (except for the general secretion pathway), or PNAS PLUS O any other virulence factors listed in the TIGRFAMs database of protein families (TIGRFAM) under TIGRFAM role “pathogen- N SH esis.” The lack of classical symbiosis factors sometimes points to- H ward more loose associations with hosts rather than highly intimate O OH symbioses. However, the fact that Ca. Entotheonella bacteria are regularly associated with sponges and have thus far not been iso- N OH O H lated from any other source is an indication for true mutualism OH HO O and suggests as-yet unknown factors involved in recognition, O O HN H colonization, and survival within the host. Some of the orphan HO OH OH biosynthetic gene clusters of Ca. Entotheonella might produce HO compounds that act as signaling molecules or host manipula- HO OH N O tion factors. OH O O Boosted Enzyme Families for Complex Metabolism. Both Ca.E. mycothiol pyrroloquinoline factor and Ca. E. gemina possess very large genomes of ca. 9Mb. quinone (PQQ) However, the reported high numbers of repetitive elements and O specialized metabolite biosynthesis clusters alone do not account for this exceptional genome size. We therefore wished to obtain in- NH sights as to whether the large genome size is caused by a high HO N N O number of distinct encoded protein families, an accumulation of OH known protein families, or the presence of unknown protein fami- lies. In the genome of Ca.E.factor(Ca.E.gemina),6,318(6,473) OH of 8,440 (8,990) protein coding sequences (CDSs) were assigned to HO known families in the protein family (PFAM) database (45). This O finding means that roughly 25% (28%) of the CDSs have no hit in O P OH the PFAM database. The PFAM database was chosen because the O compiled PFAMs are supposed to be nonoverlapping. On the other MICROBIOLOGY COOH H hand, it should be noted that domains of one protein usually belong N COOH to different PFAMs. The 6,318 (6,473) proteins with PFAM hits can O N be assigned to 2,225 (2,211) distinct PFAMs, indicating that mul- H O COOH tiplication of a subset of families, rather than the accumulation of coenzyme F many distinct families, is the mechanism behind genome expansion 420 in Ca. Entotheonella. This trend is general, because gene duplica- B tion and functional differentiation are more likely than horizontal transfer or de novo evolution of genes. To identify PFAMs that are particularly enriched within Ca. Entotheonella, we compiled a matrix (PFAMs versus genomes) containing the number of all 100 'Entotheonella' M. smithii PFAM hits per genome analyzed. To put genome statistics into relation, we chose a set of 100 genomes covering a broad range of diverse phylogenetic groups, lifestyles (e.g., pathogenic, free-living), 50 morphologies, metabolic features, and varying genome sizes (Dataset S3). Thereby, we obtained an abundance matrix of 8,301 PFAMs in 100 genomes (Dataset S4). We ordered the table by

emmission [%] abundance in Ca. E. factor to retrieve the 50 most abundant 400 471 500 600 PFAMs (top 50) present in its genome (a short version of the wavelength [nm] table is shown in Fig. 5). Notably, three PFAMs within the top 50 Ca. E. factors are not enriched in Ca. E. gemina. These PFAMs comprise parts of the NRPS assembly lines: pfam00550 C (phosphopantetheine attachment site), pfam00668 (condensation domain), and pfam13745 (HxxPF-repeated domain, unknown function). This finding is in agreement with the previous finding that Ca. E. factor harbors far more NRPS genes than Ca.E. gemina. Inspecting the 50 most abundant PFAMs of Ca.E. factor, we observe a general tendency: The genome size re- flects the scope of physiological responses to varying envi- ronmental conditions. Consequently transporters like ABC transporter (pfam00005) or major facilitator (pfam07690) and 5 µm 5 µm signal transduction systems [e.g., response regulator receiver

Fig. 2. Rare cofactors of Ca. Entotheonella. (A) Chemical structure of se- lected rare cofactors encoded in the Ca. Entotheonella genomes. PQQ is a cofactor necessary for MDHs. Mycothiol functions as a -like

cofactor in Actinobacteria like Mycobacterium tuberculosis. Coenzyme F420 excitation wavelength is 405 nm. The fluorescence emission spectra are (a deazariboflavine) is a fluorescent two-electron transfer cofactor mainly nearly identical. (C) Autofluorescence of Ca. Entotheonella filaments: bright- found in Actinobacteria and methanogenic archaea. (B) Normalized fluo- field image of a representative Ca. Entotheonella filament (Left) and fluo- rescence emission spectra of a representative Ca. Entotheonella filament and rescence image of an Ca. Entotheonella filament recorded at 471 nm using

M. smithii DSM 861, a methanogenic archaeon known to produce F420. The an excitation of 405 nm (Right). (Scale bars: 5 μm.)

Lackner et al. PNAS Early Edition | 5of10 Downloaded by guest on September 25, 2021 A ETW95197 ETW95196 phytoene sorbitol DH 1 kb synthase

ETW98207 ETW98206 ETW98205 ETW98203 ETW98202 desaturase aldehyde DH oxidase AT GT B12-bind. regulator

B phytoene synthase 2x FPP 15-cis-4,4'-diapophytoene

3x FAD desaturase 3x FADH2

all-trans-4,4'-diaponeurosporene

oxidase

aldehyde DH

O

O- 4,4'-diaponeurosporenoate

DH sorbitol sorbose NDP- sorbose GT OH HO OH O OH OO sorbosyl-4,4'-diaponeurosporenoate

AT OH HO OH O O R OO O R=acyl 'theoxanthin'

Fig. 3. Proposed staphyloxanthin-like carotenoid (theoxanthin) biosynthesis pathway in Ca. Entotheonella. (A) Symbolic representation of two gene loci encoding the postulated theoxanthin biosynthesis pathway. Arrows represent coding regions indicating the direction of transcription and are drawn to scale. (Scale bar: 1 kb.) Accession numbers of the deduced proteins are specified within arrows. (B) Hypothetical biosynthetic route leading to theoxanthin. The product theoxanthin is as yet unknown, but all biosynthetic steps resemble corresponding steps in the biosynthesis of staphyloxanthin, the yellow-golden pigment of S. aureus. AT, acyltransferase; FAD, flavin dinucleotide; FPP, farnesyl pyrophosphate; GT, glycosyl transferase; NDP, nucleoside triphosphate.

domain (pfam0007), histidine kinase A (phosphoacceptor) do- β-oxidation, or racemization (46). It is therefore tempting to main (pfam00512), sigma factor 70 (pfam04542)] are enriched speculate that Ca. Entotheonella might possess an arsenal of en- in Ca. Entotheonella, but also in other large genomes (e.g., zymes to break down a wide range of organic acids. Notably, one Anabaena, Streptomyces, Myxococcus). To pinpoint protein member of pfam0251 is the previously mentioned putative formyl- families that are particularly enriched in Ca. Entotheonella, we CoA/oxalate CoA-transferase presumably involved in oxalotrophy. adjusted our data matrix by subtraction of the median abundance Hitherto, the record holder in accumulating CoA-transferase of each PFAM over the 100 genomes (Dataset S5). Indeed, sev- family III enzymes is Frankia sp. EuI1c, with 49 sequences (of all eral PFAMs were particularly abundant in Ca. Entotheonella. 873 organisms listed in the PFAM species distribution section). One of these PFAMs is pfam02515 (CoA-transferase family III), Thus, both Ca. Entotheonella species significantly surpass this with around 90 members in each Ca. Entotheonella phylotype, number, with 87 members in Ca. E. factor and 93 members in Ca. followed by Bordetella pertussis and Streptomyces rapamycinicus, E. gemina. Another expanded PFAM with 22 instances in Ca.E. with only 22 and 17 representatives, respectively. CoA-transferase factor and 30 in Ca. E. gemina is the -binding do- III was originally discovered in anaerobic pathways and is involved main of aldehyde dehydrogenase (pfam02738), which represents in the activation of various organic acids like oxalate, bile acids, or the large subunit of xanthin dehydrogenase, CO dehydroge- benzylsuccinate for subsequent reactions, such as decarboxylation, nase, or hydroxybenzoyl-CoA dehydrogenase (47). Because the

6of10 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al. Downloaded by guest on September 25, 2021 A PNAS PLUS

B C0 5e^5 D

cyclotheonamide A (m/z 732.314)

E F0 5e^5 G MICROBIOLOGY

onnamide A (m/z 794.455)

Fig. 4. Localization of Ca. Entotheonella and the bioactive metabolites cyclotheonamide A and onnamide A. (A) Localization of Ca. Entotheonella by CARD- FISH. Overlay of a bright-field image of representative pores of T. swinhoei Y and a fluorescent image obtained from CARD-FISH labeling of Ca. Entotheonella (Scale bars: 100 μm.) Localization of cyclotheonamide A and onnamide A in the sponge tissue. (B and E) Representative optical image of a thin slice of T. swinhoei Y used for MALDI-IMS before MALDI matrix application. False-color heat map representation of spatial distribution of cyclotheonamide A (C) and onnamide A (F). Overlay of the optical image and the spatial distribution of cyclotheonamide A (D) and onnamide A (G) in the sponge tissue. (Scale bars: 2 mm.)

copper-binding motif of CO dehydrogenase is missing in all of the tor, ETX03783 (OnnC) and ETX03434 (NazB) are proposed hy- deduced Ca. Entotheonella enzymes, we suspect a role in the droxylases and belong to the onnamide and nazumamide clusters, breakdown of aromatic compounds. This hypothesis is further respectively. The other two proteins, ETW93540 and ETX02882, supported by the high number of taurine catabolism dioxygenase are not part of any identified natural product gene cluster. Another TauD family proteins (pfam02668). Although TauD itself is involved interestingfeatureofLLMsisthefact that several subfamilies in utilization of taurine as a source, other family members are (TIGRFAMs) depend on coenzyme F420 instead of a flavin co- involved in the breakdown of the pesticide 2,4-dichlorophenoxyacetic factor (49). In light of the presence of this cofactor in Ca.Ento- acid, for instance (48). In conclusion, these data suggest that theonella, we assigned all members of pfam00296 to TIGRFAM Ca. Entotheonella is a prolific source of complex and untapped subfamilies (Dataset S6). However, only 59 of 171 LLM proteins biochemistry involved in the transformation of small organic mol- from Ca. E. factor and 42 of 168 from Ca.E.geminacouldbe ecules waiting to be characterized. assigned to any TIGRFAM. Consequently, one might expect a considerable proportion of LLMs belonging to novel subfam- Extraordinary Abundance of Luciferase-Like Monooxygenases. Another ilies. Furthermore, we identified multiple members of assigned intriguing PFAM that is extraordinarily abundant in Ca. Ento- F420-dependent subfamilies, such as the poorly characterized theonella is pfam00296, the luciferase-like monooxygenase (LLM). TIGR03619 (33 members in Ca.E.factorand29inCa.E. It is the most abundant PFAM in both Ca. E. factor (171 gemina). Because the identity of G6PDH is enigmatic in Ca. members) and Ca. E. gemina (168 members). Due to the great Entotheonella, the idea is tempting that enzymes with G6PDH expansion of this family and the fact that bioluminescence of Ca. or related activities might be hidden among the large diversity Entotheonella filaments has not been observed, it is reasonable to of LLM proteins. However, as discussed above, a bona fide assume that these proteins do not act as true luciferases. Rank 2 in member of TIGR03557 (F420-dependent G6PDH) is not pre- our 100-genome set with 63 LLM genes holds S. rapamycinicus sent in Ca. Entotheonella. Thus, the role of the multifarious (producer of the immunosuppressant rapamycin), another highly LLMs of Ca. Entotheonella remains obscure, and future work is talented producer of specialized metabolites (Fig. 5). This finding highly warranted to shed more light on this PFAM. might suggest a role of LLMs in secondary metabolism. Notably, members of the PFAM TIGR04020 (natural product biosynthesis Conclusion LLM domain, a subfamily of pfam00296) are typically found Our analyses suggest that the as-yet uncultured bacteria Ca.E. encoded in specialized metabolite gene clusters. However, only factor and Ca. E. gemina are organotrophic, aerobic (or at least four proteins of Ca. E. factor, none of Ca.E.gemina,andtwoof microaerophilic), facultative anaerobic organisms that are likely S. rapamycinicus belong to this particular subfamily. In Ca.E.fac- able to use a broad range of organic carbon sources like organic

Lackner et al. PNAS Early Edition | 7of10 Downloaded by guest on September 25, 2021 Fig. 5. Heat map of the 16 most abundant PFAMs encoded in the Ca. E. factor genome. Numbers in the table represent absolute PFAM counts. A color code was applied to the fields to create a heat map (red, high abundance; green, low abundance). To put values in a meaningful relationship, the abundance of these PFAMS in nine other genomes, as well as maximum and median values, is provided. Notably, Ca. Entotheonella are extremely rich in LLMs and CoA- transferase family III enzymes. The full table (all PFAMS in 100 genomes) is shown in Dataset S4. A description of the genomes is provided in Dataset S3.

acids, alcohols, polyols, polysaccharides, or even complex aromatic but able to react to varying environmental conditions. Our studies compounds. The exact pathways of pentose and hexose degrada- demonstrate that Ca. Entotheonella contain a huge arsenal of tion, however, remain enigmatic. Ca. Entotheonella members ap- presumably novel biochemical pathways most likely not only for pear to be auxotrophic for biotin, pantothenate, and thiamine, but the production of bioactive natural products but also for the are able to produce rare cofactors like PQQ, F420, and a mycothiol- breakdown of complex organic molecules. These unusual bac- like cofactor. They harbor the genetic potential to synthesize most, teria thus represent a rich resource for research on microbial if not all, amino acids. Proteomics data suggest that a lanthanide- physiology, bioorganic chemistry, biotechnology, and natural prod- dependent MDH is a highly abundant protein in the cell, thus ucts. Just recently, a novel antibiotic with promising properties, supporting methanol utilization. Hence, the addition of rare earth teixobactin, has been discovered from previously uncultured bac- elements and methanol to the growth media could be a promising teria (50, 51). The study presented here might suggest cultivation approach for targeted cultivation. As shown for the related E. serta, strategies that could provide access to natural product sources from the producer of misakinolide, Ca. Entotheonella filaments from a wide range of sponges. Finally, we believe that our approach T. swinhoei Y are located in external and internal sponge tissue should be valuable for studies on uncultured organisms in general. regions that face sea-water, and their location is perfectly correlated While we were finalizing this paper, an analysis by Liu et al. (52) with metabolites whose biosynthetic pathways were identified in the of two metagenome bins from a T. swinhoei Y sponge collected in Ca. E. factor genome. Both Ca. Entotheonella strains possess an the South China Sea was published. Both binned genomes are extremely broad repertoire of transporters and regulators, suggest- closely related to those genomes of the Ca. Entotheonella strains ing that they might not be restricted to a defined ecological niche, described here (99.9% average nucleotide identity), and should

8of10 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al. Downloaded by guest on September 25, 2021 thus belong to the same candidate species. In contrast to our suitable for MALDI-IMS experiments or onto microscopy slides for subsequent PNAS PLUS analysis, Liu et al. (52) claim that they identified the Calvin– CARD-FISH analysis. A Sunchrome SunCollect MALDI Spotter was used to cover Benson cycle for CO fixation (via ribulose-1,5-bisphosphate the tissue slice uniformly with five to six layers of conventional MALDI matrix 2 α carboxylase), candidates for crassulacean acid metabolism (a (1:1 mixture of -cyano-4-hydroxycinnamic acid and 2,5-dihydroxybenzoic acid; 30 mg/mL in methanol; Sigma-Aldrich). The samples were then measured on a phenomenon exclusively known in plants), a type VI secretion MALDI LTQ-Orbitrap mass spectrometer (Thermo Scientific) in positive mode system, and a complete system for endospore formation. Despite with a laser energy of 45 μJ, two microscans per step, and a resolution of extensive manual and automated reinvestigation of our dataset as 60,000×. Data were acquired for a mass range between 500 and 1600 m/z. well as the sequence data provided for the Chinese sponge, we Using ImageQuest, masses of interest were assigned false colors, and the cannot find any evidence for these genomic features. resulting image was then superimposed onto an optical image of the tissue slice taken before MALDI matrix application. Materials and Methods For CARD-FISH experiments, tissue slices were immobilized on microscopy Enrichment of Ca. Entotheonella Cells. An enriched pellet of filamentous bac- slides and subsequently fixed in PBS-buffered 4% (wt/vol) paraformaldehyde teria was prepared as described previously (15). Ca. Entotheonella was then (4 °C, overnight). After washing with PBS [at room temperature (r.t.)], slices further purified on a 20:60:100 (vol/vol) stepwise Percoll (GE Healthcare) gra- were dehydrated [99% (vol/vol) ethanol, 3 min at r.t.], air-dried (2 h at r.t.), and dient in calcium- and magnesium-free artificial seawater (CMF-ASW). Ten permeabilized with 10 mg/mL lysozyme (30 min at r.t.). After inactivation of milliliters of each Percoll dilution was layered in a 50-mL Falcon tube on ice. A endogenous peroxidases with 0.01 M HCl for 15 min at r.t., slices were washed

0.5-mL cell suspension of the filamentous bacterial fraction was carefully lay- with H2O, dehydrated (99% ethanol, 5 min at r.t.), and air-dried. The hybrid- ered on top of the gradient and centrifuged at 250 × g for 20 min. Cell frac- ization reaction was performed in hybridization solution [0.9 M NaCl, 20 mM tions were carefully collected with a syringe and analyzed by phase-contrast Tris·HCl (pH 7.6), 10% (wt/vol) dextran sulfate, 0.05% SDS, 1% nucleic acid microscopy based on cell morphology. The filamentous Ca. Entotheonella frac- blocking reagent (Roche), 0.5 mg/mL herring sperm DNA (Sigma)] con- tion was collected at the 60%:100% Percoll interface and stored in CMF-ASW at taining 55% (vol/vol) formamide and 0.5 ng/μL Ca. Entotheonella-specific, 4 °C. For phase-contrast microscopy, cells were spotted on a cover slide, dried, horseradish peroxidase (HRP)-coupled probe ESP219 (5′-CCG CAA GCY CAT covered with mountant (Citifluor Ltd.), and subsequently observed under a Zeiss CTC AGA CC-3′; BioMers) for 2 h at 35 °C. The tissues slices were then Axioskop 2 epifluorescence microscope equipped with a 75-W xenon arc lamp washed once with prewarmed washing buffer [3 mM NaCl, 5 mM EDTA (XBO 75) and a 40× Plan Apochromat objective. (pH 8.0), 20 mM Tris·HCl (pH 7.6), 0.05% SDS] for 30 min at 37 °C and once with (PBS-T) (0.01% Triton X-100). After equilibration of the probe-delivered Shotgun Proteomics. The Ca. Entotheonella cell pellet was resuspended in HRP in PBS for 15 min at r.t., signal amplification took place in amplifica- 30 μL of sample buffer, and 10 and 20 μL were loaded onto an acrylamide gel tion buffer [1× PBS (pH 7.6), 2 M NaCl, 20% (wt/vol) dextran sulfate, 0.1% with gradient of 4–12% (wt/vol) acrylamide in 3-(N-morpholino)propane- nucleic acid blocking reagent (Roche), 0.0015% H2O2] containing Alexa

sulfonic acid buffer. The 20-μL lane was cut in eight sections. Sections were cut Fluor 633-labeled tyramide (Life Science) for 2 h at 37 °C in the dark. MICROBIOLOGY in small pieces and washed twice with 100 μL of 50% (vol/vol) acetonitrile in Subsequently, slices were washed three times in PBS-T for 15 min at r.t.,

100 mM NH4HCO3 and then washed with 50 μL of acetonitrile (100%). All rinsed with H2O, dehydrated, and air-dried. For microscopic analysis, slices three supernatants were discarded. Proteins were digested by addition of were covered with mountant (Citifluor Ltd.) and observed under a Zeiss 20 μL(10μL for section 1) of sequencing grade trypsin [5 ng/μLin10mMTris, Axioskop 2 epifluorescence microscope equipped with a 75-W xenon arc

2mMCaCl2 (pH 8.2)], and 30 μL of buffer (10 mM Tris, 2 mM CaCl2 (pH 8.2)] lamp (XBO 75), an appropriate filter set for Alexa Fluor 633, and a 10× Plan and incubated overnight at 37 °C. The supernatant was removed, and gel Apochromat objective.

pieces were extracted with 150 μL of 50% (vol/vol) acetonitrile in H2Ocon- taining 0.1% TFA. All supernatants were combined and dried. Samples were Genome Reannotation and Pathway Prediction. The Ca. Entotheonella genome dissolved in 20 μL of 0.1% formic acid and transferred to autosampler vials for analysis presented in this study is based on the previously reported dataset LC-MS/MS. Five microliters was injected into a NanoAcquity UPLC instrument available at the NCBI, as well as a reannotated version of Ca.E.factorTSY1 (Waters, Inc.) connected to a Q Exactive mass spectrometer (Thermo Scientific) [synonym: Ca. Entotheonella sp. TSY1; NCBI accession no. AZHW00000000.1, equipped with a Digital PicoView source (New Objective). Peptides were Integrated Microbial Genomes (IMG) submission ID 35878] and Ca.E.gemina trapped on a Symmetry C18 trap column (5 μm, 180 μm × 20 mm; Waters, Inc.) TSY2 (synonym:Ca.Entotheonella sp. TSY2; NCBI accession no. AZHX00000000.1, and separated on a BEH300 C18 column (1.7 μm, 75 μm × 150 m; Waters, Inc.) IMG Submission ID 35877) created by the IMG Expert Review (IMG-ER) system −1 at a flow rate of 250 nL·min using a gradient from 1% of solvent B (0.1% (53). We performed extensive manual reanalyses of genes. For automated gene formic acid in acetonitrile)/99% of solvent A (0.1% formic acid in water) to function and pathway predictions, we used the automatic annotations provided 40% (vol/vol) of solvent B/60% of solvent A within 90 min. Mass spectrometer by the IMG-ER platform (53). For manual predictions, we used a combination of settings were set for a data-dependent analysis, with a precursor scan range of searches in the UniProt database (54), as well as a hidden Markov model search 350–1500 m/z, resolution of 70,000, maximum injection time of 100 ms, and in the NCBI Conserved Domains (55), PFAM (45), and TIGRFAMs (56) databases. 6 threshold of 3 × 10 and with a fragment ion scan range of 200–2000 m/z, Reference metabolic pathways were retrieved from Metacyc (57) and the Kyoto 5 resolution of 35,000, maximum injection time of 120 ms, and threshold 1 × 10 . Encyclopedia of Genes and Genomes (58). For the prediction of ABC trans- Database searches were performed using Mascot (database: NCBI_nr, all spe- porter families, the blast function of the ABC transporter database ABCdb cies) and MaxQuant (25) for searching against a database containing only the (26) was used. deduced protein sequences of Ca. Entotheonella. PFAM Abundance Analysis. A set of 100 genomes representing a wide range of Single-Cell Fluorescence Emission Spectra Analysis. For confocal microscopy, phyla and ecological lifestyles was chosen for a function profile analysis in IMG-ER cells were dried and mounted in aqueous mounting media (Vectashield; Vector (Dataset S3). All available functions from the PFAM database (16,230 entries) Laboratories, Inc.). Images were acquired using a Zeiss LSM 710 confocal system were selected. All noninformative rows (i.e., PFAMS not present in the dataset) mounted on a Zeiss AxioObserver.Z1 microscope with an oil immersion Plan- were removed. To obtain a median-centered matrix, the median abundance × Apochromat 63 /1.4-N.A. differential interference contrast (DIC) objective lens. over 100 genomes of each row was computed and subtracted from each cell. Excitation was performed using the 405-nm diode laser line, and emission was collected at various emission wavelengths with a photo multiplier tube. Molecular Phylogenetic Analyses. Selected protein sequences of known MDHs Lambda stacks were collected with a step size of 5 nm for Ca.E.factorand,due described by Keltjens et al. (28) were obtained from the NCBI. Primary se- to photo bleaching, with a step size of 10 nm for M. smithii DSM 861 (American quences were aligned using the MUSCLE algorithm (59) implemented in Type Culture Collection 35061) cells. All confocal images were acquired with a MEGA6 (60). Evolutionary analyses were also conducted in MEGA6 using the pinhole of 600 μm, a frame size of 664 × 664 pixels, and a scan zoom of 3. maximum likelihood method based on the model of Le and Gascuel (61). A Image analysis was performed with Zen2012 (www.zeiss.de/corporate/de_de/ discrete gamma distribution, including invariable sites, was used to model home.html)andImageJ(https://imagej.nih.gov/ij/) software. evolutionary rate differences among sites.

Localization of Cyclotheonamide A, Onnamide A, and Ca. Entotheonella. MALDI- ACKNOWLEDGMENTS. We thank Shigeki Matsunaga, Kentaro Takada (Uni- IMS experiments and CARD-FISH experiments were performed as described versity of Tokyo), and Toshiyuki Wakimoto (Hokkaido University) for sponge previously (21). Briefly, cryopreserved T. swinhoei Y sponge was cut into 50-μm samples; Micheal Wilson [Eidgenössische Technische Hochschule Zurich (ETH slices using a microtome (Microm; Thermo Fisher). Representative subsequent Zurich)] for isolation of Ca. Entotheonella cells from sponge samples; Bidong sponge tissue slices were then transferred either to stainless-steel target plates Nguyen (ETH Zurich) for providing a culture of M. smithii; Tobias Schwarz

Lackner et al. PNAS Early Edition | 9of10 Downloaded by guest on September 25, 2021 (ScopeM, ETH Zurich) for technical support in confocal microscopy; Peter discussions on methanol metabolism. This work was supported by grants from Hunziker (Functional Genomics Center Zurich) for proteomics experiments; and the Swiss National Foundation (205321_165695 to J.P.), European Union Tobias Erb (Max Planck Institute) and Julia Vorholt (ETH Zurich) for valuable (BluePharmTrain to J.P.), and Alexander von Humboldt Foundation (to G.L.).

1. Blunt JW, et al. (2009) Marine natural products. Nat Prod Rep 26(2):170–244. 30. Misset-Smits M, van Ophem PW, Sakuda S, Duine JA (1997) Mycothiol, 1-O-(2′-[N-acetyl-L- 2. Hentschel U, Piel J, Degnan SM, Taylor MW (2012) Genomic insights into the marine cysteinyl]amido-2′-deoxy-alpha-D-glucopyranosyl)-D- myo-inositol, is the factor of NAD/ sponge microbiome. Nat Rev Microbiol 10(9):641–654. factor-dependent formaldehyde dehydrogenase. FEBS Lett 409(2):221–222. 3. Taylor MW, Radax R, Steger D, Wagner M (2007) Sponge-associated microorganisms: 31. Maia LB, Moura JJ, Moura I (2015) Molybdenum and tungsten-dependent formate Evolution, ecology, and biotechnological potential. Microbiol Mol Biol Rev 71(2):295–347. dehydrogenases. J Biol Inorg Chem 20(2):287–309. 4. Bewley CA, Faulkner DJ (1998) Lithistid sponges: Star performers or hosts to the stars. 32. Yang M, et al. (2013) Atmospheric deposition of methanol over the Atlantic Ocean. Angew Chem Int Ed 37(16):2163–2178. Proc Natl Acad Sci USA 110(50):20034–20039. 5. Gurgui C, Piel J (2010) Metagenomic approaches to identify and isolate bioactive 33. Chistoserdova L, Chen SW, Lapidus A, Lidstrom ME (2003) Methylotrophy in Methyl- natural products from microbiota of marine sponges. Methods Mol Biol 668:247–264. obacterium extorquens AM1 from a genomic point of view. J Bacteriol 185(10):2980–2987. 6. Bewley CA, Holland ND, Faulkner DJ (1996) Two classes of metabolites from Theonella 34. Sidhu H, et al. (1997) DNA sequencing and expression of the formyl swinhoei are localized in distinct populations of bacterial symbionts. Experientia transferase gene, frc, from Oxalobacter formigenes. J Bacteriol 179(10):3378–3381. 52(7):716–722. 35. Cerrano C, et al. (1999) Calcium oxalate production in the marine sponge Chondrosia 7. Schmidt EW, Obraztsova AY, Davidson SK, Faulkner DJ, Haygood MG (2000) Identifi- reniformis. Mar Ecol Prog Ser 179:297–300. cation of the antifungal peptide-containing symbiont of the marine sponge Theonella 36. Poole AM, Logan DT, Sjöberg BM (2002) The evolution of the ribonucleotide reduc- swinhoei as a novel delta-proteobacterium, “Candidatus Entotheonella palauensis”. Mar tases: Much ado about oxygen. J Mol Evol 55(2):180–196. Biol 136(6):969–977. 37. Pelz A, et al. (2005) Structure and biosynthesis of staphyloxanthin from Staphylo- 8. Brück WM, Sennett SH, Pomponi SA, Willenz P, McCarthy PJ (2008) Identification of coccus aureus. J Biol Chem 280(37):32493–32498. the bacterial symbiont Entotheonella sp. in the mesohyl of the marine sponge Dis- 38. Clauditz A, Resch A, Wieland KP, Peschel A, Götz F (2006) Staphyloxanthin plays a role codermia sp. ISME J 2(3):335–339. in the fitness of Staphylococcus aureus and its ability to cope with oxidative stress. 9. Kalesse M (2000) The chemistry and biology of discodermolide. ChemBioChem 1(3): Infect Immun 74(8):4950–4953. 171–175. 39. Liu GY, et al. (2005) Staphylococcus aureus golden pigment impairs neutrophil killing 10. Gunasekera SP, Gunasekera M, Longley RE, Schulte GK (1990) Discodermolide—a new and promotes virulence through its antioxidant activity. J Exp Med 202(2):209–215. bioactive polyhydroxylated lactone from the marine sponge Discodermia dissoluta. 40. Kutta RJ, et al. (2015) The photochemical mechanism of a B12-dependent photore- J Org Chem 55(16):4912–4915. ceptor protein. Nat Commun 6:7907. 11. Schirmer A, et al. (2005) Metagenomic analysis reveals diverse polyketide synthase 41. Rohmer M (1999) The discovery of a mevalonate-independent pathway for isoprenoid gene clusters in microorganisms associated with the marine sponge Discodermia biosynthesis in bacteria, algae and higher plants. Nat Prod Rep 16(5):565–574. dissoluta. Appl Environ Microbiol 71(8):4840–4849. 42. Hamada T, Sugawara T, Matsunaga S, Fusetani N (1994) Bioactive marine metabolism. 12. Keren R, Lavy A, Ilan M (2016) Increasing the richness of culturable -tolerant 56. Polytheonamides, unprecedented highly cytotoxic polypeptides from the marine bacteria from Theonella swinhoei by addition of sponge skeleton to the growth sponge Theonella swinhoei. 2. Structure elucidation. Tetrahedron Lett 35(4):609–612. medium. Microb Ecol 71(4):873–886. 43. Tsukamoto S, Matsunaga S, Fusetani N, Toh-e A (1999) Theopederins F-J: Five new 13. Keren R, Lavy A, Mayzel B, Ilan M (2015) Culturable associated-bacteria of the sponge antifungal and cytotoxic metabolites from the marine sponge, Theonella swinhoei. Theonella swinhoei show tolerance to high arsenic concentrations. Front Microbiol 6:154. Tetrahedron 55(48):13697–13702. 14. Lavy A, Keren R, Haber M, Schwartz I, Ilan M (2014) Implementing sponge physio- 44. Fusetani N, Matsunaga S (1993) Bioactive sponge peptides. Chem Rev 93(5): logical and genomic information to enhance the diversity of its culturable associated 1793–1806. bacteria. FEMS Microbiol Ecol 87(2):486–502. 45. Finn RD, et al. (2016) The Pfam protein families database: Towards a more sustainable 15. Wilson MC, et al. (2014) An environmental bacterial taxon with a large and distinct future. Nucleic Acids Res 44(D1):D279–D285. metabolic repertoire. Nature 506(7486):58–62. 46. Heider J (2001) A new family of CoA-transferases. FEBS Lett 509(3):345–349. 16. Piel J, et al. (2004) Antitumor polyketide biosynthesis by an uncultivated bacterial symbiont 47. Boll M, et al. (2001) Redox centers of 4-hydroxybenzoyl-CoA reductase, a member of the marine sponge Theonella swinhoei. Proc Natl Acad Sci USA 101(46):16222–16227. of the xanthine oxidase family of molybdenum-containing enzymes. JBiolChem 17. Freeman MF, et al. (2012) Metagenome mining reveals polytheonamides as post- 276(51):47853–47862. translationally modified ribosomal peptides. Science 338(6105):387–390. 48. Suwa Y, et al. (1996) Characterization of a chromosomally encoded 2,4-dichlorophenoxyacetic 18. Freeman MF, Helf MJ, Bhushan A, Morinaka BI, Piel J (November 28, 2016) Seven acid/alpha-ketoglutarate dioxygenase from Burkholderia sp. strain RASC. Appl Environ enzymes create extraordinary molecular complexity in an uncultivated bacterium. Nat Microbiol 62(7):2464–2469. Chem, 10.1038/nchem.2666. 49. Selengut JD, Haft DH (2010) Unexpected abundance of coenzyme F(420)-dependent 19. Ueoka R, et al. (2015) Metabolic and evolutionary origin of actin-binding polyketides enzymes in Mycobacterium tuberculosis and other actinobacteria. J Bacteriol 192(21): from diverse organisms. Nat Chem Biol 11(9):705–712. 5788–5798. 20. Wakimoto T, et al. (2014) Calyculin biogenesis from a pyrophosphate protoxin produced 50. Ling LL, et al. (2015) A new antibiotic kills pathogens without detectable resistance. by a sponge symbiont. Nat Chem Biol 10(8):648–655. Nature 517(7535):455–459. 21. Nakashima Y, Egami Y, Kimura M, Wakimoto T, Abe I (2016) Metagenomic analysis of 51. Piddock LJ (2015) Teixobactin, the first of a new class of antibiotics discovered by the sponge Discodermia reveals the production of the cyanobacterial natural product iChip technology? J Antimicrob Chemother 70(10):2679–2680. kasumigamide by ‘Entotheonella’. PLoS One 11(10):e0164468. 52. Liu F, Li J, Feng G, Li Z (2016) New genomic insights into “Entotheonella” symbionts in 22. Freeman MF, Vagstad AL, Piel J (2016) Polytheonamide biosynthesis showcasing the Theonella swinhoei: Mixotrophy, anaerobic adaptation, resilience, and interaction. metabolic potential of sponge-associated uncultivated ‘Entotheonella’ bacteria. Curr Front Microbiol 7(1333):1333. Opin Chem Biol 31:8–14. 53. Markowitz VM, et al. (2012) IMG: The Integrated Microbial Genomes database and 23. Morinaka BI, et al. (2014) Radical S-adenosyl methionine epimerases: Regioselective comparative analysis system. Nucleic Acids Res 40(Database issue):D115–D122. introduction of diverse D-amino acid patterns into peptide natural products. Angew 54. UniProt Consortium (2015) UniProt: A hub for protein information. Nucleic Acids Res Chem Int Ed Engl 53(32):8503–8507. 43(Database issue):D204–D212. 24. Wilson MC, Piel J (2013) Metagenomic approaches for exploiting uncultivated bac- 55. Marchler-Bauer A, et al. (2011) CDD: A Conserved Domain Database for the functional teria as a resource for novel biosynthetic enzymology. Chem Biol 20(5):636–647. annotation of proteins. Nucleic Acids Res 39(Database issue):D225–D229. 25. Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, in- 56. Haft DH, et al. (2013) TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res dividualized p.p.b.-range mass accuracies and proteome-wide protein quantification. 41(Database issue):D387–D395. Nat Biotechnol 26(12):1367–1372. 57. Caspi R, et al. (2014) The MetaCyc database of metabolic pathways and enzymes and 26. Fichant G, Basse MJ, Quentin Y (2006) ABCdb: An online resource for ABC transporter the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 42(Database repertories from sequenced archaeal and bacterial genomes. FEMS Microbiol Lett issue):D459–D471. 256(2):333–339. 58. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M (2016) KEGG as a reference 27. Delmotte N, et al. (2009) Community proteogenomics reveals insights into the resource for gene and protein annotation. Nucleic Acids Res 44(D1):D457–D462. physiology of phyllosphere bacteria. Proc Natl Acad Sci USA 106(38):16428–16433. 59. Edgar RC (2004) MUSCLE: Multiple sequence alignment with high accuracy and high 28. Keltjens JT, Pol A, Reimann J, Op den Camp HJ (2014) PQQ-dependent methanol throughput. Nucleic Acids Res 32(5):1792–1797. dehydrogenases: Rare-earth elements make a difference. Appl Microbiol Biotechnol 60. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evo- 98(14):6163–6183. lutionary Genetics Analysis version 6.0. Mol Biol Evol 30(12):2725–2729. 29. Pol A, et al. (2014) Rare earth metals are essential for methanotrophic life in volcanic 61. Le SQ, Gascuel O (2008) An improved general amino acid replacement matrix. Mol mudpots. Environ Microbiol 16(1):255–264. Biol Evol 25(7):1307–1320.

10 of 10 | www.pnas.org/cgi/doi/10.1073/pnas.1616234114 Lackner et al. Downloaded by guest on September 25, 2021