The ISME Journal (2021) 15:1641–1654 https://doi.org/10.1038/s41396-020-00876-9

ARTICLE

A genomic view of the microbiome of coral reef

1 2 1 3 4 1 3,5 3 S. J. Robbins ● W. Song ● J. P. Engelberts ● B. Glasl ● B. M. Slaby ● J. Boyd ● E. Marangon ● E. S. Botté ● 3 2 1,3 P. Laffy ● T. Thomas ● N. S. Webster

Received: 21 August 2020 / Revised: 23 November 2020 / Accepted: 7 December 2020 / Published online: 19 January 2021 © The Author(s) 2021. This article is published with open access

Abstract underpin the productivity of coral reefs, yet few of their microbial symbionts have been functionally characterised. Here we present an analysis of ~1200 metagenome-assembled genomes (MAGs) spanning seven and 25 microbial phyla. Compared to MAGs derived from reef seawater, sponge-associated MAGs were enriched in glycosyl hydrolases targeting components of sponge tissue, coral mucus and macroalgae, revealing a critical role for sponge symbionts in cycling reef organic matter. Further, visualisation of the distribution of these amongst symbiont taxa uncovered functional guilds for reef organic matter degradation. Genes for the utilisation of sialic acids and glycosaminoglycans present in sponge tissue were found in specific microbial lineages that also encoded genes for attachment to sponge-derived fibronectins and cadherins, suggesting these lineages can utilise specific structural elements of sponge tissue. Further, genes 1234567890();,: 1234567890();,: encoding CRISPR and restriction-modification systems used in defence against mobile genetic elements were enriched in sponge symbionts, along with eukaryote-like motifs thought to be involved in maintaining host association. Finally, we provide evidence that many of these sponge-enriched genes are laterally transferred between microbial taxa, suggesting they confer a selective advantage within the sponge niche and therefore play a critical role in host ecology and evolution.

Introduction efficient mechanisms for nutrient retention. Marine sponges are therefore an essential component of reef ecosystems Coral reefs are among the most productive ecosystems in because they filter large volumes of seawater (up to thou- the world and are frequently referred to as ‘rainforests of the sands of litres per day [2]) from which they retain the sea’ due to their immense biodiversity [1]. However, despite organic matter and transform it into biomass that can be their exceptionally high primary productivity, nutrient consumed by detritivores, ultimately recycling it back into levels on tropical reefs are typically low, necessitating the reef system in a process known as the “sponge-loop” [3]. Reef-dwelling sponges harbour stable and diverse microbial communities that can account for up to 35% of sponge biomass and are hypothesised to carry out functions Supplementary information The online version of this article (https:// that support their host’s health and ecology, such as the doi.org/10.1038/s41396-020-00876-9) contains supplementary – material, which is available to authorized users. transformation of carbon (e.g. polysaccharides) [4 6], nitrogen (e.g. archaeal ammonia oxidation) [7] and sulfur * N. S. Webster [8], as well as providing essential vitamins and amino acids [email protected] to the host [9]. Sponges also play host to a diverse array of 1 Australian Centre for Ecogenomics, University of Queensland, mobile genetic elements (MGE), such as viruses [10, 11], Brisbane, QLD 4072, Australia that may necessitate a diverse array of defence mechanisms 2 Centre for Marine Science & Innovation, University of New South like clustered regularly interspaced short palindromic repeat Wales, Kensington, NSW 2052, Australia (CRISPR) and restriction-modification (RM) systems. The 3 Australian Institute of Marine Science, Townsville, QLD 4810, presence of such mobile elements raises the possibility of Australia lateral gene transfer (LGT) between symbionts, though 4 GEOMAR Helmholtz Centre for Ocean Research Kiel, evidence for this is currently lacking. Düsternbrooker Weg 20, 24105 Kiel, Germany Despite their importance for host health, few community- 5 College of Science and Engineering, James Cook University, level functional investigations have been undertaken to capture Townsville, QLD 4810, Australia the broad range of microbial taxa found in sponges [8, 12]. 1642 S. J. Robbins et al.

Instead, gene-centric studies have identified a number of resuspended in Tris-HCl NaCl at pH 8.0 and kept at interesting sponge symbiont traits but have been unable to −20 °C. To minimize carry over of eukaryotic DNA from link these features to specific microbial taxa. In addition, nearly the sponge, the microbial cell pellet was treated with DNase all genome-centric characterisations of sponge-associated (DNase I; New England Biolabs) prior to cell lysis for DNA microbes have been restricted to a few lineages of interest extraction. In addition, 10 L of inshore coastal seawater was [13–15] or have focussed on low-abundance microorganisms collected from the Sea Simulator (SeaSim) at the Australian amenable to cultivation [16], with the majority of lineages Institute for Marine Science in October 2016 for metage- remaining undescribed. This skew likely biases our under- nomic sequencing. Seawater was filtered first through a standing of the roles that each symbiont lineage plays within 5 μm prefilter and then collected onto a 0.2μm sterivex filter. the microbiome and hinders our ability to identify features that underpin sponge-microbe symbiosis. To address this, we Metagenomic sequencing undertook an integrated analysis of 1188 metagenome- assembled genomes (MAGs) derived from seven marine Sponge samples were extracted using the Qiagen species, spanning 25 microbial phyla and the MagAttract PowerSoil DNA kit as described by Marotz vast majority of microbial taxa commonly found in marine et al. [18] or the DNeasy PowerBiofilm kit (QIAGEN) for sponges. Rhopaloeides. Metagenomic DNA from R. odorabile was sequenced at the Ramaciotti Centre for Genomics (Uni- versity of New South Wales, Sydney, Australia), while Materials and methods other sponge species were sequenced at the University of California San Diego as part of the Earth Microbiome Sample collection and enrichment of from Project (EMP). Metagenomic sequencing was performed sponges over several iterations as part of the EMP using the Kapa HyperPlus and Nextera XT library prep kits, with sequen- This study comprised seven host species belonging to the cing performed on Illumina HiSeq 4000 and NovaSeq 600 class Demospongiae (hereafter referred to as sponges for machines (2 × 151 bp). For R. odorabile, library prep was simplicity), representing the sub-classes Hetero- performed using the Nextera DNA Flex library prep kit and scleromorpha (Cliona orientalis and Stylissa flabelliformis), sequenced on a NextSeq 500 machine. Verongimorpha (Aplysina aerophoba) and Dictyoceratida Seawater from the SeaSim was extracted by adding (Rhopaloeides odorabile, Coscinoderma matthewsi, Ircinia 100 mg/mL lysozyme directly to the sterivex cartridge and ramosa and Carteriospongia foliascens). Two individuals incubating for 1 h at 37 °C with gentle rotation, followed by of R. odorabile were collected from Esk Island (18° 45.830′ the addition of 20 mg/mL proteinase K and incubation for 1 S; 146° 31.159′E) and one from Falcon Island (18° 46.116′ h at 55 °C. DNA clean-up was performed by adding an S; 146° 32.201′E) on the 25th of October, 2018. Four equal volume of Phenol:Chloroform:Isoamyl alcohol individuals of C. matthewsi, C. foliascens, S. flabelliformis, (25:24:1) to the lysate in a separate tube, spinning at I. ramosa and C. orientalis were collected from Davies Reef 16,000 × g for 10 min, and recovering the aqueous layer, (18° 49.948′S; 147° 37.995′E) between the 22nd and the then repeating using Chloroform:Isoamyl alcohol (24:1). 23rd of December, 2015. All sponges were rinsed in filter- DNA was precipitated by adding 0.8 volumes of iso- sterilised seawater before being snap-frozen in liquid propanol, mixing gently, incubating for 15 min at room nitrogen and stored at −80 °C. Cell fractionation was per- temperature, then spinning at 20,000 × g at 4 °C for formed according to methods described in Botte et al. [17]. 20–30 min to pellet DNA. Isopropanol was removed and Briefly, sponges were rinsed twice in calcium-magnesium 500 µl 70% ethanol was added to the pellet, then spun at free seawater before being cut into 1 cm3 pieces and 20,000 × g for 10 min and removed. The remaining pellet homogenised. Due to its bioeroding lifestyle, C. orientalis was allowed to dry for 30 min, then resuspended in 30 µl was crushed in liquid nitrogen with a mortar and pestle to PCR grade water and stored at −80 °C until metagenomic separate sponge tissue from the coral skeleton. Sterile col- sequencing at the Australian Centre for Ecogenomics. lagenase (Sigma-Aldrich) was added at a concentration of Sequencing was carried out using the Nextera Flex library 0.5 g L−1 and samples were shaken at 150 rpm for 30 min. prep kit on the NextSeq500 platform (2 × 150 bp). Samples were filtered through 100 μm sterile cell strainers and centrifuged at 100 × g for 1 min before recovering the Metagenome assembly, binning and taxonomic supernatant and centrifuging at 300 × g for 15 min. This last assignment of bacterial and archaeal MAGs step was repeated and the resulting supernatant was filtered sequentially through 8 and 5 μm filters. Microbial pellets Sponge-associated reads belonging to the same sample and were recovered by centrifuging at 8000 × g for 20 min, sequenced across multiple runs were concatenated into a A genomic view of the microbiome of coral reef demosponges 1643 single set of forward and reverse fastq files. Both sponge sequences [35, 36] after alignment with MAFFT v7.221 and SeaSim seawater samples were processed as follows: [37]. The tree was rooted between the domains Archaea and Adapter clipping of the reads was performed using seqpurge Bacteria, grouped into functional clades based on the v0.1-852-g5a7f2d2 [19] and read sets from each sample grouping in Alves et al. (Fig. S2) [35], and refined in iTOL were assembled using metaSPAdes v3.9.0 [20]. Reads from v4.2.3 [38]. each sponge species, or seawater source, were mapped in an all-versus-all manner using BamM v1.7.3 (https://github. Gene annotation and statistics com/Ecogenomics/BamM), a wrapper script leveraging the BWA mapping algorithm [21], to obtain BAM files for To ensure that the observed presence/absence of genes was differential coverage estimation. Binning was performed not influenced by MAG completeness, only a set of highly using uniteM v0.0.16 (https://github.com/dparks1134/ complete MAGs (>85% completeness) were annotated with UniteM). In brief, UniteM uses the binning algorithms the “annotate” function of EnrichM v0.2.1 (https://github. maxbin v2 v2.2.4, metabat v1 v0.32.4 using all parameter com/geronimp/enrichM) using the Pfam database to identify sets (e.g. very sensitive, etc.), metabat v2 v2.12.1, and eukaryote-like repeat (ELR) proteins and the Kyoto Ency- groopM2 v2 v2.0.0-1, and selects the best MAGs by their clopaedia of Genes and Genomes (KEGG) Orthologies CheckM [22] quality scores. For A. aerophoba, six pre- (KOs) to reconstruct metabolic pathways [39, 40], and the viously published [12] datasets (PRJNA366444- HMMs from dbCAN [41] to identify carbohydrate-active PRJNA366449 and PRJNA326328) were co-assembled enzymes (CAZY) like glycosyl hydrolases (GHs) and car- using Megahit v1.1.3 [23] and binning was performed bohydrate esterases [42] (Files. S1–S3). EnrichM’s “anno- with metaWRAP v1.0.1, which uses metabat, metabat v2 tate” function was also used to identify orthologous clusters and maxbin v2. was assigned to each MAG between highly complete MAGs (Fig. S3 and File. S4). For based on the Genome Taxonomy Database (GTDB) [24] amino acid synthesis pathways, EnrichM’s “classify” (http://gtdb.ecogenomic.org) using GTDB-Tk [25], which function was used to calculate the completeness of KEGG classifies MAGs based on placement in a reference tree modules, which are groups of genes organised by steps in a inferred using a set of 120 bacterial and 122 archaeal con- metabolic pathway as defined by KEGG. To identify genes catenated gene markers using a combination of FASTANI and pathways enriched in sponge-associated MAGs (which and pplacer [26, 27]. The relative abundance of each MAG may therefore may represent functions important for from separate individuals was calculated with coverM sponge-microbe symbiosis), EnrichM’s “enrichment” v0.2.0 (https://github.com/wwood/CoverM), after de- function (https://github.com/geronimp/enrichM) was used replicating all MAGs at 95% identity with dRep v2.2.4 to perform statistical tests for the enrichment of KEGG [28] to avoid arbitrary mapping between representatives of modules, CAZY genes and Pfams between highly complete highly similar genomes. MAGs with an overall abundance sponge and seawater-associated MAGs (Tables S2, S3 and >5% in at least one sample were included in the heat map S4). Enrichment of KOs within each module was calculated visualised with R v3.5.1 [29] (Fig. S1). Additional MAGs individually using a two-sided Mann–Whitney U-test and from coral reef seawater were obtained from other studies only modules for which >70% of KOs were enriched were [30, 31]. Information regarding the MAGs (taxonomy, further considered. For Pfam and CAZY enrichment cal- MIMAG standard statistics, etc) presented in this study can culations, EnrichM directly compares the number of pro- be found in Table S1. teins per MAG matching that Pfam using a two-sided t-test. For both KEGG and Pfam comparisons, a Identification and analysis of CuMMO family genes Benjamini–Hochberg correction was applied to control for the false discovery for multiple comparisons. The ggplot GraftM [32] was used to recover copper-dependent mem- package was used through EnrichM to make PCA plots brane-bound monooxygenase (CuMMO) family genes based on the gene content of each MAG (KEGG, Pfam, (GraftM package ID: 7.22) from the sponge-associated orthologous clusters of genes). MAGs and a published metagenomic dataset from sponges where bacterial amo genes had previously been reported Tree building and visualisation [33]. Briefly, GraftM uses gene-specific (e.g. ammonia monooxygenases (amoA)) hidden Markov models (HMMs) To visualise the phylogenetic distribution of the recovered to identify and extract sequences from metagenomic reads MAGs, phylogenetic trees were constructed by supplying or assemblies and inserts them into a reference tree to assign the bacterial and archaeal concatenated marker gene align- them to pre-defined functional clades. A phylogenetic tree ments produced by GTDB-Tk to IQ-TREE using the LG + was then inferred from these sequences with IQ-TREE G model [34]. Bacterial and archaeal trees were initially v1.6.12 [34], including previously published CuMMO constructed from all sponge-associated MAGs with >50% 1644 S. J. Robbins et al.

R

T

C

H

A

v

E

P

a

A

S

r

b

b

b

u

u

u

r

r

r

g

g

g

D

s

s

s

d

d

d

Y

o

o

o

S

r

r

r

f

f A

f

S

S

S

l t t

t

i

e

e

e

u

i

i

i

L

n

n

n

i

d

d

d

2

l

l

l

0

e

e

e

1

r

r

r

2

G

G

G

G

C

C

C

C A A

A

0 F

0

0

0

0

0

0

0

1

1

1

0

0 0

0

0

0

0

2

7

7

7

6

6

6

6

4

3

2

6

3

5

5

5

9

.

.

.

5

1

1

1

.

1

A

A

A

S S 3 enomic

S C 0 0 P 8 g 1 1 M M 2 2 M

8 E

9 1 n n

1 1 n T n 1 i i i i

0

0

F 0 b v b b b

0

0

0 e C C

g 8 3 0

C 7 C C 7 5

7 C r C C 367v O ar O 6 C s 7 9 7 6 6 C 6 6 0 C O 3 O O C O C C 1 7

i O O 6 6 1 6

IM 3 6 C S 2 S O o 6 c 2 1 O 1 I R 3 O 2 8 S S S O O O 0 S 0 0 R v 0 v S n 0 S v

i 5 3 3 1 O n 4 a I S 1 C S i n n 9 3 A 3 3 1 R S 1 B 3 1 B 3 n n 3 B i i S 1 B C S S 6 3 6 3 2 i i T 5 8 1

3 S b n 3 4 6 6 5 6 6 I 3 n M25

P 2 6 0 b 3 S b i 5 4 S C 6 4 6 S

3 g i a R 3 3 g S P b b g 2 n

6 n

4 n 1 3 6 3 3 i P 3 n g 2 i 6 A b 2 0 8 i n 4 0 41 n n 6 e b y 6 6 e i

e n A 6 6 7 i C 4 i i

R 0 6 3 8 8 6 b i

8 e n P b AM M 3 6 b l 0 4 M 8 AS 0 M n 3 n M 3 3 n b i

R n n 0 8 4 b n o 0 b b b i R 5 8 6 5 4 b 8 i 86 0 i 7 A H 8 n n 3 6 4 A

8 8 o 8 o P b A 6

o A

1 i r S C A 4 3 0 5 b i 7 b b H 0 0 6 3 b 3 4 b 9 o H 7 n 6 .1 R 2 4 6 7 M b 6 7 8 b b m P R O2 m 8 AM P b m b P 6 6 R P 4 S S i 8 b i b 4 6 R L 1 6 6 2

m O n 8 5 A n b R 3 4 3 5 i i i 87 O b 3 3 H R R 5 i b 9 3 3 6 H n 3 i n i n 6 b b i A B H bi b i I n 3 3

i

A 4 i 3 n n C 0 7 8 C 2 c S P c C C 3 B 4 i c A S 6 1 C 0 9 n n i i 6 5 3 1 A A n C 6 O3 3 S S A i i i n 3 0 i 5

U O n c 0 S S 6 7 9 i O 4 3 2 3 b n n n 8 6 C A C 3 7 n B R 4 I R 1 S 0 R 1 2 R C 3 R n O S b 1 A bi i 2 C C I R b I R S 6 P 5 I 3 in I O O 6 I L 1 b i 1 4 O 6 2 6 2 I S n n 0 n 4 R n 1 1 1 B S b P 6 b i i i bin i A 1 I C 3 C i n 2 O S n I n O 3 C C b 6 6 b p C n 0 8 5 i R b CO C 0 1 b i O b b in 1 57367 A 6 I 2 O 4 86 n S b in 29 8 6 2 4 C 2 86 b 33 R e b i C O b bin i 4 1 4 2 O 6 i n OS36 C 8 4 I n C 3 I n Y 1 1 g 5 b C 3 87 6 b t R R i O Y 9 5 b 2 2 R C 6 b b C OS r n 4 6 n 2 R C 3 bin 18 S 3 i arc R i 6 T 8 o 4 Y bin n 6 4 4 00 H C 4 b in i C 3 C in H n C n 4 6 6 36 63 3 P s 4 S T i I H b 5 b i ST S 40 O 1 4 b R b n 1 S36 F i 2 3 6 404 n IR AM O2 OS S b P i 15 32 S 5 i i a A O b 6 n 1 COS 0 a R i O 3 I C 3 b S 405 n 2 O 0 b R A R i C O T P 4 4 C fi 1 n H i 2 C S 6 05 0 M R b n C O 6 0 7 y c A 24 C C P H 1 6 C b 9 4 S b 64 a P i O i O 3 4 2 A H C 7 f O i n 2 OS3 6 O b n 8 3 ni GC l P A S B o i 6 3 n C S 40 9 4 o M SB O 3 3 7 i C rm 2 6 3 A B 0 S n COS36 OS3 6 r M I b 1 4 O S n R 6 1 n 6 1 5 M 4 5 S 3 i bin S 0 b C OS n 1 i 9 1 9 7 i C bi S C n O b 1 6 I 6 bi s i S in A 9 C O R b n nBug bi in S 6 1 B 7 4 C b 5 U 0 i b 7 O A n C u n 1 B C 3 n 5 n 100 0 0 bi b 4 n 6 in P 1 i a L C HO i

0 66 3 i 28 r 0 6 b 4 b n A S b R n b 64 4 R a 6 7 5 b i n 3 O A p 6 2 RHO3 bin 39 45 05 bi bin 10 H i b n 1 P D 36 l 6 1 C 86 I y b 4 6 O i RC A E I s n 6 S 0 R b i 3 R n S36387 b 1 i 3 T na i 8 40 6 1 H n COS3 C 3 6 n 21 5 2 4 b 1 CO n 1 i IR O1 i 1 32 7 1 C A in CO b n 57 n b bi OS36 i 8 b i C S3 n 3 O b P i C 1 i 7 14 IR n 7 n A b i COS363 A b O b n n 9 P S n i i 4 C 7 4 C i 4 I C n C b 2 A 4 b n 9 R C O 2 4 AP HO O O2 b M 4 i 7 bin n b 7 3 b 1 bi C O S n R 4 i 1 H 2 S4 8 bin 22 066 6 IR bi i 1 P S 4 n 3 S R R O 4 B 6 A 66 n 0 HO 1 B 4 b 3 RHO2 2 bin I C A H H R P b 6 i 8 R O 8 M 0 bi 51 bin B0 P O A i n R n 0 I C n i 67 H O3 bin 32 R 3 S B S I 2 n 9 AM S R b 5 R H AM S C P S I 0 in 4 AM B R C b 7 4 b P R SB0665 b P i 6 B066 0 C bi IRC AM M 3 n 4 A C S I 6 P 1 n 1 A M B R b 3 IRC M SB S 68 in IR P A 0 C b 22 0 C M 4 B 3 6 4 i R A 98 2 0 b n 18 I n 4 6 P n i 7 C in RC C P i 6 1 b 16 b in R I O 0 7 b in i 2 R C b n I 1 HO 6 5 b b S R A in 13 in 1 6 i 0 R 2 4 n 4 I O 7 3 H b P 1 5 3 2 H O n 7 17 6 6 1 i A i 5 b O3 n bin 40 bi H 6 n 1 1 7 R 6 i C b O3 b b in 0 n R 6 6 b i 4 n 2 A 4 n 4 H b i B b 8 C C 3 5 R bi i 8 R b B0 n 9 n 2 7 A R C i 066 CAR2 4 9 2 S n I b n 36 R M S B i C R 06 37 b 2 8 I A 3 1 b C A i B n 3 RC P in I C AM 2 I C A R bi S R 3 P b in 20 A R 4 n IR AM S C b M 66 b C R 3 b C I i 6 IRC 0 61 R n P A 17 6 P A 1 R 47 C b i I I C AM S n 6 C P 6 bin 25 R i n R A bi n i 6 2 R SB P 4 I B06 C R n 1 b bin 0 A S 664 I 3 b A B P R 16 0 IRC 4 M i AM B n P S AM IR C b C P M B0 0 i 1 A IR S 4 n R A I 66 C I C S 5 R B 0 P C S bi 1 AM 6 in 4 5 1 R M C 0 2 6 I P b B n n in 8 P 66 b b A P RC 8 3 A 0 i 2 I C P b in 2 i n 8 2 AM 6 7 n bi M R 4 I 6 b 4 I C 1 b n R 63 0 i I 4 24 S R C i 3 4 I S n IR b R B b 6 I C 2 386 6 R B i 3 3 C 0 3 n RHO C 0 b OS 6 b S 40 P 6 i 7 C 10 7 n O S36 4 P A 6 i n 8 n 36 i 2 A 1 1 C O 21 T M I b R b 2 n 4 7 M C i in E b i 3 1 S I C3 n OS 2 6 b 0 D R in b S B C O n 9 i 6 in 1 es B 0 C 2 b b n 0 67 1 b 4 i RH 1 40 6 T t 6 b n HO3 A e 6 8 bi E 7 7 O MP v i 3 R 3 e R I 5 n bin 8 D R b 6 87 4 sT TE H b 6 5 H e C in RH 3 1 F R OS3 0 s h O 4 in n 6 k A D 8 C t o HO2 bi en L e 1 2 64 es m 4 O ves I b 7 COS 3 n 2 a RC A n n t i S in bi y s2 P n 4 COS363 e e b o ve T R A b O 7 in 9 d on 46 6386 bi y h 01 P R H in C A 8 b D sT oma AM b O H O i P 3 6 R n 2 S3 o O A 04 h h 3 8 O 36 5 bso Aqu H 2 4 G o s S 2 b mas O b 6 C 6 8 2 B i 1 n G C 0 n bin 25 i 0 1 i F 1 m C n 5 COS n 66 b 3 8 i 1 2 6 a A 16 OS3 bin C 9 0 in n b 2 0 A ri C R 8 C i F 1 OS 01 q n A 1 in 9 6 a 3 b 88 u R 3 C 3 b s C b i OS4 9 0 4 Aq imar 2 b n 6 0 1 pA A in C 7 1 OS2 b 3 06 in 1 7 u C R 8 i 1 S 4 3 09 im i U A 4 n C b 5 n 1 O 6 n 5 5 C R b 3 i 3 .1 a a 23 64 8 A 4 C b 75 ri s i S I n R n M p IMGi C b O . a AU 2 i 1 O2 n 29 B06 1 GI A n 9 C i s b H 1 bin 2 I p R 1 b M D 1 d A in R A 1 1 3 3 G 2 2 P M S U 9 60 A b 1 O A I 6 5 RHO 4 4 D 227 4 2 in H P 7 6 b n 6 2 6 i R i in 4 2 n 5 C 1 62 0 b 14 36 2 621 17 R b n 3 I 2 i 5 in 2 6 4 9 5 7 5 1 C 7 0 6 b n 3 0 6 71 82 R in 7 i 65 2 I b 2 2 C4 b B0 7 g 171 8 1 in 9 in 8 en 4 IR 7 0 C b b R 678 b 7 n IR g o 8 I SB06 0 mic 8 C3 AM S 6 I C e B 6 R n R P M 4 I S 0 IR C A o C A mi B C R P b R P M S P H A in I A AM c C PA O P M R b 2 R SB0668 bi I H 2 i 9 I C A R M n P SB O M C S b 8 IR i A 8 1 n 1 C 6 IR P B 0 P 66 b 5 IR A 0 C 3 C M 6 in 2 2 7 IR P 6 IR bin 5 1 I 55 A n C AM S S R 4 bi B C b P n A bi bin 9 I P 0 4 i A 3 R A 6 n 8 b P 1 C M B 7 1 7 in 0 A O bin 1 P 0 H S 6 b 4 n 4 6 AM S B R i 6 in 9 b bin 0 1 2 66 19 0 in b bin 1 7 i RHO2 0 6 5 B 6 n HO3 4 0 b I 33 R S B0 R 66 i 661 b bin C IR n bin 1 S 0 1 CO I C 34 B R P I b 668 A R 3 C4 AM S C C4 bin bin in M IR P B0 P 2 A S A AM S P 8 M B PA 3 IRC M 0 1 3 C A n SB 6 b 3 R P i 7 i 9 I 0 n b 3 0 2 67 b 3 RC in i 9 I O b I 2 n H 9 R 3 b 3 R in 2 C 8 b I 1 in R 2 AR 1 bi C R C 1 bin 13 IR 3 n 4 Sponge species (outer strip) C b 22 CA R i 669 bin 1 I 2 n AR2 bin 4 R H 1 4 0 C O b C 7 B n R in 9 R I PA 3 S 5 R H 1 bi b CA bin M 4 n C O i 1 A 5 6 M 1 n P A P A P 3 6 bi AM S SB b 7 A 5 P i 2 A n SB0 0 IRC 19 6 b 69 4 bin n B 7 i i n AM SB066 b 0 5 8 RC 6 b I P 6 5 R in AM 2 10 H b 2 IRC P 0673 R O 0 n B in C i H 1 b S 2 IR O b 3 M R 3 in C2 A 5 H b 5 IR P O2 i 6 in 3 6 R n 2 H 7 IRC O b 3 1 b in R in C 2 R b H 5 I 103 26 C O b 4 C3 R in n in OS 1 R i 26 b 6 H bi 5 I b 2 3 in 8 6 3 O3 A b n 6 n 5 P 1 i 4 A 2 n 04 b 1 i 0 b 13 in b B066 in C 3 S b 7 RHO 067 O i 1 M 10 C S4 n 75 b 1 A 6 I OS3 RHO SB 0 bin R b 6 M 16 C P B n C RH in S 74 P b IR PA 6 bi A O i 2 M 4 n C A B0 M R 3 1 R 675 in H b 1 I P S b I S 0 R B0 O2 in C M B C IR A S 65 6 b 47 P 6 P 7 i M 0 AM 5 n C A RH 5 R P I b 0 I SB R S i C C O n B 1 3 IR AM 34 P 066 1 P AM b in in 1 I 7 IRC 1 b S R b 0 C i in 25 B n RC 0 3 3 I 3 b 36 6 b 1 n Agelas tubulata IRC 7 i C i 6 n IR b I b 3 n 16 P R i 7 C n C4 3 A 2 3 bi 3 M I 4 6 IR R b S in S C1 i O b B n C 0 b 4 8 6 6 S4 1 6 in O n 2 1 C i 7 4 b 5 A b A 3 P in P 2 5 R A 1 bin 1 in 1 A b H b 4 in O1 in C b 1 4 R 1 R 8 1 H 4 I 3 66 in O b C3 0 b I in in R R 3 IR SB C b 4 H i 9 P O2 n AM 0672 I AM R 4 IRC4 b B R 8 P S C H b C S O in IR P B 1 IR AM AM 0 5 C 6 bi 1 P 6 n bin 6 S 9 6 5 AM B 6 IRC P 0 b in S 6 in OS2 b 6 1 4 B 5 3 C 1 0 b S n 3 6 i O i 6 n b 6 1 1 C 4 7 6 IR b 6 S n in C4 in bi b 6 I R 4 CO 2 2 68 R H bi 1 6 in 12 C O n in b P 3 1 RHO b 2 2 AM SB R bin 3 6 n H C1 SB0 6 i O M b 3 IR A B0 5 2 8 P S 7 0 b 6 6 in M 0 7 4 IRC B 0 b 9 PA S IR C M in C IR A IR R 2 1 P 1 C H b 8 in P O1 in IRC A 1 7 M I b 2 R2 b n Amphilectus fucorum R i A i S C n C b I B 1 5 0 R 0 b 0 R3 2 C 6 in 6 in b P 8 CA 3 A 1 4 2 IR M IR b 3 R n in A bi C S C 1 C PA B 3 2 4 0 b M 66 in AR in 19 1 C b 9 S 2 2 1 I B bi 8 R 0 n AR in C 66 4 C b PA R 7 0 H b R3 74 M O in A n bin 33 PSE S 3 25 C i 5 B b 7 7 IR 0 in A b 3 6 po C 6 P in PS 7 70 A b B0 d P R 5 1 S E el AM H b C p lA O in R AM o ll S 1 2 I P 8 d e B 2 6 e n 0 bi C n llA GC 66 n IR bi Phyla (clades) l 6 35 le A 4 2 2 n b O in G 00 A i b A C 3 P n RH PA A 6 A 28 4 k 3 IRC b 3 0 51 IR in n am 03 95 C4 RHO3 bi 9 k 6 P 37 6 e 3 .1 AM SB I b n H 52 A R i C3 e 5 S C n IR bi 9 n 5 M 1 27 1 tsc .1 363 0 b n h ASM 6 in RHO2 bi e 68 1 3 l G 519 1 S C 3 bi O 6 v1 n C n 38 A 3 3 00 52 geno 7 04 5 7 P I v in S R 060 1 m IRC1 bi b E C g i 14 p P 0 e c C2 od AM 5. no R in e 1 mi I b ll Al S gen c 6 le B n n 06 o COS4 bi GC 6 m 3 Amphimedon Queenslandica 2 ic S 6

A I IRC1 bi O in 0 R n C b 03 C 49 6 P 2 6 3 A IR bin OS1 11 52 M C C 0 S 4 1 2 bin bin 2 in 5 B b 2 S 6 b .1 0 in O 38 A 66 2 C 6 677 SM3 1 5 S3 0 b 0 6 in O SB 3 A 29 C M n 3 5 PA A 3 bi 20 bi P 3 2 v n C in 1 g 3 IR b 066 e 8 3 n SB 4 R om IRC M 1 H i A n O c P bi 2 b C 4 0 IR i R 6 1 C n I in 9 06 A 3 13 b bin G b C4 6 E RH in R M SB 66 T O 2 I A 0 po 3 1 bin 16 d R b C P Poribacteria ell H in IR 47 A O1 AM SB lle IR 95 B0669 n C b C P S bin G P R in IR 2 C A H 3 AM 66 A M O 0 P 0 0 S 2 bin C B 036 B R S 0 I M 35 66 1 A 30 4 9 P 30 5. b C IR in IR bin 1 C 4 A 3 2 S4 SM R 14 P H bi O n S 36 O n C bi E 3 3 30 1 p 53 bi S 14 od 0 n O in 34 e v1 69 C 43 llA g 2 b n ll e bi en nom OS 86 bin 2 G RH C 63 6 ME C O ic 6 A 2 S3 B0 L 0 b O S S 0 I in C Aplysina aerophoba p 36 RC AM o 1 d 35 4 7 P 1 el 1 b C 1 lAl 85 RH in IR in le .1 O 1 b n A 3 4 G SM b Y4 CA G R in ST 25 36 HO 5 in C 3 1 1 b A IRC 518 b A 44 0 v in AP n 03 PAM 1 1 bi P 63 ge 1 O2 7 S 531 SB n E om RH in 8 po 5. 06 i b d 1 7 c 3 n 44 el ASM 7 O 2 bi lA b RH 7 2 lle 36 A in in 6 n G 3 PA 25 b 06 APA C IR 5 PA B A 0 C 31 bin A S k P v amk 03 AM 1 83 AM 6 IR ge P 2 eH 352 C SB no C 2 PAM S 0 m IR n ent 65 6 ic bi Gemmatimonadota s .1 78 3 O 7 che A b H n S B0 in R bi l G M 6 11 1 C 36 62 O 2 A 35 b H 3 00 26 in R bin 03 v1 35 IR 722 g HO2 21 C 8 en R in P 5. om b A 1 g i 1 M e c S S nom CO 10 B0 in 66 ic b 1 S3 bi 22 R n CO in H 50 b O 8 2 OS4 5 bin C n RH 7 bi O O2 6 2 H 8 bin bi R 1 1 RHO1 n n 20 bi 066 0 C1 B 3 bin IR S bin Arenosclera brasiliensis R 3 AM 4 H 6 P 6 O3 06 1 bi IRC B n n S bi RH 6 65 O 2 PAM 06 3 C B bi IR S n 6 M PA IRC 28 3 bin O 1 RH 6 n in 3 b 662 bi IRC SB0 M PA RC 4 I n 7 Bacteroidota 3 bi O RH n 3 bi S1 CO 17 in 3 b OS 20 C in 2 b OS C 40 bin OS4 C in 21 OS3 b 2 C n 7 bi 3 O2 n 1 RH 6 bi 38 36 2 COS in 3 Carteriospongia foliascens 1 b RHO 73 in 2 b HO R 4 bin S2 O 0 1 C n 5 n 6 bi bi 4 62 COS 06 SB AM C P IR n 51 bi S4 TA06 CO 3 in 4 2 b RHO 23 bin O3 RH 52 bin PA A 0 4 n 3 3 1 bi bin RC 661 2 I SB0 2 IRC AM bin P 0664 IRC B 1 M S 2 P PA 6 bin IRC 066 AM SB0667 bin 13 SB AM RC P I 3 IRC 9 bin 3 bin C4 IR 0662 P M SB Cliona orientalis PA AM SB0666 bin 15 C IR 53 bin S4 1 CO 1 8 bin IRC P 638 S3 CO in 1 6 b 638 3 12 AM SB0664 bin 35 COS bin 387 S36 CO UBP14 0 4 IRC bin O1 RH 52 bin PAM SB0662 bin 33 O3 RH 4 bin S3 CO 9 IRC bin O2 RH 1 P bin AM SB0677 bin 16 S1 CO 7 bin 1 S2 CO 9 n 8 7 IRC A bi in 2 AP 64 b B06 M S PAM SB0663 bin 5 PA IRC n 18 O2 bi RH Coscinoderma matthewsi 26 n 3 bi 32 O 1 bin RH 066 SB PAM IRC 8 4 bin IRC n 15 STY4 bin 7 1 bi 25 Latescibacterota IRC 5 bin 66 SB0 PAM IRC n 54 4 bi COS STY1 bin 18 bin 29 S2 bin CO 675 SB0 PAM IRC 5 9 3 bin COS STY2 bin 4 2 1 bin COS 3 bin 9 HO1 R 14 8 bin 3638 COS STY3 b 53 3 bin RHO AXIM hallam GC 14 Crambe crambe LOPHE tianQian GC bin in 6 6405 COS3 80 2 bin 21 CAR2 bin 25 RHO n 0668 bi M SB A IRC PA 000200715.1 bin 5 Spirochaetota 388 A 36 001543015.1 COS 3 4 bin 3640 COS 1 genomic 6 bin 3640 COS ASM154301v1 genomic 2 7 bin CYMC moitinhoThomas 67496.assembled 3638 COS bin 47 SUB tianQian G HO1 R 24 6 bin 3638 COS 7 bin 6 CA RHO2 00154192 16 2 bin COS 36 5. S4 bin 1 CO AS 12 M 3 bin 154192v1 ge COS Cymbastela concentrica 12 bin COS1 bin 18 n 6386 omic COS3 in 22 6387 b Verrucomicrobiota COS3 A in 49 PA bin 56 APA b bin 10 COS1 bin 7 COS2 bin 2 RHO3 bin 43 COS3

in 8 COS4 b bin 18 RHO1 RHO1 bin 37 bin 22 RHO2 bin 8 RHO3 63 APA bin RHO2 bin 25 in 3 Dysidea avara CAR2 b 14 CAR4 bin bin 18 SB0675 Planctomycetota COS36406 bin 13 IRC PAM 6 IRC2 bin 10 IRC1 bin 10 COS36387 bin 15 IRC3 bin 15 IRC4 bin in 12 SB0665 b IRC PAM in 6 SB0678 b Archaea IRC PAM COS36386 bin 19 6 B0664 bin 1 IRC PAM S 661 bin 17 IRC PAM SB0 0 B0662 bin 1 COS36405 bin 17 IRC PAM S 1 RHO2 bin 4

Halichondria oshoro RHO1 bin 15 Firmicutes RHO3 bin 49 COS4 bin 17 APA bin 42

COS2 bin 19 COS1 bin 11 COS1 bin 19 COS4 bin 32

COS36386 bin 4

COS36387 bin 19

COS36388 bin 6

COS36386 bin 31

COS4 bin 43

COS36387 bin 16 Haliclona cymaeformis COS36388 bin 17 Cyanobacteriota IRC3 bin 36 RC PAM SB0661 bin 21

COS4 bin 41

COS36387 bin 9

COS36388 bin 16

RHO2 bin 40

RHO1 bin 14

COS4 bin 42 COS36388 bin 7

COS36406 bin 3 IRC3 bin 35

COS36387 bin 17 IRC PAM SB0662 bin 48 Ircinia ramosa Deinococcota COS36405 bin 9 COS36386 bin 33 RHO1 bin 26 IRC PAM SB0666 bin 18 RHO2 bin 8 IRC PAM SB0668 bin 26 COS363 IRC3 bin 1 86 bin 32 COS2 bin 22 IRC PAM SB0661 bin 12 COS363 IRC PAM SB0677 bin 6 Bacteria 87 bin 20 RHO3 bin 3 COS3 bin 1 COS4 bin RHO2 bin 4 13 APA bin RHO1 bin 5 32 IRC PAM IRC4 bin 5 SB0661 bin 39 IRC PA bin 2 M SB0661 bin 58 IRC PAM SB0670 APA bin IRC1 bin 5 5

IRC PAM SB0 Actinobacteriota 36388 bin 10 664 bin 6 COS Ircinia variabilis RHO1 bin 9 661 bin 9 IRC PAM SB0 RHO2 bin 42 COS3 bin 7 RHO COS2 bin 1 3 bin 34 RH 5 O2 bin 46 COS4 bin 1 R 1 HO3 bin 78 COS2 bin 1 8 APA COS1 bin bin 48 IRC PAM S 10 B066 B0661 bin 2 bin 34 IRC PAM S IR 01 C4 bin 30 APA bin 1 RH 63 O1 bin 80 RHO3 bin A 58 PA bin 11 RHO1 bin IRC PAM 82 SB066 RHO2 bin 4 bin 33 IRC PA Patescibacteria in 81 M SB066 RHO2 b 6 bin 3 A 7 PA bin 3 in 25 3 Lophophysema eversa COS2 b 45 IRC PAM S OS4 bin B066 C 1 bin 51 2 IRC PAM SB S1 bin 2 066 CO 4 bin 29 7 IRC PAM IRC1 bin SB0662 bin 20 IRC1 b n 17 in 33 0676 bi AM SB IRC P IRC n 41 3 bin 1 IRC4 bi 5 IR C4 bin bin 53 22 RHO1 RHO1 n 30 bin 54 0666 bi AM SB IRC P R 58 HO1 bi 662 bin n 38 M SB0 RC PA AP I 44 A bin 2 COS4 bin RHO1 bin 18 bin 8 COS1 RHO2 bin 86 bin 6 Chloroflexota RHO1 A PA bin bin 76 6 RHO3 in 27 IRC P 665 b AM S Melophlus sarasinorum SB0 B0668 M b RC PA IR in 27 I C PAM bin 12 SB0 B0664 664 b AM S I in 11 IRC P RC PAM S bin 60 B 662 0665 SB0 bin 3 C PAM A 6 IR 3 PA bin bin 2 15 IRC4 R 56 HO1 bin bin 75 RHO3 RH O2 b 2 bin 61 in 71 RHO RH 43 O3 b 1 bin in 7 RHO 9 29 IRC P 4 bin AM COS SB0661 IRC bin 4 9 PAM 5 bin 5 SB06 0662 62 bi M SB R n 28 PA 4 HO3 IRC n 5 bin 661 bi 2 M SB0 C PA APA IR in 84 bin O1 b 96 Acidobacteriota RH R HO in 33 1 bin O2 b 4 RH R HO2 35 bin 3 1 bin 6 RHO APA bin 9 bin Mycale laxissima HO3 43 R RH 5 O1 b bin 1 in 68 678 SB0 PAM RHO3 IRC n 12 b 7 bi in 54 B066 M S IRC PA n 7 PAM S IRC 77 bi B0 6 661 M SB0 IRC bin 1 C PA in 42 PAM S 5 IR 2 b B B066 066 M S AP 6 bi PA A n 1 IRC bin 3 bin 8 675 0 SB0 PAM 9 IRC RC n 3 PAM S I 1 bi B IRC 066 IRC 2 bin 37 1 b 27 in in 20 OS4 b C I 15 RC4 bin bin AR3 11 C A n 1 PA b 1 bi in 1 CAR 3 IRC Myxococcota 55 PAM S 3 bin B0 RHO 675 2 IRC bin in 7 PA 30 1 b M S RHO B066 5 IRC 1 bi n 3 PAM n 43 PA bi SB A IR 06 7 C P 66 b in 1 AM S in 2 4 b B 8 Ophlitaspongia papilla CAR 067 2 APA 0 bi in 2 bin n 35 R1 b 20 CA AP 20 A b bin in 6 R3 4 CA IRC 2 PA bin 1 M S R2 B06 CA IRC 61 bi 45 PAM S n bin 44 APA B06 IRC 65 2 PA bin M 38 78 bin SB0 SB06 IRC 66 AM 1 1 8 bi P bin bin n 3 IRC 77 25 4 06 IR SB C4 AM 11 bin C P bin 20 IR 668 B0 APA M S in 2 bin PA 86 IRC B0664 b IR M S 18 C PAM PA bin IRC SB IRC4 CO 066 UBP10 9 S4 2 b bin bin in 5 C2 10 IR IR C in 1 PA 1 b M S RC B0665 I IRC bin 2 PAM bin 3 32 IRC SB IRC 066 n 1 P 6 b bi AM in 675 SB 11 Petrosia ficiformis 0 IR 06 SB 7 M 7 C P 5 b PA bin AM in RC 4 I S 32 I CAR RC B0 7 PAM 665 bin 1 S bin R2 R B06 37 CA HO3 61 8 b bin bin in 22 1 IRC 25 CAR 2 in 2 bi 3 b n 1 AR IR 3 C C3 22 bin bin 22 2 I CAR RC 10 4 b in 1 bin I 21 AR RC C 23 1 b bin in 3 R3 IR 1 CA 0 C P 2 AM bin S R1 C B0 CA OS 670 in 67 3 b bin A b in 31 AP CO 3 Entotheonellota 15 S3 n b bi in S2 I 15 CO RC n 2 4 b bi in 3 LI1 1 C IRC 75 2 n bin A bi 8 AP 1 IRC ma PAM S s Gam IR B a ic C PA 06 om om M 65 h en S bi penicillus liuT g I B n 1 R 0 MC v1 C 66 1 Y ic PAM 4 C 75064 nom bin M1 SB 15 S 1 ge c IRC 06 1 A 1 61 45. omi bi b 06 367633v gen I n 1 in 75 1 RC3 bi 9 19 01 ASM 8v A 0 5.1 685 omic n C 3 41 en A 13 n G M g 1 P na AS 23 A b eg .1 1 in 1 in rD 0036763 23 7 b C 1 ie 585 Alg 38 OS uth GCF 68 36 8 4 a 041 sp. S in b Q g unes F 0 ella CO b in PH Ant C an 86 CO 58 M ex G w 63 ic S A es S3 m 3 un 1 She O no bin Ant C ge R 1 known al ex 1 H 8 un al 0v O2 P 0079515. 74 n 61 b H 90 00 bi in OP CF PA RH 60 G SM2 A 9 O es 1 A 1 3 5. bin bin ntun 40 3 RH 6 HO O 8 exA R 31 1 Nitrospinota al in bin 002007 b R POO A 1 H 20 S HO 29 O2 R bin bi Qian GC 2 IR n 5 tian O C 9 B RH 52 PA SU n M 4 bi S S RH B0 CO 8 O 66 in 2 3 b 1 in bi 75 b IR 6 n 1 6 93 C 7 6 n PA SB0 A bi M AM P IR S P A 11 C B0 RC in PAM 66 I 4 b I 5 Pseudoceratina sp. R RC SB bin A C P 06 35 in 14 AM S 6 IR 6 bi 1 C B n AR3 b P 066 6 C bin 1 AM S 2 bi IR B n CAR2 C P 0 21 in 14 A 66 b M 3 bi IR S n 3 C B06 11 CAR1 2 b bin IRC in 70 b Y4 3 i ST 4 n 19 bin 3 bin 2 IR 17 TY C S 6 1 b in IR 21 TY2 bin C3 S n 6 bi 1 bi A n Y PA 11 ST 14 b in in b 7 CO 9 AR2 11 S1 Nitrospirota C n bin 3 C 1 1 OS2 3 CAR1 bi in bi 3 b C n AR OS 5 C n 57 3 b bi C in 1 6 OS 1 HO 36 3 R bin 8 40 CO 6 O3 6 S bi RH 3 n bin 6 C 63 4 A O 88 b AP ic S om 36 in n C 3 1 ge O 87 69 S bin n 36 HK1 bi 4 3 Rhopaloeides odorabile CO 04 APA S b 36 in in 30 C 3 2 b O 86 O3 41 S bi ibrio spongium H n 36 n liv R bi IR 4 5 a C 05 8 3 b b HO1 in i in Thioalk R b C n 3 87 7 OS 4 5.1 1 3 3 1 R b H in COS363 04 bin O 1 00583 3 4 0 bin 5 bi IR n 06 4 C 7 GCA COS364 64 in PAM 5 14 3 b OS 05 C 20 C 2 OS S ian n B Q S364 bi 4 0 IR b 6 tian CO 88 C in 66 n 7 2 3 b S363 b in HALC R in 2 Bdellovibrionota CO H 5 2 14 O2 36386 bi R bi OS H n C 665 bin O 3 1 B0 R b H in 3 bin 39 O 3 AM S C 3 3 P IR C b O in RC 8 S I S4 bin 26 3 27 O bin C 64 C O 0 S 5 IRC1 in 9 3 b b 6 in 2 C 4 1 7 OS 06 0 OS 1 3 bi C n C 6 n OS3 3 9 1 bi 8 87 4 b in C 6 in COS b O 38 2 S3 8 6 Spongia officinalis b HO in 71 I 64 in R b 3 RC 04 9 in P b b I AM in RHO1 76 2 RC 1 4 S 0 B06 b B S IR in 066 8 bin 4 C 6 2 12 5 AM in P b P 06 IR A in B C M RC S 73 b S 19 I M 6 19 P B A A 0 P B0 bin IR M 6 S 4 C S 75 IRC M 5 P B bi A IRC4 n IR AM 0 P bi C 6 n S 70 5 PA 8 P IRC A n A B0 bin A M 6 PA S 6 39 AR3 bi b B 2 b C R in 06 i H 6 n Dadabacteria 24 O2 19 7 2 R1 bin 17 b 2 n R b i HO1 i n CA bi 12 n 1 R2 n 5 i R b 5 CA b H in 63 O 6 in 3 b 7 CAR4 b IR i C n O2 P 9 n 73 C A 2 RH OS4 M 83 SB IR C bi 0 RHO1 bi n 67 O3 bin 82 P 6 2 H I AM b R R i C S n O3 bin P B 11 H bin 32 IRC A 0 R M 6 0 22 6 67 P S 1 0 I A B b B R M 06 in 60 C 7 8 S P SB0 5 I A b Stylissa flabelliformis AM 36405 bin bin 4 R M i 4 C 6 n n 1 4 S 65 1 IR b B bi 2 IRC P COS OS bi 6 i 0 C 3 C n 6 n O 3 4 7 1 in 1 I b 0 6 3 RH RC in b 14 3 in P 2 2 AP A 1 RHO1 b bin M A S O2 R b B bin 15 H in 0 RH 5 O1 5 66 40 RHO2 8 8 6 b b 36 1 in in in 8 14 OS b A bi 1 C 6388 bin 12 PA 3 n I b 7 OS 1 R in 7 C COS4 2 C 2 P 8 in IR A b M A 4 C P P S A A AM B bin P 0 A 6 263171v1 genomic 3 b S 6 TY3 in i 4 C n B S 4 O 1 0 b ASM in S 0 6 in b C 1 2 7 TY2 b 0 4 5.1 S 5 O b b 0 in S in in b CO 2 1 STY1 b 6 36 63171 S in TY4 C 4 2 S O 3 002 7 b A 4 S in IR 4 2 n bi 7 C n P IR A 2 C M 0 AM SB0664 HO2bin 41 bi lavyIlan GC P R I P SB S in 20 R AM C b C 0 R 6 I RHO1 bin 52 C P S 6 THE AM S B 5 O 0 S2 6 b 104 6 in Suberites sp. R B 1 3 b 9 H in 0 b COS36386 bin 5 O 66 i A n C 2 2 n OS36388 bin 21 4 1 P OS b 4 C A 2 bi 31 in b 6 n C 3 in OS 6 8 06 bi 6 3 3 3 B 4 3 8 AP 6 6 8 4 bin 7 A 04 b AM S COS A CO in P P b A in b 1 S 2 in 2 IRC R 3 3 1 in 24 H 6 8 CAR2 bin 15 4 I O 0 in R 3 4 C b b AR1 b b i i C 52 I P n n R AM S 1 1 Gammaproteobacteria C 0 9 bin 2 CAR4 2 IR P 8 A B 4 bin 9 C M 0 C 6 in 1 P S 6 AR b 2 O AM B 1 C 2 R S 0 b H 4 SB 6 i AM SB066 7 b 6 n P O i 3 3 CAR 3 bin 1 1 n 0 R 4 6 b 8 bin 4 H b 9 7 in IRC n O3 in 5 1 CAR 67 C b 4 0 OS 1 b 2 in I i 1 SB R 4 n C b 1 9 M COS4 bi 4 in 6 A IR bi P 56 23 C n I genomic R P 1 IRC in OS36386 bin 3 b C A C v1 10 C PA M S n O 2 M bi n S B RHO2 5 R 3 0 7 6 S 66 H 3 B 77 bi CO O 8 0 1 1 6 6 b ASM154300 06 41 b 6 i B R S b 6 n 3 in in .1 H b 3 AM SB06 bin 6 3 2 i 8 Tedania sp. C O 4 3 n 005 P IRC4 bin 34 in 1 3 06 4 2 AM S O 4 P S b n 2 A b IRC 2 b i 3 in i RHO3 b P 6 n RC A 4 1 I 4 2 001543 STY R b 04 2 A H in STY1 O b R 2 i in 8 1 9 n GC b H b n TY3 bin 7 4 O 2 S RH i 0 ia n 2 n

TY4 O3 bi 4 2 S AP n 1 b bin 10 A 2 STY4 bi 1 R in 7 HE tianQ bin 6 H b Y 3 in 1 R O 3 2 ST in 2 H 1 LOP b 1 STY R O b 2 HO 3 in R2 A b 1 in A 3 i 9 C 1 P n A b 2 b C i 3 STY1 bin 1 b n 6 bin A in 3 STY R R 7 HO 3 4 TY4 7 S 35 R b 3 in 2 H b 2 1 R O1 i 5 n 1 H 1 O b 40 n I RC 2 in b O2 bin I 1 R 0 AM SB0678 SB0668 bin 24 bin c P in P RH C A 386 bi IR M 3 Thaumarchaeota 6 P 1 C AM S PAM IR S IRC enomi P B C CAR1 bin 19 A 0 IRC COS3 IR 3 M 66 v1 g bi B ic S C 0 1 n S m T 4 B0 6 b Y b 1 6 STY 2 in 3 no 1 i 6 6 n b c b 6 S 28 4 in 0 3 in mi T b 3 o ic b 8 S Y2 in 9 Theonella Swinhoei ASM175062 m i c T n spAU67 2606217183 31 Alg231 54 genomic Y b 5 5.1 a gen C i 4 O 4 n 2 0 6 C S b 5 M74370v1 ge 24 i 0 n O 3 n A 30 geno bi S 6 6 AS 231 PA 1 3 4 0175 lg C 6 0 ASM15813v1A genomic S0 O 5 0 uegeria sp. b 4 A R C S3 in 06 b Alg231 i um CAR3 O n s2016 Ruegeri 3705.1 i 6 9 C 6 b 2 m Alg231 49 genomi CAR1 bin 18 1 S3 3 0 i u . O n 0 i in 86 A S 6 1 0074 er b t 5 genomic P 3 4 9 0 b gnan GC Thoma 3 in 18 A 6 0 00158135.1 R 3 6 in b H b ae bacter 2 8 bi rDe rimiCosta FZL31 3 RHO in 6 2 F 0 e bac maria sp 2 CAR4 O 7 ie eves 900143525.1 R 7 n t a in 1 1 7 bi h race AP 3 1 b n e ey Alg CA b 6 3 8 es l GC acea i il r AR2 bin 10 1 R A n 2 D GCF at b 8 gaut T C 7 HO3 7 E POO ka b in rvaLami GCF 1 R 4 T S la sp. 1 bin in e anH l CAR AR1 bin H 6 z in C 5 bacte C b 59 L in 4 O O b 6 i dob ane CAR R S 1 n odo b AMPHQ 1 b H 3 6 h T MYC AR3 R 18 O 6 in 4 Lokt C H 4 RAC in 2 8 .1 R T E 0 C CA b 5 3 5 900143535.1 H S b 5.1 T in b karimiCosta2019 43635.1 Rhodobact 4 AR2 bin 1 E w CF 5 H i 1 C T S i 7 n G 3 AR4 E ls 0 genomic HE 2 4 3 S l o 9 1 C A iu 3 900 n 0 P S l L SPOO 0014361 3 bin 8 n 3 iu P 0 i R A i2 9 665 bin 10 L iel CF 9 6 w 01 a201 H b i L il 2 G B0 IRC O i G bin 26 O n s 016 6 9 n 7 2 o S R P 5 e C 6 n iCost bi H b 1 A 9 GCF C4 C H Pi e n B0676 b 1 i t 1 AM O E n o 0 a201 S R bin 14 n 6 O e n t P I t 0 s A 1 t 7 l to heo 20 3 S i G 0 ASM148320v1 M P b an 4 C I 4 th 5 RHO A Co R A bin 8 R b i C 2 9 n Q e I P 4 n 1 CO C A n 2 mi bi in o e R 7 i 4 7 a 0 ne C P n 9 9 ll 2 RHO2 bi A 1 bi bin S 2 n 0 a unknown sponge miCosta SPOO karim RC in A 5 kari i 1 O 3 I C RH 3 G 05 l F . r R 1 M 1 la 1 R3 b 6 S 6 K C F AM SB0673 2 n C 3 3 2 v OO CA A i n O1 S I karimiCosta2019 GCF P 63 86 A 2 K O0 3 C Y B 4 b S I O 001483205.1 M 0 0 45 g SP 6 88 P 1 A 5 bi 4 S TY3 b b 0 e RC CAR C 66 0 I bin 4 in 1 no PO 40 T in . 1 SPOO ka 9 5 1 S 6 n 2 S Y1 m bi 2 3 9 3 4 m TY v bi A b o 1 n 5 b 2 404 n 3 i i P i i 2 i 99 c S3640 17 b n t n 3 I 4 inh g OS3 b n R A i 0 O i n 3 5 5 e ndler GC C b 1 b n 42 I C b 1 . no C i R i 3 oT 1 O 8 i n OS36 7 IR C n RHO P 2 A m C H 7 1 3 I AM S 2 h C P S i R in R 4 omas c b I C AM SB0 M R PA 1 OS4 bin 7 R C P 5 SB06 76 bin 1 H M B0 C A 4 6 6 P R O 5 M 6 13 8 M S 29 0 in 24 A 662 94 A 6 4 R H 3 M SB0661 b B b n M S B P 0 H O 66 9 RC3 bin 5 bi b 0 3 in C SB B v1 I 2 O b HES britsteinStei C n 4 1 i 6 7 C1 C O n 0 6 T b 75 . n 2 b b in as g IR 4 5 O S 2 66 AM S IR bi C i 0 en b n 0 i P 3 C S 1 6 n 2 sembl IRC PA O i 2 7 bin AM SB 2 bi n 6 COS2 O 2 b 7 o IRC 1 C S 2 C P in 2 4 7 b m AR R 7 i 1 O S 3 b n i 2 7 C 1 n IR C b i 40 3 i c 1 OS S b n 7 3 CA C 6 i 9 e n IRC n 3 i AR4 bin R1 b bin i 2 C 3 n 3 d O 6 1 C A A b OS 4 8 C S 3 8 5 P 6 C O b 8 A bin C 3 A bin 1 P 3 i 7 b 1 A O S 6 n 4 i A 1 6 4 b n A P S 3 5 R bin 23 1 4 0 i A 6 2 AR2 bin 1 R P 36 n 1 n 2 0 4 3 i b 3 C 4 H A 6 1 CA b n A 8 b i 4 i 4 O b 8 bi 0 A P n i AR 2 b 05 n n i CAR 7 A P A 2 n b n 1 C 1 b 5 P A 2 i n C 4 b n 2 bi i b 6 A R i COS4 bin 3 n 10 A bi i 4 i 3 b n n n n RHO HO i 0 R H R b 1 12 n 2 bi b H 6 R 1 4 i R O 1 n 1 PA 4 9 5 1 R O in n H RHO A bin i 2 b 46 2 I O n H 3 b b i 3 R O i HO3 2 I b n R b n O b 2 i C 3 i R C n n C 2 RH 3 C 3 i b 3 n b 1 LI4 2 L 3 bi 3 C L OS 3 P b i LI1 n I b 4 4 COS4 C I C n i 1 I1 4 2 L A in C C n in 6 Axinella mexicana b i C A I 1 4 bin n 0 b M 3 CL I b i A b 33 7 1 I R i 2 I R CLI I3 b i n 1 R R b n 7 1 2 S L I n 0 0 I C i CL 2 2 R C 4 n I 1 B I bin 9 b C ic A R Q 5 1 bi c P 0 L C C b i CL i P L n P C n 66 i A i C O c A A n m 85 C P b C M 2 P M TY1 o 1 S A C O b 2 0 2 TY4 FZ A O A ic I 4 S R i M S 1 S a 14 ic I OS P S n b en R M S t genom 17 d I S nomi n R C b B i I 3 S s A g 2 i m e B n R 3 8 e om C 2 IR i c 6 S 0 o 6 b ae c C3 3 n i 6 0 1 g n P B 5 R C1 6 e c 4 0 R P 6 B C i 3 6 R C A 2 5 m 0 6 c s H 0 1 0 6 1 eno mbl 4 1 A b A 8 0 7 87 1 H 4 66 v o omi m e H 4 M 5 2 3 e A mi 2 g l 0 O b M 6 6 5 5 P n n I 3 1 ge s ria n o O i ri 2 n O 6 a i 0 R P b n 7 b b 130v 63 i I i b 3 6 4 A 2 5 s e 9 n S bi e e n 5 R d b R 1 t 5 A i 0 i 3 b C 3 S i 4 3 C b n g 2 n 2 n i 62 8v g e C B n b n n C b n b n 4 n I HO 3 i 7 C B Alg S3 3 i b 4 i b 6 g n i 1 9 R O 1 1 5 1 b b n 1 O i i 3 0 i 1 4.a ac 0 o i 5 R A . b 1 3 I 4 n n n b i 2 i 35 O ka b n O 0 O n R i i O R C S n 66 n 9 b 5 b n i n 6 1 4 O i 2 1 8 1 R b n S 1 m i H P 2 6 AU2 2 7 b 8 3 2 23 9 C S 1 43 o 4 b A C sp M1 C 1 H b 2 4 2 4 i C H 6 l H 2 1 1 o 4 b R A 941 1 n 7 6 1 n n 2 O 1 SM1 p 2 C R i 5 7 2 1 9 6 R i i 1 2 PO O n R O n P b 3 2 3 g R 6 5 1 n i 01 O O 6 3 1 6 n R b b 8 5 s yl R O RH 5 9 C 2 R H P A 66 5 8 A io i A 1 6 i 2 b b b b O H 1 A

R R 5 H A S R i b Alg n H 2 6 AS H r 3 0 b S i h 2 n 5 b i 2 7 n 4 i R 9 A 0 0 H S 3 H 0 b P n 7 b 9 M R i 7 H i n i b 8 1 O 2 n P n 2 2 1 n P O i 3 n

H n 38 3 H S I b io 2 4 4 8 i b g H 5 O i R i 5 2 n i O 3 1 P lg2 h L 2 b 4 n 1 l b n i 2 B 3 A M r 3 S B n O 8 A p. 2 O . i O as 5 A b n 2 4 1 1 n 1 0 2 n g 6 8 b i b i b 3

b O 1 6 p n O 5. i 6 n i 1 l Z i n O 3 n A b n 6 3 A n s s S i 5 5 b i n n i A S C i 36 n 7 i 1 5 9 4 3 n i i 3 b 1 S b 1 6 S i 2 4 0 n n i 1 m b 5 3 b 2 b i b n S F b S . 2 4 i 2 a A 3 3 b b b A 1 vi b 4 2 6 b b 6 6 b S R b i 41 b 0 dovi S 5 1 b 7 i

b i p . O i S O 1 n b 0

I b b .1 M b B 235 n i 5 o s M n P R n 2 1 6 8 3 6 a R i

b i n m b 0 u 13 6 4 5 6 b s 1 n p 4 t n R i 5 A O 1 i 6 5 i 9 d a C 3 A C 2 0 O 0 O n A A A A n 6

i n o A 0 4 i 5 3 S3 S R 8 8 O i 8 nzia Tho s s 3

n n R 6 S P n b u a sp. s CO A C 5 6 P C 8 0 P P b se B 4 4 O C H 3 4 o m C 8 e o A r C R 9

5 O O H 7 0 b e u i A 8 A 6

2 0 C 5 i n 1 6 9 6 P R A o S e 0 n

H 9 R 5 s 168 A i 015 C C C d I C iell t C C R 2 i 7 n 0 C 3 uTh 2 3 i CO 0 4 inh b 6 c R abr R C 1 0 P t R M S b l S I m 2 a a I L 1 5.1 i uTh i F h A r C b i O n O 6 16 rsen r 019 P a C .1 o moi C 5 M C 0 e o r 7 0 k

g Y h C GCF 0 55 C t 43 s2 n O C i s G 5 y R 1 I And r e h YMC l gt O 0 YM n ma p E o C P GCF .1 u o V 143 C S 90 1 n 5 S .

1 F 5 . Ant sTh 8 C 5 x arev 900 969 e 6 0 e d F l mpso 9 7

o 4 9 G 9 a 1 stev 4 P 0 01 bon 1 L e 90014 sTh 9 GC 0 2 L 0

e 1 9 a 0 C 0 PO 9 st MC F ro CF 2 o f F C MY a G

C CY i C G

19 G m 9 REB 0 Cost 1 9 A 0

2 kari 01

a 2 O t

s

O karimi o

O osta C i SP miCosta2 i C i m PO i ar r

S k a

k

O

karim O

PO O O

S O P

P S

S

Fig. 1 Phylogenetic tree showing all publicly available bacterial except for the , which is split by class into alpha and and archaeal MAGs and genomes (N = 1253 genomes) with >50% gamma-proteobacteria. Outer tree colour strip identifies the sponge completeness and <10% contamination, recovered from 30 sponge species from with the genome or MAG originated. Red stars indicate species (Table S1). Inner clade colour denotes phylum affiliation MAGs produced in this study. completeness and <10% contamination, including those MAGs at phylum, class, order, family and genus levels. In from previous studies (Fig. 1 and Table S1). To determine brief, MetaCHIP first clusters query genomes according to how much additional phylogenetic diversity the MAGs their phylogenies, then performs an all-against-all BLASTN from this study added to the trees, phylogenetic distance, for all predicted genes. The BLASTN matches for each defined as the “total branch length spanned by a set of taxa” gene are then compared among taxa and a gene is con- and phylogenetic gain, defined as the “additional branch sidered to be a putative LGT if the best-match comes from a length contributed by a set of taxa” [24], were calculated for non-self taxa. A phylogenetic approach is then applied to both the bacterial and archaeal tree using GenomeTreeTk putative LGTs for further validation using Ranger-DTL [44] v0.0.41 (https://github.com/dparks1134/GenomeTreeTk). A and the direction of gene flow is identified. False-positive second tree constructed from all MAGs with >85% com- LGTs, as could be introduced through MAG contamination, pleteness, including those from seawater, was used as a were filtered out by removing LGTs marked by MetaCHIP basis for visualising the distribution of genes-of-interest as “full-length match” (LGT makes up large proportion of (e.g. GHs, ELR, etc) across lineages using iTOL [38] contig) or “end match” (LGT falls at end of contig). For the (Files. S5 and S6). current dataset, LGTs on average made up only 12 ± 14% of the total length of their respective contigs, ensuring that Identification of lateral gene transfers binning was based on compositional information (e.g. k- mers as used by MetaBat) from the non-LGT portion of the To identify laterally transferred genes within the sponge- contig. LGTs detected at all taxonomic levels were com- associated microbial communities, MAGs associated with bined and dereplicated. The genetic divergence of the the same sponge species, as well as all seawater-derived dereplicated LGTs, as well as the frequency of LGT transfer MAGs, were first dereplicated with dRep v2.3.2 [28]at (the number of LGTs per Mbp sequence data) per MAG, 99% average nucleotide identity. LGT analysis was per- were summarised on the basis of taxon or host of MAGs formed using MetaCHIP v1.7.5 [43] on all dereplicated and visualised using the Circlize package in R. A genomic view of the microbiome of coral reef demosponges 1645

Results and discussion Prior identification of ammonia oxidisers has been based on functional inference from phylogeny (16S rRNA gene Six sponge species, R. odorabile, C. matthewsi, C. folias- amplicon surveys) [47] or homology to specific Pfams cens, S. flabelliformis, I. ramosa and C. orientalis (a (metagenomes) [33]. However, the CuMMO gene family is bioeroding sponge), were selected for metagenomic diverse, encompassing functionally distinct relatives that sequencing (7 ± 0.5 Gbp) as these species represent domi- include amoA, particulate methane monooxygenases and nant habitat forming taxa on tropical and temperate Aus- hydrocarbon monooxygenases that cannot be distinguished tralian reefs and exhibit high intraspecies similarity in their by homology alone [35]. We used GraftM [32] to recover microbiomes. In addition, previously published microbial CuMMO genes from the sponge MAGs and their metage- MAGs from I. ramosa and Aplysina aerophoba were ana- nomic assemblies, as well as previously sequenced meta- lysed [8, 12], including 62 additional unpublished MAGs genomic assemblies from six additional sponge from A. aerophoba. The recovered MAGs, averaging 86 ± microbiomes where bacterial amoA gene sequences had 12% completeness and 2 ± 2% contamination, made up 72 been identified [33]. Phylogenetic analysis of the recovered ± 21% relative abundance of their respective communities CuMMO genes showed that all archaeal homologues came (by read mapping) on average and spanned the vast majority from Thaumarchaeota and fell within the archaeal amoA of microbial lineages typically seen in marine sponges [45] clade. In contrast, bacterial CuMMO sequences were (Fig. S1 and Table S1), including the bacterial phyla Pro- identified exclusively in MAGs from the phylum UBP10 teobacteria (331 MAGs), Chloroflexota (242), Actino- (formerly unclassified Deltaproteobacteria) and from an bacteriota (155), Acidobacteriota (97), Gemmatimonadota unknown taxonomic group in the previous metagenomic (60), Latescibacterota (44; including lineages Anck6, assemblies [33]. All recovered bacterial and taxonomically PAUC34 and SAUL), Cyanobacteria (43), Bacteroidota unidentified CuMMO placed within the Deltaproteo- (38), Poribacteria (35), Dadabacteria (22; including bacteria/Actinobacteria hmo clade, indicating these genes SBR1093), Nitrospirota (22), Planctomycetota (15), UBP10 are specific for hydrocarbons rather than ammonia (Fig. S2). (14), Bdellovibrionota (13), Patescibacteria (9; includes The finding that Thaumarchaeota are the only microbes Candidate Phylum Radiation), Spirochaetota (8), Nitrospi- within any of the surveyed sponge species capable of oxi- nota (7), Myxococcota (4), Entotheonella (2) and the dising ammonia, and their ubiquity across sponges, suggests archaeal class Nitrososphaeria (21; phylum Crenarchaeota), they are a keystone species for this process. hereafter referred to by their historical name “Thaumarch- To further investigate the distribution of functions within aeota” for name recognition. Mapping of the metagenomic the sponge microbiome, a set of highly complete (>85%) reads to the recovered MAGs showed that the communities sponge symbiont MAGs were grouped by principal com- had high intraspecies similarity across replicates, consistent ponents analysis based on their KEGG and Pfam annota- with previous 16S rRNA gene-based analyses (Fig. S1). In tions, as well as orthologous clusters that reflected all gene general, taxa present in A. aerophoba, C. foliascens, C. content. Similar analysis conducted on 37 MAGs from the orientalis and S. flabelliformis appeared unique to those sponge Aplysina aerophoba suggested the presence of sponge species, with only one dominant lineage present in functional guilds, with MAGs from disparate microbial C. orientalis (order Parvibaculales). In contrast, several phyla carrying out similar metabolic processes [12] (e.g. Actinobacteriota, Acidobacteriota and Cyanobacteria carnitine catabolism). Here, we find that MAGs clustered populations were shared across C. matthewsi, R. odorabile predominately by microbial taxonomy (phylum) rather than and I. ramosa. Further, members of the Thaumarchaeota function in all three analyses (Fig. S3). While functional were detected in all sponge species and were particularly guilds could not be identified based on analysis of total abundant in S. flabelliformis at 12 ± 4% relative abundance genome content, this does not preclude the existence of (Fig. S1). Addition of these sponge MAGs to genome trees such guilds based on more specific metabolic pathways. comprising all publicly available sponge symbionts (N = To identify pathways enriched within the sponge 1188 MAGs) resulted in a phylogenetic gain of 44 and 75% microbiome, sponge-associated MAGs with >85% com- for Bacteria and Archaea, respectively, reflecting substantial pleteness (N = 798) were compared with a set of coral reef novel genomic diversity (Fig. 1). and coastal seawater MAGs (N = 86), 31 derived from Comparative genomic analysis of the sponge-derived published datasets [31] and 55 from this study (Table S1). MAGs provided unique insights into the distribution of Seawater MAGs with >85% genome completeness (93 ± metabolic pathways across sponge symbiont taxa. For 4% completeness and 2 ± 2% contamination; Table S1) example, microbial oxidation of ammonia benefits the spanned the bacterial phyla Proteobacteria (48 MAGs), sponge host by preventing ammonia from accumulating to Bacteroidota (13), Planctomycetota (5), Myxococcota (5), toxic levels [46], a process thought to be mediated by both Gemmatimonadota (3), Marinisomatota (3), Actinobacter- symbiotic Bacteria and Archaea (i.e. Thaumarchaeota) [33]. iota (3), Verrucomicrobiota (2), Cyanobacteriota (2), 1646 S. J. Robbins et al.

Phylum

Poribacteria Chloroflexota Gemmatimonadota Acidobacteriota Gene copy number: GH51, 127,

9

in

b

5

S9 s

n e i 0

a 6 b I

w RC I 1

IC R 8 I 5 R at R I 9

2 R 2 C 2

C B 3 e P I C

n RC n 4 P S 6 r i AM P i P 5 1 3 P A se b A A b

A m M

a 0 n M i M 2 P i 8 8 a M S C C C n C s 1 bin 14 C A 5 5 C 1 1 b 6 i C 4 1 S s C a S B 4 5 1 t A A S A 4 2

A 6 6 n 6 M 4 1 s A t i S A A i I B 4 1 2 e A 8 4 2 2 n m 6 1 B 0 I RC R B R I R 2 i R e e 4 R R b 2 n n 1 R R 9 1 B n R 2 s n 0 6 se 1 2 R 1 6 i i 7 i 4 0 i b n n a S 0 1 3 n 4 b 4 I 1 1 i i 6 C 3 i 2 0 C r n se 2 r 6 b S 3 6 b 1 R b n 6 b w 4 6 n r i 5

n B i 2 s R b b r 7 i 8 6 a b i 2 6 b b e 2 b e I b 8 bin n P 4 6 B I 1 6 b C b e i n b R 2 t b 8 b t 2 ea at R 3 0 1 e 8 R b w 7 b 2 b a s H 7 i i i t 1 i r 3 A 4 3 1 6 I 1 n t i n n 2

n b 9 b i a i a 6 2 b 3 b n bi RC C i r ea n w H 8 O se n a C O 6 e 7 bin O at e O n A b a A n M in RHO 4 1 R i 38 6 w O O bin 77 b i 6 t i bin i O e A 1 b 1 RHO n 1 I w 7 3 4 3 n w 3 in n r O b H a 1 P 5 1 i w t b b 6 P H 56 6 6 B9156 S5 bin 1 a e i n 2 w 3 b w H i 6 a 0 H H L a 0 a 0 n PA a n b n H te n i S 6 S a i se b 4 i 2 a A 2 R 3 2 n A 4 RH a 12 R w r b n t e S e 4 O3 b w a b 3 R R C 5 e etti S 4 R e bi 1 s B i 7 C e O r C in 2 w b b bi s 6 i n s a t b i 1 0 a 9 O a r 2 n s 3632 e n 0 b s se e 1 4 404 b et A C 6388 t A 5 i OS w n 3640 7 s a a n 3 b C e r 66 i 6 ea 87 b b 4 e na C R n 8 se S3 3 64 9 e C R b s w C a e ti b r 22 s r b a i 4 s 36 9 7 29, 95, 88, 77, and CE7 & 15 406 tt 4 i 2 6 in a n S 3 e AR 4 i n COS36386 s A 2 2 O 7 at t e w tt n 8 i n COS36387 e n 2 26 A 8 S 6 w e i n i a s S t e b 6 9 n 2 n a 3 1 O 363 a i R 5 C i et P bin COS ett w e r e at n bi 13 a at 6327 O w a 2 6 1 1 i C b 3 A 3 OS36386 b n w r b a a n b at t n wa 1 b e 5 0 C e w a i 3 6327 CO bin 3 b et bin C OS OS3 r a s n r b bin a 36309 1 1 bi ter seasim S e r t 6 1 3 a 2 b t e et e a i C C t 5 3 A a te e r s 3 n 386 a i t i O r se n 2 b n 3 r s ea t e 3 27 5 386 bin 3 9 w 1 i 34 RHO H 42 1 1 e n a r 6 i AP 6 s 1 0 at 12 b n 7 9 ea a s a 4 3 R eaw Bdellovibrionota 6 awa bin 7 3632 b i n bin 20 6 i in 1 1 s i 2 2 n 7 s e b e 2 m 3 17 i b I i 6 OS36 b 0 si r 6 b i n s R m 63 Latescibacterota n 6 5 bin 1 i C 7 b I m S 221 b n 4 RC C COS3 1 7 b n 4 S B 26 7 i 5 I STY I in n 7 R P S 6 73 bin B 9 R b STY4 bin I 12 S4 bin 21 C A i 7 6 RC P B 9 157 A b i STY1 C n 7 3 bi B0668 M n A 9 1 P i COS P 4 bin S b n 6 156 52 A 15 CO AR1 bin 15 A M P S bi i S n 2 C AM SB06 M SB0 0 A M B b 7 S n 6 CAR S 6 i P A 2 M 0 n 1 M B CAR2 bin n S S5 1 2 P 6 66 b A in B 06 81 9 CAR4 bi S b i P n b B 0 2 b i IRC n 4 6 n 6 RC C C 0 1 bi i 2 I bi 6 n 6 4 405 0 O 4 b n IR 1 21 I 65 4 I R S in IRC3 S1 R bin 1 n n R 2 C1 4 5 IRC4 bin O IR C b 1 2 1 HO2 bi in 56 bin i 2 1 C OS36 2 n 1 bi C P P RHO A b 7 bi C 4 P A R 20 O 2 bin R n 1 M 1 A A H 3 bin HO bi 8 bin 32 9 2 RH M COS O 3 S n 3 n 4 bin 14 RHO B 2 b 1 bi 1 664 1 S 1 O3 bin n 24 9 06 i 0 i I B b n 5 RHO1 O2 bin 5 3 1 R b 6 b 4 i H 0 7 i n 7 RH 066 n C n SB i 6 3 b 5 R B 66 I HO 65 b R 6 i 2 I P n RC 2 bin 8 R M 8 I A I C4 4 bin 24 06 RC RC A 6 4 M S I M b IRC3 bin C B R i 7 P I bi n 1 S R 4 M SB0 C P P A 3 6 IR C P PA B06 B CO C1 n A 1 A M b R 7 I P M 0 i 4 AM S AM n 8 S 6 RC P S bin 5 I M S S 75 44 3 B 4 A B IRC S 066 b b 27 P in I B066 06 in IRC 3 R i bin 9 n 1 A n 24 C 68 5 9 1 b i 1 IRC P Marinisomatota I bin O b 4 P RC 4 A Myxococcota 1 b A in 7 M IR b in 7 b 3 8 RH n I i 1 RC R C n 20 S b n HO2 1 2 RHO2 i IR B in b 7 RHO3 bin 2 P 06 b 37 C 18 in n 6 A in B0661 7 IRC4 bi b P M A b n 2 0665 bin 17 A 0 i 1 RC3 2 COS P n 6 I B in 3 M SB b C b A 3 S in AM S n 25 S b 0 IR i 0 P M 62 B i 2 b 6 4 n RC1 bi A 6 0 7 2 I C b 41 P 64 RH 6 5 i R B0 6 n I bin 17 b S 2 i 4 AM SB0661 bi RHO O n IRC A bi 6 P P 1 20 5 R n A AM 4 b P HO 2 M SB06 6 8 3 in IRC n in b 3 C A i b in C 2 56 P in IR b CO O C b 4 IR S4 i 7 O1 b R n 3 IR C H 40 S 54 36388 HO 3 bi R P S n 10 1 A RH bi n O 3 7 M n 2 C b OS36 in 2 4 1 RH O2 1 6 S in C in 0 B 1 n O 7 i 1 9 06 b O3 b b b i 1 RHO2 bi 6 n 1 n H n 72 bi 8 i b 5 R IRC i O1 7 n bi n 3 i

I 5 12 b RC 1 n 1 RH 5 b 1 2 3638 0 2 b OS3638 4 3 in 1 C 1.6 I r 2 R OS 36 R C bi 2 Nitrospinota 2 C n H 2 ate 4 1 OS bin 14 9 R O3 b C Bacteroidota H in 9 in 6 in b I O b 1 seaw RC 1 i 1 A A n P 04 b I R b R P 7 A S3638 4 C P H A in 2 5 A RH O n 5 P M b 6 CO S36 2 A 2 in 9 M S O b 8 CO bin 18 IRC B 1 in 5 S 0662 OS3 bi 4 in B b 52 in 8 i C b IRC P 0664 n A 55 88 7 COS 3 3 I M IRC4 bin S2 b RC P n A SB bin 8 CO bi 5 P M I A 0677 10 OS36 2 RC PA S bin C n 2 M B bi 9 se 066 29 S 4 RHO 1 a B bin 9 bin w M 0666 bin 1 4 5 s at 1 RHO 16 e e S bin 9 n in a r B b wat s 06 b 3 RHO3 4 8 s e i 3 4 bi in 2 ea a I 6 n 67 e si R 1 3 0 w r C 4 IRC B n 7 a s m 3 bi AM SB066 i 4 t e IRC n P S b e S bin 1 r a 2 IRC2 bin B0667 b 8 n sim B 4 sea 9 31 RC AM S 66 16 A b I P SB PA in si 0 AM bin 4 m 39 SB0 9 S9 b IRC 0677 bi 5 1 in B S 55 AM B b 39 IRC P P S 9 in 154 S4 AM Nitrospirota 1 SB067 IRC P b 1 Spirochaetota RHO1 S i AM n 3 IRC P 9 5 2.2 b R in bin HO3 IRC 9 IRC b 4 1 n 6 R in i b 1 H b 49 IRC P O 3 55 A R in C bin M 2 47 R 3 8 HO1 b I S 1 bin IRC S i n 1 B n O i C 7 6 P 0662 b 50 b 3 6 in A n A P M IRC 70 A bi SB bin 4 M SB0 I C 2 RC 1 R A 0661 1 I P 1 bi 1 in P n C b AM IR IRC bin 1 in 6 4 C3 n 38 b S R i 6 B 3 3 I b 68 0 bin 1 2 6 676 C4 IR bin A 37 P bin A b RC1 M SB0 RHO3 2 I A 6 6 P in I RC RHO2 14 IRC 2 bin 5 bi in P n b 4 A R COS 3 M b 48 1 HO1 in S n S bi IRC B 51 CO 4 1 12 066 b S in in b P 1 O bin UBP10 A 66 C 2 7 2 2 M IRC bin R n in A Verrucomicrobiota S 4 B066 B0 4 bin 13 C 75 b 1 S 66 AR3 bi 06 I R 5 C AM SB RC HO3 b P in C P 16 IR AM A RHO2 b 19 M S in C P 2.8 3 IR bin B 8 2 0 b R 74 6 in A 70 C in 7 4 b bin 9 A 3 IR P n C2 1 A bi 5 GH77 IRC RH 8 3 bi C1 in O n IR P 1 8 A 12 3 M IRC bi S n bin B 1 50 RHO3 b1 IR 0668 bin 7 C IRC 13 PA bin 6 IRC M IRC bin P S 3 12 IRC2 bin CE15 A B bin S1 M 0661 n 2 CO S 28 2 bi CE7 B0662 b 11 in OS n 4 C 3 bin 6 0 14 4 IR IRC bin n 1 Dadabacteria C 4 COS bi 0677 bi 1 4 P 9 S SB 30 A IRC bi M M n CO A 664 bin S 4 12 B bi C P 62 bin Planctomycetota 0661 n 2 IR AM SB0 bin 5 P SB06 A IRC n 10 PA 29 PAM bi R b C 9 HO2 in IR 666 6 GH88 38 bin 0 n 4 bi IRC bi 8 n IRC AM SB 2 3 13 P in RH bin 4261 b IR 21 IRC ter 12 C O1 a 1 3.4 P A RHO3 bi r 22 IRC M n seaw SB 30 wate P 0675 b a 68 AM in se n 70 bi GH33 SB bi A 06 n 2 AP in 53 64 2 b bi 17 RH n PA n O 2 A bi 1 8 1 9 b O 3 in in IR 62 RH b C1 4 RHO3 bin RHO2 in 11 b Proteobacteria GH95 3 1 b R in HO n HO2 69 R bi 1 S4 in b O 0 b Deinococcota R in C n 1 70 HO2 19 i 6 26 n SB0 GH29 bi IRC4 b 1 bi IRC n 17 AM 6 5 P 6 in 4 B0 b RHO3 bin IRC S 14 AM 668 P B0 b C RHO1 in IR M S 51 PA IR b C C in IR in 38 P A 11 AM PA S b 3 B in RHO2 b in 066 83 b 2 R3 9 GH127 bin A in RHO2 C b 35 1 GH51 4 RHO2 bi CAR n 13 n 7 i b R2 b in CA 4 RHO3 20 bin 4 6 b AR 2 in C bin 6 2 6 HO 8 1 R n 1 bin bi 5 067 4 B bin RHO3 M S Cyanobacteriota PA 0665 Nanoarchaeota B C IR M S PA IRC 34 bin IRC1 25 bin C3 IR 7 in A b AP 0 n 1 bi C2 IR 5 n 3 bi C1 IR n 26 bi C3 IR 5 n 4 in 3 bi b 64 C4 06 IR B n 5 M S bi PA 65 4.6 C 06 IR SB AM C P IR 2 in 7 A b AP Actinobacteriota Crenarchaeota in 7 O1 b RH 2 in 2 3 b HO 1 R in 2 S1 b CO 0 in 1 3 b OS C 22 bin S4 CO 8 in 1 n 6 1 b bi IRC 0661 SB 1 AM bin C P 5 IR B066 PAM S RC I 28 bin O3 RH in 6 1 b bin IRC3 662 SB0 PAM 5.2 IRC 4

RHO3 bin 7 3 bin S1 CO n 17 3 bi COS 40 bin COS4 4 bin OS2 C 13 6 bin 3638 COS 2 bin 3 RHO1 n 51 4 bi COS 43 bin RHO2 23 bin RHO3 n 32 A bi in 39 STY2 AP 61 b SB06 PAM b IRC 5.8 i A bin 5 6 n 4 AP bin STY3 SB0664 PAM IRC hydrolase 9 bin 6 1 bin RHO STY1 bin 5 bin 42 RHO2 4 bin 3 RHO3 S 8 bin 4 TY4 bi APA 0 bin 3 n 7 IRC4 RHO1 b 62 bin 34 SB06 PAM IRC n 33 0664 bi in 37 AM SB RHO2 bin 25 IRC P 33 APA bin 0 bin 2 B0662 AM S IRC P RHO3 bin 43 5 IRC IRC3 bin 1 P n 22 AM SB0667 IRC4 bi 6.4 bin 8 RHO1 IRC bin 13 bin 6 PAM SB0666 bin 15 RHO2 PA bin 6 A in 11 0664 b IRC AM SB Esterases IRC P PAM S 68 bin 27 M SB06 B066 IRC PA 2 15 IRC bin 33 APA bin P 79 Amylomaltase A HO3 bin M SB0677 b R bin 45 SB0661 IRC PAM in 16 8 IRC 662 bin 2 PAM SB06 IRC PAM SB0 2 Sialidases 63 bin 5 RHO3 bin APA bin 96 4 AP RHO1 bin 7 A bin 56 36 Class RHO2 bin COS36406 bin 13 APA bin 43

RHO1 bin 68 Fucosidases COS36387 bin 15 RHO3 bin 54 661 bin 15 IRC PAM SB0 0666 bin 1 COS36 IRC PAM SB 386 bin 19 APA bin 80 bin 27 Glucuronyl IRC PAM SB0662 COS1 bin 11 IRC1 bin 20 Arabinosidases IRC4 bin 11 COS4 bin 17 APA bin 13 IRC PAM SB0661 bin 43 44 seawater seasim SB9152 S1 bin 12 IRC PAM SB0661 bin IRC1 bin 25

IRC4 bin 20 WGA-4E Anaerolineae APA bin 86 COS36405 bin 12 COS4 bin 10 IRC PAM SB0666 bin COS36386 bin 9 11

COS36406 bin 7 IRC PAM SB0661 bin 22

COS36388 bin 13 RHO3 bin 25 I RHO3 bin 1 RC1 bin 31 IRC4 bin RHO1 bin 1 21 IRC2 bin 13 IRC PAM SB0664 bin 5 14 IRC3 bin 22 IRC PAM SB0667 bin COS3 bin IRC3 bin 7 3 COS COS4 bin 28 3 bin 15 Gemmatimonadetes Thermoanaerobaculia 87 IRC2 bin 8 APA bin IRC PAM SB0665 CAR2 bin 5 bin 11 IRC P 6 bin 13 AM SB0664 bi im SB9157 S n 15 eawater seas s IRC bin 8 PAM SB0661 im SB9156 S5 bin 19 seawater seas IRC1 bin 19 152 S1 bin 8 easim SB9 seawater s 9 IRC3 bin 158 S7 bin 13 easim SB9 seawater s 29 APA bin 1 RHO1 bin 57 COS3 bin RHO3 bin 18 59 RHO2 bi RHO1 bin n 60 3 RHO3 APA bin bin 68 RH n 11 O1 bin RHO3 bi 20 uknown bin61 RHO in 2 2 bin 59 RHO2 b IRC PA bin 82 M SB0661 RHO1 bin 16 R in 55 HO3 bin COS4 b 67 9 IRC PA AR2 bin M SB0662 C bin 21 IRC P 1 bin 3 AM SB CAR 0666 bin 6 IR 7 C PAM 62 bin SB0670 AM SB06 bin 19 IRC P IRC2 bin 12 bin 3 COS2 IRC bin 15 4 bin COS1 17 I in 19 RC1 bin COS4 b 21 IRC3 n 85 bin 11 HO1 bi R A PA bin bin 42 52 IRC4 I UBA2968 Luteitaleia 17 RC1 bin bin 3 IRC1 0 IRC 9 PAM 3 bin 5 SB0661 RHO IR bin 34 62 C PA bin M SB RHO2 0664 bin 48 IRC 22 1 bin PAM RHO SB0666 IRC b 4 in 21 in 22 bin 3 APA b 3 IRC 58 PAM 3 bin SB06 RHO 62 CO bin 9 S3638 n 60 6 b RHO1 bi in 1 CO in 80 S363 O3 b 88 b RH in 11 1 RHO bin 1 1 bi 405 n 40 OS36 RHO3 C bin 6 bin 52 6404 OS3 COS C 5 3 bin 5 bin 4 APA Marinisomatia uknown 3 RHO n 2 7 bi bin 632 9 ttina 3 CO er be in 6 S1 bi 26 b n 1 seawat 363 ina CO tt in 7 S2 b er be 28 b in 1 wat 363 A 7 sea PA ettina 7 bin r b bin 1 89 eawate 6309 IRC s na 3 n 18 PAM er betti 08 bi SB wat 363 R 06 sea a HO2 64 b ettin 5 bin in 2 ter b 8 bin 18 7 awa 3630 RHO se a 9 3 tin in 2 bin ter bet 18 b 26 eawa r 426 IRC s ate in 9 PA eaw b M s SB0 er 42618 2 IR 661 wat in C4 bi bin sea b n 32 6328 IR 8 a 3 C1 tin in 2 bin bet 6 b 15 er 261 IRC 4 seawat ater n 3 PAM S eaw i s 09 b C B06 63 OS 65 na 3 n 4 2 b bin Bacteroidia Acidobacteriae etti bi in 25 r b 112 18 ate 22 IRC aw ter 1 P se wa in AM ea b SB s 6326 CO 0675 a 3 S tin in 5 3 b bin bet b in 9 29 er 618 C Gene copy number: GH33 at 42 OS seaw ater in 4 1 b aw 10 b in 2 se 63 RHO3 a 3 27 ttin bin b r be PA C in 5 ate A OS 3 w 36 sea 4 bin 4 405 S RH bin 155 3 14 9 in O2 m SB 8 b bin si 32 CO 6 a 36 3 S 7 er se na in 36 at tti 8 b 38 aw be 0 CO 7 b se ter 363 6 S in 2 wa ina in 36 ea tt 9 b 40 s be 0 CO 6 b ter 363 S in 1 ina n 3 36 seawa ett 0 bi 404 r b 31 CO b ate 36 30 S in aw ina n 36 3 se ett 8 bi 38 r b 32 I 8 b e 36 9 RC in wat n 2 P 5 ea ttina bi AM Rhodothermia Bacteriovoracia s be 12 CO S er 1 S B066 28 3 awat ter 22 bin 8 bin 2 se a 2 bin C 1 eaw 11 O 2 1 s 22 16 S2 r bin b ate 4 C in aw R O 16 1 S se CA 8 4 b in 3 bin 1 CO 3 AR 6 C n 10 S1 bi bin R2 C 1 CA 2 OS 2 36 CO 386

CAR1 bin 1 7 S3 bin bin 63 18 1 87 AR COS b C in 6 in 22 1 R1 b C bin A 7 O 1 C 1 S2 0 bin bin R3 CO 7 CA 4 S in 3 C bin R1 b 6 O 2 CA 1 S4 bin bin 2 RHO 8 AR 18 C in 1 b 4 b RH in R 0 18 Spirochaetia Bdellovibrionia CA 1 O2 in RHO3 bin S2 b 22 18 n b 9153 i A in SB b 0 P 8 S1 1 A m n bin 9152 bi C 63 seasi 65 A SB in 8 R2 ter im B06 b s S 3 C bin ea AM RC AR4 3 seawa r s P I 18 te bin b IRC IR in 2 awa 8 C 14 se CAR1 n P bi AM R4 IR S A 13 C2 B0 C bin bi 675 1 I n 9 RC 6 bin CAR in 1 1 b IRC bin 8 R3 7 3 10 CA bin bin IRC 10 AR2 24 C n PAM i IR 3 b C S 9 P B0 RHO in 3 AM 6 IRC 65 1 b S b O 7 4 B in H in bin 067 1 R 4 b IRC 8 2 3 15 b Chlamydiia Oligoflexia OS 1 P in n C i I AM 6 R 5 C SB 0666 b n P 0 bi IRC AM 6 C3 S 64 M SB R 4 B bi A I n PA 0661 n P RH M 1 C S 6 IR 2 O2 B0662 bi CAR3 bi b n bin RHO1 in 17 2 4 AR in 5 1 bin C b b 1 4 RH in 0 AR O 1 C in 2 3 5 A b P in 71 A 49 CAR1 b n bi bi CO n PA S 42 A 2 4 in 40 b CO bi PA S n A 1 1 CO bin 1 9 S AR4 bin 1 n 1 4 9 C bi CO bi 1 1 S n AR3 3 32 C CO 6386 1 OS4 bin S b C CO 36 in 2 387 4 n S Verrucomicrobiae Polyangia bi 36 b RHO2 bin C 3 in O1 OS36 88 1 H 9 R in 70 COS bi b 38 n A 0 6 P 1 4 6 A in IR bi bi C n n in 6 4 31 CO PA 3 RHO3 b 1 b 2 M 1 S S CO 36 B RHO 38 06 S 7 6 3 b 1 bin 10 CO 6 i bin COS4 bin 388 n S 16 2 C 36 b 1 COS2 1 O 38 in in S 7 1 RHO2 363 b 7 I2 b in 88 9 er 42615CL bin 2 b bin 3 R b 1 H i in O n 1 1 40 6 seawat CLI bin 4 RHO3 1 b LI1 in C CO 14 b 3 bin 14 S in 6 S C 1 5 54 O bi 0 7 bin S n in 1 I 4 20 B91 b RC b S 638 0 in m 17 1 3 S3 RHO1 b 4 in 2 CO bin 2 RHO 3 UBA8108 UBA9160 r seasi 1 5 1 b ater 426 3 i w 2 bin 2 n a S C 2 26 wate OS b ea se 1 bin in s 3 S CO 6 8 eawater 22 3 s bin 82 58 S 8 CO 2 6 PA b A bin bin i 1 S n COS 3 2 32 sim SB9152 1 6 2 r seasim SB9153 4 387 RHO 3 te RC1 bin 7 CO a I bin bi b w S n i C 4 n ater sea O 1 2 sea RC4 b 0 w I S in C 1 sea 4 O b 1 S in 3 RHO1 bin 6253 bin 58 CO 2 1 6 b 3 S i 18 COS3 3 n S4 bin 4 n 5 O bi b C in AM SB0 COS3 1 P 6 3 bin 27 406 COS1 CO IRC in 12 6 b b S 388 0665 C 3 in 5 O 6 4 SB 3 b C S 87 i B0664 O 3 n AM 6 1 P S3 404 b IRC4 bin 23 n 8 C in M S 6 8 A O 3 IRC P OS4 bin 1 43 386 b C C S i S1 bi in O 3 n O 6 b 2 IRC 29 I S 4 C RC 3 05 in O1 b bin 5 H I 3 b RC bi in R S4 bin 1 n 4 3 CO P 4 Planctomycetes UBA8248 CO 3 A M 3 I S in RC 4 S 7 B SB0662 bin 59 b RHO2 b RHO3 bin 9 in 2 in 0666 b 5 b 3 AM R in P n 1 RHO2 H 5 bi R O b n IRC HO3 in 1 2 678 bi b 3 2 COS in B0 AM SB0677 b 33 P S n 1 CO in 3 bin 15 2 36 bi 2 7 S IRC AM SB0675 bin 3 CO 4 7 P 3 05 CAR 6 AM bin C S 406 P CAR1 O 3 b IRC 6 i O1 45 S n C 387 b IRC O 3 10 RH 2 6 in bin I S R 3 bi 9 A 1 3 8 42618 binP 3 bin C 640 8 n A I 78 1 RC4 P bi 6 bin 1 I A 4 n RC M 06 b 9 2 b i n S n SB IR P in bi B066 10 seawater C A I M 12 AM 668 bin 4 RC 0 6 P P bin 2 S B I A 5 AM SB0677S 3 R P M B b P C A 0 i IRC M A M S 67 n SB06 IRC P A C1 bin 1 P B 1 RC P M 2 R A A S 5 9 10 I n 0670 A IR i HO M B bi 7 b P R in 0 n S 6 IRC HO2 1 5 Deinococci Nitrospiria 1 B 62 b IRC b 9 0 i IRC AR3 b in 3 i n n 6 bi C C 6 3 b n 1 6 O P b 7 n 9 AM SB0675 bin 1 i 7 P 22 I S A n bin 2 S5 bi RC 2 n 0 4 M 5 9157 S6 bin 56 S1 IR 5 B P bi S 1 IRC C n B IR A 52 0 C P M 6 AR2 bi A 6 I S 7 C R P M 6 B 2 SB91 I C A RC 0 bin sim SB91 CAR1 bin 1 M S n 1 P B 6 618 bin 25 I 6 RC 4 A S 0 ea 12 bin 15 bi 1 1 er seasim S b M B 675 1 1 IRC b in 0 12 2 bin 6 3 S 6 i ter s 1 1 bin 2 A n 2 n 3 2 b 4 B 6 b r seasim 1 P in 0 5 8 2 1 RHO A P 06 i wa te bi A b n seawat water 42 1 a a 2 b 3 7 12 a CLI1 in 1 A M 2 in 1 in 6 se 8 P se A 1 S b 1 A 5 3 eaw water 2 P b B0 i s seawater 22 12 b 8 n a 1 8 C A b i n 2 n in 668 se O b 1 387 bin CO 81 seawater 2 86 bin 6 S i 28 C n 1 OS S 1 bin S36 1 b 02 bin 61 1 COS 2 O in A 1 seawater 221 I 4 bin C P R 1 4 A 4 I b COS363 a 36328 bi R C 4 b 6 seawater 22 9 in 2 C P I 2 3 RC A in P 7 C M CAR2 bin A 2 bettin O A P M 0 r SB P S e CAR1 bin 14 A CO S AR3 bin 1 6 A 3 M 0 Vampirovibrionia GR-WP33-30 C I 6 B066 bin 93 RC b 6 n 386 S RHO3 bin 1 A I S i RC n 6 3 B easim SB9156 S5 bin P 5 12 C in 6 P 6 2 066 seawat RHO1 bin 31 A 13 1 bin TY4 bin 3 b 1 RHO1 OS4 A 40 3 b P n 1 i b er s S 1 i R M n 4 A 4 4 i STY2 bi n 1 n 9 b i I HO M at 1 R S b bi TY bin 2 1 69 S B i w I C n n S RC 1 b 3 b 0 3 AR3 bin 0 4 B 19 ea R RHO1 i 661 b n 4 06 s C 3 bi bin A i 9 n 2 P n 1 A R P n 7 CA A 2 P A 1 5 b H M 1 CAR2 R 6 in A 6 n 14 O b b RH HO2 b 2 i S in RHO 1 n i 3 n RHO1 bin 52 9 B RHO3 6386 bi b 29 1 O 3 0 3 i 9 4 66 RH 3 b n bin R 1 18 i HO3 bi bin 17 b n 42 R HO3 1 4 n SB0662 bin 5 R i O3 b i COS RH H n 2 bi COS4 bin 16 n 12 b i M n 7 b O1 COS4 bin I i 1 n A RC3 n I b 1 2 O P CAR RC b 38 COS1 3 I i 9 n 3 R 2 b i AR2 I n 6 R in C i 3 0 A b n C P bi 40 b C 7 IRC R i P A n CAR3 bi AR4 bin 15 n 23 P n 1 2 6 RH HO3 A P A M 0 C CAR2 bin 8 se 3 8 bi 1 A bin 1 n 41 s b M 1 2 6 S 8 se e a O i M 63 bi n COS4 bin 4 se n a w S B i 3 se b in a w 1 S B b 0664 1 se a S3 a i b w b n 2 B in 4 at 0661 n S a t AM SB0678 bin 7 RHO2 1 i w i Y2 bin 1 1 CO e at n 0 1 A T a w 17 P e T a r CO b C 662 w RHO3 I P Y 3 at e r S RC t s S C 12 5 OS A S e 3 bin 1 at r se STY S e 1 e STY4 S T r b TY4 b 4 A s a IR b b 5 T e r bin i AM SB0675 15 I T Y s a S n I 3 e s n 7 R P P i bin i i 3 R Y r 3 STY4 I n n se P e s AM SB0677 bin I RC Y 1 6 a i 1 5 3 R A A 1 b 3 R m 30 STY2 bin 2 C 3 bi 91 se n C a im P R 3 si STY1 bin TY3 bin 4 R 4 8 i C bin M 3 Cyanobacteriia UBA1144 n H a

C C s n C 6 i b 8 8 bin b i 2 2 HO1 2 S C C H P m 9 O P a s S 2 b i bin 23 0 C i 8 12 bin 8 b O S 1 O P 6 O 3 b 1 A m C n IRC C in 1 C A C R n S i C P s HO2 bin 1 O A i n O O2 1 1 2 2 B9 CAR2 bin 26 2 S i 0 n m 9 O A n 0 3 7 3 8 3 IRC 1 b P S b S i n i 5 O S A M B 4 n O O 24 B O H M n O R i b 9 S 3 m O i n 1 S 1 2 n i 2 S 7 M 6 n 1 66 bin b i A S 2 i 6 i 2 B i 3 b M 0 9 b n S 1 C S A S S

b O S 3 b S H 3 HO 6 b B 4 b n b n b b S n b 3 n i S n n 662 153 i 9 5 i S363 5 i n S RHO3 P b i 3 6 b n i i 3 4 B R 6 b S 3 3 3 3 i 0 i 0 3 i i R S 6 R I 1 n S B 9 er 22 IRC1 bin I b n B 15 2 6 2 b n b 4 i

6 i b 3 2 b A I b 6 3 i B 6 5

O 6 b n B 9 n B 7 I 2 4 b n B06 1

L 6 B b 06 O 6 3 8 0 4 14 L 8 2 4 S 3 1 R A 6 7 9 152 5 6 L 0 6 0 0 5 C i 6 A 5

3 5 0 i 3 n

8 7 8 0 6 C 0 S 6 1 4 b S2 0 C n 8 0 1 0 1 0 C 6 4 A 0 8 4 C P

06 9 9 6

B 6 i 67 7 6 i S O 5 8 4 5 4 5 b n 4 3 6 M B AP b

6 C 2 b B A 5 S b

b S 3 6 4 6

H b 6 i 7 eawat 6 b i b 4 A S b S1 AM S n 7 5 i i n 3 b 3 3 n n b 3 i s 3 i S i i S R b i

P n n b 1 n M n i n 1 b i S 2 S n b M S S n i 5 2 i 1

M 2 n b n A 2

1 3 i n 1 O 9 A O O O 2 n 2

A 8 P b i 9 3 RC n 1 P 6 C C 15 3 I P C C IRC P i 3 2 n C 2 C C

R 4

R I

R I

I Thermoleophilia Gammaproteobacteria 16 Actinobacteria Alphaproteobacteria 18 Acidimicrobiia Woesearchaeia 20 UBA2235 Nitrososphaeria Dehalococcoidia

Fig. 2 Phylogenetic tree showing the distribution of glycosyl strip is coloured by class. Both are listed clockwise in the order in hydrolases and esterases across MAGs with >85% completeness which they appear. Seawater MAGs are denoted by grey labels with (N = 884). Values represent the copy number of each gene per MAG. red text. Internal branches of the tree are coloured by phylum, while the outer

Bdellovibrionota (1) and the archaeal phylum Nanoarch- coral reefs that sponges have been shown to utilise [53, 54]. aeota (1). Comparative analysis revealed that sponge sym- Supporting this observation, isotopic investigation of the bionts were enriched in metabolic pathways for fate of coral mucus and algal polysaccharides in sponges carbohydrate metabolism, defence against infection by showed that the microbiome participates in metabolism of MGE, amino acid synthesis, eukaryote-like gene repeat these compounds, particularly in sponges with high proteins (ELRs) and cell–cell attachment (Tables S2–S4). microbial abundance and diversity [4, 5]. Enzymes from the Genes belonging to GH and carbohydrate esterase (CE) CE families 15 and 7 have been primarily characterised in families (Table S2) acting on starch (GH77), arabinose terrestrial plants where they act as glucuronyl esterases and (CAZY families GH127 and GH51), fucose (GH95 and acetyl-xylan esterases, degrading lignocellulose and GH29) and xylan polymers (CE7 and CE15), were enriched removing acetyl groups from hemicellulose [55] (e.g. in sponge-associated lineages, likely reflecting the hosts xylans). Characterisation of CE15 and CE7 from marine critical role in catabolising dissolved organic matter (DOM) microbes is rare, though activity on xylans, which are a present in reef seawater (Fig. 2). Microbial GHs from the structural component of marine algae, has previously been GH77 family target starch, the main sugar storage com- demonstrated [55–57]. pound in marine algae [48], whereas GHs from families 51 GHs acting on sialic acids (GH33) and glycosami- and 127 are known to act on plant arabinosaccharides, such noglycans (GH88) were also enriched in the sponge- as the hydroxyproline-linked arabinosaccharides found in associated MAGs and may act on compounds found algal extensin glycoproteins [49, 50]. GH127 enzymes are within sponge tissue [13] (Fig. 2). In contrast, no genes for also required for microbial degradation of carrageenan, a the degradation of collagen (collagenases), one of the main complex heteropolysaccharide produced by red algae [51]. structural components of the sponge skeleton were identi- Members of the fucosidase GH95 and GH29 enzyme fied. Sialic acid-linked residues are found in the sponge families are known to degrade fucoidan, a complex fuco- mesohyl [58], and although the impact of cleavage on the saccharide prominent in brown algae [50, 52]. Notably, host is unknown, analogy can be made to other symbioses. arabino- and fucopolysaccharides also make up a significant For example, sialidases are common in the commensal proportion of coral mucus, a major component of DOM in bacteria present in the human gut where they are used to A genomic view of the microbiome of coral reef demosponges 1647 cleave and metabolise the sialic acid-containing mucins is not likely due to genome incompleteness. This finding lining the gut wall [59]. Increased sialidase activity is contrasts with comparative investigations of Planctomyce- associated with gut dysbiosis and inflammation [60] and tota genomes from other environments [64] and additional careful control of sialidase-containing commensals is research is required to ascertain the mechanisms used by therefore necessary to maintain gut homoeostasis [59]. As sponge-associated Planctomycetota and Verrucomicrobiota glycosaminoglycans are also part of sponge tissue [13, 61], to avoid infection. Although Type III RM genes were the same may apply to microorganisms encoding GH88 enriched in sponge MAGs, they were also present in all family enzymes. However, these genes are also implicated seawater MAGs. In contrast, Types I and II RM genes were in the degradation of external sugar compounds, such as present almost exclusively in the sponge-associated MAGs. ulvans, a major sugar storage compound found in green In conjunction with an enrichment in CRISPR systems, this algae that can make up to 30% of their dry weight [62]. expanded repertoire of defence systems likely reflects the Thus, the ecological role of GH88 family enzymes within increased burden from MGEs associated with the hosts role the sponge microbiome requires further investigation. in filtering and concentrating diverse sources of reef bio- Enrichment of GHs and CEs was largely restricted to the mass. Supporting this hypothesis, metagenomic surveys of Poribacteria, Latescibacteria (class UBA2968), Spir- sponge-associated viruses revealed a more diverse viral ochaetota, Chloroflexota (classes UBA2235 and Anaeroli- population than what could be recovered from the sur- neae, but not Dehalococcoides) and Acidobacteriota (class rounding seawater [63]. Further, we found that genes Acidobacteriae). These findings corroborate previous tar- encoding toxin-antitoxin systems, which are present on geted genomic characterisations of the Chloroflexota and MGEs, such as plasmids, were also enriched in sponge- Poribacteria [13, 14] but show that they are part of a larger associated MAGs. These observations suggest that RM and set of polysaccharide-degrading lineages. Identification of CRISPR systems are important features of microbe-sponge disparately related microbial taxa across several sponge symbiosis, allowing the symbionts to colonise and persist lineages (Figs. 1 and 2) that encode similar pathways for within their host by avoiding viral infection or being polysaccharide degradation, and therefore occupy a similar overtaken by MGEs. ecological niche, supports the existence of functional guilds Pathways for the synthesis of amino acids were also within the sponge microbiome when viewed at the level of enriched in the sponge microbiome. The inability of individual pathways. Given the fundamental role of marine to produce several essential amino acids has been proposed sponges in recycling coral reef DOM, studies targeting as a primary reason that they harbor microbial symbionts these specific guilds are needed to quantify their contribu- [65–68] and it has long been thought that sponges acquire at tion to reef DOM transformation. least some of their essential amino acids from their micro- Because sponges filter and retain biomass from an biome [69, 70]. Further, gene-centric characterisation of the extensive range of reef taxa (eukaryotic algae, bacteria, Xestospongia muta and R. odorabile microbiomes revealed archaea, etc), they are exposed to a greatly expanded variety pathways to synthesise and transport essential amino acids of MGEs from these organisms, including viruses, trans- [33, 70]. However, these same amino acid pathways are posable elements and plasmids [33, 63]. For this reason, also used catabolically by the microorganisms, and trans- sponge-associated microorganisms likely require a diverse porters could simply be importing amino acids into the toolbox of molecular mechanisms for resisting infection. microbial cell. Further, as sponges are almost constantly Both RM and CRISPR systems are capable of recognising filter feeding, essential amino acids could be acquired and cleaving MGEs as part of the bacterial immune reper- through consumption of microorganisms present in sea- toire. RM systems are part of the innate immune system of water. Comparison of sponge MAGs with those from sea- bacteria and archaea and are encoded by a single (Type II) water revealed enrichment of specific pathways for the or multiple proteins (Type I, III and IV) that recognise and synthesis of lysine, arginine, histidine, threonine, valine and cleave foreign DNA based on a defined target sequence. In isoleucine (Table S4). However, visualisation of the dis- contrast, CRISPR systems are part of the adaptive immune tribution of these genes revealed that almost all MAGs in system of some bacteria and archaea and encode a target both sponges and seawater produce all amino acids, though sequence derived from the genome of a previous infective specific lineages may use different pathways to achieve this agent that is used by a CRISPR-associated protein (CAS) to (Fig. S6). The enrichment observed in the sponge MAGs identify and cleave foreign DNA. RM (Fig. S4) and CAS was therefore ascribed to differences in pathway com- (Fig. S5) genes were both enriched (Table S3) in the pleteness between sponge-associated and seawater sponge-associated MAGs and relatively evenly distributed microbes, rather than an enhanced ability of sponge sym- across taxa, with the exception of the Planctomycetota and bionts to produce any specific amino acid. In contrast, Verrucomicrobiota, where they were largely absent. As compounds, such as taurine, carnitine and creatine have also these MAGS average 93 +/− 5% completeness, this result been proposed as important host-derived carbon sources for 1648 S. J. Robbins et al.

Phylum

Poribacteria Chloroflexota Gemmatimonadota Acidobacteriota

9

n % ARP/LRR/TPR/WD40 i

b

9

5

S s

e n 0 i

a I 6 R b

wate I 1

R I C I 5 I 8 R 9 R R

C 2 2 2

C B C P C 4 3 4

I n n P AM S 1 r R 6 i i

P P 5 P A 3 s C b b AM n A A m i easi M i a 0 8 M 2 8 M S P b

s C C 1 C 1 C C 1 1 n bin C C S B 4 A 5 i 4 C 6 5 a 4 1 2 S 6 15 S A t

A 1 A A I S A n A n A 8 s B 6 2 M 0 4 t R 6 4 e i 2 6 2 1

2 m A 2 I B I 4 s B R R ea B R R R n 1 R 6 R R R e 0 s b 1 6 n 2 i 1 C n 4 n 9 R e n n 1 bi i n 0 i 1 s 0 i 5 6 3 i i S 0 1 4 6 3 i 3 b 2 6 S5 bin 1

2 I C C 4 r b S r a r 7 b 6 n 4 n 8 in 2 w I R 6 b 2 n ea 6 4 b 17 s b n b 6 2 i

6 i b n R P 2 i B R e r n i b i e wate b 1 4 b B b e 8 2 R b 6 b 2 I b 8 i 6 1 e t 7 b n a C 7 b t s 3 t b A I R 2 b e r H 0 1 b 6 b i 3 1 b i i i C i 2 3 in i 4 i 8 b wat R 1 i 9 n a s a H n t r 8 n n te b b n a e n n a in 7 b e M 6 2 i i 7 C b O 6 O O A 4 R n O 86 bin 1 R 3 e 3 O w 1 n a t e A A O 4 a C i i R O b 5 w in B915 b 1

7 3 n 1 b n 1 I w 8 i 6 w 3 t 6 r 6 0 1 b P 6 3 awa H SB0 5 1 b H n H a n 7 P w HO3 b 1 H P H a i 2 0 i a w b r 4 7 b 2 0 6 a H i 6 L a 3 S 2 n a 0 4 4 s e n S 3 b 6 n i 4 2 A 12 b te R R 4 O A i n A e a w b R bi a b 4 3 R RHO e 6 se be e b RH 6 e r O1 n 36 S5 b C e C b w bin bi S 3632 7 O 6 2 s a C i 9 e t s 3 i s 40 m awa r 2 1 b t b i i n 7 t e n a 6388 b i A n 3 4 3 n s e O a s t n C COS3638 6 s t AR2 b S 7 e 66 in 8 b e r C e 3 ti i C b b 6 s w e n R 4 S 221 e C 2 na r 4 8 t 28 s se e O Bdellovibrionota s n 4 i i i A aw tin A 6 40 9 a a 4 bi i n n COS s a b 8 2 O 7 t te e ttin i n er eas repeat proteins 2 a A C 6 t wa s R R n 20 COS3 3 ter 6 t 9 e a e S36386 s a 3 6 bin 1 1 5 C P 3 Latescibacterota r e 1 a n 37 ate a 3 a wa t 3 1 COS bin n bin 29 wa 6 1 6 S363 tina 3 i be a a A i n 1 3 COS36 wa t 632 b COS36405 bin 1 36 3 5 n w er e bi O b w CO 6 bi s 363 2 b 15 OS bin te r t r i 8 e b n C e t a e n 5 at te s 3 7 i 3 ea A ter be tt t 2 n C 3 86 7 r 7 i HO a i t r s 2 n 1 P O3 b w a 3 r e n e 2 b 34 6 w i n s 4 6 09 7 b 1 R na 363 bi 63 i 7 1 0 a a r 1 in 7 A H w s e 1 n 9 a 261 3 ea b s 3 1 a 9 eas 4 6 R n a 2 n 1 i te 2 b s 5 bin 20 2 i S3 si m 6 2 n 21 n 6 b n OS3 i I b 6 i R r 6 i se 1 3 7 O m n 4 C b i SB 1 1 bi n 70 8 bin C 2 2 i Y4 bi i I m 5 C bin R 2 7 bin 7 i n 6 IR 2 7 b 1 b b S n STY T 4 P I 6 4 6 73 C S 1 bin 6 7 TY 915 R S 1 n 0 I C A B 1 7 i 6 R b S P B A C i OS n 6 0 M SB066 9 2 n i 3 bin 5 SB06 C PA A 91 P 1 4 b n C AR B bi 1 2 b 7 A COS bi M 5 S P 5 S 2 C 4 M 2 n 1 5 AR A bi i 6 AM 7 S 6 n AR 6 C AM SB 20 M SB0 S1 P S n 81 n B b 2 C P AM i n 0661 5 b 9 CAR C i b SB i b C P b n 6 2 n R I 5 6 i C n IR 3 0 0 6 b i 2 C n R i I C 0 6 4 4 n 4 21 O 6 b 1 I R C4 bi IR R 5 b i 1 I S1 bin 4 n 56 S n R n R in 5 2 I C I S364 R C P 4 bi 1 H 2 1 CO O bi 2 C R b n 1 1 bi O A bin 4 C H b i O 3 7 n 21 14 P AM R 2 bi n 1 20 P R i n O n i A H A 5 in H 1 bin 2 9 M 3 2 8 RH b 4 C O n SB0 O 3 4 2 b RHO2 S O 2 b 1 1 9 1 i 1 n 7 2 bin 4 n B066 S 5 RHO n b 3 bin 31 i I 6 4 i O 2 bi R i n b 7 n RHO3 b 666 bin C b H 3 5 3 0 I I i 2 R 65 R P R 2 n 47 b 8 RHO C I IR R C AM C i AM SB0664 bi b n M SB0661 I 4 i IR C4 bin P B06 R C P I C4 n A R 1 R B0668 bin Marinisomatota C AM b I C P P S 37 6 AM SB Myxococcota C C i S A b n R P B 1 I C P O i AM M SB 0 n 4 AM S bi R C 67 S 5 I P 4 AM SB06 4 n R 98 4 I 3 S 0 5 C n b 2 i n 66 i I B06 b i 7 IR 24 3 R n b 5 i RC P 1 6 n I A C 9 8 b P bin I 6 1 P R i 4 A 1 b n bin 7 7 A C in 2 8 1 M I b RHO1 b n 7 0.01 I R i R R 3 in HO2 1 C b C S H b 27 0 R HO3 n bin 1 i 2 I B O in R RC P n 5 0 b C4 i 2 A 2 1 6 6 i b B0661 bin 4 M SB b n 8 IR PA A 7 06 36 0 i 1 bin S C P n in 6 IRC3 b 25 M SB0 OS4 A b 3 1 AM in 0 IRC2 661 bin 37 in 0 b P M SB 7 6 i RC 62 b n 2 I A 1 b 7 2 6 5 b 41 P n i R 6 n IRC bi B0 664 6 b H i 4 S R 2 n 20 A O1 6 IRC P AM SB0 P M H b A 4 R O b in A 5 C M SB0 6 n HO2 b 3 b P 8 in 2 R A 3 I in bi C 5 P 8 i RC O n 73 6 I b 8 C 1 I S 63 R O in 54 IRC R 4 3 C S S 10 H b RHO P 3 i 1 R O3 b n 7 A b 2 CO bin M H i R n 2 in 2 O 64 in 1 S H i 1 COS36404 bin 2 n 1 b B0 O b HO Nitrospinota 71 9 1 R 6 in 53 bin 10 7 b IR 2 in O1 bin 6386 in 8 Bacteroidota RHO3 b 87 b bin C b 5 H 2 I S3 1 R 1 in 21 1 R 63 1 C 3 b CO S 405 IR 3 in 6 4 C bi 1 R 2 CO S3 er 22 n H 2 n 2 t b O a bi O3 b 1 4 9 R i 9 C 6 H n aw n n i O 1 bi 38 I i 1 se 6 R AP 1 n A 3 b P I C R 7 A S R A in 2 O C P H in 5 A R O2 b 6 C i 9 b 0.109 P M S n H 3 8 AM 8 in 25 I O1 b bi S 1 R 5 COS36404 b b 8 B0 n O in C S IR B06 66 5 C bin b P in 2 8 C AM S 2 55 OS4 2 C S 7 I P I 6 b 3 R A R 4 in O 638 C M SB066 C4 b C in B b 8 S3 b P 067 in IR A O i 1 C C M 7 n 0 HO2 n 29 se P S b 4 i A 9 R a B 1 in O1 bin 25 bin 9 wate M 06 1 16 se S b RH 4 n 5 66 in 3 9 6 a B HO3 b 6 2 water r 06 R 0 se s b 3 n 4 B in easi I 6 in bi b 7 a R S 0678 bi 7 w C 1 34 IRC4 bin at b 66 s m 3 b i C2 AM bin e e IR n 2 R 8 Nitrospirota r a S I M SB B0 6 14 i s s C n A S 6 i B 4 b bin Spirochaetota e m 9 A 3 P 4 a P 1 IRC P M 7 si S 1 in C A 6 A R SB0 m B 0 S9 b b I bin 915 3 M 067 5 S in 9 C P A B B 3 IR P S 91 5 9 i B067 5 S n RC AM S 4 4 1 I P 1 S3 bin 5 C R AM 9 H bi IR P in R O C b H 1 b n 9 4 IR 1 IR O 6 C R 3 in bin 1 IRC P HO2 b b 49 3 n 5 A in 47 R RC n 5 M S H I i O1 S3 bi b IR B in in 18 1 C 06 b 50 CO b 37 P i A in 066 AM SB066 6 n 7 P B 0.208 2 A S IR b 0 IR C in C 1 b 1 IRC4 b AM P 1 12 6 A in M SB067 1 RC P in I 14 I 38 b R bi n 8 C n bi 6 3 b 31 IRC3 bin 26 4 n i C B06 6 n R bi S b 3 I 1 A in 26 7 C M R P A 6 A IR P n H b C i I O in b UBP10 R 3 14 IR 5 C R HO2 bi in Verrucomicrobiota P n b 4 A 4 COS2 M R 8 S1 n 3 HO1 b O 2 S in bi IR B066 C 4 1 1 C 51 n bin in P 1 COS b 2 bi A 6 2 M SB IR b 6 R n 2 in A bi C 41 C B066 0 4 b 3 bin 7 66 R M S 675 in A A 0 5 C P B IR R b 13 C H in M S O3 b 1 IRC A 9 PAM R 6 P 1 H C O2 in 38 bin SB067 IR 2 b R in 4 CA 0 9 bin 74 37 IR b A C in P bin 5 R 2 18 A 3 IR H b C1 in C O in IR P 1 b 1 A IR 2 O3 b 38 M S in in 0.307 C RH 1 50 b IR B bi C1 C 066 n PA 1 IR bin 7 8 3 IR M IR b in 6 PF00400.31 C SB06 C in IRC2 b Dadabacteria 3 12 1 P AM SB b S 6 in CO bin 2 1 2 2 b 8 S 11 PF13646.5 0 in 4 O n Planctomycetota 66 2 0 C bi 4 b 77 IR IR in 4 n 14 6 C C COS3 bin bi6 bin 1 P 1 b 9 4 4 A I S SB0 M R in 1 O M 066 S C C A B n 30 1 B0 4 b 2 P S bi PF02985.21 i C M 62 6 n R A 6 61 2 I 1 5 P B0 0 bin S 1 A IRC M PA 29 A PF13428.5 R P bin H bi C 66 6 O n R 2 b 38 I bin 9 4 SB06 bin in C 8 PF13424.5 IR 1 IR C 3 PAM 261 R 3 bi r 4 H IRC e IR O1 b n 112 bin 2 PF13414.5 C 21 P R i eawat IR AM H n 30 s C O P S 3 bi water 22 PF13371.5 AM SB066B067 68 n 70 sea n 5 bi bi PA PF12688.6 n 22 A in 53 4 A b 7 Proteobacteria R bin 2 P 1 H A O1 8 bin 9 0.406 b 3 i Deinococcota IR n 6 RHO1 bin C 2 PF07721.13 1 b HO2 4 R i R n HO n bi 11 3 b PF07720. RHO RHO3 in 69 bin 1 1 S4 in 2 b O 10 b R C n HO2 b in bi 1 in 26 PF14580.5 9 b IRC in IRC4 M SB0670 61 5 1 A 06 n 4 7 P B bi R bin C S H 1 IR M 668 PF13855.5 O3 b 4 PA 0 C SB R in IR M H 51 PA PF13306.5 IR O1 C IRC n 38 PA bin i A 1 M PA 1 O2 b PF12799.6 SB b RH 3 0662 in 83 bin 3 b R 9 in CA in R 3 b H 5 R1 O2 b A 13 C in RH in b O2 b 7 R2 PF13857.5 CA 4 R in n H 20 O3 R4 bi 26 Cyanobacteriota PF13637.5 b CA n Nanoarchaeota in 6 bi 2 6 8 1 RHO 1 bin bin 75 PF13606.5 06 4 HO3 SB n 0.505 R bi PAM 665 PF12796.6 C 0 IR SB M PA C IR 34 bin C1 IR PF00023.29 25 C3 bin IR in 7 A b AP 10 in C2 b IR 35 in 1 b IRC 6 n 2 bi C3 IR 35 4 bin 4 bin 64 5 IRC SB06 in M 5 b PA 66 Actinobacteriota Crenarchaeota IRC PAM SB0 IRC 2 in 7 A b AP 7 bin O1 RH bin 22 0.604 3 HO R 21 bin S1 CO n 10 3 bi OS C 22 bin S4 CO 18 in 6 1 bin 61 b IRC SB06 1 bin PAM 65 IRC M SB06 PA IRC n 28 RHO3 bi 6 in in 1 3 b 2 b IRC 066 SB PAM IRC 74 bin RHO3 bin 3 S1 CO n 17 S3 bi CO bin 40 S4 0.703 CO n 4 2 bi OS 3 C in 1 86 b 363 COS 32 bin RHO1 51 bin COS4 n 43 2 bi RHO bin 23 RHO3 32 bin APA in 39 661 b SB0 PAM IRC 5 A bin in 6 AP 664 b SB0 PAM S IRC TY2 9 1 bin RHO bi 42 n 2 bin ST 4 RHO 34 0.802 WD40 Y 3 bin 3 bin RHO 8 bin 4 STY1 bi 6 APA in 30 C4 b IR in 34 n 5 662 b S PAM SB0 T IRC in 33 HEAT Y4 664 b B0 b AM S RHO1 bin 37in 7 IRC P n 33 APA bi bin 20 B0662 AM S RHO2 b IRC P in 15 in 25 IRC3 b TPR in 22 RH IRC4 b IRC O3 bin 43 bin 8 PAM RHO1 SB0 in 6 66 RHO2 b IRC 7 b n 6 PA in 13 APA bi M 4 bin 11 LRR S SB066 B AM 06 IRC P 0.901 66 7 68 bin 2 IRC bin 15 M SB06 PAM S IRC PA B066 15 2 bin 33 APA bin IRC in 79 P RHO3 b AM SB0677 bin 16 5 ARP 61 bin 4 SB06 IRC PAM bin 28 SB0662 IRC IRC PAM PAM SB0663 3 bin 2 bin 5 RHO Class APA bin 96 A PA b bin 4 in 56 RHO1 36 C RHO2 bin OS3640 6 bin 1 Archaea APA bin 43 3 COS3 RHO1 bin 68 6387 bi n 15 O3 bin 54 RH 1 5 SB0661 bin 1 COS3 IRC PAM 6386 bin 19 66 bin 1 IRC PAM SB06

APA bin 80 COS1 bi in 27 n 11 IRC PAM SB0662 b

IRC1 bin 20 COS4 bin 17 IRC4 bin 11 seawater seasim SB9152 S1 bin 12 APA bin 13 IRC PAM SB0661 bin 43

IRC PAM SB0661 bin 44

IRC1 bin 25 WGA-4E Anaerolineae IRC4 bin 20 APA bin 86

COS4 bin 10 COS36405 bin 12

COS36386 bin 9 IRC PAM SB0666 bin 11 IRC COS36406 bin 7 PAM SB0661 bin 22 RHO3 b COS36388 bin 13 in 25 IRC RHO3 bin 1 1 bin 31 IRC4 bin 21 RHO1 bin 1 IRC2 bin 13 IRC PAM SB0664 bin 5 IRC3 14 bin 22 IRC PAM SB0667 bin Bacteria Gemmatimonadetes Thermoanaerobaculia COS3 bin 3 IRC3 bin 7 COS3 bin 15 COS4 bin 28 IRC2 bin 8 APA bin 87 IRC PAM SB0665 b CAR2 bin 5 in 11 IRC PAM SB 6 bin 13 0664 bin 15 im SB9157 S seawater seas n 8 IRC PAM B9156 S5 bi SB0661 bin 19 ter seasim S seawa IR bin 8 C1 bin 19 SB9152 S1 ater seasim seaw IRC3 bin bin 9 13 B9158 S7 r seasim S seawate AP 29 A bin 1 RHO1 bin COS3 b 57 in 18 RHO3 bin RH 59 O2 bin 60 RHO1 bin uknown bin61 RHO3 b 3 in 68 APA bin RHO1 n 11 bin 20 RHO3 bi R 2 HO2 bin 5 RHO2 bin 9 IRC bin 82 PAM SB066 RHO1 1 bin 16 RHO3 b in 55 in 67 COS4 b IRC PA bin 9 M SB0 CAR2 662 bin 21 IRC PAM S bin 3 B066 CAR1 6 bin 6 IR C PAM bin 7 SB067 0662 0 bin 1 PAM SB IR 9 IRC C2 bin n 12 3 OS2 bi C IR C4 bi bin 15 n 17 COS1 IRC bin 19 1 bin 2 COS4 1 UBA2968 Luteitaleia 5 IRC3 b bin 8 in 11 RHO1 A PA bin 52 bin 42 IRC4 IRC1 b bin 17 in 3 RC1 0 I IRC 59 PAM 3 bin SB0 RHO 661 b % HEAT I in 34 2 RC PA bin 6 M SB06 O2 6 RH 4 bin IRC P 22 48 AM SB 1 bin 06 RHO 66 bin 21 IRC in 22 4 bin APA b 33 IRC 8 PAM bin 5 SB066 HO3 2 R C bin 9 0 OS3 bin 6 6386 HO1 bin 1 R CO 0 S36 in 8 388 O3 b bin 11 RH RHO 11 1 bin bin 4 6405 0 Marinisomatia uknown RHO3 b COS3 repeat proteins 6 in 52 4 bin 3640 C COS OS n 55 3 bin 4 PA bi A R n 3 HO2 7 bi bin 9 a 3632 C ettin 6 OS1 er b in bin wat 26 b 1 sea C ttina 363 OS r be n 7 2 bi ate 28 bi n 17 seaw 363 tina APA bet 17 bin ater bin 89 aw 309 se 36 IRC na 18 PA betti bin M SB water RH 0664 sea 5 O2 b ettina 36308 n bin in 2 er b bi 18 7 wat 36308 RH sea na O3 b in 29 in ter betti 8 b 26 awa IRC se n 9 PA water 4261 bi M sea 618 SB0 r 42 IRC 661 ate 2 4 bi bin w n n 8 32 sea 28 bi 363 IRC na 1 b Bacteroidia Acidobacteriae tti n 2 in e 6 bi 15 ter b 4261 IRC wa er n 3 PAM S sea awat 9 bi se 630 B0 a 3 CO 66 tin 4 S2 5 b bet n bin in 25 ter 12 bi 1 wa 1 IR 8 0.01 22 sea 1 C P water bin AM sea SB0675 36326 CO 5 S3 b in bin 9 in 29 bettina 18 b ter 26 CO awa er 4 n 4 S1 se at 0 bi bi eaw 31 n 2 s 36 RH tina 27 O3 bet bin bin r PA C 53 ate A 4 OS eaw in 3640 s 4 b 5 5 S R bin 915 HO2 14 SB n 3 b im 8 bi in 67 as 32 C se O ter ina 36 S3 wa ett bin 3 63 ea b 08 CO 87 s ter 6 bin a a 363 in S364 2 aw in 9 b se ett 30 C 06 b 36 O bin 1 ater a 3 S3 ttin bin 64 Rhodothermia Bacteriovoracia seaw 10 C 04 b ter be 363 OS in wa 0 36 3 ea ttina 38 s be 8 bin 3 8 32 IR bin 5 ater 6 29 C aw a 3 in PA se ttin 2 b M SB be 11 CO er 22 S3 066 er n 28 bi 8 awat at bi CO n 1 bi se aw 12 2 n 2 se 21 6 S2 1 r 2 n 1 bi te bi n 1 a R4 CO 6 eaw CA 18 S4 0.059 s n bi bi C n 3 R3 OS 6 CA 0 1 in 1 bin 12 b C 2 OS CAR2 in 1 3638 CO 6 S bin 18 CAR1 b 36 387 R1 bin 7 C b CA OS in 22 in 6 1 b C bin R1 O 10 A 7 S2 C n 1 b bi in 7 R3 CO CA 4 S3 bin b 1 CO in Spirochaetia Bdellovibrionia AR 6 S 2 C n 1 4 i bin 8 R2 b RH CA 18 O1 n b in 18 RH 10 O CAR4 bi in 2 b R bin S2 H 2 18 O3 b 2 53 in b AP in SB91 0 A 8 m 1 bin asi C 63 e B9152 S1 A S 0665 bin 8 R2 im b 3 bin C in awater s C AR 3 se r seas PAM SB IR 18 4 b te C a IR bin IR in 14 w 8 C R1 n P 0.108 sea CA bi AM 4 IR SB06 C CAR 3 2 bin 7 bin 1 IR 5 1 9 C 6 b in 1 in 1 CAR b b 8 IR in 7 C3 b 10 CAR3 in b IR in R2 4 C 10 CA 2 P bin IR AM Chlamydiia Oligoflexia 3 C SB066 P RHO AM IR 5 7 C SB067 b bin 4 bi in RHO1 bin 39 I n 1 RC 1 8 2 OS4 13 5 b C in PA in 6 b IR M 6 C S 66 n 5 B066 0 bi PA B IR M 4 C SB066 4 bin IRC3 n P PAM S bi R A C H M 16 IR AR3 2 O2 b SB066 1 b C in in b R H in 41 17 O 2 in 5 1 bi CAR2 R b n 10 2 H in 1 O 5 CAR4 b bin 3 b A i R1 PA n 4 CA 71 C bi 9 O n bin S 42 PA 2 A C b in 40 O in b 1 S 1 0.157 A 1 9 P bin C A O bin 19 R4 S A 4 bi C in 1 C O n 32 Verrucomicrobiae Polyangia R3 b 11 S A 3 C in C 638 O 1 S3 6 b OS4 b bin CO 6 in C 2 38 4 2 S 7 HO C 3 b R bin O 63 in 1 1 S 8 3 8 9 n 70 C 638 b RHO O in A bi 0 S4 P 6 6 A IR b bi bin 1 C in n 31 P 4 C A 3 bin 6 O M SB066 RHO3 1 2 S HO n 1 C 363 R O 0 S 87 S4 bi 3 1 O C 6 b b C O 3 in in 21 S 88 16 S2 bin 1 C 3 O 63 bi CO 1 S 8 n 17 615 bin 2 R 3 7 b H 63 3 O2 b 8 in 9 LI2 bin n R 8 C H b 4 O1 in 4 in 1 LI1 bi R seawater 42 C H bi 0 6 O3 b n in 1 C 14 CLI1 bin b O 3 4 S in S 1 C 1 5 4 OS b 0 UBA8108 UBA9160 bin in 20 0.206 915 7 n 1 4 I b SB 0 R in 42 3638 C S 17 bi R 3 O H b asim C O in e r 426 1 3 112 bin 1 R b 5 2 H ter s 2 O2 bi in C 2 O 6 seawate 1 bin 3 S n seawa 3 C 638 8 eawater 2 S bin 82 OS s A 58 P C 2 6 B915 A OS b b in 32 S 3 in 22 1 bin C 638 im O HO as 41 S3 7 e R C IRC1 bin 7 O b b bin S i in 20 53 C 4 n 1 seawater seasim SB9153 S2 bin 2 n O b water s S i IRC4 1 n C 1 sea in 58 O b 3 b S in 13 RHO1 bi C 2 O b S i C 3 n in 18 O 5 SB0662 b b S in M COS4 bin 144 CO 3 A 2 6406 13 P 1 S COS C 3 O 6 IRC 0665 bin 27 S 3 b 4 bin C 8 in 6 3 8 SB 5 O 6 b 4 bin 23 1 S 3 M 4 C 3 8 in A C in O 6 7 1 P b S 40 b IR 4 C 3 in 3 AM SB06 3 O 638 4 IRC P C S bi COS O 3 n Planctomycetes UBA8248 9 640 6 2 0.255 IRC S3 b COS1 bin 8 2 I i R n C b 5 5 I i b R 3 b n i RHO1 bin 4 C 1 n 3 9 4 OS4 bin P in C C O AM SB0664 IR S 0662 bin 59 4 C SB 2 b RHO3 bin R bi in H 3 AM O2 b n R 5 6 HO b RHO2 bin 33 in 22 R i IRC P H 1 n b 3 C O3 b in O 3 AM SB0677 bin 7 C S SB0678 bin 15 in 1 i 3 B0675 bin 3 2 OS 3 n C P S 6 2 R AM C 4 7 I P R1 b 3 O 3 0 AM CAR3 bin 15 C 640 5 P A S C O 3 b IRC S 63 6 in C HO1 bin 7 3 b 1 IRC 2 O 6 8 in 9 0 R S 7 n I 3 bin 45 R 3 8 b A 1 C 6 8 i P IR 4 n 6 8 bi in b A 7 1 C P 0 i 1 A 4 n I 4 R M b 9 77 b in C bi i S n eawater 42618 bin b IR n 10 s 2 P B B06 C A 1 0 S IR M 2 6 AM SB06 P 6 P C AM M I S 5 Deinococci Nitrospiria R P B b A C A PA SB06 06 in IRC IRC3 bin P M C P 7 1 AM SB0668 C1 bin 1 R A A IR P S 5 9 R 7 H b M B0 AM SB0664 bin 2 I i 7 b P R O n SB i n 1 0 n 5 IRC HO2 b 1 6 6 b 0.304 I b 9 B0675 bin 1 R 0 2 i IRC S i 6 n CAR3 bin 2 C n b C 6 39 P 67 7 in 22 7 S6 bi 5 bin 3 O i AM A n 5 b P S I S R M in 915 I C 4 5 56 R S 1 5 b IRC 2 S1 bin 1 C P i B AM n IR 0 15 n 2 5 P 6 6 I C AM R 7 B9 CAR2 bin 22 S 2 C P S I B0 R A SB b CAR1 bin 10 6 P i I C M n R A 66 er seasim SB 12 bin 1 in C 4 M SB 0 1 I S 67 1 1 seasim SB91 1 R r 42618 bi 3 b B0 easim i b wat 12 bin 16 A C b n 5 te 1 bin 2 in r 22 n 3 i 6 b 112 b i P P n 4 0 6 8 I1 R A 0 i sea ter s wa 12 A 3 67 5 n 22 H 1 ate 22 1 b M SB 2 r CL A 6 b ea 12 b 1 O i i 2 seawater s ter 8 P n n 1 A 1 58 b A 1 eaw P in seawa s bin 8 b 0 3 12 bin bin C A b 22 in 668 2 1 7 n in O b 1 seawa 8 CO 81 seawate ter S in 2 a C 8 1 1 b 1 O S i aw C b 02 n bin 61 1 S 2 in 14 I O 4 se in R b PA S 16 COS363 4 b in COS36386 A C 1 I 4 R in 23 Vampirovibrionia GR-WP33-30 na 36328 bi 9 seawater 22 P b I C 2 R A i 1 n 20 7 AR2 b C P M 3 C C 3 bin bin 1 3 O AM S betti A SB0 r P P S CAR1 bin 14 n 9 C AM A 3 n 3 CAR 1 bin i I O 6 bi R 6 im SB9156 S5 bin 6 b b 38 S B0 A 6 S3 i 65 RHO3 I C n HO R 6 B06 6 n C R AP C P 6 2 6 b seawate 13 A 40 3 b bi 1 O 1 i Y2 bin 6 R P i n STY4 1 1 S M SB0 n 6 in R A b 4 H 4 4 9 b n 4 1 i ST 1 H M n 0.353 I O b 2 b ater seas 3 R b 1 O3 SB067 i i STY I 1 i n n w R n bin R C b A R 6 1 3 C 4 49 ea b i 6 9 C n 52 3 HO1 bi n s AR1 bi bin 69 A P i n 1 A n 20 R C i 4 P 1 AM n b P 1 H A 1 2 5 R i CAR2 A 1 6 n bin 5 n R O b 6 b i 16 H b b S i 3 b 26 H 1 i i n 7 R O n n 38 B RHO1 bi O b bin H 2 2 3 1 R 06 R 3 b i 9 4 9 n 1 H O1 b n 42 S36 H i 6 bi 2 R O3 n 27 bin 9 5 O i 1 RHO3 R H n 4 1 b COS4 R 3 1 b CO OS4 bin bin 18 n H O i i n I b in C n 8 R H O1 b 2 AM SB0662 OS1 i 3 b i b I n 19 O2 b i P R C bi n 3 C CAR 3 I R 3 3 C 3 b 8 in 4 3 I 3 C R n 6 C CAR2 b STY1 i 7 AR P n R 8 bin 7 0 C 4 I A i i C 7 1 1 P A n n 1 0 R P CAR4 bin 1 CAR2 b 6 6 bin 4 P A R M 0 bin 2 A 1 3 1 s A 8 H M OS4 2 6 1 in n H SB 3 s e O b M bin 2 s b SB C 6 n e a O SB0 5 bin b i 4 se e 3 n 1 i SB a wa n 8 HO2 3 s 3 a b 06 wate R n 1 2 se e w 0 Y2 bi a b 1 C i 2 bin n a t 06 Cyanobacteriia UBA1144 AM A wa n 6 6 HO T i a e i O n 3 bi a w P in C 6 4 COS3 4 P te r R S 1 26 I w S 6 1 SB067 O A a r STY1 bi Y STY1 R t s 7 b C SB0677 2 12 ST e r 15 4 a te 3 1 b s e b i STY4 bin 8 T S C S 3 A b n R r n 8 t s I 15 I easi asi bi in M S b 5 R T 3 i e r in I P e AM STY4 7 I Y n s 3 R P 5 n R i r n A 3 Y 63 s STY2 b I C A n 2 a n 2 1 R easi 3 1 3 A R 91 STY TY3 bin 1 2 bin 4 R C ea m R C s s n 4 3 0 n 3 n C bi b P i i n 2 2 H b M i 2 C 8 C P e m C P S i n 3 H i 9 C 12 bi H i b i C AR2 bin 1 C O P b 3 m b b 9 C C i C P n 6 R O C A a s SB b 0 O n 1 n 1 2 A n n in 8 1 C C R O i O b 8 O A i i 1 7 SB m PA I O C O A i O s 0 9 O S O3 bi 1 2 3 n S n n O SB 5 M 4 H O b 0 3 3 b S P m IRC i b 6 i 1 2 O O b 2 RHO n n 1 M 3 S i 6 9 1 7 2 1 M n i 6 1 B i S 2 S i S S m 9 S 2 A C b i n b O S 2 4 n b S A i 6 5 S S M n 3 0 4 b n S C b b 1 i 4 3 S 91 n n b n b i 3 3 RH n 3 b n i HO2 R 3 S i i b i P 3 B 0 0 6 3 i 66 RHO1 bi b b SB06 91 i n 3 3 3 I i er 22 B 5 b 6 I i S 066 R b 6 6 b 4 B 1 60 i n 6 b b b n OS36386 b t 7 6 b S B b n R I i i B A I I b 5 91 6 6 2 S1

2 5 b 3 L n n 0

3 3 i

B 4 2 B OS 4 6 9 6 0 6 i n 2 5 6 7 L L 3 2 B 0 C 4 S n 3 6 1 i 8

A A 66 3

8 8 R 5 0 C n 0 1 0 91 0 6 5 3 5 S C 8 B 4 5 8 0 1 6

0 6 0 C C 8 b P

P 6 7

0 5

A 4 5 5 S 4 O 0 9 B 3 7 4 S3 bi 8 6

M 6 6 S 4 6 2 in 51

4

A A 2 6 b 5 2

6 b b

C b 5 B H b 4 7 S 6 A 7 b 6 4 b b i 3 S 3 b AM S 3 i i seawa i i n 7 M S 3 n n b i R P n 3 bi i i n S2 0.402 2 bi i n 1 S n n b M S n A 1 i P S

2 S b 2 1 n n 1 5 i C n A O P 3 2 O n i 2

C 8 O bi 1 O n 2 9 R P AM n C 9 6 C b R I C 1 C C 3 n 3 I P 1 i 3 2 C n 4 R 5

I 2 C R

I

R Thermoleophilia Gammaproteobacteria I 0.451 Actinobacteria Alphaproteobacteria 0.5 Acidimicrobiia Woesearchaeia UBA2235 Nitrososphaeria Dehalococcoidia

Fig. 3 Phylogenetic tree showing the distribution of eukaryote-like genes per MAG devoted to each gene class. Internal branches of the repeat proteins—ankyrin (ARP), leucine-rich (LRR), tetra- tree are coloured by phylum, while the outer strip is coloured by class, tricopeptide (TPR), HEAT and WD40—across MAGS with >85% and both are listed clockwise in the order in which they appear. MAGs completeness (N = 884). Values represent the percentage of coding from seawater are denoted by grey labels with red text. symbionts [69], but pathways for their catabolism were For example, sponge-associated Poribacteria, Latescibacter- enriched in seawater rather than sponge-associated MAGs. ota and Acidobacteriota encoded a high proportion of all While these findings do not invalidate the possibility that ELR types, while other lineages, such as the Gemmatimo- microbial communities play a role in amino acid provi- nadota (average 0.25% coding genes per sponge-associated sioning to the host or that they utilise host-derived taurine, MAG versus 0.09% in seawater MAGs), Verrucomicrobiota carnitine, or creatine, they suggest that these are not key (2%), Deinococcota (0.85%), Acidobacteriota (0.20%; spe- processes mediating microbe-sponge symbiosis. cifically class Luteitaleia at 0.55%) and Dadabacteria from To form stable symbioses, bacteria must persist within the C. orientalis (0.62%) encoded a comparatively high per- sponge tissue and avoid phagocytosis by host cells. Micro- centage of ARPs and Nitrospirota encoded a high percentage bial proteins containing ELR motifs have been identified in a of HEAT_2 family proteins (0.55% versus 0.05% in sea- range of and plant-associated microbes and are water MAGs) relative to other taxa. In contrast, ELR thought to modulate the host’s intracellular processes to abundances were substantially lower, or absent, in the facilitate stable symbiotic associations [71, 72]. For exam- Actinobacteriota, the class Bacteroidia within the phylum ple, ELR-containing proteins from sponge-associated Bacteroidota, and the Thaumarchaeota, suggesting these microbes have been shown to confer the ability to evade microorganisms utilise alternative mechanisms to maintain host phagocytosis when experimentally expressed in E. coli their stable associations with the host. [10, 73]. ELR-containing proteins from the ankyrin (ARP), The mechanisms by which ELRs interact with sponge leucine-rich, tetratricopeptide and HEAT repeat families cells remains largely unknown, although microbes in other were enriched in the sponge-associated MAGs. In contrast, host systems are known to deliver ELR-containing effector WD40 repeats were not found to be enriched but are proteins into host cells via needle-like secretion systems included here as they have previously been reported as (types III, IV and V) or extracellular contractile injection abundant in Poribacteria and symbionts of other marine systems [74, 75], where they interact with the cellular animals [13, 31]. Most ELRs were present across all taxa but machinery of the host to modify its behaviour. In sponges, it were much more prevalent in specific lineages (Fig. 3). is also possible that ELRs could be secreted into the A genomic view of the microbiome of coral reef demosponges 1649

9

n

i

b

9 5

S

n

0

6 sea bi

I

1 8 C4 RC 5

9 2 4

2

I 2

I I w R 3 R R B 4 I

n

R n 6 C P 1 a i C S C i n 1 5

3 i

t C A I b

b e RC n

P

P P i n M a m r 0 i 2 A i P 8 8 A A b se n 1 1 b s R 1 i 1 A M

M S M

4 4 t 1 P 5 2 6 1 C a 6 C C n C C t

H 4 2 M C

C C i n B A 4 8 2 1 n 2 4 1 asim A 6 e i 2 4

S e A 1 6 S A n 15 S A A A 4 b A O 2 1 P 0 A A i s b M 4 5 n n I 1 S I n n b bi B n IR R 9 i i n B R B R R R 1 i R i 2 6 R i b R A n 8 r i R R 2 6 seawat r 2 i r B 7 b 0 b r b n 3 b 2 0 0 6 b 4 e 1 4 2 b i n C 8 S C 6 4 e n 2 C b IRC2 3 3 8

b i 8 se b t e 6 i 0 Phylum t 6 2 bin 12 6 te r b 3 1 3 t 6 se 1 B b 1 3 8 4 r b i b S b b 3 7 b i a 6 b in 2 I 6 b R 2 a 6 b IRC1 6 b n a n e P 6 R a 8 bin b i 3 e i O O 6 8 a 06 i i i 5 7 t i n 7 seawater B R n O 7 O 1 I i b i n 4 n n n b w t H n A 3 i AM 4 a O w 3 R i n n w 6 6

wat C 6 3 3 0 w a 12 bin 17 I 8 n H se H i H i 9156 H a

H a b w P a 6 1 n S O b n 1 a 3 6 2 b 1 2 04 bin 1 5 H 6 4 1 7 a A 3 seaw R C L S 3 1 6 RH b e e w i R R 3 O b 2 0 e 388 R e A R i 0 R 7 i 6 w 3 6 4 at n e S a 1 n O P 4 6 R n 2 H 4 in 1 C a r s O b i s S s na 36327 e b a S 3 n s 87 b w H 2 b S A s O er b b 3 C e O i 2 7 r B O b in 6 C e C n S eaw O O i s a b 9 O e i 4 C C S bin 26 i n s 36386 bin se b 0668 2 n at A 3 n t C O b t i 3 1 C seawat e er 5 n 46 S t AR 8 COS36404 OS36 bin 29 seawate R b OS36 36406 bin 6 e ina 3 C b 4 C r seasim SB9156 S5 b a e b tti C betti b b i 2 C a tt C 1 bin 27 r s w r A 4 n i 0 s e 4261 A i n 8 COS36405 b te n i n e i A 2 b CO ea eawater b tt n a R R O bin 97 7 seawate n bi bin 1 16 2 sea a 5 COS363 t r e P 20 13 i a 3 COS A n 13 w e na 6 1 n tt A i seawater 22 b 3 P b n 15 r 327 b 13 RH awate e 3 bin ate e i 6327 bin 6 5 7 w n bin A b r 6327 in 21 r seaw tt 3 RHO3 bin 39 bin 22 bin 1 se 2 bin at seas e a 6 70 bi i 5 bin r 1 seawat t na se 3 30 OS36386 bin 37 6 e t 3 1 COS36386 42617 bin 7 1 i r 6326 bin 7 i r n 4 C sea 1 n STY 4 a 3 426 9 b TY4 bin 9 B0 as a 2 bi 6 n bi i te 6 i 7 TY1 1 bin 15 m S S 3 n 4 IR 32 S i r R m 63 1 n COS1 bin 9 I s R3 bin 5 2 bi C 22 n COS4 bin A R im 7 7 bin 6 SB0673 bin 10 S AM S 7 B 26 b 7 5 R IR C IR C P bin 15 1 CA P I B 9 A M C AM S S 12 R P 1 A C PAM SB0668 9152 C AR4 bin 6 C A C A B 57 P bin P 4 i C P C bin 20 A bin n IR AM S M 91 P b 3 b B S in 2 IR bin 4 AM S S 5 405 6 RC 1 0662 S 6 bi i I 6 21 B 6 n 81 1 10 29 RC C4 bin 6 S 1 bi I 1 066 S B 2 IR bin 0 5 n bin B CO OS3

6 n 2 in 14 0 1 b bi A 6 C C 6 in 15 4 P b n 9 4 bin 2 b n 4 6 A 4 O in bi n 2 I 5 4 R RHO1 n 31 IR R S HO2 bin 56 4 1 bi 1 1 bi I H C 4 R bi Poribacteria R C 1 4 RHO1 bin 22 1 bi n in 19 R O b in C 20 4 2 066 P i RHO3 bin 32 b Chloroflexota R H 2 n P AM RH n RHO2 bin 57 3 b AM H O b 1 bin 68 B0666 i 8 RHO3 C O 3 n 23 AM SB066 6 S O P O 2 b IRC B 15 AM SB S 1 i M S S b n 7 C P I B 0 IRC4 A R 67 4 b i 0 n IR P i C AM SB0665 bin C 662 b n I 3 bi 5 I R i IR P AM SB0 R P n 28 I I C C P R C A R 47 IRC n 3 24 3 M 4 bi n 1 I C IR C IR R PA bi bin 98 n 37 P SB 4 bin 4 IRC A C C 6 in 1 A M C n P 7 1 b 45 A b 7 1 P M S 0 O 3 AM SB 67 in bin 4 n S i RHO1 bi n O 4 4 bi B 0665 5 2 RHO2 bin 2 S 0668 b 5 I bi i 7 RH R B n 36 n B0661 C 06 bin b 9 IRC4 b in 2 n 1 P IR 6 i 4 bi n IRC3 bin 17 25 A 1 SB066 C n RC2 AM S M I bin 27 8 I I R R 3 20 M 7 RC S P H C b IRC1 b A I B 1 b P R O in in 1 P SB0661 bin 37 0 2 IRC b C A 6 18 M in P M 70 b IRC A A M SB0662 bi A i 4 A C n 1 P P A S P 6 A M O b 3 P n B A i S n 0 AM SB0664n bin65 S 0 b IRC C i 675 4 2 b B in 41 IR Acidobacteriota 0 b 2 bin 8 C P 388 bi Gemmatimonadota 6 i 4 R n R 0 62 b I 36 H i 46 1 R n O1 b RHO1 n 20 OS 3640 21 H i bi 7 RHO O n C b 2 3 bin 64 0 in 5 in 1 3 COS n b b 9 COS i bi 2 n 6 6 8 n C b RHO2 8 bin 1 O 7 I i RHO3 O1 7 bi R R 4 n 3 2 C S bin H 3 b 5 1 4 RH 638 P R O b in 05 A H 3 3 in 2 S 221 4 M R O COS363 b 1 O 364 H 2 i n 1 S n 7 1 C S i Fibronectin-attachment protein O b B 4 9 0 1 in 53 1 CO 6 b 86 b IR 72 eawater 3 in s C 51 A bin I b P 36 R 1 i S 404 bin b n A 5 C O 5 I 3 i 21 RC n 2 C S36 bin 2 b R O in H 2 b in 2 18 C R O 1 3 i 9 bin H n COS3 8 O b I 1 7 R 1 in 1 COS4 b 3 C A b 7 3638 Bdellovibrionota IR R P i 2 P H A n 69 COS2 bin 8 C A b 2 bin Latescibacterota O OS M RH P 2 in C O bin 25 A S H 9 85 IR M B O b R i bin 29 bin 5 C S 0662 1 n b 5 16 4 in I P B RHO1 R 0 in 2 n A bi 4 C M 6 5 HO3 066 8 b I 6 bin 5 R in B R P I 4 67 bin 2 7 C A S R S 0 B b 8 RC4 2 b n PAM S M C in I B Fibronectin binding protein 0 I S 67 4 1 AM S R b P 8 bi C B 0 IRC sea 0 7 in C B0667 bin 14 661 b S 66 P B b 4 R PAM 0 7 A in 9 I 7 n 4 wa 0 B 6 M 6 19 AM S s 66 IRC P ea te S in 75 bi B C AM SB0 r b 3 sea w s 06 3 P 06 a eas in IR t IR 61 3 AM e RC P w r i C bin 4 I a se m I 3 te R AM SB 9 C b IRC r a S i 2 P s sim S B 4 n e 9 A b 3 9 as 1 P i 1 IRC n 60 S A n 3 C1 bin i im b b 16 B9 9 IR in 3 3 SB 15 9 b 9 IRC 8 9154 5 in S3 bin 1 S 61 bin 55 4 1 37 Marinisomatota 1 CO bin 06 n Myxococcota R b A i B S H in P S 3 A R O bin 5 HO3 1 AM I b IRC4 b P R i 4 Fibronectin III domain C R n C 49 8 in 6 P H b IR b A O in in 3 M R 2 6 H 47 2 IR S O b IRC3 bin 412 b B 1 in C C 0 5 R P 6 b 0 I A 62 b in 70 M IR IRC1 bin AM SB06686 IR S C in P n C B 1 0 1 5 P 6 b 1 IRC n A 61 in S2 bi M SB 14 O IR b C 34 C in S1 bi n 0 3 3 bi 6 1 CO in 12 7 b S4 b 6 in 37 bin 1 2 2 b O A C in P in b R A 26 HO bi CAR2 SB066 n 0675 IR R 3 1 AM C H b 4 CAR3 bin P 7 P O in 4 AM R 2 RC AM SB Nitrospinota H 8 I P b S O in C in 19 Cadherin domain 1 b Bacteroidota IR B 5 IR 0 b 1 4 C 661 b in R2 7 P 6 A CA n M IR 6 bi S C in A in 37 B 4 4 P b 0 bin 1 A 66 C1 I R 5 1 IR R H 3 C O b O3 bin 35 38 P in AM SB R 3 1 H b 6 RH bin O in n 7 2 3 bi bi 8 IRC1 2 0670 bin n RC 49 I 1 bin 6 IR OS in 2 C 1 C IRC R 2 bi 8 11 H n O n i P 1 1 COS2 b bin 6 4 b A IR b 2 7 14 M in n S C 067 B 1 5 COS3 IR 06 b 0 in 1 OS4 bin 1 C 6 M SB 0664 bi P 8 3 C A I A I bi P RC M R n SB C C 662 bin 30 Nitrospirota P 3 bi 12 IR AM SB 0 A P 1 Spirochaetota M 066 n SB0 S 28 M B 1 IRC A 6 bin 0662 b b P 9 6 in 4 06 IRC bin B 0 4 S IR IRC1 in C M C 49 IR A 618 bin 6 P P in 2 A IR bi C 2 b M C n IR er 42 1 S 4 1 1 B bin 25 2 2 0 66 eawat 1 s bin 29 A eawater 2 PA s R b bin 68 H in PA 53 O 38 A n 2 bi b IR in 1 A n 17 C AP 3 3 9 R bi IR H n bin 3 O 21 RHO1 bi2 C 1 4 P R bi HO n IR AM H n 3 R C O P S 3 0 A B bin HO3 bi 1 M S 0675 R n UBP10 70 n 1 B bin S4 bi 10 bi Verrucomicrobiota 066 in 0 26

4 22 CO b in R b b H in C4 SB067 1 O IR 66 1 28 AM 5 b P I in C bin R 6 IR M SB0 68 C1 2 A 6 b R in HO IRC P 3 11 AM SB0 bi P RH n O2 6 IRC bin 38 9 2 R bin 19 O 3 HO H in 2 R b b I in AR3 9 RC 1 C in 4 7 b 3 R bin R1 1 H 1 CA in O3 4 b RHO bin 5 CAR2 in 4 IR 1 1 b C bi PAM S n CAR4 Dadabacteria AP 11 6 A 2 bin 26 n 1 B b 8 bi 0 in RHO 5 662 83 in 1 Planctomycetota b B067 in 35 RH RHO3 b O PAM S 665 bin 4 2 0 R bin IRC SB HO 7 M 2 PA bi R n 2 IRC n 34 H 0 bi O3 1 bi IRC 25 n 6 C3 bin IR n 7 A bi AP 10 bin IRC2 35 RC1 bin 6 PF07174.10 I n 2 Proteobacteria 4 IRC3 bi 35 bin n 64 4 bi 06 5 Deinococcota C SB n IR PF05833.10 PAM C B0665 bi IR S PAM C IR n 72 PF00041.20 bi A AP 7 bin O1 RH 2 n 2 bi PF00028.16 O3 H R in 21 1 b OS 0 C in 1 3 b COS in 22 4 b COS in 18 bin 6 1 b IRC B0661 1 Cyanobacteriota Nanoarchaeota M S n PA IRC M SB0665 bi PA IRC n 28 O3 bi RH in 6 b 0662 bin 1 IRC3 SB PAM IRC 74 bin O3 RH in 3 1 b COS S3 bin 17 CO n 40 Cadherin S4 bi CO n 4 2 bi Actinobacteriota Crenarchaeota OS 13 C bin 86 363 OS 2 C in 3 O1 b RH n 51 4 bi OS C 3 in 4 O2 b RH 3 bin 23 RHO 32 bin n 39 APA 1 bi 066 SB PAM IRC 5 and Fibronectin bin in 6 APA 4 b B066 PAM S IRC bin 9 RHO1 42 2 bin RHO 4 bin 3 RHO3 48 bin APA in 30 4 C4 b bin 3 IR 662 SB0 PAM 33 IRC 4 bin B066 AM S IRC P in 33 1 A b AP 2 bin 20 AM SB066 STY2 bin 4 IRC P bin 15 IRC3 2 bin 2 IRC4 STY3 in 8 RHO1 b bin 6 n 6 RHO2 bi bin 6 STY1 bin 5 APA 11 4 bin SB066 PAM IRC n 27 0668 bi AM SB STY4 bin 7 IRC P 15 PA bin 4 A bin 79 RHO1 bin 37 RHO3 bin 45 SB0661 IRC PAM bin 28 M SB0662 RHO2 bin 25 IRC PA in 2 RHO3 b

APA bin 96 RHO3 bin 43 in 4 IRC P RHO1 b 36 AM SB0667 bin 1 RHO2 bin 43 APA bin n 68 IRC RHO1 bi 8 PAM SB0666 bin 153 54 RHO3 bin n 15 SB0661 bi IRC PAM IRC 666 bin 1 PAM SB0662 bin 33 IRC PAM SB0

APA bin 80 662 bin 27 IRC C PAM SB0 Class PAM SB0677 bin 16 IR IRC1 bin 20 IRC IRC4 bin 11 PAM SB0663 bin APA bin 13 43 5 IRC PAM SB0661 bin 12 44 PAM SB0661 bin APA IRC bin 56 IRC1 bin 25 COS36406 bin 1 IRC4 bin 20 3 APA bin 86 IRC PAM SB0662 bin 5 COS36387 bin 15 COS4 bin 10

IRC PAM SB0666 bin 11 COS36386 bin 19 IRC PAM SB0661 bin 22 RHO3 bin 25 16 WGA-4E Anaerolineae IRC1 bin 31 COS1 bin 11

COS4 bin 17 IRC3 bin 22 COS3 bin 3 COS3 bin 15

seawater seasim SB9152 S1 bin 12 IRC2 bin 8

IRC PAM SB0665 bin 11 IRC P 20 AM SB0664 bin 15

Gemmatimonadetes Thermoanaerobaculia IRC PAM S B0661 bin 19 IRC1 bin 19

IRC3 bin 13 APA bin 1

COS3 bin 18

RHO2 bin 60

RHO3 bin 68 405 bin 12 COS36 R HO1 bin 20 6386 bin 9 COS3 RHO2 24 bin 59 6 bin 7 COS3640 IRC uknown bin61 PAM SB0661 bi bin 13 n 16 COS36388 R HO3 bin 67 1 RHO3 bin IRC PAM SB0 662 bin 21 RHO1 bin 1 IRC PAM SB06 5 66 bin 6 0664 bin PAM SB IRC IRC PAM SB 0670 bi 7 bin 14 n 19 SB066 IRC2 IRC PAM bin 3 C3 bin 7 IR IRC 4 bin 17 S4 bin 28 CO I RC1 bin 87 21 APA bin IRC 28 3 bin 11 2 bin 5 CAR A PA bin 52 UBA2968 Luteitaleia n 13 bi 157 S6 B9 IRC1 seasim S bin 30 eawater 5 bin 8 s 9156 S IR sim SB C PAM er sea 8 SB0661 seawat S1 bin bin 34 152 IRC asim SB9 PAM S ter se B0664 bin 22 seawa 9 8 S7 bin I B915 RC PA asim S M SB se in 29 0666 bin seawater RHO1 b IRC4 21 bin 3 57 3 O3 bin RH IRC P AM S 9 B066 bin 5 2 bin HO1 COS3 9 R 63 n 3 86 bin 1 PA bi A COS3 32 638 11 8 bin 3 bin 11 RHO RHO 1 bin Marinisomatia uknown 2 40 2 bin RHO RHO 3 bi 82 n 52 O1 bin C RH OS3 bin 4 bin 55 COS4 RHO 2 bin 9 9 COS CAR2 bin 1 b in 3 in 1 AR1 b CO C S2 bin 17 bin 7 662 AP SB0 A bin PAM 2 89 IRC bin 1 OS2 IRC C PAM 5 SB066 bin 1 4 bi OS1 RHO n 27 C 2 bin 36 19 18 S4 bin RH CO O3 bin 26 in 85 IR Bacteroidia Acidobacteriae 1 b C RHO PAM S B06 IR 61 bi IRC4 bin 42 C4 bin n 32 17 8 bin IR IRC1 C1 bin 59 15 bin IRC RHO3 PAM 2 S in 6 B066 2 b CO 5 b RHO S2 in 2 48 bin 5 bin 18 O1 IRC RH PAM n 22 SB A bi CO 0675 bin 29 AP S3 58 bin 9 bin O3 CO RH S1 60 bin 2 RHO 40 1 bin 3 RHO bin 5 bin 80 C 3 O3 OS Rhodothermia Bacteriovoracia RH 3640 11 5 b bin RH in 14 5 O2 bi S3640 n CO COS 67 36404 bin 6 36387 bin 2 S 55 CO bin COS36 PA A 3 406 bin bi 7 COS n 1 3632 36404 bin ina ett bin 6 CO r b 326 S 3 wate 36388 sea ttina 36 n 7 I b be RC in er 6328 bi PA 5 wat 3 7 M ea n 1 SB s bettina bi CO 06 r 9 S3 68 3630 b bin awate in 12 21 se in 18 CO bettina 8 b S2 ter 30 5 bi wa a 36 in C n 1 sea ttin O 6 be 08 b S4 Spirochaetia Bdellovibrionia 9 bin 3 ater na 363 tti in 2 COS 6 seaw e 8 b 1 er b 9 bi at r 4261 bin n 1 seaw ate 18 CO 2 aw 26 S3 se r 4 n 2 638 bi C 6 awate 28 OS bin se 363 36 18 na n 2 387 i bi COS1 b in r bett 616 b 22 te r 42 n 3 in wa te bi C 10 sea a OS eaw 2 s n 4 bin 7 i COS 112 b 3 er bettina 3630922 n 1 bi t er bi C n 2 awa t OS se 4 b seawa in 5 in 8 tina 36326 8 b RH bet 61 O1 er 2 4 b at r 4 n R in 1 bi HO 8 seaw eawate 2 s 6310 27 bin Chlamydiia Oligoflexia ina 3 bin RHO 22 ett PA b A 4 3 b ter bin A in a PA 8 aw 5 S4 b se 3 in 915 in C 63 8 b A m SB R2 bi si 3632 a C n ina AR4 bin 1 3 ater se aw er bett 36308 bin 3 se t IR 4 wa 09 bin 6 C 63 PA sea 3 I M S ater bettina R w C2 B0 ea 30 bin 6 67 s 36310 bin 3 IR 5 a bin C bin 28 1 b 18 seawater bettinabettin in 10 ter IR bin 29 C3 ettina 363 112 bin seawa n 28 IR 1 ter b er 22 C 0 t 12 bi PA awa 21 M seawa bin 16 IR S se r 2 C B06 Verrucomicrobiae Polyangia ate R4 18 PAM 65 bin aw CA I se bin RC S 3 4 B0678 12 AR 10 b C n IR in bi C 15 bi 2 PA n 6 AR I M C RC S P B0664 AM CAR1 bin 12 IR C SB bi P 0 n 16 AM 661 bin 1 CAR1 bin 7 RH S O B0 AR1 bin 6 2 662 bin 1 7 C R bin HO 41 1 R3 bin 17in 4 b CA R in 0 HO 1 3 5 16 b CAR1 b AP in 4 bin A 9 8 bin AR2 1 CO 4 C 2 4 bin S2 10 b CAR in COS in 19 1 3 S2 b C bi UBA8108 UBA9160 bin 18 O n 915 0 S4 19 S1 1 b COS3 in 32 sim SB 6 SB9152 665 bin C 3 bin 8 OS3 86 ter sea bin a C 6 seasim AM SB0 IRC3 OS 3 4 eaw r P bin 18 87 b s 8 3 ate IRC AR1 n C 6388 b in C bi OS 1 3 9 seaw 6 i CAR4 CO 386 b n 6 S 4 i R1 bin 13 IR b n A C in 31 C P 4 CO AM 3 CAR3 bin bin9 7 S S 3 B AR2 C 63 066 O 8 C 7 1 S b COS 36 in bin HO3 bin 24 38 1 21 R bin 39 7 8 6 COS3 3 bi 6387 bin n RHO1 17 OS4 bin R 6388 C H O 9 R 2 bi 666 bin 13 H b n 0 O in 40 16 R 1 Planctomycetes UBA8248 IRC3 bin 5 H b AM SB O in 14 P C 3 OS bi IRC CAR3 bin 4 n 50 C 1 OS bi CAR2 bin 2 n 2 4 0 IR b AR4 bin 5 C in C 3 b 4 R 2 H i bin 71 n 35 CAR1 binA 2 O P RH 1 A bi bin 40 O n 26 A 2 P C b A OS3 in 8 4 bin 1 1 COS2 63 CAR 1 8 1 C b 6 O in b CAR3 bin S in 3 22 3 C 6 2 OS3 3 87 COS4 bin CO b bin i RHO2 bin 1 S n C 4 1 2 b 0 O i RHO1 bin bin2 70 C S n 1 Deinococci Nitrospiria A O 1 P 3 A S b 2 in 3 bin 10 C 1 O b 3 S in 3 5 RHO O1 bin 6 C b O i S n RH 10 C 3 1 OS 6 3 406 bin 3 COS4 bin 12 CO 6388 bin 1 C S3 COS2 bin O 6387 4 S3 C 6 O 404 bi bin CLI2 bin 1 S C 36 3 OS CLI1 bin 3 386 b n C 3 seawater 42615 bin 2 OS 6 2 CLI1 bin 4 405 I 3 in RC3 bin b b 5 in I in 7 bin 14 R 14 3 0 C C P 4 O AM S 2 IRC2 binS4 5 COS3638 12 bin 1 3 b B Vampirovibrionia GR-WP33-30 1 R in 06 H 3 6 2 O 6 RH 2 b b seawater 42617 bin 1 8 in O in bin 8 R 1 2 seawater seasim SB9154 S3 bin 1 152 S1 bin H 3 2 O b seawater 22 APA C in 33 O 3 S b C 3 in RHO1 bin 5 O 6 2 S 405 7 IRC1 bin 7 C 3 O 6 b C S3 40 IRC4 bin 41 O in 10 6 6 seawater seasim SB9153 S2 bin S3 3 b C 8 in 9 O 6 7 RHO1 bin 53 IR 388 bin seawater seasim SB9 662 bin 58 S C 3 6404 I P bi 6 OS4 bin 44 RC AM n 9 C in 27 I 4 bi AM SB0 R bin 12 P C S n 1 COS1 bin 18 IRC B P 0665 bin 19 AM 0 IRC I R P C AM SB0670 S bin 39 IR P B0675 b C AM S AM SB0665 b IRC4 bin 23 P A P P AM S Cyanobacteriia UBA1144 AM SB0664 bin 12 OS4 bin 15 A IRC P C R bin 1 B bin 43 H 0 in 5 COS1 bin 8 n 29 6 R O B06 6 IRC H 1 2 9 HO1 IR O b bin R 6 C 2 in 67 7 b CO b 2 COS4 bi P A in 5 in 2 IRC P S 4 M S 1 5 RHO3 bin 9 IR b in 6 B 3 C A 0 AM SB0662 bin 59 IRC M S 6 P RHO2 bin 33 n P AM 72 5 IR P B b IRC C IRC A S 0661 bin 8 in P M B AM 1 AM SB0677 bin 7 S 0 1 IR 4 67 P B0665 bin 1 C IR b S 5 AM SB0678 bin 15 CAR3 bin 1 3 b in 4 B IRC P A C b AM SB0675 bi 5 06 in 12 P CAR1 bin 1 P i A P n 0 R AM S 76 in 4 3 IRC H b RHO1 bin 72 b 1 i 2 IRC A O n 58 b A in 21 n P 3 P 1 1 B A A A 1 b b 066 678 bin 2 P in 8 COS1 A in b COS2 8 2 1 SB0 in 8 bi C n seawater 42618 bin 3 1 Thermoleophilia Gammaproteobacteria O b 02 AM CO i 14 P 1 S n 0664 bin 2 b 1 AM SB0677SB0668 bi bin n 4 i C i I n P RC3 bin 2 R S b 6 M I b IR in 2 IR C 4 i 23 n b bin 20 IRC1 bin 1 P IRC PA AM SB C 2 IR A P 7 R3 P M C C IRC C A A O P M S IR C A P S A S B C A 3 M 0665 bi AM SB0675 6 B OS3 b S P I 38 0 R in 66 5 I B R C 2 6 066 6 1 IRC C C bi B9156 S5 bin 3 P 40 3 b n 9 O AM S n S CAR2 bin 22 P 4 R 4 12 in S4 AM bi n 16 H CAR1 bin 10 R b 1 O n 12 bin 1 H in 1 IR b S B 3 O 1 i n 4 0661 19 r seasim SB9157 S6 bin 7 r 42618 bin 25 12 bi C B I 3 b 1 12 bin 6 R 06 4 bin i 1 b n 12 1 R C 9 1 A in 16 75 wate 22 H CLI1 bin 2 P PAM S b 12 bin 3 O1 in R A bin 19 i 1 1 n H b sea awater seasim seawate 12 bin 12 R 3 e b 1 R H O i s seawater 22 n 2 i n 3 H O 1 R B 386 bin 8 1 seawater seasim SB9152 S1 bin 1 O b seawater 22 1 H 2 9 0661 bin seawater bin 61 R i 4 3 n 42 A O b S36 R H i P 5 bin 6 1 b n 27 Actinobacteria Alphaproteobacteria OS36387 b RH H O A i C b n seawater 22 CO R O3 3 i n 1 1 bin 14 H O b tina 36328 bin 8 R 2 seawater 22 i 3 156 S R I O 3 bi n 36 1 RC3 bin 16 H CAR2 bin in 31 9 8 1 b b IR O2 n 3 in CA b I bin 93 R C i CAR3 bin 14 IRC n 40 7 O1 A C b RHO3 bin 19 A P i 1 H P 1 n AM S sim SB9 R P P 0 R A 3 R awater bet TY4 bin 3 H A P AM 1 bin 6 1 H b AM S seawate O Y 2 se STY2 bin 6 3 bin 13 T sea O i 0 3 n 12 B R S S seawate 3 b 2 b S B 06 seawate bin 69 n w i B i n 06 CA seawate 6 A i a n 17 066 sea 4 CAR1 bin 1 P n 14 6 ST ter 3 awater sea 7 r bin O1 bin 5 1 CAR2 bin 14 A C seasim S 3 se 2 A OS4 Y w s b C r RH b i I P 1 bin n ate ea 3 R OS r s 5 in ST A seasim 30 1 easim S C r ST 39 8 sea s b b RHO3 bi STY r OS36386 b A Y 3 i P i s m S n i COS4 bin 16 I Y 6 n 8 P 1 AM S C R in 7 9 e COS4 bin 26 I 3 b 386 bin 35 A 2 sim B COS1 bin 1 IR R asim SB9153 S2 CAR4 bin 9 C bi 4 bin 1 AM SB0662 bin 53 I bin 3 B 9152 R C R n C P AR3 bin 12 4 bin 4 R P i CAR2 bin 18 n SB B 9153 R C 3 AR2 bin H in 10 P C COS H AM S B 4 3 S 678 b n 1 P 9 C C HO2 2 C CAR4 bin 1 O AM 06 C O P B 15 A 91 C 2 2 C O 4 in IR C 3 b A S O 915 COS 6 C 1 M C O 62 bi 2 bi 2 O S2 5 n M S 5 1 b O S i S 1 HO2 bin 23 COS O SB A CO b B 1 5 O 8 4 C S 4 R S S 2 b b i S b S 5 A 3 0675 b 0677 bin 2 R i n 20 R 1 7 P b S 0662 bin 26 Woesearchaeia 5 C 2 Acidimicrobiia 5 n 24

C C 3 n A O S in 1 C 3 B C H b S 4 in 19 TY1 bin 2 n 1 C 4 b 2 i 3 P 1 1 A i OS36386 bin 3 B H 0675 bi i 3 b RHO3 bin 41 3 8 3 B STY 4 6 O n 7 23 2 S n 2 TY 9 6 n 5 AM SB0 P O 3 i S n O 0667 b 3 bin 2 O b 0 S S i L i A O n b Y n i L 6 i C b n b 2 3 n n O 1 0677 bin 15 n n i 1 3 3 P STY4 bin 8 n b 3 6 n i A S TY4 bin 1 n n 1 bi i S 2 b i i I

n i S 3 I S in b i i 1 T S b 2 3 4 i 8

4 i b 63 8 3 b 1 i 6 4 TY1 bin 1 b n 3 2 b n b 1 b S n 1 b n 3 6 n 8 b b i 3 3 0 in 3 b 6 i n 6

b 3 C S i i 7 1 0 O 4 1 i b 3 S b n 6 3 STY3 bin 1 3 b 1 4 5 b 6 b 6 AM SB 1 i 6 8 b 6 3 6 2 b 5 8 b 6 0 4 R 1 1 n A b i b n

5 I i

H 4 8 b 2 I i n 3 i n 4 P 3 6 4 7

PAM S n CAR2 C 9 8 in 4 6 n 5 2 O 4 b P i 0 2

L 6 O 2 n C 0 i 6 b i 9 n 8 2 0 R 6 6 0 3 n n 9 0

2 HO 6 i R 2 7 6 S 6 H 1 b A 5 i 6 b n C 0 C H 6 R 6 3 0 I 23 6 b n R 4 2 B I 6 r 0 1 6 R i 5 i

R O n

S R b B i n 2 b A b e 0 IRC 0 b IR S n 5 B

t i C i S i 3 O i n B 1 C B n n

S n a 2

M

C S

S 2 8

2 1 M 1 w

A

M 0

7 9 a 8 A

P M M

A

e P

A A P s

C

P

P

C

R C

I

R

C

C I R

I

R

R

I UBA2235 Nitrososphaeria I Dehalococcoidia

Fig. 4 Phylogenetic tree showing the distribution of cadherins, phylum while the outer strip is coloured by class. Both are listed fibronectins and fibronectin-binding proteins across MAGS with clockwise in the order in which they appear. Seawater MAGs are >85% completeness (N = 884). Values represent the copy number of denoted by grey labels with red text. each gene per MAG. Internal branches of the tree are coloured by extracellular space by type I or II secretion systems. Inter- abundant in the Actinobacteriota and Chloroflexota. However, estingly, although most sponge MAGs encoded eukaryote- although fibronectin III-containing genes were taxonomically like proteins (Fig. 3), few lineages encoded the necessary widespread, those encoding fibronectin-binding proteins were genes to form secretion systems (Fig. S7). It is therefore restricted to the phyla Poribacteria, Gemmatimonadota, Lates- unlikely that ELRs are introduced to the sponge host via cibacterota, Cyanobacteriota, class Anaerolineae within the traditional secretion pathways used in other animal- Chloroflexota (but not Dehalococcoidia), class Rhodothermia symbiont systems. within the Bacteroidota, Spirochaetota, Nitrospirota and the Maintaining stable association with the sponge may also archaeal phylum Thaumarchaeota. Interestingly, the taxonomic require mechanisms for attachment to the host tissue. For distribution of these genes shares significant overlap with example, cadherin domains are Ca2+-dependent cell–cell lineages encoding the genes for sponge sialic acid and glyo- adhesion proteins that are abundant in eukaryotes and have saminoglycans degradation, suggesting that attachment to the been found to serve the same function in bacteria [76]. Simi- host may be necessary for utilisation of these carbohydrates larly, fibronectin III domains mediate cell adhesion in eukar- (Fig. 2). However, as the host, bacterial, and archaeal compo- yotes, but also occur in bacteria where they play various roles nents of the sponge holobiont have fibronectin III domains, in carbohydrate binding and biofilm formation [77, 78]. In symbionts encoding fibronectin-binding proteins may use these addition, some bacterial pathogens utilise fibronectin-binding to adhere to the host tissue or potentially to form biofilms proteins to gain entry into host tissue by binding to host (bacteria–bacteria attachment). In either case, the enrichment fibronectin [77, 78]. Genes containing cadherin domains were and wide distribution of cadherins, fibronectins and fibronectin- enriched in the sponge-associated MAGs and were identified in binding proteins in the sponge MAGs suggests that cell–cell most bacterial lineages, but were notably absent in the Cya- adhesion is critical for successful establishment in the sponge nobacteriota and Verrucomicrobiota (Fig. 4). Genes containing niche. fibronectin III domains and those for fibronectin-binding pro- Distribution of genes encoding ELRs, polysaccharide- teins were also enriched in sponge-associated MAGs and were degrading enzymes (GHs and CEs), cadherins, fibronectins, distributed across most lineages, though were particularly RMs and CRISPRs across distantly related taxa suggests 1650 S. J. Robbins et al.

Actinobacte

Fig. 5 Visualisation of LGTs B

a Acidobacteriota

c

t

e xota

a

r

t ia o xota_B r

o detected within the MAGs for i t e d

riota

a

o

n

t

i

a

B fi Chlorofle the ve sponges passing the UBA8248 0 0 Chlorofl Crenarchaeota 0 Dadabacte Proteobacter 0 Deinococcota 300 0 0 cumulative MAG length 0 0 0 0 300 Gemmatimonadota ia criteria (>100 Mbp). The inner P o 300 LatescibacterotaNitrospirotaribacteria 600 Latescibacterota 0 Nitrospirotaribacteria strip is coloured by phylum 0 o 0 0 P 0 0 0 while the outer strip is coloured 0

Gemmatimonadota 300 by host sponges. Bands connect 600 ria donors and recipients, with their Proteobacte 600 colour corresponding to the 300

Dadabacter 900 Cyanobacter ia 0 donors and the width correlating Crenarchaeota 0 ia 0

Chloroflexota_B 0

0

0 to the number of LGTs. Spirochaetota

0

0

3

0 Chloroflexota Acidobacteriota

0

0 Actinobacte 0 riota Bacteroidota 300 0 Bacteroidota

0 Binatota

iota 0 Actinobacter 0 Chlorofl exota 300 0 Chlorofl 300 0 C 0 Dadabacteyanobacte 0 e 0 Deinococcotaxota_B iota Gemmatimonadotar riaia 0 0 0 Latescibacterota Acidobacter 0 Nitrospirota P 0 o Aplysina aerophoba 0 ribacteria

300 Proteobacte 300 UBA8248 Carteriospongia foliascens ria 0 0 0 0 0 Spirochaetota UBA8248 0 0 ria 0 0 Actinobacte 0 0 C 0 0 Dadabacte y Coscinoderma matthewsi 0 0 Proteobacte anobacte Proteobacte ycetota 0 ria 0 0 300 m Acidobacte

A

Nitrospirota c

t

i r xota_B n r iota ia r e o ia Ircinia ramosa Plancto

b r

anobacte a ia Dadabacteriay Binatota

Latescibacterota c

C t r e iota

r Chloroflexota

Chlorofl i

Gemmatimonadota o

t

a Rhopaloeides odorabile

that they were either acquired from a common ancestor or Sponges are important constituents of coral reef eco- that they represent more recent LGT events, potentially systems because of their critical role in DOM cycling and mediated by MGEs, which are enriched in sponge- retention via the sponge-loop. Despite their importance, associated microbial communities [69]. Here, we identify functional characterisation of sponge symbiont commu- 4963 LGTs from five sponges for which sufficient sequence nities has been restricted to just a few lineages of interest, data were available (>100 Mbp total sequence length across potentially biasing our view of sponge symbiosis. Here we all MAGs), as well as 136 LGTs from seawater MAGs, present a comprehensive characterisation of sponge sym- averaging 1.64 and 0.52 LGTs per Mbp sequences, biont MAGs spanning the complete range of taxa found in respectively (Fig. 5 and Table S5). Sequence similarity of marine sponges (Fig. 7), most of which were previously LGTs from MAGs within a sponge species was higher than uncharacterised. We revealed enrichment in glycolytic between sponge species, indicating relatively recent gene enzymes (GHs and CEs) reflecting specific functional transfers (Fig. S8). A higher frequency (Fig. S9) and lower guilds capable of aiding the sponge in the degradation of genetic divergence of LGTs among MAGs derived from the reef DOM. Further, we identified several ELRs, CRISPRs same sponge species likely results from the close physical and RMs that likely facilitate stable association with the distance between members of each microbiome, as has been sponge host, showing the specificity of ELR types with observed in other host-symbiont systems [79]. The identi- individual microbial lineages. We also clarified the role of fication of lateral transfers between microbes from different Thaumarchaeota as a keystone taxon for ammonia oxida- sponge species may highlight the horizontal acquisition of tion across sponge species and showed that processes these microbes or that a recent ancestor inhabited the same previously thought to be important, such as amino acid host. Notably, LGTs included a subset of genes that were provisioning and taurine, creatine and carnitine metabolism enriched within the sponge-associated MAGs, such as are unlikely to be central mechanisms mediating sponge- GH33 (sialidases) and CE7 (acetyl-xylan esterases), microbe symbiosis. Many of the enriched genes are lat- attachment proteins (cadherins and fibronectin III), RM and erally transferred between microbial lineages, suggesting CAS proteins, and members of all ELR families other than that LGT plays an important role in conferring a selective WD40 (Figs. 6 and S10). The observation that a significant advantage to specific sponge-associated microorganisms. number of sponge-enriched genes were laterally transferred Taken together, these data illustrate how evolutionary between disparate microbial lineages suggests that the processes have distributed and partitioned ecological processes they mediate provide a strong selective advantage functions across specificspongesymbiontlineages, within the sponge niche, though further research is required allowing them to occupy or share specific niches and live to validate these findings. symbiotically with their sponge hosts. A genomic view of the microbiome of coral reef demosponges 1651

La Fig. 6 Visualisation of gene tescib acte fl rota ow among microbial phyla 0 0 0 2 a 0 0 ot for gene families enriched in ad 2 0 on tim a 0 0 m sponge-associated MAGs. The m P e 2 0 ro G te o inner ring and band connecting 0 b a c te 0 ri donor and recipient is coloured 0 a

0 by protein family of the gene 0

0

2 0 being transferred, with the width S p i r o of the band correlating to the c h

a

0 0 e

t number of LGTs. Recipient o t a

U 0 CE7 MAGs are shown in grey. The B

0 A

8

2 outer ring is coloured by 4 8 GH33

0

microbial phylum. 0 ELRS Representation of RM and CAS 0

0 gene LGTs can be found in Fig. 0 Cadherin

0 S10. 0 Fibronectin

0 2 Recipient C

h 0 lo r o f l 4 e 0 x 0 o ta

0 0

ta 0 io 0 r te c a 0 0 b do ci 0 0 A

0 0

0 0 0 0 0 2

Bin atota

Actinobacteriota

Fig. 7 Schematic overview of microbial interactions with the host restriction-modification systems, CAS CRISPR-associated proteins, as inferred from the functional potential encoded by the sponge- ELP eukaryotic-like repeat proteins, CE7 carbohydrate esterase family associated microbial MAGs. Fbn fibronectin, cdh cadherins, RM 7, GH33 glycosyl hydrolase family 33. 1652 S. J. Robbins et al.

Data availability 5. Rix L, de Goeij JM, van Oevelen D, Struck U, Al-Horani FA, Wild C, et al. Differential recycling of coral and algal dissolved – Metagenomic assemblies and MAGs from this study can be organic matter via the sponge loop. Funct Ecol. 2017;31:778 89. 6. Rix L, Ribes M, Coma R, Jahn MT, de Goeij JM, van Oevelen D, found under NCBI bioproject ID PRJNA602572. All et al. Heterotrophy in the earliest gut: a single-cell view of het- GraftM packages can be found at https://data.ace.uq.edu.au/ erotrophic carbon and nitrogen assimilation in sponge-microbe public/graftm/7/. Commands and scripts used to execute symbioses. ISME J. 2020;14:2554–67. analyses and generate figures can be found in File. S7. 7. Moeller FU, Webster NS, Herbold CW, Behnam F, Domman D, Albertsen M, et al. Characterization of a thaumarchaeal symbiont that drives incomplete nitrification in the tropical sponge Ianthella Acknowledgements We are indebted to Illumina and the Earth basta. Environ Microbiol. 2019;21:3831–54. Microbiome Project for metagenomic sequencing, including these 8. Engelberts JP, Robbins SJ, de Goeij JM, Aranda M, Bell SC, individuals: Luke Thompson, Jon Sanders, Rodolfo Salido Benitez, Webster NS. Characterization of a sponge microbiome using an Karenina Sanders, Caitriona Brennan, Jeremiah Minich, MacKenzie integrative genome-centric approach. ISME J. 2020;14:1100–10. Bryant, Lindsay DeRight Goldasich, Greg Humphrey and Rob 9. Hentschel U, Piel J, Degnan SM, Taylor MW. Genomic insights Knight. Mari Carmen Pineda Torres and the AIMS Seasim staff are into the marine sponge microbiome. Nat Rev Microbiol. thanked for sample collection. TT and WS were additionally sup- 2012;10:641–54. “ ported by an R&D project registered as ANP 21005-4, PROBIO- 10. Jahn MT, Arkhipova K, Markert SM, Stigloher C, Lachnit T, Pita — DEEP Survey of potential impacts caused by oil and gas exploration L, et al. A phage protein aids bacterial symbionts in eukaryote on deep-sea marine holobionts and selection of potential bioindicators immune evasion. Cell Host Microbe. 2019;26:542–50.e5. ” and bioremediation processes for these ecosystems (UFRJ/Shell 11. Pascelli C, Laffy PW, Botté E, Kupresanin M, Rattei T, Lurgi M, Brasil/ANP), sponsored by Shell Brasil under the ANP R&D levy as et al. Viral ecogenomics across the Porifera. Microbiome. “ ” Compromisso de Investimentos com Pesquisa e Desenvolvimento . 2020;8:1–22. We acknowledge the Traditional Owners of the sea country where this 12. Slaby BM, Hackl T, Horn H, Bayer K, Hentschel U. Metagenomic work took place. We pay our respects to their elders past, present and binning of a marine sponge microbiome reveals unity in defense emerging and we acknowledge their continuing spiritual connection but metabolic specialization. ISME J. 2017;11:2465–78. to their sea country. 13. Kamke J, Sczyrba A, Ivanova N, Schwientek P, Rinke C, Mav- romatis K, et al. Single-cell genomics reveals complex carbohy- Compliance with ethical standards drate degradation patterns in poribacterial symbionts of marine sponges. ISME J. 2013;7:2287–2300. 14. Bayer K, Jahn MT, Slaby BM, Moitinho-Silva L, Hentschel U. Conflict of interest The authors declare that they have no conflict of Marine sponges as Chloroflexi hot spots: genomic insights and interest. high-resolution visualization of an abundant and diverse symbiotic clade. mSystems. 2018;3:1–19. ’ Publisher s note Springer Nature remains neutral with regard to 15. Lackner G, Peters EE, Helfrich EJN, Piel J. Insights into the fi jurisdictional claims in published maps and institutional af liations. lifestyle of uncultured bacterial natural product factories asso- ciated with marine sponges. Proc Natl Acad Sci USA. 2017;114: Open Access This article is licensed under a Creative Commons E347–E356. Attribution 4.0 International License, which permits use, sharing, 16. Montalvo NF, Davis J, Vicente J, Pittiglio R, Ravel J, Hill RT. adaptation, distribution and reproduction in any medium or format, as Integration of culture-based and molecular analysis of a complex long as you give appropriate credit to the original author(s) and the sponge-associated bacterial community. PLoS One. 2014;9:1–8. source, provide a link to the Creative Commons license, and indicate if 17. Botté ES, Nielsen S, Abdul Wahab MA, Webster J, Robbins S, changes were made. The images or other third party material in this Thomas T, et al. Changes in the metabolic potential of the sponge article are included in the article’s Creative Commons license, unless microbiome under ocean acidification. Nat Commun. indicated otherwise in a credit line to the material. If material is not 2019;10:1–10. included in the article’s Creative Commons license and your intended 18. Marotz C, Amir A, Humphrey G, Gogul G, Knight R. DNA use is not permitted by statutory regulation or exceeds the permitted extraction for streamlined metagenomics of diverse environmental use, you will need to obtain permission directly from the copyright samples. Biotechniques. 2017;62:290–3. holder. To view a copy of this license, visit http://creativecommons. 19. Sturm M, Schroeder C, Bauer P. SeqPurge: highly-sensitive org/licenses/by/4.0/. adapter trimming for paired-end NGS data. BMC Bioinforma. 2016;17:1–7. 20. Nurk S, Meleshko D, Korobeynikov APP. metaSPAdes: a new References versatile metagenomic assembler. Genome Res. 2017;1:30–47. 21. Li H, Durbin R. Fast and accurate short read alignment with 1. Knowlton N. Coral reef biodiversity–habitat size matters. Science. Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60. 2001;292:1493 LP–1495. 22. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2. Hentschel U, Usher KM, Taylor MW. Marine sponges as CheckM: assessing the quality of microbial genomes recovered microbial fermenters. FEMS Microbiol Ecol. 2006;55:167–77. from isolates, single cells, and metagenomes. Genome Res. 2015; 3. De Goeij JM, Van Oevelen D, Vermeij MJA, Osinga R, Mid- gr:186072.114. delburg JJ, De Goeij AFPM, et al. Surviving in a marine desert: 23. Li D, Luo R, Liu CM, Leung CM, Ting HF, Sadakane K, et al. the sponge loop retains resources within coral reefs. Science. MEGAHIT v1.0: a fast and scalable metagenome assembler dri- 2013;342:108–10. ven by advanced methodologies and community practices. 4. Rix L, De Goeij JM, Mueller CE, Struck U, Middelburg JJ, Van Methods. 2016;102:3–11. Duyl FC, et al. Coral mucus fuels the sponge loop in warm-and 24. Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, cold-water coral reef ecosystems. Sci Rep. 2016;6:1–11. Chaumeil PA, et al. A standardized bacterial taxonomy based on A genomic view of the microbiome of coral reef demosponges 1653

genome phylogeny substantially revises the tree of life. Nat Bio- 44. Bansal MS, Alm EJ, Kellis M. Efficient algorithms for the technol. 2018;36:996. reconciliation problem with gene duplication, horizontal transfer 25. Chaumeil P-A, Hugenholtz P, Parks DH. GTDB-Tk: a toolkit to and loss. Bioinformatics. 2012;28:283–91. classify genomes with the Genome Taxonomy Database. Bioin- 45. Thomas T, Moitinho-Silva L, Lurgi M, Björk JR, Easson C, formatics. 2019;36:1–3. Astudillo-García C, et al. Diversity, structure and convergent 26. FA M, RB K, EV A. pplacer: linear time maximum-likelihood and evolution of the global sponge microbiome. Nat Commun. Bayesian phylogenetic placement of sequences onto a fixed 2016;7:11870. https://doi.org/10.1038/ncomms11870. reference tree. BMC Bioinforma. 2010;11:538. 46. Zhang F, Pita L, Erwin PM, Abaid S, López-Legentil S, Hill RT. 27. Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Symbiotic archaea in marine sponges show stability and host Aluru S. High throughput ANI analysis of 90K prokaryotic gen- specificity in community structure and ammonia oxidation func- omes reveals clear species boundaries. Nat Commun. 2018;9:1–8. tionality. FEMS Microbiol Ecol. 2014;90:699–707. 28. Olm MR, Brown CT, Brooks B, Banfield JF. DRep: A tool for fast 47. Steger D, Ettinger-Epstein P, Whalan S, Hentschel U, De Nys R, and accurate genomic comparisons that enables improved genome Wagner M, et al. Diversity and mode of transmission of ammonia- recovery from metagenomes through de-replication. ISME J. oxidizing archaea in marine sponges. Environ Microbiol. 2017;11:2864–8. 2008;10:1087–94. 29. Team RC. A language and environment for statistical computing. 48. Øverland M, Mydland LT, Skrede A. Marine macroalgae as Vienna, Austria: R Foundation for Statistical Computing; 2014. sources of protein and bioactive compounds in feed for mono- ISBN 3‐900051‐07‐0. http://www.R-project.org. gastric animals. J Sci Food Agric. 2019;99:13–24. 30. Glasl B, Robbins S, Frade PR, Marangon E, Laffy PW, Bourne 49. Stiger-Pouvreau V, Bourgougnon N, Deslandes E. Carbohydrates DG, et al. Comparative genome-centric analysis reveals seasonal from seaweeds. In: Fleurence J., Levine I., editors. Seaweed in variation in the function of coral reef microbiomes. ISME J. Health and Disease Prevention. Academic Press; Cambridge, MA, 2020;14:1435–50. USA: 2016. pp. 223–274. 31. Robbins SJ, Singleton CM, Chan CX, Messer LF, Geers AU, Ying 50. Kappelmann L, Krüger K, Hehemann JH, Harder J, Markert S, H, et al. A genomic view of the reef-building coral Porites lutea Unfried F, et al. Polysaccharide utilization loci of North Sea and its microbial symbionts. Nat Microbiol 2019;4:2090–2100. Flavobacteriia as basis for using SusC/D-protein expression for 32. Boyd JA, Woodcroft BJ, Tyson GW. GraftM: a tool for scalable, predicting major phytoplankton glycans. ISME J. phylogenetically informed classification of genes within meta- 2019;13:76–91. genomes. Nucleic Acids Res. 2018;46:e59. 51. Ficko-Blean E, Préchoux A, Thomas F, Rochat T, Larocque R, 33. Fan L, Reynolds D, Liu M, Stark M, Kjelleberg S, Webster NS, Zhu Y, et al. Carrageenan catabolism is encoded by a complex et al. Functional equivalence and evolutionary convergence in regulon in marine heterotrophic bacteria. Nat Commun. complex communities of microbial sponge symbionts. Proc Natl 2017;8:1685. https://doi.org/10.1038/s41467-017-01832-6. Acad Sci USA. 2012;109:E1878–87. 52. Dong S, Chang Y, Shen J, Xue C, Chen F. Purification, expres- 34. Nguyen LT, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: sion and characterization of a novel α-L-fucosidase from a marine a fast and effective stochastic algorithm for estimating maximum- bacteria Wenyingzhuangia fucanilytica. Protein Expr Purif. likelihood phylogenies. Mol Biol Evol. 2015;32:268–74. 2017;129:9–17. 35. Alves RJE, Minh BQ, Urich T, Von Haeseler A, Schleper C. 53. Hadaidi G, Gegner HM, Ziegler M, Voolstra CR. Carbohydrate Unifying the global phylogeny and environmental distribution of composition of mucus from scleractinian corals from the central ammonia-oxidising archaea based on amoA genes. Nat Commun. Red Sea. Coral Reefs. 2019;38:21–27. 2018;9:1–17. 54. Meikle P, Richards GN, Yellowlees D. Structural investigations 36. Singleton C. Microbial community structural and metabolic on the mucus from six species of coral. Mar Biol. dynamics in thawing permafrost. PhD thesis, School of Chemistry 1988;99:187–93. and Molecular Biosciences, University of Queensland, Australia; 55. De Santi C, Willassen NP, Williamson A. Biochemical char- 2018. https://doi.org/10.14264/uql.2018.815. acterization of a family 15 carbohydrate esterase from a bacterial 37. Katoh K, Toh H. Recent developments in the MAFFT multiple marine Arctic metagenome. PLoS One. 2016;11:1–22. sequence alignment program. Brief Bioinform. 2008;9:286–98. 56. Baath JA, Mazurkewich S, Poulsen JCN, Olsson L, Lo Leggio L, 38. Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool Larsbrink J. Structure-function analyses reveal that a glucuronoyl for the display and annotation of phylogenetic and other trees. esterase from Teredinibacter turnerae interacts with carbohydrates Nucleic Acids Res. 2016;44:W242–W245. and aromatic compounds. J Biol Chem. 2019;294:6635–44. 39. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy 57. Hettiarachchi SA, Kwon YK, Lee Y, Jo E, Eom TY, Kang YH, SR, et al. Pfam: the protein families database. Nucleic Acids Res. et al. Characterization of an acetyl xylan esterase from the marine 2014;42:222–30. bacterium Ochrovirga pacifica and its synergism with xylanase on 40. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and beechwood xylan. Micro Cell Fact. 2019;18:1–10. genomes. Nucleic Acids Res. 2000;28:27–30. 58. Garrone R, Thiney Y, Pavans, de Ceccaty M. Electron microscopy 41. Yin Y, Mao X, Yang J, Chen X, Mao F, Xu Y. dbCAN: a web of a mucopolysaccharide cell coat in sponges. Experientia. resource for automated carbohydrate-active enzyme annotation. 1971;27:1324–6. Nucleic Acids Res. 2012;40:W445–W451. 59. Juge N, Tailford L, Owen CD. Sialidases from gut bacteria: a 42. Cantarel BI, Coutinho PM, Rancurel C, Bernard T, Lombard V, mini-review. Biochem Soc Trans. 2016;44:166–75. Henrissat B. The Carbohydrate-Active EnZymes database 60. Huang YL, Chassard C, Hausmann M, Von Itzstein M, Hennet T. (CAZy): an expert resource for glycogenomics. Nucleic Acids Sialic acid catabolism drives intestinal inflammation and microbial Res. 2009;37:233–8. dysbiosis in mice. Nat Commun. 2015;6. https://doi.org/10.1038/ 43. Song W, Wemheuer B, Zhang S, Steensen K, Thomas T. Meta- ncomms9141. CHIP: community-level horizontal gene transfer identification 61. Fernàndez-Busquets X, Burger MM. Circular proteoglycans from through the combination of best-match and phylogenetic approa- sponges: First members of the spongican family. Cell Mol Life ches. Microbiome. 2019;7:1–14. Sci. 2003;60:88–112. 1654 S. J. Robbins et al.

62. Reisky L, Préchoux A, Zühlke MK, Bäumgen M, Robb CS, 71. Carolin FrankA. Molecular host mimicry and manipulation in Gerlach N, et al. A marine bacterial enzymatic cascade degrades bacterial symbionts. FEMS Microbiol Lett. 2019;366:1–10. the algal polysaccharide ulvan. Nat Chem Biol. 2019;15:803–12. 72. Reynolds D, Thomas T. Evolution and function of eukaryotic-like 63. Laffy PW, Wood-Charlson EM, Turaev D, Jutz S, Pascelli C, Botté proteins from sponge symbionts. Mol Ecol. 2016;25:5242–53. ES, et al. Reef invertebrate viromics: diversity, host specificity and 73. Nguyen MTHD, Liu M, Thomas T. Ankyrin‐repeat proteins from functional capacity. Environ Microbiol. 2018;20:2125–41. sponge symbionts modulate amoebal phagocytosis. Mol Ecol. 64. Koonin EV, Makarova KS, Wolf YI. Evolutionary genomics of 2014;23:1635–45. defense systems in archaea and bacteria. Annu Rev Microbiol. 74. Costa TRD, Felisberto-Rodrigues C, Meir A, Prevost MS, Redzej 2017;71:233–61. A, Trokter M, et al. Secretion systems in Gram-negative bacteria: 65. Gordon BR, Leggat W. Symbiodinium - Invertebrate symbioses structural and mechanistic insights. Nat Rev Microbiol. and the role of metabolomics. Mar Drugs. 2010;8:2546–68. 2015;13:343–59. 66. Hurst GDD. Extended genomes: symbiosis and evolution. Inter- 75. Chen L, Song N, Liu B, Zhang N, Alikhan NF, Zhou Z, et al. face Focus. 2017;7. https://doi.org/10.1098/rsfs.2017.0001. Genome-wide identification and characterization of a superfamily 67. Ma N, Ma X. Dietary amino acids and the gut-microbiome- of bacterial extracellular contractile injection systems. Cell Rep. immune axis: physiological metabolism and therapeutic prospects. 2019;29:511–21.e2. Compr Rev Food Sci Food Saf. 2019;18:221–42. 76. Fraiberg M, Borovok I, Weiner RM, Lamed R. Discovery and 68. Douglas AE. Nutritional interactions in insect-microbial sym- characterization of cadherin domains in Saccharophagus degra- bioses: aphids and their symbiotic bacteria buchnera. Annu Rev dans 2-40. J Bacteriol. 2010;192:1066–74. Entomol. 1998;43:17–37. 77. Schwarz-Linek U, Werner JM, Pickford AR, Gurusiddappa S, 69. Pita L, Rix L, Slaby BM, Franke A, Hentschel U. The sponge Ewa JHK, Pilka S, et al. Pathogenic bacteria attach to human holobiont in a changing ocean: from microbes to ecosystems. fibronectin through a tandem β-zipper. Nature. 2003;423:177–81. Microbiome. 2018;6:1–18. 78. Hymes JP, Klaenhammer TR. Stuck in the middle: fibronectin- 70. Fiore CL, Labrie M, Jarett JK, Lesser MP. Transcriptional activity binding proteins in gram-positive bacteria. Front Microbiol. of the giant barrel sponge, Xestospongia muta Holobiont: mole- 2016;7:1–9. cular evidence for metabolic interchange. Front Microbiol. 79. Smillie CS, Smith MB, Friedman J, Cordero OX, David LA, Alm 2015;6:364. https://doi.org/10.3389/fmicb.2015.00364. eCollec- EJ. Ecology drives a global network of gene exchange connecting tion 2015. the human microbiome. Nature. 2011;480:241–4.