1 Fidelity Varies in the Symbiosis Between a Gutless Marine
Total Page:16
File Type:pdf, Size:1020Kb
1 Supplementary materials for: 2 Fidelity varies in the symbiosis between a gutless marine worm and its microbial consortium 3 4 Yui Sato*1, Juliane Wippler1, Cecilia Wentrup2, Rebecca Ansorge1, Miriam Sadowski1, Harald 5 Gruber-Vodicka1, Nicole Dubilier*1, Manuel Kleiner*3 6 1Max Planck Institute for Marine Microbiology, Celsiusstr. 1, D-28359 Bremen, Germany 7 2University of Vienna, Department of Microbiology and Ecosystem Science, Althanstr. 14, A-1090 Vienna, Austria 8 3Department of Plant and Microbial Biology, North Carolina State University, Raleigh, North Carolina, USA 9 10 Contents: 11 1. Supplementary text 12 1.1. Detection limit of symbionts based on single-copy marker genes 13 1.2. Assessment of symbiont community compositions based on 16S ribosomal RNA genes 14 1.3. Symbiont 16S ribosomal RNA gene sequences indicated a linkage between haplotypes 15 of Candidatus Thiosymbion and host mitochondria 16 1.4. Phylogenetic reconstruction of mitochondria and symbionts based on SNPs identified 17 using a deterministic genotyping approach 18 1.5. SNP-identification based on genotype probabilities enhances capabilities of population- 19 level metagenomic analyses on host-associated microbiota 20 1.6. Estimation of the effective population size of symbionts within an Olavius algarvensis 21 individual based on genome-wide SNP abundance 22 Reference for supplementary text 23 24 2. Supplementary figures 25 Supplementary Figure S1 Phylogenomic tree of symbionts in Olavius algarvensis in relation to 26 reference bacterial genomes 27 Supplementary Figure S2 Phylogeny of 16S ribosomal RNA gene sequences for symbionts in 28 Olavius algarvensis in relation to reference bacterial sequences 1 29 Supplementary Figure S3 Phylogenies of mitochondria and Candidatus Thiosymbion within the 30 major mitochondrial lineages A and B 31 Supplementary Figure S4 Core SNP-trees based on called genotypes using a deterministic 32 approach 33 Supplementary Figure S5 Correlation of mitochondrial pairwise genetic distances with 34 corresponding genetic distances of symbionts in Olavius algarvensis 35 Supplementary Figure S6 Effective population size estimates of the symbiont per Olavius 36 algarvensis individual 37 Supplementary Figure S7 Sequence alignments of 16S ribosomal RNA genes of Olavius 38 algarvensis symbionts 39 Supplementary Figure S8 Distribution of mean relative read coverage among single-copy genes 40 within a single host per symbiont species 41 Supplementary Figure S9 Symbiont composition of individual Olavius algarvensis samples of 42 two COI-haplotypes (A and B) from two locations (Sant’ Andrea and Cavoli) 43 44 3. Supplementary tables 45 Supplementary Table S1 Reference genome statistics of Olavius algarvensis symbionts 46 Supplementary Table S2 Assessment of strain variability of symbionts within Olavius 47 algarvensis individuals 48 Supplementary Table S3 Comparison of SNP-identification methods 2 49 1. Supplementary text 50 1.1 Detection limit of symbionts based on single-copy marker genes 51 In the 80 metagenomes of Olavius algarvensis, we assessed (i) symbiont community 52 composition and (ii) symbiont prevalence (the number of host individuals with the respective 53 symbiont species detected; n = 20 hosts per host group). These are assessed by quantifying sequences 54 of single-copy genes (SCGs) that are specific to symbiont species. Between 162 and 431 SCGs per 55 symbiont species were extracted from their reference genomes. Means and deviations of SCG read 56 coverages showed that a small subset of SCGs, especially in the Candidatus. (Ca.) Thiosymbion 57 symbiont, contain short repeat sequences that attract high abundance of read mapped erroneously 58 (Supplementary Figure S8). These reference SCGs were excluded from symbiont abundance 59 calculations to avoid overestimation. To this end, relative abundance of each symbiont was calculated 60 based on the mean read coverage of SCGs in the interquartile depth range, i.e. genes whose depths 61 were ranked between 25 and 75 percentiles, with genes with 0 coverages being ranked together. 62 Consequently, the minimum detection limit by this method was when the number of SCGs covered by 63 reads exceeds more than 25% of total number of SCGs in the reference. This method provides robust 64 assessment of symbionts, whereas it at times presents a conservative detection limit. For example, this 65 method per se indicated the presence of the spirochetal symbiont in all host individuals but one. 66 However, in the individual in which the spirochete symbiont was below the detection limit, sequences 67 matching 29 out of 162 SCGs (17%) as well as 16S rRNA gene were detected, thus we deemed the 68 spirochete to be present in this individual. Consequently, we interpreted that all host individuals show 69 the presence of spirochete (Figure 2). 70 71 1.2. Assessment of symbiont community compositions based on 16S ribosomal RNA genes 72 Relative abundances of O. algarvensis symbionts were estimated also by mapping metagenome 73 reads to reference 16S ribosomal RNA gene (SSU) sequences representing those of the symbionts, in 74 addition to the SCG-based approach above. For estimation of symbiont relative abundance based on 75 SSU sequences, quality-filtered reads matching with symbiont SSU sequences were quantified with 76 Kallisto using representative SSU reference sequences (NCBI accession numbers: spirochete; 3 77 AJ620502, Delta4; AJ620497, Gamma3; AJ620496, Delta3; AM493254, Delta1; AF328857, Ca. 78 Thiosymbion AF328856; the Delta1a and Delta1b symbionts sharing highly similar SSU sequences 1 79 were represented by AF328857). Results showed nearly the identical community structures in all 80 samples as shown based on SCG sequences (Supplementary Figure S9, Figure 2). The only difference 81 was that relative abundances of spirochete symbionts appeared slightly greater in the SSU-based 82 estimation than in the SGC-based estimation, which is likely due to differences in the copy number of 83 SSU in the spirochete symbiont compared to others. 84 85 1.3. Symbiont 16S ribosomal RNA gene sequences indicated a linkage between haplotypes of 86 Candidatus Thiosymbion and host mitochondria 87 As an initial assessment of partner fidelity, we assembled SSU sequences of symbionts in each 88 O. algarvensis sample, and searched for SNPs that are characteristic to certain host groups (COI- 89 haplotypes and locations). Only the Ca. Thiosymbion symbiont showed a SNP site in the SSU 90 sequences that was linked to the host COI-haplotype but not to locations, while no other symbionts 91 showed SNPs in SSU sequences that could be linked to certain host groups (Supplementary Figure 92 S7). 93 94 1.4. Phylogenetic reconstruction of mitochondria and symbionts based on SNPs identified using 95 a deterministic genotyping approach 96 We identified SNPs for mtDNA and symbionts using a deterministic genotyping approach for 97 phylogenetic reconstruction, in addition to the probabilistic approach to SNP-identification, to 98 compare the results using the two methods. For SNP-identification by deterministic genotyping, the 99 same symbiont- and mitochondrial-reads were analyzed with the SNIPPY pipeline v3.2 100 (https://github.com/tseemann/snippy), with the same reference genomes of mtDNA and symbionts as 101 described in the main document. Genotypes were first called at all reference nucleotide positions with 102 a minimum coverage of 5× for each sample. Core SNP sites were subsequently identified among 103 genotype-called sites that were covered ≥5× in all samples. Similar to the probabilistic approach 104 described in the main document, when no core SNP site was found, samples with insufficient 4 105 genotype data were excluded using a cut-off of lateral coverage (% reference sites with coverage ≥5×) 106 for SNP identification of a given symbiont. Phylogeny trees with bootstrap-support were computed 107 from resulting core SNP nucleotide alignment with IQ-TREE v1.5.5 2 using a best-model finder 108 implemented within IQ-TREE. Phylogenetic trees of the symbionts and mitochondria were visualized 109 in the iTOL web tool 3. 110 Because the deterministic genotype-calling method (i) relies on deeply-sequenced genomic sites 111 to account sequencing errors and (ii) identifies core SNP sites at common loci where all samples called 112 genotypes, it enables more conservative and more robust SNP-identifications. However, it also 113 imposes a limitation in the number of detectable SNP sites when low-coverage sequences are studied. 114 In our study, this approach required us to exclude many samples when no SNP site could be detected 115 due to low sequence-coverages, or when certain symbionts were absent. Consequently, the number of 116 samples and SNPs we could include in downstream analyses were substantially reduced when using 117 the deterministic genotyping approach, as compared to results using the genotype-probability based 118 method (Suppl. Table S3). An exception was the spirochete symbiont, where slightly more SNPs were 119 captured (99 sites) by the genotype-calling method than by the genotype probability-based approach 120 (88 sites). This was likely due to exceptionally high genetic variability of the spirochete symbionts 121 within and between host groups (see Figure 4g), which resulted in many variable sites removed by 122 filtering of statistically insignificant SNP-sites by the latter probabilistic approach. Nevertheless, 123 phylogenies based on limited SNPs and samples using the deterministic genotyping