Online Supplementary Figures of Chapter 3
Fabio Gori
Figures 1-30 contain pie charts showing the population characterization re- sulting from the taxonomic assignment computed by the methods. On the simulated datasets the true population distribution is also shown.
1 MTR
Bacillales (47.11%) Thermoanaerobacterales (0.76%) Clostridiales (33.58%) Lactobacillales (7.99%) Others (10.55%)
LCA
Bacillales (48.38%) Thermoanaerobacterales (0.57%) Clostridiales (32.14%) Lactobacillales (10.07%) Others (8.84%)
True Distribution 333 386 Prochlorales (5.84%) 535 Bacillales (34.61%) Halanaerobiales (4.37%) Thermoanaerobacterales (10.29%) Clostridiales (28.75%) Lactobacillales (9.38%) 1974 Herpetosiphonales (6.77%)
1640
249 587
Figure 1: Population distributions (rank Order) of M1, coverage 0.1x, by MTR and LCA, and the true population distribution.
2 MTR
Bacillus (47.34%) Clostridium (14.61%) Lactobacillus (8.71%) Anaerocellum (11.41%) Alkaliphilus (5.14%) Others (12.79%)
LCA
Bacillus (51.41%) Clostridium (8.08%) Lactobacillus (9.23%) Anaerocellum (15.79%) Alkaliphilus (5.17%) Others (10.31%)
True Distribution 386 552 Herpetosiphon (6.77%) 333 Prochlorococcus (5.84%)
587 Bacillus (34.61%) Clostridium (19.07%) Lactobacillus (9.38%)
249 Halothermothrix (4.37%) Caldicellulosiruptor (10.29%) Alkaliphilus (9.68%)
535 1974
1088
Figure 2: Population distributions (rank Genus) of M1, coverage 0.1x, by MTR and LCA, and the true population distribution.
3 MTR
Prochlorales (0.07%) Bacillales (47.97%) Thermoanaerobacterales (0.66%) Clostridiales (32.18%) Lactobacillales (7.76%) Others (11.35%)
LCA
Prochlorales (0.10%) Bacillales (49.02%) Thermoanaerobacterales (0.59%) Clostridiales (30.62%) Lactobacillales (9.50%) Others (10.16%)
True Distribution 3293 3950 Prochlorales (5.65%) 5263 Bacillales (36.68%) Halanaerobiales (3.98%) Thermoanaerobacterales (10.56%) Clostridiales (27.34%) Lactobacillales (9.03%) 21382 Herpetosiphonales (6.78%)
15936
2320 6154
Figure 3: Population distributions (rank Order) of M1, coverage 1x, by MTR and LCA, and the true population distribution.
4 MTR
Prochlorococcus (0.08%) Bacillus (47.02%) Clostridium (12.39%) Lactobacillus (7.34%) Anaerocellum (11.64%) Caldicellulosiruptor (0.01%) Alkaliphilus (5.17%) Others (16.35%)
LCA
Prochlorococcus (0.13%) Bacillus (50.58%) Clostridium (7.84%) Lactobacillus (7.96%) Anaerocellum (15.14%) Caldicellulosiruptor (0.02%) Alkaliphilus (4.58%) Others (13.74%)
True Distribution 3950 5536 Herpetosiphon (6.78%) 3293 Prochlorococcus (5.65%)
6154 Bacillus (36.68%) Clostridium (17.84%) Lactobacillus (9.03%) 2320 Halothermothrix (3.98%) Caldicellulosiruptor (10.56%) Alkaliphilus (9.50%) 5263 21382
10400
Figure 4: Population distributions (rank Genus) of M1, coverage 1x, by MTR and LCA, and the true population distribution.
5 MTR
Chroococcales (5.33%) Prochlorales (0.08%) Bacillales (55.35%) Thermoanaerobacterales (0.76%) Clostridiales (32.40%) Others (6.07%)
LCA
Chroococcales (3.89%) Prochlorales (0.11%) Bacillales (58.88%) Thermoanaerobacterales (0.58%) Clostridiales (30.25%) Others (6.28%)
True Distribution 13242 8576 Prochlorales (7.47%) Bacillales (46.46%) Halanaerobiales (4.91%) 41237 Thermoanaerobacterales (13.04%) Clostridiales (23.27%) Herpetosiphonales (4.84%)
82311
23108
8704
Figure 5: Population distributions (rank Order) of M1, coverage 4x, by MTR and LCA, and the true population distribution.
6 MTR
Synechococcus (5.03%) Prochlorococcus (0.09%) Bacillus (53.14%) Clostridium (7.44%) Anaerocellum (14.57%) Caldicellulosiruptor (0.02%) Alkaliphilus (6.60%) Others (13.11%)
LCA
Synechococcus (3.08%) Prochlorococcus (0.14%) Bacillus (59.89%) Clostridium (4.02%) Anaerocellum (17.51%) Caldicellulosiruptor (0.03%) Alkaliphilus (5.36%) Others (9.97%)
True Distribution 8576 22256 13242 Herpetosiphon (4.84%) Prochlorococcus (7.47%) Bacillus (46.46%) Clostridium (10.71%) 23108 Halothermothrix (4.91%) Caldicellulosiruptor (13.04%) Alkaliphilus (12.56%)
8704
82311 18981
Figure 6: Population distributions (rank Genus) of M1, coverage 4x, by MTR and LCA, and the true population distribution.
7 MTR
Malpighiales (4.26%) Desulfuromonadales (19.45%) Burkholderiales (72.56%) Others (3.73%)
LCA
Malpighiales (5.61%) Desulfuromonadales (22.01%) Burkholderiales (68.01%) Others (4.36%)
True Distribution
Desulfuromonadales (21.87%) 1984 Burkholderiales (78.13%)
7086
Figure 7: Population distributions (rank Order) of M2, coverage 0.1x, by MTR and LCA, and the true population distribution.
8 MTR
Comamonas (0.09%) Populus (5.40%) Acidovorax (15.57%) Geobacter (18.70%) Burkholderia (49.23%) Others (11.00%)
LCA
Comamonas (0.15%) Populus (7.58%) Acidovorax (3.46%) Geobacter (19.96%) Burkholderia (57.46%) Others (11.39%)
True Distribution
1415 Comamonas (15.60%) 1991 Geobacter (21.87%) Burkholderia (40.57%) Delftia (21.95%)
1984
3680
Figure 8: Population distributions (rank Genus) of M2, coverage 0.1x, by MTR and LCA, and the true population distribution.
9 MTR
Malpighiales (4.60%) Desulfuromonadales (19.34%) Burkholderiales (71.18%) Others (4.89%)
LCA
Malpighiales (6.00%) Desulfuromonadales (21.22%) Burkholderiales (67.31%) Others (5.47%)
True Distribution
Desulfuromonadales (22.09%) 20375 Burkholderiales (77.91%)
71882
Figure 9: Population distributions (rank Order) of M2, coverage 1x, by MTR and LCA, and the true population distribution.
10 MTR
Comamonas (0.11%) Populus (4.76%) Acidovorax (14.06%) Geobacter (18.36%) Burkholderia (49.82%) Delftia (0.04%) Others (12.84%)
LCA
Comamonas (0.18%) Populus (7.79%) Acidovorax (3.25%) Geobacter (19.05%) Burkholderia (56.15%) Delftia (0.08%) Others (13.51%)
True Distribution
14357 Comamonas (15.56%) 20341 Geobacter (22.09%) Burkholderia (40.30%) Delftia (22.05%)
20375
37184
Figure 10: Population distributions (rank Genus) of M2, coverage 1x, by MTR and LCA, and the true population distribution.
11 MTR
Malpighiales (8.05%) Desulfuromonadales (2.21%) Burkholderiales (84.30%) Others (5.44%)
LCA
Malpighiales (10.29%) Desulfuromonadales (2.21%) Burkholderiales (80.83%) Others (6.67%)
True Distribution 4852 Desulfuromonadales (2.77%) Burkholderiales (97.23%)
170140
Figure 11: Population distributions (rank Order) of M2, coverage 4x, by MTR and LCA, and the true population distribution.
12 MTR
Populus (8.48%) Acidovorax (15.11%) Geobacter (0.83%) Burkholderia (61.36%) Delftia (0.06%) Others (14.16%)
LCA
Populus (12.92%) Acidovorax (3.34%) Geobacter (1.16%) Burkholderia (69.24%) Delftia (0.10%) Others (13.25%)
True Distribution 4852 Geobacter (2.77%) Burkholderia (53.14%) Delftia (44.08%)
77141
92999
Figure 12: Population distributions (rank Genus) of M2, coverage 4x, by MTR and LCA, and the true population distribution.
13 MTR
Rhizobiales (2.92%) Pseudomonadales (21.51%) Bifidobacteriales (8.27%) Chromatiales (0.06%) Xanthomonadales (17.72%) Alteromonadales (46.09%) Others (3.42%)
LCA
Rhizobiales (0.85%) Pseudomonadales (18.94%) Bifidobacteriales (10.18%) Chromatiales (0.09%) Xanthomonadales (21.34%) Alteromonadales (45.62%) Others (2.98%)
True Distribution 745 Rhizobiales (6.30%) Pseudomonadales (21.74%) Bifidobacteriales (8.52%) 2571 Chromatiales (0.09%) Xanthomonadales (19.18%) 5222 Alteromonadales (44.16%)
1007
11
2268
Figure 13: Population distributions (rank Order) of M3, coverage 0.1x, by MTR and LCA, and the true population distribution.
14 MTR
Shewanella (48.41%) Pseudomonas (20.79%) Bifidobacterium (8.86%) Stenotrophomonas (18.26%) Others (3.69%)
LCA
Shewanella (50.16%) Pseudomonas (18.58%) Bifidobacterium (11.75%) Stenotrophomonas (17.25%) Others (2.25%)
True Distribution 745 11 Shewanella (44.16%) Pseudomonas (21.74%) Bifidobacterium (8.52%) 2268 Stenotrophomonas (19.18%) Halorhodospira (0.09%) 5222 Parvibaculum (6.30%)
1007
2571
Figure 14: Population distributions (rank Genus) of M3, coverage 0.1x, by MTR and LCA, and the true population distribution.
15 MTR
Rhizobiales (2.46%) Pseudomonadales (21.21%) Bifidobacteriales (8.11%) Chromatiales (0.09%) Xanthomonadales (18.07%) Alteromonadales (46.32%) Others (3.73%)
LCA
Rhizobiales (0.82%) Pseudomonadales (17.88%) Bifidobacteriales (9.97%) Chromatiales (0.12%) Xanthomonadales (22.00%) Alteromonadales (46.20%) Others (3.02%)
True Distribution 6650 Rhizobiales (5.69%) Pseudomonadales (21.32%) Bifidobacteriales (8.35%) 24939 Chromatiales (0.12%) Xanthomonadales (19.61%) Alteromonadales (44.91%) 52521
9771
139
22929
Figure 15: Population distributions (rank Order) of M3, coverage 1x, by MTR and LCA, and the true population distribution.
16 MTR
Shewanella (46.50%) Pseudomonas (20.64%) Bifidobacterium (8.60%) Stenotrophomonas (18.56%) Parvibaculum (0.01%) Others (5.70%)
LCA
Shewanella (49.93%) Pseudomonas (17.44%) Bifidobacterium (11.30%) Stenotrophomonas (17.49%) Parvibaculum (0.01%) Others (3.83%)
True Distribution 6650 139 Shewanella (44.91%) Pseudomonas (21.32%) Bifidobacterium (8.35%) 22929 Stenotrophomonas (19.61%) Halorhodospira (0.12%) Parvibaculum (5.69%) 52521
9771
24939
Figure 16: Population distributions (rank Genus) of M3, coverage 1x, by MTR and LCA, and the true population distribution.
17 MTR
Bifidobacteriales (0.66%) Alteromonadales (97.01%) Others (2.33%)
LCA
Bifidobacteriales (0.79%) Alteromonadales (97.35%) Others (1.86%)
True Distribution 1157 Bifidobacteriales (0.69%) Alteromonadales (99.31%)
165819
Figure 17: Population distributions (rank Order) of M3, coverage 4x, by MTR and LCA, and the true population distribution.
18 MTR
Shewanella (96.71%) Bifidobacterium (0.70%) Others (2.58%)
LCA
Shewanella (97.52%) Bifidobacterium (0.83%) Others (1.65%)
True Distribution 1157 Shewanella (99.31%) Bifidobacterium (0.69%)
165819
Figure 18: Population distributions (rank Genus) of M3, coverage 4x, by MTR and LCA, and the true population distribution.
19 MTR 19 20 49 root (1.21%) 159 Bacteroidetes (3.11%) Proteobacteria (84.33%) Actinobacteria (10.09%) Others (1.27%)
1329
LCA 39 19 7 126 root (1.57%) Bacteroidetes (3.23%) Proteobacteria (84.19%) Actinobacteria (10.43%) Others (0.58%)
1017
Figure 19: Population distributions (rank Phylum) of Saltern dataset by MTR (top) and LCA (bottom).
20 MTR 11 28 99 22 74 root (0.72%) Gammaproteobacteria (6.47%) Actinobacteria (class) (10.39%) Alphaproteobacteria (74.31%) 159 Betaproteobacteria (4.84%) Flavobacteria (1.44%) Others (1.83%)
1137
LCA 11 8 56 19 68 root (1.05%) Gammaproteobacteria (5.33%) Actinobacteria (class) (11.99%) Alphaproteobacteria (72.60%) 126 Betaproteobacteria (6.47%) Flavobacteria (1.81%) Others (0.76%)
763
Figure 20: Population distributions (rank Class) of Saltern dataset by MTR (top) and LCA (bottom).
21 MTR 36 135 root (10.25%) Rhizobiales (1.44%) 185 Rickettsiales (63.78%) 19 Actinomycetales (4.33%) Caudovirales (0.84%) Burkholderiales (0.91%) Flavobacteriales (1.67%) 22 Rhodobacterales (14.05%) 12 Others (2.73%) 11
57
840
LCA
70 root (8.67%) 103 Rhizobiales (0.87%) 7 Rickettsiales (67.41%) Actinomycetales (5.45%) Caudovirales (1.36%) 19 Burkholderiales (1.12%) 9 Flavobacteriales (2.35%) 11 Rhodobacterales (12.76%)
44
544
Figure 21: Population distributions (rank Order) of saltern dataset by MTR (top) and LCA (bottom).
22 MTR 25 root (16.14%) 158 Clavibacter (1.23%) Roseobacter (8.38%) Candidatus Pelagibacter (71.71%) Others (2.55%)
12
82
702
LCA 10 75 root (11.81%) Roseobacter (0.94%) Candidatus Pelagibacter (85.67%)
6 Others (1.57%)
544
Figure 22: Population distributions (rank Genus) of Saltern dataset by MTR (top) and LCA (bottom).
23 MTR 257 477 308 1652 240 root (1.12%) Proteobacteria (7.17%) Ascomycota (39.59%) Cnidaria (47.67%) Arthropoda (1.04%) Crenarchaeota (1.34%) Others (2.07%)
9116 10977
LCA 233 384 295 1310 170 root (1.41%) Proteobacteria (7.93%) Ascomycota (50.02%) Cnidaria (35.50%) Arthropoda (1.03%) Crenarchaeota (1.79%) Others (2.32%)
5867
8267
Figure 23: Population distributions (rank Phylum) of Coral dataset by MTR (top) and LCA (bottom).
24 MTR 248 633 1316 279 root (1.13%) 190 Gammaproteobacteria (6.00%) Demospongiae (0.87%) Anthozoa (48.39%) Insecta (0.98%) Eurotiomycetes (2.32%) Sordariomycetes (36.14%) Thermoprotei (1.27%) Others (2.89%)
7922
10608
509 215
LCA 207 477 1068 272 root (1.68%) Gammaproteobacteria (8.68%) Demospongiae (1.11%) 136 Anthozoa (47.53%) Insecta (1.38%) Eurotiomycetes (3.05%) Sordariomycetes (30.48%) Thermoprotei (2.21%) Others (3.88%)
3749
5847
375 170
Figure 24: Population distributions (rank Class) of Coral dataset by MTR (top) and LCA (bottom).
25 MTR
472 495 853 128123 root (2.36%) 1060 Eurotiales (2.25%)
119 Hypocreales (31.92%) 150 Sordariales (3.50%) Scleractinia (48.40%) Hymenoptera (0.71%) Caudovirales (0.57%) Pseudomonadales (5.04%) Pleosporales (0.59%) Nitrosopumilales (0.61%) 6710 Others (4.06%)
10173
736
LCA 250 432 root (3.65%) 321 76 88 Eurotiales (4.69%) Hypocreales (27.76%) Sordariales (5.53%) Scleractinia (32.13%) Hymenoptera (1.52%) 976 Caudovirales (1.74%) Pseudomonadales (14.27%) Pleosporales (1.29%) Nitrosopumilales (1.11%) Others (6.31%) 1899
119
104
378
2198
Figure 25: Population distributions (rank Order) of Coral dataset by MTR (top) and LCA (bottom).
26 MTR
1031 1133 Acinetobacter (9.03%) Aspergillus (2.44%) 279 128 Gibberella (30.57%) Neurospora (1.16%) 313 Podospora (0.70%) 90 Chaetomium (1.28%) T4−like viruses (0.50%) Porites (39.75%) Phaeosphaeria (0.79%) Magnaporthe (2.74%) Nitrosopumilus (1.12%) Others (9.92%)
3492
4540
133 80 14657
LCA
Acinetobacter (20.15%) 604 Aspergillus (1.71%) 944 Gibberella (38.51%) Neurospora (1.62%) 76 Podospora (1.62%) Chaetomium (2.24%) 169 T4−like viruses (1.22%) Porites (13.72%) 51 Phaeosphaeria (1.09%) Magnaporthe (3.61%) 80 Nitrosopumilus (1.62%) Others (12.89%)
643
57
105
76 76 1804
Figure 26: Population distributions (rank Genus) of Coral dataset by MTR (top) and LCA (bottom).
27 MTR 10652391 Bacteroidetes (79.69%) Firmicutes (17.22%) Verrucomicrobia (2.14%) 19223 Others (0.95%)
88971
LCA 935 2220 Bacteroidetes (79.75%) Firmicutes (16.87%) Verrucomicrobia (2.38%) 15756 Others (1.00%)
74488
Figure 27: Population distributions (rank Phylum) of Chicken dataset by MTR (top) and LCA (bottom).
28 MTR 3574 1980 2370 Bacilli (3.25%) Clostridia (13.07%) 14379 Bacteroidia (79.72%) Verrucomicrobiae (2.15%) Others (1.80%)
87683
LCA 3145 1548 2193 Bacilli (3.58%) Clostridia (11.14%) 9797 Bacteroidia (81.02%) Verrucomicrobiae (2.49%) Others (1.76%)
71234
Figure 28: Population distributions (rank Class) of Chicken dataset by MTR (top) and LCA (bottom).
29 MTR 2305 2174 3230 Verrucomicrobiales (2.13%) Bacteroidales (79.99%) Clostridiales (12.88%) 13925 Lactobacillales (2.99%) Others (2.01%)
86466
LCA 2193 1494 2694 Verrucomicrobiales (2.52%) Bacteroidales (81.74%) 9531 Clostridiales (10.94%) Lactobacillales (3.09%) Others (1.71%)
71234
Figure 29: Population distributions (rank Order) of Chicken dataset by MTR (top) and LCA (bottom).
30 MTR 1068 3458 1223 root (1.13%) 2276 1260 Bacteroides (76.33%) 1058 Ruminococcus (2.54%) 2883 Clostridium (7.13%) Lactobacillus (3.05%) Faecalibacterium (1.12%) 6743 Alistipes (1.33%) Akkermansia (2.41%) Parabacteroides (1.29%) Others (3.66%) 2402
72136
LCA 562 2235 1073 root (0.81%) 2166 Bacteroides (81.18%) 1202 508 Ruminococcus (1.68%) 2130 Clostridium (2.95%) Lactobacillus (3.06%) 2055 Faecalibacterium (0.73%) Alistipes (1.73%) 1172 Akkermansia (3.11%) Parabacteroides (1.54%) Others (3.21%)
56533
Figure 30: Population distributions (rank Genus) of Chicken dataset by MTR (top) and LCA (bottom).
31