Searching More Genomic Sequence with Less Memory for Fast and Accurate Metagenomic Profiling
Total Page:16
File Type:pdf, Size:1020Kb
Searching more genomic sequence with less memory for fast and accurate metagenomic profiling. Shea N. Gardner, Sasha K. Ames, Maya B. Gokhale, Tom R. Slezak, Jonathan E. Allen Supplementary Material Figure S1: Comparison of each LMAT-ML database we tested against the species calls made by the LMAT Grand database, varying the minimum number of reads called by a ML to generate a ROC-like curve. Performance of each of the LMAT-ML’s was so similar that we selected the pruning level that gave acceptable performance. Pruning at the minimum level as in the LMAT-ML-Min database is akin to using a lowest common ancestor algorithm. This was slightly worse than pruning at 10 or no pruning. Table S1: Read counts summed across 131 HMP samples for reads classified by LMAT-Grand as Eukaryota. Eukaryota Sum of Number Superkingdom/Kingdom/Phylum Reads Eukaryota 21870093 Acanthamoeba 73 Acanthamoeba castellanii 73 Albuginales 45 Albuginaceae 45 Alveolata 912 (blank) 912 Apicomplexa 131762 Aconoidasida 7901 Coccidia 123805 Conoidasida 16 (blank) 40 Apusomonadidae 90 Thecamonas 90 Bacillariophyta 16 Coscinodiscophyceae 16 Bangiophyceae 12 Bangiales 12 Blastocystis 2812 Blastocystis hominis 2812 Choanoflagellida 542 Codonosigidae 30 Salpingoecidae 512 Chromerida 992 Chromera 992 Dictyosteliida 2140 Dictyostelium 1569 Polysphondylium 362 (blank) 209 Dinophyceae 4632 Suessiales 4632 Entamoeba 12172 Entamoeba dispar 123 Entamoeba histolytica 479 Entamoeba invadens 33 Entamoeba nuttalli 11250 (blank) 287 Euglenida 19 Eutreptiales 19 Eustigmatophyceae 143 Eustigmatales 143 Fonticula 60 Fonticula alba 60 Foraminifera 614 Reticulomyxidae 614 Fungi 5396742 Blastocladiomycota 104 Chytridiomycota 175 Cryptomycota 47 Dikarya 5372409 fungal sp. EF0021 15 Glomeromycota 3095 Microsporidia 836 Mortierellomycotina 109 Mucoromycotina 1076 Neocallimastigomycota 14265 (blank) 4611 Heterolobosea 12 Schizopyrenida 12 Hyphochytriomycetes 316 Hyphochytriaceae 316 Ichthyosporea 59 Capsaspora 59 Intramacronucleata 97992 Oligohymenophorea 97979 Spirotrichea 13 Isochrysidales 926 Noelaerhabdaceae 926 Kinetoplastida 13303 Trypanosomatidae 13303 Metazoa 11729755 Arthropoda 251 Bilateria 232 Chordata 11729245 Protostomia 27 Mycetozoa 12 (blank) 12 Myxogastromycetidae 5369 Physariida 5369 Oomycetes 141 (blank) 141 Opisthokonta 933 (blank) 933 Pelagophyceae 651 Aureococcus 651 Perkinsida 25 Perkinsidae 25 Peronosporales 6857 Peronosporaceae 5945 Phytophthora 912 Pythiales 326 Pythiaceae 326 Raphidophyceae 49 Chattonellales 49 Saprolegniales 112 Saprolegniaceae 112 Stramenopiles 96 (blank) 96 Trichomonadida 414 Trichomonadidae 414 Viridiplantae 25527 Chlorophyta 42 Streptophyta 25467 (blank) 18 Xanthophyceae 178 Tribonematales 178 (blank) 4433262 (blank) 4433262 (blank) (blank) (blank) Grand Total 21870093 Supplementary Table S2: BLAST analysis of unsupported calls by MiniKraken Fraction Fraction Number BLAST reads with samples with matches matches to unsupported to called synthetic Species call species contructs LMAT-Grand top call Achromobacter_xylosoxidans 111 0.001 0.995 species,synthetic construct Propionibacterium_acnes 70 0.377 0.004 genus,Propionibacterium Prevotella_dentalis 64 0.188 0.071 genus,Prevotella Bacteroides_vulgatus 57 0.269 0.021 genus,Bacteroides Streptococcus_dysgalactiae 55 0.055 0.002 genus,Streptococcus Corynebacterium_variabile 52 0.004 0.003 order,Actinomycetales Prevotella_ruminicola 51 0.019 0.101 genus,Prevotella Bacteroides_salanitronis 50 0.016 0.047 order,Bacteroidales Propionibacterium_avidum 49 0.053 0.050 genus,Propionibacterium Actinobacillus_suis 47 0.011 0.018 genus,Haemophilus Supplementary Table S3: BLAST analysis of unsupported calls by Clinical Pathoscope Number Fraction Fraction samples BLAST reads with with matches to matches to unsupported called synthetic species call species contructs LMAT top call Bacteroides_salanitronis 57 0.004 0.191 phylum,Bacteroidetes Streptococcus_dysgalactiae 56 0.019 0.043 phylum,Firmicutes Propionibacterium_acnes 55 0.284 0.079 genus,Propionibacterium Bacteroides_vulgatus 50 0.175 0.148 genus,Bacteroides Corynebacterium_variabile 50 0.007 0.076 order,Actinomycetales Xylanimonas_cellulosilytica 49 0.000 0.168 genus,Actinomyces Streptococcus_equi 47 0.040 0.047 phylum,Firmicutes Actinobacillus_suis 46 0.006 0.181 genus,Haemophilus Mannheimia_succiniciproducens 46 0.008 0.193 family,Pasteurellaceae Isoptericola_variabilis 46 0.000 0.130 genus,Actinomyces Supplementary Table S4: BLAST analysis of unsupported calls by GOTTCHA from 131 HMP samples. Fraction reads Number Fraction with samples BLAST matches with matches to unsupported to called synthetic species call species contructs LMAT top call Propionibacterium_acnes 60 0.348 0.004 genus,Propionibacterium Streptococcus_phage_PH10 46 0.115 0.001 genus,Streptococcus Enterobacteria_phage_phiX174_sensu_lato 29 0.974 0.233 root Streptococcus_phage_SM1 29 0.249 0.001 genus, Streptococcus Streptococcus_phage_EJ-1 29 0.232 0.000 genus,Streptococcus Enterobacteria_phage_lambda 28 0.343 1.000 synthetic construct Streptococcus_phage_MM1 27 0.218 0.001 root Haemophilus_phage_HP2 26 0.137 0.005 root Capnocytophaga_ochracea 26 0.232 0.003 genus,Capnocytophaga Streptococcus_thermophilus 25 0.107 0.001 genus,Streptococcus Supplementary Table S5: BLAST analysis of unsupported calls by Metaphlan2 Fraction Fraction reads Number BLAST with samples matches matches with to to unsupported called synthetic species call species contructs LMAT top call Dasheen_mosaic_virus 55 0.000 0.008 Homo sapiens Streptococcus_sp._GMD4S 54 0.108 0.000 genus,Streptococcus Catonella_morbi 31 0.494 0.000 phylum,Firmicutes Abiotrophia_defectiva 30 0.498 0.000 phylum,Firmicutes Vicia_cryptic_virus 10 0.000 0.000 Homo sapiens Propionibacterium_phage_P101A 9 0.088 0.000 no rank,unclassified Siphoviridae Propionibacterium_acnes 7 0.486 0.000 genus,Propionibacterium Staphylococcus_capitis 6 0.686 0.004 genus,Staphylococcus Haemophilus_aegyptius 6 0.074 0.006 genus,Haemophilus Atopobium_parvulum 6 0.049 0.000 genus,Atopobium Propionibacterium_phage_P100D 5 0.553 0.000 no rank,unclassified Siphoviridae .