<<

E01-068-01 Explanation standard 16S-ITS microbial profiling analysis V3 Taxonomic classification pipeline

OVERVIEW OF ANALYSIS STEPS INVOLVED IN THE TAXONOMIC CLASSIFICATION PIPELINE

Short paired sequence reads were generated using the Illumina RDP 16S rRNA gene databases. Based on the alignment scores of the MiSeq system and converted into FASTQ files using the BCL2FASTQ pseudoreads, the taxonomic classes were assigned by associating pipeline version 2.18. The Illumina paired reads were merged into each pseudoread to the best matching . The taxonomic depth single reads (so-called pseudoreads) through sequence overlap, after of the lineage is based on the identity threshold of the rank; removal of the forward and reverse primers. Chimeric pseudoreads 99%, Genus 97%, 95%, 90%, 85%, 80%. were removed and the remaining reads were aligned to the A brief overview of the pipeline is given in the figure below.

Q-check

Target amplification Illumina MiSeq Pseudoreads through (e.g. 16S rRNA, ITS) sequencing sequence overlap

Interactive visualization Taxonomic classification Chimera removal in online portal based on DNA databases

MORE INFORMATION

1 Ribosomal Database Project: and tools for high throughput rRNA analysis. Cole JR, Wang JA et al. (Nucl Acids Res. 2014).

PO Box 1336 T +31 (0)71 523 39 17 2302 BH Leiden E [email protected] The Netherlands W baseclear.com Metagenomics online analysis portal

OVERVIEW PER PROJECT: TREE AND BAR-CHART

In the project overview, a tree is been constructed on-the-fly: samples that share a similar taxonomic composition displayed in close proxi- FIGURE INTERPRETATION mity. Bray-Curtis dissimilarity is used as the distance measure, after which neighbour joining is used for clustering. In addition a bar-chart The tree indicates the distance between samples based is generated for each sample showing the community composition at on the taxonomic composition. The individual samples are a user-defined taxonomical level (e.g. species, genus, family, order, indicated next to the three, as well as their taxonomic com- class, phylum, or ). position.

FEATURES

The user can cluster samples on-the-fly at different taxonomic depths, e.g. family or genus . The ‘sort OTU abundance’ feature provides the option to sort the colors of the bars either based on the overall abundance of each taxon or based on the abundance of the taxa within each individual sample. Taxonomic composition of individual samples can be viewed by clicking on the sample name.

PO Box 1336 T +31 (0)71 523 39 17 2302 BH Leiden E [email protected] The Netherlands W baseclear.com OVERVIEW PER PROJECT: SPECIES RICHNESS AND DIVERSITY ESTIMATES

At the very right, for each sample three different alpha diversity/ richness metrics (chao1 richness estimator, Shannon entropy of counts, and Simpson’s index) are displayed. To compute these estimates, firstly taxonomic assignments are made using the OTU picking of USEARCH (97% similarity). Subsequently, the different metrics are calculated, after subsampling from the entire set, to account for different sampling depths.

FIGURE INTERPRETATION NOTE

The blue bars depict the diversity or richness estimates, Alpha-diversity metrics are calculated using the sum of data depending on the chosen metric. The width of the bars are from one study and hence cannot be shown when samples displayed as relative to the greatest estimate. were selected from different studies.

PO Box 1336 T +31 (0)71 523 39 17 2302 BH Leiden E [email protected] The Netherlands W baseclear.com OVERVIEW PER SAMPLE: TAXONOMIC COMPOSITION TABLE

The table displays the taxonomic composition of a single sample at a user-defined taxonomic level. The number of reads assigned to the taxonomic group are displayed as well as the relative abundance compared to the total number of reads. More information is provided under the red information button (i)

FEATURES

The table features sort and search options on all columns. It is also possible to download the table in CSV, Excel, and PDF format, but also to COPY the table-data directly to the clipboard (CTRL+C).

PO Box 1336 T +31 (0)71 523 39 17 2302 BH Leiden E [email protected] The Netherlands W baseclear.com OVERVIEW PER SAMPLE: MULTI-LEVEL KRONA PIE-CHART

The interactive pie-chart is constructed for each sample and allows fast and easy interpretation using its powerful selection and zoom features . The inner rings refer to higher taxonomic levels (e.g. phylum), the outer rings refer to lower taxonomic levels (e.g. species).

FEATURES MORE

By pressing the left mouse button on a taxonomic , a more detailed pie- INFORMATION chart is generated which contains the underlying taxonomic levels. Other features Interactive metagenomic visualiza- includes text-based searching, font size selection but – importantly – also the option tion in a web browser. Ondov BD, to generate publication-quality images cans which can be easily be downloaded Bergman NH, and Phillippy AM (BMC using the “Snapshot” feature. Bioinformatics. 2011).

PO Box 1336 T +31 (0)71 523 39 17 2302 BH Leiden E [email protected] The Netherlands W baseclear.com