<<

Metagenomics and Metatranscriptomics

Jinyu Yang Bioinformatics and Mathematical Biosciences Lab

July/15th/2016 Background

Conventional sequencing begins with a culture of identical cells as a source of DNA Bottleneck

 Vast majority of microbial biodiversity had been missed by cultivation-based methods

 Large groups of cannot be cultured and thus cannot be sequenced Bottleneck

For example:

16S ribosomal RNA sequences 1. Relatively short 2. Often conserved within a species 3. Generally different between species

Many 16S rRNA sequences have been found which do not belong to any known cultured species  there are numerous non-isolated organisms

Cultivation-based methods find < 1% of the bacterial and archaeal species in a sample Next-generation sequencing

The price of DNA sequencing continues to fall

All from all the members of the communities in environmental samples

The term “Metagenomics” first appeared in 1998

Metagenomics:

1. genomic analysis of microbial DNA, which is extracted directly from communities in environmental samples

2. Alleviating the need for isolation and lab cultivation of individual species

3. Focus on microbial communities Metagenomics

Taxonomic analysis (“who is out there?”) Assign each read to a taxonomy

Functional analysis (“what are they doing?”) Map each read to a functional role/pathway

Comparative analysis (“how do different samples compare?”) Level 1: sequence composition Level 2: taxonomic diversity Level 3: functional complement, etc. Metagenomics Metagenomics Metagenomics Microbial Communities Microbial Communities

Metagenomics Metatranscriptomics Entire genetic complements including Subset of genes that are transcribed phylogenetic and functional genes from active population

DNA-based mRNA-based

Microbial Communities

-based

Metabolomics Global metabolic profiles produced that are ultimately expressed from community function by regulations Metatranscriptomics

Functional diversity Metagenomics Metabolic diversity

Which genes are expressed?

Metagenomic studies provided a snapshot of the genetic composition of the community Metatranscriptomics

Metatranscriptomics is the study of the function and activity of the complete set of transcripts (RNA-seq) from environmental samples.

Metatranscriptomics: profiling of community-wide gene expression (RNA-seq) Metatranscriptomics

 Gene activity diversity

 Gene expression abundance

 Differential gene expression analysis Metatranscriptomics

Gene activity diversity

How many different genes are expressed in a microbial community across all species?

How is the functional and pathway diversity? Metatranscriptomics

Gene expression abundance

Which are the highest expressed genes in a specific environmental condition?

What is the most important functionality (pathway) needed in an environment? Metatranscriptomics

Differential gene expression analysis

Which genes show a highest change in expression levels between different conditions (biomarker detection)? Metatranscriptomics

Raw RNA-seq reads

Low quality trimming Preprocessing rRNA identification and removal Dereplication

Annotation and Assembly

Gene activity diversity Gene expression abundance Analysis Differential expression Thanks!