Metagenomics and Metatranscriptomics
Jinyu Yang Bioinformatics and Mathematical Biosciences Lab
July/15th/2016 Background
Conventional sequencing begins with a culture of identical cells as a source of DNA Bottleneck
Vast majority of microbial biodiversity had been missed by cultivation-based methods
Large groups of microorganisms cannot be cultured and thus cannot be sequenced Bottleneck
For example:
16S ribosomal RNA sequences 1. Relatively short 2. Often conserved within a species 3. Generally different between species
Many 16S rRNA sequences have been found which do not belong to any known cultured species there are numerous non-isolated organisms
Cultivation-based methods find < 1% of the bacterial and archaeal species in a sample Next-generation sequencing
The price of DNA sequencing continues to fall
All genes from all the members of the communities in environmental samples Metagenomics
The term “Metagenomics” first appeared in 1998
Metagenomics:
1. genomic analysis of microbial DNA, which is extracted directly from communities in environmental samples
2. Alleviating the need for isolation and lab cultivation of individual species
3. Focus on microbial communities Metagenomics
Taxonomic analysis (“who is out there?”) Assign each read to a taxonomy
Functional analysis (“what are they doing?”) Map each read to a functional role/pathway
Comparative analysis (“how do different samples compare?”) Level 1: sequence composition Level 2: taxonomic diversity Level 3: functional complement, etc. Metagenomics Metagenomics Metagenomics Microbial Communities Microbial Communities
Metagenomics Metatranscriptomics Entire genetic complements including Subset of genes that are transcribed phylogenetic and functional genes from active population
DNA-based mRNA-based
Microbial Communities
protein-based
Metabolomics Metaproteomics Global metabolic profiles produced Proteins that are ultimately expressed from community function by gene regulations Metatranscriptomics
Functional diversity Metagenomics Metabolic diversity
Which genes are expressed?
Metagenomic studies provided a snapshot of the genetic composition of the community Metatranscriptomics
Metatranscriptomics is the study of the function and activity of the complete set of transcripts (RNA-seq) from environmental samples.
Metatranscriptomics: profiling of community-wide gene expression (RNA-seq) Metatranscriptomics
Gene activity diversity
Gene expression abundance
Differential gene expression analysis Metatranscriptomics
Gene activity diversity
How many different genes are expressed in a microbial community across all species?
How is the functional and pathway diversity? Metatranscriptomics
Gene expression abundance
Which are the highest expressed genes in a specific environmental condition?
What is the most important functionality (pathway) needed in an environment? Metatranscriptomics
Differential gene expression analysis
Which genes show a highest change in expression levels between different conditions (biomarker detection)? Metatranscriptomics
Raw RNA-seq reads
Low quality trimming Preprocessing rRNA identification and removal Dereplication
Annotation and Assembly
Gene activity diversity Gene expression abundance Analysis Differential expression Thanks!