Getting More Out of the Human Microbiome Data: Untangling the 16S rRNA Gene Based Phylogeny and of Clostridiales

Christel Chehoud Mentors: Dirk Gevers and Brian Haas Human Microbiome Project (HMP)

 Exploration of Microbial Communities  Human development  Physiology  Immunity  Disease  Nutrition

http://nihroadmap.nih.gov/hmp/ 16S Ribosomal RNA

 Large RNA component of the small subunit of the ribosome  1300-1600 bp  Conserved Regions  Variable Regions  Markers for classifying Building Phylogenetic Trees to Infer Evolutionary Relationships

V. Dendrosphorobacter … A C T C C T G T A T G T C C A G … V. Succinispira … A C A A A T G T A T A G A C T C … V. … A C G T G T G T A T C G C C C G … V. Succhiniclasticum … A C G A T T G T A T C A T C T T … V. Phascolarcbaterium … A C C G C T G T A T T G G C G A … Building Phylogenetic Trees to Infer Evolutionary Relationships

V. Dendrosphorobacter … A C T C C T G T A T G T C C A G … V. Succinispira … A C A A A T G T A T A G A C T C … V. Acidaminococcus … A C G T G T G T A T C G C C C G … V. Succhiniclasticum … A C G A T T G T A T C A T C T T … V. Phascolarcbaterium … A C C G C T G T A T T G G C G A …

Acidaminococcus Dendrosphorobacter QUANTITATIVELY Dendrosphorobacter INFER RELATIONSHIPS Dendrosphorobacter

Succhiniclasticum Succhiniclasticum

Phascolarcbaterium Building Phylogenetic Trees to Infer Evolutionary Relationships

V. Dendrosphorobacter … A C T C C T G T A T G T C C A G … V. Succinispira … A C A A A T G T A T A G A C T C … V. Acidaminococcus … A C G T G T G T A T C G C C C G … V. Succhiniclasticum … A C G A T T G T A T C A T C T T … V. Phascolarcbaterium … A C C G C T G T A T T G G C G A …

Acidaminococcus Dendrosphorobacter QUANTITATIVELY Dendrosphorobacter INFER RELATIONSHIPS Dendrosphorobacter

Succhiniclasticum Succhiniclasticum

Phascolarcbaterium Taxonomy vs. Phylogeny

Dendrosphorobacter Succinispira

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Taxonomy vs. Phylogeny

Dendrosphorobacter Classifier gets Succinispira confused

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Conflicts in Organism Classification: Taxonomy vs. Phylogeny

Morphology Physiology Evolutionary Function Relatedness Sequencing the Human Microbiome: Sequences that Cannot Be Confidently Classified at the Genus Level

Order Total Number of Percent of the Sequences for Order Whole Order

Lactobacillales 72,498 1%

Clostridiales 28,067 26%

Actinomycetales 22,188 4%

Pasteurellales 19,366 10%

Bacteroidales 18,653 13%

Bacillales 14,535 1% Sequencing the Human Microbiome: Sequences that Cannot Be Confidently Classified at the Genus Level

Order Total Number of Percent of the Sequences for Order Whole Order

Lactobacillales 72,498 1%

Clostridiales 28,067 26%

Actinomycetales 22,188 4%

Pasteurellales 19,366 10%

Bacteroidales 18,653 13%

Bacillales 14,535 1% Goal Systematically resolve conflicts between phylogenetic groupings and existing taxonomic nomenclature in order to improve classification accuracy Dendrosphorobacter

Succinispira

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Goal Systematically resolve conflicts between phylogenetic groupings and existing taxonomic nomenclature in order to improve classification accuracy Dendrosphorobacter

Succinispira

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Goal Systematically resolve conflicts between phylogenetic groupings and existing taxonomic nomenclature in order to improve classification accuracy Dendrosphorobacter Dendrosphorobacter Succinispira

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Goal Systematically resolve conflicts between phylogenetic groupings and existing taxonomic nomenclature in order to improve classification accuracy Dendrosphorobacter

Dendrosphorobacter

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Goal Systematically resolve conflicts between phylogenetic groupings and existing taxonomic nomenclature in order to improve classification accuracy Dendrosphorobacter

Dendrosphorobacter

Dendrosphorobacter

Dendrosphorobacter

Acidaminococcus

Succinispira

Phascolarcbaterium Compiling a Reference Set

 Trusted Taxonomies  Living Tree Project  Ribosomal Database Project Gold Set  Human Oral Microbiome Database  GreenGenes Hugenholtz Classification Building a Tree

Original Classification

GGH Bacteria BSV43 GGH Bacteria Firmicutes LTP Succinispira LTP Veillonellaceae Acidaminococcus GGH Bacteria Firmicutes 014B-C8 LTP Veillonellaceae Acidaminococcus RDP Veillonellaceae Acidaminococcus RDP Veillonellaceae Acidaminococcus LTP Veillonellaceae Phascolarcbaterium GGH Bacteria Firmicutes p-2464-18B5 GGH Unclassified GGH Bacteria Firmicutes Unclassified LTP Veillonellaceae Succhiniclasticum GGH Firmicutes Succiniclasticum RDP Veillonellaceae Succhiniclasticum Adding the Classifier’s Predicted Classification Predicted Classification Score

RDP Veillonellaceae Succinispira .21 RDP Veillonellaceae Succhiniclasticum .48 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus .87 RDP Veillonellaceae Succhiniclasticum .35 RDP Veillonellaceae Succinispira .57 RDP Veillonellaceae Succhiniclasticum .56 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 Flagging Anomalous Sequences

Anomaly

RDP Veillonellaceae Succinispira .21 RDP Veillonellaceae Succhiniclasticum .48 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus .87 RDP Veillonellaceae Succhiniclasticum .35 RDP Veillonellaceae Succinispira .57 RDP Veillonellaceae Succhiniclasticum .56 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 Grouping based on Evolutionary Distance 9% 1 2 3 4

5 6

7 8 Grouping based on Evolutionary

Distance 12% 9% 6% Incorporating 6%, 9% 12% and

15% Groupings Groups (%) 6 9 12 15

RDP Veillonellaceae Succinispira .21 RDP Veillonellaceae Succhiniclasticum .48 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus .87 RDP Veillonellaceae Succhiniclasticum .35 RDP Veillonellaceae Succinispira .57 RDP Veillonellaceae Succhiniclasticum .56 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 Monophyletic Analyzer Monophyletic Analyzer

Problematic Eliminating the Anomalous Sequences

RDP Veillonellaceae Succinispira .21 RDP Veillonellaceae Succhiniclasticum .48 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus 1.0 RDP Veillonellaceae Acidaminococcus .87 X RDP Veillonellaceae Succhiniclasticum .35 X RDP Veillonellaceae Succinispira .57 RDP Veillonellaceae Succhiniclasticum .56 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 RDP Veillonellaceae Succhiniclasticum 1.0 Editing One Taxon at a Time

Succinispira .21 Succhiniclasticum .48 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus Acidaminococcus1.0 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus BrGenIII .87 Succhiniclasticum .35 Succinispira .57 Succhiniclasticum .56 Succhiniclasticum 1.0 Succhiniclasticum 1.0 Succhiniclasticum 1.0 Editing One Taxon at a Time

Dendrosphorobacter .21 Succhiniclasticum .48 Succinispira Succinispira 1.0 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus Acidaminococcus1.0 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus .87 X Succhiniclasticum .35 X Succinispira .57 Succhiniclasticum .56 Succhiniclasticum 1.0 Succhiniclasticum 1.0 Succhiniclasticum 1.0 Editing One Taxon at a Time

Dendrosphorobacter BrGenV .21 Succhiniclasticum BrGenIV .48 Succinispira Succinispira 1.0 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus Acidaminococcus1.0 Acidaminococcus 1.0 Acidaminococcus 1.0 Acidaminococcus BrGenIII .87 X Acidaminococcus .35 X Succhiniclasticum .57 Succinispira SucciniclasticumI.56 Succhiniclasticum SucciniclasticumII1.0 Succhiniclasticum 1.0 Succhiniclasticum 1.0 Editing One Taxon at a Time

BrGenV BrGenIV Succinispira

Acidaminococcus

BrGenIII X X X X SucciniclasticumI SucciniclasticumII Inconsistencies in Genera

Only 30% of 110 genera perfectly group together Inconsistencies in Genera

21 multi-family genera Post-Classification Refinement Inconsistencies in Genera Percentage of Unclassifiable Sequences Decreases to 8%

30

25

20 Classifies Additional 15 5,000 sequences

Percentage 10

5

0 Before After The Remaining 8% of Sequences

30

25

20

15

Percentage 10 Novel Taxa?

5

0 Before After Future Direction

 Continue revising the family and genus names  Expand the analysis for the entire tree  Incorporating the species level in the analysis  Attempt to sort through the remaining 8% Acknowledgements

 Human Microbiome Project Group  Dirk Gevers  Brian Haas  Ashlee Earl  Bil Lewis  Dawn Cuilla  Georgia Giannoukos  Michael Feldgarden  Doyle Ward  Bruce Birren

 Administration  Lucia Vielma  Eboney Smith