Supporting Information

Supporting Information

Supporting Information Lozupone et al. 10.1073/pnas.0807339105 SI Methods nococcus, and Eubacterium grouped with members of other Determining the Environmental Distribution of Sequenced Genomes. named genera with high bootstrap support (Fig. 1A). One To obtain information on the lifestyle of the isolate and its reported member of the Bacteroidetes (Bacteroides capillosus) source, we looked at descriptive information from NCBI grouped firmly within the Firmicutes. This taxonomic error was (www.ncbi.nlm.nih.gov/genomes/lproks.cgi) and other related not surprising because gut isolates have often been classified as publications. We also determined which 16S rRNA-based envi- Bacteroides based on an obligate anaerobe, Gram-negative, ronmental surveys of microbial assemblages deposited near- nonsporulating phenotype alone (6, 7). A more recent 16S identical sequences in GenBank. We first downloaded the gbenv rRNA-based analysis of the genus Clostridium defined phylo- files from the NCBI ftp site on December 31, 2007, and used genetically related clusters (4, 5), and these designations were them to create a BLAST database. These files contain GenBank supported in our phylogenetic analysis of the Clostridium species in the HGMI pipeline. We thus designated these Clostridium records for the ENV database, a component of the nonredun- species, along with the species from other named genera that dant nucleotide database (nt) where 16S rRNA environmental cluster with them in bootstrap supported nodes, as being within survey data are deposited. GenBank records for hits with Ͼ98% these clusters. sequence identity over 400 bp to the 16S rRNA sequence of each of the 67 genomes were parsed to get a list of study titles Annotation of GTs and GHs. NCBI accessions for GTs and GHs associated with the hits. These study titles were used to evaluate were retrieved from the CAZy website at www.cazy.org (8). GT the types of environments where close relatives of each of the 67 and GH annotations for the 21 human gut-derived genomes from isolates had been found (Table S1 and Table S2). the HGMI were obtained with an automated procedure followed by manual curation. In the automated procedure, the protein Generation and Evaluation of the 16S rRNA Consensus Tree. 16S models are compared to a library of Ϸ80,000 previously anno- rRNA sequences were aligned with NAST (1) and hypervariable tated GH and GT functional module sequences using a combi- positions filtered out with lanemaskPH (2). This alignment was nation of BLAST (E Ͻ1e-6) (9) and HMMER (E Ͻ1e-6) with used to make 1,000 bootstrapped NJ trees using ClustalW and HMM models (10) of families derived from the same library. python code. The consensus tree was generated using the Distantly related sequences (those with borderline E-values and majority rule consensus method as implemented in PyCogent (3) poor alignment coverage) were manually inspected to ascertain and rooted between the Bacteria and the Archaea. the preservation of key functional elements and active site The 16S rRNA tree allowed us to evaluate the taxonomy of 21 residues, when known. This procedure provided consistent re- newly sequenced genomes from the HGMI project. In accor- sults with annotations at the CAZy website. dance with previous studies that showed genera within the Note that the two gut Archaeal genomes were excluded from Firmicute class Clostridia to be extremely heterogeneous (4, 5), the GH-based genome clustering analysis because neither had we found that some members of the genera Clostridium, Rumi- any GH genes. 1. DeSantis TZ, et al. (2006) NAST: A multiple sequence alignment server for comparative 7. Jensen NS, Canale-Parola E (1985) Nutritionally limited pectinolytic bacteria from the analysis of 16S rRNA genes. Nucleic Acids Res 34:394–399. human intestine. Appl Environ Microbiol 50:172–173. 2. Hugenholtz P (2002) Exploring prokaryotic diversity in the genomic era. Genome Biol 8. Coutinho PM, Henrissat B (1999) Carbohydrate-active enzymes: An integrated data- 3:1–8. base approach. Recent Advances in Carbohydrate Bioengineering, eds Gilbert HJ, 3. Knight R, et al. (2007) PyCogent: A toolkit for making sense from sequence. Genome Davies G, Henrissat B, Svensson B (The Royal Society of Chemistry, Cambridge), pp 3–12. Biol 8:R171. 9. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search 4. Collins MD, et al. (1994) The phylogeny of the genus Clostridium: Proposal of five new tool. J Mol Biol 215:403–410. genera and eleven new species combinations. Int J Syst Bacteriol 44:812–826. 10. Eddy SR (1996) Hidden Markov models. Curr Opin Struct Biol 6:361–365. 5. Wiegel J, Tanner R, Rainey FA (2006) An introduction to the family Clostridiaceae. Prokaryotes, (Springer, New York), Vol 4, pp 654–678. 6. Farrow JAE, Lawson PA, Hippe H, Gauglitz U, Collins MD (1995) Phylogenetic evidence that the gram-negative nonsporulating bacterium Tissierella (Bacteroides) preaeacuta is a member of the Clostridium subphylum of the gram-positive bacteria and descrip- tion of Tissierella creatinini sp nov. Int J Syst Bacteriol 45:436–440. Lozupone et al. www.pnas.org/cgi/content/short/0807339105 1of6 Table S1. Sequenced genomes from the Human Gut Microbiome Initiative Environmental Distribution (Ͼ98% ID hits to ncbi ENV Full Name Lineage database) Description/Original Isolation RefSeq Bacteria: Bacteroidetes; Bacteroidetes (class); Bacteroidales Bacteroides caccae Bacteroidaceae Human gut Normal human gut microbiota NZ࿝AAVM00000000 ATCC 43185 and purported pathogenic factor in inflammatory bowel disease / Human feces Bacteroides ovatus Bacteroidaceae Human gut Normal human gut microbiota NZ࿝AAXF00000000 Bacteroides stercoris Bacteroidaceae Human gut Normal human gut microbiota / NZ࿝ABFZ00000000 Human feces Bacteroides uniformis Bacteroidaceae None Normal human gut microbiota NZ࿝AAYH00000000 Parabacteroides Porphyromonadaceae Human gut Normal human gut microbiota / NZ࿝AAXE00000000 merdae Human feces Bacteria; Actinobacteria; Coriobacteridae Collinsella aerofaciens Coriobacteriales; Human gut Normal human gut microbiota / NZ࿝AAVN00000000 Coriobacteriaceae Human feces Bacteria; Firmicutes; Mollicutes Eubacterium dolichum Human gut Normal human gut microbiota / NZ࿝ABAW00000000 DSM 3991 Human feces Bacteria; Firmicutes; Clostridia; Clostridiales Clostridium bolteae Cluster XIVa Human gut Normal human gut microbiota / NZ࿝ABCC00000000 ATCC BAA-613 Human feces Ruminococcus obeum Cluster XIVa Human and pig gut Normal human gut microbiota / NZ࿝AAVO00000000 ATCC 29174 Human feces Ruminococcus gnavus Cluster XIVa Human gut Normal human gut microbiota / NZ࿝AAYG00000000 ATCC 29149 Human feces Clostridium scindens Cluster XIVa Human and mouse gut Normal human gut microbiota / NZ࿝ABFY00000000 ATCC 35704 Human feces Dorea longicatena Cluster XIVa Human gut Normal human gut microbiota / NZ࿝AAXB00000000 DSM 13814 Human feces Ruminococcus torques Cluster XIVa Human gut Normal human gut microbiota / NZ࿝AAVP00000000 ATCC 27756 Human feces Eubacterium Cluster XIVa Human gut Normal human gut microbiota / NZ࿝AAVL00000000 ventriosum ATCC Human feces 27560 Coprococcus eutactus Cluster XIVa Human gut Normal human gut microbiota / NZ࿝ABEY00000000 ATCC 27759 Human feces Clostridium sp. L2–50 Cluster XIVa Human gut Normal human gut microbiota / NZ࿝AAYW00000000 Human feces Peptostreptococcus Peptostreptococcaceae Human mouth, gut, and Normal microbiota in human NZ࿝ABEE00000000 micros ATCC 33270 lung mouth and gut: pathogen / Lung infection Faecalibacterium Cluster IV Human and pig gut Normal human microbiota flora NZ࿝ABED00000000 prausnitzii M21/2 / Human feces Eubacterium siraeum Cluster IV Human gut Normal human gut microbiota / NZ࿝ABCA00000000 DSM 15702 Human feces Anaerotruncus Cluster IV Human and avian gut Normal human gut microbiota / NZ࿝ABGD00000000 colihominis DSM Human feces 17241 Bacteroides capillosus Cluster IV Human and turkey gut Normal human gut microbiota: NZ࿝AAXG00000000 ATCC 29799 human abscesses and wounds Lozupone et al. www.pnas.org/cgi/content/short/0807339105 2of6 Table S2. Previously sequenced genomes included in the analysis Environmental Distribution (Ͼ98% ID hits to ncbi ENV Description/Original Gut(G) Full Name Lineage database) Isolation Non-Gut(N) RefSeq Bacteria;Proteobacteria Escherichia coli HS Gammaproteobacteria; Cosmopolitan in free-living Non-disease causing GNC࿝009800 Enterobacteriales; communities and abundant in strain/Human gut Enterobacteriaceae vertebrate gut Pseudoalteromonas atlantica Gammaproteobacteria; Seawater Associated with marine NNC࿝008228 T6c Alteromonadales; biofilms and shell disease Pseodoalteromonadaceae in shellfish/ Diseased crab shells Shewanella frigidimarina NCIMB Gammaproteobacteria; Marine water, ice, coral and saline Marine bacterium/Water NNC࿝008345 400 Alteromonadales; lakes from the North Sea Shewanellaceae Hahella chejuensis KCTC 2396 Gammaproteobacteria; Marine Marine/ Marine sediment N NC࿝007645 Oceanospirillales; Hahellaceae Alcanivorax borkumensis SK2 Gammaproteobacteria; Marine sediment, water, and Marine/ Marine sediment N NC࿝008260 Oceanospirillales; coral Alcanivoracaceae Roseobacter denitrificans OCh Alphaproteobacteria; Diverse Saline Environments Marine anaerobe/ Marine NNC࿝008209 114 Rhodobacterales; (Marine Ice, bacterioplankton, sediment Rhodobacteraceae and sediment) Mesorhizobium sp. BNC1 Alphaproteobacteria; Soil, waste-water treatment Sewage / Sewage N NC࿝008254 Rhizobiales; Phyllobacteriaceae Bacteria:Bacteroidetes Parabacteroides distasonis ATCC Bacteroidetes

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us