Meeting Report: “Metagenomics, Metadata and Meta-Analysis” (M3) Workshop at the Pacific Symposium on Biocomputing 2010
Total Page:16
File Type:pdf, Size:1020Kb
Standards in Genomic Sciences (2010) 2:357-360 DOI:10.4056/sigs.802738 Meeting Report: “Metagenomics, Metadata and Meta-analysis” (M3) Workshop at the Pacific Symposium on Biocomputing 2010 Lynette Hirschman1, Peter Sterk2,3, Dawn Field2, John Wooley4, Guy Cochrane5, Jack Gilbert6, Eugene Kolker7, Nikos Kyrpides8, Folker Meyer9, Ilene Mizrachi10, Yasukazu Nakamura11, Susanna-Assunta Sansone5, Lynn Schriml12, Tatiana Tatusova10, Owen White12 and Pelin Yilmaz13 1 Information Technology Center, The MITRE Corporation, Bedford, MA, USA 2 NERC Center for Ecology and Hydrology, Oxford, OX1 3SR, UK 3 Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK 4 University of California San Diego, La Jolla, CA, USA 5 European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Hinxton, Cambridge, UK 6 Plymouth Marine Laboratory (PML), Plymouth, UK 7 Seattle Children’s Hospital, Seattle, WA, USA 8 Genome Biology Program, DOE Joint Genome Institute, Walnut Creek, California, USA 9 Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA 10 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA 11 DNA Data Bank of Japan, National Institute of Genetics, Yata, Mishima, Japan 12 University of Maryland School of Medicine, Baltimore, MD, USA 13 Microbial Genomics Group, Max Planck Institute for Marine Microbiology & Jacobs University Bremen, Bremen, Germany This report summarizes the M3 Workshop held at the January 2010 Pacific Symposium on Biocomputing. The workshop, organized by Genomic Standards Consortium members, in- cluded five contributed talks, a series of short presentations from stakeholders in the genom- ics standards community, a poster session, and, in the evening, an open discussion session to review current projects and examine future directions for the GSC and its stakeholders. Introduction The M3 Workshop at the Pacific Symposium on Workshop, this 2010 PSB meeting included six Biocomputing (PSB) 2010 was organized by sessions and two other workshops: members of the Genomic Standards Consortium to Computational Challenges in Comparative continue the outreach by the GSC to the broader Genomics multi-omics community and to the computational • biology community. The workshop was a follow- Computational studies of non-coding RNAs on to two successful workshops held during the second half of 2009: the International Conference • Dynamics of Biological Networks on Intelligent Systems for Molecular Biology • Multi-resolution Modeling of Biological (ISMB) Metagenomics, Metadata and MetaAnalysis Macromolecules (M3) Special Interest Group (SIG) [1], and the M5 • (Metagenomics, Metadata, MetaAnalysis, Models, Personal Genomics and Metainfrastructure) workshop held in • Reverse Engineering and Synthesis of conjunction with the Supercomputing ’09 (SC09) Biomolecular Systems conference, Portland, OR, United States. • In silico Biology Workshop PSB serves as a meeting ground to explore topical issues of interest to a cross section of the compu- • GPD-Rxn Workshop: Genotype-Phenotype- tational biology community. In addition to the M3 Drug Relationship Extraction from Text • The Genomic Standards Consortium Meeting Report: “Metagenomics, Metadata and Meta-analysis” (M3) Background type, etc). A recent paper published in PNAS illu- The Genomic Standards Consortium (GSC) orga- strates the power of this approach [5]. It reports a nized this workshop as part of its goal to create study aimed at elucidating the relationships be- richer descriptions for the collection of genomes tween metabolic pathways and environmental and metagenomes through the development of parameters in microbial communities using the standards and tools for supporting compliance data and metadata from the Global Ocean Survey and exchange of contextual information [2]. Estab- (GOS), an earlier landmark paper in the history of lished in September 2005, this international the field of metagenomics [6]. The kick-off of the community includes representatives from the In- Human Microbiome Project and the resulting data ternational Nucleotide Sequence Database Colla- sets will open enormous new possibilities for the boration (INSDC), major genome sequencing cen- coordinated integration of contextualized meta- ters, bioinformatics centers and a range of re- genomes. search institutions. The rapid pace of genomic and metagenomic se- M3 Workshop Structure quencing projects [3], which now include studies The workshop goal was to attract experimental- of microbiomes, will only increase as the use of ists and computational researchers making “next- ultra-high-throughput sequencing methods be- generation” use of contextual metadata. The comes more commonplace. It is clear that we need workshop was divided into two parts – a set of new standards to capture additional contextual contributed talks to highlight specific research data as well as tools to support its use in down- activities, and a panel of leaders in the metage- stream computational analyses. It is also clear that nomics community who discussed the broad is- these standards will be vital to exploring the com- sues related to generation of metagenomics data, plex interactions that take place in communities – metadata standards and tools to support the me- both microbial communities, such as those sam- ta-analysis. In addition, the workshop included a pled in marine environments, and host-microbial poster session to highlight recent advances re- communities, such as those now being sampled in lated to the M3 goals and GSC activities. the Human Microbiome Project. Contributed Talks The GSC has been responsible for promulgating The contributed talks covered the three “M”s: the MIGS/MIMS standard (Minimal Information about Genomic/Metagenomics Sequences) [3], Metagenomics th and, at the 8 GSC workshop in September 2009, a • Using 100 years of data to contextualize meta- new standard MIENS (Minimal Information about genomics in the Western English Channel. Jack an ENvironmental Sequence) [4]. These standards Gilbert, Plymouth Marine Laboratory, UK are being incorporated into the INSDC (Interna- tional Nucleotide Sequence Database Collabora- • Metagenomics reveals functional shifts in the tion) as part of a new “structured comment field”. bovine rumen microbiota composition with This development was explored in a panel session propionate intake. Michael E. Sparks, Animal that was part of the workshop, involving repre- and Natural Resources Institute, USDA, Agri- sentatives from DDBJ, EMBL and GenBank. cultural Research Service, Beltsville, USA As one of its activities, the GSC has launched a new Metadata electronic journal SIGS (Standards in Genomic • Gemina: Ontology and metadata standards de- Sciences)in order to provide an open-access pub- velopment provide core of infectious pathogen lication for the rapid dissemination of both ge- surveillance and geospatial tool. Lynn Schriml, nome and metagenome reports compliant with University of Maryland School of Medicine, the MIGS/MIMS standards; the first three issues Baltimore, USA have included “Short Genome Reports” on 32 se- quenced bacterial genomes. Meta-analysis The M3 Workshop at PSB 2010 built directly on • Comparative Microbial genomics of resistance the past GSC workshops and the ISMB SIG [1]. Its genes in Staphylococcus aureus. Anja Staus- focus was on comparative studies of (me- gaard, The Technical University of Denmark, ta)genomes that bring these sequences into “con- 2800 Kgs. Lyngby, Denmark text” (i.e., by geolocation, habitat, organism pheno- 358 Standards in Genomic Sciences Hirschman et al. • Accurate taxonomic assignment of short pyro- pants. Suggestions included the International sequencing reads. Jose Clemente, Center for Symposium for Microbial Ecology (ISME) meeting Information Biology and DNA Databank of Ja- in August 22-27th in Seattle. This had now led to pan, National Institute of Genetics, Mishima, the inclusion of a GSC round table discussion at Japan. this meeting on Monday the 23rd August 2010. There was discussion of both previous meetings in The first two talks (Gilbert, Sparks) described which the GSC was invited to participate, including comparative metagenomic studies that demon- the 109th General Meeting of the American Society strated the power provided by data measured (e.g. for Microbiology (ASM), the Argonne Soils Work- geographic location, salinity, temperature, or pH) shop and SC09, as well as upcoming GSC spon- and curated (e.g., habitat or host) using appropri- sored events including the M3 and BioSharing SIG ate metadata standards. The third talk by Schriml at ISMB 2010, July 9-10 in Boston, and the GSC9 described a new set of curated metadata stan- meeting at JCVI April 28-30th 2010 in Rockville. In dards that aided in the integration and inter- addition, Nikos Kyrpides made a plea for the GSC operability of disparate datasets, drawing on GSC to reach beyond the microbial community to in- sponsored work on the Environmental Ontology clude the plant genome community as well as EnvO. The final two talks demonstrated the power many of the model organism groups. of meta-analysis: Stausgaard used a comparative genomics approach to identify and analyze resis- There was discussion about a different meaning of tance genes in Staphylococcus aureus); Clemente “standards” that might serve as a kind of “Con- looked at taxonomic assignment of sequences of sumer Reports” model for comparing and con- short read-length, a significant hurdle