<<

Standards in Genomic Sciences (2011) 5:243-247 DOI:10.4056/sigs.2134923

The Earth Project: The Meeting Report for the 1st International Earth Microbiome Project Conference, Shenzhen, China, June 13th-15th 2011

Jack A. Gilbert1,2, Mark Bailey3, Dawn Field3, Noah Fierer4,5, Jed A. Fuhrman6, Bin Hu7, Janet Jansson8, Rob Knight9, George A. Kowalchuk10,11, Nikos C. Kyrpides12, Folker Meyer1,13, Rick Stevens1,13

1Argonne National Laboratory, Argonne, IL, USA 2 Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA 3 Centre for Ecology & Hydrology, Natural Environment Research Council, Crowmarsh Gifford, Wallingford, Oxon, UK 4 Dept. of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO USA 5 Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO USA 6 Dept. of Biological Sciences, University of Southern California, Los Angeles CA USA 7 Beijing Genomics Institute at Shenzhen, Guangdong, China 8 Lawrence Berkeley National Laboratory, Earth Sciences Division Berkeley, CA USA 9 Howard Hughes Medical Institute and Department of Chemistry & Biochemistry, University of Colorado at Boulder, Boulder, USA 10 Department of Microbial Ecology, Netherlands Institute of Ecology (NIOO-KNAW), Wageningen, The Netherlands 11 Department of Ecological Science, VU University Amsterdam, Amsterdam, The Netherlands 12 DOE Joint Genome Institute, Walnut Creek, CA, USA 13 Computation Institute, University of Chicago, Chicago, IL USA

This report details the outcome of the 1st International Earth Microbiome Project Conference. The 2-day conference was held at the Kingkey Palace Hotel, Shenzhen, China, on the 14th- 15th June 2011, and was hosted by BGI (formally the Beijing Genomics Institute). The conference was arranged as a formal launch for the Earth Microbiome Project, to highlight some of the exciting research projects, results of the preliminary pilot studies, and to provide a discussion forum for the types of technology and experimental approaches that will come to define the standard operating procedures of this project.

Introduction The Earth Microbiome Project [1-3] is an acquisition, data generation and and data analysis ambitious endeavor that aims to generate the are essentially issues of scale and can only be largest repository of comparable environmental resolved with sufficient support from the sequence data yet attempted. The EMP is driven community and the funding agencies. from a fundamental need to understand life on The benefits of generating a comprehensive planet Earth and its interactions with the environment. wide survey of comparable data are many fold, To achieve this, it has become clear that we need a including an unprecedented knowledge resource deep exploration of the microbiome of Earth that will allow fundamental advances in the study through systematic characterization of the of microbial biodiversity, biogeography, ecology, microbial communities and their diversity across global protein and gene diversity, evolution and the planet. The need is fueled by scientific and community dynamics. Advances in sequencing economic justifications for a large-scale and rapid technology, coupled with advances in computing assessment of global microbial biodiversity. The and data analysis and the rise in massively- technical challenges associated with sample parallel researcher communication networks

The Genomic Standards Consortium The Earth Microbiome Project (social networking science), makes it possible to the morning of the first day. In addition, we had 22 now consider a distributed and scalable approach offered talks with presenters from a range of to the problem of sample collection, processing, nations (China, USA, Germany, France, New sequencing and analysis for hundreds of Zealand, Australia, and Spain). The meeting was thousands of environmental locations. loosely divided into two themes. The theme for The 1st International EMP Conference was Day 1 was microbial ecology, which focused on designed to showcase the rationale, the tools, and why we need the EMP, tools and models for the design of the EMP, highlighting the technical EMP, some preliminary data from the pilot study, challenges, and the potential. The EMP defines a and a number of exciting case studies from EMP suite of standard protocols and procedures for the collaborators. The theme for Day 2 was chiefly processing and analysis of thousands of samples dedicated to standards and bioinformatic from disparate environments and locations. While techniques, which included novel data analysis the ‘no-size-fits-all’ paradigm is a fundamental tools, standard data acquisition, and some problem for any global survey, the benefits of considerations from previous or existing massive generating such survey outweigh these sequencing projects, including Terragenome, The complications. Generating an integrated Microbial Earth project, The Gordon and Betty understanding of the role of microbes in the Moore Foundations Sequencing Project, and ecosystem turnover of each system on earth, and Meta-HIT. exploring the complexity of interaction between each system will help to define and build a new Day 1 model of Earth’s biodiversity, which will help to The 1st International EMP Conference (Twitter define and refine our capability to manage the hashtag #EMP1, #earthmicrobiome, resources of this planet. #earthmicrobiomeproject) was opened by a The main goal of the EMP is a systematic welcome speech by Professor Huanming Yang characterization of microbial life on Earth, which (Director of BGI) who gave a marvelous is exceptionally challenging and is comparable, if introduction to the reason for scientific meetings, not exceeding, the challenge faced by which is expounded as ‘to make friends and drive astrophysicists and astronomers in exploring the collaboration’. He also reiterated BGI’s excitement universe. There are approximately 5 x 1030 at being involved with the EMP, and noted that microbial cells on Earth [4,5] which is a billion this study was both ambitious and worthwhile. times the number of stars in the known universe Professor Yang also Introduced Professor Rick [6], and their genetic complexity is exceptional and Stevens (Argonne National Laboratory, University is both cause and effect of their ubiquity in every of Chicago, USA) who gave the keynote for the niche on Earth. Yet, no ocean is bottomless, and conference. Professor Stevens discussed the the number and type of functional adaptations to origins, rationale and prospects for the EMP environmental conditions must be finite even if in exploring the parallels with the Sloan flux. However, while it is vital that we understand Foundation’s Digital Sky Survey. He pointed out the players and plays associated with the the EMP’s task was far more difficult, but with microbial world, this census is only a small aspect much more significant consequences for of the EMP. One of the main goals is to generate a humankind. suite of microbial community models that enable us to predict, for example, the changes in Session I: Microbial Ecology, the role of the metabolite turnover in diverse environmental EMP in re-defining research systems over different spatial and temporal scales The first invited speaker was George to help us better manage our environment plan Kowalchuck (The Netherlands Institute of and mitigate future changes in the environment, Ecology, The Netherlands) gave an exciting talk e.g. climate change. about why the EMP is important, and how the The 1st International EMP Conference was an generation of comparable data from many open meeting with over 100 attendees. There ecosystems can help us to redefine our were 8 invited guest speakers, including Rick exploration of the microbial world. He argued that Stevens (Argonne National Laboratory and it was essential to combine large- and small-scale University of Chicago) who gave the keynote on studies to build up a multidimensional picture of microbial life. Secondly, Jack A. Gilbert (Argonne 244 Standards in Genomic Sciences Gilbert et al. National Laboratory and the University of Chicago, Germany (May/June 2011). He highlighted the USA) gave a brief welcome and thank you note to importance of the EMP in helping to define the the local committee for helping to organize the environmental reservoirs of human pathogens. conference, he then outlined the EMPs John Stephen (Australian Genome Research fundamental goals, and provided some initial data Facility Ltd, Australia) followed with an from the main pilot study of the first 10,000 introduction to a new initiative to generate a samples processed. The data were from 5,387 national terrestrial soils map for microbial life in samples and comprised only of 16S rRNA Australia; this was an excellent example of an sequences all generated using the same DNA early stage adopter of the EMP protocols for extraction, amplification and sequencing protocol. generating comparable databases of large scale The samples came from streamwater, soil, marine surveys. Torsten Thomas (The University of New sediment, human skin, air, coal-beds, lake water, South Wales, Australia) made an excellent case for human guts and human oral cavities. The resulting exploring the microbial world on physical alpha diversity was shown as well as a PCoA plot surfaces, specifically sponges and corals, of all 5,387 samples comprising >210 million providing excellent examples of extant data and sequences of the 16S rRNA gene V4 region the lack of comparability among these data. Janet generated using Illumina GAIIx amplicon Seifert (Rice University, USA) gave a passionate sequencing. Dr. Gilbert also presented results from argument for exploring the global diversity of the Western English Channel study, highlighting marine stromatolites, which are considered several new regional scale models derived from among the oldest microbial ecosystems on Earth, 16S rRNA and metagenomic data generated over a and represent a valuable tool for exploring prolonged time series. These models highlighted microbial evolution. Zhongjun Jia (Institute of the end-goal of the EMP, to generate taxonomic Soil Science, Chinese Academy of Science, China) and metabolic turnover predictions across space then provided an excellent example of a country- and time. Two offered talks followed. The first was wide survey of soil from China, with a from Juan Imperial (Polytechnic University of focus on the need for collecting detailed and Madrid, Spain) who made a case for a global comprehensive environmental data records. survey of legume-rhizobial symbionts. The second was from Guanghua Wang (Northeast Institute of Session III: EMP case studies Geography and Agroecology, China) who gave the Following the coffee break, S. Craig Cary (The first virus-based talk of the conference and University of Waikato, New Zealand) gave an highlighted the need for the EMP to explore viral excellent example of a sample collection from biodiversity as well, specifically exploring T4-type Antarctica, for which rich metadata is available, bacteriophages. and gave a wonderful example of how to design a Following a coffee break and group photograph, regional scale survey. Tong Zhang (The two further invited talks were given. Jed University of Hong Kong, Hong Kong) discussed Fuhrman (University of Southern California, USA) the microbiota of human engineered gave an excellent overview of the history and environments, suggesting the EMP should not taxonomic profiling of marine microbial overlook these. As an example, he discussed the metagenome and as some of the most studied microbiome of wastewater treatment plants. microbial ecosystems on Earth. He highlighted the Haiyan Chu (Institute of Soil Science, Chinese absolute necessity for time series studies to Academy of Science, China) gave an excellent determine the variability in any given system. example of a global-scale analysis of microbial life Janet Jansson (Lawrence Berkley National in soils, demonstrating that the communities in Laboratory, USA) provided an exciting overview of the arctic were fundamentally similar to the work on the terrestrial microbiome, including communities from many different latitudes. the world’s largest metagenomic project, JGI’s This closed the official sessions for the first day. Great Prairie Grand Challenge pilot study. The attendees were then offered a tour of the facilities at BGI, Shenzhen, followed by a banquet Session II: Microbial Genomics and Diversity for all attendees. Following lunch, Jun Wang (BGI, China) gave an excellent talk on the genome sequencing of the recent Escherichia coli strain from the outbreak in http://standardsingenomics.org 245 The Earth Microbiome Project Day 2 JGI, USA) then discussed the need for Session IV: Bioinformatic analyses and comprehensive coverage of genome sequences from cultured isolates to help ground truth lessons learned observations in metagenomic data, highlighting The first talk on Day 2 was given by Rob Knight the Microbial Earth Project (MEP). This project (University of Colorado at Boulder, USA), who aims to sequence the genomes of all the type discussed the lessons learned from the Human strains of and , currently Microbiome Project and many of the other estimated to be around 9,000 taxa. Dr. Kyrpides projects and datasets. Yangqing Peng (BGI, also suggested that the EMP was so important as China) presented a new suite of bioinformatic to be comparable to the moon race in the 1960s, tools for exploring genome reassembly from and, as such, it demanded the need for a metagenomic data. Adina Howe (Michigan State government agency (analogous to NASA) in each University, USA) also discussed tools for sequence country to fund and facilitate the effort. This could data assembly, including the need to break-up the be realized if the microbiology community would data into smaller portions prior to assembly. come together and form a distributed research Hans-Joachim Ruscheweyh (Tübingen center supporting EMP, which would eventually University, Germany) followed with an excellent develop into a Microbial Environmental Genomics presentation of the MEGAN software package for Agency (MEGA). K. Eric Wommack gave the metagenomic data analysis, and how the second viral talk of the conference highlighting the associated metadata can be used to group efforts to sequence and survey viral life on Earth. metagenomes by environmental parameters. He also introduced Virome as an annotation Following the coffee break, Yuzhen Ye (University platform specifically designed for the annotation of Indiana, USA) presented FragGeneScan as a tool of viral metagenomic data. Jack A. Gilbert then for predicting genes in short and error-prone gave a stand-in lecture for Suzanne Kennedy reads. Tom O. Delmont (École centrale de Lyon, (MoBio, USA), which focused on the France) provided some compelling data from the reproducibility of different DNA extraction Terragenome Project exploring the soil microbiota methodologies, and introduced some products from different ecosystems, and the implications of from MoBio designed at improving the quantity differences in DNA extraction techniques. In a and quality of DNA extracted from different departure from the original agenda, Hongwei samples. Zhou (Southern Medical University, China) Following the coffee break, Lanjuan Li (Chinese presented an interesting method for reducing the Academy of Engineering, China) gave an excellent dominance of abundant members of the talk as the final presenter of the conference, community so that a greater proportion of the discussing the implications of Hepatitis B infection rare community can be identified. Scott C. on human gut microbiota. Edmunds (BGI, China) then presented examples of how to disseminate data following generation, To close the meeting, Jack A. Gilbert thanked all including the idea of data DOIs and citation. attendees and speakers, and gave special thanks to Heshan Lin (Virginia Tech, USA) discussed the Hanqiao Kang and Zimin Zhu for all their use of graphic processing units (GPUs) for assistance in making the conference such a accelerating short-read mapping and local success. realignment for sequence data. Cheng-Cang Wu Following dinner, a panel discussion was held on (Lucigen Corporation, USA) concluded the session the need for DNA extraction standards in the EMP. with an excellent talk on the use of long-insert The premise of this working group was to explore clone libraries as another method for exploring the concerns of the community regarding the the microbial dynamics in different ecosystems. adoption of a single DNA extraction methodology for all samples. As already highlighted by two of Session V: Data analysis and annotation the talks in the core conference session, DNA Following lunch, Folker Meyer (Argonne National extraction can vary from sample to sample, and Laboratory, University of Chicago, USA) explored different methods generate different profiles of the use of cloud computing and MG-RAST to the same samples. Importantly, it was evident exploit the vast data bonanza being generated by immediately that no one technique would be an studies similar to the EMP. Nikos Kyrpides (DOE- ideal solution. However, it was also made very

246 Standards in Genomic Sciences Gilbert et al. clear that without a single extraction methodology DNA from the most environments, be adopted for there could be no absolute comparability between future EMP studies. Importantly, a review of the different samples and hence the idea of a DNA extraction protocol would be imperative systematic global survey would lose much of its annually for the continued development and value. The protocol adopted by the EMP for the refinement of the EMP. Finally, it was admitted first 10,000 samples pilot study was the MoBio that no one DNA extraction protocol would work PowerSoil DNA Isolation Kit (both 96-well and for all samples, in that for some samples, the single column, depending on number of samples amount of DNA generated would be too low for being processed). The manufacturer's protocol any analysis. For the pilot study, it was was amended with an initial 65 ºC heating step recommended that these samples initially not be immediately after the addition of the bead included so as to aid the generation of a massive, solution, and before the shaking step. The comparable dataset as quickly as possible. outcome of this working group session was that However, as EMP progresses it will be imperative sample and extraction bias will exist no matter that there be a focus on exploring the level of which method is adopted, and the need for overlap between different techniques. An archive comparability should override primary concerns of these selected EMP sub-samples could be that different samples and different taxa will be created where different extraction techniques differentially extracted in different systems. were used that had a known variation to the core However, some recommendations were made, extraction protocol. specifically that, following the initial pilot study, it was imperative that a more comprehensive Wrap-up assessment of the biases associated with different It was agreed that another meeting be held in June techniques be more thoroughly explored, and that 2012 to explore the evolution of the EMP, with the this should be the basis of a second pilot study. recommendation that the gathering convene in Additionally, it was recommended that a robust the Netherlands. DNA extraction protocol identified from this pilot study, defined as being able to extract the most

Acknowledgments This work was supported in part by the U.S. Dept. of Yue Enterprise Holdings Ltd. for their sponsorship of Energy under Contract DE-AC02-06CH11357. We also the meeting. want to thank Eppendorf, MoBio, BGI, Lucigen, and Hua

References 1. Gilbert JA, Meyer F, Jansson J, Gordon J, Pace N, Earth microbiome project. Stand Genomic Sci Tiedje J, Ley R, Fierer N, Field D, Kyrpides N, et 2010; 3:243-248. al. The Earth Microbiome Project: Meeting report 4. Whitman WB, Coleman DC, Wiebe WJ. of the "1 EMP meeting on sample selection and Prokaryotes: the unseen majority. Proc Natl Acad acquisition" at Argonne National Laboratory Sci USA 1998; 95:6578-6583. October 6 2010. Stand Genomic Sci 2010; 3:249- 253. 5. Kyrpides NC. Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream. 2. The Earth Microbiome Project. Nat Biotechnol 2009; 27:627-632. http://www.earthmicrobiome.org 6. Gilbert JA. Beyond the Infinite - tracking bacterial 3. Gilbert JA, Meyer F, Antonopoulos D, Balaji P, gene expression. Microbiology Today 2010; Brown CT, Brown CT, Desai N, Eisen JA, Evers D, 37:82-85. Field D, et al. Meeting report: the terabase workshop and the vision of an

http://standardsingenomics.org 247