Global Genome Biodiversity Network: Saving a Blueprint of the Tree of Life – a Botanical Perspective
Total Page:16
File Type:pdf, Size:1020Kb
Annals of Botany 118: 393–399, 2016 doi:10.1093/aob/mcw121, available online at www.aob.oxfordjournals.org Global Genome Biodiversity Network: saving a blueprint of the Tree of Life – a botanical perspective O. Seberg1,*, G. Droege2, K. Barker3, J. A. Coddington3, V. Funk3, M. Gostel3, G. Petersen1 and P. P. Smith4 1Natural History Museum of Denmark, Faculty of Science, University of Copenhagen, Sølvgade 83, opg. S, DK-1307 Copenhagen, Denmark, 2Botanic Garden and Botanical Museum Berlin-Dahlem, Freie Universitat€ Berlin, Ko¨nigin-Luise-Str. 6–8, D-14195 Berlin, Germany, 3National Museum of Natural History, Smithsonian Institution, Washington, DC 20560, USA and 4Botanic Gardens Conservation International, 199 Kew Road, Richmond TW9 3BW, Surrey, UK. *For correspondence. E-mail [email protected] Received: 10 March 2016 Returned for revision: 10 May 2016 Accepted: 19 May 2016 Published electronically: 20 June 2016 Background Genomic research depends upon access to DNA or tissue collected and preserved according to high-quality standards. At present, the collections in most natural history museums do not sufficiently address these standards, making them often hard or impossible to use for whole-genome sequencing or transcriptomics. In re- sponse to these challenges, natural history museums, herbaria, botanical gardens and other stakeholders have started to build high-quality biodiversity biobanks. Unfortunately, information about these collections remains fragmented, scattered and largely inaccessible. Without a central registry or even an overview of relevant institutions, it is diffi- cult and time-consuming to locate the needed samples. Scope The Global Genome Biodiversity Network (GGBN) was created to fill this vacuum by establishing a one- stop access point for locating samples meeting quality standards for genome-scale applications, while complying with national and international legislations and conventions. Increased accessibility to genomic samples will further genomic research and development, conserve genetic resources, help train the next generation of genome re- searchers and raise the visibility of biodiversity collections. Additionally, the availability of a data-sharing platform will facilitate identification of gaps in the collections, thereby empowering targeted sampling efforts, increasing the breadth and depth of preservation of genetic diversity. The GGBN is rapidly growing and currently has 41 mem- bers. The GGBN covers all branches of the Tree of Life, except humans, but here the focus is on a pilot project with emphasis on ‘harvesting’ the Tree of Life for vascular plant taxa to enable genome-level studies. Conclusion While current efforts are centred on getting the existing samples of all GGBN members online, a pilot project, GGI-Gardens, has been launched as proof of concept. Over the next 6 years GGI-Gardens aims to add to the GGBN high-quality genetic material from at least one species from each of the approx. 460 vascular plant fami- lies and one species from half of the approx. 15 000 vascular plant genera. Key words: Arboretum, biodiversity repository, biodiversity genomics, biobanks, botanic gardens, DNA banking, genome-quality samples, Global Genome Initiative. INTRODUCTION damage, interfering with sequencing (e.g. by cross-linking DNA and proteins in formalin-preserved tissues; see, Friedman The molecular revolution has changed almost every field of and DeSalle, 2008; Zimmermann et al.,2008). However, our modern biology. Consequently the demand for access to biolog- ability to deal with these problems is quickly improving (see ical samples appropriate for genomic research (quickly pre- Der Sarkissian et al., 2015), and, for certain applications, the served, properly vouchered and correctly identified) by the degraded nature of DNA in museum samples is not an obsta- scientific community has also increased significantly. This de- cle – as DNA does not have to be sheared prior to library build- mand has been answered partially through growing efforts in ing – and most, if not all, samples yield sequences suitable for fieldwork, in parallel with a supply of material from culture col- either genome sequencing or plastomes and mitogenomes. As a lections, seed banks, botanical gardens, zoos, aquaria and – to consequence, a new field, museum genomics or museomics an increasing degree – material stored in natural history mu- (Men et al.,2008), has emerged, and the use of the collections seums and herbaria. Improvements in sequencing technology has broadened beyond traditional uses in taxonomy and phylo- have increased our ability to use traditional collections, but, de- genetics, to encompass new fields such as population genomics, spite the power of new molecular techniques, the use of tradi- adaptation genomics, phylogenomics, and ecological and con- tional historical material frequently remains a challenge: (1) the servation genomics. DNA in specimens is often fragmented; (2) historical preserva- Thus, to fulfil new research aims, and maximize the use of tion techniques usually fail to inhibit endo- and exonuclease ac- their collections with minimal destructive sampling, traditional tivity; or (3) the DNA has become almost inaccessible due to biodiversity repositories have adapted increasingly to their new preservatives and fixatives that cause widespread post-mortem role and have started to store tissue or DNA samples to help VC The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ Downloaded from https://academic.oup.com/aob/article-abstract/118/3/393/1741085by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. by guest on 19 April 2018 394 Seberg et al. — Saving a blueprint of the Tree of life overcome these barriers. Currently, storage is achieved in sev- focuses on DNA and tissue banks attached to traditional natural eral different ways and, although storage in liquid nitrogen va- history or culture collections, but membership to the network is pour might be considered the ‘gold’ or four-star standard (see open to any biodiversity biobank (e.g. seed banks, environmen- Wong et al., 2012; Spooner and Ruess, 2014), other storage tal genebanks, zoos, aquaria and other types of biological repos- methods such as tissue dried in silica or preserved in ethanol itories, as well as representatives of government, academic and can be useful as well. However, the shear number of known other organizations involved in genomic biodiversity research). and unknown eukaryotic species, 1Á12–1Á75 million and 8Á7– Members are expected to have interests in (1) genomic research 12Á25 million, respectively (lower estimates from Mora et al., and research infrastructure connected to biodiversity and the 2011; higher from Goombridge and Jenkins, 2002), signifi- environment; (2) interacting with other members and the cantly surpasses the capability of any single repository. GGBN Secretariat; and (3) contributing to the achievement of Though sample quality is important, it is not the only rele- the GGBN’s goals. vant parameter; sample diversity should also be considered. If The GGBN’s primary goal is long-term storage and improv- we look closely, only truly charismatic vertebrate groups such ing the discovery of tissue and DNA (genomic DNA), as well as birds and mammals are ‘well’ represented in GenBank, i.e. as the associated voucher specimens to allow verification of there is at least one sequence from 8687 of all approx. 10 000 previous determinations in the future. For many taxonomic known bird species, and from 4587 of the approx. 5500 known groups and applications, molecular based identification has be- mammal species (Chapman, 2009). This contrasts markedly come increasingly important with its potential for standardized, with even one of the most charismatic seed plant groups, automated, scalable biodiversity identification. To reach that Orchidaceae, which is only represented by at least one sequence goal, well-documented reference databases are required to en- for 6265 species of the estimated approx. 22 500 species able automated sequence comparison, e.g. by BLAST against (Mabberly, 2008). The vertebrate groups have clearly received nucleotide collection or primary sequence databases operating a disproportionate level of attention. The bias is even much in the International Nucleotide Sequence Database stronger when it comes to more taxonomically diverse groups Collaboration (INSDC; Cochrane et al., 2011; see, however, such as invertebrates, most vascular plants, green algae and Rach et al., 2008). Presently, identification attempts often fail, fungi. primarily due to the lack of reference databases. Consequently, the need for co-ordinated sampling efforts, storage and documentation strategies, as well as data and sam- ple quality management is more urgent than ever. The only THE GGBN PLATFORM way to collect and proportionately sample the Tree of Life and salvage a genomic blueprint of key organisms for generations The GGBN Data Portal (data.ggbn.org; Droege et al.,2014) to come is to divide up this task globally and to make the re- addresses and improves the use of samples and data by provid- sources available to the wider scientific community. ing standardized access to genome-quality samples and related To face these challenges, the Global Genome Biodiversity