BIOSCAN: DNA Barcoding to Accelerate Taxonomy and Biogeography for Conservation and Sustainability

BIOSCAN: DNA Barcoding to Accelerate Taxonomy and Biogeography for Conservation and Sustainability Genome BIOSCAN: DNA Barcoding to accelerate taxonomy and biogeography for conservation and sustainability Journal: Genome Manuscript ID gen-2020-0009.R1 Manuscript Type: Current opinion Date Submitted by the 12-Mar-2020 Author: Complete List of Authors: Hobern, Donald; International Barcode of Life Consortium, iBOL Secretariat Keyword: DNA barcoding, Metabarcoding, Biodiversity, Taxonomy, Biomonitoring Is the invited manuscript for Draft consideration in a Special Trends in DNA Barcoding and Metabarcoding 2019 Issue? : Page 1 of 12 Genome 1 BIOSCAN: DNA barcoding to accelerate taxonomy and 2 biogeography for conservation and sustainability 3 Donald Hobern, [email protected] , ORCID: 0000-0001-6492-4016 4 Centre for Biodiversity Genomics, 50 Stone Road East, University of Guelph, Guelph, ON, N1G2W1, 5 Canada Draft Genome Page 2 of 12 6 In the 17 years since the seminal paper on biological identifications through use of COI as a DNA 7 barcode (Hebert et al., 2003), this short gene region has become the tool of choice for countless 8 biological applications, most conspicuously in support of animal taxonomy (Coleman & Radulovici, 9 2020) and biomonitoring. Parallel efforts have led to the adoption of ITS (Xu, 2016) and two 10 chloroplast loci (CBOL Plant Working Group, 2009; Hollingsworth, 2011) as barcodes for fungi and 11 plants. The International Barcode of Life Consortium (iBOL, continues to foster the 12 wider application of these techniques. DNA barcoding is now a standard approach across multiple 13 fields of biology, something reflected both in increasingly routine references in the literature and in 14 the growth in associated journals and in attendees at the biennial Barcode of Life conferences. 15 Over the same period, advances in genomic technologies and bioinformatics have made it possible 16 routinely and accurately to sequence materials for a tiny fraction of the 2003 cost. High-throughput 17 sequencing has opened the door both forDraft whole-genome studies to become nearly routine and for 18 large-scale metagenomics to explore whole communities of organisms. 19 Given these developments, it is easy to regard DNA barcoding as a transitional solution that reflects 20 the technical limitations of the period when it was first proposed. Long-read sequencing and 21 genome skimming approaches offer evolutionary and functional insights that cannot be retrieved 22 from 650 base pairs of a mitochondrial gene. However, DNA barcoding and metabarcoding still offer 23 significant benefits that cannot be achieved and may never be achieved simply by increasing the 24 power of sequencing platforms. The adoption of short, standard DNA barcodes across whole 25 kingdoms of life ensures that new sequences quickly and efficiently fill gaps in reference libraries 26 and maximises the number of field-collected sequences that can be recognised using cheap and 27 simple protocols, including use of "mini-barcodes" (Hajibabei et al., 2006; Gao et al., 2019) 28 Barcode sequences offer an interface between taxonomy and all the fields of biology that depend on 29 precise and timely species identifications, including environmental genomics. Identification of 30 organisms remains the foundational step for most human interactions with biodiversity. DNA Page 3 of 12 Genome 31 barcoding addresses many of the limitations historically associated with securing such 32 identifications. Perhaps more significantly, it allows species identifications to be carried out at 33 unprecedented scales and precision. The efficiency of DNA barcoding as an identification tool offers 34 the prospect of high-quality broad-scale information on ecosystem composition and resource usage 35 at a time when the need and appetite for big data on the environment is greater than ever before 36 (e.g. Díaz et al., 2009). 37 Identification of known species and sorting unrecognised organisms have historically been labour- 38 intensive tasks, severely limited by available expertise. For the most speciose groups, only a handful 39 of taxonomists globally may have the skills and knowledge to provide such a service. Even when such 40 expertise is available (usually limited to small taxonomic groups and certain geographic regions), 41 determinations may only be possible for some life stages, one sex, certain forms, or in combination 42 with some associated ecological data. AcrossDraft the tropics and in other biodiverse regions, for 43 megadiverse groups such as insects, mites and fungi, the number of described species may be a 44 small fraction of the total and formal identifications may be unachievable. 45 These challenges have limited progress in addressing some of the most fundamental and persistent 46 questions in biology. We have no complete overview of all the world's species, the products of 47 Earth's evolutionary history that share this planet. Estimates even of the number of extant species 48 vary across an order of magnitude or more, with numbers in the range of five (Costello et al., 2013) 49 to eleven (Taxonomy Decadal Plan Working Group, 2018) million considered most likely. Even when 50 species have been described, the associated information may be hard to access. To catalogue the 51 community of species found in Darwin's "entangled bank, clothed with many plants of many kinds, 52 with birds singing on the bushes, with various insects flitting about, and with worms crawling 53 through the damp earth" (Darwin, 1859) using classical methods would require the skills of dozens 54 or even hundreds of experts. Fully to represent this community as a structured ecosystem would Genome Page 4 of 12 55 take years of effort. To expand this understanding to a protected area, let alone a bioregion or the 56 whole planet, would be inconceivable. 57 The burden of correctly identifying organisms is not restricted to taxonomy, ecology and other fields 58 of biodiversity research. Accurate species determinations are essential for pest management, 59 biosecurity at ports and borders, conservation and human and animal health. In each case, agencies 60 and businesses have been obliged to develop and maintain their own taxonomic capabilities or else 61 to rely on scarce professional and academic resources. For the wider public, the difficulty even of 62 finding the name for an organism hampers understanding, appreciation and use of biodiversity 63 (Janzen, 2010). 64 DNA barcoding has developed as the genomics solution to address the difficulty and cost of securing 65 such identifications. The adoption of shortDraft standardised, easily replicated markers is a strength since 66 it simplifies sequencing pipelines, maximises their applicability across vast swathes of life, and 67 facilitates cost minimisation. Barcodes signal identity, or at least close association, between pairs of 68 organisms. Within many community surveys and biomonitoring efforts, recognising that multiple 69 individuals belong to a single species is in itself valuable information, even in the absence of a known 70 species name. Where properly identified specimens are available and have been barcoded, these 71 can anchor this identity within a known species and its associated biology. Over time, as more 72 reference specimens are added to the barcode libraries, sets of historically detected but 73 unrecognised individuals can receive formal identification. DNA barcoding thus allows for 74 biomonitoring and ecological research to proceed and deliver results asynchronously from attaching 75 species names to the sequence clusters. Ecology and taxonomy can continue in parallel with the 76 expectation that additions to field data and to the library of reference sequences will ultimately 77 interconnect in complementary ways. 78 As reference barcodes become available for all species within a group, identification of individuals 79 can be simplified to sequencing of DNA and look-up of the associated barcodes. This is particularly Page 5 of 12 Genome 80 critical in areas such as agricultural biosecurity and product certification, where there may be the 81 possibility of detecting organisms from anywhere in the world. Delays in identifying potential new 82 arrivals can have catastrophic effects. Economic modelling indicates expected impacts or treatment 83 costs in the billions of dollars for the most significant invasive pests, such as Agrilus planipennis 84 Fairmaire, 1888 (emerald ash borer) in North America (Kovacs et al., 2010) and for Anoplolepis 85 gracilipes Smith, 1857 (yellow crazy ant) in Australia (Spring D et al., 2019). 86 The development of metabarcoding techniques, including use of environmental DNA for species 87 detection and community, indicates the ultimate and most efficient endpoint for these activities. 88 Once the vast majority of known species and undescribed species are represented by reference 89 barcodes, very low-cost processes can deliver data on all of the species present in any community. 90 A further advantage of barcode-based approachesDraft is that they rely on very short sections of DNA 91 with limited potential for industrial application or wider research. 122 national parties have ratified 92 the Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits 93 Arising from their Utilization (
