Identifying Canadian Freshwater Fishes Through DNA Barcodes
Total Page:16
File Type:pdf, Size:1020Kb
Identifying Canadian Freshwater Fishes through DNA Barcodes Nicolas Hubert1, Robert Hanner2, Erling Holm3, Nicholas E. Mandrak4, Eric Taylor5, Mary Burridge3, Douglas Watkinson6, Pierre Dumont7, Allen Curry8, Paul Bentzen9, Junbin Zhang2, Julien April1, Louis Bernatchez1* 1 De´partement de biologie, Pavillon Charles-Euge`ne-Marchand, Universite´ Laval, Sainte-Foy, Que´bec, Canada, 2 Canadian Barcode of Life Network, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada, 3 Department of Natural History, Royal Ontario Museum, Toronto, Ontario, Canada, 4 Great Lakes Laboratory for Fisheries and Aquatic Sciences, Fisheries and Oceans Canada, Burlington, Ontario, Canada, 5 Department of Zoology, Vancouver, British Columbia, Canada, 6 Fisheries and Oceans Canada, Central & Arctic Region, Freshwater Institute, Winnipeg, Manitoba, Canada, 7 Ministe`re des Ressources naturelles et de la faune du Que´bec, Direction de l’ame´nagement de la faune de Montre´al, de Laval et de la Monte´re´gie, Longueuil, Que´bec, Canada, 8 Fish and Wildlife Research Unit, University of New Brunswick, Fredericton, New Brunswick, Canada, 9 Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada Abstract Background: DNA barcoding aims to provide an efficient method for species-level identifications using an array of species specific molecular tags derived from the 59 region of the mitochondrial cytochrome c oxidase I (COI) gene. The efficiency of the method hinges on the degree of sequence divergence among species and species-level identifications are relatively straightforward when the average genetic distance among individuals within a species does not exceed the average genetic distance between sister species. Fishes constitute a highly diverse group of vertebrates that exhibit deep phenotypic changes during development. In this context, the identification of fish species is challenging and DNA barcoding provide new perspectives in ecology and systematics of fishes. Here we examined the degree to which DNA barcoding discriminate freshwater fish species from the well-known Canadian fauna, which currently encompasses nearly 200 species, some which are of high economic value like salmons and sturgeons. Methodology/Principal Findings: We bi-directionally sequenced the standard 652 bp ‘‘barcode’’ region of COI for 1360 individuals belonging to 190 of the 203 Canadian freshwater fish species (95%). Most species were represented by multiple individuals (7.6 on average), the majority of which were retained as voucher specimens. The average genetic distance was 27 fold higher between species than within species, as K2P distance estimates averaged 8.3% among congeners and only 0.3% among concpecifics. However, shared polymorphism between sister-species was detected in 15 species (8% of the cases). The distribution of K2P distance between individuals and species overlapped and identifications were only possible to species group using DNA barcodes in these cases. Conversely, deep hidden genetic divergence was revealed within two species, suggesting the presence of cryptic species. Conclusions/Significance: The present study evidenced that freshwater fish species can be efficiently identified through the use of DNA barcoding, especially the species complex of small-sized species, and that the present COI library can be used for subsequent applications in ecology and systematics. Citation: Hubert N, Hanner R, Holm E, Mandrak NE, Taylor E, et al. (2008) Identifying Canadian Freshwater Fishes through DNA Barcodes. PLoS ONE 3(6): e2490. doi:10.1371/journal.pone.0002490 Editor: Hans Ellegren, University of Uppsala, Sweden Received December 5, 2007; Accepted May 19, 2008; Published June 18, 2008 Copyright: ß 2008 Hubert et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This research was supported through funding to the Canadian Barcode of Life Network from NSERC, Genome Canada (through the Ontario Genomics Institute). Other sponsors listed at www.BOLNET.ca. Competing Interests: The authors have declared that no competing interests exist. * E-mail: [email protected] Introduction array of organisms, species are generally well delineated by a particular sequence or by a tight cluster of very similar sequences DNA barcoding is designed to provide accurate, and automated that allow unambiguous identifications [4,5,6,7,8,9,2,10,11,12]. species identifications through the use of molecular species tags Despite the great promise of DNA barcoding, it has been based on short, standardised gene regions [1,2]. While humanity is controversial in some scientific circles [13,14]. Yet, recent results facing increasing evidence of the erosion of Earth’s biodiversity, illustrated some straightforward benefits from the use of a this approach is proving its effectiveness in characterising the standardised molecular approach for identification [1,2]. First, complexity of the biodiversity realm at a pace unequalled by other intraspecific phenotypic variation often overlaps that of sister taxa characters [3]. The primary goals of DNA barcoding focus on the in nature, which can lead to incorrect identifications if based on assembly of reference libraries of barcode sequences for known phenotype only [e.g. 15]. Second, DNA barcodes are effective species in order to develop reliable, molecular tools for species whatever the life stages under scrutiny [e.g. 16, 17]. Third, cryptic identification in nature. Current results suggest that, in a large variation and often spectacular levels of undetected taxonomic PLoS ONE | www.plosone.org 1 June 2008 | Volume 3 | Issue 6 | e2490 Barcoding Freshwater Fishes diversity have been frequently reported [e.g. 18, 19, 20]. Finally, resource for the DNA barcoding community [3]. The BOLD DNA barcode libraries are fully available as they are deposited in a database currently host specimens records for which essentially, major sequence database, and attached to a voucher specimen seven data elements are listed: whose origin and current location are recorded [2,3]. Once libraries are available, recent studies illustrate the vast array of 1. Species name applications that can be applied to them such as forensic 2. Voucher data engineering [21,22], ecology of cryptic communities [23], the 3. Collection record tracking of invasive species [24,25] and identification of prey from 4. Identifier of the specimen predator stomach samples [e.g. 26]. With the aim of assigning specimens to known species based 5. COI sequence of at least 500 bp on molecular tags, a 648-bp segment of the 59 region of 6. PCR primers used to generate the amplicon mitochondrial cytochrome c oxidase I (COI) gene forms the 7. Trace files library of primary barcodes for the animal kingdom [1]. Mitochondrial DNA (mtDNA) presents several advantages that The core data element in BOLD is a biphasic record consisting make it well suited for large scale molecular tagging. First, this of both a ‘‘specimen page’’ and a ‘‘sequence page’’ (Figure 1). genome is present in a large number of copies yielding Access to these pages is possible through direct link in the project substantial amounts of genomic DNA from a variety of console (1 in Figure 1) that includes a comprehensive list of all extraction methods. Second, the high mutation rate and small specimens included in the project. The specimen page (2 in effective population size make it often an informative genome Figure 1) assembles varied data about source of each specimen about evolutionary patterns and processes [27,28]. For a including the specimen’s donor and identifier, taxonomy, barcoding approach to species identification to succeed, collection data (including geospatial coordinates and digital however, within-species DNA sequences need to be more similar images), the repository and catalog number of the voucher to one another than to sequences in different species. Several specimen. Each specimen page is coupled to a sequence page (3 in processes such as pseudogenes ontogenesis, introgressive hybrid- Figure 1) that records the barcode sequence (FASTA format), isation, and retention of ancestral polymorphism pose potential PCR primers and trace files, amino acid translation, and difficulties in capturing species boundaries using mtDNA ultimately the GenBank accession number as well. Information sequences [29,30,31,32]. The detection of mixed genealogy from both the specimen and sequence pages can be incorporated between closely related species has been previously estimated to into taxon ID trees that can be used in the identification system, occur in nearly 20 percent of the cases in the wild [30]. Recent while onboard mapping functions support investigations into barcoding studies emphasised that this percent can vary widely spatial molecular ecology. among phyla, yet species assignment failures typically do not After preparing the barcode records in BOLD, data were exceed 5 to 10 percent in a large array of organisms [2]. uploaded into GenBank. Appendix S1 provides the voucher The economic importance and identification challenges associ- specimen ID, BOLD specimen record number, and GenBank ated with fishes prompted the launch of an international Fish accession number for each record. The Consortium for the Barcoding of Life (FISH-BOL) initiative (http://www.fishbol.org/) Barcode of Life,