Novel Molecular Resources to Facilitate Future Genetics Research on Freshwater Mussels (Bivalvia: Unionidae)
Total Page:16
File Type:pdf, Size:1020Kb
data Data Descriptor Novel Molecular Resources to Facilitate Future Genetics Research on Freshwater Mussels (Bivalvia: Unionidae) Nathan A. Johnson 1,* and Chase H. Smith 2,3,* 1 Wetland and Aquatic Research Center, U.S. Geological Survey, 7920 NW 71st Street, Gainesville, FL 32653, USA 2 Biology Department, Baylor University, Waco, TX 76798, USA 3 Department of Integrative Biology, University of Texas at Austin, 2415 Speedway, Austin, TX 78712, USA * Correspondence: [email protected] (N.A.J.); [email protected] (C.H.S.) Received: 8 July 2020; Accepted: 28 July 2020; Published: 30 July 2020 Abstract: Molecular data have been an integral tool in the resolution of the evolutionary relationships and systematics of freshwater mussels, despite the limited number of nuclear markers available for Sanger sequencing. To facilitate future studies, we evaluated the phylogenetic informativeness of loci from the recently published anchored hybrid enrichment (AHE) probe set Unioverse and developed novel Sanger primer sets to amplify two protein-coding nuclear loci with high net phylogenetic informativeness scores: fem-1 homolog C (FEM1) and UbiA prenyltransferase domain-containing protein 1 (UbiA). We report the methods used for marker development, along with the primer sequences and optimized PCR and thermal cycling conditions. To demonstrate the utility of these markers, we provide haplotype networks, DNA alignments, and summary statistics regarding the sequence variation for the two protein-coding nuclear loci (FEM1 and UbiA). Additionally, we compare the DNA sequence variation of FEM1 and UbiA to three loci commonly used in freshwater mussel genetic studies: the mitochondrial genes cytochrome c oxidase subunit 1 (CO1) and NADH dehydrogenase subunit 1 (ND1), and the nuclear internal transcribed spacer 1 (ITS1). All five loci distinguish among the three focal species (Potamilus fragilis, Potamilus inflatus, and Potamilus purpuratus), and the sequence variation was highest for ND1, followed by CO1, ITS1, UbiA, and FEM1, respectively. The newly developed Sanger PCR primers and methodologies for extracting additional loci from AHE probe sets have great potential to facilitate molecular investigations targeting supraspecific relationships in freshwater mussels, but may be of limited utility at shallow taxonomic scales. Dataset: https://doi.org/10.5066/P9Q3CFL5. Dataset License: CC0. This work was authored as part of the Contributor’s official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law. This is an open access article that has been identified as being free of known restrictions under copyright law, including all related and neighboring rights (https://creativecommons.org/publicdomain/mark/ 1.0/). You can copy, modify, distribute, and perform the work, even for commercial purposes, all without asking permission. Keywords: anchored hybrid enrichment; conservation genetics; endangered species; mitochondrial DNA; nuclear DNA; primer design; Potamilus; Sanger sequencing Data 2020, 5, 65; doi:10.3390/data5030065 www.mdpi.com/journal/data Data 2020, 5, 65 2 of 12 1. Summary Recent studies utilizing molecular tools have been integral in resolving the evolutionary history of freshwater mussels (Bivalvia: Unionida), despite a heavy reliance on the Sanger sequencing of mitochondrial DNA (mtDNA). Nearly all phylogenetic studies on freshwater mussels have largely relied on the commonly used marker internal transcribed spacer 1 (ITS1) to incorporate inference from the nuclear genome [1–6], which was first used in freshwater mussels nearly 20 years ago [7]. Even though researchers continue to utilize this locus, the number of studies reporting issues related to excessive heterozygosity, primarily due to length polymorphisms, continues to increase [1,2,6,8–14]. Other nuclear loci, such as histone H3 and 28S, have been utilized in freshwater mussel phylogenetic studies; however, these markers are well known to show limited diversity at shallow taxonomic scales and have primarily been used to resolve deep level phylogeny [15–20]. In recent years, the decreasing costs of next-generation sequencing platforms have significantly increased the ability to generate molecular supermatrices in non-model taxa [21,22], including freshwater mussels [23,24]. In particular, the recently developed anchored hybrid enrichment (AHE) probe set Unioverse [23] has drastically improved the ability to resolve phylogeny in freshwater mussels. The Unioverse probe set consists of 811 protein-coding loci derived from genomic and transcriptomic resources across Bivalvia that can be captured across all freshwater mussels to resolve phylogenetic relationships. Despite the decreasing costs of next-generation sequencing, the utilization of AHE probe sets can be cost-prohibitive for small-scale projects or molecular investigations that incorporate hundreds of individuals to investigate intra- or interspecific relationships. However, AHE probe sets offer opportunities for the development of primers for the amplification and Sanger sequencing of select protein-coding loci that can be used for small-scale projects. Here, we evaluated the phylogenetic informativeness of loci in the Universe probe set and report the development of novel primer pairs for the amplification of two protein-coding nuclear genes fem-1 homolog C (FEM1) and UbiA prenyltransferase domain-containing protein 1 (UbiA). To demonstrate the utility of these markers and facilitate their use in future studies, we provide the PCR primer sequences, optimized PCR conditions and thermal cycling parameters, haplotype networks, DNA alignments, and summary statistics regarding sequence variation for the two protein-coding nuclear loci (FEM1 and UbiA) and three loci that are commonly used in studies in freshwater mussels: the mitochondrial genes cytochrome c oxidase subunit 1 (CO1) and NADH dehydrogenase subunit 1 (ND1), and the nuclear ITS1 locus. All five loci distinguish among the three focal species (Potamilus fragilis, Potamilus inflatus, and Potamilus purpuratus) and should be amplifiable across the subfamily Ambleminae. The observed sequence variation was highest for ND1, followed by CO1, ITS1, UbiA, and FEM1, respectively (Figure1). We also provide the detailed methodology used in the marker selection to expedite the identification of additional candidate loci and primer development from available AHE data. The newly developed Sanger PCR primers and methodologies for extracting additional loci have great potential to facilitate molecular investigations targeting supraspecific relationships in freshwater mussels, but may be of limited utility at shallow taxonomic scales. Data 2020, 5, 65 3 of 12 Data 2020, 5, x FOR PEER REVIEW 10 of 13 FigureFigure 1. TCS 1. TCS haplotype haplotype networks networks for for (A ()A cytochrome) cytochrome c c oxidaseoxidase subunit 1 1 (CO1), (CO1), (B (B) fem-1) fem-1 homolog homolog C (FEM1),C (FEM1), (C) internal (C) internal transcribed transcribed spacer spacer 1 (ITS1), 1 (ITS1), (D )(D NADH) NADH dehydrogenase dehydrogenase subunitsubunit 1 1 (ND1), (ND1), and and (E) (E) UbiA prenyltransferase domain-containing protein 1 (UbiA). Each colored circle represents a UbiA prenyltransferase domain-containing protein 1 (UbiA). Each colored circle represents a unique unique haplotype, the colors correspond to individual species, the black circles represent unsampled haplotype, the colors correspond to individual species, the black circles represent unsampled haplotypes, haplotypes, and the hash marks indicate nucleotide differences between haplotypes. and the hash marks indicate nucleotide differences between haplotypes. Table 4. Summary of diversity indices based on all five loci utilized in this study. Abbreviations and 2. Data Description symbols are as follows: cytochrome c oxidase subunit 1 (CO1); fem-1 homolog C (FEM1); internal transcribed spacer 1 (ITS1); NADH dehydrogenase subunit 1 (ND1); UbiA prenyltransferase domain-containing protein 2.1. Specimen Details 1 (UbiA); sample size (n); nucleotide diversity (π); number of haplotypes (nh); number of segregating All thesites metadata (S); and number related of parsimony-informative to the specimensused sites (P). in this study, including the collection location, GPS coordinates, and museumLocus catalogn numbers,π are providednh (https://Sdoi.org /10.5066P /P9Q3CFL5)[25]. CO1 28 0.047864 14 82 78 2.2. Molecular Data FEM1 28 0.003431 3 4 4 We present the DNAITS1 sequence28data from0.039169 five markers:6 the333 mitochondrial 31 genes CO1 and ND1, the nuclear non-codingND1 marker28 ITS1, and0.062111 the protein-coding13 146 nuclear genes138 FEM1 and UbiA. UbiA 28 0.004002 4 8 7 Our five-locus DNA alignment consisted of 3368 bp of mitochondrial and nuclear sequence data (CO14.= User657 bp;Notes ND1 = 900 bp; FEM1 = 501 bp; UBiA = 765 bp; ITS1 = 545 bp). The number of loci sequenced for each individual varies from two to five loci, with all loci available for 28 individuals (Table1). TheAll the specific data and sample metadata sizes described for each locusin this are study as follows:are at https://doi.org/10.5066/P9Q3CFL5 CO1 ( n = 102); ND1 (n = 103); [1], FEM1 and all the novel GenBank accessions for this study were as follows: CO1: MT662002–MT662099; (n = 29); UBiA (n = 29); and ITS1 (n = 31). A subset of individuals was chosen for the additional nDNA FEM1: MT669773–MT669799; ITS1: