<<

Molecular Resources (2013) doi: 10.1111/1755-0998.12091

DNA barcoding at riverscape scales: assessing biodiversity among fishes of the genus Cottus (Teleostei) in northern Rocky Mountain streams

MICHAELK.YOUNG,KEVINS.MCKELVEY,KRISTINEL.PILGRIMandMICHAELK.SCHWARTZ U.S. Forest Service, Rocky Mountain Research Station, 800 East Beckwith Avenue, Missoula, MT 59801, USA

Abstract There is growing interest in broad-scale biodiversity assessments that can serve as benchmarks for identifying ecological change. Genetic tools have been used for such assessments for decades, but spatial sampling consider- ations have largely been ignored. Here, we demonstrate how intensive sampling efforts across a large geographical scale can influence identification of taxonomic units. We used sequences of mtDNA cytochrome c oxidase subunit 1 and , analysed with maximum parsimony networks, maximum-likelihood trees and genetic distance thresholds, as indicators of biodiversity and identity among the taxonomically challenging fishes of the genus Cottus in the northern Rocky Mountains, USA. Analyses of concatenated sequences from fish collected in all major watersheds of this area revealed eight groups with species-level differences that were also geographically cir- cumscribed. Only two of these groups, however, were assigned to recognized species, and these two assignments resulted in intraspecific genetic variation (>2.0%) regarded as atypical for individual species. An incomplete inven- tory of individuals from throughout the geographical ranges of many species represented in public databases, as well as sample misidentification and a poorly developed , may have hampered species assignment and discov- ery. We suspect that genetic assessments based on spatially robust sampling designs will reveal previously unrecog- nized biodiversity in many other taxa.

Keywords: Cottidae, cryptic species, sculpins, species discovery Received 19 December 2012; revision received 24 January 2013; accepted 6 February 2013

Introduction resolve cryptic species complexes (Bickford et al. 2007; Valentini et al. 2009). Projections of a rapidly changing climate and increasing Among the largest ongoing efforts to catalogue biodi- human population in North America have led to calls for versity are those predicated on DNA barcoding (Ratnas- broad-scale biodiversity assessments that can serve as ingham & Hebert 2007), which relies on the benchmarks for identifying ecological change. Assessing and comparison of a standardized portion of the gen- biological diversity requires identifying taxa of interest ome—most often cytochrome c oxidase subunit 1 (COI) and describing their distributions. Species-level diversity region of mtDNA (Hebert et al. 2003a)—for species has long been the primary metric by which biodiversity delineation and identification. Although initially viewed is measured, in part because organisms at the species as highly controversial (Rubinoff 2006), it has subse- level are often easily identified on the basis of their mor- quently proven to be effective in many circumstances phology, behaviour or acoustics. In many countries, (Teletchea 2010). Although nuclear DNA sequencing has diversity at levels below that of species is neglected with certain advantages and is becoming more feasible for regard to conservation (Laikre 2010), but to some degree biodiversity assessment (Taylor & Harris 2012), the lack that reflects the difficulty in cataloguing variation at of recombination, rarity of indels, ease and low cost of lower taxonomic levels when using traditional methods. amplifying and sequencing, and high mutation rates of Increasingly, genetic tools are permitting more detailed mtDNA have favoured its use (Zink & Barrowclough and accurate assessments of biodiversity because of their 2008). ability to identify conservation units within species and Nevertheless, assigning samples collected in biodiver- Correspondence: Michael K. Young, Fax: +1 406 543 2663; sity surveys to recognized species is often problematic. E-mail: [email protected] The first issue arises when sequences of samples are

Published 2013. This article is a U.S government work and is in the public domain in the USA 2 M. K. YOUNG ET AL. compared with those in reference collections. Public founded on the notion that ancient and contemporary databases such as those maintained by the National Cen- geological and climatic events, combined with the ter for Biotechnology Information or of Life Data vagility of an organism, constrain and direct land- system (BOLD) contain tens of millions of reference scape-scale—or for stream-dwelling organisms, river- sequences for hundreds of thousands of species. scape-scale (Fausch et al. 2002)—genetic structure in Although these databases represent an enormous, most species (Avise 2000). Thus, sampling schemes and publically available catalogue of life, they suffer several reference databases must account for this structure to shortcomings: incomplete taxonomic and geographical permit reliable delineation of species or major lineages. coverage (Nielsen & Matz 2006; Elias et al. 2007), inclu- This is especially important when relying on contrasts sion of poor-quality sequences (Harris 2003) and in genetic distances within and among taxa because misidentification of voucher specimens (Kvist et al. broad-scale sampling often results in increased intra- 2010). These can cause problems during the final step in specific variation and less certainty about interspecific the genetic assessment of biodiversity, which is to deter- distance thresholds, that is the barcode gap (Fregin mine whether a sampled individual is a representative of et al. 2012). an existing species or an undescribed one. Most exam- Fine-grained yet broad-scale sampling is critical for ples of species identification from sequence data use taxa that may exhibit limited dispersal and localized distance-based methods and are founded on the observa- divergence, particularly among poorly studied groups. tion that intraspecific genetic distances (generally <1%) One such group comprises fishes in the genus Cottus, tend to be much smaller than interspecific distances commonly known as sculpins, which are primarily fresh- among congeners (Hebert et al. 2003b). Because mini- water, benthic species found in lakes, rivers and streams mum genetic distances of 1–3% are typical of species- throughout the Northern Hemisphere. Sculpins are mid- level differences for many groups of vertebrates (Avise trophic species that serve as prey for larger piscivores & Walker 1999; Ratnasingham & Hebert 2007; Ward but also consume macroinvertebrates and the egg and 2009), this approach is generally accepted (Goldstein & larval stages of other fishes. Although they can represent DeSalle 2010) especially when corroborated by other the bulk of aquatic vertebrate biomass in small streams methods (Ross et al. 2008; Zou et al. 2011). More contro- (Cheever & Simon 2009; Raggon 2010), their ecological versial has been the use of distance-based methods for significance is poorly understood. Contributing to this the discovery of new species (DeSalle et al. 2005). It has uncertainty is a lack of taxonomic clarity. Species in this been argued that individuals with sequences that differ genus are widely acknowledged as being among the by some interspecific threshold (e.g. 2% among freshwa- most difficult freshwater fishes to identify (Jenkins & ter fishes; Ward 2009) from sequences of all other recog- Burkhead 1994; Wydoski & Whitney 2003), in part nized forms may constitute new species (Hebert et al. because putatively diagnostic characteristics are geo- 2004). Much of the critique of species discovery via this graphically variable within species (McPhail 2007). approach has focused on improved methods of calcula- Molecular analyses (Kinziger et al. 2005; Hubert et al. tion and selection of distance thresholds (Meyer & 2008; April et al. 2011) revealed that several species of Paulay 2005; Meier et al. 2008; Collins et al. 2012) or use North American Cottus contain deeply divergent lin- of alternatives such as character-based or model-based eages that might represent unrecognized species, thus methods (Zou et al. 2011; Powell 2012). Some of the the identity and distribution of many members of this controversy is more fundamental, because somewhat genus remain unclear. And because the lifetime home arbitrary genetic distances are acceptable currency for ranges of individual fish can be relatively small, for species delineation only under certain hypotheses about example, <250 m (Petty & Grossman 2007; Hudy & what constitutes a species, for example, some variants of Shiflet 2009), divergence at small spatial scales may be the phylogenetic species concept (Hausdorf 2011). the norm. Surprisingly, the consequences of the geographical A region in which there has been little work on the distribution of sampling for genetically based biodiver- biodiversity of sculpins includes the upper Columbia sity assessments have been largely overlooked (Berg- and Missouri River basins in northern Idaho and western sten et al. 2012). Many studies rely on one or a few Montana. Up to five species of sculpins—Cottus bairdii, individuals of a species collected from a fraction of its Cottus beldingii, Cottus cognatus, Cottus confusus and distribution (Funk & Omland 2003; Meyer & Paulay Cottus rhotheus—have been thought to occur in small 2005; Wiemers & Fiedler 2007). These individuals often streams in this region, although there has been little reflect locations that were convenient to sample or consensus on their individual distributions (Lee et al. were obtained as an adjunct to other activities, rather 1980; Wydoski & Whitney 2003; McPhail 2007). Confu- than resulting from a statistically robust sampling sion about the identity and distribution of sculpins in design. The field of , however, is this region has also led to ambiguity in conservation

Published 2013. This article is a U.S government work and is in the public domain in the USA SPATIALLY DRIVEN GENETIC BIOASSESSMENT OF SCULPINS 3 priorities. In Montana and part of British Columbia, (a) 120°0'0"W 115°0'0"W 110°0'0"W C. confusus has at different times been regarded as either a species of special concern or as having never been pres- 50°0'0"N ent (Hendricks 1997; Committee on the Status of Endan- gered Wildlife in Canada (COSEWIC) 2010). Montana Our goal was to resolve the distribution and identity of members of this taxonomically difficult group from small streams throughout the U.S. Northern Rocky 45°0'0"N Mountains. First, we collected sculpins from a spatially Idaho comprehensive and randomly selected set of streams that represented all major river basins. Second, we sequenced 45°0'0"N Sample locations two regions of the mitochondrial genome to permit com- Sculpin parisons with nearly all described species of North American freshwater sculpins in public databases. Third, 115°0'0"W 110°0'0"W 10050 0 100 200 300 we used consensus results from an array of methods to Kilometers delineate and identify species. (b) 120°0'0"W 115°0'0"W 110°0'0"W

Materials and methods 50°0'0"N

Sampling Montana A Our data set consisted of sequences from samples col- B lected in the field and sequences obtained from public C sequence repositories. Field collections were made 45°0'0"N D

(using electrofishing) from 398 streams sampled from Idaho E 2008 to 2011 on state and federal lands in the upper F 45°0'0"N Columbia and Missouri River basins in northern Idaho G and western Montana (Fig. 1). These streams formed H part of the PACFISH/INFISH Biological Opinion Effec- 115°0'0"W 110°0'0"W tiveness Monitoring Program network (Kershner et al. 2004). This network comprises a random sample (Ste- vens & Olsen 1999) of about one-third of all 6th-code subbasins (area, 40–160 km2; Wang et al. 2011) with Fig. 1 Observations of Cottus in the upper Columbia River and substantial federal ownership. Nearly, all sample sites Missouri River basins, Idaho and Montana, USA. These include consisted of the first low-gradient stream reach on (a) all sampled sites and those locations for which sculpins were public land (Kershner et al. 2004). Captured fish were sequenced and (b) the haplotype groups observed at each site. held briefly in buckets containing stream water. Before releasing them, we retained upper caudal fin clips (on DNA regions and phylogenetic analyses chromatography paper; LaHood et al. 2008) of up to 10 specimens captured at each site, but made no attempt to Although sequences from many mtDNA regions have identify sculpins in the field because most species can- been used to identify fish species, we used COI and cyto- not be easily distinguished, and we sometimes handled chrome b (cyt b) because their popularity (Ratnasingham hundreds of individuals at each site. We captured scul- & Hebert 2007; Page & Hughes 2010) provided the pins in 187 streams and analysed 1–5 fish (generally 2) broadest coverage of sculpins in public databases. We from 119 streams (and at two sites on five of these sequenced the COI region for all sculpins (n = 236) in the streams). Our intent was to include 2–4 streams repre- sample (Table S1, Supporting information). A subsample senting each 4th-code subwatershed in this area (http:// of these (n = 120) that included most novel COI haplo- water.usgs.gov/GIS/regions.html; Table S1, Supporting types was also sequenced at cyt b. GenBank accession information). All collections were made under scientific numbers for these sequences are JX282572–282599 (for collection permits issued (to MKY) by Montana Fish, COI) and JX282526–282571 (for cyt b). Wildlife and Parks and the Idaho Department of Fish We used the QIAGEN DNeasy Blood and Tissue kit and Game. All tissue specimens and extracted DNA (QIAGEN Inc.) to extract genomic DNA from tissues, fol- were vouchered at the Wildlife Genetics Laboratory, lowing the manufacturer’s instructions for tissue. The Missoula, MT. COI region was amplified with primers FF2d and FR1d

Published 2013. This article is a U.S government work and is in the public domain in the USA 4 M. K. YOUNG ET AL.

(Ivanova et al. 2007). The cyt b region was amplified with branch-swapping in MEGA 5.1 (Tamura et al. 2011). We primers L14724 and H15915 (Schmidt & Gold 1993). inspected terminal in the majority consensus tree Reaction volumes of 50 lL contained 50–100 ng DNA, for reciprocal and for concurrence with the 19 reaction buffer (Life Technologies), 2.5 mM MgCl2, independent maximum parsimony networks. Because of 200 lM each dNTP, 1 lM each primer, 1 U Taq polymer- incomplete sorting, species can develop without ase (Life Technologies). The PCR programme was exhibiting reciprocal monophyly (Funk & Omland 2003); 94 °C/5 min, [94 °C/1 min, 55 °C/1 min, 72 °C/1 min thus, this method also tends to underestimate species 30 s] 9 34 cycles, 72 °C/5 min. The quality and quantity richness. Finally, after assignments to putative individual of template DNA were determined by 1.6% agarose gel species, we then compared the maximum intragroup dis- electrophoresis. PCR products were purified with Exo- tance to the minimum intergroup distance to assess Sap-IT (Affymetrix-USB Corporation) according to man- whether a barcode gap was evident and consistent (Meier ufacturer’s instructions. et al. 2006). We used the absolute maximum intragroup We used the Big Dye kit and the 3700 DNA Analyzer distance as the threshold for species delineation. We (ABI; High Throughput Genomics Unit) to obtain DNA based calculations on uncorrected p-distances because sequences. We used the given primers to generate DNA these have been shown to perform as well or better for sequence data for COI, whereas we used internal prim- detection of barcode gaps than the more broadly used ers LC1, LC2 and LC3 (Kinziger & Wood 2003) to gener- Kimura-2-parameter model (Collins et al. 2012; Srivathsan ate cyt b sequences in three fragments. Sequences were & Meier 2012). viewed and aligned with Sequencher ( Codes The second step involved repeating these analyses Corp.). with the inclusion of public sequences to facilitate species The COI data set consisted of 620-bp sequences, identification. These consisted of all publically available which included 182 variable and 124 parsimony-infor- sequences of Cottus bairdii, Cottus beldingii, Cottus cogna- mative sites. Mean nucleotide frequencies were A, tus, Cottus confusus and Cottus rhotheus—those species 0.2274; T, 0.2979; C, 0.2958; and G, 0.1789. The 1034-bp believed to occupy the region we sampled—with unique cyt b data set consisted of 370 variable and 256 parsi- haplotypes as well as single representatives of all other mony-informative sites, for which mean nucleotide fre- species of Cottus from North America present in public quencies were A, 0.2337; T, 0.2861; C, 0.3247; and G, sequence repositories (Table S2, Supporting information). 0.1555. We observed no indels or gaps, amino acid trans- For these analyses, COI (n = 28 haplotypes from field lations did not reveal any stop codons, and all sequences sampling and 46 from public sources) and cyt b (n = 46 exceeded lengths typical of nuclear DNA of mitochon- haplotypes from field sampling and 39 from public drial origin (Zhang & Hewitt 1996). sources) sequences were examined separately because no We used a two-step approach to delineate and identify individual in public databases was represented by species of sculpin from the field collections. First, we sequences of both loci. After alignment, all sequences assigned field-collected samples to particular haplotype were trimmed to the same number of nucleotides. groups using three methods to analyse concatenated COI Sequences with ambiguous nucleotides were excluded and cyt b sequences (n = 120; Table S1, Supporting infor- from maximum parsimony networks because they can mation): 95% maximum parsimony networks, maximum- introduce errors in network structure (Joly et al. 2007) likelihood phylogenetic trees and pairwise genetic dis- but were retained for other analyses. Following Kinziger tances. We used TCS 1.21 (Clement et al. 2000) to construct et al. (2005), we used Leptocottus armatus as an 95% maximum parsimony networks. Independent net- in maximum-likelihood phylogenetic trees. Because the works were regarded as candidate species (Hart & Sun- focus at this step was to assign each group to a described day 2007). Ninety-five per cent maximum parsimony species, we adopted the liberal tree-based method with a networks tend to be conservative estimators of species distance threshold (Ross et al. 2008), that is, sequences diversity because the probability of pooling distinct taxa that were sister to or within a monophyletic with a into one network can be relatively high, whereas the likeli- recognized species, and were less than the maximum hood of splitting a single into separate networks is intraspecific distance from that species, were identified low (Hart & Sunday 2007; Chen et al. 2010). Next, we con- as that species. We based distance thresholds on the structed mid-point-rooted maximum-likelihood phyloge- absolute maximum intragroup distance for each marker- netic trees under the strict tree-based method (Ross et al. haplotype group combination for our field-collected sam- 2008) to delineate groups. The evolutionary model ples, which led to a variable distance threshold. When + + GTR G I yielded the lowest AICc and highest log- haplotype groups were represented by a single sequence, likelihood values for these data. Using this model, we we adopted the largest intragroup distance among all constructed maximum-likelihood trees with 1000 boot- haplotype groups for that marker as the threshold strap replicates incorporating subtree-prune-and-graft distance.

Published 2013. This article is a U.S government work and is in the public domain in the USA SPATIALLY DRIVEN GENETIC BIOASSESSMENT OF SCULPINS 5

Fig. 2 Maximum parsimony networks of (a) (b) (c) concatenated cytochrome c oxidase sub- unit 1 and cytochrome b oxidase sequences for sculpins collected in this study. Each circle represents a haplotype, and sizes are proportional to the number of individuals with that haplotype. Each line segment represents a single mutation, and small black dots represent unobserved haplo- types.

(d)

(e)

(f) (g) (h)

We interpreted consensus among the three methods regarded these eight groups as potential species in and both markers—as well as geographical concordance further analyses. —as evidence for (i) the assignment of a haplotype group to a described species or (ii) that group to repre- Species identification sent a lineage or species not represented in public data- bases. Assignment of haplotype groups to known species depended on the region used and on the pool of species to which each haplotype group could be compared. Net- Results work analyses of COI sequences collapsed the haplotype groups into three networks, only two of which were Species delineation unambiguous. Group B was connected to two haplo- On the basis of the concatenated sequences, there was types identified as Cottus beldingii. Haplotype groups D substantial concordance among methods with respect and E were joined in a single network unconnected to to the classification of field-sampled sculpins. Calcula- any other. The remaining haplotype groups were joined tion of 95% maximum parsimony networks generated in a single complex network that included public eight distinct networks of haplotypes (Fig. 2) that were sequences of Cottus bairdii, Cottus caeruleomentum, Cottus also resolved as reciprocally monophyletic clades in cognatus and Cottus rhotheus. In contrast, analyses of cyt b maximum-likelihood trees (Fig. 3). Comparisons of all sequences returned seven unambiguous networks. possible pairwise genetic distances revealed that mini- Again, haplotype group B was connected to a public mum intergroup distances (mean, 4.18%) tended to sequence identified as Cottus beldingii. The remaining exceed maximum intragroup distances (mean, 0.58%; networks consisted of one (A, C, F, G, H) or two (D–E) Table 1). Nevertheless, overlap between the minimum haplotype groups and were unconnected to sequences of intergroup distance (haplotype groups F–G, 1.09%) and any recognized species. maximum intragroup difference (haplotype group B, Similarly, maximum-likelihood trees based on COI 1.21%) precluded designating a fixed threshold (or bar- sequences were less well resolved and supported than code gap) that would delineate all groups. The eight those based on cyt b sequences, but the topologies haplotype groups (A–H) delimited by the network- and produced by both regions were comparable with regard tree-building approaches, however, were also geograph- to their sister taxa and to monophyly of the haplotype ically distinct; most were confined to particular river groups (Fig. 4). In the COI-based tree, bootstrap support basins within the study area. Consequently, we was high (  86%) for all terminal clades containing

Published 2013. This article is a U.S government work and is in the public domain in the USA 6 M. K. YOUNG ET AL.

Fig. 3 Maximum-likelihood phylogeny of concatenated cytochrome c oxidase sub- unit 1 and cytochrome b oxidase sequences for sculpins collected in this study. Num- bers on the branches are support with respect to 1000 bootstrap replications. Only values  70 are shown. C

G F

A

H

D

E

B

haplotype groups, and half of the groups were recipro- with recognized species or one another and were cally monophyletic from all other sculpins. Applying the regarded as unidentified species. distance threshold (Table 2) resulted in identifying Comparisons of pairwise genetic distances indicated group B as C. beldingii, combining groups D and E as a differentiation among haplotype groups and most recog- single unidentified taxon and combining groups F and G nized species (Table 2). The nearest neighbour of each and identifying them as C. rhotheus. The remaining haplotype group was inconsistent; for only three of the groups were not assigned to any recognized species. In eight groups was the same nearest neighbour identified the cyt b-based tree, all haplotype groups were in termi- in analyses with both markers. Intragroup genetic nal clades with very high bootstrap support (  97%). All distances tended to be larger for cyt b than for COI were reciprocally monophyletic other than group B, sequences (means, 0.59% vs. 0.44%). In all comparisons, which was again identified as C. beldingii. The other genetic distances to the nearest neighbour were consis- seven haplotype groups were too distant to be combined tently greater with cyt b than with COI.

Published 2013. This article is a U.S government work and is in the public domain in the USA SPATIALLY DRIVEN GENETIC BIOASSESSMENT OF SCULPINS 7

Table 1 Pairwise distance matrix (uncorrected p-distances, %) Cottus bendirei, Cottus extensus and C. hubbsi (Neely 2003). for concatenated sequences of field-sampled sculpins Because one individual of the latter species had a very similar COI haplotype, comprehensive sampling within ABCDEFGH its range should inform taxonomic placement of fish in A* 0.48† group A. B 4.95 1.21 Group B was found exclusively within the Clearwater C 1.87 4.77 0.24 River basin and aligned with some public sequences of D 4.04 5.97 4.53 0.18 Cottus beldingii in all analyses. Although we identify E 4.28 6.16 4.71 1.27 0.97 group B as likely members of that species, this assignment F 2.78 6.64 2.96 5.43 5.49 0.12 is problematic. Late in the 20th century, sculpins morpho- G 2.90 6.22 2.90 5.85 5.91 1.09 NC‡ H 2.35 5.19 2.60 4.47 4.71 3.44 3.44 0.84 logically identified as C. beldingii were collected at loca- tions close to those we sampled (Maughan 1976). Fish *We based group labels on independent 95% maximum parsi- from one these locations, however, were once described mony networks and reciprocally monophyletic clades from as a separate species, Cottus tubulatus (Hubbs & Schultz maximum-likelihood trees. 1932). Furthermore, haplotypes in this group—from the †Values on the diagonal are maximum intragroup distances and northernmost distribution of C. beldingii—were separated off-diagonal values are minimum intergroup distances. – ‡NC indicates no comparison because only one haplotype was by genetic distances of 2.70 4.52% from those of C. beldin- observed. gii present in the upper Snake River, Great Basin and Wil- lamette River. This high intraspecific divergence warrants taxonomic attention, particularly given that the type loca- tion for this species is Lake Tahoe, Nevada (Lee et al. Discussion 1980). Group C was found throughout the Pend Oreille and Delineation and identification of haplotype groups lower Kootenai River basin. Although sculpins from this Our spatially comprehensive sample likely provided a area have traditionally been recognized as Cottus cogna- thorough assessment of haplotype diversity in stream- tus (Wydoski & Whitney 2003; McPhail 2007), public dwelling sculpins in the upper Columbia and Missouri sequences labelled as C. cognatus were neither the near- River basins. In general, there was consensus among the est neighbour to this group (based on distance) nor three approaches and two markers for the existence of among the most closely related (based on placement in eight lineages that differed at levels typical of separate the phylogenetic trees). Because the range boundary of freshwater fish species (Ward 2009). The complex geo- C. cognatus in eastern North America is 500–1500 km dis- logical and climatic history in this region—involving the tant (Lee et al. 1980), it is unlikely that intermediate hapl- Cascadian orogeny, active volcanism, redirection of otypes exist that would cause this species and group C major river basins and Pleistocene glaciation—has led to to join in phylogenetic networks or trees. Moreover, a diversification in an array of taxa (Shafer et al. 2010), so C. cognatus haplotype from extreme eastern Siberia also this diversity may be unsurprising. Quite surprising, differs greatly (2.80–2.99%) from group C. Consequently, however, was that the majority of these groups did not we regard this group as a potentially undescribed assign to recognized species, despite North American species. fishes having been generally well inventoried in genetic Groups D and E were found in overlapping portions surveys (Hubert et al. 2008; April et al. 2011). Below, we of the upper Clearwater River basin, were most similar briefly describe the likely taxonomic position of each to one another in all analyses and were sometimes joined group and its distribution. as a single group. Their nearest neighbour in any analy- Group A was the only sculpin in the upper Missouri sis is a single sequence of C. confusus collected from near River basin and was also present in some portions of the the type location of this species in the Salmon River near Columbia River basin immediately adjacent to the Conti- Ketchum, Idaho (Lee et al. 1980), but this haplotype is nental Divide. Although historically assigned to Cottus relatively distant (2.22–2.70%) from either groups. Nev- bairdii (Holton & Johnson 2003), sculpins from this por- ertheless, C. confusus has been described as the only scul- tion of the study area are now believed to represent an pin present in the upper Clearwater River (Maughan unrecognized species (Neely 2003; McPhail 2007). Our 1976). Interpreting the identity of groups D and E as part results support this interpretation, but with a caveat: it of a highly variable species or as independent, unnamed may represent a range extension for Cottus hubbsi. Both lineages will require analyses of additional samples from phylogenetic trees suggest that the closest relatives of throughout the historical range of C. confusus. group A comprise those sculpins also formerly thought The distribution of groups F and G are consistent to be C. bairdii—C. bairdii punctulatus, C. b. semiscaber, with that of C. rhotheus (Maughan et al. 1980; Wydoski

Published 2013. This article is a U.S government work and is in the public domain in the USA 8 M. K. YOUNG ET AL.

H1 H4 (a) A1 (b) 94 H3 74 A4 H2 90 A3 99 H5 A2 90 H6 C. hubbsi H9 H7 C. girardi 94 C. bairdii 8 H8 91 100 F1 C. rhotheus 1 G1 C. rhotheus 2 89 G1 C. rhotheus 1 99 F3 H1 99 F1 97 99 H2 F2 H3 97 C. baileyi C. bairdii 7 C. carolinae C. cognatus 4 C. cognatus 1 89 C. cognatus 3 95 C. cognatus 3 70 C. cognatus 1 99 C. bendirei C. cognatus 2 C. hubbsi C. cognatus 5 98 C. bairdii semiscaber C. hypselurus C. bairdii punctulatus C. rhotheus 2 C. extensus 92 A8 C. bairdii 9 A9 C. bairdii 2 A7 C. bairdii 5 A1 86 75 C. bairdii 12 99 A3 92 C. bairdii 14 A2 89 C. cognatus 6 89 A4 C. cognatus 7 A5 86 C. bairdii 15 81 A6 98 C. bairdii 1 C. bairdii 16 81 C. bairdii 1 C. bairdii kumlieni C. bairdii 4 C. caeruleomentum C. bairdii 3 C. cognatus 2 C. bairdii 2 C. bairdii 10 C. girardi C7 C13 C2 C9 C5 C5 C1 100 C6 C4 C11 C3 C8 C8 C12 77 C6 C1 C. bairdii 13 C2 C4 90 C. bairdii 6 C. bairdii 11 C3 91 C7 C. caeruleomentum C10 C. carolinae C. hypselurus C. cf. bairdii 70 C. cf. hypselurus C. chattahoochee C. bairdii 3 99 C. tallapoosae C. bairdii 5 99 D1 99 C. bairdii 4 100 D2 82 C. bairdii bairdii 99 E4 C. confusus D1 E3 99 82 E1 D2 100 D3 82 E2 93 C. beldingii 1 E4 98 E3 C. beldingii 2 E1 98 B4 99 94 76 E2 B5 C. greenei B2 99 C. leiopomus B1 C. beldingii 2 89 C. beldingii 3 97 98 B3 B3 99 B4 71 C. beldingii 4 C. beldingii 1 99 B1 89 B2 0.01

0.01

Fig. 4 Maximum-likelihood phylogeny for sculpins. These represent fish collected during this study and species of Cottidae from North America represented in GenBank for (a) cytochrome c oxidase subunit 1 (COI) and (b) cytochrome b oxidase (cyt b)sequences.Numbers on the branches are support with respect to 1000 bootstrap replications. Only values  70 are shown. The outgroup (Leptocottus armatus) and distantly related Cottus of the C. asper clade (C. aleuticus, C. asper, C. asperrimus, C. gulosus, C. klamathensis, C. marginatus, C. perplexus, C. pitensis, C. princeps, and C. tenuis; Kinziger et al. 2005) and C. ricei are not shown.

& Whitney 2003; McPhail 2007). In addition, COI haplo- C. confusus (Bailey & Bond 1963), but members of group types of C. rhotheus were more similar to those of F and H differ by more than 4.05% from that species (and G than they were to one another and the type location— groups D and E) and are unrelated on the basis of our Spokane River Falls, Washington—was relatively close analyses. Given the large differences between group H to some of our sampling sites. Consequently, we regard and all other species and haplotype groups of sculpins, these groups as possible members of this species, but we consider it likely that this group constitutes an unde- note that this results in an intraspecific genetic distance scribed species. among all available samples (at cyt b) exceeding 2.5%, a threshold generally associated with species-level Did genetically based biodiversity assessment succeed? differences. Group H was distributed throughout the Coeur We regard the characterization of sculpin biodiversity in d’Alene and St. Joe River basins in Idaho, with a disjunct the U.S. northern Rocky Mountains as a partial success. range in the central Clark Fork River basin in Montana. Genetic analyses of spatially comprehensive collections This distribution is not consistent with that of any of Cottus revealed eight geographically and genetically described species. In some treatments, sculpins from delineated groups of sculpins that could be regarded as near the Idaho sampling locations were identified as distinct taxa. Nevertheless, we were able to assign (and

Published 2013. This article is a U.S government work and is in the public domain in the USA SPATIALLY DRIVEN GENETIC BIOASSESSMENT OF SCULPINS 9

Table 2 Pairwise genetic distances (%) within haplotypes groups (maxima, bolded) and to their three nearest neighbours (range)

Haplotype group Nearest neighbour group COI Nearest neighbour group cyt b

A 0.32* 0.77 Cottus hubbsi 0.48–0.65 Cottus extensus 1.54–1.83 C 0.97–1.29 Cottus bairdii punctulatus 1.64–1.93 Cottus cognatus 1.13–2.10 C. cognatus 1.74–2.70 B 1.13 1.26 Cottus beldingii 0.00–4.52 C. beldingii 0.77–3.19 C 3.87–4.84 Cottus leiopomus 4.16–4.83 C. hubbsi 4.52–4.84 Cottus bairdii punctulatus 4.83–5.02 C 0.32 0.29 Cottus caeruleomentum 0.81–1.13 C. bairdii punctulatus 2.41–2.61 A 0.97–1.29 A 2.41–2.90 Cottus bairdii 0.97–1.94 C. extensus 2.51–2.70 D 0.16 0.19 E 0.97–1.94 E 1.26–1.83 C. hubbsi 4.03–4.19 Cottus confusus 2.22–2.41 A 4.19–4.52 A 3.86–4.34 E 0.97 1.06 D 0.97–1.94 D 1.26–1.83 C. hubbsi 4.19–4.84 C. confusus 2.32–2.70 C 4.19–5.00 C. cognatus 3.67–5.14 F NA† 0.19 Cottus rhotheus 0.65–2.90 G 1.06–1.16 G 1.13 Cottus rhotheus 2.51–2.70 C. bairdii 1.77–3.55 A 3.19–3.67 G NA NA C. rhotheus 0.48–2.74 F 1.06–1.16 F 1.13 C. rhotheus 2.41–2.51 C 1.77–1.94 A 3.28–3.57 H 0.65 0.97 C 1.61–2.26 C. bairdii punctulatus 2.22–2.61 A 1.94–2.74 C. extensus 2.22–2.61 C. rhotheus 1.94–3.23 C. bairdii semiscaber 2.32–2.70

*Values are from uncorrected p-distances. †NA indicates that no intragroup distance could be calculated because a haplotype group was represented by a single haplotype. tenuously at best) few of the haplotype groups to a recog- (Steinke & Hanner 2011). If one presupposes that intra- nized species on the basis of genetic sequences. Is it pos- specific genetic variation is uniformly low, a few samples sible that we have discovered up to eight previously of an organism—perhaps even one—would be adequate undescribed taxa (e.g. species or subspecies) or is the cur- for species diagnosis or discovery. But characterizing rent approach to genetically based identification suspect? biodiversity at and below the level of species appears to A primary tenet of such identification is that all spe- require several-fold larger samples (Bergsten et al. 2012). cies that are potentially related to an unknown sample Moreover, we observed geographically complex and are available for comparison. We expected to meet this highly divergent genetic structure among many of our criterion because we examined sequences from nearly all haplotype groups (Fig. 2), which may be common named species of Cottus in North America and because among organisms that exhibit limited vagility or occupy barcode coverage of North American freshwater fishes is patchily distributed habitats (Hammer et al. 2010) in regarded as nearly complete (Hanner et al. 2011). environments of varying stability over evolutionary time. Although it is plausible that the groups we observed In such instances, reference collections of convenience constitute novel taxa, representatives of sculpins in pub- samples collected at coarse scales will misrepresent taxo- lic databases from the region we sampled were almost nomic diversity by underestimating intraspecific genetic nil, which we regard as indicative of a more fundamental diversity and failing to include novel lineages that issue. Current protocols for global, genetically based constitute new species. species inventories aim for five individuals of most spe- Improving this situation requires progress on several cies, and up to 25 individuals of widely distributed taxa fronts. Foremost is the addition of data sets derived from

Published 2013. This article is a U.S government work and is in the public domain in the USA 10 M. K. YOUNG ET AL. systematic, fine-scale sampling. Based both on sculpin group of fishes, appears problematic. An additional cri- biology and on the wide genetic divergence between ref- tique has been that networks and trees are probabilistic erence samples putatively belonging to the same species, methods and that diagnostic portions of a sequence, that we believe that river-basin-level genetic structuring of is, one or more nucleotide-location combinations that are sculpins (or any taxa with comparable life histories) is fixed and unique, would provide a more definitive likely across North America, particularly in regions that assignment of species identity and be more consistent have not been recently glaciated (Bernatchez & Wilson with historical taxonomic practice (DeSalle et al. 2005; 1998). We echo the suggestion that a thorough revision Zou et al. 2011). Yet, until multiple individuals from of the taxonomy of sculpins is warranted (Kinziger et al. throughout the geographical range of each species have 2005), but emphasize that far more attention to the spa- been examined, character sets that are truly diagnostic tial scale and grain of sampling will be necessary to will remain speculative (Elias et al. 2007). Finally, we achieve a robust phylogeny, if for no other reason than regard mtDNA-based delineations of haplotype groups such sampling is more likely to collect all extant species as evidence for particular hypotheses. Analyses with (Hillis et al. 2003). nuclear DNA are essential to confirm or refute these Correct assignment to described species or discovery hypotheses and to examine the influence of ancient or of new species also requires that reference specimens are modern introgression on these groups and on difficulties correctly identified. The extent of divergence within in morphological species identification (Nolte et al. 2009). some species of sculpin, for example, C. bairdii, C. beldin- gii, C. confusus and C. rhotheus, suggests that multiple Is species identification necessary? taxa may be concealed under these species designations (April et al. 2011). Although this may reflect an imprecise Strictly speaking, genetic analyses of diversity do not taxonomy, misidentification of specimens also seems require taxonomic assignment of the groups that consti- likely and plagues many samples in reference collections tute components of that diversity. Yet, assignment to a (Becker et al. 2011; Cerutti-Pereyra et al. 2012). A system taxonomic group—typically a species—has heuristic of tissue or DNA vouchering with precise georeferencing value for studies of , phylogeography and of the locations of collection, as now required for submis- ecology, as well as being a prerequisite for designation sion to some sequence repositories (Ratnasingham & as warranting conservation attention (Turner 1999; Hebert 2007; Puillandre et al. 2012), would permit Mace 2004). Consequently, the designation of provi- resequencing of problematic individuals or recollection sional taxa is widely practised (Blaxter et al. 2005; April at those sites and begin to establish geographical bound- et al. 2011), although its legitimacy is sometimes ques- aries for haplotype groups. And particularly for species tioned because of disagreements about defining mean- that are genetically structured at small spatial scales, a ingful conservation units (Valentini et al. 2009; Funk critical adjunct to this process may be to sequence indi- et al. 2012) or even what constitutes a species (de Que- viduals from the type locations of many species with an iroz 2007; Frankham et al. 2012). Also, these taxonomic aim towards assigning a reference sequence to an indi- placeholders function as an ad hoc form of DNA taxon- vidual representative of the holotype (Kvist et al. 2010; omy, which in its fullest expression as a replacement Lowenstein et al. 2011). for the Linnaean system has been roundly criticized by Finally, the methods of species discovery and the many in the taxonomic community (Seberg et al. 2003). markers used for it merit further attention. Shortcomings Nevertheless, we agree with the widespread consensus of individual genetic methods and markers have been that they fill a need by highlighting potentially valid widely reported (Meier et al. 2006), and we observed sim- taxa (Janzen et al. 2009; Goldstein & DeSalle 2010) and ilar issues. For example, one weakness of most network- that a more formal system for contributing and recog- or tree-based methods is that they fail to discriminate nizing this diversity is urgently needed (Maddison between closely related species (Hart & Sunday 2007). et al. 2012) to ensure that these quasitaxonomic entities This was particularly evident among 95% maximum par- have standing until integrated taxonomic practices simony networks, and to a lesser extent maximum-likeli- provide a more thorough treatment of their position in hood trees, based on COI sequences. In contrast, both the tree of life (Padial et al. 2010). methods yielded comparable results—a high degree of haplotype group discrimination—when based on cyt b sequences. Because COI sequences are below the thresh- Acknowledgements old at which adding more nucleotides greatly increases We thank S. Adams and D. Schmetterling for piquing our inter- – and stabilizes phylogenetic resolution (c. 1000 1200 est in the investigation of sculpins in this region. P. Unmack, R. nucleotides; Pollock et al. 2002; Roe & Sperling 2007), Collins and M. Knight provided helpful comments on an earlier relying solely on COI as a biodiversity tag, at least for this version of this manuscript.

Published 2013. This article is a U.S government work and is in the public domain in the USA SPATIALLY DRIVEN GENETIC BIOASSESSMENT OF SCULPINS 11

References Goldstein PZ, DeSalle R (2010) Integrating DNA barcode data and taxo- nomic practice: determination, discovery, and description. BioEssays, April J, Mayden RL, Hanner RH, Bernatchez L (2011) Genetic calibration 33, 135–147. of species diversity among North America’s freshwater fishes. Hammer MP, Unmack PJ, Adams M, Johnson JB, Walker KF (2010) Proceedings of the National Academy of Sciences of the USA, 108, 10602– Phylogeographic structure in the threatened Yarra pygmy perch 10607. Nannoperca obscura (Teleostei: Percichthyidae) has major implications Avise JC (2000) Phylogeography: The History and Formation of Species. for declining populations. Conservation Genetics, 11, 213–223. Harvard University Press, New Haven, Connecticut. Hanner R, DeSalle R, Ward RD, Kolokotronis S-O (2011) The fish Barcode Avise JC, Walker D (1999) Species realities and numbers in sexual verte- of Life (FISH-BOL) special issue. Mitochondrial DNA, 22(Suppl 1), 1–2. brates: perspective from an asexually transmitted genome. Proceedings Harris DJ (2003) Can you bank on GenBank? Trends in Ecology and Evolu- of the National Academy of Sciences of the USA, 96, 992–995. tion, 18, 317–319. Bailey RM, Bond CE (1963) Four new species of freshwater sculpins, Hart MW, Sunday J (2007) Things fall apart: biological species form genus Cottus, from western North America. Occasional Papers of the unconnected parsimony networks. Biology Letters, 3, 509–512. Museum of Zoology, University of Michigan, 634,1–27. Hausdorf B (2011) Progress toward a general species concept. Evolution, Becker S, Hanner R, Steinke D (2011) Five years of FISH-BOL: brief status 65, 923–931. report. Mitochondrial DNA, 22(Suppl 1), 3–9. Hebert PDN, Cywinska A, Ball SL, DeWaard JR (2003a) Biological identi- Bergsten J, Bilton DT, Fujisawa T et al. (2012) The effect of geographical fications through DNA . Proceedings of the Royal Society of scale of sampling on DNA barcoding. Systematic Biology, 61, 851–869. London, Series B: Biological Sciences, 270, 313–321. Bernatchez L, Wilson CC (1998) Comparative phylogeography of Nearc- Hebert PDN, Ratnasingham S, DeWaard JR (2003b) Barcoding tic and Palearctic fishes. Molecular Ecology, 7, 431–452. life: cytochrome c oxidase subunit 1 divergences among closely related Bickford D, Lohman DJ, Sodhi NS et al. (2007) Cryptic species as a species. Proceedings of the Royal Society of London, Series B: Biological Sci- window on diversity and conservation. Trends in Ecology and Evolution, ences, 270(Suppl 1), 96–99. 22, 148–155. Hebert PDN, Penton EH, Burns JM, Janzen DH, Hallwachs W (2004) Ten Blaxter M, Mann J, Chapman T et al. (2005) Defining operational taxo- species in one: DNA barcoding reveals cryptic species in the neotropi- nomic units using DNA barcode data. Proceedings of the Royal Society of cal skipper butterfly . Proceedings of the National London, Series B: Biological Sciences, 360, 1935–1943. Academy of Sciences, 101, 14812–14817. Cerutti-Pereyra F, Meekan MG, Wei NWV, O’Shea O, Bradshaw CJA Hendricks P (1997) Status, Distribution, and Biology of Sculpins (Cottidae) in (2012) Identification of rays through DNA barcoding: an application Montana: A Review. Montana Natural Heritage Program, Helena. for ecologists. PLoS ONE, 7, e36479. Hillis DM, Pollock DD, McGuire JA, Zwickl DJ (2003) Is sparse taxon Cheever BM, Simon KS (2009) Seasonal influence of brook trout and sampling a problem for phylogenetic inference? Systematic Biology, 52, mottled sculpin on lower trophic levels in an Appalachian stream. 124–126. Freshwater Biology, 54, 524–535. Holton GD, Johnson HE (2003) A Field Guide to Montana Fishes, 3rd edn. Chen H, Strand M, Norenburg JL et al. (2010) Statistical parsimony Montana Fish, Wildlife and Parks, Helena. networks and species assemblages in cephalotrichid nemerteans Hubbs CL, Schultz LP (1932) Cottus tubulatus, a new sculpin from Idaho. (Nemertea). PLoS ONE, 5, e12885. Occasional Papers of the Museum of Zoology, University of Michigan, 242, Clement M, Posada D, Crandall KA (2000) TCS: a computer program to 1–9. estimate gene genealogies. Molecular Ecology, 9, 1657–1659. Hubert N, Hanner R, Holm E et al. (2008) Identifying Canadian freshwa- Collins RA, Boykin LM, Cruickshank RH, Armstrong KF (2012) ter fishes through DNA barcodes. PLoS ONE, 3, e2490. Barcoding’s next top model: an evaluation of nucleotide substitution Hudy M, Shiflet J (2009) Movement and recolonization of Potomac models for specimen identification. Methods in Ecology and Evolution, sculpin in a Virginia stream. North American Journal of Fisheries Manage- 3, 457–465. ment, 29, 196–204. Committee on the Status of Endangered Wildlife in Canada (COSEWIC) Ivanova NV, Zemlak TS, Hanner RH, Hebert PDN (2007) Universal (2010) COSEWIC Assessment and Status Report on the Shorthead Sculpin primer cocktails for fish DNA barcoding. Molecular Ecology Notes, 7, Cottus confusus in Canada. Committee on the Status of Endangered 544–548. Wildlife in Canada, Ottawa, Ontario. Janzen DH, Hallwachs W, Blandin P et al. (2009) Integration of DNA DeSalle R, Egan MG, Siddall M (2005) The unholy trinity: taxonomy, spe- barcoding into an ongoing inventory of complex tropical biodiversity. cies delimitation and DNA barcoding. Philosophical Transactions of the Molecular Ecology Resources, 8(Suppl 1), 1–26. Royal Society of London, Series B: Biological Sciences, 360, 1905–1916. Jenkins RE, Burkhead NM (1994) Freshwater Fishes of Virginia. American Elias M, Hill RI, Willmott KR et al. (2007) Limited performance of DNA Fisheries Society, Bethesda, Maryland. barcoding in a diverse community of tropical butterflies. Proceedings of Joly S, Stevens MI, van Vuuren BJ (2007) Haplotype networks can be mis- the Royal Society of London, Series B: Biological Sciences, 274, 2881–2889. leading in the presence of missing data. Systematic Biology, 56, 857–862. Fausch KD, Torgersen CE, Baxter CV, Li HW (2002) Landscapes to river- Kershner JL, Archer EK, Coles-Ritchie M et al. (2004) Guide to effective scapes: bridging the gap between research and conservation of stream monitoring of aquatic and riparian resources. U.S. Forest Service fishes. BioScience, 52, 483–498. General Technical Report, RMRS-GTR-121. Frankham R, Ballou JD, Dudash MR et al. (2012) Implications of different Kinziger AP, Wood RM (2003) Molecular systematics of the polytypic species concepts for conserving biodiversity. Biological Conservation, species Cottus hypselurus (Teleostei: Cottidae). Copeia, 2003, 624–627. 153,25–31. Kinziger AP, Wood RM, Neely DA (2005) Molecular systematics of the Fregin S, Haase M, Olsson U, Alstrom€ P (2012) Pitfalls in comparisons of genus Cottus (Scorpaeniformes: Cottidae). Copeia, 2005, 303–311. genetic distances: a case study of the avian family Acrocephalidae. Kvist S, Oceguera-Figueroa A, Siddall ME, Erseus C (2010) Barcoding, Molecular and Evolution, 62, 319–328. types and the Hirudo files: using information content to critically Funk DJ, Omland DE (2003) Species-level and : fre- evaluate the identity of DNA barcodes. Mitochondrial DNA, 21, quency, causes, and consequences, with insights from animal mito- 1998–2005. chondrial DNA. Annual Review of Ecology, Evolution, and Systematics, LaHood ES, Miller JJ, Apland C, Ford MJ (2008) A rapid, ethanol-free fish 34, 397–423. tissue collection method for molecular genetic analyses. Transactions of Funk WC, McKay JK, Hohenlohe PA, Allendorf FW (2012) Harnessing the American Fisheries Society, 137, 1104–1107. genomics for delineating conservation units. Trends in Ecology and Laikre L (2010) Genetic diversity is overlooked in international conserva- Evolution, 27, 489–496. tion policy implementation. Conservation Genetics, 11, 349–354.

Published 2013. This article is a U.S government work and is in the public domain in the USA 12 M. K. YOUNG ET AL.

Lee DS, Gilbert CR, Hocutt CH, Jenkins RE, McAllister DE, Stauffer JR Rubinoff D (2006) Utility of mitochondrial DNA barcodes in species con- (1980) Atlas of North American Freshwater Fishes. North Carolina State servation. , 20, 1026–1033. Museum of Natural History, Raleigh. Schmidt TR, Gold JR (1993) Complete sequence of the mitochondrial Lowenstein JH, Osmundson TW, Becker S, Hanner R, Stiassny MLJ cytochrome b gene in the cherryfin shiner, Lythrurus roseipinnis (Teleo- (2011) Incorporating DNA barcodes into a multi-year inventory of the stei: Cyprinidae). Copeia, 1993, 880–883. fishes of the hyperdiverse Lower Congo River, with a multi-gene Seberg O, Humphries CJ, Knapp S et al. (2003) Shortcuts in systematics? performance assessment of the genus Labeo as a case study. Mitochon- A commentary on DNA-based taxonomy. Trends in Ecology and Evolu- drial DNA, 22(Suppl 1), 52–70. tion, 18,63–65. Mace GM (2004) The role of taxonomy in species conservation. Shafer ABA, Cullingham CI, Cot^ e SD, Coltman DW (2010) Of glaciers Proceedings of the Royal Society of London, Series B: Biological Sciences, and refugia: a decade of study sheds new light on the phylogeography 359, 711–719. of northwestern North America. Molecular Ecology, 19, 4589–4621. Maddison DR, Guralnick R, Hill A, Reysenback A-L, McDade LA (2012) Srivathsan A, Meier R (2012) On the inappropriate use of Kimura- Ramping up biodiversity discovery via online quantum contributions. 2-parameter (K2P) divergences in the DNA-barcoding literature. Trends in Ecology and Evolution, 27,72–77. , 28, 190–194. Maughan OE (1976) A survey of fishes of the Clearwater River. Northwest Steinke D, Hanner R (2011) The FISH-BOL collaborators’ protocol. Science, 50,76–86. Mitochondrial DNA, 22(Suppl 1), 10–14. Maughan OE, Edmundson EE, Farris AE, Wallace RL (1980) A compari- Stevens DL Jr, Olsen AR (1999) Spatially restricted surveys over time for son of fish species above and below Palouse Falls, Palouse River, aquatic resources. Journal of Agricultural, Biological, and Environmental Washington-Idaho. Northwest Science, 54,5–8. Statistics, 4, 415–428. McPhail JD (2007) The Freshwater Fishes of British Columbia. University of Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) Alberta Press, Edmonton. MEGA5: molecular evolutionary genetics analysis using maximum Meier R, Kwong S, Vaidya G, Ng PKL (2006) DNA barcoding and taxon- likelihood, evolutionary distance, and maximum parsimony methods. omy in Diptera: a tale of high intraspecific variability and low identifi- Molecular Biology and Evolution, 28, 2731–2739. cation success. Systematic Biology, 55, 715–728. Taylor HR, Harris WE (2012) An emergent science on the brink of irrele- Meier R, Zhang G, Ali F (2008) The use of mean instead of smallest inter- vance: a review of the past 8 years of DNA barcoding. Molecular specific distances exaggerates the size of the “barcoding gap” and Ecology Resources, 12, 377–388. leads to misidentification. Systematic Biology, 57, 809–813. Teletchea F (2010) After 7 years and 1000 citations: comparative Meyer CP, Paulay G (2005) DNA barcoding: error rates based on compre- assessment of the DNA barcoding and the DNA taxonomy proposals hensive sampling. PLoS Biology, 3, 2229–2238. for taxonomists and non-taxonomists. Mitochondrial DNA, 21, 206–226. Neely DA (2003) A Systematic and Taxonomic Reassessment of the Mottled Turner GF (1999) What is a fish species? Reviews in Fish Biology and Fisher- Sculpin , Cottus bairdii Girard (Teleostei: Cottidae). ies, 9, 281–297. Doctoral dissertation, University of Alabama, Tuscaloosa. Valentini A, Pompanon F, Taberlet P (2009) DNA barcoding for ecolo- Nielsen R, Matz M (2006) Statistical approaches for DNA barcoding. gists. Trends in Ecology and Evolution, 24, 110–117. Systematic Biology, 55, 162–169. Wang L, Infante D, Esselman P et al. (2011) A hierarchical spatial frame- Nolte AW, Gompert Z, Buerkle CA (2009) Variable patterns of introgres- work and database for the National River Fish Habitat Condition sion in two sculpin hybrid zones suggest genomic isolation differs Assessment. Fisheries, 36, 436–449. among populations. Molecular Ecology, 18, 2615–2627. Ward RD (2009) DNA barcode divergence among species and genera of Padial JM, Miralles A, De la Riva I, Vences M (2010) The integrative birds and fishes. Molecular Ecology Resources, 9, 1077–1085. future of taxonomy. Frontiers in Zoology, 7, 16. Wiemers M, Fiedler K (2007) Does the barcoding gap exist?—A case study Page TJ, Hughes JM (2010) Comparing the performance of multiple mito- in blue butterflies (: Lycaenidae). Frontiers in Zoology, 4,8. chondrial in the analysis of Australian freshwater fishes. Journal Wydoski RS, Whitney RR (2003) Inland Fishes of Washington, 2nd edn. of Fish Biology, 77, 2093–2122. American Fisheries Society, Bethesda, Maryland. Petty JT, Grossman GD (2007) Size-dependent territoriality of mottled Zhang DX, Hewitt GM (1996) Nuclear integrations: challenges for sculpin in a southern Appalachian stream. Transactions of the American mitochondrial DNA markers. Trends in Ecology and Evolution, 11,247–251. Fisheries Society, 136, 1750–1761. Zink RM, Barrowclough GF (2008) Mitochondrial DNA under siege in Pollock DD, Zwickl DJ, McGuire JA, Hillis DM (2002) Increased taxon avian phylogeography. Molecular Ecology, 17, 2107–2121. sampling is advantageous for phylogenetic inference. Systematic Biol- Zou S, Li Q, Kong L, Yu H, Zheng X (2011) Comparing the usefulness of ogy, 51, 664–671. distance, monophyly and character-based DNA barcoding methods in Powell JR (2012) Accounting for uncertainty in species delineation during species identification: a case study of Neogastropoda. PLoS ONE, 6, the analysis of environmental DNA sequence data. Methods in Ecology e26619. and Evolution, 3,1–11. Puillandre N, Bouchet P, Boisselier-Dubayle M-C et al. (2012) New taxon- omy and old collections: integrating DNA barcoding into the collection M.K. Young, K.S. McKelvey, and M.K. Schwartz curation process. Molecular Ecology Resources, 12, 396–402. de Queiroz K (2007) Species concepts and species delimitation. Systematic designed the study and conducted the fieldwork. K.L. Biology, 56, 879–886. Pilgrim and M.K. Young extracted and sequenced the Raggon MF (2010) Seasonal Variability in Diet and Consumption by Cottid DNA. M.K. Young conducted the statistical analyses. and Salmonid Fishes in Headwater Streams in Western Oregon. Master’s M.K. Young wrote the paper with contributions from all thesis, Oregon State University, Corvallis, USA. Ratnasingham S, Hebert PDN (2007) BOLD: the barcoding of life co-authors. data system (www.barcodinglife.org). Molecular Ecology Notes, 7, 355–364. Roe AD, Sperling FAH (2007) Patterns of evolution of mitochondrial cytochrome c oxidase I and II DNA and implications for DNA barcod- Data Accessibility ing. and Evolution, 44, 325–345. Ross HA, Murugan S, Li WLS (2008) Testing the reliability of genetic DNA sequences: GenBank accessions JX282572–282599 methods of species identification via simulation. Systematic Biology, 57, (for COI) and JX282526–282571 (for cyt b). 216–230.

Published 2013. This article is a U.S government work and is in the public domain in the USA SPATIALLY DRIVEN GENETIC BIOASSESSMENT OF SCULPINS 13

Supporting Information Table S2 Cytochrome c oxidase subunit 1 and cytochrome b oxi- Additional Supporting Information may be found in the online dase haplotype identifiers, accession numbers, and sampling version of this article: locations of sculpins in GenBank used in these analyses Table S1 Haplotypes and locations of Cottus collected in the upper Columbia and Missouri River basins

Published 2013. This article is a U.S government work and is in the public domain in the USA