Conservation genetics of ( elongatus):

insights from environmental DNA and phylogeography

A Thesis Submitted to the Committee on Graduate Studies in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Faculty of Arts and Science

TRENT UNIVERSITY Peterborough, , © Copyright by Natasha R. Serrao 2016 Environmental and Life Sciences M.Sc. Graduate Program May 2016

Abstract

Conservation genetics of Redside Dace (Clinostomus elongatus):

insights from environmental DNA and phylogeography

Natasha R. Serrao

Recent range reductions of endangered species have been linked to urban development, increased agricultural activities, and introduction of non-native species. I used Redside Dace (Clinostomus elongatus) as a focal species to examine the utility of novel monitoring approaches, and to understand historical and contemporary processes that have influenced their present distribution. I tested the efficacy of environmental DNA

(eDNA) to detect Redside Dace, and showed that eDNA was more sensitive for detecting species presence than traditional electrofishing. Parameters such as season, number of replicates, and spatial versus temporal sampling need to be accounted for when designing an eDNA monitoring program, as they influence detection effectiveness and power. I also assessed the species’ phylogeographic structure using both mitochondrial and microsatellite DNA analysis. The data from the microsatellite markers indicate that

Redside Dace populations are genetically structured, with the exception of several populations from the Allegheny River basin. Combined sequence data from three mitochondrial genes (cytochrome b, ATPase 6 and ATPase 8) indicated that Redside

Dace persisted within three Mississippian refugia during the last glaciation. Secondary contact between two lineages was indicated by both mitochondrial and microsatellite data. The combined results from the eDNA and conservation genetics studies can be used to inform Redside Dace recovery efforts, and provide a template for similar efforts for other aquatic endangered species.

ii

Keywords: Redside Dace (Clinostomus elongatus), endangered species, environmental

DNA (eDNA), DNA, detection probability, conservation genetics, phylogeography

iii

Dedication

This thesis is dedicated with love to my grandmother Rosy Serrao who passed away at the start of my degree. She was one of the most influential female figures in my life and I aspire to have her strength. I love you nana, and think about you all the time!

iv

Acknowledgements

I am extremely grateful for all the amazing experiences and wonderful people I have had the opportunity to work with over the last three years. I have had the lab support and assistance from a number of talented individuals in the Fish Genetics Lab. I especially thank

Maggie Boothroyd for being there to listen and offer advice during my entire degree, and being such a great friend to me. Kristyne Wozney has helped me with all aspects of my thesis, and has offered invaluable lab troubleshooting, as well as life advice along the way.

I am grateful to Cait Nemeczek for teaching me how to do my first eDNA extraction, her friendship, and words of encouragement. I thank Caleigh Smith for help with everything microsatellite and mitochondrial, and always managing to put a smile on my face. I also thank Jenn Bronnenhuber and Anne Kidd for their insight and advice on various aspects of my project.

I am very appreciative to Scott Reid’s Aquatic Endangered Species Research Team for all their help in the field, and collecting water samples for my project during Fall 2012.

To Victoria Kopf, thank you for all your efforts for everything Redside Dace, teaching me to climb, and entertaining my terrible jokes. I wouldn’t have been able to survive this degree without your friendship and support. I would also like to thank Matt Sweeting for all his patience with me in the field; I could not have asked for a better person to have experienced my first field season with. I also thank Andrea Dunn and colleagues (Conservation Halton) for helping me collect buccal swabs at Sixteen Mile Creek.

I would like to acknowledge all my American collaborators for going out into the field and collecting Redside Dace samples for my project. Douglas Carlson (New York

v

State Department of Environmental Conservation), Konrad Schmidt and Jenny

Kruckenberg (North American Native Fishes Association), Holly Jennings (Forest Service) and John Pagel (Ottawa National Forest), Brant Fisher ( Department of Natural

Resources), Brian Zimmerman (Ohio State University), David Thorne and Isaac Gibson

(West Virginia Division of Natural Resources), John Lyons ( Department of

Natural Resources), Nate Tessler (EnvironScience), and David Miko ( Fish and Boat Commission), Matthew R. Thomas (Kentucky Department of Fish and Wildlife

Resources) generously provided samples that would otherwise have not been possible to collect. Additionally, Aaron Clauser (Clauser Environmental), Wayne Starnes (North

Carolina State Museum of Natural Sciences), Aaron Snell (Streamside Ecological Services,

Inc), and Douglas Fischer (Pennsylvania Fish and Boat Commission) also provided valuable advice during my project.

I thank my mum and my dad for their unconditional love and support, and being there to help me with all four Peterborough moves! I wouldn’t be where I am without the values you have instilled in me. I thank my siblings Nicole and Daniel for providing comedic relief, and helping me through the final stretch of this degree. I am extremely grateful to Shannon Fera for her words of wisdom, and for taking me in under her wing and helping me through the initial rough patch of my graduate degree. I also thank Jessica

Tomlin for her daily calls, editing thesis drafts, and 17 years of sustained friendship- I am so lucky to have you in my life! Lauren Banks, Tim Bartley, Ryan Franckowiak, Bob

Hanner, Christine Terwissen, Spencer Walker, Cristen Watt also deserve special thanks for valuable insight into various aspects of my project.

vi

Lastly, I would like to thank my graduate committee for all their help and guidance over my three years. I thank Al Dextrase for his patience, hours of hands-on help with occupancy modelling, and always being there to help despite short notice. I thank Joanna Freeland for her speedy feedback, attention to details, and for her words of encouragement during my project. I am appreciative to Scott Reid for providing me with the field technical support, and teaching me the importance of being a professional. You are a brilliant research scientist and I am grateful to have had the opportunity to work with you. To Chris- thank you for your mentoring, life counselling, and for helping me grow. Your friendship has been one of my most valued ones at Trent and I have come to view you as my academic father. Thank you!

vii

Table of Contents

Abstract ...... ii Dedication ...... iv Acknowledgements ...... v Table of Contents ...... viii List of Figures ...... xi Chapter 1: General introduction ...... 1 References ...... 8 Chapter 2 ...... 15 Abstract ...... 15 Introduction ...... 16 Results ...... 31 Discussion ...... 35 References ...... 40 Chapter 3 ...... 57 Abstract ...... 57 Introduction ...... 58 Methods ...... 63 Results ...... 71 Discussion ...... 78 References ...... 90 Chapter 4: General Discussion ...... 130

viii

List of Tables

Table 2.1: Mean, standard deviation, maximum, and minimum values of environmental variables for 29 sites sampled for eDNA testing for Redside Dace...... 45

Table 2.2: Estimates of Redside Dace detection probability and occupancy and ΔAICc values from models for spring and fall field seasons (horizontal headings), at temporal sampling (R) of 3, 4, and 5 replicates...... 46

Table 3.1: Locations with drainage, jurisdiction, code names, latitude/longitude and number of samples obtained for mtDNA and microsatellite genetic samples used for study...... 113

Table 3.2: ATPase variable sites for 23 unique haplotypes of C. elongatus (1st column), nucleotide positions at which mutations occur (1st row), number of individuals (N) and populations that contain that particular haplotype. Haplotype 1A represents reference sequence for table; dots within a cell represent nucleotide positions identical to the reference sequence...... 115

Table 3.3: Summary of ATPase 6 and 8 and Cytochrome b sequence results for 27 Redside Dace populations, showing numbers of sequenced individuals (N), number of haplotypes detected (Nh), haplotypic richness (HR), haplotype diversity (h), and nucleotide diversity (π) for 27 Redside Dace populations...... 116

Table 3.4: Cytochrome b variable sites for 35 unique haplotypes of C. elongatus (1st column), showing nucleotide positions at which mutations occur (1st row), number of individuals (N) and populations that contain that particular haplotype. Dots within a cell represent nucleotide positions identical to the reference sequence...... 119

Table 3.5: Haplotype name, number of individuals and population occurrences for 47 unique haplotypes based on combined sequences (ATPase 6 and 8, and cytochrome b)...... 123

Table 3.6: Genetic description of 28 Redside Dace populations (see Table 3.1 for localities). Columns represent letter codes, number of individuals genotyped (N),

ix

observed number of alleles (Na), standardized allelic richness (AR) for n=20 gene copies, observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient (FIS)...... 125

Table 3.7: Pairwise FST values among 28 Redside Dace populations along with sample size for each population...... 126

Table 3.8: Analysis of Molecular Variance (AMOVA) for total evidence (cytochrome b and ATPase 6 and 8) mitochondrial DNA data based on hypothesized (i) Mississippi and Atlantic refugia (2 groups), (ii) mitochondrial DNA bootstrap supported groups (3 refugia), and (iii) microsatellite Principal Coordinate Analysis clustering (3 refugia hypothesis), and hierarchical FST analysis for (iv) eastern versus western groups (v) three groups identified by STRUCTURE, and (vi) contemporary drainage patterns ...... 128

x

List of Figures

Figure 2.1: Map of 29 Redside Dace eDNA sampling sites from Fall 2012 and Spring 2013 sampling season (grey circles), 10 lake negative control sampling sites (black triangles) from Spring 2013, and Otonabee River field blank (star) to help establish detection threshold...... 47

Figure 2.2: Plot of log10 transformed template DNA copy number (x-axis) versus dilutions for four Redside Dace eDNA samples in order to test for inhibition at four sampling locations (LC1= Lynde Creek 1, LC2= Lynde Creek 2, MC1= Mitchell Creek 1, MC2= Mitchell Creek 2)...... 48

Figure 2.3: Histogram of negative controls copy numbers/reaction of amplified Redside Dace eDNA (x-axis) versus frequency (y-axis) for four types of: (a) filter control (n=168, x̅ =0.091, s=0.288), (b) lake control (n=32, x̅ =0.048, s=0.13), (c) DNA extraction control (n=31, x̅ =0.081, s=0.29), (d) field control (n=27, x̅ =0.18, s=0.39)...... 49

Figure 2.4: Scatter plot for mean copy number /reaction of each sample run in triplicates (y-axis) versus the coefficient of variation of those values (CV; x-axis) (left) and histogram of CV versus the frequency of samples that fall under the CV (right)...... 50

Figure 2.5: Boxplot of qPCR standards with known DNA concentrations (1000 copies/ reaction down to 1 copy/ reaction) set as “eDNA unknowns” versus copy number log10 transformed (y-axis), as a test for qPCR accuracy...... 51

Figure 2.6: Boxplot of Redside Dace standards (106 down to 100 copies/ reaction) at the threshold cycle (Ct) where the copy number passes the baseline threshold for (A) omitted (data points for the standard curve were removed to improve R2 value) (B) All standards (no data points excluded)...... ………52

Figure 2.7: Barplot of total temporal Redside Dace detections (x-axis) found at each of the eleven sampled sites (y-axis). Five temporal replicates were collected at each season (fall and spring) twice, with a time lapse of approximately 10 d between sampling weeks within a season. Site labels on y-axis are listed in Appendix 2.1...... 53

xi

Figure 2.8: Detection probabilities (y-axis) within seasons (x-axis) for Fall Week 1 (FW1), Fall Week 2 (FW2), Spring Week 1 (SW1), and Spring Week 2 (SW2) (error bars represent upper and lower 95% confidence limits of estimates)...... 54

Figure 2.9: Individual site detection probability estimates for index of flow versus detection probability (top), and temperature versus detection probability (bottom), during Spring at 5 replicates...... 55

Figure 2.10: A comparison of the number of sites (out of n=29) with Redside Dace DNA detections (x-axis), versus the number of replicates sampled (y-axis), for a) the four spatially replicated samples collected at each site and b) the four temporally replicated samples collected at each site in each season...... 56

Figure 3.1: Distribution map of sampling locations for Redside Dace (Clinostomus elongatus). Inset map shows the species’ global range (reproduced from COSEWIC 2007 report, with permission), with the polygon enclosing the species range...... 98

Figure 3.2: Mutational network observed among C. elongatus haplotypes for ATPase 6 and 8 based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.2; each node represents one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution...... 99

Figure 3.3: Neighbour-joining dendrogram of relationships among ATPase 6 and 8 haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table 3.2) are represented by numbers outside brackets, while number of individuals are represented by numbers inside brackets. Numbers at branch nodes show bootstrap support values >50 %...... 100

Figure 3.4: Mutational network observed among C. elongatus haplotypes for cytochrome b based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.4; each node represents one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution of haplogroups (purple=haplogroup C; light green=haplogroup D)…………………………………….101

xii

Figure 3.5: Neighbour-joining dendrogram of relationships among cytochrome b haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table 3.4) are represented by numbers outside brackets, while numbers of individuals are represented by numbers inside brackets. Numbers at branch nodes show bootstrap support values >50 %...... 102

Figure 3.6: Mutational network observed among C. elongatus haplotypes for total evidence for cytochrome b and ATPase 6 and 8 based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.5; while each node represent one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution of haplogroups (orange=haplogroup 1, olive green=haplogroup 3, red=haplogroup 2)...... 103

Figure 3.7: Neighbour-joining dendrogram of relationships among total evidence (cytochrome b and ATPase 6 and 8) halotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table 3.5) are represented by numbers outside brackets, while number of individuals are represented by numbers inside brackets. Numbers at branch nodes show bootstrap support values >50%...... 104

Figure 3.8: Distribution of haplogroups (orange=haplogroup 1, green=haplogroup 3, red=haplogroup 2, black = unassigned) for combined cytochrome b and ATPase 6 and 8 data using groups identified via mutational network (Figure 3.6) and genetic distance (Figure 3.7)...... 105

Figure 3.9: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace individuals, where K represents number of genetically unique populations. Analyses were run at K=1 to K=29, and methods outlined in Chapter 2. Results analysed using (i) log likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005)...... 106

Figure 3.10: Bayesian clustering assignment implemented in STRUCTURE for 28 populations at (a) K=3 for range-wide analysis (b) results of separate STRUCTURE runs on the three identified subsets for fine-scale analysis, showing optimal K values along with preceding and successive K values. All runs were implemented with no admixture, and independent allele frequencies. Colours between different runs have no association with each other...... 107

xiii

Figure 3.11: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace individuals, for three genetic groups identified by K=3 on Figure 3.10, where K represents number of genetically unique populations. Analyses were run for red group from K=1 to K=10 (top left), green group from K=1 to K=20 (top right), and blue group from K=1 to K=10 bottom group using methods outlined in Chapter Two. Results analysed using (i) log likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005)...... 108

Figure 3.12: Principal coordinate analysis (PCoA) of genetic structure across all sampled Redside Dace populations (red= cluster A, blue= cluster B, green=cluster C). Inset map shows the geographic distributions of each genetic group...... 109

Figure 3.13: Neighbour joining dendrogram of genetic relationships among sampled populations based on Nei et al. (1983) DA genetic distance for 10 microsatellite loci. Numbers at branch nodes represent bootstrap support values > 50% based on 500 bootstrap replicates. Groups correspond to those identified in Figure 3.12...... 110

Figure 3.14: Plot of isolation by distance for pairwise population comparisons of transformed geographic distance [ln (distance in km+1)] versus genetic divergence [(FST)/(1-FST)]...... 111

Figure 3.15: Isolation by distance plot of transformed geographic distance [ln (distance in km+1)] versus genetic divergence [(FST)/(1-FST)] for population population pairs with geographic distances of less than 123 km. Points in yellow represent pairwise comparisons among the Allegheny River populations...... 112

xiv

List of Appendices

Appendix 2.1: Field data collected at 29 sites including sampling dates, fish caught, habitat characteristics (channel width, channel depth, conductivity, temperature), and GPS coordinates...... 139

Appendix 2.2a: Raw qPCR values (copies/reaction) for fast mix during Spring (S) and Fall (F) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and spatial (S1-S4) replicates...... 141

Appendix 2.2b: Raw qPCR values (copies/reaction) for environmental mix during Spring(S) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and spatial (S1-S4) replicates...... 143

Appendix 2.3: 10 Lake control sites absent for Redside Dace along with their GPS coordinates and date sampled...... 145

Appendix 2.4: Comparison of Environmental versus Fast mastermix ...... 146

Appendix 2.5: Separating the signal from the noise: using receiver operator characteristics to optimize sensitivity and specificity of environmental DNA detections ...... 152

Appendix 2.6: Comparison of electrofishing and eDNA detections during Fall and Spring sampling season (total of 29 sites)...... 165

Appendix 2.7: Estimates of Redside Dace detection probability and occupancy, AICc, ΔAICc, AIC weights, number of parameters, and -2log values from models for spring and fall field seasons (horizontal headings), at temporal sampling (R) of 3, 4, and 5 replicates...... 166

Appendix 2.8: Comparison of costs for eDNA versus electrofishing ...... 168

xv

Appendix 3.1: Primer sequences used for microsatellite DNA analysis, along with their GenBank accession numbers, repeat motifs, size range (bp) and annealing temperatures...... 170

Appendix 3.2: Proportion of polymorphic loci across ten microsatellite primers for 28 Redside Dace populations...... 172

Appendix 3.3: List of 20 populations that deviate from Hardy-Weinberg equilibrium expectations...... 173

Appendix 3.4: Total evidence haplotype numbers with corresponding ATPase 6 and 8 and cytochrome b haplotypes...... 174

xvi

1

Chapter 1: General introduction

Within the last century, the rate of vertebrate species loss has increased to approximately 100 times greater than the background extinction rate (Ceballos et al.

2015). This rapid loss of biodiversity has occurred primarily as a result of anthropogenic causes, including habitat fragmentation and loss (Soulé and Kohm 1989, Frankham et al.

2002). The North American freshwater fauna are experiencing a loss of biodiversity comparable to biota in tropical forests; extinction rates for freshwater mussels are projected to increase by 6% within the next decade, with 39% of fish species listed as imperiled (Ricciardi and Rasmussen 1999, Dudgeon et al. 2006, Jelks et al. 2008,

Ceballos et al. 2015).

Conservation biology is a multidisciplinary science that encompasses monitoring, biogeography, ecology, and genetics in order to protect species (Soulé et al. 1985). To reduce the loss of biodiversity, the field of conservation biology seeks to counter some of the threats to species decline (Soulé et al. 1985). The main goals are to protect species or populations that have been negatively impacted by human-mediated activities including habitat loss, invasive species, urbanization, and agricultural activities (Magurran 2009).

Many initiatives have been implemented to protect freshwater systems (Suski and Cooke

2006, George et al. 2009), while continuing research on the influence of anthropogenic processes on individual species and community assemblages (Leidy et al. 2011, Ramirez-

Llodra et al. 2011). Current research has explored the extent of habitat loss (Skole and

Tucker 1993), the effects of fragmentation and its contribution to lower fish species density (Layman et al. 2004), and the negative impact of urbanization on species richness and composition (McKinney 2002).

2

Conservation units

Within Canada and the , conservation units below the species level can be applied to protect species at risk. For species with ranges spanning multiple jurisdictions, with different laws and policies at federal, state, and provincial levels, effective conservation can be challenging (Mooers et al. 2010, Petrou et al. 2013, Favaro et al. 2014). The Endangered Species Act in the United States provides protection for threatened and endangered species, as well as subspecies or distinct population segments within species, along with their corresponding habitats. Defining a distinct population segment can be challenging due to ambiguity in classification below the subspecies level, so the evolutionary unit (EU) concept was developed in order to help identify this. An EU is considered distinct if there is evidence of genetic isolation, geographic and temporal isolation, and behavioural and reproductive isolation (National Research Council 1995).

This can be achieved by looking at ecological, genetic, behavioural and morphological data. Additionally, the term “evolutionary significant unit” exists to help conserve species at risk, and there are many interpretations of its meaning throughout the literature. Ryder

(1986) used it to identify conservation units based on adaptive variation, while Waples

(1991) used it to identify populations based on adaptive variation and reproductive isolation. More recently, Moritz (1994) used a genetics-based approach with conservation units being identified by reciprocal monophyly using mitochondrial DNA (mtDNA) and allele frequency divergences using nuclear DNA. Lastly, a management unit (MU) refers to conserving populations’ short term by identifying populations containing high mtDNA and nuclear genetic diversity levels (Moritz 1994).

3

Within Canada, “Designatable Units” are applied to identify conservation units based on: (i) the presence of subspecies or varieties, (ii) if populations are discrete from each other, and (iii) if the discreteness has an evolutionary significance (COSEWIC

2014). A population can be classified as discrete based on genetic uniqueness using either neutral markers or examining inherited traits, if there are natural disjunctions between large portions of a species range so that movement between the locations is difficult, and if the species occupies various eco-geographic locations (COSEWIC 2014). Based on the above criteria, a population or group can be considered evolutionarily significant if (i) it has deep phylogenetic divergences from other populations, (ii) is in a unique ecological setting that could have resulted in local adaptations, (iii) shows evidence that the population being examined is the only group of populations (or population) left in the species’ native range, or (iv) its loss would result in an extensive disjunction in the species’ range (COSEWIC 2014).

Using genetics to identify conservation units

Patterns of genetic structure and diversity within species reflect historical and contemporary influences, and an understanding of the impacts of past and current events and processes is beneficial for effective management (Bernatchez and Wilson 1998,

McDermid et al. 2011, Ginson et al. 2015). At a historical scale, genetic data reflect large- scale processes such as vicariant events and climatic changes despite having occurred thousands of years ago, and can be used to understand contemporary genetic structuring

(Avise 2000, Gum et al. 2005, Borden and Krebs 2009). In particular, Pleistocene glaciations played important roles in shaping the contemporary distributions of many freshwater fish species within North America (Hocutt and Wiley 1986 and chapters

4 therein). Advancing ice sheets displaced fish species into refugia at glacial margins, and their current distributions are reflective of postglacial dispersal during and after deglaciations (Hocutt and Wiley 1986, Mandrak and Crossman 1992, Wilson and Hebert

1996, McDermid et al. 2011). Phylogeographic studies using mitochondrial DNA

(mtDNA) have been used to make inferences about glaciation events and postglacial dispersal routes that provide insight into evolutionary lineages important for management units (Bernatchez and Wilson 1998, Ginson et al. 2015). Mitochondrial DNA is well suited for this type of study, as it is maternally inherited without recombination, is considered selectively neutral, and has a higher mutation rate than most nuclear genes

(Brown et al. 1979, Avise et al. 1987, Moritz et al. 1987).

At a more contemporary scale, knowledge of connectivity between populations, inbreeding levels, and metapopulation structuring, can also be inferred using genetic data

(Berendzen and Dugan 2008, Blakney et al. 2014, Ginson et al. 2015). These data can also be used to examine the impact of anthropogenic habitat alterations such as urbanization and habitat fragmentation (Blakney et al. 2014, Mather et al. 2015). Patterns of historical structuring can be complemented using faster-evolving markers such as nuclear microsatellite DNA loci, which are biparentally inherited, selectively neutral, and have high mutation rates (Li et al. 2002, Ellegren 2004, Selkoe and Toonen 2006, Kirk and Freeland 2011). Microsatellite data can provide information on genetic diversity, effects of inbreeding, gene flow, and population structuring within and among contemporary populations (McCusker et al. 2014, Ginson et al. 2015, Glass et al. 2015).

These complementary genetic data sources can be used separately or in concert to help

5 inform conservation and management decisions for conservation efforts (Moritz 1994,

Neff et al. 2011).

Monitoring challenges

An effective monitoring program should account for knowledge of a species’ distribution, their habitat, and allow for adaptive management practices (Campbell et al.

2002). It is common for species at risk to have poorly identified habitat ranges and requirements at the time of listing, and few studies have accounted for population viability and unoccupied habitat when making designations (Camaclang et al. 2015).

Monitoring efforts to obtain this information have often been undertaken rather haphazardly, with specific objectives, experimental design and statistical analysis being overlooked, likely due to the limited resources available for the recovery of a species

(Noss 1990). Once comprehensive monitoring has been undertaken, sites can be assessed for recovery efforts.

A novel application for conservation genetics is using environmental DNA

(eDNA) to document occurrences of aquatic endangered species. Environmental DNA detection, is growing in popularity for its ability to detect occurrences of aquatic species

(Ficetola et al. 2008, Darling and Mahon 2011, Mahon et al. 2013). Using this technique, species’ presence can be documented based on the presence of their DNA in water or sediments (Ficetola et al. 2008). This technique has both the potential to detect species at low density levels, and is non-invasive, yet its application to endangered species has been limited (Bronnenhuber and Wilson 2013, Wilcox et al. 2013). The focus of most eDNA work has been on invasive species (Olson et al. 2012, Goldberg et al. 2013, Jerde et al.

6

2013). Environmental DNA has been evaluated as an effective tool in controlled environments (Ficetola et al. 2008, Thomsen et al. 2012, Goldberg et al. 2013), marine and freshwater systems (Thomsen et al. 2012, Jerde et al. 2013, Goldberg et al. 2013) and aquatic sediments (Turner et al. 2014). Studies have also evaluated the effect of flow on detection rates (Deiner and Altermatt 2014), the effectiveness of various DNA extraction methods (Deiner et al. 2014), the number of PCR replicates needed to avoid false positives and negatives in metabarcoding (Ficetola et al. 2014), and seasonal effects on detection rates (Deiner and Altermatt 2014, Jane et al. 2014). These advances have helped refine eDNA as a monitoring tool, although several limitations still exist that need to be addressed in future studies (Roussel et al. 2015). Standard reporting procedures need to be incorporated across studies so that basic information including (i) copy numbers present in negative controls, (ii) limits of detection for target species and (iii) standardized measures of what constitutes a “false positive” are known (Roussel et al.

2015). To date, comparison of detection effectiveness of eDNA versus traditional sampling methods, and comparison with measures of occupancy/ abundance estimates have been rarely assessed (Roussel et al. 2015). Also, while detection probability estimates are useful for determining optimal sampling conditions, few studies have used this approach to account for imperfect detection (Schmidt et al. 2013; Ficetola et al. 2014;

Hunter et al. 2015).

Test Species

The Redside Dace, Clinostomus elongatus (Teleostei: ), is a small freshwater that typifies many conservation concerns and information needs to create effective recovery strategies. Redside Dace are stream fish that are generally found

7 in pools with overhanging vegetation that support terrestrial , their primary food source (Novinger and Coon 2000). The species has a disjunct distribution throughout the upper Drainage, , Allegheny River and upper Susquehanna

River, as well as many tributaries in the Basin (COSEWIC 2007). Within

Canada, Redside Dace populations are restricted to , with the exception of one population east of Sault Saint Marie (Redside Dace Recovery Team 2010).

Populations have been declining due to habitat loss and degradation throughout their range (Parker et al. 1988, COSEWIC 2007, Redside Dace Recovery Team 2010). Redside

Dace is thought to be extirpated from 10 of 24 Ontario watersheds, with eight of the remaining 14 locations experiencing decline (COSEWIC 2007). Scott and Crossman

(1973) identified Reside Dace as a species to study based on the documented declines.

Despite the species being designated as Endangered in 2007 by the Committee on the

Status of Endangered Wildlife in Canada (COSEWIC 2007), it is not yet protected at the federal level. Within Ontario, Redside Dace and its habitat are protected under the province’s Endangered Species Act, and a provincial recovery strategy has been developed (Redside Dace Recovery Team 2010).

Research on Redside Dace has largely focused on habitat associations, monitoring approaches, and threats; however, limited information exists on their spatial genetic structure and diversity (Berendzen and Dugan 2008, Houston et al. 2010, Redside Dace

Recovery Team 2010, Sweeten 2012). Although a large portion of the current Redside

Dace range was glaciated during the Pleistocene, their postglacial origins still remain unresolved. Based on distributional data, Underhill (1986) suggested that Redside Dace colonized their contemporary range from a single (Mississippian) glacial refugium,

8 whereas Mandrak and Crossman (1992) suggested that contemporary populations may have originated from two refugia (Atlantic and Mississippian). Additionally, few studies have looked at contemporary Redside Dace structuring using microsatellite DNA

(Berendzen and Dugan 2008, Sweeten 2012), which can provide important information on genetic diversity levels, gene flow, and potential inbreeding. In order to investigate large-scale historical and fine-scale contemporary influences on genetic structure and diversity within and among populations of Redside Dace, information on geographic variation of both mitochondrial (mtDNA) and microsatellite DNA would help inform conservation efforts.

My research aimed to advance the conservation of Redside Dace by applying conservation genetic tools to identify and map genetic diversity within the species range, and to improve monitoring efforts. In Chapter 2, I sought to assess the sensitivity of eDNA for documenting the presence of Redside Dace, and developed basic sampling protocols to minimize false positive and false negative error rates. In Chapter 3, I characterized the phylogeography and contemporary genetic structure and diversity for

Redside Dace across its global range, with an emphasis on Ontario populations. The results from these two studies provide new knowledge and tools to help inform Redside

Dace conservation and address actions identified in the Ontario Recovery Strategy for the species.

References

Avise JC, Arnold J, Ball RM, Bermingham E, et al. (1987) Intraspecific phylogeography: the mitochondrial DNA bridge between population genetics and systematics. Annual Review of Ecology and Systematics, 18, 489-522. Avise JC (2000) Phylogeography. Harvard University Press, Cambridge, MA.

9

Berendzen PB, Dugan JF (2008) Establishing conservation units and population genetic parameters of fishes of greatest conservation need distributed in Southeast Minnesota. State Wildlife Grants Program. Division of Ecological Resources, Minnesota Department of Natural Resources. Available from: http://files.dnr.state.mn.us/eco/nongame/projects/consgrant_reports/2008/2008_berendzen _etal.pdf.

Bernatchez L, Wilson CC (1998) Comparative phylogeography of Nearctic and Palearctic fishes. Molecular Ecology, 7, 431–452.

Blakney JR, Loxterman JL, Keeley ER (2014) Range-wide comparisons of northern leatherside chub populations reveal historical and contemporary patterns of genetic variation. Conservation Genetics, 15, 757-770.

Borden WC, Krebs RA (2009) Phylogeography and postglacial dispersal of smallmouth bass (Micropterus dolomieu) into the Great Lakes. Canadian Journal of Fisheries and Aquatic Sciences, 66, 2142-2156.

Bronnenhuber JE, Wilson CC (2013) Combining species-specific COI primers with environmental DNA analysis for targeted detection of rare freshwater species. Conservation Genetics Resources, 5, 971–975.

Brown WM, George M, Wilson AC (1979) Rapid evolution of mitochondrial DNA. Proceedings of the National Academy of Sciences, 76, 1967-1971.

Camaclang AE, Maron M, Martin TG, Possingham HP (2015) Current practices in the identification of critical habitat for threatened species. Conservation Biology, 29, 482- 492.

Campbell SP, Clark JA, Crampton LH, Guerry AD, et al. (2002) An assessment of monitoring efforts in endangered species recovery plans. Ecological Applications, 12, 674-681.

Ceballos G, Ehrlich PR, Barnosky AD, García A, et al. (2015) Accelerated modern human–induced species losses: Entering the sixth mass extinction. Science Advances, 1, e1400253.

COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa, ON.

COSEWIC (2014) Guidelines for Recognizing Designatable Units. Committee on the Status of Endangered Wildlife in Canada.Available from: http://www.cosewic.gc.ca/eng/sct2/sct2_5_e.cfm.

10

Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based methods for monitoring biological invasions in aquatic environments. Environmental Research, 111, 978-988.

Deiner K, Altermatt F (2014) Transport distance of invertebrate environmental DNA in a natural river. PLoS ONE, 9, e88786.

Deiner K, Walser JC, Mächler E, Altermatt F (2014) Choice of capture and extraction methods affect detection of freshwater biodiversity from environmental DNA. Biological Conservation, 183, 53-63.

Dudgeon D, Arthingto AH, Gessner MO, Kawabata Z, et al. (2006) Freshwater biodiversity: importance, threats, status and conservation challenges. Biological Reviews, 81, 163-182.

Ellegren H (2004) Microsatellites: simple sequences with complex evolution. Nature Reviews Genetics, 5, 435-445.

Favaro B, Claar DC, Fox CH, Freshwater C, et al. (2014) Trends in extinction risk for imperiled species in Canada. PloS ONE, 9, e113118.

Ficetola GF, Miaud C, Pompanon F, Taberlet P (2008) Species detection using environmental DNA from water samples. Biology Letters, 4, 423-425.

Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false presences, and the estimation of presence / absence from eDNA metabarcoding data. Molecular Ecology Resources, 15, 543–556.

Frankham R, Briscoe DA, Ballou JD (2002) Introduction to Conservation Genetics. Cambridge University Press, Cambridge.

George AL, Kuhajda BR, Williams JD, Cantrell MA, et al. (2009) Guidelines for propagation and translocation for freshwater fish conservation. Fisheries, 34, 529-545.

Ginson R, Walter RP, Mandrak NE, Beneteau CL, et al. (2015) Hierarchical analysis of genetic structure in the habitat-specialist Eastern Sand Darter (Ammocrypta pellucida). Ecology and Evolution, 5, 695–708.

Glass WR, Walter RP, Heath DD, Mandrak NE, et al. (2015) Genetic structure and diversity of spotted gar (Lepisosteus oculatus) at its northern range edge: implications for conservation. Conservation Genetics, 16, 889-899

Goldberg CS, Sepulveda A, Ray A, Baumgardt J, Waits P (2013) Environmental DNA as a new method for early detection of New Zealand mudsnails (Potamopyrgus antipodarum). Freshwater Science, 32, 792–800.

11

Gum B, Gross R, Kuehn R (2005) Mitochondrial and nuclear DNA phylogeography of European grayling (Thymallus thymallus): evidence for secondary contact zones in central Europe. Molecular Ecology, 14, 1707-1725.

Hocutt CH, Wiley EO (1986) The Zoogeography of North American Freshwater Fishes. John Wiley and Sons, New York.

Houston DD, Shiozawa DK, Riddle BR (2010) Phylogenetic relationships of the western North American cyprinid genus Richardsonius, with an overview of phylogeographic structure. Molecular Phylogenetics and Evolution, 55, 259-273.

Hunter ME, Oyler-McCance SJ, Dorazio RM, Fike JA, et al. (2015) Environmental DNA (eDNA) sampling improves occurrence and detection estimates of invasive Burmese pythons. PLoS ONE, 10, e0121655.

Jane S, Taylor WM, Mckelvey KS, Young MK (2014) Distance, flow and PCR inhibition: eDNA dynamics in two headwater streams. Molecular Ecology Resources, 15, 216-227.

Jelks HL, Walsh SJ, Burkhead NM, Contreras-Balderas S, et al. (2008) Conservation status of imperiled North American freshwater and diadromous fishes. Fisheries, 33, 372- 407.

Jerde CL, Chadderton WL, Mahon AR, Renshaw MA, et al. (2013) Detection of Asian carp DNA as part of a Great Lakes basin-wide surveillance program. Canadian Journal of Fisheries and Aquatic Sciences, 70, 522-526.

Sweeten J (Manchester University) (2012) Redside dace (Clinostomus elongatus) in Mill Creek, Wabash County, Indiana: A strategy for research and augmentation. Indiana Department of Natural Resources.

Kirk J, Freeland JR (2011) Applications and implications of neutral versus non-neutral markers in molecular ecology. International Journal of Molecular Sciences, 12, 3966- 3988.

Layman CA, Arrington DA, Langerhans RB, Silliman BR (2004) Degree of fragmentation affects fish assemblage structure in Andros Island (Bahamas) estuaries. Caribbean Journal of Science, 40, 232-244.

Leidy RA, Cervantes‐Yoshida K, Carlson SM (2011) Persistence of native fishes in small streams of the urbanized San Francisco Estuary, California: acknowledging the role of urban streams in native fish conservation. Aquatic Conservation: Marine and Freshwater Ecosystems, 21, 472-483.

12

Li YC, Korol AB, Fahima T, Beiles A, Nevo E (2002) Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Molecular Ecology, 11, 2453-2465.

Magurran AE (2009) Threats to freshwater fish. Science, 325, 1215-1216.

Mahon AR, Barnes MA, Li F, Egan SP, et al. (2013) DNA-based species detection capabilities using laser transmission spectroscopy DNA-based species detection capabilities using laser transmission spectroscopy. Journal of the Royal Society Interface, 10, 20120637.

Mandrak NE, Crossman E (1992) Postglacial dispersal of freshwater fishes into Ontario. Canadian Journal of Zoology, 70, 2247-2259.

Mather A, Hancox D, Riginos C (2015) Urban development explains reduced genetic diversity in a narrow range endemic freshwater fish. Conservation Genetics, 16, 625-634.

McCusker MR, Mandrak NE, Egeh B, Lovejoy NR (2014) Population structure and conservation genetic assessment of the endangered Pugnose Shiner, Notropis anogenus. Conservation Genetics, 1, 343-353.

McDermid JL, Wozney JK, Kjartanson SL, Wilson CC (2011) Quantifying historical, contemporary, and anthropogenic influences on the genetic structure and diversity of lake sturgeon (Acipenser fulvescens) populations in northern Ontario. Journal of Applied Ichthyology, 27, 12–23.

McKinney ML (2002) Urbanization, biodiversity, and conservation. BioScience, 52, 883- 890.

Mooers AO, Doak DF, Findlay CS, Green DM, et al. (2010) Science, policy, and species at risk in Canada. BioScience, 60, 843-849.

Moritz C (1994) Defining ‘evolutionarily significant units’ for conservation. Trends in Ecology & Evolution, 9, 373-375.

Moritz C, Dowling TE, Brown WM (1987) Evolution of animal mitochondrial DNA: relevance for population biology and systematics. Annual Review of Ecology and Systematics, 18, 269-292.

National Research Council (NRC) (1995) Science and the Endangered Species Act. National Academy Press, Washington, D.C.

Neff BD, Garner SR, Pitcher TE (2011) Conservation and enhancement of wild fish populations: preserving genetic quality versus genetic diversity. Canadian Journal of Fisheries and Aquatic Sciences, 68, 1139-1154.

13

Noss RF (1990) Indicators for monitoring biodiversity: a hierarchical approach. Conservation Biology, 4, 355-364.

Novinger DC, Coon TG (2000) Behavior and physiology of the redside dace, Clinostomus elongatus, a threatened species in . Environmental Biology of Fishes, 57, 315–326.

Olson ZH, Briggler JT, Williams RN (2012) An eDNA approach to detect eastern hellbenders (Cryptobranchus a. alleganiensis) using samples of water. Wildlife Research, 39, 629-636.

Parker BJ, McKee P, Campbell RR (1988) Status of the redside dace, Clinostomus elongatus, in Canada. Canadian Field Naturalist, 102,163-169.

Petrou EL, Hauser L, Waples RS, Seeb JE, et al.(2013) Secondary contact and changes in coastal habitat availability influence the nonequilibrium population structure of a salmonid (Oncorhynchus keta). Molecular Ecology, 22, 5848-5860.

Ramirez-Llodra E, Tyler PA, Baker MC, Bergstad OA, et al. (2011) Man and the last great wilderness: human impact on the deep sea. PLoS ONE, 6, e22588.

Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus elongatus) in Ontario. Ontario Ministry of Natural Resources. Peterborough, ON.

Ricciardi A, Rasmussen JB (1999) Extinction rates of North American freshwater fauna. Conservation Biology, 13, 1220-1222.

Roussel JM, Paillisson JM, Tréguier A, Petit E (2015) The downside of eDNA as a survey tool in water bodies. Journal of Applied Ecology, 52, 823-826.

Ryder OA (1986) Species conservation and systematics: the dilemma of subspecies. Trends in Ecology & Evolution, 1, 9-10.

Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy models in the analysis of environmental DNA presence/absence surveys: a case study of an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.

Scott WB, Crossman EJ (1973) Freshwater Fishes of Canada. Fisheries Research Board of Canada, Bulletin 183, Toronto, ON. 966 pp.

Selkoe KA, Toonen RJ (2006) Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecology Letters, 9, 615-629.

Skole D, Tucker C (1993) Tropical deforestation and habitat fragmentation in the Amazon: satellite data from 1978 to 1988. Science, 260, 1905-1910.

14

Soulé ME (1985) What is conservation biology? A new synthetic discipline addresses the dynamics and problems of perturbed species, communities, and ecosystems. BioScience, 35, 727-734.

Soulé ME, Kohm KA (1989) Research Priorities for Conservation Biology. Island Press, Washington, DC.

Suski CD, Cooke SJ (2007) Conservation of aquatic resources through the use of freshwater protected areas: opportunities and challenges. Biodiversity and Conservation, 16, 2015-2029.

Thomsen PF, Kielgast J, Iversen LL, Wiuf C, et al. (2012) Monitoring endangered freshwater biodiversity using environmental DNA. Molecular Ecology, 21, 2565–2573.

Turner CR, Uy KL, Everhart RC (2014) Fish environmental DNA is more concentrated in aquatic sediments than surface water. Biological Conservation, 183, 93-102.

Underhill JC (1986) The fish fauna of the Laurentian Great Lakes, the St. Lawrence Lowlands, Newfoundland and Labrador. In: Zoogeography of North American Freshwater Fishes (eds Hocutt CH, Wiley EO), pp. 105-136. John Wiley and Sons, NY.

Waples RS (1991) Pacific salmon, Oncorhynchus spp., and the definition of “species” under the Endangered Species Act. Marine Fisheries Review, 53, 11-22.

Wilcox TM, McKelvey KS, Young MK, Jane SF, Lowe WH (2013) Robust detection of rare species using environmental DNA: the importance of primer specificity. PLoS ONE, 8, e59520.

Wilson CC, Hebert PDN (1996) Phylogeographic origins of lake trout (Salvelinus namaycush) in eastern North America. Canadian Journal of Fisheries and Aquatic Sciences, 53, 2764-2775.

15

Chapter 2: Using environmental DNA (eDNA) to detect endangered Redside Dace,

Clinostomus elongatus

Abstract

Detection and monitoring of species at risk in aquatic environments necessitates methods that are non-intrusive and are able to identify target organisms at low densities.

Environmental DNA (eDNA) as a monitoring tool has been applied extensively to invasive species, but research on species at risk has been limited. In this study, Redside Dace

(Clinsotomus elongatus), an endangered fish native to southwestern Ontario within Canada, was used to determine if eDNA is a sensitive tool for monitoring cyprinids and other stream fishes. A total of 29 historic Redside Dace sites were sampled, with five temporal and four spatial eDNA replicates collected at each site, and later analyzed using qPCR. Additionally,

I assessed if seasonal differences in spawning activity and stream flow would impact

Redside Dace eDNA detections. Using occupancy modelling, the results from my study indicate that overall detection probability is higher in the spring than in the fall, varied between sampling weeks within a season, that collecting water over spatial and temporal scales are comparable, and a minimum of three replicates are needed to reliably detect

Redside Dace at a specific site. A comparison of naive detections for electrofishing versus eDNA monitoring in the fall indicated that eDNA was able to detect Redside Dace at more sites than electrofishing. The results from my study indicate that eDNA surveying is a sensitive tool for species detection, that can complement conventional sampling methods to increase detection success.

16

Introduction

The rapid loss of biodiversity globally has provoked a need to conserve species

(Frankham et al. 2002). To counter the effects of biodiversity loss, many jurisdictions have enacted legislation to protect species at risk and their habitats. Important activities associated with these laws include gathering scientific information regarding the species distribution, biology and habitat requirements, and encouraging stewardship activities that facilitate recovery. For this to be successful, an extensive inventory and monitoring program needs to be put in place to determine the range of occupied habitat for a species, and site-specific population trends (Campbell et al. 2002). A basic understanding of habitat ranges for most species is lacking, and this can therefore be a difficult undertaking

(Thompson 2004).

The detection and monitoring of aquatic species at risk requires methods that are non-intrusive and are able to identify target organisms at low densities. Traditional monitoring for aquatic organisms including electrofishing, snorkelling and netting can not only be costly, but also time consuming, and requires trained labour (Darling and Mahon

2011). From a conservation perspective, these monitoring approaches could cause increased stress and reduce fitness to the already at-risk species (Nielsen 1998). To overcome these limitations, environmental DNA (referred to as eDNA herein) detection is an emerging technique that is becoming more frequently used because of its potential for species detection (Ficetola et al. 2008). Referring to the detection of target DNA from an environmental sample, its benefit lies in documenting a species’ presence without having to use invasive survey techniques. This DNA-based identification holds potential for

17 unambiguous species identification at various life stages (Hebert et al. 2003, Victor et al.

2009, Serrao et al. 2014), and for greater sensitivity at detecting the target organism over traditional methods for monitoring surveys (Darling and Mahon 2011).

Environmental DNA methodology has been applied to a wide range of aquatic species to document their occurrence, distribution, and habitat occupancy. The primary application of eDNA to date has been on aquatic invasive species such as the American

Bullfrog (Rana catesbeiana) (Ficetola et al. 2008), Bighead Carp (Hypophthalmichthys nobilis), Silver Carp (H. molitrix) (Jerde et al. 2013) and New Zealand Mudsnail

(Potamopyrgus antipodarum) (Goldberg et al. 2013). In comparison, eDNA methods for monitoring species at risk have been largely understudied (Bronnenhuber and Wilson

2013, Janosik and Johnston 2015, Laramie et al. 2015). Since the advent of eDNA technology, lab methodology has progressed from using traditional polymerase chain reaction (PCR) (Ficetola et al. 2008), to using newer platforms such as real-time PCR

(qPCR) (Wilcox et al. 2013), next-generation sequencing (Thomsen et al. 2012) and laser transmission spectroscopy (Mahon et al. 2013). Our understanding of DNA’s properties has also increased over this period. For example, Goldberg et al. (2013) determined that in a controlled environment, eDNA from the New Zealand Mudsnail could no longer be detected 45 days after individuals were removed. Similarly, Takahara et al. (2012) found that when Common Carp (Cyprinus carpio) were placed inside a 9L tank, the eDNA shed by the organisms increased until around day 6, after which it reached an equilibrium.

Despite the utility of eDNA monitoring in many study systems, the importance of sampling design and factors that influence detection have been largely overlooked

(Yoccoz 2012, Schmidt et al. 2013).

18

Redside Dace (Clinostomus elongatus) present a unique opportunity for testing the efficacy of eDNA detection for documenting species occurrence because limited knowledge is available on their habitat distribution, and few studies have focused on endangered aquatic species. Redside Dace typically consume terrestrial insects and therefore require clear waters to visually detect their prey, as well as overhanging vegetation to attract their prey (Daniels and Wisniewski 1994, Novinger and Coon 2000).

They are stream fish that are found at mid-water depths within pools, and move to riffles for spawning when water temperatures reach between 16-18 ºC (Novinger and Coon

2000, Redside Dace Recovery Team 2010). Redside Dace occupy a disjunct distribution throughout the upper Mississippi River Drainage, Great Lakes Basin, Ohio River and upper (COSEWIC 2007, Novinger and Coon 2000). Within Canada,

Redside Dace populations are restricted to southern Ontario, with the exception of the

Two Tree River population that has recently been found on St. Joseph Island near Sault

Saint Marie, with unknown origins (Redside Dace Recovery Team 2010). Populations have been declining as a result of habitat loss and degradation throughout their range

(Parker et al. 1987, COSEWIC 2007, Redside Dace Recovery Team 2010). In 1973,

Redside Dace was identified as a species to study on the grounds that they were less common than 30 years prior (Scott and Crossman 1973). In 1987, Redside Dace was designated as being of Special Concern by the Committee on the Status of Endangered

Wildlife in Canada (COSEWIC) (Parker et al. 1988) and reassessed as Endangered in

2007 (COSEWIC 2007). The species was listed as Endangered under Ontario’s

Endangered Species Act in 2009 (OMNRF 2015).

19

The distribution of this species is poorly documented and large knowledge gaps exist to determine extant populations, relative abundance, and occupancy (Redside Dace

Recovery Team 2010). Additionally, a need exists to implement a long-term monitoring program to examine Redside Dace populations and their habitats through time (Redside

Dace Recovery Team 2010). Before 1979, Redside Dace were not targeted species in surveys, and therefore historical knowledge gaps exist regarding their distribution and abundance (Poos et al. 2012). A few studies have looked at the effectiveness of assessing

Redside Dace abundance using a backpack electrofisher and a bag seine (Reid et al. 2008,

Poos et al. 2012), and standardized approaches for monitoring Redside Dace presence and abundance exist (Wilson and Dextrase 2008).

The overall objective of my study was to evaluate whether eDNA is a sensitive tool for regional monitoring of stream fishes and the detection of aquatic species at risk, using Redside Dace as a study species. I also investigated whether seasonal differences in stream flow and spawning activity influence Redside Dace detectability. If there was high water flow within a season, I predict fewer Redside Dace detections because of a dilution effect. Specific objectives were to: (i) characterize the repeatability of sampling results within a season, (ii) determine the minimum number of water samples to be collected at a site to ensure confidence in detecting Redside Dace that are present, (iii) compare detection rates between multiple water samples collected over a one-hour temporal scale versus spatially replicated samples, and (iv) compare presence/absence of eDNA and electrofishing-based detections of Redside Dace. The results of this study will be important for informing future collection efforts for Redside Dace.

20

Methods

Field sampling

Water samples were collected from 29 sites in southern Ontario streams where

Redside Dace are present at varying population densities (COSEWIC 2007, Reid et al.

2008, Poos et al. 2012) (Figure 2.1). Samples were collected during the spring (May and

June 2013) and fall (September 2012 and 2013). Habitat measurements taken at each site were channel width (mean = 3.60 m, median = 3.22 m), maximum channel depth (mean =

0.22 m, median = 0.22 m), water temperature (mean = 15.1 ºC, median = 15.1 ºC), and conductivity (mean = 779.6 μS, median = 724 μS). Average habitat characteristics for each season are listed in Table 2.1 and a list of sites, locations, and habitat characteristics are provided in Appendix 2.1.

At each of the 29 sites sampled (Appendix 2.1), nine 1-Lwater samples were collected to compare spatial and temporal sensitivity, and evaluate the effect of sample replicates on Redside Dace detections. Throughout this chapter, the use of the term “site” refers to the area at which temporal and spatial replicates were collected from the watercourses outlined in Appendix 2.1. The starting point of sampling for each site was located (approximately 2-3 m) downstream of a pool. Pools were targeted for sampling, as they are the preferred habitat for Redside Dace (Novinger and Coon 2000). Site length was set at 10 times the wetted stream width, with a minimum length of 40 m, and contained a minimum of one riffle-pool sequence according to the Ontario Stream

Assessment Protocol (Stanfield 2013). Temporal sampling consisted of collecting five 1-

L water samples at the downstream end of the site at 15 minute intervals (time 0, 15 min,

21

30 min, 45 min, and 60 min). Five spatial samples were also collected within each site, using the t = 60 min temporal sample as the first spatial replicate. The remaining four samples were systematically collected in an upstream manner; each separated by 10 m (if stream width ≤ 4 m) or a longer distance (if width > 4 m) defined as wetted width x 10/4.

Each site was sampled in both spring and fall to assess seasonal variation in eDNA detection probability. Eleven of the 29 sites (Appendix 2.1) were resampled approximately one week after initial sampling for both fall and spring field seasons, in order to assess repeatability within each season.

A 1-L field control sample was collected at the start of each sampling trip from the Otonabee River (known absence of Redside Dace) to ensure sterile field techniques.

During the spring of 2013, three 1-L water samples were collected from 10 lakes that do not support Redside Dace populations (Mandrak and Crossman 1992) in order to define a minimum copy number per DNA reaction that would be accepted as a positive detection

(Figure 2.1).

After water sampling at a site was complete, the same area was sampled with a

Smith-RootTM backpack electrofisher during the fall season. It was assumed that Redside

Dace populations consisting of greater than 1 individual/100m2 would be detectable via electrofishing (Reid et al. 2008). If Redside Dace were absent from the first sampling pass, a second (and up to three) electrofishing passes were completed. The median electrofishing effort completed across each site was 929 s, the average number of passes per site was two, and the total effort across all 29 sites was 30,416 s. Each pass was separated by 10 min to allow the water to clear. Counts of Redside Dace for each pass

22 were recorded at all 29 sites, and all individuals were released after sampling.

Electrofishing was the only traditional method used to collect Redside Dace.

Sample filtration

Water samples were stored in refrigeration (4 C) until filtration. Filtration took place within 48 h of sample collection. One-litre samples were individually filtered through a three-manifold filtering apparatus (EZ-Stream™ vacuum pump) with filter sizes of 47 mm GFC and 934-AH membrane pore. Between samples, filter funnels and their bases were immersed in 10% bleach solution for 10-15 min to destroy DNA on filtering equipment. Equipment was then rinsed thoroughly with tap water, followed by a final rinse with double-distilled water to ensure that residual bleach was removed.

Forceps were flame-sterilized between samples in 95% ethanol after contact with processed filters. Filters were then placed in small labeled petri dishes and stored at -80

C until DNA extraction. eDNA marker development

Primer and probe sets were designed to amplify an 83 base pair segment of the barcoding region of the mitochondrial cytochrome c oxidase subunit I (COI) gene in

Redside Dace. This gene region, which is approximately 650 base pairs in length, was used in this study because it serves as a tool for species identification and discovery

(Hebert et al. 2003). The primers and probe were designed using Primer Express 3.0.1 software (Applied Biosystems). To verify the specificity of the primers and ensure no cross-reaction between species found in the same area, the primers were tested across a

PCR temperature gradient of 55-65 C against tissue-derived DNA from Blacknose Dace

23

(Rhinichthys atratulus) and Creek Chub (Semotilus atromaculatus). The two species are commonly found in waters with Redside Dace. The most genetically similar fish to

Redside Dace include (Clinostomus funduloides), and cyprinids in the genera Richardsonius and Lotichthys. The distributions of these species do not overlap with Redside Dace and therefore are not likely to amplify (Houston et al. 2010). No cross-amplifications were apparent at all temperatures, indicating that the designed primers were specific to Redside Dace.

DNA extraction

To avoid aerosol contamination, water filtering, DNA extraction and PCR amplification of samples took place in separate rooms. Samples were extracted following the MoBio PowerWater DNA Isolation Kit (http://www.mobio.com) protocols, with several modifications shown in bold. Filters were removed from the -80 C freezer, allowed to thaw and transferred to a 15 mL falcon tube using forceps. Forceps were flame-sterilized between samples. 1000 L of PW1 heated at 70 C was added to each tube. Tubes were placed in the shaker to lyse for a minimum of 30 min after which they were centrifuged at 2,000 x g for 1 min in order for the liquid to spin to the bottom.

Using a transfer pipette, supernatant was placed in a clean 2-mL collection tube, after which tubes were centrifuged at 13,0000 x g for 1 min. The supernatant was pipetted into a clean 2-mL tube after which 200 L of PW2 was added. The entire solution was mixed using a vortex (VWR Advanced Digital Shaker). Tubes were stored at 4 C for 5 min.

PW2 was used to remove inhibitory compounds, such as proteins, cell debris and other non-DNA materials. Tubes were centrifuged at 13,000 x g for 1 min and the supernatant

24 was transferred into a clean 2-mL centrifuge tube, leaving the pellet in the tube. 650 L of PW3 was placed in the incubator at 70 C for several minutes, and was added into a new tube and mixed using a vortex. PW3 is a high concentration salt that binds to DNA, allowing the DNA to bind to the silica-based spin column. In the fumehood, 650 L from each tube was pipetted into a spin column, centrifuged at 13,000 x g for 1 min, and the flow-through was discarded. This step was repeated twice until all supernatant from each tube was washed through the spin column. The spin column was placed into a clean 2-mL tube. Next, 650 L of PW4 was added to the 2-mL tube, spun at 13,000 x g for 1 min and the flow-through was subsequently discarded; PW4 was used to remove any residual salts that may inhibit downstream PCR reactions. 650 L of PW5 was added to the same tube, centrifuged at 13,000 x g for 1 min, and the flow-through was discarded again. The same tubes were immediately re-centrifuged at 13,000 x g for 2 min; PW5 was used to make sure that PW4 is completely removed from the DNA. Spin baskets were placed into the final tube, after which the DNA was eluted with 100 L of low TE (10 mM Tris pH 8,

0.1 mM EDTA) and the solution was centrifuged at 13,000 x g for 1 min. The spin basket was then discarded. Low TE was used instead of buffer PW6 (as illustrated through protocols) because it is superior for long-term storage of DNA. qPCR amplification

Quantitative PCR (polymerase chain reaction) or real-time PCR (referred to as qPCR herein), was used as the assay for DNA detection. Real-time PCR is similar to conventional PCR, with the major technical difference being the use of a fluorescently- labeled probe. This assay uses Taqman with minor groove binding properties. During the

25 annealing phase, the Redside Dace-specific forward (RSD-F: 5’-

GCTAGCTTCTTCTGGCGTTGA-3’) and reverse primer (RSD-R: 5’-

CTGCATGGGCAAGGTTACCT-3’) bind to the target strand. Additionally, a probe

(6FAM-CGGAACAGGATGAACGG-MGBNFQ) which consists of a 5’ fluorescent reporter and a 3’ quencher hybridize to the target strand. When the probe is intact, the 3’ quencher absorbs the signal from the 5’ fluorescent reporter through fluorescent resonance energy transfer (FRET). However, when the Taq polymerase adds the complementary nucleotides into the target strand and reaches the probe, it cleaves the probe through its 5’-3’ nuclease activity, which then causes the probe to split and emit a fluorescence signal. The fluorescence signal is proportional to the quantity of starting template quantity, and is determined through the use of standard interpolation (Heid et al.

1996). Benefits to using qPCR over traditional PCR include: 1) qPCR does not require gel electrophoresis; this reduces contamination because the amplified qPCR product is unopened; 2) the qPCR reaction takes less time to run; and, 3) qPCR measures copy number after each cycle (Heid et al. 1996).

Development of qPCR standards

DNA standards were generated using two reference Redside Dace specimens

(RSD1-4, RSD2-2), and were developed to quantify the copies of target DNA present in the environmental sample. The reference specimens were PCR-amplified to target the 707 bases of COI, inclusive of primers (Ivanova et al. 2007). PCR product was quantified using a Picogreen plate (BMG FluoStar Galaxy 96-well plate system). The volume of

Redside Dace DNA needed for 10 billion copies of DNA/reaction was calculated, based on the Picogreen reading and the molecular weight for the COI region. The calculated

26 volume was used as a starting point for serial dilution: 10 L of 1010 copies/reaction were added to 90 L of low TE, to achieve a concentration of 109 copies/reaction, after which the solution was mixed thoroughly (Wozney and Wilson 2012). A new tip was used to pipette 10 L of 109 copies/reaction to 90L of low TE, to achieve 108 copies/reaction.

This was repeated until a concentration of 1 copy/reaction was achieved. For each qPCR run, 106 copies/reaction down to 1 copy/reaction were assayed as quantitative controls.

A standard curve was generated by plotting the known concentration of DNA against the cycle at which the signal passed the cycle threshold (Ct). The Ct is chosen to be significantly higher than the background fluorescence, in order to make accurate inferences of signal versus noise. Standards were used to identify the cycle number at which a "known" quantity of DNA passes the fluorescence threshold. Cycle number was used to infer the number of DNA copies associated with each sample. For each qPCR assay, 2 standards were run, each in duplicate to compare within-pipette variability.

The data for the raw qPCR values across all sites can be found in Appendix 2.2, and the locations of the 10 negative control lakes can be found in Appendix 2.3. qPCR reactions

PCR reactions were set up in a laminar flow UV fumehood to minimize the risk of

DNA contamination. Standards were pipetted into wells in a separate room that was dedicated for amplified Redside Dace DNA to avoid cross-contamination. A preliminary test for inhibition was done based on Redside Dace eDNA lab samples that were previously collected. Two sample replicates were collected from Lynde Creek (LC1 and

LC2), and two eDNA sample replicates were collected from Mitchell Creek (MC1 and

27

MC2). Samples were diluted to determine if inhibitors were present using a dilution series of undiluted DNA, 1:2, 1:5, 1:10, 1:20, 1:30; each sample was run twice on the qPCR assay to determine the within-sample variation. If inhibitors were present in the reaction, an increase in copy number would be expected in more diluted samples. For both field seasons (Sept 2nd 2012-June 11th 2014), each eDNA sample was run with 15L of the following cocktail: 10 L of TaqMan® Fast Universal PCR Master Mix (2✕) (referred to as fast mix hereafter), 0.4 L of RSD-R, 0.4 L of RSD-F, 0.4 L of RSD-probe, 3.8L of ddH2O, 5 L of stock DNA. Each sample was run in triplicate to assess the level of within-sample variability. There was no evidence of inhibition as a result of decreased copy numbers at higher dilution series (Figure 2.2). StepOnePlus thermocycling conditions for the fast mix were as follows: initial denaturation for 2 min at 95 °C, followed by a 2 step-process of 1 s denaturation at 95 °C, and a 20 s annealing at 60 °C, repeated for 40 cycles.

Water samples from fall 2012 were run using fast mix, while water samples from spring 2013 were run using both fast mix and TaqMan® Environmental Master Mix 2.0

(referred to as environmental mix herein), the latter of which was first used in the lab during 2013. The environmental mix could not be tested on fall 2012 water samples due to the potential for DNA degradation over the past year that they were in the freezer. The fast mix was compared to the environmental mix in order to see if there was a significant difference between the two mixes at detecting low copy numbers. Based on these results, no significant difference would indicate that the data from the environmental mix and the fast mix could be analyzed in conjunction, but a significant difference would indicate that

28 the data from each mix would have to be analyzed separately. The paired comparison of the PCR results from each of the TaqMan master mixes is detailed in Appendix 2.4.

Statistical analyses

Each qPCR sample was run in triplicate and the average of the three runs were used for the analysis. The precision and accuracy of the qPCR platform was assessed in order to determine how reliable the platform was for determining DNA copy number present within a sample. The coefficient of variation (CV) was calculated to assess the within-trial precision for each triplicate sample. This was determined by dividing the standard deviation by the mean for each PCR replicate. Values that had a standard deviation of 0 and a mean of 0 were left as 0 for the coefficient of variation (since 0/0 is undefined). To measure the qPCR accuracy, known concentrations of DNA (1000 copies/reaction down to 1 copy/reaction) were treated as “unknown” DNA samples. This was replicated seven times to get a true measure of data variability. A second measure of accuracy was determined by graphing the copy number of each standard against the cycle at which it passes the threshold (Ct) for (i) all standards generated in the assay and (ii) standard points used in the regression to generate the copy numbers (often points at 1 copy/reaction were omitted because they deviated far away from the line of best fit).

Failure rates were calculated for each mix at the 1 copy/reaction to determine how many failed to generate a Ct at 40 cycles. These values were assigned a Ct of 40 cycles.

An occupancy modelling approach based on multiple temporal samples was used to estimate detection probabilities, and to assess the influence of water temperature, stream flow levels, and number of water samples collected on Redside Dace detection

29

(MacKenzie et al. 2002). The detection probability associated with electrofishing was not specifically modelled, so the detection ability of the gears was compared using naïve detections. A site was considered positive for Redside Dace eDNA presence if at least one of the nine sampling replicates had Redside Dace DNA. The approach estimates the probability of site occupancy and detection probability (the probability of detecting a species in an individual survey or sample if it is present) using maximum likelihood procedures (MacKenzie et al. 2002). Detection probabilities can be more accurately estimated by adding covariates and using the logistic formula exp(XB)/(1+exp(XB), where X represents covariate data, and B represents the vector of model parameters

(MacKenzie et al. 2002). Detection probability is an important statistic for monitoring programs because it allows one to determine the conditions under which sampling would be most effective and the amount of sampling effort required to increase chances of detection. Single-season occupancy modelling assumes that: 1) during the sampling period, the site is closed to occupancy changes, 2) false positives do not occur, 3) detecting a species at one site is independent of detecting a species at another site, and 4) probability of occupancy and detection are constant across sites or are modelled as a function of covariates (MacKenzie et al. 2002). For this study, it was assumed that occupancy is constant across sites.

Program PRESENCE 6.2 (Hines 2006) was used to estimate detection probabilities and standard errors. Three copies/reaction were used as a threshold for a positive detection in individual samples (see Appendix 2.5), because it incorporates all the negatives that had a reading of greater than 0 copies/reaction. To examine the repeatability of sampling results within a season using eDNA, detection probabilities

30 were calculated at the 11 repeated sites using a null model (no covariates) during spring week one and week two, and fall week one and week two (objective i). Additionally, candidate sets of four models including detection covariates were tested at three, four, and five temporal replicates during the spring and the fall to look at between season repeatability (objective ii). The models included in each candidate set were: 1) constant detection probability (null model, detection probability is the same across all surveys and sites), p(.); 2) detection probability with water temperature as a site-specific covariate, p(temp); 3) detection probability with an index of flow as a covariate p(flow); and 4) detection probability with water temperature and flow as covariates, p(temp+flow).

Correlations between the two covariates were assessed using Spearman’s rank correlation rho, using the stats package in R studio (R Core Team 2013). Temperature was chosen as a covariate because it could influence factors that break down/preserve DNA; I predicted that an increase in temperature would result in lower detection probability. An index of flow (depth x wetted width) was chosen as a covariate because environmental DNA concentrations are considered to negatively covary with flow (Klymus et al. 2014).

Estimated occupancy and detection probabilities were obtained for models analyzed using

PRESENCE. An information theoretic approach was used to compare competing models based on Akaike’s Information Criterion corrected for small sample sizes (AICc). The number of sampled sites was used as the sample size when calculating AICc (e.g., n = 29 for the spring and fall sampling seasons). Goodness of fit was assessed using the global model in each candidate set (p(temp+flow)), using the Pearson chi-square statistic and

10,000 bootstraps; over dispersion in the data was assessed by estimating ĉ (MacKenzie and Bailey 2004). Where there was over dispersion (ĉ > 1), Quasi-AIC corrected for small sample size (QAICc) was used for model selection.

31

The detection probability of the spatial sampling could not be directly modelled due to the spatial nature of the replicates. Therefore the spatial and temporal sampling schemes were directly compared by examining detection histories for both methods at sites where Redside Dace were detected (objective iii). To compare the efficiencies of fall electrofishing and eDNA samples, the number of unique sites with Redside Dace detections for each gear type, and the number of sites with detections by both sampling methods were determined (objective iv). The minimum number of temporal water samples needed to be 95% confident of detecting Redside Dace when present at a particular site, was calculated using the formula: 1-[1-p]k, where k represents number of replicates, and p represents the model-averaged detection probability (Pellet and Schmidt

2005). Minimum sample effort was calculated for each season and each replicate scenario.

Results

The fall eDNA samples (n = 316) had a mean value of 13.0 copies/reaction (based on triplicate mean per sample), a maximum value of 1156.9 copies/reaction, and a median of

1.3 copies/reaction. The spring eDNA samples (n = 316) had a mean value of 8.8 copies/reaction, a maximum value of 154.8 copies/reaction, and a median value of 2.8 copies/reaction. These values were obtained for all water samples, including those with no detections. Overall, 69% of the qPCR negative controls (field blanks, lab negatives, lake controls) contained no Redside Dace DNA; however, the samples that did have a copy number reading contained minute amounts of DNA. Control samples (n = 258) had a maximum copy number of 3.3 copies/reaction at the filter control, and had an overall

32 mean copy number of 0.09 copies/reaction, standard deviation of 0.3, and a median of 0 copies/reaction (Figure 2.3).

High variability was found between qPCR sample triplicates for both control and environmental samples. The coefficient of variation (CV) (n = 890) had a mean of 57.7%,

(median CV=36.7%, max = 173) (Figure 2.4). The mean copy number of samples (n =

88) having a CV value of greater than 150% was 0.2 copies/reaction (median=0.2 copies/reaction, max=1.9 copies/reaction). Real-time PCR results from eDNA standards of known concentrations indicated a decrease in the accuracy of qPCR to quantify DNA copy numbers within a reaction, as copy number decreased (Figure 2.5). The variance in qPCR copy number increased as copy number decreased, as indicated by the greater variation (larger boxplot spread) at 10 copies/reaction and 1 copy/reaction. At a test concentration of 1 template copy/reaction, the control assay was unable to detect Redside

Dace in two of the seven replicates. The qPCR output for all controls had a reading of 0 copies/reaction. A comparison of all standards used to test environmental samples, indicates that there was high overlap between the 10 copies/reaction and 1 copy/reaction

Ct (Figure 2.6), and a failure rate of 34% for the 1 copy/reaction standards to amplify.

Redside Dace eDNA was detected at 18 of 29 sites sampled during fall, and 16 of

29 sites sampled during spring using only eDNA monitoring. Redside Dace were detected at 13 sites in both spring and fall and were detected at 5 unique sites in the fall and at 3 unique sites in the spring using only eDNA detection. Electrofishing only detected

Redside Dace at 14 of 29 sites sampled in the fall; eDNA was able to detect Redside Dace at 7 sites that had no detections with electrofishing, while electrofishing had 3 sites with

33 detections that were not detected by eDNA. Redside Dace were detected using both sampling methods at 11 of 29 sites (See Appendix 2.6).

The number of sites with positive detections differed between sampling weeks

(sampling separated by approximately 10 d) within the same season (Figure 2.7). Of the

11 sampled sites, six were positive for Redside Dace during fall week one while only three of these same sites tested positive for Redside Dace during fall week two. Number of positive detections were higher during spring, with seven sites testing positive for

Redside Dace during week one and nine positives during week two (with all sites that tested positive in week one testing positive in week two). At two sites, Redside Dace eDNA was not detected during either season or either week (Figure 2.7). The inconsistency between sampling weeks is also reflected by differences in estimated detection probabilities. Fall water sampling during week one resulted in a detection probability of 0.63 + 0.09. The detection probability increased approximately 0.40 to 1.0, implying perfect detection during week two (Figure 2.8). Spring detection probabilities were more similar between weeks (week one detection probability = 0.68 + 0.08; week two detection probability = 0.80 + 0.06) (Figure 2.8).

Detection probabilities for eDNA sampling were consistently high; ranging from

0.62 to 0.90 depending on the model and season (Table 2.2). There were no correlations between flow index and water temperature during spring (rs=-0.1, p>0.05) and fall (rs=-

0.2, p>0.05). None of the candidate sets show evidence of lack of fit (i.e., p > 0.05) and only the candidate set for fall samples with five replicates showed signs of overdispersion

(ĉ = 1.39). For each model that was tested in both seasons, spring detection probability was always higher than fall detection probability. For example, the highest model-average

34 estimate of p during the fall was 0.73 (+ 0.096); while the lowest model-average estimate of p during the spring was 0.74 (+ 0.081). Within a season, the importance of covariates for both spring and fall detection probabilities was variable. During the fall season at the various replicates, the null model was always the best or second ranked model (Appendix

2.7). For the spring season, both the temperature model and the additive model of temperature and flow, were the best models (Appendix 2.7). A plot of individual site detection probabilities against flow and temperature indicates that detection probabilities increase as flow decreases and temperature increases. Detection probabilities were lowest at temperatures below 13 ºC (Figure 2.9).

As the number of temporal samples increased, detection probability decreased, although there was an increase in the estimated occupancy rates for both fall and spring.

At three replicates for the spring null model, there was an occupancy estimate of 0.52 (+

0.09) and a detection probability of 0.89 (+ 0.05), while there was an occupancy estimate of 0.58 (+ 0.09) and a detection probability of 0.80 (+ 0.04) at five temporal replicates. A similar pattern was present during the fall sampling season. At three replicates for the fall null model, there was an occupancy estimate of 0.45 (+ 0.09) and detection probability of

0.76 (+ 0.07), while at five temporal replicates there was an occupancy of 0.53 (+ 0.09) and detectability of 0.65 (+ 0.06) (Table 2.2). Standard errors of detection probability estimates tended to be slightly smaller for the models with additional temporal replicates.

Using the formula 1-[1-p] k and model-averaged estimates of p for each replicate during each season, two or three temporal samples were needed to be 95% confident of detection

Redside Dace when present at a site (Appendix 2.7).

A comparison of water samples collected over a short-time period (temporal) at single locations versus along a stream reach (spatial) indicated that both approaches

35 provided similar results in the fall. Spatial sampling performed slightly better in the spring. With one water sample, 13 sites tested positive for Redside Dace eDNA in the spring for both temporal and spatial sampling (Figure 2.10). In the fall, one temporal replicate resulted in 12 positive detections and one spatial replicate resulted in 10 positive detections. When the number of samples increased to the fourth collection replicate, both temporal and spatial sampling in the spring detected Redside Dace at 15 sites. In the fall, temporal and spatial sampling detected Redside Dace at 13 sites and 15 sites, respectively

(Figure 2.10).

Discussion

A successful monitoring program includes efficient survey design, statistical power needed to detect change, and sensitive methodology (Legg and Nagy 2006). Major advantages of using eDNA as a monitoring tool, are that it is non-intrusive to target and non-target species and their habitats, sampling can be done without specialized or costly field equipment, and volunteers are able to participate in field collections with limited expertise (Portt et al. 2006, Darling and Mahon 2011, Biggs et al. 2015). Despite the increased use of eDNA monitoring (Ficetola et al. 2008, Darling and Mahon 2011, Barnes et al. 2014), few studies have explored basic sampling procedures needed to be confident of a species’ absence or presence at a particular site (Schmidt et al. 2013, Ficetola et al.

2014). This study has helped to fill in these knowledge gaps by intensively collecting water samples spatially, temporally and seasonally throughout 29 sites in watersheds where Redside Dace are known to be present.

Environmental DNA detection probabilities differed between the fall and spring sampling seasons, as well as between weeks within the same season. Spring had a higher

36 number of sites with positive detections, as well as a higher detection probability estimate compared to the fall. These results were contrary to my initial prediction, in which I expected that the higher water flow during the spring would have created a DNA dilution effect, thereby reducing the genetic signal present within the water. The lack of a dilution effect could result from the limited differences in stream flow between seasons. The results from this study were consistent with the literature, however, in which Jane et al.

(2014) found that eDNA copy number was steady across distances at high flow compared to low flow. Additionally, different sampling weeks within the same season (approximate

10-day sampling difference) had varying detection probability estimates, and were all relatively high. While estimates were fairly consistent between weeks during the spring, the fall detection probability had both the highest and lowest detection values of the 4 weeks (Figure 2.8). A limitation to these detection probability estimates, are that sample size was low (n = 11), which could result in biases. Additionally, differences in detection probability may be the result of changes in Redside Dace abundance between sampling weeks, changes in occupancy status (for example, the closure assumption may be violated, and the weeks could represent different sampling seasons), or it could be the result of other factors not measured. Detection probability might be higher in the spring as a result of spawning season (Scott and Crossman 1973), which would cause higher fish activity and higher local abundances of adult fish, and therefore more cells would be shed into the environment (Klymus et al. 2014).

Environmental DNA had the highest detection probability at three temporal replicates, while highest occupancy rates were obtained at five temporal replicates. A tradeoff occurs between sampling more sites and obtaining more replicates at a site, and

37 one must be able to determine if the added effort at one location is worth both the time and the cost (Gibbs et al. 1998). Based on the results from this study, it is recommended that three temporal replicates be sampled at each site. The extra initiative to collect five temporal samples only resulted in an additional one and two sites with positive detections during the spring and fall, respectively, and therefore the additional efforts could be aimed at either targeting other reaches within a stream, or sampling additional watersheds.

The spatial replicate sampling was more effective than temporal replicates for total number of positive detections; however, the differences between the two sampling approaches were not substantial. A study of how eDNA varies horizontally and vertically within a water column determined that high eDNA variability does exist over a smaller spatial scale (~10 metre intervals), and that before more extensive sampling for aquatic species occurs, initial surveys should employ fine-scale collections (Eichmiller et al.

2014, Laramie et al. 2015). In the context of my study, the main advantage of using temporal replicates would be to obtain detection probability estimates (since spatial replicates are more difficult to model, and they are likely sampling a larger total sample site). However, temporal replicates at a single location may fail to detect DNA from

Redside Dace further upstream that are detected by upstream spatial replicates within the site. By taking spatial replicates, it would (i) allow DNA to be collected over a wider spatial range, given that DNA is unevenly distributed throughout the waterbody, and (ii) sample collection would take a shorter time as a result of no time lapse between spatial replicates other than the time required to move upstream with sampling gear.

38

Despite a strong correspondence of sites with Redside Dace detections via electrofishing and eDNA monitoring, eDNA was able to detect Redside Dace at a greater number of sites. This finding is consistent throughout the literature, as demonstrated within the Slackwater Darter (Etheostoma boschungi) (Janosik and Johnston 2015),

Eastern Hellbender (Cryptobranchus a. alleganiensis) (Spear et al. 2015), and Great

Crested Newt (Triturus cristatus) (Biggs et al. 2015) study systems. A point worth noting is that eDNA did not detect Redside Dace at three sites where the species was detected by electrofishing. This could be the result of low-density populations, an example being

Lynde Creek, which has been the focus of several conservation efforts (Redside Dace

Recovery Team 2010). Therefore, both methods have their own advantages and disadvantages, and instead of one method replacing the other, the two should be able to complement each other. Important considerations when using traditional gear are the seasonal bias, age class bias, and difficulty standardizing efforts when using multiple gear types (Pope and Willis 1996, Bonvechio et al. 2008, Meye and Ikomi 2012, Fischer and

Quist 2014).

Conclusion

The major findings of the study were (i) spring sampling resulted in higher detection probabilities than fall sampling, (ii) eDNA repeatability was variable between sampling weeks of the same season, (iii) a minimum of three replicates was needed at a site to ensure confidence that Redside Dace were detected when present, (iv) results were comparable for temporal versus spatial sampling, and (v) eDNA detected Redside Dace at more sites than electrofishing, and (vi) detection probability using eDNA was consistently high. Despite the increased use of eDNA for invasive species, its application to species at

39 risk has been largely lacking. This study demonstrated that eDNA is a sensitive tool for detecting Redside Dace, however, sampling design is key in order to increase the chances of detecting a species. Based on results, a recommendation for future monitoring efforts that can be extended to other cyprinids and stream fishes would be to use electrofishing as a first pass to monitoring. Reasons why one might get a signal from eDNA despite electrofishing failing could be as a result of the fish being upstream of the sampling site, as well as low population-densities that might make it difficult to catch the fish. Despite eDNA having a higher success rate of detecting Redside Dace, the logistical costs associated with eDNA exceeded the benefits of using electrofishing as a first pass, and should be implemented if traditional gear are unable to detect a species presence (see

Appendix 2.8). Along with budgetary costs, using traditional gear has the benefit of physically verifying species presence. If Redside Dace are not detected at a site using traditional gear, eDNA sampling should be used as a second approach because of its potential for increased sensitivity. Precise documentation of an endangered species’ distribution is critical not only for their protection, but also for the protection of other species that occupy similar habitats. It is also important to be able to recognize true absences, so that project developers are able to use the land without worry of future restrictions if an endangered species were to be found after development. With limited resources available for the recovery of a species, comprehensive knowledge of their distribution is critical so that appropriate efforts can be invested on locations that contain the target species.

Despite the knowledge that this study has contributed to the growing field of eDNA, there is a lot to be learned. Future efforts could focus on (i) determining how other

40 factors such as turbidity and overhanging vegetation influence Redside Dace occupancy,

(ii) adding more sampling sites so that within-season detection probability can be more thoroughly investigated, and (iii) investigating the effectiveness of eDNA to detect other stream fishes of management concern, such as Brook Trout (Salvelinus fontinalis). While more information needs to be collected before eDNA can be used routinely, it is a sensitive tool for species monitoring and its use should continue to be explored for other aquatic endangered species and systems.

References

Barnes MA, Turner CR, Jerde CL, Renshaw MA, et al. (2014) Environmental conditions influence eDNA persistence in aquatic systems. Environmental Science & Technology, 48, 1819–1827.

Biggs J, Ewald N, Valentini A, Gaboriaud C, et al. (2015) Using eDNA to develop a national citizen science-based monitoring programme for the great crested newt (Triturus cristatus). Biological Conservation, 183, 19–28.

Bonvechio TF, Pouder WF, Hale MM (2008) Variation between electrofishing and otter trawling for sampling Black Crappies in two Florida Lakes. North American Journal of Fisheries Management, 28, 188–192.

Bronnenhuber JE, Wilson CC (2013) Combining species-specific COI primers with environmental DNA analysis for targeted detection of rare freshwater species. Conservation Genetics Resources, 5, 971–975.

Campbell SP, Clark JA, Crampton LH, Guerry AD, et al. (2002) An assessment of monitoring efforts in endangered species recovery plans. Ecological Applications, 12, 674–681.

COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. Vii+59 pp. (www.sararegistry.gc.ca/status/status_e.cfm).

Daniels RA, Wisniewski SJ (1994) Feeding ecology of redside dace, Clinostomus elongatus. Ecology of Freshwater Fish, 3, 176-183.

41

Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based methods for monitoring biological invasions in aquatic environments. Environmental Research, 111, 978–988.

Eichmiller JJ, Bajer PG, Sorensen PW (2014) The relationship between the distribution of common carp and their environmental DNA in a small lake. PLoS ONE, 9,e112611.

Ficetola GF, Miaud C, Pompanon F, Taberlet P (2008) Species detection using environmental DNA from water samples. Biology letters, 4, 423–425.

Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false presences, and the estimation of presence / absence from eDNA metabarcoding data. Molecular Ecology Resources, 15, 543–556.

Fischer JR, Quist MC (2014) Gear and seasonal bias associated with abundance and size structure estimates for lentic freshwater fishes. Journal of Fish and Wildlife Management, 5, 394–412.

Frankham R, Ballou JD, Briscoe DA (2002) Introduction to Conservation Genetics. Cambridge, UK: Cambridge University press. 617 p.

Gibbs JP, Droege S, Eagle P (1998) Monitoring populations of plants and . American Institute of Biological Sciences, 48, 935–940.

Goldberg CS, Sepulveda A, Ray A, Baumgardt J, Waits P (2013) Environmental DNA as a new method for early detection of New Zealand mudsnails (Potamopyrgus antipodarum). Freshwater Science, 32, 792–800.

Hebert PDN, Cywinska A, Ball SL, deWaard JR (2003) Biological identifications through DNA barcodes. Proceedings of the Royal Society of London. Series B, 270, 313–321.

Houston DD, Shiozawa DK, Riddle BR (2010) Phylogenetic relationships of the western North American cyprinid genus Richardsonius, with an overview of phylogeographic structure. Molecular Phylogenetics and Evolution, 55, 259-273.

Heid C, Stevens J, Livak JK, Williams PM (1996) Real time quantitative PCR. Genome Research, 6, 986–994.

Hines JE (2006) PRESENCE2 - Software to estimate patch occupancy and related 615 parameters. USGS-PWRC. http://www.mbr-pwrc.usgs.gov/software/presence.html.

Ivanova NV, Zemlak TS, Hanner RH, Hebert PDN (2007) Universal primer cocktails for fish DNA barcoding. Molecular Ecology Notes, 7, 544–548.

42

Jane S, Taylor WM, Mckelvey KS, Young MK (2014) Distance, flow and PCR inhibition: eDNA dynamics in two headwater streams. Molecular Ecology Resources, 15, 216-227.

Janosik AM, Johnston CE (2015) Environmental DNA as an effective tool for detection of imperiled fishes. Environmental Biology of Fishes, 98, 1889–1893.

Jerde CL, Chadderton WL, Mahon AR, Renshaw MA, et al. (2013) Detection of Asian carp DNA as part of a Great Lakes basin-wide surveillance program. Canadian Journal of Fisheries and Aquatic Sciences, 70, 522–526.

Klymus KE, Richter CA, Chapman DC, Paukert C (2014) Quantification of eDNA shedding rates from invasive bighead carp Hypophthalmichthys nobilis and silver carp Hypophthalmichthys molitrix. Biological Conservation, 183, 77-84.

Laramie MB, Pilliod DS, Goldberg CS (2015) Characterizing the distribution of an endangered salmonid using environmental DNA analysis. Biological Conservation, 183, 29–37.

Legg CJ, Nagy L (2006) Why most conservation monitoring is, but need not be, a waste of time. Journal of environmental management, 78, 194–199.

MacKenzie DI., Nichols JD., Lachman GB., Droege S, et al. (2002) Estimating site occupancy rates when detection probabilities are less than one. Ecology, 83, 2248-2255.

MacKenzie DI, Bailey LL (2004) Assessing the fit of site-occupancy models. Journal of Agricultural, Biological, and Environmental Statistics, 9, 300-318.

MacKenzie DI, Nichols JD, Royle JA, Pollock KH (2006) Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence. Elsevier, Burlington, MA.

Mahon AR, Barnes MA, Li F, Egan SP, et al. (2013) DNA-based species detection capabilities using laser transmission spectroscopy DNA-based species detection capabilities using laser transmission spectroscopy. Journal of the Royal Society Interface, 10, 20120637.

Mandrak NE, Crossman EJ (1992) A Checklist of Ontario Freshwater Fishes. Royal Ontario Museum, Toronto, ON.

Meye JA, Ikomi RB (2012) Seasonal fish abundance and fishing gear efficiency in river Orogodo, Niger Delta, Nigeria. World Journal of Fish and Marine Sciences, 4, 191–200.

Nielsen JL (1998) Scientific sampling effects : electrofishing California’s endangered fish populations. Fisheries, 23, 6–12.

43

Novinger DC, Coon TG (2000) behavior and physiology of the Redside Dace, Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of Fishes, 57, 315–326.

Parker B, Mckee P, Campbell RR (1987) COSEWIC status report on the redside dace Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. 1-20pp.

Parker BJ, McKee P, Campbell RR (1988) Status of the redside dace, Clinostomus elongatus, in Canada. Canadian Field Naturalist, 102,163-169.

Pellet J, Schmidt BR (2005) Monitoring distributions using call surveys: estimating site occupancy, detection probabilities and inferring absence. Biological Conservation, 123, 27-35.

Poos M, Lawrie D, TU C, Jackson DA, Mandrak NE (2012) Estimating local and regional population sizes for an endangered minnow, redside dace (Clinostomus elongatus), in Canada. Aquatic Conservation: Marine and Freshwater Ecosystems, 22, 47–57.

Pope KL, Willis DW (1996) Seasonal influences on freshwater fisheries sampling data. Reviews in Fisheries Science, 4, 57–73.

Portt CB, Coker GA, Ming DL, Randall RG (2006) A review of fish sampling methods commonly used in Canadian freshwater habitats. Canadian Technical Report of Fisheries Aquatic Sciences. 2604 p.

R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/.

Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus elongatus) in Ontario. Ontario Recovery Strategy Series. Prepared for the Ontario Ministry of Natural Resources, Peterborough, Ontario. vi + 29 pp.

Reid SM, Jones NE, Yunker G (2008) Evaluation of single-pass electrofishing and rapid habitat assessment for monitoring Redside Dace. North American Journal of Fisheries Management, 28, 50–56.

Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy models in the analysis of environmental DNA presence/absence surveys: a case study of an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.

Scott WB, Crossman EJ (1973) Freshwater Fishes of Canada. Fisheries Research Board of Canada, 184.

44

Serrao NR, Steinke D, Hanner RH (2014) Calibrating snakehead diversity with DNA barcodes: expanding taxonomic coverage to enable identification of potential and established invasive species, PLoS ONE, 9, e99546.

Spear SF, Groves J.D, Williams LA, Waits LP (2015) Using environmental DNA methods to improve detectability in a hellbender (Cryptobranchus alleganiensis) monitoring program. Biological Conservation, 183, 38–45.

Stanfield, L (2013) Ontario Stream Assessment Protocol. Version 9.0. Fisheries Policy Section. Ontario Ministry of Natural Resources. Peterborough, Ontario. 505 Pages.

Takahara T, Minamoto T, Yamanaka H, Doi H, Kawabata Z (2012) Estimation of fish biomass using environmental DNA. PLoS ONE, 7, e35868.

Thompson WL (2004). Sampling Rare or Elusive Species: Concepts and Techniques for Estimating Population Parameters. Island Press, Washington DC, USA.

Thomsen PF, Kielgast J, Iversen LL, Wiuf C, et al. (2012) Monitoring endangered freshwater biodiversity using environmental DNA. Molecular Ecology, 21, 2565–2573.

Victor BC, Hanner R, Shivji M, Hyde J, Caldow C (2009) Identification of the larval and juvenile stages of the Cubera Snapper, Lutjanus cyanopterus, using DNA barcoding. Zootaxa, 2215, 24–36.

Wilcox TM, McKelvey KS, Jane SF, Lowe WH, et al. (2013) Robust detection of rare species using environmental DNA: the importance of primer specificity. PLoS ONE, 8, e59520.

Wozney KM, Wilson PJ (2012) Real-time PCR detection and quantification of elephantid DNA: Species identification for highly processed samples associated with the ivory trade. Forensic Science International, 219, 106-112.

Wilson CC, Dextrase AJ (2008) Sampling protocols for Redside Dace. Prepared for the Ontario Ministry of Natural Resources, Peterborough, Ontario. vi + 4 pp.

Yoccoz NG (2012) The future of environmental DNA in ecology. Molecular Ecology, 21(8), 2031-2038.

45

Table 2.1: Mean, standard deviation, maximum, and minimum values of environmental variables for 29 sites sampled for eDNA testing for Redside Dace.

Spring Fall Standard Habitat Standard Mean Deviation Min Max Mean Min Max Characteristic Deviation

Mean Channel 3.54 1.83 1.02 7.20 3.66 2.00 0.86 7.80 Width (m) Mean Water 0.24 0.08 0.07 0.40 0.20 0.07 0.09 0.36 Depth (m) Index of Flow 0.90 0.64 0.12 2.64 0.75 0.53 0.09 2.09 (m2) Temperature 15.8 2.2 12.0 20.2 14.4 3.3 7.6 20.6 (ºC) Conductivity 674 244.6 183.0 1302.0 852.4 289.7 490.0 1514.0 (μS)

46

Table 2.2: Estimates of Redside Dace detection probability and occupancy and ΔAICc values from models for spring and fall field seasons (horizontal headings), at temporal sampling (R) of 3, 4, and 5 replicates.

R Model Fall Fall ΔAICc Spring Spring ΔAICc Occupancy Detection Occupancy Detection ψ (+ SE) probability ψ (+ SE) Probability P (+ SE) P (+ SE) 3 ψ(.)p(.) 0.45 0.76 0.52 0.89 (0.094) (0.07) 0 (0.0929) (0.050) 2.26 ψ(.)p(temp) 0.49 0.69 0.5 0.81 (0.10) (0.093) 0.22 (0.0929) (0.087) 0 ψ (.)p(flow) 0.45 0.77 0.52 0.90 (0.094) (0.11) 2.35 (0.093) (0.06) 3.82 ψ (.)p(temp+flow) 0.49 0.71 0.56 0.84 (0.10) (0.11) 2.38 (0.099) (0.076) 0.24 4 ψ(.)p(.) 0.44 0.77 0.55 0.86 (0.092) (0.060) 1.29 (0.092) (0.044) 7.53 ψ(.)p(temp) 0.48 0.70 0.59 0.79 (0.10) (0.082) 0 (0.09) (0.069) 1.14 ψ(.)p(flow) 0.45 0.76 0.55 0.85 (0.092) (0.092) 3.56 (0.093) (0.059) 7.26 ψ(.)p(temp+flow) 0.48 0.70 0.58 0.78 (0.1014) (0.10) 2.69 (0.099) (0.088) 0 5 ψ(.)p(.) 0.52 0.65 0.58 0.80 (0.093) (0.056) 0.56 (0.092) (0.044) 8.19 ψ(.)p(temp) 0.52 0.66 0.59 0.76 (0.093) (0.077) 2.14 (0.093) (0.077) 5.72 ψ(.)p(flow) 0.53 0.62 0.58 0.79 (0.096 ) (0.085) 0 (0.092) (0.058) 5.23 ψ(.)p(temp+flow) 0.53 0.63 0.61 0.73 (0.096) (0.10) 1.76 ( 0.097) (0.081) 0

47

Figure 2.1: Map of 29 Redside Dace eDNA sampling sites from Fall 2012 and Spring 2013 sampling season (grey circles), 10 lake negative control sampling sites (black triangles) from Spring 2013, and Otonabee River field blank (star) to help establish detection threshold.

48

Figure 2.2: Plot of log10 transformed template DNA copy number (x-axis) versus dilutions for four Redside Dace eDNA samples in order to test for inhibition at four sampling locations (LC1= Lynde Creek 1, LC2= Lynde Creek 2, MC1= Mitchell Creek 1, MC2= Mitchell Creek 2).

49

Figure 2.3: Histogram of negative controls copy numbers/reaction of amplified Redside Dace eDNA (x-axis) versus frequency (y-axis) for four types of: (a) filter control (n=168, x̅ =0.091, s=0.288), (b) lake control (n=32, x̅ =0.048, s=0.13), (c) DNA extraction control (n=31, x̅ =0.081, s=0.29), (d) field control (n=27, x̅ =0.18, s=0.39).

50

Figure 2.4: Scatter plot for mean copy number /reaction of each sample run in triplicates (y-axis) versus the coefficient of variation of those values (CV; x-axis) (left) and histogram of CV versus the frequency of samples that fall under the CV (right).

51

Figure 2.5: Boxplot of qPCR standards with known DNA concentrations (1000 copies/ reaction down to 1 copy/ reaction) set as “eDNA unknowns” versus copy number log10 transformed (y-axis), as a test for qPCR accuracy.

52

Figure 2.6: Boxplot of Redside Dace standards (106 down to 100 copies/reaction) at the threshold cycle (Ct) where the copy number passes the baseline threshold for (A) omitted (data points for the standard curve were removed to improve R2 value) (B) All standards (no data points excluded).

R2 F3 F2 F1 P1 DU3 DU2 DU1 L2 L1 D1

0 1 2 3 4 5 Number of Detections

Fall 2 Fall 1 Spring 2 Spring 1

Figure 2.7: Barplot of total temporal Redside Dace detections (x-axis) found at each of the eleven sampled sites (y-axis). Five temporal replicates were collected at each season (fall and spring) twice, with a time lapse of approximately 10 d between sampling weeks within a season. Site labels on y-axis are listed in Appendix 2.1.

53

54

Figure 2.8: Detection probabilities (y-axis) within seasons (x-axis) for Fall Week 1 (FW1), Fall Week 2 (FW2), Spring Week 1 (SW1), and Spring Week 2 (SW2) (error bars represent upper and lower 95% confidence limits of estimates).

55

1.2

1

0.8

0.6

0.4 Detection Detection Probability

0.2

0 0 0.5 1 1.5 2 2.5 Flow index (m2)

1.2

1

0.8

0.6

0.4 Detection Detection Probability 0.2

0 0 5 10 15 20 25 Temperature (oC)

Figure 2.9: Individual site detection probability estimates for index of flow versus detection probability (top), and temperature versus detection probability (bottom), during Spring at 5 replicates.

four

three Legend Spatial-Spring

two Temporal-Spring

Spatial-Fall Number Replicatesof Number one Temporal-Fall

0 2 4 6 8 10 12 14 16 Number of Sites

Figure 2.10: A comparison of the number of sites (out of n=29) with Redside Dace DNA detections (x-axis), versus the number of replicates sampled (y-axis), for a) the four spatially replicated samples collected at each site and b) the four temporally replicated samples collected at each site in each season.

56

57

Chapter 3: Conservation genetics of Redside Dace (Clinostomus elongatus): phylogeography and contemporary spatial structure

Abstract

Redside Dace Clinostomus elongatus (Teleostei: Cyprinidae) is a species of conservation concern that is declining throughout its range as a result of urban development and agricultural activities. The purpose of this study was to use mitochondrial and microsatellite data to characterize genetic diversity for Redside Dace to understand how past and current events have shaped their genetic relationships.

Phylogeographic structure among 28 Redside Dace populations throughout Ontario and the United States was assessed by sequence analysis of the mitochondrial cytochrome b and ATPase 6 and 8 genes. Populations were also genotyped using 10 microsatellite loci to examine genetic diversity within and among populations as well as contemporary spatial structuring. Mitochondrial DNA data revealed three geographically distinct lineages, which were highly concordant with the three groups identified via microsatellite analysis. Additionally, secondary contact was observed within the Allegheny River and tributaries to . The refugial groups in this study differed from the one refugium (Mississippian), and two refugia (Mississippian and Atlantic) hypotheses presented in the literature. With the exception of three allopatric populations within the

Allegheny watershed, high genetic structuring between populations suggests their isolation, indicating that recovery efforts should be population-based.

58

Introduction

Contemporary species’ distributions have largely been influenced by historical environmental changes during the Quaternary (Hocutt and Wiley 1986, Hanfling et al.

2002, Gum et al. 2005). Glaciation events during the Pleistocene played an important role in influencing evolutionary history; in North America, most of Canada and New England was repeatedly covered by ice sheets (Hocutt and Wiley 1986, Overpeck et al. 1992).

During cycles of glacial advance and retreat, species now found in formerly glaciated areas were repeatedly displaced into peripheral (usually southern) refugia in order to survive (Hocutt and Wiley 1986). These historical events have had a profound impact on contemporary species’ distributions and genetic structure and diversity within species

(Bernatchez and Wilson 1998, Hewitt 2004). More recently, anthropogenic influences have also had profound impacts on contemporary species distributions (Wang et al. 2001,

Leidy et al. 2011). Anthropogenic influences can negatively impact aquatic environments by restricting fish dispersal, causing habitat loss, and degrading water quality (Helfman

2007). This includes the construction of dams that impede fish migration, the removal of riparian vegetation which increases turbidity and sedimentation into waters, and increased agricultural activities resulting in the release of harmful chemicals into the water

(Helfman 2007). Disentangling how past and more recent events have influenced fish distribution can be a daunting task; however, it is necessary to understand how historic and contemporary processes have contributed to observed patterns.

In many cases, genetic information can improve our understanding of species biology, ecology, and behaviour, and can help inform management decisions. Many studies have demonstrated that freshwater fishes have retained genetic historical

59 signatures that reflect glacial influences and events (Bernatchez and Wilson 1998); this can provide important information on postglacial colonization routes and evolutionary lineages for management and conservation (Wilson and Hebert 1998, Ginson et al. 2015).

At a more contemporary scale, conservation genetics provides information on evolutionary lineages within species, their genetic diversity, migration levels and population structure, which can aid in the identification of appropriate conservation and management units for a species of concern (Frankham et al. 2002, McDermid et al. 2011).

This information can also be used to assess the genetic consequences of natural or anthropogenic factors that impact population health and viability (Avise 2000, Frankham et al. 2002). Genetic information at multiple temporal scales, along with information on distribution range, abundances, and critical habitat designation are some of the considerations required to conserve rare species (Frankham et al. 2002, MacKenzie et al.

2006, Helfman 2007).

The Redside Dace, Clinostomus elongatus (Teleostei: Cyprinidae), is a small freshwater fish that typifies many of these conservation concerns and information needs.

Redside Dace are stream fish that are generally found in pools, and occupy a disjunct distribution throughout the upper Mississippi River Drainage, Great Lakes Basin, Ohio

River and upper Susquehanna River (Novinger and Coon 2000, COSEWIC 2007). Within

Canada, Redside Dace populations are largely restricted to southern Ontario, with the exception of one northern population near Sault Saint Marie (Redside Dace Recovery

Team 2010). Populations have been declining throughout their range as a result of urbanization and agricultural activities, and 40% of Ontario populations are thought to be extirpated (Parker et al. 1988, COSEWIC 2007, Redside Dace Recovery Team 2010). In

60

1973, Redside Dace was identified as a species of conservation concern on the grounds that they were less common than 30 years prior (Scott and Crossman 1973). In 1987,

Redside Dace was designated as being of Special Concern by the Committee on the

Status of Endangered Wildlife in Canada (COSEWIC) (Parker et al. 1988) and reassessed as Endangered in 2007 (COSEWIC 2007). The species was listed as Endangered under

Ontario’s Endangered Species Act in 2009 (OMNRF 2015). Within Ontario, urban development is a major threat to populations in watersheds draining into western Lake

Ontario (Redside Dace Recovery Team 2010). Research to date has focused attention on habitat associations, monitoring techniques, threats, and approaches to augmentation; by contrast, only limited information exists on its historical origins and genetic structuring

(Berendzen and Dugan 2008, Houston et al. 2010, Redside Dace Recovery Team 2010,

Sweeten 2012).

Redside Dace were able to persist in glacial refugia and subsequently recolonize

Ontario; however, competing hypotheses exist in the literature of one versus two refugia.

Based on distributional data, Hocutt and Wiley (1986) suggested that Redside Dace colonized their contemporary range from a single (Mississippian) glacial refugium, whereas Mandrak and Crossman (1992) suggested that contemporary populations may have originated from two (Atlantic and Mississippian) refugia. There exists the potential for more than two glacial refugia, as these hypotheses are only based on distributional data; the use of genetic data can provide insight into distinct lineages that can be used to infer number of refugial groups. Additionally, large scale phylogeography has never been applied across the Redside Dace range, and can be informative for detecting structure, ancestry and relationships within a species range (Avise 2000). It is worth noting that

61 multiple refugial groups are a common interpretation (Mandrak and Crossman 1992,

Soltis et al. 2006, Ginson et al. 2015) among other southern Ontario freshwater fishes. A better understanding of glacial refugia and postglacial dispersal routes should provide valuable insights into how these affected present patterns in genetic diversity and structure of Redside Dace, as well as other species occupying similar ranges.

Few studies have looked at contemporary and historical genetic structuring of

Redside Dace populations. Houston et al. (2010) assessed molecular systematic relationships within a subset of the cyprinid family based on mitochondrial (mtDNA) and nuclear data, and resolved Clinostomus as the sister group to the Richardsonius-

Lotichthys clade, refuting previous interpretations which grouped Clinostomus as a sister group to Richardsonius. They also used mtDNA to infer that C. elongatus and C. funduloides diverged approximately 2.6 million years ago, and determined the mutation rate of cytochrome b to be ~1.7% per million years (Houston et al. 2010). Berendzen et al. (2008) assessed genetic patterns of Redside Dace using cytochrome b within and among several tributaries of the upper Mississippi River, and found shallow patterns of genetic divergences among the three tributaries. After assessing the same populations using microsatellite analysis, high among population variation was observed, indicating that conservation efforts would need to be drainage based (Berendzen et al. 2008). More recently, Pitcher et al. (2009) designed a set of eight microsatellite loci for Redside Dace and its congener (C. funduloides) to be used in characterizing population structuring and genetic diversity within Redside Dace. These loci have subsequently been used to assess mate choice and reproductive success in captive Redside Dace (Beausoleil et al. 2012), but have yet to be applied to spatial or conservation genetic questions. A comprehensive

62 understanding of how Redside Dace declines throughout Ontario may affect local and regional levels of genetic diversity can be used to select candidates that would serve as source populations for re-introductions, to examine the effects of inbreeding depression, as well as fragmentation. Examining the global and local variation in genetic diversity of

Redside Dace has been identified as a research priority in the Ontario Redside Dace

Recovery Strategy (Redside Dace Recovery Team 2010).

In this study, I assessed the phylogeography and contemporary genetic structure and diversity of Clinostomus elongatus across its North American range, with an emphasis on Ontario populations. Sequence analysis of mtDNA was used to look at large- scale phylogeography and infer post-glacial dispersal patterns, while microsatellite data were used to characterize contemporary genetic structuring, patterns of gene flow, and genetic diversity levels within and among sampled populations. My major study objectives were to (i) identify phylogeographic lineages based on mtDNA sequencing in order to determine the number of glacial refugia from which contemporary populations originated, and (ii) characterize the genetic diversity and structure of Redside Dace populations in Ontario and across its range using microsatellite loci. Using mtDNA, a single refugium for all Redside Dace during the Pleistocene would be supported by shallow mtDNA lineages with low bootstrap support and little or no spatial structuring

(Avise 2000, Maggs et al. 2008). Alternatively, if Redside Dace dispersed after glaciation from more than one refugium, I would predict that multiple divergent genetic clusters with high bootstrap support and largely distinct geographic distributions would be detected. For microsatellite data, strong contemporary genetic structuring among Redside

Dace populations would be expected because of their specific habitat requirements

63

(Novinger and Coon 2000) and limited dispersal abilities between watersheds (Poos and

Jackson 2012).

Methods

Field Sampling

Redside Dace samples were obtained across the species’ range. Samples were collected intensively throughout Ontario and broadly across other parts of their range in order to test contrasting postglacial origin hypotheses as well as to assess how the spatial genetic structuring of the Ontario populations compare with that throughout the rest of their range. Samples were collected across Ontario using a bag seine or backpack electrofisher (Table 3.1; Figure 3.1). Eleven populations were sampled within Ontario during 2012 and 2013 and DNA samples were collected. The buccal swab technique was used to collect DNA samples when more than 30 fish were caught during the first sampling day (Reid et al. 2012). The caudal fin clip technique was used to collect DNA samples when less than 30 fish were caught during the first sampling day. Fin clipping ensured that individuals from populations that were sampled repeatedly (in order to obtain target sample sizes) were not resampled. DNA tissue samples from 21 populations across other parts of its range were provided by collaborators in the United States including the

Mississippi River, Great Lakes, Ohio River, Allegheny River, Monongahela River and

Susquehanna River basins in order to assess genetic diversity across the global range

(Table 3.1).

DNA extraction

64

DNA from two samples were used for primer optimization of microsatellite and mitochondrial loci, and were extracted using Qiagen DNeasy Blood and Tissue kits

(QIAGEN); all other samples were extracted using a simple lysis and isopropanol method

(Sambrook et al. 1989). DNA was extracted from fin clips using QIAGEN following the manufacturer’s instructions with some exceptions: the samples were lysed at 65 C in the incubator overnight; after adding AW2, each sample was spun at 13, 000 x g for 5 min; sample DNA was eluted in low TE (10 mM Tris pH 8, 0.1 mM EDTA).

DNA from fin clips and buccal swabs were extracted in 96 well plates using an isopropanol protocol with 250 µL of lysis buffer consisting of approximately 250 L of

1xTNES lysis buffer (50 mM Tris pH 8, 100 mM NaCl, 1 mM EDTA, 1 % SDS weight per volume) and 1mg proteinase K. Plates were sealed with a silicon mat and placed in a

37C incubator overnight to lyse samples.

After incubation, samples were centrifuged for one min at 1030 Relative

Centrifugal Force (RCF) in order to spin condensation down to the well bottom. The silicon mat was removed, and 10 L of 5M NaCl was placed in each well, followed by

500 L of 80% isopropanol. The plate was then resealed and centrifuged at 2360 RCF for

45 min. The supernatant was discarded, 1000 L of 70% ETOH was added to each well, the plate was resealed and quickly vortexed, and then centrifuged at 2360 RCF for 45 min. The supernatant was again discarded, and the deep well plate was dried in a 65C incubator for approximately 30 to 40 minutes. The DNA pellet at the bottom of the plate was eluted in 150 L of sterile 1x TE buffer (10 mM Tris pH8, 1 mM EDTA). The plate was resealed, vortexed, and incubated overnight at 4 C.

65

Gel Visualization

After overnight incubation, the resuspended DNA was transferred into a stock plate. To determine if DNA was successfully extracted, samples were visualized using a

1.5% agarose gel; 2 L of 2X SYBR Green was added to 2 L of DNA, centrifuged down to combine SYBR Green with DNA, and the 4 L mixture was added into each well. A DNA mass ladder (BioShop) was included in each row of the gel as a size standard (3 L of mass ladder plus 3L of SBYR Green). The gel was run at 95 V for approximately 1.5 hours, and a picture was taken using ultraviolet (UV) photoimaging.

Depending on band strength, samples were diluted with varying volumes of low TE

(10mM Tris pH 8, 0.1mM EDTA), to make 100 L of working DNA solutions of approximately 6 ng/L. mtDNA PCR amplification

The ATPase 6 and 8, and cytochrome b mitochondrial genes were used for assessing phylogeographic structure in Redside Dace. The primer sequences used to amplify ATPase 6 and 8 were ATP 8.2_L8331 (5’-AAAGCRTYRGCCTTTTAAGC-3’) and CO3.2_H9236 (5’- GTTAGTGGTCAKGGGCTTGGRTC-3’) (Sivasundar et al.

2001). Each 10 L PCR reaction consisted of 2 L template DNA [6 ng/L], 1 L BSA

[200 ng/L], 0.2 L dNTPs [10 mM], 2 L 10X Buffer (with 15 mM MgCL2), 0.4 L

MgCl2 [25 mM], 0.2 L of each primer [10 mM], 0.05 L Taq [5 units/L], and 3.95 L ddH2O. PCR cycling conditions for ATPase 6 and 8 were an initial hot start of 94 ºC for 3 min, followed by 30 cycles of denaturation at 94 ºC for 1 min, annealing at 56 ºC for 90 s and extension at 72 ºC for 90 s, with a final extension at 72 ºC for 1 min. For cytochrome

66 b, the primers used to amplify the fragment were LA-a (5’-GTGACTTGAAAAACCACC

GTT-3’) and HA-a (5’-CAACGATCTCCGGTTTACAAGAC-3’(Dowling and Naylor

1997). Each 10 L PCR reaction consisted of 2 L template DNA [6ng/L], 1 L BSA

[200 ng/L], 0.2 L dNTPs [10 mM], 2 L 10X Buffer (with 15 mM MgCL2), 0.4 L

MgCl2 [25mM], 0.2 L of each primer [10 mM], 0.05 L Taq [5 units/L], and 3.95 L ddH2O. The PCR cycling conditions for this gene region were an initial hot start of 94 ºC for 8 min, followed by 35 cycles of denaturation at 95 ºC for 30 s, annealing at 50 ºC for

30 s and extension at 72 ºC for 90 s, with a final extension at 72 ºC for 7 min. PCR products were visualized on a 1.5% agarose gel to determine amplification success.

Sequencing

Post-PCR clean-up was done using ExoSAP to remove excess primers and dNTPs. Before sequencing, 8 L of PCR product had the following mixture added to it:

0.9 L Antarctic Phosphatase Buffer, 5,000 units/mL Antarctic Phosphatase, and 20,000 units/mL Exonuclease (New England Biolabs). Reactions were placed in the thermocycler with the following conditions: 37 ºC for 15 min, 80 ºC for 15 min followed by a cool-down to 10 ºC.

Sequencing reactions were carried out in 12 µL reactions with each well containing the following quantities of reagents: 0.5 L of BigDye dye terminator mix 3.1

(Applied Biosystems), 1L of 5X buffer, 9 L of ddH2O product and 0.5 L of PCR product. With PCR reactions that generated faint bands, the reaction quantities were changed to 1.0 L of PCR product and 8.5 L of ddH2O. PCR products were sequenced in both directions using amplification primers as well as internal primers 8.3 (5’-

67

AAYCCTGARACTGACCATG-3’) for ATPase 6 and 8, and LDrs (5’-

CCATTTGTCATCGCCGGTGC-3’), HDrs (5’- GGGTTATTTGACCCTGTTTCGT-3’) for cytochrome b (Dowling and Naylor 1997, Houston et al. 2010). The PCR plate was then placed into the thermocycler under the following conditions: an initial hot start of 96

°C for 2 min, followed by 30 cycles of denaturation at 96 °C for 30 s, annealing at 55°C for 15 s, and an extension at 60 °C for 4 min. Following the cycle sequencing reaction, ethanol precipitation was done. 1.1L of 5M sodium acetate was heated to 37°C prior to being added to the reaction, followed by 37 L of 95% ethanol. The plate was vortexed and placed in the centrifuge at 2360 RCF for 45 min. The supernatant was discarded, 150

L of 70% ethanol was added to each well, and the plate was centrifuged at 2360 RCF for

45 min. Post centrifugation, the supernatant was discarded and the plate was dried in a 65

°C incubator. Once dried, the pellet was reconstituted in 10L of HiDi formamide

(Applied Biosystems), and sequenced product was visualized on an ABI 3730 sequencer.

Microsatellite PCR amplification

For microsatellite amplification, primers for ten published loci (Dimsoski et al.

2000, Bessert and Orti 2003, Pitcher et al. 2009) were optimized to be run in multiplex

(see Appendix 3.1 for list of primers). Each PCR reaction consisted of 2 L template

DNA [6 ng/L], 1 L BSA [200 ng/ L], 0.2 L dNTPs [10 mM], 2 L 10X Buffer (with

15 mM MgCL2), 0.05 L Taq [5 units/L], and remainder ddH2O depending on number of primers added to the reaction with a total volume of 10 L. Primer concentrations for multiplex reactions were as follows: Multiplex 1- RSD 86 [0.3 M], RSD 42A [0.3 M],

RSD 70 [0.3 M], RSD 2-91 [0.3 M]; Multiplex 2- RSD 142 [0.05], RSD 179 [0.05];

68

Multiplex 3- RSD 2-58 [0.3 M], CA 12 [0.3 M]; Multiplex 4- CA11 [0.3 M],

Ppro118 [0.3 M]. Multiplexes MP1 and MP2 had an initial hot start of 94 ºC for 10 min, followed by 30 cycles of denaturation at 92 ºC for 1 min, annealing at 54 ºC for 1 min and extension at 72 ºC for 1 min 30 s, with a final extension at 72 ºC for 7 min. Multiplexes

MP3 and MP4 had an initial hot start of 94 ºC for 10 min, followed by 30 cycles of denaturation at 92ºC for 1 min, annealing at 57ºC for 1 min and extension at 72 ºC for 90 s, with a final extension at 72 ºC for 7 min. Post PCR, 1 mL of HiDi was mixed with 4 L of 350 ROX size standard (Applied Biosystems). 10 L of the HiDi formamide (Applied

Biosystems) and 350 ROX mixtures were pipetted into each well of a 96-well plate and 1

L of PCR product was added. The plate was spun down and used for genotyping using an ABI 3730 sequencer.

Data Analysis

Mitochondrial Data

Sequencher v.4.05 (GeneCodes) was used to trim primers, assemble and manually edit bidirectional sequences from raw electropherogram “trace” files. Sequence data were aligned using MEGA v.6.06 (Tamura et al. 2007). Basic haplotype information was obtained using DnaSP v.5.10.01, including the number of unique haplotypes, variable sites, and haplotype and nucleotide diversity for each gene region (Librado and Rozas

2009). Haplotype richness was calculated for ATPase 6 and 8, and cytochrome b, in

HAPLOTYPE ANALYSIS v.1.05 (Eliades and Eliades 2009) using a sample size of ten for cytochrome b, and nine for ATPase 6 and 8, to account for differences in sample size across populations. Mutational differences among haplotypes were assessed in PopART

69 v.1.7.2 (http://popart.otago.ac.nz) using minimum spanning networks (Bandelt et al.

1999). Dendrograms were created in MEGA v.6.06 (Tamura et al. 2007) for neighbour- joining (p-distance) (Saitou and Nei 1987), and maximum parsimony with a 500 bootstrap support (Felsenstein 1985). Mutational networks and genetic distance dendrograms were created for ATPase 6 and 8, and cytochrome b, separately, as well as for the combination of the two gene regions (referred to as “total evidence” hereafter). In order to test various refugial hypotheses, the populations were grouped into a priori subsets to examine variation present at various hierarchical levels. An analysis of molecular variance (AMOVA) was implemented in Arlequin 3.5 (Excoffier and Lischer

2010) to determine how the variation was partitioned.

Additionally, to determine time of divergence for Redside Dace cytochrome b lineages, the C. elongatus - C. funduloides divergence of 2.6 million years and a variation of 1.7% per million years (Houston et al. 2010) was applied to cytochrome b haplogroups detected in this study.

Microsatellite Data

Genotype data were manually scored using GeneMapper v.4.1 (Applied

Biosystems). Microchecker v.2.2.3 (Van Oosterhout et al. 2004) was used to test for null alleles and genotyping errors in the dataset. Genetic polymorphism levels were calculated for number of alleles (Na), observed heterozygosity (Ho), expected heterozygosity (HE), inbreeding coefficient (FIS), and Hardy Weinberg equilibrium using GenAlEx v.6.5 (Peakall and Smouse 2006). To account for sample size differences, HP-Rare v.1.1

(Kalinowski 2005) was used to calculate allelic richness standardized to a sample size of

70

10 individuals or 20 alleles. Genepop v4.2 on the web (http://genepop.curtin.edu.au/) was used to determine if linkage disequilibrium occurred among genotyped loci (Raymond and Rousset 1995). Pairwise population FST values were calculated with Arlequin v.3.5

(Excoffier and Lischer 2010), and statistical significance was based on 10,000 dememorization steps.

Individual-based analysis was first run to determine if population structuring exists, followed by population-based analysis to examine genetic differentiation between groups. STRUCTURE, a Bayesian-based clustering program, was used to assign individuals to clusters (K). The program was run with a 50,000 step burn-in period, followed by 50,000 Markov Chain Monte Carlo (MCMC) resampling steps, with a total of 8 iterations per value of K. Allele frequencies were not correlated, and the no admixture model was selected because the populations under study are distinct (Pritchard et al. 2000). The K value was set from 1 (panmixia) to 29 (indicating discrete population structuring for all sampled groups). Additionally, STRUCTURE was re-run based on the populations that clustered together when K=3 (one western and two eastern refugial groups). The objective of these runs was to characterize population structure at regional and local scales. Cluster 1 was run from K 1 to 10, Cluster 2 was run from K 1 to 10, and

Cluster 3 was run from K 1 to 20. Structure Harvester v 0.6.94 (Earl and vonHoldt 2011) was used to visualize the optimal number of clusters using the data generated from

STRUCTURE with (i) maximum likelihood for K and (ii) highest second order rate of change using ∆K (Evanno et al. 2005). CLUMPP v.1.1.2 (Jakobsson and Rosenberg

2007) was used to collate independent run results from STRUCTURE, which were then visualized using program DISTRUCT v.1.1 (Rosenberg 2007). Principal Coordinate

71

Analysis (PCoA) was run using GenAlEx (Peakall and Smouse 2006). The purpose of this test was to use an ordination method to visualizeat genetic distances among sites sampled. POPTREE2 (Takezaki et al. 2010) was used to generate genetic distance dendrograms using neighbour joining of DA genetic distance measure (Nei et al. 1983).

The DA genetic distance measure has been proven to construct precise and accurate dendrograms under many evolutionary models (Takezaki et al. 2010). Lastly, a hierarchical FST analysis was completed in Arlequin v.3.5 to account for the subpopulation substructure variation by running an AMOVA. This was run to examine groupings between (i) east versus western groups, (iii) clustering according to K=3 for

STRUCTURE output and PCoA analysis, and (iii) current drainage distributions (Great

Lakes, Upper Mississippi, Ohio River, and Susquehanna). A Mantel test was run in

GenAlEx to test for a significant relationship between genetic and geographic distance between all population pairs, using pairwise FST values and geographic distances based on map coordinates. This relationship was then graphed, with the geographic distance on the x-axis, ln (geographic +1) transformed, and the genetic distance on the y-axis, Fst/(1-

Fst), with a total of 999 permutations (Rousset 1997).

Results

Mitochondrial polymorphism

A total of 312 individuals were successfully sequenced for ATPase 6, ATPase 8, and cytochrome b. Twenty-three unique haplotypes were identified from ATPase 6 and 8 sequences obtained from 338 Redside Dace individuals (Table 3.2). No gaps were present in the data. Of the 806 nucleotide positions in this gene region, 29 were variable

72

(polymorphic), with 25 parsimony-informative sites, and 4 singleton sites. Haplotype diversity within populations ranged from 0 to 0.667, and nucleotide diversity ranged from

0 to 0.00127 (Table 3.3). Allegheny River drainage populations WOO and BHR had the highest haplotype diversities, but also had larger sample sizes (Table 3.3). These results were consistent when haplotype richness was calculated, standardized for a sample size of ten. For cytochrome b, 35 unique haplotypes were identified from 327 Redside Dace individuals. Three sequences were removed from the analysis due to ambiguity in base calling. Of the 1101 nucleotide positions in this region, no gaps were present, and 41 sites were variable, of which 33 were parsimony informative, while 8 were singleton sites

(Table 3.4). The overall haplotype diversity for all sequenced individuals was 0.897, while nucleotide diversity was 0.003. Haplotype diversity within populations ranged from

0 to 0.838, while nucleotide diversity ranged from 0 to 0.005 (Table 3.3). Twelve populations were characterized by a single haplotype. The highest haplotype and nucleotide diversities were associated with three Allegheny River populations (DOD,

BHR, and EBM) (Table 3.3). These results were consistent when haplotype richness was calculated, standardized to a sample size of ten.

For the total evidence analysis, ATPase 6 and 8 and cytochrome b sequences from

312 individuals were combined to create a 1908 base pair fragment. This resulted in 47 unique haplotypes with 37 parsimony informative sites and 10 singletons (Table 3.5).

Mutational Network and Distance Based Analysis

For ATPase, a mutation network analysis identified the presence of two haplogroups, A and B, which were separated by a minimum of six mutational steps

73

(Figure 3.2). The TTR population was represented by haplotype 9, and were separated by a minimum of five mutational steps between the two haplogroups. Haplogroup A was represented by 18 unique haplotypes with a maximum of five mutations between any two haplotypes, while B was represented by 4 unique haplotypes, with a maximum of three mutations between any two haplotypes. The two groups were supported by high genetic structuring between the eastern and western groups with high bootstrap support (77%) for haplogroup A from haplogroup B (Figure 3.3). Phylogeographical structuring was observed, with haplogroup A representing an eastern distribution, and haplogroup B representing a western distribution (Figure 3.2). Haplotypes 1, 3, 9, and 15 were the most common haplotypes found in multiple drainages. A group was defined as having a minimum of six mutational steps away from each other, as well as having a high bootstrap support (>70) within the neighbour joining dendrogram.

For cytochrome b, a mutation network analysis revealed the presence of two haplogroups (labelled C and D) which were separated by a minimum of seven mutational steps (Figure 3.4). Haplogroup C was represented by 26 unique haplotypes with a maximum of six mutations away from any two haplotypes, while haplogroup D was represented by nine unique haplotypes with a maximum of three haplotypes between any two mutations. The two groups are supported by high bootstrap support (99%) (Figure

3.5). Haplogroup D was restricted to the Allegheny and Ohio River drainages, whereas haplogroup C was present within all sampled drainages. Haplotypes 2, 9, and 19 were the most common haplotypes observed. Based on the molecular clock estimated by Houston et al. (2010) of 1.7% per million years for cytochrome b, the genetic distance of 0.9%

74 observed between cytochrome b haplogroups C and D would suggest their divergence took place approximately 0.5 million years ago.

For the total evidence data (ATPase 6 and cytochrome b combined), a mutation network analysis revealed the presence of three composite haplogroups (labelled 1, 2, and

3) which were separated by a minimum of five mutational differences between each group (Figure 3.6; Appendix 3.4). Haplogroup 1 was the most common and was represented by 27 unique haplotypes, haplogroup 2 was represented by 8 unique haplotypes, and haplogroup 3 was represented by 11 unique haplotypes. The three groups were supported by high bootstraps (>70%), and haplogroup 7 did not fall within any of the lineages and therefore grouped on its own (Figure 3.6, 3.7). A map of the three haplogroups and their distributions indicated that haplogroup 2 was restricted to the western portion of the species range (upper Mississippi River), while haplogroups 1 and 3 were only observed in its eastern range (Figure 3.8). Haplogroup 1 was found exclusively within the Ohio River, Susquehanna River, and Great Lakes drainages, and haplogroup 3 was found exclusively within the Ohio River watershed. The distributions of the two eastern haplogroups overlapped within the Allegheny River watershed a headwater system of the Ohio River drainage (Figure 3.8).

Microsatellite polymorphism

All ten microsatellite loci were polymorphic across all populations. Within individual populations, the number of polymorphic loci ranged from 60% to 100%, with a mean of 88.6% and a standard error of 1.9% (Appendix 3.2). The number of alleles at each locus ranged from 11 to 27 (mean =17.5). The mean number of alleles per locus

75 within a population was 3.8 when standardized (at 20 alleles) and 4.8 when uncorrected.

Across all populations, linkage was found between loci RSD2-91 and RSD42A, and Ca11 and RSD142 (p<0.05). With few exceptions, pairwise comparisons within populations were not significant, and therefore all loci were used for analysis. Most loci by population tests (93%) were in Hardy-Weinberg equilibrium (Appendix 3.3). When null alleles were analyzed on a population-by-population basis using MICROCHECKER, there was no evidence for null alleles in at least 26 of 28 populations analyzed, and all loci were used in subsequent analysis.

Allelic richness and observed heterozygosity were greatest in eastern mid-latitude

Redside Dace populations, specifically those in the Allegheny River (WOO, EBM, BHR, and DOD) (Table 3.6). The lowest values were found in tributary populations in Ontario (TTR, SAU, and GUL). High levels of Redside Dace genetic diversity were also measured from some of the Ohio River drainage populations (EFI and BRU) at the more southern part of the species’ range. Populations on the north shore of Lake Ontario tributaries had low to moderate genetic diversity levels, while the populations found in the western portion of its range were characterized by low levels of diversity (Table 3.6).

Population Structure Analysis

Hierarchical STRUCTURE analysis confirmed population differentiation at range-wide, regional, and local scales. The number of clusters (K) identified within

STRUCTURE was 17 using the likelihood [Pr(X/K)] values (Figure 3.9), while the number of clusters using the Evanno et al. (2005) ΔK method was 2 (Figure 3.9).

STRUCTURE was used at a range-wide level to confirm the unique lineages identified

76 via mtDNA analysis (Figure 3.10). A K=1 value, which would result if Redside Dace were panmictic, was not supported within STRUCTURE because it had the lowest likelihood [Pr(X/K)] value (Figure 3.9). At K=2, the clustering analysis partitioned the populations into “eastern” versus “western” groups. Although the Evanno et al. (2005) method suggested K=2 to be the best method based on second order rate of change in ∆K, population assignments based on K=3 groups showed better concordance with the mtDNA neighbour-joining dendrogram and likelihood [Pr(X/K)] value. K=3 was best supported at the range-wide level, and with the exception of Lake Ontario tributaries, the populations that fell into the major mtDNA groups matched that of STRUCTURE results.

STRUCTURE partitioned the populations into a western group (the Northwest

Mississippi River, the Great Lakes (Lake Superior), and two eastern groups (Figure 3.10).

The first eastern group was observed in the Ohio River, tributaries of Lake Huron and

Lake Erie, and the lower Mississippi River; the second eastern group was observed in

Lake Ontario tributaries, the Allegheny River, and the Susquehanna River.

Further STRUCTURE analysis was separately undertaken within each of the three groups identified at K=3. Each subgroup showed evidence of genetic substructure: the substructure runs indicated that the first eastern (blue) group had a total of eight groups, while the western (red) group had a total of five groups, and the second eastern (green) group had a total of sixteen (Figure 3.10, 3.11). These values were obtained based on the ln(P(K)) method because they provided the most accurate optimal solutions for K. Within each of the three groups, strongpopulation structuring was observed across all populations. An exception to this, was three of the Allegheny River populations within

77 the second eastern lineage (DOD, WOO, BHR), as well as the Monongahela population

(STR), which consistently grouped together across different solutions for K (Figure 3.10).

PCoA identified the same groups as STRUCTURE at K=3 (Figure 3.10, 3.12).

Group B and group C (eastern populations) grouped closer together, while group A

(western) appeared to be more genetically distinct (Figure 3.12). Axes one and two accounted for 33.2% (eigenvalue = 10.1) and 14.2% (eigenvalue = 4.3) of the variation in the microsatellite data, for a cumulative value of 47.4%. The neighbour joining dendrogram of pairwise genetic distances generated using Nei et al.’s (1983) DA indicated substantial genetic divergence among populations (Figure 3.13). Groups A and C identified by PCoA analysis were supported by high bootstrap values in the neighbour- joining dendrogram, however, group B showed two discordances (i) one of the populations within the neighbour joining dendrogram in group B (MIL) was more genetically similar to the western group, and (ii) the Ohio River populations within group

B were more genetically similar to the group C than to the Great Lakes populations within Group B.

Pairwise genetic differences (FST tests) between populations were all significant (p

< 0.05), and pairwise FST values ranged from 0.08, to 0.62 (Table 3.7). A significant isolation by distance was revealed when geographical distance was plotted against pairwise genetic distance (R2 = 0.28, p < 0.0001) (Figure 3.14). The second IBD plot of a subset of populations of geographic distance (Ln (1+Distance)), showed that the genetic differences were higher between geographically proximatepopulations, whereas lower genetic distances were found between more distant Allegheny River populations (Figure

3.15).

78

Hierarchical Analysis of Molecular Variance (AMOVA)

AMOVA identified significant genetic variation at all hierarchical levels, for the hypotheses tested (Table 3.8). For mtDNA, the greatest amount of variation was observed among the three groups identified via total evidence analysis (71.1%), while 21.9% of the variation was attributed to populations within groups, and 7.0% of the variation was attributed to within populations. The second highest amount of variation was observed between the Atlantic and Mississippian populations (53.8%), with 32.1% of the variation attributed to populations within the drainage groups, and 14.1% of the variation attributed to within populations. The least amount of variation was observed among the three groups identified by STRUCTURE and PCoA (47.1%), with 35.2% of the variation attributed to populations within groups, and 17.7% of the variation attributed to within populations

(Table 3.8). The AMOVA results from the FST analysis indicated that most of the variation occurred within populations of all three hypotheses tested (54.6% for two groups, 59.2% for three groups, 61.4% for four groups), rather than among populations within groups (24.6% for two groups, 21.1% for three groups, 28.5% for four groups), or among groups (20.8% for two groups, 19.7% for three groups, 10.1% for four groups)

(Table 3.8).

Discussion

The genetic data from this study indicated that Redside Dace were isolated within multiple glacial refugia during the last ice age, and post-glacial recolonization routes can be inferred based on phylogeographic patterns. Both mitochondrial and microsatellite data exhibited evidence of three distinct phylogeographic lineages with restricted

79 geographic distributions. The restricted geographic range and high genetic variation between the three mtDNA haplogroups, suggests their historical persistence in separate refugia. Co-occurrence of mtDNA lineages occurred within the Allegheny River drainage, suggesting historical secondary contact during postglacial colonization. Using microsatellite data, high genetic structuring was observed, suggesting little to no gene flow and reciprocal isolation among sampled populations. Moderate genetic diversity levels were found in urbanized areas of Ontario, whereas high diversity levels were found within the Allegheny River drainage likely as a result of fewer landscape-level disturbances, and the southern part of the Redside Dace range likely as a result of those areas being in unglaciated areas. The data generated from this study can be used as a baseline for future Redside Dace genetic monitoring.

Evidence for multiple glacial lineages

Multiple glacial refugia were supported by the highest among-group variation in the AMOVA being explained by the mtDNA total evidence lineages. My findings do not support the single Mississippian hypothesis (Hocutt and Wiley 1986); there are three lineages with high bootstrap support and spatial structuring, which is contrary to shallow mtDNA lineages and limited spatial structuring expected by one refugium. My data support a multiple refugia hypothesis, as supported by both mtDNA and microsatellite evidence, all reflective of a Mississippian origin. However, instead of one Atlantic and one Mississippian lineage as hypothesized, three Mississippian groups were identified.

An Atlantic origin, as suggested by Mandrak and Crossman (1992), may be more difficult to achieve because the Appalachian mountains could have created a barrier to dispersal, as seen with the Slider turtle (Trachemys scripta), Bowfin (Amia calva), and Largemouth

80

Bass (Micropterus salmoides) (Soltis et al. 2006). If an Atlantic lineage was present, the eastern Lake Ontario populations should have separated into its own haplogroup, instead of grouping along with the eastern Mississippian lineages, as seen with Banded Killifish

(Fundulus diaphanous) (April and Turgeon 2006) and Lake Trout (Salvelinus namaycush) (Wilson and Hebert 1998). Redside Dace diverged from their congener (C. funduloides) approximately 2.6 million years BP during the Pliocene (Houston et al.

2010), and so the splitting of the three lineages would have had to take place after the speciation event.

The narrow western distribution of haplogroup two was characterised by high genetic distances from haplogroup one and three, suggesting its long-term isolation

Borden and Krebs (2009) identified a similar pattern for Smallmouth Bass (Micropterus dolomieu), which was also characterized by low genetic diversity levels west of the

Mississippi River. The “Driftless Area,” located in southwestern Wisconsin was not covered by glaciers and likely served as a refugium to Redside Dace and other freshwater fishes such as Brook Trout (Salvelinus fontinalis) (Berendzen and Dugan 2008, Hoxmeier et al. 2015). To reach their current distributions within the western range, Redside Dace likely used the Brule-Portage outlet of Lake Duluth (precursor to Lake Superior), which formed approximately 11, 500 years BP (Mandrak and Crossman 1992). Redside Dace may have reached as far as ancestral Lake Huron or Lake Algonquin either using Lake

Duluth or the Michigan Upper Peninsula. Similar eastward dispersal from a western source has been inferred for the Common Gartersnake (Thamnophis sirtalis) (Placyk et al.

2007), which would have used terrestrial corridors. Of the three lineages, haplogroup one was characterized by the highest number of haplotypes and also had the widest

81 geographic range, with tributaries draining into the Ohio River, , Lake Ontario,

Allegheny River and the Susquehanna River. Co-occurrence of haplogroup one and three was found within the Allegheny River basin, suggesting their secondary contact following deglaciation. Within haplogroup one, the highest genetic diversity levels were found within the tributaries of the Kentucky and Licking rivers (tributaries of the Ohio River), and were consistent with high diversity levels in southern, unglaciated ranges (Hewitt

1996, Bernatchez and Wilson 1998). These populations are located within the Eastern

Highland Region, which was unaffected by the ice sheets and contained high amounts of endemism and species richness (Mayden 1988). The Kentucky River served as an important refugial source for Smallmouth Bass (Borden and Krebs 2009), the Painted- hand Mudbug crayfish (Cambarus polychromatus) (Simon and Burskey 2014) and the

Orangethroat Darter (Etheostoma spectabile) (Bossu et al. 2013). In general, the southern tributaries of the Teays River system were a refugial source for freshwater fishes during the last ice age (Hocutt et al. 1986).

Using cytochrome b haplotype divergence data, and the temporal divergence estimate for C.elongatus and C. funduloides published by Houston et al. (2010), haplogroup one and three may have separated from each other ~0.5 million years BP during the Kansan glaciation (Hocutt et al. 1986), and were likely connected via the

Teays River System. During the early Pleistocene, the Teays originally started at the headwaters of West Virginia and went northwest through Ohio, Indiana and Illinois, where it then connected to the ancestral Mississippi River (Ver Steeg 1946). Connections within this drainage system are thought to have been altered starting in the Nebraskan around 1 million years BP, resulting in the current Ohio River System (Hocutt et al.

82

1986). Haplogroup 3 was found within the upper Ohio River system. Based on the genetic haplotype distribution and knowledge of past events, I hypothesize that Redside Dace were present in unglaciated West Virginia, and gained access to the Allegheny and

Susquehanna River drainage via proglacial Lake Monongahela (Hocutt and Wiley 1986,

Mandrak and Crossman 1992).

The data presented here suggest the presence of three separate Mississippian refugia. For haplogroups one and two, the distinct geographical distributions and the high number of mutations are reflective of two separate refugia. The two haplogroups also appear to have not come into contact based on the haplogroups corresponding to non- overlapping geographic localities. Similarly, haplogroup one and three likely persisted in separate refugia: their divergence and distinct geographic distributions reflect their isolation and subsequent dispersal from separate refugia (Maggs et al. 2008). The two glacial groups appear to have come into contact with each other after the ice melted; this is supported by co-occurrence of haplogroups one and three within the Allegheny, as well as the high mtDNA and microsatellite diversity level in this area, which is found when two groups come into contact with each other (Maggs et al. 2008).

A point worth mentioning is the discordance between the ATPase 6 and 8, as well as the cytochrome b data set. ATPase 6 and 8 and cytochrome b, each identify two groups that are different from each other. This could be the result of selective sweeps, mutation, or background selection (Ballard and Whitlock 2004). While some authors have argued for analysing data from each gene region separately as a result of heterogeneity in datasets (Miyamoto and Fitch 1995), other studies have taken a total evidence approach, because it is thought that multiple genes evolving at different rates could interact in a

83 positive way to resolve phylogenetic relationships (Kluge 1989, Mousson et al. 2005).

Given the high congruence between the microsatellite and the total evidence mtDNA dataset, the latter approach seemed to be the most appropriate.

The combined use of both mtDNA and microsatellite markers provide invaluable contributions to understanding the interplay between historical and contemporary genetic structuring. Co-occurrence of the two eastern mtDNA lineages occurred within the

Allegheny River drainage, and microsatellite data suggest that historical secondary contact extended up to the western tributaries of Lake Ontario. The predominance of haplogroup one over haplogroup three in secondary contact zones within western Lake

Ontario tributaries could have been the result of natural selection, genetic drift, or smaller number of founding individuals for haplotype three coming into contact with an already established haplogroup one (Bernatchez and Wilson 1998, Avise 2000, Galtier et al.

2009, Jezkova et al. 2013). Mitochondrial DNA is haploid and therefore has a lower effective population size, and reaches fixation at a fast rate (Avise 2000, Avise 2001), and would therefore under-estimate the extent of secondary contact. Microsatellite markers are diploid, have a high mutation rate, and in conjunction with the mtDNA, they allow for a comprehensive understanding of the extent of secondary contact based on the discrepancy between the two lines of evidence. The use of microsatellite data on their own would make it difficult to infer secondary contact, while the use of mtDNA on its own would underestimate the degree of secondary contact.

I hypothesized that variation in mtDNA data was likely the result of multiple interstadials, which would have allowed for long-distance dispersal of haplogroup one via glacial meltwaters (Lewis et al. 1995, Dyke 2004). Lake Ontario was likely an important

84 area of secondary contact for Redside Dace, along with other north-eastern freshwater fishes during deglaciation (Mandrak and Crossman 1992, Stepien and Faber 1998, April et al. 2013). Additionally, Lake Erie was a critical post colonization route for species dispersal during the Pleistocene because it was the first area to be deglaciated (Lewis et al. 2008). Lake Ontario was covered by ice sheets during the last ice age and it wasn’t until 13, 000 years ago that the glaciers started to melt, resulting in the formation of Lake

Iroquois, which received waters from Lake Whittlesey (ancestral Lake Erie) (Hocutt and

Wiley 1986). The end of the Erie ice lobe was marked by a large flooding through New

York to Hudson Valley, and subsequently to the Atlantic Ocean; this could have contributed to Redside Dace refugial mixing (Schmidt 1986, Mandrak and Crossman

1992). Another possible scenario is that the two refugial groups mixed prior to colonization, and then later dispersed. If haplogroup one reached fixation before dispersal, this would create similar results as seen under the secondary contact scenario.

Contemporary structuring

As well as reflecting intraspecific phylogeographic lineages, microsatellite data show evidence of high genetic structuring within Redside Dace populations, indicating a general lack of gene flow among populations. While microsatellite markers are usually used for looking at contemporary genetic structuring (Freeland et al. 2011), the PCoA and

STRUCTURE results reflect phylogeographic ancestry, suggesting that historical factors have had an important influence on present day genetic patterns. Despite the evidence of historical signatures, however, high genetic structuring was found among most Redside

Dace populations, suggesting that local populations became reciprocally isolated after colonization. The lack of connectivity between populations as a result of limited dispersal

85 abilities (Poos & Jackson 2012), can cause genetic drift and therefore result in genetic differentiation between populations. Redside Dace are also habitat specialists because they prefer slow-moving waters, overhanging vegetation, occupy mid-position levels in pools, and require cool and clear waters in order to persist, resulting in smaller home ranges and patchy distributions (Novinger & Coon 2000, Poos & Jackson 2012). The findings presented here are consistent with other Redside Dace genetic studies, which documented high genetic structuring among Ohio River drainage populations (Sweeten

2012), as well as populations in Mississippi River tributaries (Berendzen et al. 2008).

This study greatly expanded on these previous studies by providing evidence for similar genetic structuring throughout the species range. Similar results are observed in other small-bodied fishes in Ontario including Eastern Sand Darter (Ammocrypta pellucida)

(Ginson et al. 2015), Greenside Darter (Etheostoma blennioides) (Beneteau et al. 2009), and Pugnose Shiner (Notropis anogenus) (McCusker et al. 2014).

Effects of urbanization and agriculture on genetic diversity levels

Increasing urban development in adjacent landscapes is thought to have resulted in the decline and isolation of Redside Dace populations (MNR 2011). Despite over 80% of Ontario populations residing within areas of high urbanization (COSEWIC 2007), high genetic diversity levels were found throughout much of the Redside Dace range. This could be the result of samples being collected from headwater populations that are still abundant and on the edge of the advancing front of urbanization. Similar patterns have been reported in freshwater mussels (Galbraith et al. 2015) and Lake Sturgeon (Wozney et al. 2011), where genetic diversity levels did not reflect population status. The lowest genetic diversity levels were observed in the Saugeen River and Gully Creek (Lake

86

Huron drainage), which are both characterized by high agricultural activities. Redside

Dace were historically widespread in the Saugeen River watershed, but are now limited to a small number of headwater catchments. Within Gully Creek, Redside Dace numbers are thought to be below those needed for long-term population viability (Poos et al. 2012).

Poos et al. (2012) found that Redside Dace were located throughout Gully Creek, but only at low densities, and suggested that their low numbers could be the result of the chronic, as opposed to episodic, nature of chemicals being released into the water. The reduced genetic diversity in these systems might therefore reflect reduced numbers of Redside

Dace and increased genetic drift, or might also reflect genetic loss due to local inbreeding

(Redside Dace Recovery Team 2010). It is worth noting that the low sample sizes for

Saugeen River sites might have artificially reduced diversity estimates. Poos et al. (2012) also flagged Gully Creek as a population of special concern, with current demographics falling below the population size required to ensure long-term viability.

Allegheny River populations

STRUCTURE analysis identified that three populations within the larger

Allegheny River watershed (DOD, BHR, and WOO) were genetically similar. The STR population may also belong to this group, as it is most genetically similar to BHR despite being from the Monogahela watershed. While EBM is part of the Allegheny River watershed, it was genetically different than other populations. Two potential explanations for the genetic similarity of the three populations could be the result of natural dispersal and connectivity among the Allegheny River populations, or human transfer among sites via bait bucket introductions; Redside Dace is most abundant within Pennsylvania, in comparison to the rest of its range. As a result, it is permitted for use as bait within this

87 jurisdiction. For natural dispersal, if high connectivity was present between the three

Allegheny River populations, genetically similar populations could have resulted due to high gene flow. There is no evidence to refute the natural dispersal scenario; while the populations surveyed are geographically distant and Redside Dace are non-migratory fish, they are also very abundant in the Allegheny watershed, and no published literature exists on genetic structuring and/or dispersal. The focus of past Redside Dace studies have been on populations that were in areas of urbanization and agricultural activities

(Sweeten 2012, Poos & Jackson 2012, Poos et al. 2012) , and this is the first study to look at this species in an area of high abundance. Bait bucket introductions into streams is another possible scenario. If bait bucket transfers are occurring, high genetic similarities would result due to recent shared ancestry, irrespective of geographic distance, because gene flow is human-mediated, and can therefore span large distances. The lower FST values between the Allegheney River populations over greater distances could be explained by such transfers. Literature suggests that anglers still dispose of live baitfish into the waters even though there are strict regulations that permit this, which could have resulted in the release of Redside Dace as baitfish into non-native waters (Litvak &

Mandrak 1993).

The available data do not refute/disprove predictions from evidence for either hypothesis. To resolve the probable cause underlying the observed genetic patterns in the

Allegheny watershed, fine-scale genetic sampling within the three Allegheny streams would be required to document potential metapopulation structure and test or refute both hypotheses. If a natural dispersal scenario were valid, I would expect to see genetic similarity between all sampled populations, and a plot of isolation by distance would

88 reveal low divergence over low to moderate geographic distances, and divergences increasing with geographic distance. If a bait bucket scenario were valid, I would not expect to see the same patterns of increased genetic distance over increased geographic distance; the introductions would be human-mediated and fish are able to travel large distances.

Management Implications

Defining conservation units below the species level has yet to be done in a large- scale context for Redside Dace, and the genetic data generated from this study can be used to inform recovery efforts. The three monophyletic lineages identified using microsatellite and mtDNA analyses, and their genetic distinctiveness from each other, indicate their long-term separation. The reciprocal monophyly observed within the three groups, meets one of the criteria for defining Evolutionary Significant Units (Moritz

1994) and Management Units (Moritz 1994) within the United States, and Designatable

Units (COSEWIC 2012) within Canada. For identifying DUs, however, both evolutionary and/or ecological significance are also necessary for recognizing distinct groups

(COSEWIC 2014). To date, as only limited ecological data exist for comparisons, it may therefore be worth considering whether or not the genetic groups identified should be recognized as separate DUs, or a single DU.

Regardless of whether populations in Canada are classified as a single DU or multiple DUs, recovery efforts focused on multiple distinct genetic conservation units will be important for recovery efforts. Successfully conserving a species, involves preserving multiple populations from a variety of ecological settings that are able to

89 support themselves (Redford et al. 2011). By identifying multiple populations to protect within these three major groups, this would ensure adequate conservation of all lineages throughout the Redside Dace range. Additionally, the jurisdiction of Ontario is unique in that it contains all three lineages identified by both mtDNA and microsatellite analysis, so all groups and surrounding habitat receive protection under the Ontario Endangered

Species Act.

The genetic data will be invaluable for informing translocation and population augmentation efforts, which have been limited for Redside Dace recovery. The mtDNA data (total evidence network and dendrogram) will be crucial for delineating major lineages, while the microsatellite data (genetic diversity levels) will be important for translocation efforts to avoid both outbreeding and inbreeding depression, and to avoid introducing novel genotypes to the population (George et al. 2009). To date, only one augmentation initiative has been attempted, with approximately 500 individuals translocated from Asher Creek to Mill Creek in Indiana (Sweeten 2012). Preliminary results suggest limited spawning. A modelling study undertaken by Poos et al. (2012) suggests that the minimum number of Redside Dace individuals required for population viability is between 2900 and 4300 breeding individuals. Novinger and Coon (2000) wanted to determine the feasibility of moving New York individuals to the endangered

Michigan population, and determined this would not be appropriate based on physiological and behavioural experiments. Genetic data presented here also indicates that this would be inappropriate on genetic grounds, because the two populations fall within different evolutionary lineages. For future recovery efforts, genetic data will allow for higher confidence in identifying appropriate “source” and “recipient” candidates for

90 translocation experiments, before investing resources into physiological and behavioural experiments.

References

April J, Hanner RH, Dion‐Côté AM, Bernatchez L (2013) Glacial cycles as an allopatric speciation pump in north‐eastern American freshwater fishes. Molecular Ecology, 22, 409-422.

April J, Turgeon J (2006) Phylogeography of the banded killifish (Fundulus diaphanus): glacial races and secondary contact. Journal of Fish Biology, 69, 212-228.

Avise JC (2000) Phylogeography. Harvard University Press, Cambridge, MA.

Avise JC (2001) Cytonuclear genetic signatures of hybridization phenomena: rationale, utility, and empirical examples from fishes and other aquatic animals. Reviews in Fish Biology and Fisheries, 3, 253-263.

Bandelt H, Forster P, Röhl A (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution, 16, 37–48.

Beausoleil JMJ, Doucet SM, Heath DD, Pitcher TE (2012) Spawning coloration, female choice and sperm competition in the redside dace, Clinostomus elongatus. Animal Behaviour, 83, 969-977.

Beneteau CL, Mandrak NE, Heath DD (2009) The effects of river barriers and range expansion of the population genetic structure and stability in Greenside Darter (Etheostoma blennioides) populations. Conservation Genetics, 10, 477-487.

Berendzen PB, Dugan JF (2008) Establishing conservation units and population genetic parameters of fishes of greatest conservation need distributed in Southeast Minnesota. State Wildlife Grants Program. Division of Ecological Resources, Minnesota Department of Natural Resources. Available from: http://files.dnr.state.mn.us/eco/nongame/projects/consgrant_reports/2008/2008_berendzen _etal.pdf.

Bernatchez L, Wilson CC (1998) Comparative phylogeography of Nearctic and Palearctic fishes. Molecular Ecology, 7, 431–452.

Bessert ML, Orti G (2003) Microsatellite loci for paternity analysis in the fathead minnow, Pimephales promelas (Teleostei: Cyprinidae). Molecular Ecology Notes, 3, 532–534.

91

Borden WC, Krebs RA (2009) Phylogeography and postglacial dispersal of smallmouth bass (Micropterus dolomieu) into the Great Lakes. Canadian Journal of Fisheries and Aquatic Sciences, 66, 2142-2156.

Bossu CM, Beaulieu JM, Ceas PA, Near TJ (2013) Explicit tests of palaeodrainage connections of southeastern North America and the historical biogeography of Orangethroat Darters (Percidae: Etheostoma: Ceasia). Molecular Ecology, 22, 5397- 5417.

Ceballos G, Ehrlich PR, Barnosky AD, García A et al. (2015) Accelerated modern human–induced species losses: entering the sixth mass extinction. Science Advances, 1, e1400253.

COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa, ON.

COSEWIC (2012) Guidelines for Recognizing Designatable Units. Committee on the Status of Endangered Wildlife in Canada. http://www.cosewic.gc.ca/eng/sct2/sct2_5_e.cfm.

Dimsoski P, Toth GP, Bagley MJ (2000) Microsatellite characterization in central stoneroller Campostoma anomalum (Pisces: Cyprinidae). Molecular Ecology, 9, 2155– 2234.

Dowling TE, Naylor GJP (1997) Evolutionary relationships of in the genus Luxilus (Teleostei : Cyprinidae) as determined from cytochrome b sequences. Copeia, 1997, 758–765.

Dyke AS (2004) An outline of North American deglaciation with emphasis on central and northern Canada. Quaternary Glaciations: Extent and Chronology, 2, 373-424.

Earl DA, vonHoldt BM (2012) STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources, 4, 359-361.

Eliades N-G, Eliades DG (2009) HAPLOTYPE ANALYSIS: software for analysis of haplotypes data. Distributed by the authors. Forest Genetics and Forest Tree Breeding, Georg-Augst University Goettingen, Germany.

Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology, 14, 2611– 2620.

92

Excoffier L, Lischer HEL (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources, 10, 564-567.

Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution, 39, 783-791.

Frankham R, Briscoe DA, Ballou JD (2002) Introduction to Conservation Genetics. Cambridge University Press, Cambridge.

Freeland JR, Kirk H, Petersen S (2011) Molecular Ecology, Second Edition. Wiley- Blackwell, New York.

Galbraith HS, Zanatta DT, Wilson CC (2015) Comparative analysis of riverscape genetic structure in rare, threatened and common freshwater mussels. Conservation Genetics, 16, 845-857.

Galtier N, Nabholz B, Glémin S, Hurst GDD (2009) Mitochondrial DNA as a marker of molecular diversity: a reappraisal. Molecular Ecology, 18, 4541-4550.

George AL, Kuhajda BR, Williams JD, Cantrell MA, et al. (2009). Guidelines for propagation and translocation for freshwater fish conservation. Fisheries, 34, 529-545.

Ginson R, Walter RP, Mandrak NE, Beneteau CL, et al. (2015) Hierarchical analysis of genetic structure in the habitat-specialist Eastern Sand Darter (Ammocrypta pellucida). Ecology and Evolution, 5, 695–708.

Gum B, Gross R, Kuehn R (2005) Mitochondrial and nuclear DNA phylogeography of European grayling (Thymallus thymallus): evidence for secondary contact zones in central Europe. Molecular Ecology, 14, 1707-1725.

Hänfling B, Hellemans B, Volckaert FAM, Carvalho GR (2002) Late glacial history of the cold‐adapted freshwater fish Cottus gobio, revealed by microsatellites. Molecular Ecology, 11, 1717-1729.

Helfman GS (2007) Fish Conservation: A Guide to Understanding and Restoring Global Aquatic Biodiversity and Fishery Resources. Island Press, Washington, DC.

Hewitt GM (1996) Some genetic consequences of ice ages, and their role in divergence and speciation. Biological journal of the Linnean Society, 58, 247-276.

Hewitt GM (2004) Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 359, 183-195.

93

Hocutt CH, Jenkins RE, Stauffer Jr JR (1986) Zoogeography of the fishes of the central Appalachians and central Atlantic coastal plain. In: Zoogeography of North American Freshwater Fishes (eds Hocutt CH, Wiley EO), pp. 161-211. John Wiley and Sons, New York.

Hocutt CH, Wiley EO (1986) The Zoogeography of North American Freshwater Fishes. John Wiley and Sons, New York.

Houston DD, Shiozawa DK, Riddle BR (2010) Phylogenetic relationships of the western North American cyprinid genus Richardsonius, with an overview of phylogeographic structure. Molecular Phylogenetics and Evolution, 55, 259–73.

Hoxmeier RJH, Dieterman DJ, Miller LM (2015) Brook Trout Distribution, Genetics, and Population Characteristics in the Driftless Area of Minnesota. North American Journal of Fisheries Management, 35, 632-648.

Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics, 23, 1801-1806.

Jezkova T, Leal M, Rodríguez‐Robles JA (2013) Genetic drift or natural selection? Hybridization and asymmetric mitochondrial introgression in two Caribbean lizards (Anolis pulchellus and Anolis krugi). Journal of Evolutionary Biology, 26, 1458-1471.

Kalinowski ST (2005) HP-Rare 1.0: a computer program for performing rarefaction on measures of allelic richness. Molecular Ecology Notes, 5, 187–189.

Sweeten J (Manchester University) (2012) Redside dace (Clinostomus elongatus) in Mill Creek, Wabash County, Indiana: A strategy for research and augmentation. Indiana Department of Natural Resources.

Kluge AG (1989) A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Systematic Zoology, 38, 7-25.

Leidy RA, Cervantes‐Yoshida K, Carlson SM (2011) Persistence of native fishes in small streams of the urbanized San Francisco Estuary, California: acknowledging the role of urban streams in native fish conservation. Aquatic Conservation: Marine and Freshwater Ecosystems, 21, 472-483.

Lewis CM, Moore TC, Rea DK, Dettman DL, et al. (1995) Lakes of the Huron basin: their record of runoff from the Laurentide Ice Sheet. Quaternary Science Reviews, 13, 891-922.

94

Lewis CM, Karrow PF, Blasco SM, McCarthy FM, et al. (2008) Evolution of lakes in the Huron basin: deglaciation to present. Aquatic Ecosystem Health & Management, 11, 127- 136.

Librado P, Rozas J (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics, 25, 1451–1452.

Litvak MK, Mandrak NE (1993) Ecology of freshwater baitfish use in Canada and the United States. Fisheries, 18, 6-13.

MacKenzie DI, Nichols JD, Royle JA, Pollock KH (2006) Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence. Elsevier, Burlington, MA.

Maggs CA, Castilh R, Foltz D, Henzler C, et al. (2008) Evaluating signatures of glacial refugia for North Atlantic benthic marine taxa. Ecology, 89, S108-S122.

Mandrak NE, Crossman E (1992) Postglacial dispersal of freshwater fishes into Ontario. Canadian Journal of Zoology, 70, 2247-2259.

Mayden RL (1988) Vicariance biogeography, parsimony, and evolution in North American freshwater fishes. Systematic Biology, 37, 329-355.

McCusker MR, Mandrak NE, Egeh B, Lovejoy NR (2014) Population structure and conservation genetic assessment of the endangered Pugnose Shiner, Notropis anogenus. Conservation Genetics, 15, 343-353.

McDermid JL, Wozney JK, Kjartanson SL, Wilson CC (2011) Quantifying historical, contemporary, and anthropogenic influences on the genetic structure and diversity of lake sturgeon (Acipenser fulvescens) populations in northern Ontario. Journal of Applied Ichthyology, 27, 12–23.

Ministry of Natural Resources (MNR) (2011) DRAFT Guidance for Development Activities in Redside Dace Protected Habitat. Ontario Ministry of Natural Resources, Peterborough, Ontario.

Miyamoto MM, Fitch WM (1995) Testing species phylogenies and phylogenetic methods with congruence. Systematic Biology, 44, 64-76.

Moritz C (1994) Defining ‘evolutionarily significant units’ for conservation. Trends in Ecology & Evolution, 9, 373-375.

Mousson L, Dauga C, Garrigues T, Schaffner F, Vazeille M, Failloux AB (2005) Phylogeography of Aedes (Stegomyia) aegypti (L.) and Aedes (Stegomyia) albopictus

95

(Skuse)(Diptera: Culicidae) based on mitochondrial DNA variations. Genetical research, 86, 1-11.

Nei, M, Tajima F, Tateno Y (1983) Accuracy of estimated phylogenetic trees from molecular data. Journal of Molecular Evolution, 19, 153-170.

Novinger DC, Coon TG (2000) Behavior and physiology of the redside dace, Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of Fishes, 57, 315–326.

Overpeck JT, Webb RS, Webb T (1992) Mapping eastern North American vegetation change of the past 18 ka: No-analogs and the future. Geology, 20, 1071-1074.

Parker BJ, McKee P, Campbell RR (1988) Status of the redside dace, Clinostomus elongatus, in Canada. Canadian Field Naturalist, 102, 163-169.

Peakall R, Smouse PE (2006) Genalex 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes, 6, 288–295.

Pitcher TW, Beneteau CL, Walter RP, Wilson CC et al. (2009) Isolation and characterization of microsatellite loci in the redside dace, Clinostomus elongatus. Conservation Genetics Resources, 1, 381–383.

Placyk JS, Burghardt GM, Small RL, King RB, et al. (2007) Post-glacial recolonization of the Great Lakes region by the common gartersnake (Thamnophis sirtalis) inferred from mtDNA sequences. Molecular Phylogenetics and Evolution, 43, 452-467.

Poos MS, Jackson DA (2012) Impact of species-specific dispersal and regional stochasticity on estimates of population viability in stream metapopulations. Landscape Ecology, 27, 405-416.

Poos MS, Lawrie D, Tu C, Jackson DA, et al. (2012) Estimating local and regional population sizes for an endangered minnow, redside dace (Clinostomus elongatus), in Canada. Aquatic Conservation: Marine and Freshwater Ecosystems, 22, 47-57.

Pritchard JK, Stephens M and Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945–959.

Raymond M, Rousset F (1995) GENEPOP (Version 1.2): Population Genetics Software for Exact Tests and Ecumenicism. Journal of Heredity, 86, 248-249.

Redford KH, Amato G, Baillie J, Beldomenico P, et al. (2011) What does it mean to successfully conserve a (vertebrate) species? BioScience, 61, 39-48.

96

Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus elongatus) in Ontario. Ontario Ministry of Natural Resources. Peterborough, ON.

Reid SM, Kidd A, Wilson CC (2012) Validation of buccal swabs for noninvasive DNA sampling of small‐bodied imperiled fishes. Journal of Applied Ichthyology, 28, 290-292.

Rosenberg NA (2004) DISTRUCT: a program for the graphical display of population structure. Molecular Ecology Notes, 4, 137-138.

Rousset F (1997) Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics, 145, 1219–1228.

Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular biology and evolution, 4, 406-425.

Sambrook J, Fritsch EF, Maniatis T (1989) Molecular Cloning. Cold Spring Harbor Laboratory Press, New York.

Schmidt RE (1986) Zoogeography of the Northern Appalachians . In: Zoogeography of North American Freshwater Fishes (eds Hocutt CH, Wiley EO), pp. 137-159. John Wiley and Sons, New York.

Scott WB, Crossman EJ (1973) Freshwater Fishes of Canada. Bulletin of Fisheries Research Board of Canada 184.

Simon TP, Burskey JL (2014) Spatial Distribution and Dispersal Patterns of Central North American Freshwater Crayfish (Decapoda: Cambaridae) with Emphasis on Implications of Glacial Refugia. International Journal of Biodiversity.

Sivasundar A, Bermingham E, Orti G (2001) Population structure and biogeography of migratory freshwater fishes (Prochilodus: Characiformes) in major South American rivers. Molecular Ecology, 10, 407-417.

Soltis DE, Morris AB, McLachlan JS, Manos PS, et al. (2006) Comparative phylogeography of unglaciated eastern North America. Molecular Ecology, 15, 4261– 4293.

Stepien CA, Faber JE (1998) Population genetic structure, phylogeography and spawning philopatry in walleye (Stizostedion vitreum) from mitochondrial DNA control region sequences. Molecular Ecology, 7, 1757-1769.

Takezaki N, Nei M, Tamura K (2010) POPTREE2: Software for constructing population trees from allele frequency data and computing other population statistics with Windows interface. Molecular Biology and Evolution, 27, 747–752.

97

Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Molecular Biology and Evolution, 24, 1596– 1599.

Van Oosterhout C, Hutchinson WF, Willis DP, Shipley P (2004) MICRO‐CHECKER: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes, 4, 535-538.

Ver Steeg K (1946) The Teays River. Ohio Journal of Science, 46, 297-307.

Wang L, Lyons J, Kanehl, P, Bannerman R (2001) Impacts of urbanization on stream habitat and fish across multiple spatial scales. Environmental Management, 28, 255-266.

Wilson CC, Hebert PD (1996) Phylogeographic origins of lake trout (Salvelinus namaycush) in eastern North America. Canadian Journal of Fisheries and Aquatic Sciences, 53, 2764-2775.

Wilson CC, Hebert PD (1998) Phylogeography and postglacial dispersal of lake trout (Salvelinus namaycush) in North America. Canadian Journal of Fisheries and Aquatic Sciences, 55, 1010-1024.

Wozney KM, Haxton TJ, Kjartanson S, Wilson CC (2011) Genetic assessment of lake sturgeon (Acipenser fulvescens) population structure in the Ottawa River. Environmental Biology of Fishes, 90, 183-195.

Figure 3.1:Distribution map of sampling locations for Redside Dace (Clinostomus elongatus). Inset map shows the species’ global range (reproduced from COSEWIC 2007 report, with permission), with the polygon enclosing the species range.

98

Haplogroup A

Haplogroup B

Figure 3.2: Mutational network observed among C. elongatus haplotypes for ATPase 6 and 8 based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.2; each node represents one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution.

99

100

Haplogroup A

Haplogroup B

Figure 3.3: Neighbour-joining dendrogram of relationships among ATPase 6 and 8 haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table 3.2) are represented by numbers outside brackets, while number of individuals are represented by numbers inside brackets. Numbers at branch nodes show bootstrap support values >50 %.

Haplogroup D

Haplogroup C

Figure 3.4: Mutational network observed among C. elongatus haplotypes for cytochrome b based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.4; each node represents one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution of haplogroups (purple=haplogroup C; light green=haplogroup D).

101

102

Haplogroup C

Haplogroup D

Figure 3.5: Neighbour-joining dendrogram of relationships among cytochrome b haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table 3.4) are represented by numbers outside brackets, while numbers of individuals are represented by numbers inside brackets. Numbers at branch nodes show bootstrap support values >50 %

Haplogroup 1

Haplogroup 3

Haplogroup 2

Figure 3.6: Mutational network observed among C. elongatus haplotypes for total evidence for cytochrome b and ATPase 6 and 8 based on parsimony. Each numeric circle corresponds to a haplotype listed in Table 3.5; while each node represent one nucleotide substitution. Branch lengths do not correspond to genetic distance. Inset map shows the geographic distribution of haplogroups (orange=haplogroup 1, olive green=haplogroup 3, red=haplogroup 2).

103

104

Haplogroup 1

Haplogroup 2

Haplogroup 3

Figure 3.7: Neighbour-joining dendrogram of relationships among total evidence (cytochrome b and ATPase 6 and 8) haplotypes based on p-distances with 500 bootstrap replicates. Haplotype numbers (Table 3.5) are represented by numbers outside brackets, while number of individuals are represented by numbers inside brackets. Numbers at branch nodes show bootstrap support values >50%.

Figure 3.8: Distribution of haplogroups (orange=haplogroup 1, green=haplogroup 3, red=haplogroup 2, black= unassigned) for combined cytochrome b and ATPase 6 and 8 data using groups identified via mutational network (Figure 3.6) and genetic distance (Figure 3.7).

105

0 350 0 5 10 15 20 25 30 35

-5000 Mean LnP(K) Delta K 300

-10000 250

-15000 200 -20000

150 Delta K

Mean Mean LnP(K) -25000

100 -30000

-35000 50

-40000 0

Number of Clusters (K)

Figure 3.9: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace individuals, where K represents number of genetically unique populations. Analyses were run at K=1 to K=29, and methods outlined in Chapter 2. Results analysed using (i) Log Likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005). 106

Figure 3.10: Bayesian clustering assignment implemented in STRUCTURE for 28 populations at (a) K=3 for range-wide analysis (b) results of separate STRUCTURE runs on the three identified subsets for fine-scale analysis, showing optimal K values along with preceding and successive K values. All runs were implemented with no admixture, and independent allele

frequencies. Colours between different runs have no association with each other. 107

Figure 3.11: Results from Bayesian clustering analyses in STRUCTURE for Redside Dace individuals, for three genetic groups identified by K=3 on Figure 3.10, where K represents number of genetically unique populations. Analyses were run for red group from K=1 to K=10 (top left), green group from K=1 to K=20 (top right), and blue group from K=1 to K=10 (bottom) group using methods

outlined in Chapter Two. Results analysed using (i) Log Likelihood (L(K)), and (ii) ∆K approach outlined in Evanno et al. (2005). 108

SAU Group B GUL

EBC RED EFI

MIL BRU HAN Group C

COB PCoA PCoA (14.2%) 2 SMC DON ROU FOU TTR MIT LRR HUM LCR EBM CAR NFR STR Group A UNN KET WOO OST DOD BHR

PCoA 1 (33.2%)

Figure 3.12: Principal coordinate analysis (PCoA) of genetic structure across all sampled Redside Dace populations (red= cluster A, blue= cluster B, green=cluster C). Inset map shows the geographic distributions of each genetic group.

109

110

Group A

Group B

Group C

DA

Figure 3.13: Neighbour joining dendrogram of genetic relationships among sampled populations based on Nei et al. (1983) DA genetic distance for 10 microsatellite loci. Numbers at branch nodes represent bootstrap support values > 50% based on 500 bootstrap replicates. Groups correspond to those identified in Figure 3.12.

Figure 3.14: Plot of isolation by distance for pairwise population comparisons of transformed geographic distance [ln (distance in km+1)] versus genetic divergence [(FST)/(1-FST)].

111

Figure 3.15: Isolation by distance plot of transformed geographic distance [ln (distance in km+1)] versus genetic divergence [(FST)/(1-FST)] for population population pairs with geographic distances of less than 123 km. Points in yellow represent pairwise comparisons among the Allegheny River populations.

112

Table 3.1: Locations with drainage, jurisdiction, code names, latitude/longitude and number of samples obtained for mtDNA and microsatellite genetic samples used for study.

State/Province Drainage Population Code Latitude Longitude ATPase Cyt Microsatellite Name (ºN) (ºW) b Minnesota Little Cannon LCR 44.35 -92.96 12 14 33 Mississippi R River Minnesota North Fork NFR 44.30 -92.79 14 15 35 Mississippi R Zumbro Michigan Lake Superior Unnamed Creek UNN 46.68 -90.02 12 12 36 Michigan Lake Superior Schroeder Creek SCH 46.56 -89.83 7 6 -- Wisconsin Mississippi R Little Rib River LRR 45.09 -89.81 12 10 23 Wisconsin Mississippi R East Fork EFR 42.53 -89.13 1 1 -- Raccoon Creek Indiana Ohio R Mill Creek MIL 40.77 -85.90 13 14 35 Indiana Ohio R Hanna Creek HAN 39.66 -84.88 13 10 44 Ontario Lake Huron Two Tree River TTR 46.25 -84.03 22 21 40 Kentucky Kentucky R East Fork Indian EFI 37.87 -83.66 11 14 29 (Ohio R) Creek Kentucky Kentucky R Red River RED 37.83 -83.63 10 9 10 (Ohio R.) Kentucky Licking R Brushy Fork BRU 37.95 -83.51 14 14 29 (Ohio R) Ontario Lake Huron Gully Creek GUL 43.61 -81.66 14 15 36 Ohio East Branch EBC 41.54 -81.29 16 16 28 Lake Erie Chagrin River Ontario Lake Huron Saugeen River SAU 44.25 -80.41 2 0 14 West Virginia Monongahela R Straight Fork STR 38.85 -80.37 10 9 12 (Ohio R) West Virginia Monongahela R Whiteday Creek WDC 39.43 -79.97 3 2 -- (Ohio R)

Ontario Sixteen Mile SMC 43.57 -79.89 -- -- 45 113 Lake Ontario Creek

Table 3.1 (continued)

State/Province Drainage Population Code Latitude Longitude ATPase Cyt Microsatellite Name (ºN) (ºW) b Ontario Lake Ontario Fourteen Mile FOU 43.42 -79.77 12 12 43 Ontario Kettleby Creek KET 44.00 -79.56 18 16 35 Ontario Lake Ontario Humber River HUM 43.92 -79.56 -- -- 36 Ontario Lake Ontario Mitchell Creek MIT 43.97 -79.14 -- -- 34 Ontario Lake Ontario Carruthers Creek CAR 43.92 -79.03 14 12 50 Pennsylvania Allegheny R East Branch EBM 41.01 -78.76 12 12 34 (Ohio R) Mahoning Pennsylvania Allegheny R Bloomster Hollow BHR 41.75 -78.52 22 21 31 (Ohio R) Run New York Allegheny R Dodge Creek DOD 42.13 -78.24 18 17 21 (Ohio R) New York Susquehanna R Otselic River OST 42.76 -75.74 12 14 17 New York Black R (Lake Kidder Creek KID 43.93 -75.64 5 5 -- Ontario) New York Black R (Lake Cobb Creek COB 43.85 -75.64 12 13 22 Ontario)

114

Table 3.2: ATPase variable sites for 23 unique haplotypes of C. elongatus (1st column), nucleotide positions at which mutations occur (1st row), number of individuals (N) and populations that contain that particular haplotype. Haplotype 1A represents reference sequence for table; dots within a cell represent nucleotide positions identical to the reference sequence.

0 0 0 0 1 1 1 1 1 2 2 2 2 3 3 3 4 5 5 5 5 5 6 6 6 6 6 6 6 Hap 3 5 5 6 3 4 5 5 8 0 1 7 9 4 5 8 0 1 3 5 6 9 0 5 5 6 7 7 9 N Populations 5 0 8 7 6 8 0 2 1 5 4 4 5 7 8 6 0 1 2 6 8 8 7 0 2 7 7 9 4 DOD, OST, COB, KID, DON, CAR, 1A C C C G C G G G G G G A A A A C A C T T G A C C G G C A G 136 FOU, HAN, MIL, KET, EBC, WOO, BHR 2A . . . . . A ...... 1 DOD DOD, WOO, 3A ...... G ...... 39 BHR, EBM, STR, WDC 4A ...... G A ...... 2 OST 5A ...... G ...... 3 CAR 6A ...... G . 4 GUL 7A ...... A ...... G . 10 GUL 8A ...... T . . 2 HAN 9A T ...... A A . . . G . . T . C ...... 23 EFR, TTR 10A ...... A . . . 11 MIL 11A ...... A ...... 5 KET 12A ...... C . . . G ...... 23 EFI, RED 13A ...... C . . . G ...... A ...... 12 BRU 14A T . . . T ...... G . . . . C . . . T A . . . . 15 SCH, UNN SCH, LCR, 15A T . . . T ...... G . . . . C . . . . A . . . . 38 NFR, LRR 16A T . . . T ...... G G . . . . C . . . . A . . . . 2 LCR

17A T . . . T . A ...... G . . . C C . . . . A . . . . 2 NFR 115 18A ...... G ...... C 1 WOO

19A ...... G . . G ...... 3 BHR 20A . T T ...... A ...... 1 BHR 21A ...... T ...... 1 EBM 22A ...... T ...... G . 2 SAU 23A . . . A ...... G ...... 2 WDC

116

Table 3.3: Summary of ATPase 6 and 8 and Cytochrome b sequence results for 27 Redside Dace populations, showing numbers of sequenced individuals (N), number of haplotypes detected (Nh), haplotypic richness (HR) and haplotype diversity (h) and nucleotide diversity (π) for 27 Redside Dace populations.

ATPase 6 and 8 Cytochrome b Population N Nh Rh h π N Nh Rh (n=9) h π (n=10) LCR 12 2 0.99 0.30 3.80 x 10-4 14 3 1.8 0.56 9.00 x 10-4 NFR 14 2 0.93 0.27 6.50 x 10-4 15 1 0.00 0 0 UNN 12 1 0.00 0 0 12 1 0.00 0 0 SCH 7 2 -- 0.57 7.1 x 10-4 6 1 -- 0 0 LRR 12 1 0.00 0 0 10 3 1.9 0.60 6.1 x 10-4 MIL 13 2 0.96 0.28 3.5 x 10-4 14 2 1.00 0.53 4.8 x 10-4 HAN 13 2 0.96 0.28 3.5 x 10-4 10 1 0.00 0 0 TTR 22 1 0.00 0 0 21 1 0.00 0 0 EFI 11 1 -- 0 0 14 2 0.64 0.14 1.3 x 10-4 RED 10 1 0.00 0 0 9 1 0.00 0 0 BRU 14 2 0.26 3.3 x 10-4 14 1 0.00 0 0 GUL 14 2 1.0 0.44 5.5 x 10-4 15 2 0.96 0.34 6.2 x 10-4 EBC 16 1 0.00 0 0 16 1 0.00 0 0 SAU 2 1 -- 0 0 0 0 -- 0 0 STR 10 1 0.00 0 0 9 2 1.00 0.40 3.5 x 10-4

117

Table 3.3 (continued)

ATPase 6 and 8 Cytochrome b Population N Nh Rh h π N Nh Rh (n=9) h π (n=10) -4 WDC 3 2 -- 0.67 8.3 x 10 2 1 -- 0 0 FOU 12 1 0.00 0 0 12 3 1.96 0.68 7.6 x 10-4 KET 18 2 1.0 0.43 5.3 x 10-4 16 2 0.98 0.40 1.09 x 10- 3 WOO 13 3 1.77 0.61 8.3 x 10-4 11 4 2.64 0.67 4.82 x 10- 4 DON 14 1 0.00 0 0 13 1 0.00 0 0 CAR 14 2 0.99 0.36 4.5 x 10-4 12 1 0.00 0 0 EBM 12 2 0.83 0.17 4.1 x 10-4 12 5 3.46 0.80 3.00 x 10- 3 BHR 22 4 2.30 0.64 1.27 x 10- 21 7 3.66 0.84 5.22 x 10- 3 3 DOD 18 3 1.53 0.45 5.9 x 10-4 17 6 3.80 0.83 4.07 x 10- 3 OST 12 2 0.99 0.30 7.5 x 10-4 14 2 0.64 0.14 1.3 x 10-4 KID 5 1 -- 0 0 5 1 -- 0 0 COB 12 1 0.00 0 0 12 2 0.75 0.17 1.50 x 10- 4

118

Table 3.4: Cytochrome b variable sites for 35 unique haplotypes of C. elongatus (1st column), showing nucleotide positions at which mutations occur (1st row), number of individuals (N) and populations that contain that particular haplotype. Dots within a cell represent nucleotide positions identical to the reference sequence. Hap 1 3 9 1 1 1 2 2 2 2 2 2 3 3 3 4 4 4 4 4 4 4 5 5 5 6 6 1 0 0 0 2 2 1 1 2 3 5 8 3 7 9 0 0 2 3 3 5 6 1 3 7 0 1 3 6 8 2 3 7 4 5 8 0 5 9 2 8 6 5 8 6 1 3 9 3 8 2 1B A A T A A T C C G T T C G C G G C A G T A G A G A C C 2B ...... A . C T . . T . . . A . . A . . . . T 3B ...... A C C T . . T . . . A . . A . . . . T 4B ...... A . C T . . T . . G A . . A . . . . T 5B ...... A . C T . . T A . . A . . A . . . . T 6B ...... T ...... T ...... 7B ...... A . C T . . T . . . A . . A . . . . T 8B ...... A . C T . . T . . . A . . A . . . . T 9B ...... A . C T ...... A . . A . . . . T 10B ...... A . C T . . T . . . A . . A . . . . T 11B ...... A . C T ...... A . . A . . . . T 12B . . . . G . . . A . C T ...... A . . A . . . . T 13B ...... A . . T ...... A . . A . A . . T 14B ...... A . C T . . . . T . A . . A . . . . T 15B . G ...... A . C T . . . . T . A . . A . . . . T 16B ...... A . C T . . T . . . A C . A . . . T T 17B ...... A . C T ...... A . . A . . . . T 18B ...... A . C T ...... A . . A . . G . T 19B ...... A . . T ...... A . . A . . . . T 20B ...... A . . T . . . . T . A . . A . . . . T 21B ...... A . . T ...... A . . A . . . . T 22B ...... T A . C T . . T . . . A . . A . . . . T 23B . . . G . . . . A . C T . . T . . . A . . A . . . . T 24B . . . . . C ...... 25B ...... G . . . . 26B ...... T ...... 27B . . . . . C . . A . C T . . T . . . A . . A . . . . T 28B ...... A . C T A . T . . . A . G A . . . . T 119

29B ......

Table 3.4 (continued)

Hap 1 3 9 1 1 1 2 2 2 2 2 2 3 3 3 4 4 4 4 4 4 4 5 5 5 6 6 1 0 0 0 2 2 1 1 2 3 5 8 3 7 9 0 0 2 3 3 5 6 1 3 7 0 1 3 6 8 2 3 7 4 5 8 0 5 9 2 8 6 5 8 6 1 3 9 3 8 2

30B . . C ...... 31B ...... 32B ...... A . C T . . T . . . A . . A . . . . T 33B ...... A . . T ...... A . . A . . . . T 34B G ...... A . . T ...... A . . A . . . . T 35B ......

120

Table 3.4 (continued)

1 1 1 1 6 6 6 6 7 7 7 8 8 8 0 0 0 0 Hap 4 7 7 9 2 2 4 3 3 8 Populations 1 2 2 6 N 2 2 5 9 5 9 4 7 9 5 7 0 3 1 1B C G T G G T A C C A C G T A 23 DOD, WOO, EBM, STR, WDC 2B ...... T . . . 56 DOD, COB, KID, OST, CAR, FOU, BHR 3B ...... T . . . 1 DOD 4B ...... T . . . 3 DOD, BHR 5B ...... T . . . 5 DOD, BHR 6B ...... 1 DOD 7B ...... T . C . 1 OST 8B ...... T . T . . . 1 COB 9B ...... T . . . 56 DON, FOU, HAN, KET, EBC, 10B ...... T . . G 2 FOU 11B T ...... T . . . 3 GUL 12B ...... T . . . 12 GUL 13B ...... T . . G 22 EFR, TTR 14B ...... T . . . 8 MIL 15B ...... T . . . 6 MIL 16B ...... T . . . 4 KET 17B ...... T . . G 36 EFI, BRU, RED 18B ...... T . . G 1 EFI 19B ...... T . . G 48 UNN, SCH, LCR, NFR, LRR 20B ...... T . . G 2 LCR 21B ...... T . . T A . G 3 LCR 22B . . . . . C . . . . T . . . 3 WOO 23B ...... T . . . 1 WOO 24B ...... 1 WOO 25B ...... 6 BHR 26B ...... 4 BHR 27B ...... T . . . 1 BHR 28B ...... T . . . 3 BHR 29B . . . T ...... 1 EBM 121

30B . C . T ...... 4 EBM

Table 3.4 (continued)

1 1 1 1 6 6 6 6 7 7 7 8 8 8 0 0 0 0 Hap 4 7 7 9 2 2 4 3 3 8 Populations 1 2 2 6 N 2 2 5 9 5 9 4 7 9 5 7 0 3 1

31B ...... G ...... 2 EBM 32B ...... G T . . . 1 EBM 33B . . . . A . . . . . T . . G 3 LRR 34B ...... T . . G 1 LRR 35B . . C ...... 2 STR

122

123

Table 3.5: Haplotype name, number of individuals and population occurrences for 47 unique haplotypes based on combined sequences (ATPase 6 and 8, and cytochrome b).

Haplotype N Populations 1 2 WDC 2 18 STR, EBM, WOO, DOD 3 2 STR 4 3 LRR 5 27 LRR, LCR, NFR, SCH 6 1 LRR 7 21 TTR, EFR 8 4 EBM 9 2 EBM 10 1 EBM 11 1 EBM 12 3 BHR 13 3 BHR, DOD 14 6 BHR 15 49 BHR, FOU, CAR, KID, DOD, OST, COB 16 3 BHR 17 1 BHR 18 5 BHR, DOD 19 3 WOO 20 1 WOO 21 1 WOO 22 1 WOO 23 21 EFI, BRU, RED 24 54 EBC, KET, HAN, FOU, DON 25 2 NFR 26 1 LCR 27 3 LCR 28 2 LCR 29 15 SCH, UNN 30 1 EFI 31 12 BRU 32 4 KET 33 7 MIL 34 4 MIL 35 1 MIL 36 1 MIL 37 2 HAN 38 10 GUL 39 3 GUL

124

Table 3.5: (continued)

Haplotype N Populations 40 2 FOU 41 2 CAR 42 1 COB 43 2 OST 44 1 OST 45 1 DOD 46 1 DOD 47 1 DOD

125

Table 3.6: Genetic description of 28 Redside Dace populations (see Table 3.1 for localities). Columns represent letter codes, number of individuals genotyped (N), observed number of alleles (Na), standardized allelic richness (AR) for n=20 gene copies, observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient (FIS).

Pop N Na AR HO HE FIS LCR 33.0 5.5 3.89 0.39 0.38 0.08 NFR 34.0 3.5 2.86 0.36 0.35 -0.03 UNN 36.0 2.9 2.51 0.40 0.43 0.11 LRR 27.9 5.3 3.69 0.42 0.38 -0.09 MIL 35.0 3.9 3.25 0.49 0.49 -0.01 HAN 44.0 7.0 4.92 0.61 0.62 0.03 TTR 39.5 2.9 2.50 0.36 0.36 0.10 EFI 27.7 5.2 4.36 0.59 0.59 0.00 RED 10.0 3.3 3.30 0.61 0.51 -0.18 BRU 29.9 5.2 4.27 0.62 0.60 -0.03 GUL 33.7 3.1 2.63 0.39 0.41 0.08 EBC 28.0 6.1 4.91 0.65 0.60 -0.10 SAU 13.4 2.8 2.69 0.35 0.35 0.01 STR 12.0 3.5 3.39 0.45 0.44 0.06 SMC 44.0 5.4 4.01 0.54 0.54 0.00 FOU 42.5 4.7 3.85 0.57 0.55 -0.04 KET 33.4 3.8 3.34 0.48 0.45 -0.04 HUM 35.7 5.4 4.17 0.53 0.50 -0.04 WOO 31.0 6.8 5.19 0.58 0.57 0.06 DON 29.9 3.5 3.11 0.49 0.47 -0.06 ROU 45.9 5.1 3.83 0.59 0.56 -0.07 MIT 33.9 5.0 4.23 0.58 0.58 0.02 CAR 49.9 3.3 2.82 0.45 0.45 0.02 EBM 33.8 6.6 5.08 0.68 0.64 -0.07 BHR 31.0 7.8 5.72 0.59 0.61 0.03 DOD 21.0 6.1 4.96 0.60 0.56 -0.06 OST 16.8 4.7 4.13 0.51 0.48 -0.07 COB 22.0 5.1 4.03 0.50 0.48 -0.04

Table 3.7: Pairwise FST values among 28 Redside Dace populations along with sample size for each population. N EFI BRU LCR NFR HUM HAN MIL FOU KET SMC TTR CAR ROU DON MIT GUL DOD OST EFI 28 0.00 BRU 30 0.08 0.00 LCR 33 0.47 0.47 0.00 NFR 34 0.49 0.48 0.10 0.00 HUM 36 0.27 0.25 0.50 0.51 0.00 HAN 44 0.15 0.14 0.43 0.45 0.27 0.00 MIL 35 0.30 0.34 0.44 0.47 0.41 0.33 0.00 FOU 43 0.23 0.22 0.46 0.48 0.10 0.24 0.38 0.00 KET 34 0.28 0.27 0.57 0.58 0.21 0.30 0.46 0.21 0.00 SMC 45 0.26 0.27 0.49 0.50 0.19 0.24 0.37 0.18 0.25 0.00 TTR 40 0.48 0.49 0.39 0.38 0.51 0.44 0.45 0.49 0.58 0.51 0.00 CAR 50 0.33 0.31 0.47 0.48 0.21 0.33 0.39 0.22 0.31 0.32 0.48 0.00 ROU 46 0.22 0.21 0.46 0.48 0.13 0.21 0.40 0.13 0.18 0.17 0.49 0.25 0.00 DON 30 0.21 0.25 0.54 0.57 0.24 0.29 0.41 0.22 0.19 0.22 0.57 0.34 0.20 0.00 MIT 34 0.18 0.19 0.47 0.49 0.16 0.19 0.35 0.15 0.12 0.14 0.50 0.27 0.11 0.15 0.00 GUL 36 0.30 0.35 0.60 0.62 0.43 0.33 0.45 0.39 0.46 0.35 0.62 0.49 0.38 0.39 0.35 0.00 DOD 21 0.26 0.26 0.51 0.54 0.24 0.27 0.43 0.23 0.18 0.23 0.54 0.35 0.21 0.21 0.15 0.42 0.00 OST 17 0.34 0.35 0.54 0.57 0.34 0.30 0.47 0.32 0.30 0.28 0.57 0.44 0.28 0.28 0.23 0.48 0.13 0.00 COB 22 0.29 0.32 0.54 0.56 0.22 0.29 0.40 0.24 0.28 0.13 0.56 0.35 0.17 0.23 0.17 0.36 0.22 0.23 UNN 36 0.45 0.45 0.29 0.32 0.47 0.42 0.44 0.42 0.51 0.46 0.33 0.45 0.44 0.50 0.44 0.57 0.47 0.49 WOO 31 0.26 0.24 0.50 0.52 0.20 0.26 0.41 0.21 0.14 0.22 0.53 0.32 0.21 0.19 0.13 0.40 0.02 0.17 BHR 31 0.23 0.22 0.47 0.49 0.21 0.23 0.39 0.21 0.14 0.21 0.50 0.31 0.19 0.20 0.13 0.38 0.02 0.12 EBM 34 0.24 0.22 0.47 0.49 0.20 0.22 0.39 0.22 0.19 0.25 0.49 0.29 0.20 0.25 0.16 0.38 0.12 0.22 EBC 28 0.22 0.27 0.46 0.47 0.36 0.21 0.29 0.32 0.39 0.30 0.47 0.38 0.31 0.32 0.27 0.33 0.35 0.37 SAU 14 0.37 0.39 0.61 0.62 0.47 0.38 0.53 0.40 0.53 0.43 0.62 0.53 0.42 0.48 0.43 0.44 0.46 0.52 STR 12 0.30 0.29 0.57 0.59 0.35 0.31 0.44 0.33 0.30 0.33 0.59 0.42 0.32 0.35 0.24 0.50 0.18 0.35 LRR 28 0.47 0.47 0.36 0.33 0.50 0.43 0.46 0.46 0.57 0.49 0.35 0.45 0.47 0.55 0.48 0.61 0.51 0.56 RED 10 0.09 0.18 0.52 0.54 0.32 0.20 0.38 0.29 0.37 0.31 0.53 0.40 0.27 0.29 0.24 0.32 0.31 0.37

126

Table 3.7: (continued)

COB UNN WOO BHR EBM EBC SAU STR LRR RED EFI BRU LCR NFR HUM HAN MIL FOU KET SMC TTR CAR ROU DON MIT GUL DOD OST COB 0.00 UNN 0.51 0.00 WOO 0.23 0.46 0.00 BHR 0.21 0.43 0.04 0.00 EBM 0.25 0.43 0.12 0.08 0.00 EBC 0.31 0.44 0.35 0.33 0.32 0.00 SAU 0.47 0.57 0.45 0.42 0.41 0.39 0.00 STR 0.35 0.53 0.18 0.13 0.19 0.40 0.55 0.00 LRR 0.54 0.16 0.50 0.48 0.46 0.45 0.61 0.57 0.00 RED 0.32 0.49 0.31 0.27 0.26 0.25 0.41 0.37 0.52 0.00 127

Table 3.8: Analysis of Molecular Variance (AMOVA) for total evidence (Cytochrome b and ATPase 6 and 8) mitochondrial DNA data based on hypothesized (i) Mississippi and Atlantic refugia (2 groups), (ii) mitochondrial DNA bootstrap supported groups (3 refugia), and (iii) microsatellite Principal Coordinate Analysis clustering (3 refugia hypothesis), and hierarchical FST analysis of microsatellite data for (iv) eastern versus western groups (v) three groups identified by STRUCTURE, and (vi) contemporary drainage patterns.

Source of variation d.f. Sum of Variance % variation P-value squares components

(i) mtDNA: Mississippi versus Atlantic Among groups 1 337.08 2.73 53.79 <0.001 Among populations within 25 486.73 1.63 32.14 <0.001 groups Within populations 288 205.29 0.71 14.07 <0.001 Total 314 1029.10 5.07

(ii) mtDNA: 3 refugial groups (based on mtDNA bootstrap support for total evidence) Among groups 2 613.21 3.55 71.10 <0.001 Among populations within 28 316.88 1.09 21.91 <0.001 groups Within populations 284 99.01 0.35 6.99 <0.001 Total 314 1029.10 4.99

(iii) 3 refugial groups (mtDNA data, grouped by microsatellite for STRUCTURE and PCoA) Among groups 2 416.34 1.90 47.07 <0.001 Among populations within 24 407.45 1.42 35.24 <0.001 groups

Within populations 288 205.29 0.71 17.69 <0.001 128 314 1029.10 4.03

Total

Table 3.8 (continued)

Source of variation d.f. Sum of Variance Percent of P-value squares components % variation

(iv) Eastern versus Western populations Among groups 1 568.63 0.96 20.78 <0.001 Among populations within 26 1908.36 1.13 24.64 <0.001 groups Within populations 1736 4359.26 2.51 54.58 <0.001 Total 1763 6836.24 4.60

(v) K=3 STRUCTURE populations Among groups 2 1015.34 0.84 19.71 <0.001 Among populations within 25 1461.65 0.90 21.12 <0.001 groups Within populations 1736 4359.25 2.51 59.17 <0.001 Total 1763 6836.24 4.24

(vi) Contemporary drainage patterns Among groups 3 660.92 0.41 10.09 <0.001 Among populations within 24 1816.07 1.17 28.50 <0.001 groups Within populations 1736 4359.25 2.51 61.40 <0.001 Total 1763 6836.24 4.90

129

130

Chapter 4: General Discussion

The results presented in both data chapters highlighted the value of genetic tools and information for improving knowledge and management plans for species at risk.

Chapter 2 results demonstrated the effectiveness of environmental DNA (eDNA) for documenting the presence of Redside Dace, and that eDNA monitoring can be more sensitive than electrofishing at species detection. Environmental DNA results also showed that sampling design, number of replicates, and season are all important considerations for the application of the technique. Used properly, eDNA monitoring should help to increase the chances of detecting species when present at a site, and support the implementation of recovery actions (Darling and Mahon 2011). Chapter 3 results similarly showed the value of genetic information for species conservation, with both mitochondrial and microsatellite DNA identifying three phylogeographic lineages within Redside Dace. Combined mtDNA and microsatellite data also indicate the occurrence and extent of secondary contact between two of the lineages during postglacial colonization. Microsatellite data also showed that contemporary populations of Redside Dace are highly structured with little to no gene flow occurring, and that levels of genetic diversity within populations do not reflect the regional declines that have been observed.

Despite the potential for false positive and false negative detection in other studies

(Darling and Mahon 2011), Chapter 2 is the first to set qPCR detection thresholds using the Receiver Operator Characteristic (ROC) approach (Fan et al. 2006). While other studies accounted for false negatives by estimating imperfect detection (Schmidt et al.

2013, Ficetola et al. 2014, Hunter et al. 2015), there are few that quantify false positive

131

error rates (see critique by Wilson et al. 2015), and many fail to acknowledge amplification in the negative control wells (Laramie et al. 2015, Roussel et al. 2015).

Although widely used in medical diagnostics to evaluate the efficacy of diagnostic tests

(false versus true positive and negative test results; Kumar and Indrayan 2011), ROC has not previously been applied to interpreting eDNA detection levels and error rates. The application of ROC criteria to eDNA detections could substantially reduce the error rates and associated uncertainties highlighted by Darling and Mahon (2011). Setting an arbitrary low eDNA threshold to maximize potential detections could increase the rate of false positives, and lead to mistakenly protecting unoccupied habitats. Alternatively, setting a higher or overly conservative threshold would result in an increased risk of false negatives, potentially leading to a lack of protection for habitats supporting species of conservation concern, as seen in other species (reviewed in Miller et al. 1989). In at least two cases [European weather loach (Misgurnus fossilis) and spotted gar (Lepisosteus oculatus)], species were presumed locally extirpated, but eDNA yielded positive detections (Sigsgaard et al. 2015, Boothroyd et al. 2015). Integrating the ROC approach into future eDNA studies would be useful to reduce the risk of the negative conservation- related consequences of setting too low or too high a threshold.

The identification of three phylogeographic lineages within Redside Dace

(Chapter 3) is a significant contribution towards species management and recovery plans.

Conservation below the species level is critical for recovery efforts in order to maintain the adaptive resources and potential of intraspecific lineages, as well as conserving the species’ evolutionary legacy (Moritz 1994, Frankham et al. 2002, Geist 2011). My study is the first initiative to identify Redside Dace evolutionary lineages and hierarchical

132

genetic structuring across the species range. Based on these results, conservation strategies for managing Redside Dace should incorporate phylogeographic ancestry into recovery planning, as well as for reintroduction or translocation efforts.

Microsatellite data were also useful for identifying and mapping the different lineages within individual jurisdictions. Although all three lineages are present in the

United States, Redside Dace populations in individual states showed membership to only one major microsatellite group, despite mitochondrial evidence of secondary contact in

Pennsylvania and New York. By contrast, all three microsatellite-based groups were detected within Ontario, although all but one sampled population belonged to the same mitochondrial lineage. Strong spatial structuring and limited gene flow shown by microsatellite data in Chapter 3 also suggest that conservation efforts should be focused on local or regional scales. The majority of populations were genetically distinct from all others regardless of geographic distance or proximity. Accordingly, conservation efforts should take a population-based approach to manage Redside Dace where possible. It should be noted that the genetic diversity data did not always reflect population status.

Demographic data from some populations included in Chapter 3 indicate they may be below the numbers needed for long-term population viability (Poos et al. 2012) despite exhibiting moderate to high genetic diversity.

Both microsatellite and mitochondrial data suggest that Redside Dace populations in Ontario represent three distinct groups, which should be taken into account for future recovery efforts. The federal Species at Risk Act and the provincial Endangered Species

Act consider protection for units below the species level, and all populations in Ontario are currently considered at the species level provincially and by COSEWIC as a single

133

designatable unit (DU) for conservation (COSEWIC 2007). While evidence of genetic discreteness is important for identifying DUs, evidence of evolutionary and/or ecological significance is also required (COSEWIC 2014). Although very little ecological data is available for population comparisons, Novinger and Coon (2000) observed physiological differences between Redside Dace populations from separate lineages, suggesting adaptive ecological differences among the different genetic groups. It may therefore be worth considering whether the genetic groups identified in my thesis should be recognized as separate DUs. Regardless of whether populations in Canada are classified as a single or multiple DUs, recognition of multiple distinct genetic conservation units is important for recovery efforts; translocation of fish between different DU may not be successful, as demonstrated by Novinger and Coon (2000).

The combined results from Chapters 2 and 3 provide potent information and tools for aiding Redside Dace restoration and recovery efforts. Environmental DNA monitoring can be used in advance of re-introduction efforts to determine if candidate habitats still support unrecognized remnant populations. Negative eDNA results from a presumed extirpated site should reduce the risk of inadvertently stocking fish on top of a local population, with potential negative consequences (George et al. 2009). Conversely, if a site considered to be extirpated yields positive eDNA detections for Redside Dace and translocation is still considered appropriate, caution should be taken that the source population for translocations or stocking is from the same lineage as the recipient population. Translocating individuals into a re-introduction site that contains an undetected remnant population could substantially alter the genetic composition of the remnant population, as well as result in outbreeding depression (George et al. 2009).

134

Additionally, if an extirpated site tests “negative” for Redside Dace, eDNA could be used after the reintroduction has occurred to evaluate the success of translocation or stocking experiments.

For re-introduction efforts, the combination of genetic data and population abundance estimates (e.g. Redside Dace Recovery Team 2010, Poos et al. 2012) will be important to identify source populations for reintroductions at extirpated sites.

Mitochondrial and microsatellite data can be used in advance of re-introduction efforts to identify suitable populations to serve as sources for re-introduction based on evolutionary lineage(s) and levels of genetic diversity. Microsatellite markers can be used post re- introduction to determine translocation or stocking success; genetic diversity levels can be monitored after introduction, and assignment tests can be employed to identify how well the source population(s) has/have contributed genetic material to successive generations (Hansen et al. 2001, Piller et al. 2005). While captive breeding would allow for a higher number of fish to be released into the wild with minimal consequences to source populations, a potential undesirable genetic consequence of relying on hatchery production could be the release of maladapted fish (Araki et al. 2009). As microsatellite data have been used in sperm competition trials to assess the fitness importance of mate choice in Redside Dace (Beausoleil et al. 2012), it would also be possible to assess the effectiveness of hatchery mating based on mate choice to ensure that multiple fish are contributing to future generations in order to avoid inbreeding, versus random mating to select for specific strains. Augmentation using closely-related wild fish would result in minimal negative genetic consequences to the recipient population; however, population abundance estimates would also be advisable to ensure that the genetic diversity of source

135

populations are not compromised (Weeks et al. 2011). Additionally, genetic recapture initiatives can take place in lieu of physical tagging for mark-recapture in order to obtain population size estimates (Lukacs and Burnham 2005).

Future Directions

My thesis has made significant contributions to knowledge of Redside Dace conservation, and is a starting point for future recovery efforts. Environmental DNA can be used for long-term monitoring (Chapter 2), and the genetic data (Chapter 3) will serve as a baseline for future translocation efforts (Redside Dace Recovery Team 2010). As per recommendations of Chapter 3, multiple populations within each evolutionary lineage should be protected, so that the evolutionary potential and historical legacy of Redside

Dace can be maintained. Once sites are selected for re-introduction, degraded habitats can be restored, and suitable genetic stocks can be identified as source populations for recovery (Meffe 1995). Considerations for translocations should include employing ancestry matching versus environmental matching for selecting appropriate source and recipient populations, determining if one or multiple source populations should be translocated (Meffe 1995), and to decide whether transferring wild fish between populations or relying on captive bred fish would be more appropriate (George et al.

2009, Houde et al. 2015).

136

References

Araki H, Cooper B, Blouin MS (2009) Carry-over effect of captive breeding reduces reproductive fitness of wild-born descendants in the wild. Biology Letters, 5, 621-624.

Beausoleil JMJ, Doucet SM, Heath DD, Pitcher TE (2012) Spawning coloration, female choice and sperm competition in the redside dace, Clinostomus elongatus. Animal Behaviour, 83, 969-977.

Boothroyd M, Mandrak NE, Fox M, Wilson CC (2015) Environmental DNA (eDNA) detection and habitat occupancy of threatened spotted gar (Lepisosteus oculatus). Aquatic Conservation: Marine and Freshwater Ecosystems. Manuscript accepted.

COSEWIC (2007) COSEWIC assessment and updated status report on the Redside Dace Clinostomus elongatus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa, ON.

COSEWIC (2014) Guidelines for Recognizing Designatable Units. Committee on the Status of Endangered Wildlife in Canada. Available from: http://www.cosewic.gc.ca/eng/sct2/sct2_5_e.cfm.

Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based methods for monitoring biological invasions in aquatic environments. Environmental Research, 111, 978-988.

Fan J, Upadhye S, Worster A (2006) Understanding receiver operator characteristics (ROC) curves. Canadian Journal of Emergency Medicine, 8, 19-20.

Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false presences, and the estimation of presence / absence from eDNA metabarcoding data. Molecular Ecology Resources, 15, 543–556.

Frankham R, Briscoe DA, Ballou JD (2002) Introduction to Conservation Genetics. Cambridge University Press, Cambridge.

Geist J (2011) Integrative freshwater ecology and biodiversity conservation. Ecological Indicators, 11, 1507-1516.

George AL, Kuhajda BR, Williams JD, Cantrell MA, et al. (2009) Guidelines for propagation and translocation for freshwater fish conservation. Fisheries, 34, 529-545.

Hansen MM, Kenchington E, Nielsen EE (2001) Assigning individual fish to populations using microsatellite DNA markers. Fish and Fisheries, 2, 93-112.

137

Houde ALS, Garner SR, Neff BD (2015) Restoring species through reintroductions: strategies for source population selection. Restoration Ecology, 23, 746-753.

Hunter ME, Oyler-McCance SJ, Dorazio RM, Fike JA, et al. (2015) Environmental DNA (eDNA) sampling improves occurrence and detection estimates of invasive Burmese pythons. PloS one, 10, e0121655.

Kumar R, Indrayan A (2011) Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatrics, 48, 277-287.

Laramie MB, Pilliod DS, Goldberg CS (2015) Characterizing the distribution of an endangered salmonid using environmental DNA analysis. Biological Conservation, 183, 29-37.

Lukacs PM, Burnham KP (2005) Review of capture–recapture methods applicable to noninvasive genetic sampling. Molecular Ecology, 14, 3909-3919.

Meffe GK (1995) Genetic and ecological guidelines for species reintroduction programs: application to Great Lakes fishes. Journal of Great Lakes Research, 21, 3-9.

Miller RR, Williams JD, Williams JE (1989) Extinctions of North American fish during the past century. Fisheries, 6, 22-38.

Minckley WL (1995) Translocation as a tool for conserving imperiled fishes: experiences in western United States. Biological Conservation, 72, 297-309.

Moritz C (1994) Defining ‘evolutionarily significant units’ for conservation. Trends in Ecology & Evolution, 9, 373-375.

Novinger DC, Coon TG (2000) Behavior and Physiology of the redside dace, Clinostomus elongatus, a threatened species in Michigan. Environmental Biology of Fishes, 57, 315–326.

Piller KR, Wilson CC, Lee CE, Lyons J (2005) Conservation genetics of inland lake trout in the upper Mississippi River basin: stocked or native ancestry? Transactions of the American Fisheries Society, 134, 789-802.

Poos MS, Jackson DA (2012) Impact of species-specific dispersal and regional stochasticity on estimates of population viability in stream metapopulations. Landscape Ecology, 27, 405-416.

Redside Dace Recovery Team (2010) Recovery Strategy for Redside Dace (Clinostomus elongatus) in Ontario. Prepared for the Ontario Ministry of Natural Resources. Peterborough, ON.

138

Roussel JM, Paillisson JM, Tréguier A, Petit E (2015) The downside of eDNA as a survey tool in water bodies. Journal of Applied Ecology, 52, 823-826.

Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy models in the analysis of environmental DNA presence/absence surveys: a case study of an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.

Sigsgaard EE, Carl H, Møller PR, Thomsen PF (2015) Monitoring the near-extinct European weather loach in Denmark based on environmental DNA from water samples. Biological Conservation, 183, 46-52.

Wilson CC, Wozney KM, Smith CM (2015) Recognizing false positives: synthetic oligonucleotide controls for environmental DNA surveillance. Methods in Ecology and Evolution, doi: 10.1111/2041-210X.12452

Weeks AR, Sgro CM, Young AG, Frankham R, et al. (2011) Assessing the benefits and risks of translocations in changing environments: a genetic perspective. Evolutionary Applications, 4, 709-725.

Appendix 2.1: Field data collected at 29 sites including sampling dates, fish caught, habitat characteristics (channel width, channel depth, conductivity, temperature), and GPS coordinates.

Site Name Code Lat Long. Date Resample Mean Mean Mean Water Conductivi Total Fish sampled Date Channel Water width Temp ty (μS) Caught Spring Width Depth * (oC) (m) (m) mean depth Lynde Creek 4 L4 43.90 -78.96 30-May-13 4.78 0.32 1.53 17.9 704 0 Lynde Creek 3 L3 43.97 -78.96 30-May-13 5.07 0.24 1.22 14.2 643 0 Lynde Creek 1 L1 43.92 -78.98 14-May-13 28-May-13 0.86 0.11 0.09 8.9 596 1 Lynde Creek 2 L2 43.92 -78.99 14-May-13 28-May-13 5.09 0.19 0.97 7.6 670 0 Lynde Creek 5 L5 43.95 -78.99 11-Jun-14 1.72 0.27 0.46 17.9 704 0 East Carruthers 2 E2 43.92 -79.03 11-Jun-14 3.49 0.12 0.41 15.8 951 Creek 2 East Carruthers 1 E1 43.93 -79.03 07-Jun-13 2.28 0.36 0.82 13.1 803 Creek 1 Spring Creek DU2 43.93 -79.07 15-May-13 28-May-13 7.25 0.22 1.60 11.5 494 0 Ganateskiagon 1 DU3 43.88 -79.10 15-May-13 28-May-13 3.40 0.09 0.31 12.1 588 Creek Mitchell Creek DU1 43.97 -79.14 15-May-13 28-May-13 1.23 0.16 0.20 9.5 490 34 Morning Side 0 R4 43.82 -79.21 16-May-13 3.24 0.22 0.71 14 1281 Creek 0 Morning Side R3 43.83 -79.23 16-May-13 2.33 0.17 0.40 15.9 1367 Creek

Berczy Creek R6 43.88 -79.33 17-May-13 3.51 0.21 0.74 12.4 1282 1 Trib to Berczy 2 R2 43.88 -79.35 17-May-13 06-Jun-13 1.39 0.10 0.14 15.1 1514 Creek Leslie St. Trib R5 43.88 -79.39 23-May-13 3.97 0.24 0.95 17.6 960 21 Leslie St. Trib R1 43.90 -79.39 21-May-13 2.30 0.18 0.41 16.9 806 8

Don River Creek 0 139 D1 43.86 -79.46 21-May-13 06-Jun-13 2.45 0.12 0.29 18.9 1046 1

East Humber 0 H2 43.93 -79.53 06-Jun-13 4.07 0.23 0.9 14.6 724 Drive East Humber 2 H1 43.92 -79.56 31-May-13 7.80 0.27 2.1 20.6 759 River Purpleville Creek P1 43.84 -79.60 23-May-13 06-Jun-13 4.11 0.27 1.1 17.8 909 0 Humber Trail W1 43.90 -79.61 27-May-13 6.98 0.25 1.7 12.6 795 0 Fourteen Mile 2 F2 43.42 -79.73 24-May-13 04-Jun-13 6.56 0.11 0.72 12.3 1340 0 Fourteen Mile 1 F1 43.42 -79.76 24-May-13 04-Jun-13 3.05 0.13 0.40 17.9 1300 3 Fourteen Mile 3 F3 43.42 -79.77 24-May-13 04-Jun-13 2.39 0.28 0.67 10.4 775 6 Sixteen Mile 1 SI 43.57 -79.89 31-May-13 2.43 0.10 0.24 17.9 670 7 Silver Creek CR1 43.64 -79.92 24-May-13 5.83 0.22 1.28 11.2 650 0 Stan J 2 SJ2 43.49 -81.66 03-Jun-13 0.88 0.19 0.17 16.8 609 0 Gully Creek 1 GC1 43.61 -81.68 03-Jun-13 6.15 0.12 0.74 13.4 600 3 Stan J 1 SJ1 43.50 -81.70 03-Jun-13 1.38 0.21 0.29 13.3 690 0

140

Appendix 2.2a: Raw qPCR values (copies/reaction) for TaqMan Fast qPCR mastermix (Applied Biosystems Inc.) during Spring (S) and Fall (F) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and spatial (S1-S4) replicates.

Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5 F-S1 F-S2 F-S3 F-S4 F-T1 F-T2 F-T3 F-T4 F-T5 L4 0.26 0.14 0.00 0.00 0.18 0.09 1.47 0.58 0.051 0.00 0.00 0.03 0.02 0.06 0.00 0.04 0.00 0.00 L3 0.00 0.00 0.68 0.66 0.20 0.00 0.00 0.31 0.20 0.00 0.34 0.18 0.23 0.00 0.13 0.08 0.00 0.00 L1 0.10 0.00 0.08 0.17 0.00 0.15 0.03 0.00 0.00 0.73 1.68 1.15 0.00 0.19 0.83 0.11 0.60 0.00 L2 0.32 0.33 0.00 0.62 0.00 0.65 0.87 1.84 0.00 0.40 0.14 2.79 0.71 0.00 0.00 0.40 0.14 3.30 L5 0.09 0.00 0.00 0.25 0.00 0.00 0.00 0.19 0.00 0.00 0.00 0.00 0.19 0.00 0.00 0.00 0.27 0.00 E2 0.00 0.51 0.19 0.211 0.85 0.21 0.190 0.401 0.48 2.73 6.84 0.12 3.75 3.69 3.17 7.15 4.85 1.05 E1 5.80 10.55 10.03 14.59 13.62 11.84 9.95 9.08 9.65 0.00 0.41 0.63 1.29 0.37 0.00 0.00 1.40 0.36 DU2 0.42 0.39 0.30 0.00 0.00 0.53 1.43 0.54 0.00 6.30 1.32 2.03 1.94 1.96 0.60 0.75 1.90 1.64 DU3 0.51 2.47 1.65 1.127 2.35 3.86 4.31 2.77 0.61 0.42 1.00 2.17 1.88 2.35 1.11 3.12 5.30 2.14 DU1 0.33 0.76 0.78 1.12 0.32 0.70 1.55 0.83 0.91 107.5 14.07 81.39 90.33 12.21 80.90 14.21 6.72 27.45 R4 0.73 1.23 0.45 0.070 1.11 0.00 0.56 0.64 0.26 0.20 0.00 0.00 0.31 0.00 0.00 0.00 0.00 0.65 R3 0.36 1.31 1.33 1.24 0.30 0.83 0.90 1.99 2.56 1.29 0.40 0.00 0.37 6.30 1.06 0.00 1.75 0.30 R2 8.91 3.18 5.25 3.46 7.83 4.50 4.67 6.40 13.41 0.74 0.53 1.38 0.76 0.77 0.00 0.00 0.32 0.05 R5 8.32 4.66 7.04 7.85 4.12 4.38 11.74 55.32 4.45 7.20 5.07 5.93 6.47 4.38 1.75 3.64 5.91 4.04 R6 9.53 6.53 5.46 8.06 5.40 5.48 8.16 5.67 3.78 1.15 4.130 3.59 4.71 11.95 4.77 4.90 1.29 2.55 R1 0.25 40.46 79.43 61.73 35.59 74.18 27.05 15.71 12.15 11.80 21.11 9.55 9.66 14.91 8.97 55.98 10.50 18.13 D1 14.35 8.88 6.90 5.37 24.64 23.61 25.83 11.12 33.23 3.53 1.98 0.79 5.56 4.35 1.54 1.35 2.58 1.72 H2 1.24 1.32 1.77 0.48 1.85 0.08 1.89 1.18 1.82 2.63 0.22 1.50 0.84 0.34 2.26 0.51 0.52 0.12 H1 5.14 11.28 7.94 6.97 6.28 11.41 12.11 10.52 10.26 1.08 1.50 1.04 27.88 1.27 0.71 0.25 0.74 0.18 P1 5.34 3.12 6.16 1.63 1.88 0.96 2.97 1.53 3.70 1.77 3.23 5.23 3.04 5.98 3.44 2.08 5.26 5.20 W1 1.23 0.00 0.65 1.74 1.03 0.84 0.00 0.58 0.90 27.87 0.96 0.00 2.53 0.15 2.68 0.00 0.05 1.28 F2 4.24 4.55 2.80 3.89 3.49 3.21 3.89 2.70 1.52 0.80 0.20 2.04 0.70 1.52 0.00 0.20 0.28 0.71

141 F1 8.98 16.36 14.15 8.24 0.77 16.56 154.80 10.21 7.67 1.04 6.32 20.51 6.23 4.74 3.20 4.63 3.26 1.51

Appendix 2.2a (continued)

Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5 F-S1 F-S2 F-S3 F-S4 F-T1 F-T2 F-T3 F-T4 F-T5 F3 9.65 7.16 8.48 7.62 4.71 1.40 20.08 15.24 13.39 1156.9 587.06 3.57 3.03 17.96 22.93 0.18 14.29 731.15 SI 21.74 5.13 8.44 17.04 13.23 12.38 21.69 12.44 8.80 18.96 11.56 15.92 4.04 9.73 7.36 18.87 6.08 15.32 CR1 1.85 1.61 0.63 2.36 0.76 1.93 2.76 2.31 0.94 0.00 0.00 0.00 0.19 0.00 0.00 0.33 0.00 0.75 SJ2 72.52 72.58 44.31 52.94 42.09 35.19 59.55 64.88 71.06 0.00 0.60 0.00 0.00 0.37 0.52 0.42 0.00 2.41 GC1 2.15 4.12 6.10 3.746 4.64 7.52 0.39 3.02 1.67 4.65 2.61 2.28 0.38 1.10 2.06 2.06 0.31 3.84 SJ1 15.40 19.96 7.48 33.20 21.51 17.45 23.78 23.72 18.13 11.53 7.98 11.21 18.11 3.01 3.42 6.35 6.93 11.29

142

Appendix 2.2b: Raw qPCR values (copies/reaction) for TaqMan Environmental qPCR mastermix (Applied Biosystems Inc.) during Spring(S) sampling season at 29 sampled Redside Dace sites for temporal (T1-T5) and spatial (S1-S4) replicates.

Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5 L4 0.10 0.28 0.36 1.91 0.39 0.28 3.98 0.88 0.60 L3 0.15 0.21 0.86 0.47 0.49 0.348 0.48 0.50 0.15 L1 0.17 0.45 0.53 1.745 0.61 0.15 0.40 0.02 1.06 L2 0.24 2.14 0.64 3.11 1.45 3.27 1.86 5.65 1.51 L5 0.00 0.53 0.08 0.00 0.15 0.09 0.088 0.30 0.12 E2 0.50 0.00 0.55 0.75 0.97 0.24 0.49 1.39 0.55 E1 8.51 9.54 11.25 25.38 20.92 18.78 19.81 18.21 17.50 DU2 0.385 0.73 0.72 0.95 0.49 0.32 1.19 0.77 2.97 DU3 1.386 3.42 1.89 2.08 3.04 2.31 3.96 5.30 3.12 DU1 0.11 3.64 0.73 2.65 1.83 0.29 3.50 2.65 4.36 R4 1.24 0.36 0.00 0.00 0.31 0.56 1.58 0.57 0.60 R3 0.98 5.24 3.45 2.62 3.38 2.19 4.30 6.58 4.75 R2 4.65 2.76 3.99 4.20 9.80 7.05 7.76 10.18 12.65 R5 8.38 6.82 8.57 4.80 5.46 8.69 8.01 48.55 14.21 R6 10.07 5.73 7.67 10.07 7.30 8.90 14.10 8.94 5.83 R1 0.54 65.89 76.62 70.28 51.20 110.13 52.47 34.94 28.12 D1 23.85 13.94 16.33 6.13 39.68 31.05 31.29 17.34 43.11 H2 1.11 0.60 0.26 0.52 0.29 0.11 1.56 0.00 0.40 H1 4.62 14.03 7.90 5.53 9.71 15.01 13.46 11.32 11.80 P1 6.79 4.62 7.78 3.57 1.57 2.01 1.56 3.68 4.06 W1 0.59 1.77 1.46 1.26 1.50 0.52 3.01 1.57 1.73

F2 1.53 5.14 1.22 6.44 1.39 1.64 4.81 2.22 2.32 143 F1 10.18 21.75 11.44 7.82 11.88 16.55 146.93 10.60 11.58

Appendix 2.2b (continued)

Code S-S1 S-S2 S-S3 S-S4 S-T1 S-T2 S-T3 S-T4 S-T5 F3 6.97 4.17 7.97 10.53 5.81 0.20 8.53 10.71 8.60 SI 17.69 7.51 12.34 15.62 13.83 12.35 24.92 10.84 13.71 CR1 2.35 2.92 0.89 2.72 1.53 1.285 3.12 2.09 3.53 SJ2 58.88 49.59 40.01 44.93 30.04 18.73 37.12 34.03 50.50 GC1 3.99 2.70 1.93 4.77 3.92 3.62 2.66 4.97 3.41 SJ1 10.65 13.48 6.52 18.92 16.38 17.74 17.16 13.21 17.30

144

145

Appendix 2.3: 10 Lake control sites absent for Redside Dace along with their GPS coordinates and date sampled.

Site Name Latitude Longitude Date Sampled Given

Stoco Lake 44.47 -77.31 June 14th, 2013

Moira Lake 44.48 -77.47 June 14th, 2013

Eels Lake 44.89 -78.11 June 13th, 2013

Cedar Lake 44.97 -77.76 June 14th, 2013

Paudask Lake 44.99 -78.07 June 13th, 2013

Monck Lake 44.99 -78.11 June 13th, 2013

Auger Lake 45 78.05 June 13th, 2013

Mayo Lake 45.05 -77.57 June 13th, 2013

Weslemkoon 45.08 -77.45 June 13th, 2013 Lake 1

Fraser Lake 45.19 -77.64 June 13th, 2013

146

Appendix 2.4: Comparison of Environmental versus Fast mastermix

Introduction

Real-time qPCR mastermixes, TaqMan® Fast Universal PCR Master Mix and

Environmental Master Mix 2.0, are both used in quantification assays, however, the sensitivity of the two have yet to be compared. The purpose of this section was to determine which mix was better at detecting eDNA at lower copy numbers. This will have important implications for future eDNA studies; a more sensitive mix will result in a higher proportion of true positives for monitoring programs.

Methods

Water samples from fall 2012 were run using TaqMan® Fast Universal PCR

Master Mix (referred to as fast mix herein), while water samples from spring 2013 were run using both fast mix as well as TaqMan® Environmental Master Mix 2.0 (referred to as environmental mix herein), the latter of which was first used in the lab during 2013.

The environmental mix could not be tested on fall 2012 water samples due to the potential for DNA damage as a result of freezing and thawing, which could have resulted in DNA degradation over the past year of being in the freezer. The fast mix was compared to the environmental mix to see if there was a significant difference between the two. Based on these results, no significant difference would indicate the environmental and fast mixes could be analyzed in conjunction, but a significant difference would indicate that the data from each mix would have to be analyzed separately.

147

The environmental mix was run with the samples collected in spring 2013/2014 and fall 2013. For two replicates, each sample was run with 15 L of the following cocktail: 10 L TaqMan® Environmental Master Mix 2.0, 0.4 L of RSD-R, 0.4 L of

RSD-F, 0.4 L of RSD-probe, 3.8 L of ddH2O, and 5 L of stock DNA. For one of the replicate plates: each sample was run with 15 L of the following cocktail in order to test for inhibition: 10 L TaqMan® Environmental Master Mix 2.0, 0.4 L of RSD-R, 0.4 L of RSD-F, 0.4 L probe, 1.2 L ddH2O, 2.2 L 10x TaqMan® Exogenous Internal

Positive Control, 0.4 L 50x TaqMan® Exogenous Internal Positive Control DNA. The internal positive control was run with the environmental mastermix reaction to determine if inhibition was present within the sample. StepOnePlus thermocycling conditions for

TaqMan® Environmental Master Mix 2.0 were as follows: initial denaturation for 10 min at 95 °C for, followed by a 2 step-process of a 15 s denaturation at 95 °C, and a 1 min annealing at 60 °C, repeated for 40 cycles. The fast mix was run on all field samples, while the environmental mix could not be used for the 2012 field collections because the lab just started using the mix in 2013. A paired t-test was used to determine if there was a significant difference between the mean of the environmental and fast mixes. To determine the direction and range of this difference, a Bland-Altman plot was generated by graphing the mean value of DNA copies/reaction for the fast and environmental mix against the difference of one method minus the other (Bland and Altman 2003). Due to the wide range of x-values, the x-scale was logged.

148

Results

The mean DNA copies/reaction for samples assayed using the environmental and fast mixes were significantly different based on a combination of a paired t-test and a

Bland-Altman plot. A paired t-test (n = 393 for both) showed that the log means of the two mixes are significantly different from each other, indicating that the data could not be pooled together (t = -6.5, p<0.05). Assumptions of normality and homogeneity of variance (F392=1.04, p>0.05) were met by adding a constant of 0.5 to copy number values and log base ten transforming. For values between zero and one copy/reaction, there appeared to be no difference between mixes because values within this range fell on the y

= 0 line. Above one copy/reaction however, an increase in the mean of the fast and environmental mixes is accompanied by an increase in the difference between mixes.

Although the points on the graph fall both above and below y = 0, the majority of y- values are greater than zero, indicating a bias towards higher values for the environmental mix. This is supported by an overall mean difference between the two mixes of y = 0.88

DNA copies/reaction, and the limits of agreement are y = -10.05 and y = 11.80 (Figure

A4-1). Despite the environmental mix being more sensitive at lower copy numbers, the fast mix results were used in the analysis because the qPCRs were run for all samples.

The IPC assay for inhibition could only be run for the environmental mix, and results indicated that for the most part, environmental water samples do not have inhibitors present, as illustrated by the similar spread in values of the control and environmental samples (Figure A4-2). Environmental samples had a median value of

28.87 copies/reaction (n = 392, x̅ = 28.60, s = 0.81), while the control samples had a median value of 28.84 copies/reaction (n = 139, x̅ = 28.73, s = 1.48). Of the 139 controls,

149

11 samples did not have a Ct value reading, while two environmental samples (L5-S1-

Fall and H2-T4) did not have a Ct value reading, indicating that these samples were potentially inhibited. Further tests were not done on these additional samples due to time constraints.

Reference

Bland JM, Altman DG (2003) Applying the right statistics: analyses of measurement studies. Ultrasound in Obstetrics & Gynocology, 22, 85-93.

Limits of agreement (y=11.80)

Mean difference of mixes (y=0.88)

Limits of agreement (y=-10.05)

Figure A4-1: Bland-Altman analysis of method comparison between TaqMan Fast and Environmental qPCR mastermixes (Applied Biosystems Inc.), with the x-axis showing the average copy number of a particular samples using the fast and environmental mix, and the y-axis showing the difference in copy number between the mixes using that same sample.

150

151

Figure A4-2: Boxplot of Internal Positive Control (IPC) values at the threshold cycle (Ct) they cross at during the environmental mix qPCR assay. All negative controls are represented by the “controls” boxplot on the left, while all eDNA water samples are represented by the “environmental” boxplot on the right.

152

Appendix 2.5: Separating the signal from the noise: using receiver operator characteristics to optimize sensitivity and specificity of environmental DNA detections

Introduction

A successful monitoring program is one that is able to detect change within a system soon after it takes place (Lindenmayer et al. 2013). DNA-based approaches are being utilized for monitoring because of their potential for higher specificity and sensitivity, non-intrusiveness, as well as their reduced costs (Darling and Mahon 2011).

In particular, environmental DNA (eDNA) detection, which refers to detecting target

DNA from a water sample, is more commonly being applied as a monitoring tool for both invasive and endangered species (Ficetola et al. 2008, Wilcox et al. 2013). False positive and false negative error rates are not uncommon in species monitoring programs. A ‘false positive’ refers to a species reported as being present at a particular site but is actually absent, while a ‘false negative’ refers to a species reported as being absent at a particular site, but is actually present (goes undetected). Despite the growing number of studies to use real-time PCR (qPCR) methodology to determine species presence, the accuracy of this platform in the eDNA field has been largely under-evaluated.

A Receiver Operator Characteristic (ROC) is a statistical test that can be used to measure the sensitivity and specificity of a test at varying thresholds (Metz 1978). The sensitivity of a test is defined as [True Positive (TP) / True Positive (TP) + False Negative

(FN)], and the test’s specificity as [True Negative (TN) / (True Negative (TN) + False

Positive (FP)] at varying pre-defined thresholds (Figure A5-1; Metz 1978). First used in

World War Two to differentiate the ability of radar operators to identify “noise” from

“true signal” (Fan et al. 2006), it is more commonly used in the medical literature to

153 evaluate the efficacy of diagnostic tests (Kumar and Indrayan 2011). ROC has been previously used to evaluate the efficacy of qPCR (Nutz et al. 2011). However, ROC has yet to be used to design eDNA monitoring programs and compare the effect of different

DNA copy thresholds on detection limits.

Methods

All eDNA samples were classified as either “control” or “detection”, in order for

ROC analysis to proceed. The “control” group consisted of filter, DNA extraction, field and lake negative controls; these are the samples that are theoretically free of Redside

Dace DNA (identified as the “negative” group). The “detection” group, consisted of any environmental samples that had a qPCR output of greater than zero, indicating Redside

Dace DNA presence (identified as the “positive” group). The ROC analysis arranges all datapoints in numerical order, and goes through every point (“positive” and “negative”), and sets that particular value as the threshold. Each datapoint is classified as one of four options: true positive (TP), true negative (TN), false positive (FP), and false negative

(FN) at each threshold. The TP is any “positive” that has a value above the threshold, while a TN is any “negative” value that falls below the threshold. The FP is any

“negative” value that falls above the threshold, while the FN is any “positive” value that falls below the threshold. This is calculated for all “positive” and “negative” cases.

To measure the accuracy of the qPCR assay at each threshold, sensitivity and specificity values are calculated. Sensitivity is measured as TP/(TP+FN), which are the

“positives” that are recognized as detections at a particular threshold. Specificity, on the other hand is measured as TN/(TN+FP), and are the “negatives” that are recognized as

154 non-detections at a particular threshold. Once these values are calculated for all data points, a curve of sensitivity versus specificity can be generated. Overall, an increase in threshold would result in higher sensitivity and a lower specificity, while a lower threshold would result in a lower sensitivity and higher specificity. The area under the curve (AUC) can then be calculated for all the data points to determine the probability that the qPCR assay will return a value that is greater for a random Redside Dace water sample (in this case), than a randomly chosen negative control sample. An AUC value of

1.0 indicates that the particular test is able to accurately classify “positive” from

“negative”, while a value of 0.5 indicates that the assay isn’t appropriate for distinguishing a “positive” test result from a “negative” test result (Figure A5-1).

Setting the threshold is a trade-off between sensitivity and specificity. Values between one and ten were chosen as candidate thresholds in order to set a threshold, reflective of the higher variance values observed within this range in the standards using the fast mastermix. The ROC curves and AUC values were created for the fast and environmental mixes and calculated in R using the program pROC, and sensitivities and specificities were calculated using MedCalc (http://www.medcalc.org/). Additionally, the approach taken by Kumar and Indrayan (2011) was taken into consideration when trying to identify the optimal threshold.

Results

At one copy/reaction, sensitivity was the highest (60.2%), while specificity was lowest (98.5%). The rate of true positives was highest at one copy/reaction, with a value of

42.8% (Table A5-1). The inverse relationship was observed when a higher threshold was

155 set; higher specificity was obtained at the cost of lower sensitivity. For example, at 10 copies/reaction, sensitivity was lowest (18.7%) and specificity was highest (100%). While there were no false positives detected at the ten copy threshold, the rate of true positive dropped from 42.8% at 1 copy/reaction to 13.2% at 10 copies/reaction (Table A5-1). The

AUC for the qPCR assay was 0.89 for fast mix and 0.84 for environmental mix (Figure

A5-1).

Based on the highest sensitivity: specificity ratio (Kumar and Indrayan 2011), the optimal threshold was 0.5 copies/reaction. This threshold was not used in my study because 0.5 copies/ reaction is theoretically impossible. Despite 1 copy/reaction having the highest sensitivity: specificity ratio, a threshold of 3 copies/reaction was selected. A more conservative threshold was used because of the inconsistencies and failures associated with amplifying 1 copy/reaction. This decision is supported by four lines of evidence: (i) the standards generated for each qPCR assay were quantified using DNA copies from 1x106 down to 1 copy/ reaction. When generating the curve, some of the data points at 1 copy/reaction had to be removed due to outliers (Figure 2.6), (ii) there was a failure rate of 34% when amplifying DNA at 1 copy/reaction for all standards, (iii) for the experiment in which the eDNA standards were set as unknowns to assess how accurately qPCR could identify copy numbers, two of the seven replicates did not amplify at 1 copy/ reaction, despite the DNA at 1 copy/ reaction being taken from the same working stock solution (iv) the MIQE guidelines suggest that 3 copies/reaction would be the most sensitive. Therefore, assuming ideal PCR conditions and by setting a threshold of 1 copy/reaction, there would be no room for error, which would be a difficult assumption to meet given the stochastic nature of qPCR (Bustin et al. 2009). From a management

156 perspective, a threshold of 3 copies/reaction would a) still be more sensitive than electrofishing, and b) be ideal for reducing the chances of having false positives with the more conservative threshold (in comparison to 1 copy/reaction).

Discussion

Contamination has been identified as a major issue for the reliable eDNA detection of target species, because of its ability to inflate false positive rates (Darling and

Mahon 2011, Thomsen and Willerslev 2014). Despite this, many studies have reported that contamination was not evident in their study (Turner et al. 2014, Laramie et al. 2015,

Spear et al. 2015), but it is unclear if this means (i) all negative controls had a qPCR value of 0 copies/reaction, or (ii) if there was background reading present but the values fell below the threshold and were therefore dismissed. My study indicated that positive readings in the negative controls are not frequent, but do occur due to the stochastic nature of PCR. This poses a problem for analyzing data, because of the difficulties in differentiating between a “true” positive versus a “false” positive at low level detections.

The use of a Receiver Operator Characteristics analysis in future studies is therefore encouraged in order to quantify error rates associated with setting different thresholds.

Setting an appropriate threshold is a critical first step for monitoring surveys because it dictates which sites are considered “positive” versus “negative”, and therefore impacts downstream data analysis. In spite of the growing number of studies and their focus on answering the more complex ecological questions in the eDNA literature, very little attention has been given to detection thresholds and its impacts on false positive error rates (Ficetola et al. 2014, Schmidt et al. 2013). At lower levels of DNA within a

157 sample, quantification error exists, which could negatively influence monitoring programs (Klymus et al. 2014). Furthermore, for the studies that have discussed detection limits, there is no accepted standard threshold for qPCR. For example, while Wilcox et al.

(2013) set a target copy of 0.5 target copies/reaction, Turner et al. (2014) set 30 copies/reaction as their 95% limit of detection (LOD), while the LOD for Eichmiller et al.

(2014) was 50 copies/ reaction. This study attempts to overcome this inconsistency in the literature, and is the first one to use ROCs as a way of examining the implications of setting various thresholds on error rates.

Low error rates are important for implementing new technology into a monitoring program. The false negative error rates calculated in this study using the ROC framework were approximately 40%, higher than that calculated by Laramie et al. (2015), which had a rate of approximately 8.2%. Laramie et al. (2015) determined false negative rates by calculating (replicate number where no Chinook eDNA was detected)/ (number of replicate sites where eDNA was confirmed to be present). The limitation of this approach is that it does not account for low-level detections that fall really close to the threshold set for all replicates, as this was only done for samples where there was at least one positive.

A second study derived a false negative eDNA rate of 8.7% and obtained this estimate by determining the eDNA samples that tested negative for the target species out of total samples collected (collection sites represent areas where newts are present in large amounts) (Biggs et al. 2015). A limitation of using this approach is that it is only calculating false negative error rates associated with field sampling and the potential to not collect eDNA in the water bottle, given that a species is present at the site. The estimation produced from this study differs from these two, because a more holistic

158 approach was taken by calculating the environmental samples that had a qPCR output of greater than 0 copies/reaction, but below 3 copies/reaction.

This study indicated that electrofishing was less sensitive for Redside Dace presence than eDNA at a lower threshold (Th=1), while electrofishing was more sensitive than eDNA monitoring when a larger threshold is applied (Th=10). At a higher set eDNA threshold, one can be more confident that the positive sites identified via eDNA methodology are “true detections,” and this was demonstrated by the high agreement of sites that tested positive using both methods (Figure A5-1). At the other end of the spectrum, by setting a threshold of 1 copy/reaction, eDNA was able to detect at all sites that tested positive using electrofishing, while detecting additional sites that traditional methods could not (Figure A5-3). The important consideration here is whether these additional sites actually contain Redside Dace, or if there is the potential to increase false positive error rates as a result of setting a lower threshold (Type I error) (Burns and

Valdivia 2007). The limitation of setting a higher threshold is that we can be less certain that we have identified the sites containing Redside Dace using eDNA (ie/ increasing our false negative rate; Type II error) (Burns and Valdivia 2007), however this comes with the advantage of being able to invest time and resources into populations that are likely to contain the target species of interest. Additionally, extra sampling effort could be invested into sites with eDNA detections that fall below the copy threshold.

159

References

Biggs J, Ewald N, Valentini A, Gaboriaud C, et al. (2015) Using eDNA to develop a national citizen science-based monitoring programme for the great crested newt (Triturus cristatus). Biological Conservation, 183, 19–28.

Burns M, Valdivia H (2007) Modelling the limit of detection in real-time quantitative PCR. European Food Research and Technology, 226, 1513-1524.

Bustin SA, Benes V, Garson JA, Hellemans J, et al. (2009) The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clinical Chemistry, 55, 611-622.

Darling JA, Mahon AR (2011) From molecules to management: adopting DNA-based methods for monitoring biological invasions in aquatic environments. Environmental Research, 111, 978–988.

Eichmiller JJ, Bajer PG, Sorensen PW (2014) The relationship between the distribution of common carp and their environmental DNA in a small lake. PLoS ONE, 9,e112611.

Fan J, Upadhye S, Worster A (2006) Understanding receiver operator characteristics (ROC) curves. Canadian Journal of Emergency Medicine, 8, 19-20.

Ficetola GF, Miaud C, Pompanon F, Taberlet P (2008) Species detection using environmental DNA from water samples. Biology Letters, 4, 423–425.

Ficetola GF, Pansu J, Bonin A, Coissac E, et al. (2014) Replication levels, false presences, and the estimation of presence / absence from eDNA metabarcoding data. Molecular Ecology Resources, 15, 543–556.

Kumar R, Indrayan A (2011) Receiver Operator Characteristic (ROC) curve for medical researchers. Indian Pediatrics, 48, 277-287.

Klymus KE, Richter CA, Chapman DC, Paukert C (2014) Quantification of eDNA shedding rates from invasive bighead carp Hypophthalmichthys nobilis and silver carp - Hypophthalmichthys molitrix. Biological Conservation, 183, 77-84.

Laramie MB, Pilliod DS, Goldberg CS (2015) Characterizing the distribution of an endangered salmonid using environmental DNA analysis. Biological Conservation, 183, 29–37.

Lindenmayer D, Piggott M, Wintle B (2013) Counting the books while the library burns: why conservation monitoring programs need a plan for action. Frontiers in Ecology and the Environment, 11, 549-555.

160

Metz CE (1978) Basic principles of ROC analysis. Seminars in Nuclear Medicine, 4, 283- 298.

Nutz S, Döll K, Karlovsky P (2011) Determination of the LOQ in real-time PCR by receiver operating characteristic curve analysis: application to qPCR assays for Fusarium verticillioides and F. proliferatum. Analytical and Bioanalytical Chemistry, 401, 717-726.

Schmidt BR, Kery M, Ursenbacher S, Hyman OJ, Collins JP (2013) Site occupancy models in the analysis of environmental DNA presence/absence surveys: a case study of an emerging amphibian pathogen. Methods in Ecology and Evolution, 4, 646–653.

Spear SF, Groves J.D, Williams LA, Waits LP (2015) Using environmental DNA methods to improve detectability in a hellbender (Cryptobranchus alleganiensis) monitoring program. Biological Conservation, 183, 38–45.

Thomsen P, Willerslev E (2014) Environmental DNA – An emerging tool in conservation for monitoring past and present biodiversity. Biological Conservation, 183, 4-18.

Turner CR, Uy KL, Everhart RC (2014) Fish environmental DNA is more concentrated in aquatic sediments than surface water. Biological Conservation, 183, 93-102.

Wilcox TM, McKelvey KS, Jane SF, Lowe WH, Whiteley AR, Schwartz MK (2013) Robust detection of rare species using environmental DNA: the importance of primer specificity. PLoS ONE, 8, e59520.

Table A5-1: Comparison of fast and environmental TaqMan® mastermixes (Applied Biosystems Inc.) for True Positive (TP), True Negative (TN), False Positive (FP), False Negative (FN), Sensitivity (Sn) and Specificity rates (Sp) at thresholds of 1, 3, 5, and 10 target eDNA copies/reaction.

Threshold True True False False Sensitivity Specificity (copies/reaction) Negative Positive Positive Negative 1 28.5%(254) 42.8%(381) 28.2%(251) 0.5%(4) 60.28% 98.45% 3 28.9%(257) 29.2%(260) 41.8%(372) 0.1%(1) 41.40% 99.61% 5 29%(258) 22.3%(199) 48.7%(433) 0%(0) 31.90% 100% 10 29%(258) 13.2%(118) 57.8%(514) 0%(0) 18.67% 100%

161

162

Figure A5-1: Schematic representation of using Receiver Operator Characteristic (ROC) to discriminate between true and false positive and negative PCR results, as well as to evaluate test sensitivity and specificity. TN= True Negative; FN= False Negative; FP= False Positive; TP= True Positive.

163

Figure A5-2: Receiver Operator Characteristic (ROC) curve of specificity (x-axis) plotted against sensitivity (y-axis) with (left) an AUC = 0.89 and TaqMan Fast qPCR mastermix (Applied Biosystems, Inc); total of 632 eDNA samples with a qPCR output of > 0 copies/reaction, and 258 negative control samples were used to generate the curve. (right) an AUC = 0.84 with Environmental Mastermix 2.0; total of 388 eDNA samples, and n=170 negative controls). A curve that falls below the diagonal line would indicate an inaccurate assay.

164

Figure A5-3: Venn Diagrams indicating the number of sites (n=29) with Redside Dace detections in the Fall 2012 using electrofishing only (blue), eDNA only (green), and electrofishing and eDNA combined (overlap of green and blue). Redside Dace were detected at 14 of the 29 locations using single-pass electrofishing (Reid et al. 2008); eDNA detections are shown based on qPCR thresholds of 1, 3,5, and 10 DNA copies/reaction.

165

Appendix 2.6: Comparison of electrofishing and eDNA detections during Fall and Spring sampling season (total of 29 sites).

Fall Spring

Total eDNA detections 18 16

eDNA positive detections unique to season under study 5 3

Positive electrofishing sites 14 n/a

Positive electrofishing sites with no positive eDNA 3 n/a detections

Positive eDNA sites with no positive electrofishing 7 n/a

Appendix 2.7: Estimates of Redside Dace detection probability and occupancy, AICc, ΔAICc, AIC weights, number of parameters, and -2log values from models for spring and fall field seasons (horizontal headings), at temporal sampling (R) of 3, 4, and 5 replicates.

Model Occupancy ψ Detection AICc ΔAIC AIC # -2log Model No. of replicates (+ SE) probability p c weights Param likelihood average required for (+ SE) eters (+ SE) reliable detection Fall; Replicates=3, (n = 29, χ2 = 4.39, p = 0.59, ĉ = 0.75) ψ(.)p(.) 0.45 (0.094) 0.76 ( 0.07) 86.14 0 0.4 2 81.68

ψ(.)p(temp) 0.49 (0.10) 0.69 ( 0.093) 86.36 0.22 0.36 3 79.4 0.73 3

ψ (.)p(flow) 0.45 ( 0.094) 0.77 ( 0.11) 88.49 2.35 0.12 3 81.53 (0.096) ψ(.)p(temp*flow) 0.49 (0.10) 0.71 ( 0.11) 88.52 2.38 0.12 4 78.85

Replicates=4, (n = 29, χ2 = 12.69, p = 0.51, ĉ = 0.91) ψ(.)p(temp) 0.48 ( 0.10) 0.70 (0.082) 99.17 0 0.52 3 92.21

ψ(.)p(.) 0.44( 0.092) 0.77 (0.060) 100.46 1.29 0.27 2 96 0.72 3 ψ(.)p(temp+flow) 0.48 ( 0.1014) 0.70 (0.10) 101.86 2.69 0.13 4 92.19 (0.079) ψ(.)p(flow) 0.45 ( 0.092) 0.76 ( 0.092) 102.73 3.56 0.09 3 95.77 Replicates=5, (n = 29, χ2 = 41.51, p = 0.08, ĉ =1.39)* ψ(.)p(.) 0.52 ( 0.093) 0.65 ( 0.056) 105.39 0 0.44 3 136.82 0.64 3 ψ(.)p(flow) 0.53 ( 0.096 ) 0.62 ( 0.085) 105.39 0 0.29 4 134.26 (0.075) ψ(.)p(temp) 0.52 (0.093) 0.66( 0.077) 107.80 2.14 0.13 4 136.4 ψ(.)p(temp+flow) 0.53 ( 0.096) 0.63 ( 0.10) 109.03 3.64 0.07 5 134.02

*QAICc used for model selection due to over dispersion.

166

Spring

Replicates=3, (n = 29, χ2 = 1.96, p = 0.96, ĉ =0.29) ψ(.)p(temp) 0.5 (0.0929) 0.81 (0.087) 74.22 0 0.46 3 67.26 ψ(.)p(temp*flow) 0.56 ( 0.099) 0.84 (0.076) 75.17 0.95 0.28 4 65.5

ψ(.)p(.) 0.52 (0.093) 0.89 (0.050) 75.98 1.76 0.19 2 71.52 0.84 (0.083) 2

ψ (.)p(flow) 0.52 (0.093) 0.90 (0.06) 78.04 3.82 0.07 3 71.08

Replicates=4, (n = 29, χ2 = 5.40, p = 0.99, ĉ =0.37) ψ(.)p(temp+flow) 0.58 (0.099) 0.78 (0.088) 90 0 0.53 4 80.33 ψ(.)p(temp) 0.59 ( 0.09) 0.79 (0.069) 90.43 0.43 0.43 3 83.47 0.79 (0.081) 2 ψ(.)p(.) 0.55 ( 0.092) 0.86 (0.044) 96.32 6.32 0.02 2 91.86

ψ(.)p(flow) 0.55 (0.093) 0.85 (0.059) 96.55 6.55 0.02 3 89.59

Replicates=5, (n = 29, χ2 = 21.71, p = 0.84, ĉ =0.73) ψ(.)p(temp+flow) 0.61(0.097) 0.73 (0.081) 121.87 0 0.82 4 112.2

ψ(.)p(flow) 0.58 (0.092) 0.79 (0.058) 126.39 4.52 0.09 3 119.43 0.74 (0.081) 3 ψ(.)p(temp) 0.59 (0.093) 0.76 (0.077) 126.88 5.01 0.07 3 119.92

ψ(.)p(.) 0.58 (0.092) 0.80 (0.044) 128.85 6.98 0.03 2 124.39

167

168

Appendix 2.8: Comparison of costs for eDNA versus electrofishing

Past studies have indicated that eDNA water collection yields higher detection and entails lower costs than conventional sampling (Fukumoto et al. 2015, Goldberg et al.

2013), which is an important consideration when implementing programs. I estimated the cost of using either method to sample Redside Dace sites. The cost comparison of eDNA versus electrofishing is dependent on how (i) many eDNA water samples are being taken at a site, (ii) water sample turbidity (determines filtration time), and (iii) qPCR replicates being run. Based on my electrofishing fall data, 0.5 hours was spent monitoring each site with two field technicians, excluding the time required to process and identify fishes.

With a cost of approximately $20/hour for field technician support, this would come up to a total of $20 for one hour. A conservative estimate of processing one eDNA sample in the lab run in triplicate is approximately $35 (excluding lab-tech hours cost). For lab-tech hours, to process 90 samples, it would take approximately 7 hours to filter samples

(assuming that filtration per samples takes no longer than 10-15 minutes), 24 hours to do

DNA extractions (30 extractions in 8 hours), one hour to run a gel with the 96 samples, and 1.5 hours to run each sample in triplicate. Therefore, on a per-sample basis, this would take an average of 0.5 of an hour to process one sample, or a total of 1.5 hours to process three samples (assuming triplicates water samples were taken at each site). This cost does account for the costs of running Redside Dace standard controls, pre and post filter controls, field blanks and extraction negatives. If temporal replicates were taken at each site, with a 15 minute interval, the time spent electrofishing and collecting the eDNA samples would be equal, and eDNA water collection would incur more of an expense.

However, if we were to take spatial replicates at each site, the field tech time spent to

169 collect water samples at each site, and the time spent to electrofish, would equal the time spent to run two samples in the lab (this does not take into account the $35/sample).

Additionally, other important considerations to think about that could influence initial start-up costs for eDNA monitoring are: (i) primer optimization, (ii) investment in lab equipment for eDNA techniques, and (iii) turbidity of samples influences filtration time.

References

Fukumoto S, Ushimaru A, Minamoto T (2015) A basin-scale application of environmental DNA assessment for rare endemic species and closely related exotic species in rivers: a case study of giant salamanders in Japan. Journal of Applied Ecology, 52, 358–365.

Goldberg CS, Sepulveda A, Ray A, Baumgardt J, Waits P (2013) Environmental DNA as a new method for early detection of New Zealand mudsnails (Potamopyrgus antipodarum). Freshwater Science, 32, 792–800.

Appendix 3.1: Primer sequences used for microsatellite DNA analysis, along with their GenBank accession numbers, repeat motifs, size range (bp) and annealing temperatures.

Locus GenBank Primer Sequence (5’-3’) Repeat Motif Range (bp) Annealing Reference Accession Temp (ºC)

RSD42A GQ150754 F: AACTGCAGACAGGGATCTGG (TC)14(AC)7 171-205 54 Pitcher et al. 2009

R: TATCTGTGCCTGCTGGTGAG

RSD2- GQ150756 F: TGAAATCAAAATGGTCAGTCCTT (CA)13(TA)6 195-223 57 Pitcher et al. 2009 58 R: TGCGCTAAACGTCATCAGAG

RSD70 GQ150757 F: TGCAGTGGTTTGCAATCTAAG (GT)14 239-255 54 Pitcher et al. 2009

R: CCGACGACCCCTTTAAGAAT

RSD86 GQ150758 F: CACAAAAACGGGATGAATTG (TG)20 209-227 54 Pitcher et al. 2009

R: GCGAACTGCAGCACTTACAG

RSD2- GQ150759 F: ACAGCCACTATACCTGAAATCAA (TCTA)21 181-273 54 Pitcher et al. 2009 91 R: CGCAAATAAAGGTGACTTGAC

RSD142 GQ150760 F: CACCCTGCTGTTTCTGTTCA (TATC)20 191-311 54 Pitcher et al. 2009

R: ATTGCTTTCCCTGTGAATCG

RSD179 GQ150761 F: GCTAGTCAAACTGGTCTCTTTCC (AT)2GTCT(G 197-219 54 Pitcher et al. 2009

T)16 R: GGCTGCCAGCAAATATTAGAA

170

CA11 AF277582 F: TCCCTCACTGTGCCCTACA (TAGA)7 251-355 57 Dimsoski et al. 2000 R: GGCGTAGCAATCATTATACCT

CA12 AF277584 F: GTGAAGCATGGCATAGCACA 169-301 57 Dimsoski et al. (TAGA) (CA 10 2000 R: CAGGAAAGTGCCAGCATACAC GA)4(TAGA)2

Ppro118 AY254352 F: CCGGATGCACTGGTGGAGAAAA (CTCA)2(CA)2 199-291 57 Bessert et a. 2003

R:CCAGCAATCATAGCAGGCAGGAA C

171

172

Appendix 3.2: Proportion of polymorphic loci across ten microsatellite primers for 28 Redside Dace populations.

Population %P LCR 90 NFR 60 UNN 90 LRR 80 MIL 90 HAN 90 TTR 80 EFI 90 RED 90 BRU 100 GUL 80 EBC 100 SAU 70 STR 80 SMC 100 FOU 100 KET 90 HUM 90 WOO 100 DON 90 ROU 90 MIT 90 CAR 90 EBM 100 BHR 100 DOD 90 OST 70 COB 90

173

Appendix 3.3: List of 20 populations that deviate from Hardy-Weinberg equilibrium expectations.

Population: Locus: EFI RSD142 LCR RSD179 HAN RSD86 MIL Ppro118 KET Ca12 KET Ppro118 KET RSD2-91 KET RSD42A TTR RSD70 MIT Ca12 MIT RSD179 GUL Call GUL Pro118 GUL RSD86 GUL RSD2-58 UNN Ppro118 WOO RSD70 EBM RSD42A STR RSD2-58 LRR RSD2-91

174

Appendix 3.4: Total evidence haplotype numbers with corresponding ATPase 6 and 8 and cytochrome b haplotypes.

Total Evidence ATPase 6 and 8 Cytochrome b haplotype Haplotype1 Number Haplotype23 Number number1 2 3 1 3 3 35 4 15 33 5 15 19 6 15 34 7 9 13 8 3 30 9 3 31 10 21 32 11 3 1 12 1 28 13 1 4 14 3 25 15 1 2 16 19 26 17 1 27 18 1 5 19 1 22 20 18 1 21 3 24 22 1 23 23 12 17 24 1 9 25 17 19 26 15 20 27 15 21 28 16 19 29 14 19 30 12 18 31 13 17 32 11 16 33 10 14 34 10 15 35 1 14 36 1 15 37 8 9 38 7 12 39 6 11 40 1 10

175

41 5 2 42 1 8 43 4 2 44 1 7 45 2 2 46 3 6 47 1 3