Metabarcoding approach to identifying early life stages of

by

Kavishka Gallage

A thesis submitted in conformity with the requirements for the degree of Master of Science Ecology and Evolutionary Biology University of Toronto

ãCopyright by Kavishka Gallage 2020

Metabarcoding approach to identifying early life stages of Great Lakes fishes

Kavishka Gallage

Master of Science

Ecology and Evolutionary Biology University of Toronto

2020

Abstract

Accurately identifying fishes in their early life stages using morphology is challenging, time-consuming, and requires taxonomic expertise. Metabarcoding is a method that can be used to identify in batch samples (Cruaud et al. 2017). Detection of early life stages of fishes is important for understanding life history patterns and critical spawning habitat. In this study, metabarcoding is used as an identification tool to identify 1119 egg and larva batch samples from

Sydenham River and Rondeau Bay. I identified 34 species from Sydenham River and 8 species from Rondeau Bay and the spawning months of these species based on date of capture. I determined the materials and supplies cost of metabarcoding in this study to be $6597.33, compared to $62289.09 for individual-based barcoding. This study shows the potential of metabarcoding as a broad-scale detection and identification method for early life stages of Great

Lake fishes.

ii

Acknowledgements

Many people contributed to the completion of my Master’s thesis through academic and emotional support. First, I would like to thank my supervisors Dr. Nicholas Mandrak and Dr. Nathan Lovejoy for their continuous support throughout the duration of my thesis. I am grateful for opportunity to work with you and learn from you. I am grateful to have the opportunity to research and work on a project that I was able to intertwine my passion for molecular biology and conservation. I thank you for the guidance and encouragement throughout the countless setbacks during my thesis. I would also like to thank my supervisory committee Dr. Jason Weir and Dr. Roberta Fulthorpe for their helpful comments and questions during my committee meeting, which contributed to the advancement and success of this thesis.

Thank you to the staff at Fisheries and Oceans Canada for the time spend completing field work and providing me with my sample set. Also, thank you for the continuous support throughout the project.

I would like to thank Alex Van Nynatten for all of your help during the troubleshooting of this project, it was extremely useful to be able to openly communicate my ideas. I thank Nathan Lujan for teaching advance molecular lab techniques and designing this project. I thank other members of Lovejoy lab for their continuous support and enthusiasm during my lab work. I thank Mandrak lab for their continuous support during my conference presentations and the completion of my thesis. You all have been great friends and colleagues to learn from during the duration of my thesis.

Finally, I would like to thank Connor Grey, my friends, and my family for all of your emotional support throughout my academic career. I thank my parents for inspiring me to become the person I am today and inspiring me to work in fish conservation through countless days spend snorkeling around coral reefs. I would like to thank Connor for your continuous emotional support throughout my thesis by keeping me focused on my goals and helping me to be the best version of myself.

iii

Table of Contents

Abstract ...... ii Acknowledgments ...... iii Table of Contents ...... iv List of Tables ...... vii Chapter 2 ...... vii Appendix A ...... vii Appendix C ...... viii List of Figures ...... ix Chapter 2 ...... ix Appendix A ...... ix Appendix B ...... ix List of Appendices ...... x Chapter 1 Introduction to metabarcoding and the Great Lakes ...... 1 Barcoding for Detection ...... 2 DNA barcoding ...... 2 DNA metabarcoding ...... 4 Abundance in larvae and early detection ...... 6 Limitations of metabarcoding ...... 8 Applications of barcoding and metabarcoding ...... 8 Barcoding to detect diversity ...... 8 Barcoding to detect threats ...... 9 Combining barcoding with High Throughput Sequencing ...... 10 Possible future applications ...... 11 Introduction to Laurentian Great Lakes ...... 12 History of changes to the Great Lakes ...... 12

iv

Threats to Great Lakes ...... 13 Species at risk ...... 13 Aquatic invasive species ...... 14 Sydenham River watershed ...... 15 Rondeau Bay ...... 16 Purpose of the study ...... 17 Significance of this Study ...... 17 References ...... 19 Chapter 2 Metabarcoding approach to identifying Great Lake fishes in their early life stages ...... 29 Introduction ...... 29 Methods ...... 31 Sample collection ...... 31 DNA extraction ...... 33 Reference Library and Primer design ...... 34 Positive and negative controls ...... 36 Library preparation ...... 36 Bioinformatics ...... 38 Cost-effectiveness analysis ...... 40 Results ...... 41 Species detection ...... 41 Spatial patterns ...... 45 Temporal patterns ...... 45 Cost-effectiveness analysis ...... 47 Discussion ...... 47 Species detection ...... 48 Spatial patterns ...... 49

v

Spawning timing ...... 51 Detecting species at risk and invasive species ...... 52 Conservation relevance ...... 55 Cost effectiveness ...... 56 Conclusion ...... 58 References ...... 59 General Conclusions ...... 69 Conclusion of the study ...... 69 Experimental design ...... 69 Significance of the results ...... 70 References ...... 72 Appendices ...... 73 Appendix A ...... 73 Appendix B ...... 86 Appendix C ...... 88

vi

List of Tables

Chapter 2

Table 2-1. Sampling information for egg and larval batch samples from three sampling location in Sydenham River, Southwestern Ontario and Rondeau Bay, Lake Erie. Table consists of sampling dates, number of sampling days, and number of samples ...... 32

Table 2-2. Universal PCR primer pair use to amplify wide variety of taxa from the Great Lakes watershed and potential invasive species...... 35

Table 2-3. Detection of species using metabarcoding (MB) (eggs and larvae; this study) and conventional sampling (Cvn) (typically adults, Eakins et al. 2020) for species in the Sydenham River and Rondeau Bay, Lake Erie. Reported spawning months for each species detected based on Eakins et al. (2020) are shown, as are the months that eggs and larvae were detected using metabarcoding. * species detected with less than 10 reads. Species listed under SARA are denoted by designation: Special Concern (SC), Threatened (TH), Endangered (EN)...... 44

Table 2-4. Comparison of spawning months for species detected in the Sydenham River (Alvinston, Florence, and Oil Springs) ...... 46

Table 2-5. Comparison of materials and supplies costs for barcoding and metabarcoding. Total barcoding cost is based on the price to barcode one specimen multiplied by the number of specimens in this study. The metabarcoding cost per sample was calculated based on the price to barcode the total number of specimens in the study divided by the number of individuals. Labour and infrastructure costs are not included. Costs for chemicals and materials commonly found in the lab and used in very low quantity were omitted ...... 47

Appendix A

Table A-S1. PCR 1 primer list. Modified primers containing PCR primer, heterogeneity spacers, and linking primer region ...... 73

vii

Table A-S2. PCR2 primer list. PCR2 primers consist of half of the sequencing primer, index (in red), and Illumina adaptor. Different combinations of i7 and i5 index primers were used to assign IDs to PCR1 product...... 74

Table A-S3. Mock communities (positive controls) consisting of species found in the Sydenham River (SYD) and Rondeau Bay (RB). S7 contains all seven species used in the positive controls and S3 contains three species that are in both mock communities. Success column indicates ability of metabarcoding pipeline to detect the species in each mock community in three replicates...... 76

Table A-S4. The primer sets used during PCR2 to ID samples. Indexes in PCR2 primers was used to demultiplex read output from sequencing runs...... 77

Table A-S5. Commands used on mothur software to filter for PCR/sequencing errors and to classify detections...... 77

Table A-S6. Read count for each species detected in negative controls during sequencing run one and two. The read counts were used to correct for contamination in the egg and larval batch samples ...... 78

Table A-S7. Summary of read count for species included in positive controls. Three read counts represent the count for each replicate (total = 3) for each mock community. Positive detection threshold was determine based on the lowest read count found in positive controls (count = 2) ...... 78

Table A- S8. Species list for reference library and reference alignment with their respective GenBank number ...... 79

Appendix C

Table C-S1. Glossary of terms defining terms used in this thesis ...... 88

viii

List of Figures

Chapter 2

Figure 2-1. Localities sampled for fish early life-history stages in southwestern Ontario. The Alvinston, Oil Springs, and Florence sites are on the Sydenham River, which drains to Lake St Clair, and thought to provide nursery habitat for Great Lake fishes. Samples from Rondeau Bay were collected from the northern shore, which is highly vegetated and provides nursery spawning habitat for fishes ...... 33

Figure 2-2. Total number of detections of each species across in all batch samples, with colours indicating localities. 310 detections from egg batch samples and 1114 detections from larval batch samples ...... 43

Appendix A

Figure A-S1. A neighbour-joining tree used to estimate the level of divergence for the barcode region between all Great Lake fishes and potential invasive species. This was used to determine the potential of my barcode to differentiate between closely related species...... 84

Figure A-S2. Cytochrome c oxidase subunit 1 (COI) sequence map, base position 5487 to 7037 in the full mtDNA sequence, displaying the base position of forward primer, reverse primer, and the barcode region. This sequence map is based on the full mtDNA sequence of Walleye ( vitreus) (GenBank #: NC_028285) ...... 85

Appendix B

Figure B-S1. Species read numbers for each detection based on A) egg and, B) larval batch samples...... 86

Figure B-S2. The relationship between mean read count and the number of species detected in mix batch samples after filtering for contamination and PCR and sequencing error ...... 87

ix

List of Appendices

Appendix A Supplementary information for Chapter 2 methods section ...... 73

Appendix B Supplementary information for Chapter 2 results section ...... 86

Appendix C Glossary of terms ...... 88

x

Chapter 1 Introduction to metabarcoding and the Great Lakes

In recent years, DNA barcoding has been at the forefront of species identification and biomonitoring (Herbert et al. 2003; Thomsen and Willerslev 2015), as barcoding effectively removes the limitations associated with morphological identification of phenotypically similar species (Herbert et al. 2003). Advancements in high-throughput sequencing have allowed for barcoding of large bulk samples, i.e. metabarcoding (Taberlet et al. 2012; Cruaud et al.

2017). Identifying critical spawning and nursery habitat of species is important in conservation

(Rosenfeld and Hatfield 2006), and this is a common knowledge gap in recovery strategies for species listed under the Species At Risk Act (SARA): such as Eastern Sand Darter ( pellucida) (COSEWIC 2009) and Lake Chubsucker (Erimyzon sucetta) (Fisheries and Oceans

Canada 2017). Ichthyoplankton occurs in higher abundance than adults of a given species, making it an effective tool in biomonitoring (Lasker 1987). Conventional identification of ichthyoplankton surveys require a high level of taxonomic expertise given the morphological similarity of closely related species (Ahlstrom and Moser 1976; Loh et al. 2014). Integrating metabarcoding into ichthyoplankton surveys will increase the accuracy (Ko et al. 2013) and cost- effectiveness (Hulley et al 2018) of surveying ichthyoplankton assemblages. In this study, I aim to evaluate the use of metabarcoding methods to detect and identify the early life stages of fishes.

My sampling sites include the Sydenham River and Rondeau Bay. The Sydenham River, a tributary of Lake St.Clair, is highly diverse with more than 80 fish species (Dextrase et al. 2003) and the Rondeau Bay wetlands, a costal embayment on the northern shore of Lake Erie, provide nursery habitat for many fishes in Lake Erie and Rondeau Bay (Edwards et al. 2006). The goals of this study are: 1) develop and use a metabarcoding protocol to detect species diversity in

1 batch samples of egg and larval fishes; 2) to use the metabarcoding protocol to evaluate fish biodiversity in the Sydenham River drainage of Lake St Clair and Rondeau Bay, Lake Erie; 3) to assess the temporal patterns of egg and larvae occurrence; and, 4) compare the cost benefits of using metabarcoding compared to single-specimen barcoding.

1.1 Barcoding for detection

DNA barcoding

DNA barcoding uses interspecific variation in conserved genetic markers to identify species. It effectively removes limitations associated with the identification of species with high phenotypic plasticity, cryptic morphologies, ambiguity in developmental stages, and those that require a high level of taxonomic expertise (Herbert et al. 2003). One of the most commonly used genetic markers for barcoding is the mitochondrial cytochrome oxidase 1 (COI) gene. This is due to the availability of primers that bind and amplify COI from a wide array of species.

Regions of the COI gene evolve rapidly, providing finer phylogenetic resolution than other mitochondrial genes, and the gene has an extensive curated reference database for classification

(Hebert et al. 2003; Elbrecht and Leese 2017).

DNA barcoding is possible due to the advancements in molecular ecology through the combination of Polymerase Chain Reaction (PCR) - based DNA amplification (Mullis 1990), universal primers (Weisburg et al. 1991), and species identification through sequencing (Hebert et al., 2003). DNA barcoding can be performed on bulk samples containing entire organisms or through environmental samples (e.g. water, feces), referred to as environmental DNA (eDNA)

(Taberlet et al. 2018). A disadvantage of eDNA is that DNA derived from environmental

2 samples may be highly degraded and will, therefore, usually require a shorter barcode to amplify enough fragments to be barcoded (Thomsen et al. 2012; Taberlet et al. 2018). A study conducted by Hajbaebaei et al. (2006) has shown that as few as 100 bp of DNA sequence is enough to identify and differentiate most species. Using entire organisms yields DNA of sufficient quality such that the length of the barcode fragment can be as short as the barcodes used in eDNA or as long as 500-800 bp (the standardized barcodes in most reference libraries; Herbert et al. 2003;

Taberlet et al. 2012). A longer barcode will aid in accurately distinguishing more closely related species not distinguishable by a mini-barcode (Meausneir et al. 2008). Use of whole organisms will provide higher-quality DNA through tissue extraction; however, this can be costly and time consuming. To create a target barcode region, there must be a highly variable region among the

DNA sequences of the taxa, flanked by two highly conserved regions of 15-20 bp used to attach species-specific or degenerative primers. Degenerative primers are a mixture of primers that cover all possible nucleotide combinations in the given primer binding site. Using degenerate primers enables a wide variety of taxa to be amplified at once (Weisburg et al. 1991). The primers will allow amplification of the barcode region of each sample through PCR. The amplicons can then be sequenced using capillary electrophoresis via Sanger sequencing (Sanger et al. 1977).

Sequencing reads generated through PCR are compared to the sequences of voucher species from a barcode library to confirm the identity of the individual. In an attempt to standardize a barcode library, Hebert et al. (2003) created the Barcode of Life. This resource provides scientists with a standard taxonomic library that can be used to compare unknown specimens to standardized barcodes of identified specimens (Herbert et al. 2003; Taberlet et al.

3

2012). After the introduction of Barcode of Life, the number of articles on barcoding increased exponentially alongside advancements in molecular techniques and equipment, such as the use of various DNA sampling techniques (e.g. expansion of eDNA sampling (Thomsen and Willerslev

2015)), more variety in DNA concentration, use of different DNA extraction methods, primer optimizations, and PCR optimization (Bohmaan et al. 2014). Barcoding was initially introduced as a method to identify species on an individual basis and to increase discovery of new species

(Herbert et al. 2003). Use of barcoding for species detection and biomonitoring is becoming more prominent as scientists and policy makers alike appreciate its efficiency (Bohmaan et al.

2014; Littlefair and Clare 2016). Barcoding is now used as a method for early-detection surveillance of invasive species and to assess the biodiversity of aquatic systems through a combination of barcoding with high-throughput sequencing (HTS) methods called metabarcoding (Balasingham et al. 2018).

DNA metabarcoding

Metabarcoding combines DNA barcoding-based identification with high-throughput sequencing methods to provide an effective and efficient way to identify bulk samples of many individuals, likely containing more than one species, in parallel (Taberlet et al. 2012).

Metabarcoding uses degenerate primers to mass amplify a mixture of DNA samples extracted through tissue (Ratnasingham and Herbert 2007) or eDNA. The use of a highly versatile degenerate primers allows coverage of a wide variety of taxa and amplify the barcode regions

(Cruaud et al. 2017). The amplicons from PCR are standardized into low quantities per sample to reduce read bias among species. The DNA is then pooled and can be subsampled for sequencing

(Creer et al. 2016; Cruaud et al. 2017). The pooled sample is sequenced using HTS platforms,

4 such as Illumina MiSeq and HiSeq, which allow sequencing of hundreds of samples simultaneously (Taberlet et al. 2012). The current HTS platforms are capable of producing millions of sequencing reads from a pooled sample in a single run, compared to Sanger sequencing, which is limited to DNA from a single specimen at a time (Illumina™ 2019). This results in high read output per organism in the pooled sample, which allows mass identification of multiple specimens simultaneously (Littlefair and Clare 2016; Cruaud et al. 2017).

Sequencing reads are assembled into clusters of sequences similar to each other called

Operational Taxonomic Units (OTUs). These clusters each have a representative sequence that is then compared to an existing reference library of the genetic marker for mass identification

(Cristescu 2014).

In the past decade, metabarcoding has been widely used to answer fundamental questions in ecology and conservation biology. As the anthropogenic impact on the planet increases, understanding the changes to the natural ecosystem becomes vital for conservation.

The use of metabarcoding for monitoring, and as a method to improve the detection of rare species, has been proposed to meet this urgent need (e.g. Robson et al. 2016). This method is highly adaptable to a variety of taxonomic groups and capable of identifying species to a high taxonomic resolution (Cruaud et al. 2017). Metabarcoding has the potential to establish a DNA- based method for a global network of biodiversity surveillance and monitoring (Taberlet et al.

2012; Cristescu 2014) and has been used to detect the presence of rare, at-risk, and invasive species (Taberlet et al. 2012). Furthermore, it significantly reduces the cost and time required to identify bulk samples of individuals. Metabarcoding provides an opportunity for taxonomists and

5 geneticists alike to more effectively and efficiently identify and study a wide range of species at all life stages.

Abundance in larvae and early detection

Understanding the early life stages of fishes and factors affecting their survival will aid in determining their recruitment patterns. The increase in anthropogenic activity that influences biotic and abiotic factors of aquatic systems have led to declines in survival of larvae and, thereby, the subsequent adult population (Kennish 1992). By understanding the general trends and patterns of fish larvae and eggs in an aquatic ecosystem (e.g. distribution of the larvae and eggs, species richness, mortality rate), we can determine the influences that aid in succession of the adult population (Lasker 1987). The ability to accurately identify fishes in their early life stages is challenging and requires high expertise. Generally, early life stages of fishes, which include eggs, fry, larvae, and juveniles, are captured through sampling methods that are highly effective for plankton due to the small size of fishes during their early life stages. Early life stages of fishes are commonly referred to as ichthyoplankton because of their similarity in behaviour to planktonic species (Ahlstrom and Moser 1976). Ichthyoplankton surveys are conducted for three main reasons: 1) to survey the distribution and abundance of a single species based on their larvae and eggs to estimate the biomass of the adult spawning population; 2) to estimate the survival of the year class to understand the factors influencing survival of ichthyoplankton; and, 3) to survey of all fishes present in an area to understand the life history and habitat use (Ahlstrom and Moser 1976). Ichthyoplankton surveys require a high level of expertise; identification of ichthyoplankton requires taxonomists that can recognize and differentiate the high ambiguity among closely related species and their distinct patterns (e.g.

6 body shape, pigment patterns, osteology) (Ahlstrom and Moser 1976; Loh et al. 2014). During ichthyoplankton surveys, all individuals can be at different stages of development and the presence of cryptic species makes it difficult to distinguish between closely related species

(Ahlstrom and Moser 1976). Recent efforts have relied on DNA barcoding as a method to identify ichthyoplankton, as it is not limited by the barriers present in conventional sampling and identification efforts (Becker et al. 2015; Hulley et al. 2018).

Conventionally, the most commonly used method to identify fishes is through morphology, based on comparing the differences in anatomy and other phenotypic characteristics to distinguish species (Herbert and Gregory 2005). However, it can be very difficult to identify specimens in their early developmental stages using morphology because many fishes are anatomically similar in early life-stages (Strauss and Bond 1990; Kim and Oh 2015). Fish eggs are identified to taxa through the presence, location, density, and colour of oil globules and melanophores based on their similarities to previously successfully identified individuals (Kim and Oh 2015). In a study by Becker et al. (2015), neotropical fish larvae were difficult to identify to species using morphology; a DNA barcoding approach correctly identified all 40 eggs and 57 larvae, whereas, the morphological approach was only 22.5% accurate (Becker et al. 2015).

Identifying the early life stages of fishes is difficult and can only be successfully achieved by taxonomic specialists; barcoding is capable of removing these barriers to better identify ichthyoplankton to the species level.

7

Limitations of metabarcoding

Molecular approaches to species identification have limitations due to the degradation of

DNA and a lack of complete genome databases (Strauss and Bond 1990; Ratnasingham and

Herbert 2007). Molecular methods are also vulnerable to false positives and false negatives through PCR and sequencing errors (Teletchea et al. 2009). Primer specificity, likelihood of mismatching to occur with the primer and template sequence (Dieffenbach et al. 1993), can limit the ability to detect a wide range of species (Pawlowski et al., 2018). Primer bias can lead to false negatives in samples with high species richness due to unsuccessful amplification of all target species (Hatzenbuhler et al. 2017). The current standard barcode is a ~600 bp section of the COI sequence, however, this barcode is too long for HTS methods such as Illumina (Kocher et al., 2017). This has led to studies that implement a shorter barcode region with high diversity to identify species (Hajbaebaei et al. 2006; Taberlet et al. 2018). Despite these limitations, metabarcoding is an effective tool for species identification at any life stage and can be used as a method for detecting species richness. A metabarcoding approach is also more cost effective, provides a better resolution for identification, capable of detecting cryptic species, and, overall, more accurate and effective than conventional methods (Herbert and Gregory 2005).

1.2 Applications of barcoding and metabarcoding

Barcoding to detect diversity

Biodiversity information is critical in implementing appropriate management actions to conserve species that are at risk of extinction. Barcoding provides the opportunity to identify a variety of fishes using universal primers. A study by Ward et al. (2005) used barcoding to assess

8 the local biodiversity, using a cocktail of two different sets of primers to identify a wide variety of Australian sharks, rays, and teleost species. Barcoding can also be used to understand the complex structures of in highly diverse groups of fishes in freshwater and marine habitats (Valdez-Moreno et al. 2009). It can be difficult to differentiate species that have recently diverged due to similarity in their genetic code and morphology at the species level. Barcoding can be used to identify these potentially cryptic species and haplotypes in highly diverse taxa to fill the gaps in taxonomic knowledge in these elusive species (Landi et al. 2014; McCusker et al.

2012); for example, the discovery of a new species of swamp eel in Catemaco Lake in the

Neotropics (Valdez-Moreno 2009). A study by Olds et al. (2016) sampled a small stream at multiple sites using electrofishing and compared their findings to eDNA sampling. All species detected using electrofishing were detected using eDNA with additional species detected only through eDNA. Another aspect of barcoding is to identify ichthyoplankton, which can be biased depending on the taxonomist. Barcoding has shown to reduce this bias by accurately identifying morphologically similar larvae compared to conventional methods (Ko et al. 2013).

Barcoding to detect threats

As new colonizers of ecosystems, aquatic invasive species (AIS) are initially present in very low abundances, but can rapidly grow in number (Mack et al., 2000). It is essential to detect

AIS prior to establishment of a reproducing population. Metabarcoding has shown its potential as a biodiversity monitoring tool and can be used for early detection of invasive species. AIS are usually generalists that exploit resources in an ecosystem to establish a reproducing population, and are capable of dispersing beyond their initial point of introduction, making the entire watershed vulnerable to colonization unless limited by natural or human barriers (Ricciardi

9

2006). An emerging threat to native species in the Great Lakes are the Asian carp species.

Currently, four species of Asian carps threaten the Great Lakes: Grass Carp (Ctenopharyngodon idella), Bighead Carp (Hypophthalmichthys nobilis), Black Carp (Mylopharyngodon piceus), and

Silver Carp (Hypophthalmichthys molitrix) (Cudmore et al. 2012, 2017). Several methods have been used for the early detection of Asian carps in the Great Lakes (Marson et al. 2016). One method uses eDNA to detect presence of Bighead or Silver Carp in targeted areas of concern. eDNA-based surveillance methods are capable of exploiting the advantages of an aquatic environment; the suspension of tissues, scales, and other biotic material allows detection of DNA from elusive species (Jerde et al. 2013). The eDNA evidence reported in Jerde et al. (2013) showed the presence of Bighead and Silver Carp eDNA on the Great Lakes side of an electric barrier erected to prevent movement from the Mississippi to Great Lakes basin. This evidence was later confirmed by commercial fishermen that caught an adult Bighead Carp in Lake

Michigan close to a positive eDNA detection site (Jerde et al. 2013). The eDNA-based Great

Lakes surveillance plan proposed by the Asian Carp Regional Coordinating Committee

(ACRCC) could potentially detect Asian carps early enough to implement management and protection plans to reduce the damage from these invasive species (ACRCC 2020).

Combining Barcoding with High Throughput Sequencing

Metabarcoding surpasses the species-specific individual barcoding detection by allowing mass detection of species in communities. The use of metabarcoding compared to barcoding is relatively recent; the combination of barcoding with a high-throughput sequencing platform provides the opportunity for more efficient protocols. Several studies in the last decade have highlighted the possibilities and current advancements in metabarcoding. Metabarcoding is an

10 effective tool for measuring species richness more effectively compared to conventional methods, such as nets and electrofishing (Valentini et al., 2016; Evans et al., 2017). eDNA-based metabarcoding provides a non-destructive method to assess fish diversity and community composition (Stat et al. 2018). Some studies have attempted to identify fish diversity and community composition through metabarcoding of stomach content of predatory fishes

(Siegenthaler et al. 2018). Metabarcoding of ichthyoplankton has been used to establish an effective tool to identify early life stages of fishes (Duke and Burton 2020), establish potential early detection systems for AIS (Hatzenbuhler et al. 2017), and establish early lifestage surveillance system to access fisheries sustainability through successful reproduction of stock populations (Maggia et al. 2017).

Possible future applications

Barcoding can be used for wide variety of applications, including stomach content analysis (Leray et al., 2013), identifying fishes at their early life stages (Ko et al. 2013), and evaluating misidentification of commercial fish production (Ardura et al. 2010). Increase in demand for seafood has led to destructive practices including mislabelled/misidentified fish products being sold for commercial purposes. DNA-based identification tools have been used to accurately identify commercially sold fish products (Ardura et al. 2010). Inaccurate and lack of catch identification has led to unstable fisheries (Watson and Pauly 2001). Monitoring genetic variation and accurate detection of stock populations can help to maintain fisheries and assess their adaptation to the changing environment (Frankham 1995; Watson and Pauly, 2001). A complete and accurate database of freshwater fishes of Canada will aid in standardizing and accurately identifying fish more successfully (Ratnasingham and Herbert 2007; Hubert et al.

11

2007). Using multiple different mitochondrial sequences for barcoding may also aid in differentiating species that are difficult to evaluate using just one molecular marker like cytochrome c oxidase I (COI) (Ardura et al. 2013).

1.3 Introduction to Laurentian Great Lakes

History of changes to the Great Lakes

The modern-day Laurentian Great Lakes are the result of the Laurentide continental ice sheet retreat approximately 14,000 years ago (Emminizer 2020). These lakes are among the largest freshwater lakes in the world and include numerous smaller lakes and extensive tributaries within its basin. The Great Lakes contain roughly 20% of the Earth’s fresh water and has a diversity of microhabitats (Cudmore and Crossman 2000). The ichthyofauna of the Great

Lakes is evolutionarily recent. Fish biodiversity dispersing from Wisconsinan glacial refugia in the basin established much of the modern Great Lakes fish community (Bailey and Smith 1981). One of the more recent changes to the Great Lakes ichthyofauna is directly related to the increased anthropogenic activity within the Great Lakes basin. The increase in anthropogenic activity has led to introduction of non-native species through various vectors, such as intentional stocking, ballast water, angler release, and aquarium release (Ricciardi, 2006;

Mandrak and Cudmore 2010, 2013). Many of these invasive species have established populations within the Great Lakes, which has significantly impacted the biodiversity of the

Great Lakes (Emery, 1985; Ricciardi 2006; Mandrak and Cudmore 2010, 2013).

12

Threats to Great Lakes

Like many ecosystems around the world, the Great Lakes ecosystem is being impacted by a variety of threats. Fish fauna in the Great Lakes changed as European settlers introduced aquatic species that they prefer to fish and eat, and technology that negatively affected native fishes (Mandrak and Cudmore, 2010). Extensive anthropogenic activity has resulted in global extinction of three species (Blue Pike, ( johannae),

(Coregonus reighardi)), extirpation of 18 species, and many other species are at-risk in at least one of the Great Lakes (Mandrak and Cudmore 2010, 2013). Over the past two centuries, more than 180 invasive species have been detected in the Great Lakes basin (Ricciardi 2006).

Successful introduction and establishment of invasive species are a great threat to native ecosystems and the economy (Lodge et al. 2016). Invasive species have a direct impact on native ecosystems through predation and competition with native fauna, and an indirect impact through alteration of trophic interactions and transmission of diseases (Helfman 2007). As the number of invasive species has increased in the Great Lakes, so has the number of extirpated species

(Mandrak and Cudmore 2010). Therefore, it is necessary to implement management actions to prevent the further introduction of aquatic invasive species to the Great Lakes ecosystem and conserve the extant population of at-risk species.

Species at risk

Within the past two centuries, many native species in Canada were extirpated due to lack of effective conservation actions. According to a study done by Collen et al. (2014), one in three freshwater species are at risk for extinction worldwide, and aquatic species are at a higher risk of extinction compared to terrestrial species. High rates of habitat degradation and habitat loss have

13 resulted in at-risk species worldwide that are overly susceptible to local extinction (Collen et al.

2014). In Canada, the negative impacts of AIS have resulted in an increased decline in populations of at-risk species (Mandrak and Cudmore 2010; Dextrase and Mandrak 2006). In

2003, the Species at Risk Act (SARA) was enacted to protect at-risk wildlife in Canada. The purpose of this act was to prevent the extirpation of Canadian biodiversity and provide recovery strategies to species at risk of extinction (S.C. 2002, c29). Effective recovery strategies and action plans require information regarding target population and distribution trajectories. Action plans outlines actions required to achieve distribution and abundance targets identified in recovery strategies (Irvine et al. 2005). Therefore, it is important to accurately estimate species richness, distribution, and abundance. In the Great Lakes basin, there are many at-risk fish species listed as Special Concern, Threatened, and Endangered under the Act (Mandrak and

Cudmore 2013).

Aquatic invasive species

The cumulative increase in aquatic invasive species over the past two centuries has had a great impact in the decline of native species. In the past century, the Great Lakes basin has become densely populated with the immigration of European settlers, who have directly and indirectly affected the decline in native species (Mandrak and Cudmore 2010). Recreational fishing in the Great Lakes basin has led to a high number of non-native species being released into aquatic systems through the stocking of non-native species and use of non-native bait

(Mandrak and Cudmore 2010, Drake and Mandrak 2014). The impacts of invasive species are notably significant when they affect at-risk species living in limited ranges. For example, two endemic Threespine stickleback (Gasterosteus aculeatus) forms in Hadley Lake were assessed

14 by COSEWIC as extinct after the introduction of Brown Bullhead (Ameiurus nebulosus), which predated on the sticklebacks (Hatfield 2001). The increasing number of AIS in the Great Lakes basin over the past century has occurred alongside other stresses, such as habitat degradation and overexploitation, resulting in the decline of native species (Ricciardi 2006; Mandrak and

Cudmore 2010). The synergetic behavior in a multi-stressor system leads to an increase in the impact of individual threats, ultimately leading to population extinction if species are unable to adjust to the multiple stressors (Halpern et al. 2008). Through the collaborative efforts of

Fisheries and Oceans Canada, provincial and territorial governments, universities, and other research organizations, AIS prevention methods are being developed to address the introduction, establishment, and spread of aquatic invasive species in Canadian aquatic ecosystems (Fisheries and Oceans Canada 2018a). One of the primary challenges to management of AIS species is the lack of technology for early detection of invasions into aquatic systems.

Sydenham River watershed

The Sydenham River is a large river system located in southwestern Ontario and is home to a unique and diverse group of fish taxa. The river drains into Lake St. Clair and has two main branches, the North and East Sydenham River. The Sydenham River watershed has a low relief system (Fisheries and Oceans Canada 2018a), meaning there is no significant change in elevation along the length of the river. The land use in the watershed is mainly agricultural, which results in runoff that enriches the river with turbidity and nutrients (Fisheries and Oceans Canada,

2018a). The Sydenham River watershed provides critical habitat for its at-risk species and is one of the most species-rich watersheds in Canada for SAR (Staton et al. 2003). The Sydenham

River is located within the Carolinian ecoregion, the milder climate in this region allows for a

15 greater diversity of fauna and flora, making it a conservation hot spot in Canada (Staton and

Mandrak 2006). The Sydenham River is considered a priority conservation hot spot in southwestern Ontario due to its high species richness, high number of species at risk, and numerous of threats to species at risk (Staton and Mandrak 2006). Like many other ecosystems, anthropogenic activities threaten these species (Fisheries and Oceans Canada 2018a). Identifying critical spawning and nursery habitat of the freshwater fishes in the Sydenham River will aid in implementing effective recovery strategies to protect these potentially vulnerable aquatic species.

The mid-section of the East Sydenham River was sampled as many of the rare species detected in the Sydenham River have overlapping habitat preferences in this section (Staton et al., 2003).

Most of these species prefer areas with gravel substance and moderate current, which is present throughout middle and lower sections of the East Sydenham River (Staton et al., 2003). This section of the river is less affected by agriculture run-off, high levels of suspended material, and oil content than other parts of the watershed, making the water quality suitable for greater diversity (Staton et al., 2003).

Rondeau Bay

Rondeau Bay is a coastal wetland located on the north shore of Lake Erie and is home to a wide range of species due to its unique habitats of extensive wetlands and shallow open water.

These wetlands are used by a variety of taxa for feeding, spawning, and nursery habitats. A recent survey of the coastal wetlands of Rondeau Bay captured 49 fish species through the use of mini-fyke and bag seine gear types (Montgomery et al. 2020). Rondeau Bay provides critical habitat for at-risk species that have limited distribution across Canada (e.g. Spotted Gar

(Fisheries and Oceans Canada 2018b), Spotted Sucker (Edward and Staton 2009), and Warmouth

16

(COSEWIC 2015)). Rondeau Bay supports the largest population of Spotted Gar in Canada

(Fisheries and Oceans Canada 2018b). Eggs on the vegetation of Rondeau Bay were sampled for the purpose of detecting these species at risk.

Purpose of the study

Sensitive detection methods are imperative to detect various native and invasive fish species in the Great Lakes. In my study, I aim to improve a molecular-based methods to detect and identify Great Lakes fishes at their early life stages. The goals of this study are to: 1) develop and use a metabarcoding protocol to detect species diversity in batch samples of egg and larval fishes; 2) to use a metabarcoding protocol to evaluate fish biodiversity in the Sydenham River drainage of Lake St Clair and Rondeau Bay, Lake Erie; 3) to assess the temporal patterns of egg and larvae occurrence; and, 4) compare the cost benefits of using metabarcoding compared to single-specimen barcoding. To achieve these goals, I acquired 1119 batch samples of eggs and larvae, collected by Fisheries and Oceans Canada from sites in Rondeau Bay and the Sydenham

River in 2017 and, subsequently, processed through a metabarcoding pipeline consisting of DNA extraction, DNA amplification, library preparation, DNA high-throughput sequencing, and bioinformatics.

Significance of this study

Molecular ecology applies advanced molecular biology techniques to answer fundamental ecological questions. The ability to accurately identify species through small amounts of tissue, hair, feces, or traces of DNA is a first step toward larger goals, including

17 understanding behaviour (Farrell et al. 2000; Walker et al., 2016), and estimating population sizes (Kohn et al., 1999, Mills et al. 2000). Ichthyoplankton surveys provide a better understanding of fish distribution, abundance, recruitment, species interaction, and habitat use

(Lasker, 1987). Integrating molecular techniques with ichthyoplankton surveys will increase the accuracy and cost efficiency of the survey (Hulley et al. 2018; Taberlet et al. 2018). It is imperative that we use these techniques to improve the conventional monitoring system in place to detect and identify aquatic species. The Sydenham River and Rondeau Bay provide case studies of the potential of metabarcoding as a detection and identification tool for the early life stages of fishes. This study evaluates this potential and provides a foundation for future conservation research into that combine advanced molecular techniques and bioinformatics to answer fundamental ecological questions.

18

References

Ardura, A., Pola, I. G., Ginuino, I., Gomes, V., and Garcia-Vazquez, E. 2010. Application of barcoding to Amazonian commercial fish labelling. Food Research International, 43(5), 1549-1552. doi:10.1016/j.foodres.2010.03.016

Ahlstrom, E.H., and Moser, H. G. 1976. Eggs and larvae of fishes and their role in systematic investigations and in fisheries. Revue des travaux de l’Institut des pêches maritimes. 40(3), 379-398.

Asian Carp Regional Coordinating Committee (ACRCC). 2020. Asian Carp Action Plan for Fiscal Year 2020. Council on Environmental Quality, Washington, DC. www.asian carp.us

Baird, D. J., and Hajibabaei, M. 2012. Biomonitoring 2.0: a new paradigm in ecosystem assessment made possible by next-generation DNA sequencing. Molecular Ecology, 21(8), 2039–2044. doi: 10.1111/j.1365-294x.2012.05519.x

Bailey, R. M., and Smith, G. R. 1981. Origin and Geography of the Fish Fauna of the Laurentian Great Lakes Basin. Canadian Journal of Fisheries and Aquatic Sciences, 38(12), 1539-1561. doi:10.1139/f81-206

Balasingham, K. D., Walter, R. P., Mandrak, N. E., and Heath, D. 2017. Environmental DNA detection of rare and invasive fish species in two Great Lakes tributaries. Molecular Ecology, 27(1), 112-127. doi:10.1111/mec.14395

Becker, R. A., Sales, N. G., Santos, G. M., Santos, G. B., and Carvalho, D. C. 2015. DNA barcoding and morphological identification of Neotropical ichthyoplankton from the Upper Paraná and São Francisco. Journal of Fish Biology, 87(1), 159–168. doi: 10.1111/jfb.12707

Bohmann, K., Evans, A., Gilbert, M. T. P., Carvalho, G. R., Creer, S., Knapp, M., … Bruyn, M. D. 2014. Environmental DNA for wildlife biology and biodiversity monitoring. Trends in Ecology & Evolution, 29(6), 358–367. doi: 10.1016/j.tree.2014.04.003

19

Collen, B., Whitton, F., Dyer, E. E., Baillie, J. E., Cumberlidge, N., Darwall, W. R., . . . Böhm, M. 2013. Global patterns of freshwater species diversity, threat and endemism. Global Ecology and Biogeography, 23(1), 40-51. doi:10.1111/geb.12096

COSEWIC. 2009. COSEWIC assessment and status report on the Eastern Sand Darter Ammocrypta pellucida, Ontario populations and Quebec populations, in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. vii + 49 pp

COSEWIC. 2015. COSEWIC assessment and status report on the Warmouth gulosus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. x + 47 pp.

Creer, S., Deiner, K., Frey, S., Porazinska, D., Taberlet, P., Thomas, W. K., … Bik, H. M. 2016. The ecologists field guide to sequence-based identification of biodiversity. Methods in Ecology and Evolution, 7(9), 1008–1018. doi: 10.1111/2041-210x.12574

Cristescu, M. E. 2014. From barcoding single individuals to metabarcoding biological communities: towards an integrative approach to the study of global biodiversity. Trends in Ecology & Evolution, 29(10), 566–571. doi: 10.1016/j.tree.2014.08.001

Cruaud, P., Rasplus, J.-Y., Rodriguez, L. J., and Cruaud, A. 2017. High-throughput sequencing of multiple amplicons for barcoding and integrative taxonomy. Scientific Reports, 7(1). doi: 10.1038/srep41948

Cudmore, B. and Crossman E.J. 2000. Checklists of the fish fauna of the Laurentian Great lakes and their connecting channels. Canadian Manuscript Report of Fisheries and Aquatic Science, 2550:v+39 p.

Cudmore, B., Mandrak, N.E., Dettmers, J., Chapman, D.C., and Kolar, C.S. 2012. Binational ecological risk assessment of bigheaded carps (Hypophthalmichthys spp.) for the Great Lakes basin. DFO Canadian Science Advisory Secretariat Science Advisory Document. 2011/114.

Cudmore, B., Jones, L.A., Mandrak, N.E., Dettmers, J.M., Chapman, D.C., Kolar, C.S, and Conover, G. 2017. Ecological Risk Assessment of Grass Carp (Ctenopharyngodon idella) for the Great Lakes Basin. DFO Canadian Science Advisory Secretariat Research Document, 2016/118. vi + 115 p.

20

Dextrase, A.J., S.K. Staton, and Metcalfe-Smith, J.L. 2003. National recovery strategy for species at risk in the Sydenham River: an ecosystem approach. National Recovery Plan No. 25. Recovery of Nationally Endangered Wildlife (RENEW). Ottawa, Ontario. 73 pp.

Dextrase, A. J., and Mandrak, N. E. 2006. Impacts of alien invasive species on freshwater fauna at risk in Canada. Biological Invasions, 8(1), 13–24. doi: 10.1007/s10530-005- 0232-2

Dieffenbach, C. W., Lowe, T. M., and Dveksler, G. S. 1993. General concepts for PCR primer design. Genome Research, 3(3). doi:10.1101/gr.3.3.s30

Drake, D. A., and Mandrak, N. E. 2014. Ecological risk of live bait fisheries: a new angle on selective fishing. Fisheries, 39(5), 201-211. doi:10.1080/03632415.2014.903835

Duke, E. M., and Burton, R. S. 2020. Efficacy of metabarcoding for identification of fish eggs evaluated with mock communities. Ecology and Evolution, 10(7), 3463-3476. doi:10.1002/ece3.6144

Edwards, A., J. Barnucz, and N.E. Mandrak. 2006. Fish assemblage surveys of Rondeau Bay, Ontario: 2004 and 2005. Canadian Manuscript Report of Fisheries and Aquatic Sciences. 2773: v + 43 pp

Edwards, A.L. and S.K. Staton. 2009. Management plan for the Blackstripe Topminnow, Pugnose Minnow, Spotted Sucker and Warmouth in Canada. Species at Risk Act Management Plan Series. Fisheries and Oceans Canada, Ottawa. viii + 43 pp.

Elbrecht, V., and Leese, F. 2017. Validation and development of COI metabarcoding primers for freshwater macroinvertebrate bioassessment. Frontiers in Environmental Science. doi:10.3389/fenvs.2017.00011

Emery, L. 1985. Review of fish introduced into the Great Lakes, 1819-1974. Great Lakes Fishery Commission Technical Report, 45.

Emminizer, T. 2020. Moving ice: How the Great Lakes formed. New York: PowerKids Press.

Evans, N. T., Li, Y., Renshaw, M. A., Olds, B. P., Deiner, K., Turner, C. R., . . . Pfrender, M. E. 2017. Fish community assessment with eDNA metabarcoding: Effects of sampling

21

design and bioinformatic filtering. Canadian Journal of Fisheries and Aquatic Sciences, 74(9), 1362-1374. doi:10.1139/cjfas-2016-0306

Farrell, L. E., Roman, J., and Sunquist, M. E. 2000. Dietary separation of sympatric carnivores identified by molecular analysis of scats. Molecular Ecology, 9(10), 1583- 1590. doi:10.1046/j.1365-294x.2000.01037.x

Fisheries and Oceans Canada. 2017. Report on the progress of Recovery Strategy Implementation for the Lake Chubsucker (Erimyzon sucetta) in Canada for the Period 2010 – 2015. Species at Risk Act Recovery Strategy Report Series. Fisheries and Oceans Canada, Ottawa. iii+ 31 pp.

Fisheries and Oceans Canada 2018a. Action Plan for the Sydenham River in Canada: An Ecosystem Approach. Species at Risk Act Action Plan Series. Fisheries and Oceans Canada, Ottawa. iv + 36 pp.

Fisheries and Oceans Canada. 2018b. Report on the Progress of Recovery Strategy Implementation for the Spotted Gar (Lepisosteus oculatus) in Canada for the Period 2012 – 2017. Species at Risk Act Recovery Strategy Report Series. Fisheries and Oceans Canada, Ottawa. iv + 23 pp.

Frankham, R. 1995. Conservation Genetics. Annual Review of Genetics, 29, 305-327. doi:10.1146/annurev.ge.29.120195.001513

Hajibabaei, M., Smith, M. A., Janzen, D. H., Rodriguez, J. J., Whitfield, J. B., and Hebert, P. D. N. 2006. A minimalist barcode can identify a specimen whose DNA is degraded. Molecular Ecology Notes, 6(4), 959–964. doi: 10.1111/j.1471-8286.2006.01470.x

Halpern, B. S., Walbridge, S., Selkoe, K. A., Kappel, C. V., Micheli, F., D'Agrosa, C., ... Watson, R. 2008. A global map of human impact on marine ecosystems. Science, 319(5865), 948–952. doi: 10.1126/science.1149345

Hatfield, T. and Ptolemy, J. 2001. Status of the stickleback species pair, Gasterosteus spp., in Paxton Lake, Texada Island, British Columbia. Canadian Field-Naturalist. 115. 591-596.

22

Hatzenbuhler, C., Kelly, J. R., Martinson, J., Okum, S., and Pilgrim, E. 2017. Sensitivity and accuracy of high-throughput metabarcoding methods for early detection of invasive fish species. Scientific Reports, 7(1). doi:10.1038/srep46393

Hebert, P. D. N., Cywinska, A., Ball, S. L., and Dewaard, J. R. 2003. Biological identifications through DNA barcodes. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1512), 313–321. doi: 10.1098/rspb.2002.2218

Hebert, P. D. N., and Gregory, T. R. 2005. The promise of DNA barcoding for taxonomy. Systematic Biology, 54(5), 852–859. doi: 10.1080/10635150500354886

Helfman, G. 2007. Fish conservation: a guide to understanding and restoring global aquatic biodiversity and fishery resources. Environmental Science. doi: 10.5860/choice.45-3200

Hulley, E. N., Taylor, N. D., Zarnke, A. M., Somers, C. M., Manzon, R. G., Wilson, J. Y., and Boreham, D. R. 2018. DNA barcoding vs. morphological identification of larval fish and embryos in Lake Huron: Advantages to a molecular approach. Journal of Great Lakes Research, 44(5): 1110-1116. doi:10.1016/j.jglr.2018.07.013

Illumina™. 2019. Targeted next-generation sequencing versus qPCR and Sanger sequencing. Retrieved from https://www.illumina.com/content/dam/illumina- marketing/documents/products/other/infographic-targeted-ngs-vs-sanger-qpcr.pdf

Irvine, J.M., Gross, M.R., Wood, C.C., Holtby, L.B., Schubert, N.D., and Amiro, P.G. 2005. Canada’s Species at Risk Act. Fisheries, 30(12), 11-19. doi: 10.1577/1548- 8446(2005)30[11:CSARA]2.0CO;2

Jerde, C. L., Chadderton, W. L., Mahon, A. R., Renshaw, M. A., Corush, J., Budny, M. L., … Lodge, D. M. 2013. Detection of Asian carp DNA as part of a Great Lakes basin-wide surveillance program. Canadian Journal of Fisheries and Aquatic Sciences, 70(4), 522– 526. doi: 10.1139/cjfas-2012-0478

Kennish, M.J. 1992. Ecology of Estuaries: Anthropogenic Effects. Boca Raton, USA: CRC Press: 494 pp

23

Ko, H., Wang, Y., Chiu, T., Lee, M., Leu, M., Chang, K., . . . Shao, K. 2013. Evaluating the accuracy of morphological identification of larval fishes by applying DNA barcoding. PLoS ONE, 8(1). doi:10.1371/journal.pone.0053451

Kocher, A., Thoisy, B., Catzeflis, F., Huguin, M., Valière, S., Zinger, L., … Murienne, J. 2017. Evaluation of short mitochondrial metabarcodes for the identification of Amazonian mammals. Methods in Ecology and Evolution, 8(10), 1276–1283. doi: 10.1111/2041-210x.12729

Kohn, M. H., York, E., Kamradt, D.A., Haught, G., Sauvajot, R., and Wayne, R.K. 1999. Estimating population size by genotyping faeces. Proceedings of the Royal Society of London, Series B 266:657–663.

Landi, M., Dimech, M., Arculeo, M., Biondo, G., Martins, R., Carneiro, M., … Costa, F. O. 2014. DNA barcoding for species assignment: The case of Mediterranean marine fishes. PLoS ONE, 9(9). doi: 10.1371/journal.pone.0106135

Lasker, R. 1987. Use of fish eggs and larvae in probing some major problems in fisheries and aquaculture. American Fisheries Society Symposium, 2(1-16).

Leray, M., Yang, J. Y., Meyer, C. P., Mills, S. C., Agudelo, N., Ranwez, V., . . . Machida, R. J. 2013. A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: Application for characterizing coral reef fish gut contents. Frontiers in Zoology, 10(1), 34. doi:10.1186/1742-9994-10-34

Littlefair, J. E., and Clare, E. L. 2016. Barcoding the food chain: from Sanger to high- throughput sequencing. Genome, 59(11), 946–958. doi: 10.1139/gen-2016-0028

Lodge, D.M., Simonin, P.W., Burgiel, S.W., Keller, R.P., Bossenbroek, J.M., Jerde, C.L. ... Zhang, H. 2016. Risk analysis and bioeconomics of invasive species to inform policy and management. The Annual Review of Environment and Recourse, 41. 453-488.

Loh, W. K. W., Bond, P., Ashton, K. J., Roberts, D. T., and Tibbetts, I. R. 2014. DNA barcoding of freshwater fishes and the development of a quantitative qPCR assay for the species-specific detection and quantification of fish larvae from plankton samples. Journal of Fish Biology, 85(2), 307–328. doi: 10.1111/jfb.12422

24

Mack, R., Simberloff, D., Lonsdale, W., Evans, H., Clout, M., and Bazzaz, F. 2000. Biotic invasions: causes, epidemiology, global consequences, and control. Ecological Applications, 10(3), 689-710. doi:10.2307/2641039

Maggia, M.E., Vigouroux, Y., Renno, J.F., Duponchelle, F., Desmarais, E., Nunez, J., ... Mariac, C. 2017. DNA metabarcoding of Amazonian ichthyoplankton swarms. PloS one, 12(1), e0170009. doi: 10.1371/journal.pone.0170009

Mandrak N.E., and Cudmore, B. 2010. The fall of native fishes and the rise of non-native fishes in the Great Lakes basin. Aquatic Ecosystem Health & Management, 13(3), 255- 268. doi: 10.1080/14634988.2010.507150

Mandrak, N.E. and B.C. Cudmore. 2013. Fish species at risk and non-native fishes in the Great Lakes Basin: Past, present and future pp. 167-202 In: Taylor, W.W., A.J., Lynch, and N.J. Leonard (eds). Great Lakes Policy and Management, Second Edition. Great Lakes Fishery Commission, Ann Arbor, MI.

Marson, D., Gertzen, E., and Cudmore, B. 2016. Results of Fisheries and Oceans Canada’s 2014 Asian carp early detection field surveillance program. Canadian Manuscript Report for Fisheries and Aquatic Sciences, 3103.

McCusker, M. R., Denti, D., Guelpen, L. V., Kenchington, E., and Bentzen, P. 2012. Barcoding Atlantic Canada’s commonly encountered marine fishes. Molecular Ecology Resources, 13(2), 177–188. doi: 10.1111/1755-0998.12043

Meusnier, I., Singer, G. A., Landry, J., Hickey, D. A., Hebert, P. D., and Hajibabaei, M. 2008. A universal DNA mini-barcode for biodiversity analysis. BMC Genomics, 9(1),214. doi:10.1186/1471-2164-9-214

Mills, L. S., Citta, J. J., Lair, K. P., Schwartz, M. K., and Tallmon, D. A. 2000. Estimating Animal abundance using non-invasive DNA sampling: promise and pitfalls. Ecological Applications, 10(1), 283-294. doi:10.1890/1051-0761(2000)010[0283:eaaund]2.0.co;2

Montgomery, F., Reid, S. M., and Mandrak, N. E. 2020. Extinction debt of fishes in Great Lakes coastal wetlands. Biological Conservation, 241, 108386. doi:10.1016/j.biocon.2019.108386

25

Mullis, K. B. 1990. The unusual origin of the polymerase chain reaction. Scientific American, 262(4), 56-65. doi:10.1038/scientificamerican0490-56

Oh, J., and Kim, S. 2015. Morphological and molecular characterization of separated pelagic eggs from Lophius litulon (Lophiiformes; Lophiidae). Journal of Fish Biology, 86(6), 1887–1891. doi: 10.1111/jfb.12701

Olds, B. P., Jerde, C. L., Renshaw, M. A., Li, Y., Evans, N. T., Turner, C. R., … Lamberti, G. A. 2016. Estimating species richness using environmental DNA. Ecology and Evolution, 6(12), 4214–4226. doi: 10.1002/ece3.2186

Pawlowski, J., Kelly-Quinn, M., Altermatt, F., Apothéloz-Perret-Gentil, L., Beja, P., Boggero, A., … Kahlert, M. 2018. The future of biotic indices in the ecogenomic era: Integrating (e)DNA metabarcoding in biological assessment of aquatic ecosystems. Science of The Total Environment, 637-638, 1295–1310. doi: 10.1016/j.scitotenv.2018.05.002

Ratnasingham, S., and Hebert, P. D. N. 2007. BOLD: The Barcode of Life data system (http://www.barcodinglife.org). Molecular Ecology Notes, 7(3), 355–364. doi: 10.1111/j.1471-8286.2007.01678.x

Ricciardi, A. 2006. Patterns of invasion in the Laurentian Great Lakes in relation to changes in vector activity. Diversity Distributions, 12(4), 425–433. doi: 10.1111/j.1366- 9516.2006.00262.x

Robson, H. L., Noble, T. H., Saunders, R. J., Robson, S. K., Burrows, D. W., and Jerry, D. R. 2016. Fine-tuning for the tropics: Application of eDNA technology for invasive fish detection in tropical freshwater ecosystems. Molecular Ecology Resources, 16(4), 922- 932. doi:10.1111/1755-0998.12505

Rosenfeld, J. S., and Hatfield, T. 2006. Information needs for assessing critical habitat of freshwater fish. Canadian Journal of Fisheries and Aquatic Sciences, 63(3), 683-698. doi:10.1139/f05-242

Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74(12), 5463–5467. doi: 10.1073/pnas.74.12.5463

26

Siegenthaler, A., Wangensteen, O. S., Soto, A. Z., Benvenuto, C., Corrigan, L., and Mariani, S. 2018. Metabarcoding of shrimp stomach content: Harnessing a natural sampler for fish biodiversity monitoring. Molecular Ecology Resources, 19(1), 206-220. doi:10.1111/1755-0998.12956

Species at Risk Act, S.C. 2002, c 29.

Stat, M., John, J., Dibattista, J. D., Newman, S. J., Bunce, M., and Harvey, E. S. 2018. Combined use of eDNA metabarcoding and video surveillance for the assessment of fish biodiversity. Conservation Biology, 33(1), 196-205. doi:10.1111/cobi.13183

Staton, S.K., Dextrase, A., Metcalfe-Smith, J.L., Di Maio, J., Nelson, M., Parish, J,... Holm, E. 2003. Status and trends of Ontario’s Sydenham River ecosystem in relation to aquatic species at risk. Environmental Monitoring and Assessment, 88, 283-310. doi: 10.1023/A:1025529409422

Staton, S.K. and N.E. Mandrak. 2006. Focusing conservation efforts for freshwater biodiversity. pp. 197-204 In: G. Nelson et al. (eds). Protected areas and species and ecosystems at risk: research and planning challenges. Proceedings of the Parks Research Forum of Ontario Annual Meeting 2005. Parks Research Forum of Ontario, University of Waterloo, Waterloo, ON.

Strauss, R. and Bond, C. 1990. Taxonomic methods: morphology. In: Schreck C. B., Moyle P. B., editors. Methods for fish biology. American Fisheries Society. pp. 109–140.

Taberlet, P., Coissac, E., Hajibabaei, M., and Rieseberg, L. H. 2012. Environmental DNA. Molecular Ecology, 21(8), 1789–1793. doi: 10.1111/j.1365-294x.2012.05542.x

Taberlet, P. 2018. Environmental DNA: for biodiversity research and monitoring. Oxford: Oxford University Press.

Teletchea, F. 2009. Molecular identification methods of fish species: reassessment and possible applications. Reviews in Fish Biology and Fisheries, 19(3), 265–293. doi: 10.1007/s11160-009-9107-4

27

Thomsen, P. F., and Willerslev, E. 2015. Environmental DNA – An emerging tool in conservation for monitoring past and present biodiversity. Biological Conservation, 183, 4-18. doi:10.1016/j.biocon.2014.11.019

Thomsen, P. F., Kielgast, J., Iversen, L. L., Møller, P. R., Rasmussen, M., and Willerslev, E. 2012. Detection of a diverse marine fish fauna using environmental DNA from seawater samples. PLoS ONE, 7(8). doi: 10.1371/journal.pone.0041732

Valdez-Moreno, M., Ivanova, N. V., Elías-Gutiérrez, M., Contreras-Balderas, S., and Hebert, P. D. 2009. Probing diversity in freshwater fishes from Mexico and Guatemala with DNA barcodes. Journal of Fish Biology, 74(2), 377-402. doi:10.1111/j.1095- 8649.2008.02077.x

Valentini, A., Taberlet, P., Miaud, C., Civade, R., Herder, J., … Dejean, T. 2016. Next- generation monitoring of aquatic biodiversity using environmental DNA metabarcoding. Molecular Ecology, 25: 929-942. doi:10.1111/mec.13428

Walker, F. M., Williamson, C. H., Sanchez, D. E., Sobek, C. J., and Chambers, C. L. 2016. Species from feces: -wide identification of Chiroptera from guano and other non- invasive genetic samples. Plos One, 11(9). doi:10.1371/journal.pone.0162342

Ward, R. D., Zemlak, T.S., Innes, B.H., Last, P.R., and Hebert, P.D. 2005. DNA barcoding Australia's fish species. Philosophical Transactions of the Royal Society B. 360(1462):1847-1857. Doi:10.1098/rstb.2005.1716

Watson, R., and Pauly, D. 2001. Systematic distortions in world fisheries catch trends. Nature, 414(6863), 534-536. doi:10.1038/35107050

Weisburg, W. G., Barns, S. M., Pelletier, D. A., and Lane, D. J. 1991. 16S ribosomal DNA amplification for phylogenetic study. Journal of Bacteriology, 173(2), 697-703. doi:10.1128/jb.173.2.697-703.1991

28

Chapter 2 Metabarcoding the early life stages of freshwater fishes in the Great Lakes watershed

2.1 Introduction

DNA metabarcoding has broadened the applicability of molecular-based approaches to biomonitoring. Conventional methods, such as taxonomic keys, morphometrics, and single- specimen barcoding, are time-consuming, costly, and often inaccurate. An advantage of using metabarcoding, compared to conventional DNA barcoding, is the ability to barcode bulk samples on a large scale (Taberlet et al., 2012; Cruaud et al., 2017). Barcoding uses molecular markers, such as the mitochondrial cytochrome oxidase subunit 1 (COI gene) and genetic diversity among taxa to identify species (Hebert et al., 2003; Cruaud et al., 2017). Metabarcoding has proven to be advantageous in early-detection monitoring (Taberlet et al., 2012; Robson et al., 2016), detection of rare species (Valdez-Moreno et al. 2019), estimation of species richness (Olds et al.,

2016), and the potential quantification of species abundance (Evans et al., 2016; Mariac et al.,

2018) in aquatic systems.

Ichthyoplankton surveys are conducted to determine the distribution and abundance of the adult populations, to understand factors influencing survival, and to evaluate current biodiversity in aquatic systems (Ahlstrom and Moser, 1976; Lasker, 1987). Conducting ichthyoplankton surveys requires a high level of taxonomic expertise given the morphological similarity of the fish eggs and larvae of closely related species (Ahlstrom and Moser, 1976; Loh et al., 2014). Recent efforts have used DNA barcoding to identify ichthyoplankton, as this

29 method is not limited by the barriers of conventional morphological and genetic identification methods (Becker et al., 2015; Hulley et al., 2018). Conventional barcoding is limited to identifying individual specimen, whereas metabarcoding allows the identification of species within a batch sample of multiple individuals (Taberlet et al., 2012). As a result, metabarcoding is ideal for large-scale ichthyoplankton surveys.

To evaluate the utility of metabarcoding in processing ichthyoplankton surveys, fish eggs and larvae were sampled in two aquatic systems in the Canadian Great Lakes basin -- the

Sydenham River and Rondeau Bay, Lake Erie (Figure 1). The Sydenham River is located in southwestern Ontario and drains into Lake St. Clair. With more than 80 fish species, the

Sydenham River has greater fish diversity of than most Canadian rivers (Dextrase et al. 2003).

Three localities in the East Sydenham River were sampled in this study; Alvinston, Florence, and

Oil Springs. Rondeau Bay (Montgomery et al. 2020) is a warm coastal wetland on the northern shore of Lake Erie’s central basin that provides extensive fish spawning and nursery habitat for fishes and over 50 fish species have been documented there (Edwards et al., 2006). The heavily vegetated northern shore of Rondeau Bay was sampled. Both systems include species at risk

(SAR) and aquatic invasive species (AIS) (Dextrase et al. 2003; Montgomery et al. 2020).

Knowledge of species diversity, including endangered and invasive species, and identifying the habitat required for early life stages is critical for conserving the biodiversity of these aquatic systems (Dextrase et al. 2003; Edwards et al. 2006). The East Sydenham River and coastal regions of Rondeau Bay are areas of critical habitat for at-risk fish spawning (Fisheries and

Oceans Canada. 2018a; Edwards and Staton. 2009); therefore, detection and identification of the early life stages of these species is critical for conservation and management.

30

In this study, the utility of molecular-based metabarcoding methods to identify the species in larval and egg batch samples is evaluated. The goals of this study are: 1) develop and use a metabarcoding protocol to detect species diversity in batch samples of egg and larval fishes; 2) to use the metabarcoding protocol to evaluate fish biodiversity in the Sydenham River drainage of Lake St Clair and Rondeau Bay, Lake Erie; 3) to assess the temporal patterns of egg and larvae occurrence; and, 4) compare the cost benefits of using metabarcoding compared to single-specimen barcoding.

2.2 Methods

Fish species present in 1119 early life-stage batch samples collected in the Sydenham

River and Rondeau Bay were identified using a 2-step PCR amplification (Cruaud et al., 2017), sequencing with Illumina MiSeq, filtering and analyzing raw data using mothur software

(Schloss et al. 2009), and using a built-in BLAST algorithm on mothur to identify the barcodes using a reference library.

Sample collection

Larval and egg batch samples were collected by Fisheries and Oceans Canada in spring and summer of 2017. Egg batch samples were collected from three localities in the East

Sydenham River (Alvinston, Florence, and Oil Springs) and the northern shore of Rondeau Bay,

Lake Erie (Figure 1; Table 1). Larval batch samples were collected from Alvinston and Oil

Springs (Figure 1; Table 1). Alvinston and Oil Springs are in close proximity (6 km apart) and have similar habitat, and 36 km upstream of Florence (Figure 1). In the Sydenham River, 500 µm drift nets were set for 30-min intervals for each sampling event. The drift nets were set across the

31 stream, 5 replicates were placed at each site for each sampling date. In Rondeau Bay, macrophyte surveys using multiple 0.5 m2 quadrats were used to sample the vegetated regions of the northern shore. The submerged vegetation was collected using a rake and later inspected for eggs. Samples from each gear set were separated based on life stage (larvae or eggs) and placed in scintillation tube tubes with 95% ethanol. In total, 1119 batch samples (352 egg samples; 767 larval samples) with varying numbers of individuals in each sample were collected. The samples were stored in 95% ethanol at room temperature for DNA extraction.

Table 1. Sampling information for egg and larval batch samples from three sampling location in Sydenham River, Southwestern Ontario and Rondeau Bay, Lake Erie. Table consists of sampling dates, number of sampling days, and number of samples.

Eggs Larvae Sampling Sampling dates # of # of Sampling dates # of # of Site sampling samples sampling samples days days

Sydenham River

Alvinston April 19-August 2 19 115 May 19 – August 2 23 387 Florence April 19 – August 1 22 159 - - - Oil Springs April 20 – July 12 10 34 April 20 – July 19 20 380 Lake Erie Rondeau May 30 – June 15 9 44 - - - Bay

32

Figure 1. Localities sampled for fish early life-history stages in southwestern Ontario. The Alvinston, Oil Springs, and Florence sites are on the Sydenham River, which drains to Lake St Clair, and thought to provide nursery habitat for Great Lake fishes. Samples from Rondeau Bay were collected from the northern shore, which is highly vegetated and provides nursery spawning habitat for fishes.

DNA extraction

Prior to DNA extractions, batch samples containing more than five individuals were homogenized using a tissue homogenizer. A 1 ml subsample from the homogenized sample was used to extract genomic DNA. Samples containing fewer than five individuals were cut into smaller pieces to increase the surface area for tissue degradation.

33

The salt-extraction protocol described in Lujan et al. (2020) was used to extract DNA from the batch samples. The tissues from each batch sample were degraded using a cell-lysis solution. A thermomixer at 60 ºC at 600 rmp was used to digest each sample for 24 hours. If the tissues were not digested after 24 hours, more proteinase K was added to accelerate the lysis process and placed back in the thermomixer until tissue were completely digested. After tissues were digested, the samples were centrifuged for 10 minutes at 13000 rpm to separate debris from supernatant. The supernatant containing the DNA was transferred to, and mixed in, a centrifuge tube containing 180 µL of 5M NaCl. The tube was centrifuged for 10 minutes at 13000 rpm and the supernatant was transferred into another centrifuge tube containing 420 µL of isopropanol.

The resultant sample was then inverted multiple times to precipitate the DNA followed by centrifugation for 15 minutes at 13000 rpm to form a DNA pellet. The sample was carefully decanted to remove the supernatant while avoiding any disturbance to the pellet. To remove impurities in the precipitated DNA, the DNA pellet was washed twice using 250 µL of 80% ethanol by vortexing the centrifuge tube for 60 seconds with the ethanol. Ethanol was removed by decanting the centrifuge tube while the DNA pellet remained in the tube, and the remaining ethanol was removed using a SpeedVacä. The DNA was resuspended in 100 µL of 1X TE buffer and placed in a thermomixer at 25 ºC at 600 rpm for 24 hours. The DNA extractions were stored at -20 ºC. A subset of samples was quantified using QubitTM to determine average DNA concentration from the salt extraction process and to test the success of the DNA extractions.

Reference Library and Primer design

Partial COI sequences were collected from GenBank for all extant fish species and all potential invasive species in the Great Lakes basin (based on Roth et al., 2013) to create a

34 reference alignment (also referred to as reference library in the context of identification) A single sequence was selected for each species (Appendix A – Table S8), and the alignment was trimmed to include the 62nd base pair position to the 308th base pair of the COI sequence

(Appendix A – Figure S2). Based on this reference alignment, four potential universal primer pairs were made to amplify a 198 bp region. The primer pairs were tested by amplifying genomic

DNA of species from the Great Lakes and potential Great Lakes invasives. The most successful primer pair, KGLF-F and KGLF-R primer pair (modified versions of PS1-F and PS1-R from

Balasingham et al., 2017), targets a highly variable 198 bp region in the COI sequence (Table 2).

A neighbor-joining tree was used to confirm the phylogenetic position of each sample and a distance matrix was used to calculate the level of divergence of the target region among target taxa (Appendix A – Figure S1). It was determined that the divergence among Great Lakes fishes for the 198 bp barcode was sufficient to identify most species to the species level.

Table 2. Universal PCR primer pair use to amplify wide variety of taxa from the Great Lakes watershed and potential invasive species.

Primer Name Primer Forward KGLF-F TATTTGGTGCCTGAGCCGGRATRGT Reverse KGLF-R CAGAAGCTTATRTTATTTATYCG

The primer pair was modified to link PCR2 primers, to allow dual-index based Illumina

MiSeq sequencing (Cruaud et al. 2017; Appendix A- Table S1). To modify the PCR1 primers, heterogeneity spacers and part of the sequencing primer was added. Heterogeneity spacers allow for equal proportion of each of the bases to be sequenced during dual-index sequencing (Fadrosh et al 2014). Region of the sequencing primer is used to hybridize the primers during PCR2.

35

PCR2 primers consist of the second half the sequencing primer, an index region, and the

Illumina adaptor. Multiple PCR2 primers consisting of unique indexes were used to differentiate each sample in the pooled mix (See Appendix A- Table S2). Illumina adaptor is used to attach the sequences to the flowcell during Illumina MiSeq sequencing.

Positive and Negative Controls

Blanks (negative controls) were included in each plate to assess levels of contamination; these wells contained the mastermix with 1 µL of water instead of DNA. Mock communities

(positive controls) were used to assess the ability to detect multiple species and differences in read output in mixed-species batch sample. Four mock communities, with three replicates each, consisting of different combinations of species present in the sampling sites were used in this study (Appendix A – Table S3). Mock communities were created with diluted genomic DNA

(~10ng/µL) extracted from previously identified specimen.

Library preparation

The 1119 batch samples were amplified and then tagged with indices using a two-step

PCR (Cruaud et al., 2017). The first PCR (PCR1) is used to amplify the target region. The second PCR (PCR2) tags the amplicons with dual indexes for sequencing.

Prior to PCR1, samples were tested through a series of dilutions (1:2, 1:5, and 1:10) to identify potential PCR inhibition. Based on the results from these preliminary tests, all samples were diluted at 1:10 to prevent PCR inhibition.

36

For PCR1 96 different primer pairs were used per 96-well plate to amplify the target barcode region in PCR1: KGLF-R-(1-12) & KGLF-F-(A-H) (Appendix A – Table S1) with the following thermocycler protocol: 95ºC for 1 min followed by 16 cycles of 95 ºC for 10 s, 62 ºC for 1 min, and 72 ºC for 1 min and then 25 cycles of 95 ºC for 10 s, 46 ºC for 30 s, and 72 ºC for

1 min. Each PCR reaction was performed using the same mastermix: 14.92 uL of dH2O, 0.75x of PCR buffer, 1.23 mM of MgCl2, 22 µM of dNTPs, 0.4 µM of forward primer, 0.4 µM of reverse primer, 0.4x of Taq Polymerase, and 1 µL of template DNA in a 20 µL reaction volume.

After amplification, the PCR1 amplicon should be 315-333 bp long, a 2% agarose gel stained with RedSafe™ dye was used to visualize the PCR1 products. All samples with or without the target band proceeded to PCR2 protocol.

For PCR2, dual-indexes were used to identify the samples within each Illumina MiSeq sequencing run (See Appendix A – Table S4). This allows us to demultiplex each read output from sequencing in order to identify which sample each of them belongs to. PCR2 thermocycler protocol as follows: 95 ºC for 1 min followed by 16 cycles of 95 ºC for 10 s, 62 ºC for 1 min, and

72 ºC for 1 min and then 25 cycles of 95 ºC for 10 s, 46 ºC for 30 s, and 72 ºC for 1 min. Each

PCR 2 reaction was performed using the same mastermix: 14.92 µL of dH2O, 0.75x of PCR buffer, 1.23 mM of MgCl2, 22 µM of dNTPs, 0.4 µM of forward primer, 0.4 µM of reverse primer, 0.4x of Taq Polymerase, and 1 µL of PCR 2 product as template DNA in a 20 µL reaction volume. After amplification, the PCR2 amplicon should be 384-402 bp long, a 2% agarose gel stained with RedSafe dye was used to visualize the PCR2 products.

PCR2 product was cleaned and standardized using a Thermofisher Sequalprep

Normalization Kit following manufacturer protocols. The normalization kit is used to standardize the quantity of DNA from each reaction in the final pooled mix. A subsample of this pooled mix

37 will be used in the sequencing process. 96 PCR reactions (96 samples) were pooled into one centrifuge tube (total of 12 tubes); the concentration of DNA in each pooled mix varied between

0.9 – 2.2 ng/µL.

The Illumina libraries were paired-end sequenced using a 2x150 bp Miseq reagent kit v2

Nano on an Illumina MiSeq platform in the Centre for the Analysis of Genomic Evolution and

Function at University of Toronto. Two sequencing runs were performed, each run sequenced

576 samples.

Bioinformatics

Mothur (Schloss et al. 2009) was used to filter and clean raw read output; commands used in each step are provided on Table S5 in Appendix A. After demultiplexing the raw reads from the sequencing run based on their assigned indexes, full contigs were produced from these reads. The contigs were then filtered by removing any contig longer than 333bp (length of the barcode + the PCR1 primer pair). Any duplicate sequences were merged to remove noise. The reference alignment was used to create an alignment of all unique sequences after merging duplicates. Using the alignment as a reference excess basepairs from both sides of the barcode.

Any sequences that differed by 1 or 2 basepairs were clustered, as recommended by mothur

(Schloss et al. 2009) to account for possible sequencing error. The sequences were then filtered for chimeric sequences due to PCR error.

The resulting sequences were then identified using the reference library (Appendix A–

Table S8). The confident level of species identity for each unique sequence was assessed using classify.seq command. This command implements a naïve Bayes method to perform a repeated

38 random sampling of the k-mers (where k=8) within the unknown sequence and tries to match them to sequences from the reference library (Murali et al. 2018). Confidence levels are derived from the percentage of bootstrap replicates that were assigned to a taxonomic rank (In this study, it is to the species level) (Murali et al. 2018). Based on Wang et al. 2007 recommendations, any sequence that was identified with less than 80% support was removed. This will remove any sequences that cannot be classified with high confidence prior to actual classification of the unique sequences. Classify.seq command was used again with the built-in BLAST algorithm to assign taxa for the unique sequences that remains. To verify these identifications, BLAST search of few sequences using GenBank, independent of mothur, was used to confirm the assignment of taxa to sequences on mothur. Phylogenetic analysis by creating neighbour-joining tree of all unique sequences and reference library was used to confirm assignment of taxa as well.

Phylogenetic analysis was conducted to see how the identified sequences would group in a neighbour-joining tree (Foster et al. 2013). Sequences with the same species identity, including reference sequences, should group together into monophyletic clades. After sequences were filtered using mothur, total number of sequences (read count) for each species were tallied per sample.

Before analyzing the results, detection data for each sample were filtered using results from positive and negative controls to reduce potential false-positive detections. Following

Nguyen et al. (2014), a minimum positive detection threshold was determined based on my positive-control experiments. Species included in the positive controls were detected at levels as low as 2 reads (Appendix A – Table S7), therefore, the minimum detection threshold was set to 2 reads per sample. Setting a higher threshold is more conservative in terms of detections, but could result in cases where less true diversity is detected. Negative controls provide an indication

39 of contamination during library preparation and sequencing and can be used to determine species-specific contamination thresholds for detection (McKnight et al., 2019 and Nguyen et al.

2014). For each of the two sequencing runs, maximum read counts for each species detected in the negative controls (Appendix A – Table S6) were used as species-specific minimum thresholds for detection of those taxa in each sample (following similar methods to McKnight et al., 2019 and Nguyen et al. 2014). For analyses and interpretations below, read counts were reduced by these species- and run-specific thresholds.

Cost-effectiveness analysis

The cost per sample of consumable expenses (excluding labour) for metabarcoding and individual-based barcoding was calculated in Canadian Dollars based on the number of batch samples and the number of specimens sequenced in this study. For this study, 13717 specimens from 1119 batch samples were sequenced in two next-generation sequencing runs. For standard barcoding, the per-sample cost was calculated based on standard protocols used to barcode samples and then multiplied by 13717 to determine the total estimated cost for the study. For metabarcoding, the cost to sequence the total number of specimens in this study ($6597.33) was divided by the total number of individuals analyzed (13717) to determine the price per individual egg/larva sample, and divided by the total number of batch samples analyzed (1119) to determine the price of sequencing each batch sample.

All costs were calculated based on 2020 prices, and future costs may differ. For individual barcoding, reagents and other materials were priced at University of Toronto

MedStore, and costs of Sanger sequencing were based on charges at The Centre for Applied

Genomics at The Hospital for Sick Children, Toronto. DNA extraction assumes use of reagents

40 required for a salt extraction protocol, PCR assumes use of reagents used in the methods of this study, and post-PCR cleanup assumes use of ExoSAP (barcoding) or Thermofisher Sequalprep

Normalization Kit (metabarcoding). The cost of sequencing for metabarcoding was based on charges for Illumina MiSeq 2x150bp runs at The Centre for the Analysis of Genome Evolution and Function at the University of Toronto.

2.3 Results

Species detection

A total of 498224 reads for batch samples were analyzed (average of 531 reads ± 514 reads). A detection in this study is defined as a presence of a species in a batch sample. Most sequences (99%) were identified to the species level. A total of 1424 detections were determined from the batch samples after filtering for contamination. A total of 917 out of 1119 batch samples (250 of 352 egg batch samples and 667 of 767 larval batch samples) provided successful detections. The mean read count per detection decreased as the number of species detected in a sample increased (Appendix B – Figure S2). This variation in read counts could be caused by species-specific primer efficacy (Hatzenbuhler et al. 2017).

The metabarcoding protocol developed in this study was successful at detecting and identifying species present in egg and larval batch samples. In a total of 1424 detections (310 egg-based detections and 1114 larva-based detections) 35 species were detected. (Figure 2; Table

3). A total of 34 species were detected from the Sydenham River and 8 species from Rondeau

Bay (Figure 2; Table 3), but this difference in number of species detected is most likely the result of the difference in gear types used and numbers of batch samples analyzed from each location

41

(1075 samples from the Sydenham River and 44 from Rondeau Bay). The species detected using metabarcoding generally agreed with the species detected in these watersheds using conventional methods (Table 3). The most frequently detected species were Greenside Darter (Etheostoma blennioides), Shorthead Redhorse (Moxostoma macrolepidotum), Spotfin Shiner (Cyprinella spiloptera), and White Sucker (Catostomus commersonii) (Figure 2).

The metabarcoding protocol detected four species listed under the federal Species at Risk

Act (SARA): Eastern Sand Darter (Ammocrypta pellucida - Threatened), Lake Chubsucker

(Erimyzon sucetta - Endangered), River Darter (Percina shumardi - Endangered), and Spotted

Sucker (Minytrema melanops - Endangered) (www.sararegistry.gc.ca). Three invasive species were also detected: Common Carp (Cyprinus carpio) Round Goby (Neogobius melanostomus), and Sea Lamprey (Petromyzon marinus).

Some species were detected with low read counts (Figure 2; Appendix B – Figure S1).

Generally, in metabarcoding studies, read thresholds are used to remove detections with 1-10 read count to attain reliable detections in the overall study at the expense of losing rare detections

(Alberdi et al. 2017). Based on positive controls results (Appendix A – Table S7), minimum read count of 2 was used as a minimal threshold for detection. Greater Redhorse (Moxostoma valenciennesi), Lake Chubsucker, Lake Trout (Salvelinus namaycush), Northern Hogsucker

(Hypentelium nigricans), River Darter, Sea Lamprey, and (Micropterus dolomieu) were detected with less than 10 reads per sample (see Appendix B – Figure S1). These detections should be viewed as less certain than those evidenced by higher read counts.

42

Figure 2. Total number of detections of each species across in all batch samples, with colours indicating localities. 310 detections from egg batch samples and 1114 detections from larval batch samples

43

Table 3. Detection of species using metabarcoding (MB) (eggs and larvae; this study) and conventional sampling (Cvn) (typically adults, Eakins et al. 2020) for species in the Sydenham River and Rondeau Bay, Lake Erie. Reported spawning months for each species detected based on Eakins et al. (2020) are shown, as are the months that eggs and larvae were detected using metabarcoding. * species detected with less than 10 reads. Species listed under SARA are denoted by designation: Special Concern (SC), Threatened (TH), Endangered (EN).

Sydenham Rondeau Reported Months Detected Species Scientific Name River Bay Spawning using metabarcoding (SARA status) Months Cvn MB Cvn MB Egg Larvae Banded Killifish Fundulus diaphanus X - X X June – August June - Blackside Darter* Percina maculata X X - - May – June May April-July Bluegill Lepomis macrochirus X X X - June – August July June-July Bluntnose Minnow Pimephales notatus X X X X June – August June-July May- August Common Carp Cyprinus carpio X X X X May – August May-July May-July Creek Chub Semotilus X X X - May – June May - atromaculatus Eastern Sand Darter Ammocrypta pellucida X X X - June – July June June- (TH) August Fantail Darter Etheostoma flabellare X X - - May – June May May Freshwater Drum Aplodinotus grunniens X X X - May – July May-July June Ghost Shiner / Mimic Notropis X X X - June – August / July May- Shiner buchanani/Notropis June – July August volucells Golden Redhorse Moxostoma X X X - May – June May-July May-July erythrurum Greater Redhorse* Moxostoma X X - - May – June - June valenciennesi Greenside Darter Etheostoma X X X X April – June April-July May-July blennioides Buffalo Ictiobus spp. X X X X May – June May-June - Johnny Darter Etheostoma nigrum X X X - May – June June May Lake Chubsucker* (EN) Erimyzon sucetta X X X - May – June - May Lake Trout* Salvelinus namaycush - X - - September – - May & November July Logperch Percina caprodes X X X - May - June April-June April-June Longnose Gar Lepisosteus osseus X X X - May – June - June Mooneye Hiodon tergisus X X X X April – June April-June May Northern Hogsucker* Hypentelium nigricans X X - - April – May May - Northern Sunfish Lepomis peltastes X X - - June – July - July Quillback Carpiodes cyprinus X X X - April – June May May-June Redfin Shiner Lythrurus umbratilis X X X - June – August - May-July River Darter (EN)* Percina shumardi X X - - April – June May - Ambloplites rupestris X X X - May – June July June-July Round Goby Neogobius X X X X May – July May-June May-June melanostomus Sea Lamprey* Petromyzon marinus - X X - May – June June - Shorthead Redhorse Moxostoma X X X X April – June May-July May-July macrolepidotum Silver Redhorse Moxostoma anisurum X X X - April – June June May-July Smallmouth Bass* Micropterus dolomieu X X X - May – June - June-July Spotfin Shiner Cyprinella spiloptera X X X - June – August May-August June- August Spotted Sucker (SC) Minytrema melanops X X - - May – June July June-July Stonecat Noturus flavus X X X - June – August - May-July White Sucker Catostomus X X X - April - June April-June May-July commersonii

44

Spatial patterns

There were differences in species detected between Florence, Alvinston/Oil Springs, and

Rondeau Bay in the egg batch samples (egg batch samples were collected from all four locations). Alvinston and Oil Springs sampling sites were farther upstream than the Florence site.

Creek Chub (Semotilus atromaculatus), Freshwater Drum (Aplodinotus grunniens), Eastern Sand

Darter, Spotted Sucker, and Sea Lamprey eggs were detected primarily at the Florence sampling sites (Figure 2). Blackside Darter, Johnny Darter, Quillback (Carpiodes cyprinus), and Rock

Bass (Ambloplites rupestris) eggs were detected primarily at Alvinston and Oil Springs (Figure

2). Banded Killifish (Fundulus diaphanus) was only detected in Rondeau Bay.

Temporal patterns

Temporal spawning patterns in the dataset were identified based on the number of samples collected through April to August (Table 4). Egg samples were collected from April 19 to August 2 and larval samples were collected from April 20 to August 2 (Table 1). Rondeau Bay was omitted due to low sample size (n = 44 and 19 samples were successfully processed).

Species in larval samples were detected either at the same time as the egg samples or in the following month (Table 4). Spawning months for species were similar across sampling sites

(Table 4).

45

Table 4. Comparison of spawning months for species detected in the Sydenham River (Alvinston, Florence, and Oil Springs).

Alvinston and Alvinston and Common Name Species Name Florence Oil Springs Oil Springs Egg Egg Larvae

Blackside Darter Percina maculata - May April - July Bluegill Lepomis macrochirus July - June - July Bluntnose Minnow Pimephales notatus July July May - August Common Carp Cyprinus carpio June - July May to July May - July Creek Chub Semotilus atromaculatus May - - Eastern Sand Darter Ammocrypta pellucida June - June - August Fantail Darter Etheostoma flabellare - May May Freshwater Drum Aplodinotus grunniens June - July May June Notropis buchanani/Notropis Ghost Shiner / Mimic Shiner volucells - July May - August Golden Redhorse Moxostoma erythrurum July May May - July Moxostoma valenciennesi Greater Redhorse - June Greenside Darter Etheostoma blennioides April - July May to July May - July Bufallo Ictiobus spp. - May - Johnny Darter Etheostoma nigrum June - May Lake Chubsucker Erimyzon sucetta - - May Lake Trout Salvelinus namaycush - - May & July Logperch Percina caprodes April - June May April - June Longnose Gar Lepisosteus osseus - - June Mooneye Hiodon tergisus - April to May May Northern Hogsucker Hypentelium nigricans - May - Northern Sunfish Lepomis peltastes - - July Quillback Carpiodes cyprinus - May May - June Redfin Shiner Lythrurus umbratilis - - May - July River Darter Percina shumardi - May - Rock Bass Ambloplites rupestris - July June - July Round Goby Neogobius melanostomus - May May - June Sea Lamprey Petromyzon marinus - June - Shorthead Redhorse Moxostoma macrolepidotum May - July May to July May - July Silver Redhorse Moxostoma anisurum June - May - July Smallmouth Bass Micropterus dolomieu - - June - July Spotfin Shiner Cyprinella spiloptera May - August May to July June - August Spotted Sucker Minytrema melanops July - June - July Stonecat Noturus flavus - - May - July White Sucker Catostomus commersonii April - May April - June May - July

46

Cost-effectiveness analysis

Excluding labour and infrastructure, the total cost of metabarcoding was approximately

CAD$6597.33, while the estimated cost to individually barcode the same number of individual egg and larval samples was estimated at CAD$62289.09 (Table 5). This converts to a per- individual cost of CAD$4.55 for individual-based barcoding, and CAD$0.84 using metabarcoding. The cost of metabarcoding 1119 batch samples is CAD$5.88 per batch sample.

Table 5. Comparison of materials and supplies costs for barcoding and metabarcoding. Total barcoding cost is based on the price to barcode one specimen multiplied by the number of specimens in this study. The metabarcoding cost per sample was calculated based on the price to barcode the total number of specimens in the study divided by the number of individuals. Labour and infrastructure costs are not included. Costs for chemicals and materials commonly found in the lab and used in very low quantity were omitted.

Number of Barcoding Metabarcoding individuals 1 13717 1 Batch Sample 13717 DNA extraction 0.87 11876.46 0.12 0.87 981.43 PCR 0.16 2128.80 0.05 0.33 368.58 Post-PCR clean up 0.27 3703.59 0.29 2.01 2247.33 Sequencing 3.25 44580.25 0.38 2.68 3000.00 Total 4.55 62289.09 0.84 5.88 6597.33

47

2.4 Discussion

Ichthyoplankton surveys of Sydenham River and Rondeau Bay provided a snapshot of biodiversity in these aquatic habitats. Based on the egg and larvae batch samples, 35 species were detected that use these waterbodies for spawning. Additionally, spawning months for each species detected in the batch samples were determined. The samples from the East Sydenham

River at Florence, Alvinston, and Oil Springs contained species-at-risk and aquatic invasive species. These results provide an insight into using metabarcoding as a biomonitoring tool for ichthyoplankton. In this study, metabarcoding is determined to be more cost-effective in comparison to barcoding. This study highlights the effectiveness of using metabarcoding to detect and identify early life stages of fishes in the Great Lakes basin.

Species detection

The potential of using a metabarcoding pipeline to determine species diversity in batch samples of ichthyoplankton was assessed in this study. 99% of the sequences were assigned a taxon. With this protocol, 34 species that spawn in the East Sydenham River and 8 species that spawn in the vegetated coastal region of Rondeau Bay were detected. In the positive controls, all species in the mock communities were detected. Similarly, in the batch samples, multiple species were detected in mixed-species batch samples. Therefore, it was determined this metabarcoding is efficient method for detecting and identifying species in freshwater ichyoplankton surveys.

Due to an identical barcode region, few closely related species were indistinguishable using the methodologies in this study. In the barcode region of COI sequence, River Carpsucker

(Carpiodes carpio) and Quillback sequences are identical; therefore, they are indistinguishable;

48 however, River Carpsucker is not known from the Great Lakes basin (Page and Burr, 2011, Roth et al. 2013). Therefore, it is highly unlikely that the detections are River Carpsucker, but rather

Quillback, which are known to inhabit the Great Lakes basin (Roth et al. 2013). Mimic Shiner

(Notropis volucellus) and Ghost Shiner (Notropis buchanani) had identical barcode regions as well. Ghost Shiner was previously a of Mimic Shiner (Mayden 1989) and they are often mistaken for one another due to morphological similarities (Holm et al. 2010). Ghost

Shiner is known to inhabit the Lake St. Clair drainage in Canada and Mimic Shiner is widespread across Ontario, including the Lake St. Clair drainage (Holm et al. 2010; Holm and

Houston 1993). Additionally, misidentification is possible due to use of GenBank barcodes that are incorrectly identified or simply the two species are too closely related. Based on a mitochondrial DNA analysis, there are high rates of gene flow among buffaloes in the Great

Lakes basin that, with intermediate morphological characters, make the buffaloes difficult to identify to species (Bart et al. 2010). The species most readily distinguished morphologically

(Holm et al. 2010), Bigmouth Buffalo, is known to inhabit the East Sydenham River (COSEWIC

2009a). Possibly a longer COI barcode or multiple barcodes from different molecular markers

(Meusneir et al., 2008; Ardura et al., 2013) can be used to differentiate between closely related species, buffaloes still remain as an exception.

Spatial patterns

In the Sydenham River, similar species were detected more frequently in Alvinston and

Oil Springs, which was expected as the two sampling sites are ~6 km apart, whereas Florence had higher number of detections from other species, ~36 km downstream of Oil Springs (Figure

2). This is likely due to the habitat differences between Florence and Alvinston/Oil Springs. The

49

Alvinston and Oil Spring sampling sites are predominantly bedrock (Parish Geomorphic

Limited, 2000) with areas of gravelly and riffle habitats (Metcalfe-Smith et al 2003). Commonly detected species in this area were Greenside Darter, Shorthead Redhorse, and White Sucker. The

Sydenham River is surrounded by land used for agriculture resulting in higher volumes of agricultural runoff (Fisheries and Oceans 2018a). Greenside Darter preferred habitat consists of large filamentous algae, which is abundant in habitat exposed to organic pollution from fertilizer

(Holm et al. 2010) like the Sydenham River (Fisheries and Oceans Canada. 2018a). Shorthead

Redhorse releases its eggs over rubble material in riverine habitats consisting of riffles (Holm et al. 2010) and larval habitat is associated with areas dominate in sand and gravel substrate (Lane et al. 1996). White Sucker deposits its eggs on rocks, rubble, or gravel bottoms (Balon 1975) and larvae is commonly detected in areas containing cobble, rubble, sand, and silt (Lane et al. 1996).

Conversely, the substrate at Florence is dominated by sand deposits with some clay, silt, and gravel and the stream is vegetated in warmer months (Parish Geomorphic Limited, 2000).

Commonly detected species in Florence egg batch samples were Common Carp, Eastern Sand

Darter, Freshwater Drum, Spotfin Shiner, and Spotted Sucker. Common Carp releases its eggs in freshly flooded plant material near muddy hypoxic environments (Balon 1975). Eastern Sand

Darter prefers sandy bottom river beds and spawning is thought to occur in mixed sand and gravel substrate (Johnston 1989) and larval habitat assumed to be sand-dominated riverbeds

(Lane et al. 1996). Spotfin Shiner prefers gravel or sand substrates, and spawning sites are chosen by males in rock crevices or near fallen trees and stumps (Holm et al. 2010). Freshwater

Drum and Spotted Sucker prefer larger habitats, such as Lake St. Clair (Holm et al. 2010) and may undertake spawning migrations into the Sydenham River (COSEWIC 2015). Freshwater

Drum is a broadcast spawner and eggs are released in open waters with constant flow to

50 maximize survival (Balon 1975). Spotted Sucker releases its adhesive eggs near riffles in riverine habitats (Holm et al. 2010) and larvae is commonly found in riverbeds consisting of sand

(Lane et al. 1996).

Highly productive wetlands, such as coastal regions of Rondeau Bay, are important in fish production; the wetlands function as spawning and nursery habitat for phytophilic species and cover for juvenile fishes, and the abundance in larvae attracts predatory fishes (Herdendorf

1992). Rondeau Bay egg samples were collected from the vegetated coastal wetlands of the bay, therefore, species that depend on aquatic vegetation for reproduction were expected. Fishes with the phytophil reproductive guild deposit their adhesive eggs on submerged live or dead aquatic planets (Balon 1975). These fishes are adapted to habitat with highly dense vegetation, hypoxic environment, and muddy beds (Balon 1975). Four out of the nine species detected in Rondeau

Bay egg batch samples, Bigmouth Buffalo, Banded Killifish, Common Carp, and Greenside

Darter, belong to this vegetation dependent reproductive guild (Balon 1975). Bluntnose Minnow creates nests in cavities of stones or wooden material and releases adhesive eggs (Holm et al.

2010), it is likely that these eggs were collected alongside the vegetation. Mooneye, Round

Goby, Shorthead Redhorse, and White Sucker all releases their eggs over rock surfaces (Holm et al. 2010), it is likely by chance these eggs were pulled alongside vegetation during sampling and/or drifted eggs from riverine species as the sampling locations at Rondeau Bay were near mouth of agricultural drains.

Spawning timing

Eggs and larvae were generally captured within known spawning periods (Eakins et al.

2020) (Figures 3; Tables 3). However, some detections were earlier than previously reported.

51

As fish spawning behaviour is correlated to the temperature of their environment (Kraak and

Pankhurst, 1997), this may be related to local variation in temperatures or an indicator of phenological changes related to increased water temperatures caused by climate change

(Pankhurst and Munday 2011). Larvae detections either co-occurred during the same or following month as the egg detections for the same species (Table 4). Egg hatching time is generally temperature dependent and hatching days may vary among species. At optimal temperatures, eggs of early-spring spawners generally require incubation periods of 10-20 days to hatch, and late-spring spawners require consistent high temperatures for fewer days (Teletchea et al., 2009).

Detecting species at risk and invasive species

Early life stages of species at risk or aquatic invasive species occur in higher abundance than adults; therefore, through passive surveillance (Simmons et al. 2016) of aquatic systems, metabarcoding has the potential to detect these elusive species. As with many species at risk, the ecology of eggs and larvae are poorly understood. Knowledge gaps in most recovery strategies consist of better understanding of habitat requirement for early life stages; for example, Eastern

Sand Darter (COSEWIC. 2009b) and Lake Chubsucker (Fisheries and Oceans Canada 2017).

While invasive species can be new to these habitats, they have the potential to impact native fishes through predation, competition, disease, and habitat alteration (Crossman 1991). Detection of successfully reproducing populations of an invasive species may indicate an imminent threat and be important for informing management actions, such as eradication.

Four fish species at risk were detected in the batch samples. Eastern Sand Darter is listed under SARA as Threatened. According to COSEWIC, the species is thought to have been

52 extirpated from 4 out of the 11 locations that they have occupied historically in Ontario

(COSEWIC, 2009b). The Sydenham River provides critical habitat for Eastern Sand Darter between the Strathroy and Alvinston section of the river (Dextrase et al., 2003; COSEWIC

2009b). Eastern Sand Darter was detected in the spawning months (Eakins et al 2020; Table 3) and spawning habitat previously reported (Johnston 1989). Spotted Sucker is listed under SARA as Special Concern. The Canadian population of this species is limited to southwestern Ontario, known to inhabit the Sydenham River (COSEWIC 2014), and has been presumed to be in decline for the past century (COSEWIC 2014). Spotted Sucker eggs were detected in similar habitat to what is previously seen in literature (Holm et al. 2010) and a month following their spawning months (Eakins et al. 2020) in July. Lake Chubsucker occurs in southwestern Ontario but has not been recorded in the Sydenham River (COSEWIC 2008; Staton et al. 2010). Lake

Chubsucker is known to inhabit Lake St. Clair but, due to their limited dispersal ability (Staton et al. 2010), it is unlikely that it would disperse upstream to Oil Springs to spawn. Lake

Chubsucker was detected in one sample; however, it was detected with few reads (Appendix B –

Figure S1), and it is possible this was a false positive detection. In future studies, independent

Sanger sequencing of the initial DNA extraction is recommended for confirming species detections with low read count. Endangered River Darter is an extremely rare species that has only been collected 29 times in Ontario, which includes the Sydenham River (Mandrak 2018).

Spawning months for the Ontario population of River Darter is unknown, they are known to spawn in early spring based on populations in other Canadian Provinces (Balesic 1971). In the batch samples, River Darter was detected in May which is consistent with reproduction patterns from Balesic 1971.

53

Sydenham River is also home to other species at risk such as Blackstripe Topminnow

(Fundulus notatus), Grass Pickerel (Esox americanus vermiculatus), Northern Madtom (Noturus stigmosus), and Pugnose Minnow (Opsopoeodus emiliae) (Fisheries and Oceans 2018a), and

Rondeau Bay provides critical spawning habitat to Pugnose Shiner (Notropis anogenus)

(Edwards et al. 2012), Spotted Gar (Lepisosteus oculatus) (Fisheries and Oceans 2018b), and

Warmouth (COSEWIC 2015). Blackstripe Topminnow, Pugnose Minnow, and Grass Pickerel uses highly dense vegetated areas for spawning to deposit their adhesive eggs (Edwards and

Staton 2009; Holm et al. 2010), and Northern Madtom are known to release their eggs during the night, therefore, these species were not expected to be detected based on the sampling habitats and times used in the Sydenham River. Eastern Sand Darter, Pugnose Shiner, Spotted Gar, and

Warmouth are known to spawn in highly vegetated areas like Rondeau Bay in summer (Edwards and Staton 2009; Edwards et al. 2012; Fisheries and Oceans 2018b). It is possible these species were not detected due to small sample size and limited sampling dates for Rondeau Bay (Table

1).

Three invasive fish species were detected in the batch samples. Round Goby is invading the Sydenham River, which may pose a threat to species at risk such as the Eastern Sand Darter

(Poos et al. 2010; Fisheries and Oceans Canada, 2012). Round Goby has been detected throughout East Sydenham River using eDNA (Balasingham et al. 2017) and electrofishing

(Poos et al. 2010), which is consistent with the detections in this study (Table 3). Round Goby was also detected in the Rondeau Bay samples. Round Goby was previously detected in this area and are a known threat to SAR species such as Lake Chubsucker in Rondeau Bay (Fisheries and

Oceans Canada 2017). The Sydenham River was surveyed for Sea Lamprey larval recruitment in

May 2012, and there was no evidence of larval lamprey (Adair and Sullivan 2013). Sea Lamprey

54 migrate upstream from the Great Lakes to spawn and spawning typically occurs in June and July

(Manion and Hanson 1980), Sea Lamprey eggs were detected in June in the batch samples

(Table 3). Common Carp was detected abundantly across the river and its foraging behavior is known to increase the turbidity (Parkose et al. 2003) of the Sydenham River (Dextrase et al.

2003). Common Carp is described as a high threat risk for imperiled species through habitat alteration (Almeida et al. 2013). Other potential invasive threats to Sydenham River and

Rondeau Bay include Bighead Carp (Hypophthalmichthys nobilis), Grass Carp

(Ctenopharyngodon idella), Rudd (Scardinnius erythrophthalmus), and Ruffe ( cernua) however, there are no established populations of these species in these aquatic systems and were not detected in the batch samples (Kocovsky et al. 2012; OFAH/OMNRF Invading

Species Awareness Program 2012a, 2012b).

Conservation relevance

The life history of early life stages of native fishes is poorly understood, likely as a result of the difficulty in identifying early life stages of closely related species and capturing them in the field. Detecting and understanding the early life stages of species can provide insight into adult populations in an aquatic system, including their preferred spawning habitat, reproductive timing, and recruitment success (Ahlstrom and Moser 1976; Lasker 1987; Miller et al. 1988). It is imperative that conservation efforts monitor and protect highly diverse habitats (Noss 1990), such as the Sydenham River. The inability to identify and understand life-history patterns of recruitment will hinder attempts at remediating fluvial ecosystems (Chovanec et al., 2003; Roni et al., 2010; Claudet, 2010). Monitoring the recovery in larval abundance following mitigation efforts is key in assessing the success of remediation (Cushing, 1975; Dumont et al. 2011;

55

McAdam et al. 2017). Based on results of this study, DNA-based ichthyoplankton surveys shows potential at identifying early life stages of fishes for monitoring of critical habitats.

Identifying habitat of species at risk and aquatic invasive species is crucial for protection and restoration of imperiled species and early detection and management of invasive species.

Effective recovery strategies and action plans includes identifying critical spawning and nursery habitat of species at risk, generally identified as knowledge gaps in most recovery strategies; for example, Eastern Sand Darter (COSEWIC. 2009) and Lake Chubsucker (Fisheries and Oceans

Canada 2017). This study shows the potential of metabarcoding as a tool to detect spawning habitats and temporal spawning patterns of native fishes. As new threats emerge in the Great

Lakes, such as the Asian carps (Cudmore et al. 2012, 2017), it becomes imperative to have efficient early detection tools for rapid responses to, and management of, potential threats.

Currently, eDNA is used to detect the presence and absence of Asian Carp in potential points of entry to the Great Lakes (Jerde et al. 2013). Use of eDNA alongside metabarcoding will improve the cost and time efficiency of early detection surveillance programs for aquatic invasive species

(Borrell et al. 2017).

Cost effectiveness

Metabarcoding appears to be more cost-effective than individual barcoding for identification at a large sampling scale. For individual-based barcoding it would cost CAD$4.55 to barcode each specimen (Table 5), while the cost using metabarcoding is only CAD$0.88.

However, metabarcoding provides pooled species-presence results for batches, and it is not possible to calculate the abundance of species within batches. In contrast, individual-based

56 barcoding provides more precise results because they provide the species-identity of every individual in the analysis.

One other aspect of the cost calculations for metabarcoding is that some components must be purchased in volumes that do not exactly match the needs of differently sized studies.

For example, primer stocks as purchased provide more material than used in this study, but are costed here as if the exactly required amounts can be purchased. Similarly, consumable for the

PCR purification and quantification step are purchased in units of 960 samples. Thus, start-up costs will be higher than estimated here and, unless multiple projects are carried out, there will be inefficiencies in consumable expenses.

These calculations provided here do not account for sample collection costs or laboratory processing to remove debris - these costs are likely to be the same for both methods. However, the time required to process equivalent numbers of individuals will be less for metabarcoding; for the metabarcoding method used here, tissues from multiple specimens were homogenized and DNA extracted in the same tube. For individual-based barcoding, each sample would be separated and treated individually.

Based on these results, it is more time and cost effective to metabarcode when performing large-scale biomonitoring. Hulley et al. (2018) determined costs for identifying larval fishes using barcoding (CAD$11.46 per individual) and morphological identifying (CAD$13.69) larval fishes. This barcode cost was higher than cost calculated in this study because they accounted for labour costs, and their study used relatively expensive QIAGEN™ kits for DNA extraction and post-PCR clean-up. The cost and time effectiveness of metabarcoding has been emphasized in several studies monitoring ballast water (Zaiko et al. 2015) and marine fisheries

(Hansen et al. 2018), and comparing Sanger sequencing and high-throughput sequencing

57

(Shokralla et al. 2015). These prices are subject to change as the cost analysis is based on 2020 prices. As the use of next generation sequencing methods become more common, it is possible that these prices will continue to decrease, making metabarcoding more cost effective.

2.5 Conclusions

This study highlights the potential of a metabarcoding protocol as a highly effective and efficient method for the identification and biomonitoring of Great Lake species, including early life stages. The KGLF-F and KGLF-R primer pair is capable of amplifying a wide range of species found in the Great Lakes. Using this approach, 35 species were detected in ichthyoplankton survey samples for the Sydenham River and Rondeau Bay. These results provide important insights into the early life-history biology of fishes (e.g. spawning and nursery habitat) and monitoring of fish species, including at-risk and invasive species. The cost effectiveness of using a metabarcoding protocol for large-scale barcoding and biomonitoring was presented in this study. Metabarcoding can be used in many different applications for fish conservation and efficiency of metabarcoding will only continue to improve with technological advances in the future.

58

References

Adair, R., and Sullivan, P. 2013. Sea lamprey control in the Great Lakes: Annual report to the Great Lakes Fishery Commission. Great Lakes Fishery Commission, Ann Arbor, Michigan.

Ahlstrom, E.H., and Moser, H. G. 1976. Eggs and larvae of fishes and their role in systematic investigations and in fisheries. Revue des Travaux de l'Institut des Peches Maritimes. 40(3): 379-398.

Alberdi, A., Aizpurua, O., Gilbert, M. T., and Bohmann, K. 2017. Scrutinizing key steps for reliable metabarcoding of environmental samples. Methods in Ecology and Evolution, 9(1), 134-147. doi:10.1111/2041-210x.12849

Almeida, D., Ribeiro, F., Leunda, P. M., Vilizzi, L., and Copp, G. H. 2013. Effectiveness of FISK, an invasiveness screening tool for non-native freshwater fishes, to perform risk identification assessments in the Iberian Peninsula. Risk Analysis, 33(8), 1404-1413. doi:10.1111/risa.12050

Ardura, A., Planes, S., and Garcia-Vazquez, E. 2013. Applications of DNA barcoding to fish landings: Authentication and diversity assessment. ZooKeys, 365: 49-65. doi:10.3897/zookeys.365.6409

Balasingham, K. D., Walter, R. P., Mandrak, N. E., and Heath, D. 2017. Environmental DNA detection of rare and invasive fish species in two Great Lakes tributaries. Molecular Ecology, 27(1), 112-127. doi:10.1111/mec.14395

Balesic, H. 1971. Comparative ecology of four species of darters (Etheostominae) in Lake Dauphin and its tributary, the Valley River. M.Sc. Thesis, University of Manitoba, Department of Zoology.

Balon, E. K. 1975. Reproductive guilds of fishes: A proposal and definition. Journal of the Fisheries Research Board of Canada. 32: 821-864.

Bart, H. L., Clements, M. D., Blanton, R. E., Piller, K. R., and Hurley, D. L. 2010. Discordant molecular and morphological evolution in buffalo fishes (:

59

Catostomidae). Molecular Phylogenetics and Evolution, 56(2): 808-820. doi:10.1016/j.ympev.2010.04.029

Becker, R. A., Sales, N. G., Santos, G. M., Santos, G. B., and Carvalho, D. C. 2015. DNA barcoding and morphological identification of Neotropical ichthyoplankton from the Upper Paraná and São Francisco. Journal of Fish Biology, 87(1): 159–168. doi: 10.1111/jfb.12707

Borrell, Y. J., Miralles, L., Huu, H. D., Mohammed-Geba, K., and Garcia-Vazquez, E. 2017. DNA in a bottle—Rapid metabarcoding survey for early alerts of invasive species in ports. Plos One, 12(9). doi:10.1371/journal.pone.0183347

Chovanec, A., Hofer, R., & Schiemer, F. (2003). Chapter 18 Fish as bioindicators. In Markert et al. (Eds.), Bioindicators & Biomonitors - Principles, Concepts and Applications, 6: 639-676

Claudet, J., Osenberg, C. W., Domenici, P., Badalamenti, F., Milazzo, M., Falcón, J. M., . . . Planes, S. 2010. Marine reserves: Fish life history and ecological traits matter. Ecological Applications, 20(3): 830-839. doi:10.1890/08-2131.1

COSEWIC. 2008. COSEWIC assessment and update status report on the Lake Chubsucker Erimyzon sucetta in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. vi + 29 pp.

COSEWIC. 2009a. COSEWIC assessment and update status report on the Bigmouth Buffalo Ictiobus cyprinellus, Great Lakes - Upper St. Lawrence populations and Saskatchewan - Nelson River populations, in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. vii + 40 pp

COSEWIC. 2009b. COSEWIC assessment and status report on the Eastern Sand Darter Ammocrypta pellucida, Ontario populations and Quebec populations, in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. vii + 49 pp

COSEWIC. 2014. COSEWIC status appraisal summary on the Spotted Sucker Minytrema melanops in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. xvi pp.

60

COSEWIC. 2015. COSEWIC assessment and status report on the Warmouth Lepomis gulosus in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. x + 47 pp.

Crossman, E. J. 1991. Introduced freshwater fishes: a review of the North American perspective with emphasis on Canada. Canadian Journal of Fisheries and Aquatic Sciences, 48(S1), 46-57. doi:10.1139/f91-303

Cruaud, P., Rasplus, J.-Y., Rodriguez, L. J., & Cruaud, A. 2017. High-throughput sequencing of multiple amplicons for barcoding and integrative taxonomy. Scientific Reports, 7(1). doi: 10.1038/srep41948

Cudmore, B., Mandrak, N.E., Dettmers, J., Chapman, D.C., and Kolar, C.S. 2012. Binational ecological risk assessment of bigheaded carps (Hypophthalmichthys spp.) for the Great Lakes basin. DFO Canadian Science Advisory Secretariat Science Advisory Document. 2011/114.

Cudmore, B., Jones, L.A., Mandrak, N.E., Dettmers, J.M., Chapman, D.C., Kolar, C.S, and Conover, G. 2017. Ecological Risk Assessment of Grass Carp (Ctenopharyngodon idella) for the Great Lakes Basin. DFO Canadian Science Advisory Secretariat Research Document, 2016/118. vi + 115 p.

Cushing, D. H. 1980. Marine ecology and fisheries. Cambridge, UK: Cambridge University Press.

Dextrase, A.J., S.K. Staton, and Metcalfe-Smith, J.L. 2003. National recovery strategy for species at risk in the Sydenham River: an ecosystem approach. National Recovery Plan No. 25. Recovery of Nationally Endangered Wildlife (RENEW). Ottawa, Ontario. 73 pp.

Dumont, P., DAmours, J., Thibodeau, S., Dubuc, N., Verdon, R., Garceau, S., Bilodeau, P, Mailhot, Y., and Fortin, R. 2011. Effects of the development of a newly created spawning ground in the Des Prairies River (Quebec, Canada) on the reproductive success of lake sturgeon (Acipenser fulvescens). Journal of Applied , 27: 394– 404.

Eakins, R. J. 2020. Ontario Freshwater Fishes Life History Database. Version 5.02. Online database. (http://www.ontariofishes.ca), accessed 04 July 2020

61

Edwards, A., Barnucz, J., and Mandrak, N.E. 2006. Fish assemblage surveys of Rondeau Bay, Ontario: 2004 and 2005. Canadian Manuscript Report of Fisheries and Aquatic Science. 2773: v + 43 pp.

Edwards, A.L. and S.K. Staton. 2009. Management plan for the Blackstripe Topminnow, Pugnose Minnow, Spotted Sucker and Warmouth in Canada. Species at Risk Act Management Plan Series. Fisheries and Oceans Canada, Ottawa. viii + 43 pp.

Edwards, A.L., Matchett, S.P., Doherty, A. and Staton, S.K. 2012. Recovery strategy for the Pugnose Shiner (Notropis anogenus) in Canada (Proposed). Species at Risk Act Recovery Strategy Series. Fisheries and Oceans Canada, Ottawa ON. x+72 p.

Evans, N. T., Olds, B. P., Renshaw, M. A., Turner, C. R., Li, Y., Jerde, C. L., . . . Lodge, D. M. 2015. Quantification of mesocosm fish and amphibian species diversity via environmental DNA metabarcoding. Molecular Ecology Resources, 16(1): 29-41. doi:10.1111/1755-0998.12433

Fadrosh, D. W., Ma, B., Gajer, P., Sengamalay, N., Ott, S., Brotman, R. M., and Ravel, J. (2014). An improved dual-indexing approach for multiplexed 16S rRNA gene sequencing on the Illumina MiSeq platform. Microbiome, 2(1), 6. doi:10.1186/2049-2618-2-6

Fisheries and Oceans Canada. 2012. Recovery strategy for the Eastern Sand Darter (Ammocrypta pellucida) in Canada. Species at Risk Act Recovery Strategy Series, Fisheries and Oceans Canada, Ottawa. vii + 56 pp.

Fisheries and Oceans Canada. 2017. Report on the progress of Recovery Strategy Implementation for the Lake Chubsucker (Erimyzon sucetta) in Canada for the Period 2010 – 2015. Species at Risk Act Recovery Strategy Report Series. Fisheries and Oceans Canada, Ottawa. iii+ 31 pp.

Fisheries and Oceans Canada. 2018a. Action plan for the Sydenham River in Canada: an ecosystem approach. Species at Risk Act Action Plan Series. Fisheries and Oceans Canada, Ottawa. iv + 36 pp.

Fisheries and Oceans Canada. 2018b. Report on the Progress of Recovery Strategy Implementation for the Spotted Gar (Lepisosteus oculatus) in Canada for the Period 2012

62

– 2017. Species at Risk Act Recovery Strategy Report Series. Fisheries and Oceans Canada, Ottawa. iv + 23 pp.

Foster, P. G., Bergo, E. S., Bourke, B. P., Oliveira, T. M., Nagaki, S. S., Sant’Ana, D. C., and Sallum, M. A. 2013. Phylogenetic analysis and DNA-based species confirmation in Anopheles (Nyssorhynchus). PLoS ONE, 8(2). doi:10.1371/journal.pone.0054063

Hansen, B. K., Bekkevold, D., Clausen, L. W., and Nielsen, E. E. 2018. The sceptical optimist: Challenges and perspectives for the application of environmental DNA in marine fisheries. Fish and Fisheries, 19(5): 751-768. doi:10.1111/faf.12286

Hatzenbuhler, C., Kelly, J. R., Martinson, J., Okum, S., and Pilgrim, E. 2017. Sensitivity and accuracy of high-throughput metabarcoding methods for early detection of invasive fish species. Scientific Reports, 7(1). doi:10.1038/srep46393

Hebert, P. D. N., Cywinska, A., Ball, S. L., and Dewaard, J. R. 2003. Biological identifications through DNA barcodes. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1512): 313–321. doi: 10.1098/rspb.2002.2218

Herdendorf, C. E. 1992. Lake Erie coastal wetlands: An overview. Journal of Great Lakes Research, 18(4), 533-551. doi:10.1016/s0380-1330(92)71321-5

Holm, E., and Houston, J. 1993. Status of the ghost shiner, Notropis buchanani, in Canada. Canadian Field-Naturalist, 107: 440–445.

Holm, E., Mandrak, N. E., and Burridge, M. 2010. The ROM field guide to freshwater fishes of Ontario. Second Edition. Royal Ontario Museum, Toronto, ON.

Hulley, E. N., Taylor, N. D., Zarnke, A. M., Somers, C. M., Manzon, R. G., Wilson, J. Y., and Boreham, D. R. 2018. DNA barcoding vs. morphological identification of larval fish and embryos in Lake Huron: Advantages to a molecular approach. Journal of Great Lakes Research, 44(5): 1110-1116. doi:10.1016/j.jglr.2018.07.013

Jerde, C. L., Chadderton, W. L., Mahon, A. R., Renshaw, M. A., Corush, J., Budny, M. L., . . . Lodge, D. M. 2013. Detection of Asian carp DNA as part of a Great Lakes basin-wide surveillance program. Canadian Journal of Fisheries and Aquatic Sciences, 70(4), 522- 526. doi:10.1139/cjfas-2012-0478

63

Johnston, C.E. 1989. Spawning in the Eastern Sand Darter, Ammocrypta pellucida (Pisces: ) with comments on the phylogeny of Ammocrypta and related taxa. Transactions of the Illinois Academy of Sciences, 82(3-4):163-168.

Lane, P.A., Portt, C.B., and Minns, C.K. 1996. Nursery habitat characteristics of Great Lakes fishes. Canadian Manuscript Report of Fisheries and Aquatic Science. 2338: v + 42p.

Lasker, R. 1987. Use of fish eggs and larvae in probing some major problems in fisheries and aquaculture. American Fisheries Society Symposium, 2:1-16.

Loh, W. K. W., Bond, P., Ashton, K. J., Roberts, D. T., and Tibbetts, I. R. 2014. DNA barcoding of freshwater fishes and the development of a quantitative qPCR assay for the species-specific detection and quantification of fish larvae from plankton samples. Journal of Fish Biology, 85(2): 307–328. doi: 10.1111/jfb.12422

Lujan, N. K., Weir, J. T., Noonan, B. P., Lovejoy, N. R., and Mandrak, N. E. 2020. Is Niagara Falls a barrier to gene flow in riverine fishes? A test using genome-wide SNP data from seven native species. Molecular Ecology, 29(7), 1235-1249. doi:10.1111/mec.15406

Kocovsky, P. M., Chapman, D. C., and Mckenna, J. E. 2012. Thermal and hydrologic suitability of Lake Erie and its major tributaries for spawning of Asian carps. Journal of Great Lakes Research, 38(1), 159-166. doi:10.1016/j.jglr.2011.11.015

Kraak, G. V., and Pankhurst, N. W. 1997. Temperature effects on the reproductive performance of fish. In: Wood, C.M., and McDonald, D.G. editors. Global warming: implications for freshwater and marine fish. Cambridge University Press, Cambridge. 159-176.

Mandrak, N.E. 2018. Recovery Strategy for the River Darter (Percina shumardi) - Great Lakes – Upper St. Lawrence populations in Ontario. Ontario Recovery Strategy Series. Prepared for the Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario. v + 24 pp.

Manion, P.J., and Hanson, L. H. 1980. Spawning behavior and fecundity of lampreys from the upper three Great Lakes. Canadian Journal of Fisheries and Aquatic Sciences. 37: 1635-1640.

64

Mariac, C., Vigouroux, Y., Duponchelle, F., García-Dávila, C., Nunez, J., Desmarais, E., and Renno, J. 2018. Metabarcoding by capture using a single COI probe (MCSP) to identify and quantify fish species in ichthyoplankton swarms. Plos One, 13(9). doi:10.1371/journal.pone.0202976

Mayden, R. L. 1989. Phylogenetic studies of North American minnows, with emphasis on the genus Cyprinella (Teleostei: Cypriniformes). University of Kansas Museum of Natural History, Lawrence, Miscellaneous Publication 80. 189 pp.

McAdam, S. O., Crossman, J. A., Williamson, C., St-Onge, I., Dion, R., Manny, B. A., and Gessner, J. 2017. If you build it, will they come? Spawning habitat remediation for sturgeon. Journal of Applied Ichthyology, 34(2): 258-278. doi:10.1111/jai.13566

McKnight, D. T., Huerlimann, R., Bower, D. S., Schwarzkopf, L., Alford, R. A., and Zenger, K. R. 2019. MicroDecon: A highly accurate read-subtraction tool for the post-sequencing removal of contamination in metabarcoding studies. Environmental DNA, 1(1), 14-25. doi:10.1002/edn3.11

Metcalfe-Smith, J. L., Maio, J. D., Staton, S. K., and Desolla, S. R. 2003. Status of the freshwater mussel communities of the Sydenham River, Ontario, Canada. The American Midland Naturalist, 150(1), 37-50. doi:10.1674/0003- 0031(2003)150[0037:sotfmc]2.0.co;2

Meusnier, I., Singer, G. A., Landry, J., Hickey, D. A., Hebert, P. D., and Hajibabaei, M. 2008. A universal DNA mini-barcode for biodiversity analysis. BMC Genomics, 9(1): 214. doi:10.1186/1471-2164-9-214

Miller, T. J., Crowder, L. B., Rice, J. A., and Marschall, E. A. 1988. Larval size and recruitment mechanisms in fishes: Toward a conceptual framework. Canadian Journal of Fisheries and Aquatic Sciences, 45(9), 1657-1670. doi:10.1139/f88-197

Montgomery, F., Reid, S. M., and Mandrak, N. E. 2020. Extinction debt of fishes in Great Lakes coastal wetlands. Biological Conservation, 241, 108386. doi:10.1016/j.biocon.2019.108386

65

Murali, A., Bhargava, A., & Wright, E. S. (2018). IDTAXA: A novel approach for accurate taxonomic classification of microbiome sequences. Microbiome, 6(1). doi:10.1186/s40168-018-0521-5

Nguyen, N. H., Smith, D., Peay, K., and Kennedy, P. 2014. Parsing ecological signal from noise in next generation amplicon sequencing. New Phytologist, 205(4), 1389-1393. doi:10.1111/nph.12923

Noss, R. F. 1990. Indicators for monitoring biodiversity: A hierarchical approach. Conservation Biology, 4(4): 355-364. doi:10.1111/j.1523-1739.1990.tb00309.x

OFAH/OMNRF Invading Species Awareness Program. (2012a). Eurasian Ruffe. Retrieved from: www.invadingspecies.com.

OFAH/OMNRF Invading Species Awareness Program. (2012b). Rudd. Retrieved from: www.invadingspecies.com.

Olds, B. P., Jerde, C. L., Renshaw, M. A., Li, Y., Evans, N. T., Turner, C. R., … Lamberti, G. A. 2016. Estimating species richness using environmental DNA. Ecology and Evolution, 6(12): 4214–4226. doi: 10.1002/ece3.2186

Page, L.M. and Burr, B.M. 2011. Peterson field guide to freshwater fishes of North America north of Mexico. Second edition. Houghton Mifflin Company, Boston. p. 432.

Pankhurst, N. W., & Munday, P. L. 2011. Effects of climate change on fish reproduction and early life history stages. Marine and Freshwater Research, 62(9), 1015. doi:10.1071/mf10269

Parkos, J.J., Santucci, V.J., and Wahl, D.H. 2011.Effects of adult common carp (Cyprinus carpio) on multiple trophic levels in shallow mesocosms. Canadian Journal of Fisheries and Aquatic Sciences, 60:182-192. doi:10.1139/f03-011.

Parish Geomorphic Limited, 2000. Sydenham River: Fluvial geomorphology assessment. Ontario Ministry of Natural Resources and St Clair Conservation Authority.

Poos, M., Dextrase, A. J., Schwalb, A. N., & Ackerman, J. D. 2010. Secondary invasion of the round goby into high diversity Great Lakes tributaries and species at risk hotspots:

66

Potential new concerns for endangered freshwater species. Biological Invasions, 12(5), 1269-1284. doi:10.1007/s10530-009-9545-x

Roth, B., Mandrak, N.E., Hrabik, T., Sass, G., and Peters, J. 2013. Fishes and decapod crustaceans of the Great Lakes Basin. In Taylor, W.W., Lynch, A.J., and Leonard, N.K., editor. Great Lakes Fisheries Policy and Management: A Binational Perspective: 105- 136. Michigan State University Press, East Lansing, MI.

Robson, H. L., Noble, T. H., Saunders, R. J., Robson, S. K., Burrows, D. W., and Jerry, D. R. 2016. Fine-tuning for the tropics: Application of eDNA technology for invasive fish detection in tropical freshwater ecosystems. Molecular Ecology Resources, 16(4): 922- 932. doi:10.1111/1755-0998.12505

Roni, P., Pess, G., Beechie, T., and Morley, S. 2010. Estimating changes in Coho salmon and Steelhead abundance from watershed restoration: how much restoration is needed to measurably increase smolt production? North American Journal of Fisheries Management, 30(6), 1469-1484. doi:10.1577/m09-162.1

Schloss, P. D., Westcott, S. L., Ryabin, T., Hall, J. R., Hartmann, M., Hollister, E. B., . . . Weber, C. F. (2009). Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities. Applied and Environmental Microbiology, 75(23), 7537-7541. doi:10.1128/aem.01541- 09

Shokralla, S., Porter, T. M., Gibson, J. F., Dobosz, R., Janzen, D. H., Hallwachs, W., . . . Hajibabaei, M. 2015. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform. Scientific Reports, 5(1). doi:10.1038/srep09687

Simmons, M., Tucker, A., Chadderton, W. L., Jerde, C. L., & Mahon, A. R. 2016. Active and passive environmental DNA surveillance of aquatic invasive species. Canadian Journal of Fisheries and Aquatic Sciences, 73(1), 76-83. doi:10.1139/cjfas-2015-0262

Staton, S.K., K.L. Vlasman, and A.L. Edwards. 2010. Recovery strategy for the Lake Chubsucker (Erimyzon sucetta) in Canada. Species at Risk Act Recovery Strategy Series, Fisheries and Oceans Canada, Ottawa. vi + 49 pp.

67

Stepien, C. A., Elz, A. E., & Snyder, M. R. 2018. Invasion genetics of the silver carp (Hypophthalmichthys molitrix) across North America: Differentiation of fronts, introgression, and eDNA detection. Plos One, 14(3), doi:10.1101/392704

Taberlet, P., Coissac, E., Hajibabaei, M., and Rieseberg, L. H. 2012. Environmental DNA. Molecular Ecology, 21(8): 1789–1793. doi: 10.1111/j.1365-294x.2012.05542.x

Teletchea, F., Gardeur, J., Kamler, E., and Fontaine, P. 2009. The relationship of oocyte diameter and incubation temperature to incubation time in temperate freshwater fish species. Journal of Fish Biology, 74(3): 652-668. doi:10.1111/j.1095-8649.2008.02160.x

Thomsen, P. F., and Willerslev, E. 2015. Environmental DNA – An emerging tool in conservation for monitoring past and present biodiversity. Biological Conservation, 183, 4-18. doi:10.1016/j.biocon.2014.11.019

Valdez-Moreno, M., Ivanova, N. V., Elías-Gutiérrez, M., Pedersen, S. L., Bessonov, K., and Hebert, P. D. 2019. Using eDNA to biomonitor the fish community in a tropical oligotrophic lake. doi:10.1101/375089

Wang, Q., Garrity, G. M., Tiedje, J. M., and Cole, J. R. 2007. Naïve Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology, 73(16), 5261-5267. doi:10.1128/aem.00062-07

Zaiko, A., Martinez, J. L., Schmidt-Petersen, J., Ribicic, D., Samuiloviene, A., and Garcia- Vazquez, E. 2015. Metabarcoding approach for the ballast water surveillance – An advantageous solution or an awkward challenge? Marine Pollution Bulletin, 92(1-2): 25- 34. doi:10.1016/j.marpolbul.2015.01.008

68

Chapter 3 General Conclusions

3.1 Conclusion of the study

This study aimed to evaluate the potential of metabarcoding as a new method to identify ichthyoplankton samples from fresh waters. Through the use of metabarcoding of larvae and egg batch samples from Sydenham River and Rondeau Bay watersheds, I detected 34 species in

Sydenham and 8 species in Rondeau Bay. The number of species detected is limited by the time of sampling, the number of samples, and the locations sampled. I identified spawning months of these species, which were generally consistent with previously reported spawning months, but also extended spawning periods for some species. I detected three species at risk and three invasive species. Detection of these spawning habitats is key to conserving imperilled species and managing invasive species. The cost of metabarcoding at large scale identification is cheaper than using single specimen barcoding. I determined the cost of metabarcoding, to the sample size used in this study, to be $6597.33 compared to $62289.09 for single-species barcoding.

3.2 Experimental design

After analysing my data, I noticed some limitations to my study, and how I can improve the metabarcoding protocol further. The first issue was not being able to differentiate between closely related species. This can be resolved through use of a longer barcode (Meausneir et al.

2008) or more efficiently through use of multiple genomic markers; such as 12S rDNA and cyt b

(Ardura et al. 2013). Contamination was another issue due to the low concentration of genomic

69

DNA used in metabarcoding (Nguyen et al. 2014). I used both negative and positive controls to determine potential contamination in my results. Through the use of negative controls, I was able to eliminate any potential contamination in my results. I used the maximum number of reads present in negative controls for each species from each sequencing run and removed that number of reads from my detections in each sequencing run (similar methods to Mcknight et al., 2019).

There were few detections that had low read count. A low number of read output in sequencing could be due to contamination not accounted for by my methods or potential rare detection in batch samples containing multiple species. I chose to leave these detections in my results as I have taken a conservative approach to filter contamination and PCR/sequencing error. To confirm these results, I would need to amplify these samples individually with specific-specific primers and sequence using Sanger sequencing.

3.3 Significance of the results

This study shows the potential of metabarcoding to identify species diversity in aquatic systems with cost benefits. The ability to detect and identify ichthyoplankton is limited by expertise in ichthyoplankton taxonomy (Herbert et al., 2003). These barriers are effectively removed using DNA-based identification systems such as metabarcoding. However, limitations to metabarcoding include identifying recently diverged species that are too similar in their barcoding sequence. I was able to identify 99% of my detection to the species level. A common knowledge gap in most recovery potential assessments is life history and critical habitat of larval and egg stages of imperiled species (e.g. COSEWIC, 2009). The ability to identify ichthyoplankton at a massive sampling scale will potentially accelerate the process to locate and understand the life-history patterns and critical habitats of these species. In my study, I was able

70 to detect three species at risk in Sydenham River and Rondeau Bay watersheds, and to determine their spawning months based on my results. Previous studies have shown the potential of metabarcoding as an early detection method for invasive species (Westfall et al. 2019). My study was able to detect Common Carp, Round Goby, and Sea Lamprey through metabarcoding, which shows the potential of metabarcoding to detect spawning locations of invasive species in the

Great Lakes. Overall, metabarcoding can be used as an effective tool in conservation and management for the detection of Great Lake fishes.

71

Reference

Ardura, A., Planes, S., and Garcia-Vazquez, E. 2013. Applications of DNA barcoding to fish landings: Authentication and diversity assessment. ZooKeys, 365: 49-65. doi:10.3897/zookeys.365.6409

COSEWIC. 2009. COSEWIC assessment and status report on the Eastern Sand Darter Ammocrypta pellucida, Ontario populations and Quebec populations, in Canada. Committee on the Status of Endangered Wildlife in Canada. Ottawa. vii + 49 pp

Hebert, P. D. N., Cywinska, A., Ball, S. L., and Dewaard, J. R. 2003. Biological identifications through DNA barcodes. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1512): 313–321. doi: 10.1098/rspb.2002.2218

Mcknight, D. T., Huerlimann, R., Bower, D. S., Schwarzkopf, L., Alford, R. A., and Zenger, K. R. 2019. MicroDecon: A highly accurate read-subtraction tool for the post-sequencing removal of contamination in metabarcoding studies. Environmental DNA, 1(1), 14-25. doi:10.1002/edn3.11

Meusnier, I., Singer, G. A., Landry, J., Hickey, D. A., Hebert, P. D., and Hajibabaei, M. 2008. A universal DNA mini-barcode for biodiversity analysis. BMC Genomics, 9(1): 214. doi:10.1186/1471-2164-9-214

Nguyen, N. H., Smith, D., Peay, K., & Kennedy, P. 2014. Parsing ecological signal from noise in next generation amplicon sequencing. New Phytologist, 205(4), 1389-1393. doi:10.1111/nph.12923

Westfall, K. M., Therriault, T. W., & Abbott, C. L. (2019). A new approach to molecular biosurveillance of invasive species using DNA metabarcoding. Global Change Biology, 26(2), 1012-1022. doi:10.1111/gcb.14886

72

Appendices

Appendix A: Supplementary material for methods

Table A-S1. PCR 1 primer list. Modified primers containing PCR primer, heterogeneity spacers, and linking primer region

Primer Name Primer

Forward R2D2COIF_A TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_B TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_C TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGCTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_D TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGAGCTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_E TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCAGCTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_F TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGCAGCTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_G TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTGCAGCTATTTGGTGCCTGAGCCGGRATRGT R2D2COIF_H TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGATGCAGCTATTTGGTGCCTGAGCCGGRATRGT

Reverse C3POCOIR_1 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCAGAAGCTTATRTTATTTATYCG C3POCOIR_2 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGCAGAAGCTTATRTTATTTATYCG C3POCOIR_3 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_4 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_5 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTTCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_6 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTTTCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_7 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCTTTCGCAGAAGCTTATRTTATTTATYCG

C3POCOIR_8 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGCTTTCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_9 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAGCTTTCGCAGAAGCTTATRTTATTTATYCG

C3POCOIR_10 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAAGCTTTCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_11 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGAAGCTTTCGCAGAAGCTTATRTTATTTATYCG C3POCOIR_12 GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAGAAGCTTTCGCAGAAGCTTATRTTATTTATYCG

73

Table A-S2. PCR2 primer list. PCR2 primers consist of half of the sequencing primer, index (in red), and Illumina adaptor. Different combinations of i7 and i5 index primers were used to assign IDs to PCR1 product.

Primer Name Primer

PCR2 i7 Index Primers D701 CAAGCAGAAGACGGCATACGAGATCGAGTAATGTGACTGGAGTTCAG D702 CAAGCAGAAGACGGCATACGAGATTCTCCGGAGTGACTGGAGTTCAG D703 CAAGCAGAAGACGGCATACGAGATAATGAGCGGTGACTGGAGTTCAG D704 CAAGCAGAAGACGGCATACGAGATGGAATCTCGTGACTGGAGTTCAG D705 CAAGCAGAAGACGGCATACGAGATTTCTGAATGTGACTGGAGTTCAG D706 CAAGCAGAAGACGGCATACGAGATACGAATTCGTGACTGGAGTTCAG D707 CAAGCAGAAGACGGCATACGAGATAGCTTCAGGTGACTGGAGTTCAG D708 CAAGCAGAAGACGGCATACGAGATGCGCATTAGTGACTGGAGTTCAG D709 CAAGCAGAAGACGGCATACGAGATCATAGCCGGTGACTGGAGTTCAG D710 CAAGCAGAAGACGGCATACGAGATTTCGCGGAGTGACTGGAGTTCAG D711 CAAGCAGAAGACGGCATACGAGATGCGCGAGAGTGACTGGAGTTCAG D712 CAAGCAGAAGACGGCATACGAGATCTATCGCTGTGACTGGAGTTCAG

iTru7_102_01 CAAGCAGAAGACGGCATACGAGATCGCCTTATGTGACTGGAGTTCAG iTru7_102_02 CAAGCAGAAGACGGCATACGAGATCAGGTAAGGTGACTGGAGTTCAG iTru7_102_03 CAAGCAGAAGACGGCATACGAGATTTGCAACGGTGACTGGAGTTCAG iTru7_102_04 CAAGCAGAAGACGGCATACGAGATGCTGAATCGTGACTGGAGTTCAG iTru7_102_05 CAAGCAGAAGACGGCATACGAGATGAACGTGAGTGACTGGAGTTCAG iTru7_102_06 CAAGCAGAAGACGGCATACGAGATAACGCACAGTGACTGGAGTTCAG iTru7_102_07 CAAGCAGAAGACGGCATACGAGATCGCAACTAGTGACTGGAGTTCAG iTru7_102_08 CAAGCAGAAGACGGCATACGAGATTGGCTCTTGTGACTGGAGTTCAG iTru7_102_09 CAAGCAGAAGACGGCATACGAGATTGAGCTGTGTGACTGGAGTTCAG iTru7_102_10 CAAGCAGAAGACGGCATACGAGATGCCTTAACGTGACTGGAGTTCAG iTru7_102_11 CAAGCAGAAGACGGCATACGAGATTGTGGCTTGTGACTGGAGTTCAG iTru7_102_12 CAAGCAGAAGACGGCATACGAGATAACCGTGTGTGACTGGAGTTCAG

iTru7_103_01 CAAGCAGAAGACGGCATACGAGATAATCGCTGGTGACTGGAGTTCAG iTru7_103_02 CAAGCAGAAGACGGCATACGAGATGGTCACTAGTGACTGGAGTTCAG iTru7_103_03 CAAGCAGAAGACGGCATACGAGATTAGTCTCGGTGACTGGAGTTCAG iTru7_103_04 CAAGCAGAAGACGGCATACGAGATACCATGTCGTGACTGGAGTTCAG iTru7_103_05 CAAGCAGAAGACGGCATACGAGATAGACATGCGTGACTGGAGTTCAG iTru7_103_06 CAAGCAGAAGACGGCATACGAGATGATGGAGTGTGACTGGAGTTCAG iTru7_103_07 CAAGCAGAAGACGGCATACGAGATCAGTCACAGTGACTGGAGTTCAG iTru7_103_08 CAAGCAGAAGACGGCATACGAGATGTTCTTCGGTGACTGGAGTTCAG

74 iTru7_103_09 CAAGCAGAAGACGGCATACGAGATAAGACACCGTGACTGGAGTTCAG iTru7_103_10 CAAGCAGAAGACGGCATACGAGATGCCTTCTTGTGACTGGAGTTCAG iTru7_103_11 CAAGCAGAAGACGGCATACGAGATTCGAACCTGTGACTGGAGTTCAG iTru7_103_12 CAAGCAGAAGACGGCATACGAGATGGAACATGGTGACTGGAGTTCAG iTru7_104_01 CAAGCAGAAGACGGCATACGAGATTATGGCACGTGACTGGAGTTCAG iTru7_104_02 CAAGCAGAAGACGGCATACGAGATCTACAAGGGTGACTGGAGTTCAG iTru7_104_03 CAAGCAGAAGACGGCATACGAGATAATCCAGCGTGACTGGAGTTCAG iTru7_104_04 CAAGCAGAAGACGGCATACGAGATCCTCGTTAGTGACTGGAGTTCAG iTru7_104_05 CAAGCAGAAGACGGCATACGAGATGCAACCATGTGACTGGAGTTCAG iTru7_104_06 CAAGCAGAAGACGGCATACGAGATGGTATAGGGTGACTGGAGTTCAG iTru7_104_07 CAAGCAGAAGACGGCATACGAGATCGACCTAAGTGACTGGAGTTCAG iTru7_104_08 CAAGCAGAAGACGGCATACGAGATGATCTTGCGTGACTGGAGTTCAG iTru7_104_09 CAAGCAGAAGACGGCATACGAGATAAGGCTCTGTGACTGGAGTTCAG iTru7_104_10 CAAGCAGAAGACGGCATACGAGATTCCATTGCGTGACTGGAGTTCAG iTru7_104_11 CAAGCAGAAGACGGCATACGAGATTACTCCAGGTGACTGGAGTTCAG

PCR2 i5 Index Primers D501 AATGATACGGCGACCACCGAGATCTACACTATAGCCTACACTCTTTCCCTAC D502 AATGATACGGCGACCACCGAGATCTACACATAGAGGCACACTCTTTCCCTAC D503 AATGATACGGCGACCACCGAGATCTACACCCTATCCTACACTCTTTCCCTAC D504 AATGATACGGCGACCACCGAGATCTACACGGCTCTGAACACTCTTTCCCTAC D505 AATGATACGGCGACCACCGAGATCTACACAGGCGAAGACACTCTTTCCCTAC D506 AATGATACGGCGACCACCGAGATCTACACTAATCTTAACACTCTTTCCCTAC D507 AATGATACGGCGACCACCGAGATCTACACCAGGACGTACACTCTTTCCCTAC D508 AATGATACGGCGACCACCGAGATCTACACGTACTGACACACTCTTTCCCTAC iTru5_02_A AATGATACGGCGACCACCGAGATCTACACCTTCGCAAACACTCTTTCCCTAC iTru5_02_B AATGATACGGCGACCACCGAGATCTACACGTGGTATGACACTCTTTCCCTAC iTru5_02_C AATGATACGGCGACCACCGAGATCTACACCACTGTAGACACTCTTTCCCTAC iTru5_02_D AATGATACGGCGACCACCGAGATCTACACAGACGCTAACACTCTTTCCCTAC iTru5_02_E AATGATACGGCGACCACCGAGATCTACACCAACTCCAACACTCTTTCCCTAC iTru5_02_F AATGATACGGCGACCACCGAGATCTACACAACACGCTACACTCTTTCCCTAC iTru5_02_G AATGATACGGCGACCACCGAGATCTACACTGGATGGTACACTCTTTCCCTAC iTru5_02_H AATGATACGGCGACCACCGAGATCTACACTTCGAAGCACACTCTTTCCCTAC iTru5_03_A AATGATACGGCGACCACCGAGATCTACACAACACCACACACTCTTTCCCTAC iTru5_03_B AATGATACGGCGACCACCGAGATCTACACTGAGCTGTACACTCTTTCCCTAC iTru5_03_C AATGATACGGCGACCACCGAGATCTACACCACAGGAAACACTCTTTCCCTAC iTru5_03_D AATGATACGGCGACCACCGAGATCTACACTGACAACCACACTCTTTCCCTAC

75

iTru5_03_E AATGATACGGCGACCACCGAGATCTACACTGTTCCGTACACTCTTTCCCTAC iTru5_03_F AATGATACGGCGACCACCGAGATCTACACCCTAGAGAACACTCTTTCCCTAC iTru5_03_G AATGATACGGCGACCACCGAGATCTACACGCATAACGACACTCTTTCCCTAC iTru5_03_H AATGATACGGCGACCACCGAGATCTACACCAGTGCTTACACTCTTTCCCTAC

iTru5_04_A AATGATACGGCGACCACCGAGATCTACACCGTATCTCACACTCTTTCCCTAC iTru5_04_B AATGATACGGCGACCACCGAGATCTACACCGTCAAGAACACTCTTTCCCTAC iTru5_04_C AATGATACGGCGACCACCGAGATCTACACCCATGAACACACTCTTTCCCTAC iTru5_04_D AATGATACGGCGACCACCGAGATCTACACGGTACTTCACACTCTTTCCCTAC iTru5_04_E AATGATACGGCGACCACCGAGATCTACACACCGCTATACACTCTTTCCCTAC iTru5_04_F AATGATACGGCGACCACCGAGATCTACACTTCCAGGTACACTCTTTCCCTAC iTru5_04_G AATGATACGGCGACCACCGAGATCTACACTCGAACCTACACTCTTTCCCTAC iTru5_04_H AATGATACGGCGACCACCGAGATCTACACTAGTGCCAACACTCTTTCCCTAC

Table A-S3. Mock communities (positive controls) consisting of species found in the Sydenham River (SYD) and Rondeau Bay (RB). S7 contains all seven species used in the positive controls and S3 contains three species that are in both mock communities. Success column indicates ability of metabarcoding pipeline to detect the species in each mock community in three replicates.

Positive Brown Common Freshwater Greenside Smallmouth Shorthead Yellow Control Bullhead Shiner Drum Darter Bass Redhorse SYD X X - X X X - RB X - X X X - X S7 X X X X X X X S3 X - - X X - -

76

Table A-S4. The primer sets used during PCR2 to ID samples. Indexes in PCR2 primers was used to demultiplex read output from sequencing runs.

Plate Number i7 primer set i5 primer set Sequencing One 1 D7 D5 2 iTru07_104 iTru05_04 3 D7 iTru05_02 4 D7 iTru05_03 5 D7 iTru05_04 6 iTru07_102 D5 Sequencing Two 7 D7 D5 8 D7 iTru05_02 9 D7 iTru05_03 10 D7 iTru05_04 11 iTru07_102 D5 12 iTru07_102 iTru05_02

Table A-S5. Commands used on mothur software to filter for PCR/sequencing errors and to classify detections.

Command Purpose Parameters make.conting form full contigs using read output - screen.seqs remove sequences that are too big and to maxambig=0, minlength=181, trim sequences maxlength=283 unique.seq merge duplicate sequences - align.seq align sequences to reference alignment - pre.cluster cluster sequences that are < 2 bases diffs=2 different chimera.vsearch remove chimeras - classify.seq classify sequences cutoff=80, search = blast

77

Table A-S6. Read count for each species detected in negative controls during sequencing run one and two. The read counts were used to correct for contamination in the egg and larval batch samples. Species Name Sequencing Run 1 Sequencing Run 2 Bluntnose Minnow - 2 Common Carp - 3 Common Shiner - 3 Freshwater Drum - 7 Greenside Darter 17 8 Lake Chubsucker 3 - Longnose Gar 2 - Shorthead Redhorse 21 199 Spotfin Shiner 8 264 Stonecat 2 -

Table A-S7. Summary of read count for species included in positive controls. Three read counts represent the count for each replicate (total = 3) for each mock community. Positive detection threshold was determine based on the lowest read count found in positive controls (count = 2).

Positive Brown Common Freshwater Drum Greenside Smallmouth Shorthead Yellow Control Bullhead Shiner Darter Bass Redhorse Perch SYD 9, 2, 4 5, 32, 7 - 7, 22, 9 2, 2, 4 133, 757, 581 - RB 5, 17, 9 - 659, 609, 1252 37, 26, 58 24, 7, 28 - 5, 7, 13 S7 3, 9, 4 8, 42, 11 449, 1243, 535 14, 31, 11 2, 10, 4 179, 273, 143 3, 18, 4 S3 14, 15, 23 - - 57, 88, 38 19, 43,23 - -

78

Table A- S8. Species list for reference library and reference alignment with their respective GenBank number.

Species Name GenBank # Aplodinotus grunniens EU522444 Acantharchus pomotis HQ557340 Acipenser fulvescens EU524394 Alosa aestivalis EU523896 Alosa pseudoharengus EU523899 Alosa sapidissima EU523901 Ambloplites rupestris EU524410 Ameiurus catus KF558302 Ameiurus melas EU524417 Ameiurus natalis EU524423 Ameiurus nebulosus EU524429 Amia calva EU523910 Ammocrypta clara JN024773 Ammocrypta pellucida EU523914 Anguilla rostrata EU524440 Apeltes quadracus EU524445 Aphredoderus sayanus JN024807 Astronotus ocellatus JQ667500 Betta splendens JQ818784 Campostoma anomalum EU524447 Campostoma oligolepis JN024834 Carassius auratus EU524448 Carassius carassius HQ961037 Carassius gibelio HQ960642 Carpiodes carpio JN024862 Carpiodes cyprinus EU524451 Catostomus catostomus EU524464 Catostomus commersonii EU524473 Centrarchus macropterus JN024960 Channa argus KJ937443 Chrosomus eos EU525058 Chrosomus erythrogaster JN028212 Chrosomus neogaeus EU524274 Clinostomus elongatus EU524487 Cobitis taenia KM286530 EU523943 Coregonus clupeaformis EU523952 EU523960 Coregonus EU523965 KC009641 Coregonus maraena HQ960631 Coregonus nigripinnis EU523980 Coregonus zenithicus EU523988 Cottus bairdii EU524493 Cottus cognatus EU524517 Cottus ricei EU524001

79

Table A- S8. continued Species Name GenBank # Couesius plumbeus EU524002 Ctenopharyngodon idella KJ552789 Culaea inconstans EU524535 Cycleptus elongatus JN025162 Cyclopterus lumpus KJ204823 Cyprinella analostana JN025168 Cyprinella lutrensis EU751766 Cyprinella spiloptera EU524539 Cyprinella venusta JN025255 Cyprinella whipplei JN025266 Cyprinus carpio carpio HQ682682 Dallia pectoralis EU524007 Danio albolineatus KP263429 Dorosoma cepedianum EU524557 Elassoma zonatum JN025299 Enneacanthus chaetodon JN025312 Enneacanthus gloriosus JN025319 Enneacanthus obesus HQ557525 Ericymba buccata JN027465 Erimystax x punctatus JN025410 Erimyzon oblongus HQ579034 Erimyzon sucetta EU524567 Esox americanus americanus EU524574 Esox americanus vermiculatus EU524572 Esox lucius EU524582 Esox masquinongy EU524593 Esox niger EU524604 Etheostoma blennioides EU524017 Etheostoma caeruleum EU524022 Etheostoma chlorosomum JN025749 Etheostoma exile EU524028 Etheostoma flabellare EU524036 Etheostoma microperca EU524042 Etheostoma nigrum EU524051 Etheostoma olmstedi EU524049 Etheostoma spectabile JN026390 Etheostoma zonale JN026580 Exoglossum laurae HQ557359 Exoglossum maxillingua EU524615 Fundulus chrysotus JN026632 Fundulus diaphanus EU524620 Fundulus dispar HQ557217 Fundulus notatus EU524063 Fundulus olivaceus JN026666 Fundulus sciadicus JN026675 Gambusia affinis JN026702 Gambusia holbrooki JN026706 Gasterosteus aculeatus EU524633 Gymnocephalus cernua JN026731 Heterotilapia buttikoferi KJ938229 Hiodon alosoides EU524649 Hiodon tergisus EU524654 Hybognathus hankinsoni EU524077

80

Table A- S8. continued Species Name GenBank # Hybognathus placitus EU524082 Hybognathus regius EU524665 Hypentelium nigricans EU524669 Hypophthalmichthys molitrix KX224123 Hypophthalmichthys nobilis KF742440 Ichthyomyzon castaneus EU524087 Ichthyomyzon fossor EU524090 Ichthyomyzon unicuspis EU524097 Ictalurus furcatus EU752099 Ictalurus punctatus EU752102 Ictiobus bubalus JN026913 Ictiobus cyprinellus EU524687 Ictiobus niger JN026919 Labidesthes sicculus EU524692 Lampetra appendix EU524109 Lepisosteus oculatus EU524699 Lepisosteus osseus EU524119 Lepisosteus platostomus JN026964 Lepisosteus platyrhincus HQ937021 Lepomis cyanellus EU524711 Lepomis gibbosus EU524723 Lepomis humilis EU524729 Lepomis gulosus JN026996 Lepomis macrochirus EU524736 Lepomis megalotis EU524744 Lepomis microlophus JN027045 Lepomis symmetricus JN027056 Leuciscus idus HQ960976 Lota lota EU524754 Luxilus chrysocephalus EU524761 Luxilus cornutus EU524765 Lythrurus umbratilis EU524791 Macrhybopsis storeriana EU524799 Margariscus margarita EU524803 Micropterus dolomieu EU524816 Micropterus salmoides EU524834 Minytrema melanops EU524839 Misgurnus anguillicaudatus KJ553659 Misgurnus fossilis KM286762 Morone americana EU524138 Morone chrysops EU524140 Morone mississippiensis JN027259 Morone saxatilis EU524145 Morone saxatilis x Morone chrysops KX224125 Moxostoma anisurum EU524846 Moxostoma carinatum EU524148 Moxostoma duquesnii EU524864 Moxostoma erythrurum JN027287 Moxostoma macrolepidotum EU524891 Moxostoma valenciennesi EU524904 Mylopharyngodon piceus KX224146

81

Table A- S8. continued Species Name GenBank # Myoxocephalus thompsonii EU524916 Neogobius melanostomus EU524154 Nocomis biguttatus EU524158 Nocomis micropogon JN027357 Notemigonus crysoleucas EU524938 Notropis amblops JN026795 Notropis anogenus EU524166 Notropis ariommus JN027405 Notropis atherinoides EU524962 Notropis bifrenatus EU524966 Notropis blennius JN027449 Notropis boops JN027459 Notropis buchanani EU524973 Notropis chalybaeus JN027492 Notropis dorsalis JN027525 Notropis heterodon EU524983 Notropis heterolepis JN027547 Notropis hudsonius EU525002 Notropis nubilus JN027608 Notropis photogenis EU525012 Notropis procne JN027635 Notropis rubellus EU525017 Notropis stramineus EU525028 Notropis texanus EU524182 Notropis volucellus EU525032 Noturus flavus EU525040 Noturus gyrinus EU525046 Noturus insignis EU524189 Noturus miurus EU525053 Noturus stigmosus EU525054 Oncorhynchus clarkii EU524192 Oncorhynchus gorbuscha EU524207 Oncorhynchus keta EU525057 Oncorhynchus kisutch EU752129 Oncorhynchus masou masou GU207328 Oncorhynchus mykiss EU524218 Oncorhynchus nerka EU524225 Oncorhynchus tshawytscha EU524227 Opsopoeodus emiliae JN027856 Osmerus mordax dentex HQ712707 Osteoglossum bicirrhosum JQ667558 Panaque nigrolineatus GQ225417 Parachromis managuensis KU692739 Perca flavescens EU524240 Percina caprodes EU524248 Percina copelandi EU524250 Percina evides JN027999 Percina maculata EU524257 Percina phoxocephala JN028107 Percina shumardi EU524260 Percopsis omiscomaycus EU524263 Petromyzon marinus EU524271 Phenacobius mirabilis HQ557179 Piaractus brachypomus KJ136024

82

Table A- S8. continue Species Name GenBank # Pimephales notatus JN028226 Pimephales promelas EU525095 Pimephales vigilax HQ557497 Platichthys flesus KJ205126 Polyodon spathula JN028274 Pomoxis annularis EU525096 Pomoxis nigromaculatus EU525101 coulterii EU525103 Prosopium cylindraceum EU524288 Proterorhinus marmoratus EU524307 Pterygoplichthys pardalis KU692596 Pterygoplichthys sp KW11T171, KU568996 Pungitius pungitius EU525111 Pygocentrus nattereri KU288905 Pylodictis olivaris EU525113 Rhinichthys atratulus EU525116 Rhinichthys cataractae EU525121 Rhinichthys obtusus EU524334 Salmo letnica KJ554643 Salmo salar EU524349 Salmo trutta EU524356 Salvelinus alpinus EU524361 Salvelinus fontinalis EU524366 Salvelinus namaycush EU522419 Sander canadensis EU524369 Sander vitreus EU524374 Scaphirhynchus platorynchus JN028406 Scardinius erythrophthalmus EU524381 Semotilus atromaculatus EU525136 Semotilus corporalis EU525145 Silurus glanis KC501495 Siniperca chuatsi KP112441 Thymallus arcticus EU522435 Tinca tinca EU524390 Umbra limi EU522446

83

Figure A-S1. A neighbour-joining tree used to estimate the level of divergence for the barcode region between all Great Lake fishes and potential invasive species. This was used to determine the potential of my barcode to differentiate between closely related species.

84

Figure A-S2 Cytochrome c oxidase subunit 1 (COI) sequence map, base position 5487 to 7037 in the full mtDNA sequence, displaying the base position of forward primer, reverse primer, and the barcode region. This sequence map is based on the full mtDNA sequence of Walleye (Sander vitreus) (GenBank #: NC_028285).

85

Appendix B: Supplementary material for results

Figure B-S1. Species read numbers for each detection based on A) egg and, B) larval batch samples.

86

Figure B-S2. The relationship between mean read count and the number of species detected in mix batch samples after filtering for contamination and PCR and sequencing error.

87

Appendix C: Glossary of terms

Table C-S1. Glossary of terms defining terms used in this thesis.

Terms Definition Amplicon Amplicon is a piece of DNA or RNA that is a product of amplification or replication event.

Barcode Fragment of DNA or RNA used to identify species by comparing to a reference library.

Sample containing the DNA or RNA of more than one individual and possibly Batch sample more than one species.

Blanks Negative control wells containing 1µL of water instead of 1µL of DNA or RNA in the PCR reaction.

BLAST Basic local alignment search tool is an algorithm for comparing biological sequence information to a reference database.

Broadcaster spawner Spawning behaviour where the gametes are released into open water for external fertilisation.

Chimeric sequence DNA sequence that originated from multiple transcripts.

Contig Series of overlapping DNA sequences used to reconstruct the original DNA or RNA sequence.

COSEWIC The Committee on the Status of Endangered Wildlife in Canada is an advisory panel to Minister of Environment and Climate Change Canada that assess the status of wildlife species at risk of extinction. Cryptic species Species that contain individuals that are morphologically identical but cannot interbreed.

Degenerative Primer Mixture of oligonucleotide in which some positions of the nucleotide has number of possible bases. This allows binding of wide variety of nucleotide sequences, thereby amplification of DNA or RNA sequences from wide variety of species. Demultiplex Separating sequence data based on sample using an index system.

88

Detection A presence of a species in a sample.

eDNA Environmental DNA that can be collected from a variety of environmental samples such as water, soil, air, and ice.

Heterogeneity spacer sequence of nucleotides that allows for equal proportion of each of the bases to be sequenced during dual-index sequencing

High-Throughput Sequencing platform capable of sequencing multiple DNA or RNA sequences in Sequencing parallel

Ichthyoplankton Ichthyoplankton are the eggs and larvae of fishes that drift as a result of water current; includes eggs, fry, larvae, and juveniles.

Illumina adaptor Illumina adaptor is used to attach the sequences to the flowcell during Illumina MiSeq sequencing. k-mer Subsequence of nucleotides of length k contained within a DNA or RNA sequence.

Mock community Mixture of DNA or RNA from several known species used to simulate the composition of DNA or RNA isolated from a batch sample.

Mothur Software package that allows users to analyze a community sequence data within one software.

Operation Taxonomic Unit OTU is a cluster of sequences that are closely related and are generally identified (OTU) as a group.

Phenotypic plasticity Species capable of exhibiting multiple phenotypes based on stimuli from varied environmental conditions.

Phytophilic species Reproductive guild that deposits their adhesive eggs on submerged live or dead aquatic planets.

Positive detection threshold Read threshold to remove potential false positive detections determined based on the read count for species detected in positive controls.

Primer specificity Likelihood of mismatching to occur with the primer and template sequence.

Read Read is an inferred sequence of base pairs corresponding to a part of a single DNA or RNA fragment.

89

Reference library Databases containing the DNA or RNA sequences of previously identified taxa and used to compare unknown sequences during taxonomic identification.

Species at Risk Act (SARA) Purpose of SARA is to prevent imperilled species in Canada from disappearing and to provide for the recovery of SARA listed species.

Species-specific Species-specific threshold to adjust the read count of species detected in the contamination threshold negative controls across all batch samples.

Taxonomic key Guide used to identify an unknown organism.

Universal Primer Single oligonucleotide that binds to a particular DNA nucleotide sequence which allows for amplification of a wide variety of DNA templates.

90