<<

Examining patterns of genetic variation in Canadian marine molluscs through DNA barcodes

by

Kara Layton

A Thesis presented to The University of Guelph

In partial fulfilment of requirements for the degree of Master of Science in Integrative Biology

Guelph, Ontario, Canada

© Kara Layton, January, 2012

ABSTRACT

Examining patterns of genetic variation in Canadian marine molluscs through DNA barcodes

Kara Layton Advisor: University of Guelph, 2013 Professor P.D.N Hebert

In this thesis I investigate patterns of sequence variation at the COI gene in Canadian marine molluscs. The research presented begins the construction of a DNA barcode reference library for this phylum, presenting records for nearly 25% of the Canadian fauna. This work confirms that the COI gene region is an effective tool for delineating of marine molluscs and for revealing overlooked species. This study also discovered a link between GC content and sequence divergence between congeneric species. I also provide a detailed analysis of population structure in two bivalves with similar larval development and dispersal potential, exploring how Canada’s extensive glacial history has shaped genetic structure. Both bivalve species show evidence for cryptic taxa and particularly high genetic diversity in populations from the northeast Pacific. These results have implications for the utility of DNA barcoding both for documenting biodiversity and broadening our understanding of biogeographic patterns in Holarctic species.

Acknowledgements

Firstly, I would like to thank my advisor Dr. Paul Hebert for providing endless guidance and support during my program and for greatly improving my research. You always encouraged my participation in field collections and conferences, allowing many opportunities to connect with colleagues and present my research to the scientific community. I am also incredibly grateful for the invaluable feedback and input I received from my committee members, Dr. Elizabeth Boulding and Dr. André Martel. Your enthusiasm for this project always kept me motivated and I thank you for teaching me all about the wonderful world of malacology. I would like to thank the Natural Sciences and Engineering Research Council of Canada (NSERC), the International Barcode of Life project and Genome Canada through the Ontario Genomics Institute for funding this research. I am also grateful to Aboriginal Affairs and Northern Development Canada for providing me with a Northern Scientific Training Program grant that aided in field collections in Churchill, Manitoba. Lastly, I thank the Canadian Centre for DNA Barcoding (CCDB) for help in sequence acquisition. Research teams at the Kasitsna Bay Lab, the Churchill Northern Studies Centre and the Huntsman Marine Science Centre provided logistical support for field collections and I am extremely grateful for their help. I would also like to extend my gratitude to Sarah Hardy, Katrin Iken, Suzanne Dufour, Barry McDonald, Robert Frank, Nicholas Jeffrey and Paolo Pierrossi for aid in specimen collections. A sincere thank you to all graduate students in the Hebert, Adamowicz, Smith, Crease, Hajibabaei and Gregory labs for providing input into my project and contributing greatly to my happiness during this program. I especially want to thank Christy Carr for her advice, patience and compassion- you truly are a fantastic scientist. Finally, I need to thank my close friends and family, particularly my parents and Ryan, who not only provided unconditional love but have shaped the person I am today.

iii

Table of Contents Abstract……………………………………………………………………………………………………ii Acknowledgements……………………………………………………………………………………….iii List of Tables……………………………………………………………………………………………...vi List of Figures……………………………………………………………………………………………vii General Introduction………………………………………………………………………………...... 1 Chapter 1: Patterns of DNA barcode variation in Canadian marine molluscs…………………...... 3 Abstract……………………………………………………………………………………………...... 3 Introduction……………………………………………………………………………………………...4 Methods……………………………………………………………………………………………...... 5 Specimen collection and data scrutiny………………………………………………………………..5 DNA extraction, amplification and sequencing……………………………………………………….5 Data analysis…………………………………………………………………………………………..6 Results…………………………………………………………………………………………………..7 Sequence recovery……………………………………………………………………………...... 7 COI variation in molluscs……………………………………………………………………………..7 Variation in nucleotide composition…………………………………………………………………..8 Distribution of indels…………………………………………………………………………………..8 Discussion………………………………………………………………………………………………8 Sequencing success in ………………………………………………………………………8 Patterns of sequence variation……………………………………………………………………...... 9 Insertions and deletions in COI……………………………………………………………………...10 Patterns of nucleotide composition…………………………………………………………………..11 Conclusions…………………………………………………………………………………………..12 Chapter 2: Geographic patterns of mtCOI diversity in two species of Canadian marine bivalves……………………………………………………………………………23 Abstract………………………………………………………………………………………………..23 Introduction……………………………………………………………………………………………24 Methods………………………………………………………………………………………………..25 Specimen collection…………………………………………………………………………………..25 DNA extraction, amplification and sequencing……………………………………………………...26 Data analysis…………………………………………………………………………………………26 Results…………………………………………………………………………………………………27 Sequence recovery and haplotype diversity………………………………………………………….27

iv

Patterns of genetic diversity…………………………………………………………………………28 Population structure…………………………………………………………………………………29 Discussion……………………………………………………………………………………………..30 Comparing diversity and structure in two bivalves with planktotrophic larval development……....30 Implications for glacial refugia in the northeast Pacific……………………………………………30 Evidence of sibling species…………………………………………………………………………..31 Conclusions…………………………………………………………………………………………..32 General Conclusions……………………………………………………………………………………..44 Summary of findings…………………………………………………………………………………44 The application of a DNA barcode library for Canadian marine molluscs………………………....44 Gene flow in marine populations…………………………………………………………………….45 Conclusions and implications for conservation……………………………………………………...46 Literature Cited………………………………………………………………………………………….47 Appendix A: Specimen Preservation…...……………………………………………………………....56 Appendix B: Species Identifications…………………………………………………………………....57 Appendix C: Chapter 1 Supplementary Material……………………………………………………..61 Appendix D: Chapter 2 Supplementary Material……………………………………………………..65 Appendix E: R Code……………………………………………………………………………………..74

v

List of Tables Chapter 1 ………………………………………………………………………………………………...13 Table 1.1. Parameter settings for each OTU algorithm………………….………...………………….13 Table 1.2. Percent success in recovery of a COI sequence …………....……………………………..13 Table 1.3. The number of COI sequences and BINs, intraspecific and nearest neighbour distances and mean GC content for each of 33 orders………………………………………………...14 Table 1.4. Mean intraspecific divergence, number of genetic clusters, number of individuals sampled and locality information for each potential cryptic species complex in this study……………………………………………………………………………………………15 Table 1.5. Intra and interspecific distances (K2P) for taxonomic groups examined by DNA barcoding in previous literature…………………………………………………...……………15 Chapter 2…………………………………………………………………………………………………33 Table 2.1. Genetic diversity in populations of the bivalve species, Hiatella arctica and Macoma balthica……………………………………………………………………………………..33 Table 2.2. Overall genetic structure measured by AMOVA for Hiatella arctica…...………………..34 Table 2.3. Overall genetic structure measured by AMOVA for Macoma balthica balthica…...…….34 Table 2.4. Overall genetic structure measured by AMOVA for ATL Macoma balthica…..………...35

Table 2.5. FST for populations of Hiatella arctica, Macoma balthica balthica and ATL Macoma balthica……………………………………………………………………………….36 Appendix B……………………………………………………………………………………………….57 Table B.1. References for species identifications………………...…………………………………58 Appendix C………………………..…………………………………………………………………...…61 Table C.1. List of COI primers used for molecular techniques in Chapter 1…………..……………61 Table C.2. List of GenBank specimens used for analysis in Chapter 1……………………………...61 Appendix D……………………………………………………………………………….………………65 Table D.1. Detailed collection information for all 172 Hiatella arctica specimens…………………65 Table D.2. Detailed collection information for all 196 Macoma balthica specimens...... 67

vi

List of Figures Chapter 1…………………………………………………………………………………………………16 Figure 1.1. Sampling locations and the number of specimens examined in this study……...…….…16 Figure 1.2. Rarefaction curves for the five classes of Canadian marine mollusc represented in this study……………...……………………………………………………………….17 Figure 1.3. Mean intraspecific divergences (K2P) and nearest neighbour distances for all specimens in this study………………...………………………………………………………18 Figure 1.4. Maximum and mean intraspecific divergences plotted against the number of individuals analyzed for 157 species……………………………………………………………….18 Figure 1.5. Box plots comparing mean nearest neighbour distance with the number of species sampled from each genus with ≥ 2 representative species………………………..19 Figure 1.6. Neighbour-joining trees (K2P), with locality information, for 9 cryptic species complexes in this study……………………………………………………………………….20 Figure 1.7. Mean nearest neighbour distance (K2P) plotted against mean GC content for the 33 orders of Mollusca represented in this study………………………………………21 Figure 1.8. Secondary structure of COI marked with insertions and deletions for gastropods and bivalves……………………………………………………………………………….22 Chapter 2…………………………………………………………………………………………………37 Figure 2.1. Collection sites for H. arctica and M. balthica…………………………………………..37 Figure 2.2. Intraspecific sequence divergence (K2P) for H. arctica and M. balthica……………...... 38 Figure 2.3. Neighbour-joining tree based on K2P distances for H. arctica. The top scale shows estimated divergence time in millions of years while the bottom scale bar shows sequence divergence…………………………………………………………………………………..39 Figure 2.4. Neighbour-joining tree based on K2P distances for M. balthica. Subspecies names suggested by Väinölä (2003) are provided. The top scale shows estimated divergence time in millions of years while the bottom scale bar shows sequence divergence...... 40 Figure 2.5. Median-joining haplotype networks for H. arctica and M. balthica constructed with maximum parsimony…………………………………………………………………………….41

Figure 2.6. FST values (Slatkin’s linearized) plotted against geographic distance for populations within the main lineage of H. arctica…………………………………………………….42

Figure 2.7. FST values (Slatkin’s linearized) plotted against geographic distance for populations of M. balthica balthica…………………………………………………………………...42

Figure 2.8. FST values (Slatkin’s linearized) plotted against geographic distance for populations of ATL M. balthica………………………………………………………………………43

vii

Appendix B……………………………………………………………………………………………….57 Figure B.1. Hinge dentition in bivalves………………………………………………………...…….59 Figure B.2. View of girdle scales on Tonicella marmorea and Tonicella rubra…………...... ……...60 Appendix C………………………………………………………………………………………………61 Figure C.1. Neighbour-joining tree (K2P) for all barcoded specimens……………………………...62 Appendix D……………………………………………………………………………………………….65

Figure D.1. Calculations for DXY, Tajima’s D, FST and Bray Curtis index……………….………….70 Figure D.2. Bray Curtis similarity values between populations of Hiatella arctica…………………71 Figure D.3. Bray Curtis similarity values between populations of Macoma balthica…….…………72 Figure D.4. Images of hinge dentition in each of the four cryptic lineages of Hiatella arctica……..73

viii

General Introduction

The discovery and quantification of marine biodiversity is crucial for evaluating ecosystem structure and interactions, factors of growing importance in the face of climate change (Archambault et al. 2010). Species discovery also forms a basis for conservation and restoration of marine biodiversity (Snelgrove 2010). Traditional approaches to species discrimination have relied solely on morphological traits and often resulted in cryptic species going undetected (Hebert et al. 2003). In addition, the identification of specimens was often impossible because of life stage diversity or damage to diagnostic traits (Hebert et al. 2003, Carr et al. 2010, Radulovici et al. 2010). These facts make clear the need to integrate molecular data into species delineation techniques. The use of the cytochrome c oxidase subunit 1 (COI) gene as a genetic barcode for taxa has created a standardized approach for species delineation (Hebert et al. 2003). DNA barcoding is based on the observation that sequence divergences among species are generally much greater than those within species (Hebert et al. 2003). DNA barcoding studies typically employ distance-based methods to calculate intra and interspecific divergences, gaining insight into sequence variation both within species and among congeners (Zou et al. 2011). Prior studies have shown that a sequence divergence threshold of 2% is effective in delineating many animal species (Hebert et al. 2003, Witt et al. 2006, Kerr et al. 2009). Moreover, such analysis often highlights cases of cryptic diversity, particularly in marine species with broad ranges (Knowlton 2000, Carr et al. 2010, Bucklin et al. 2011). The frequent discovery of distinct genetic clusters in past investigation on species with Holarctic distributions suggests that genetic studies are a critical element if one wishes to accurately quantify marine biodiversity (Knowlton 2000, Carr et al. 2010, Bucklin et al. 2011).The extension of knowledge on biodiversity in Canada’s marine ecosystems is critical because of stressors originating from climate change, eutrophication, acidification, overfishing and pollution (Archambault et al. 2010). In addition, human-mediated dispersal, linked to increased vessel traffic, has resulted in invasive species replacing the natural marine fauna (Carlton 1999, Bax et al. 2003, Archambault et al. 2010). These invasive species not only threaten environmental functions, but hurt the economy through impacts on commercially important fish and invertebrate species. My thesis extends knowledge of Canadian species diversity in the most diverse marine phylum, the Mollusca (Bouchet et al. 2002). This phylum is not only speciose, but ubiquitous as its member species occupy diverse habitats and posses varied life history strategies. This study not only uses COI for delineating species, but also integrates these results with traditional taxonomic methods to provide a multi-dimensional approach to species identification. Exploring genetic variation in marine molluscs provides insight into diversity patterns in Canadian and will aid future conservation strategies.

1

Chapter 1 reports progress in the assembly of a barcode reference library for Canada’s marine molluscs. This chapter explores patterns of sequence variation, both within and between species, and also provides a detailed analysis of variation in GC content across molluscan orders. In addition, it probes taxa with deep intraspecific divergences with a view towards the revelation of potential cryptic complexes while also providing comparative analyses of species identification techniques. Chapter 2 involves a focused examination of sequence variation and population structure in two bivalve species with planktonic larval development and a Holarctic distribution. This study was motivated by recent evidence for deep sequence divergence and phylogeographic partitioning in other marine species which were once thought to possess broad ranges (Knowlton 2000, Carr et al. 2010, Bucklin et al. 2011). Furthermore, it has been shown that patterns of population structure can differ in species with similar life history strategies, a result that likely reflects their differential responses to the glacial cycles that have impacted Canada’s coasts for much of the last two million years (Bernatchez & Wilson 1998). In fact, repeated glaciations have played a key role in shaping contemporary patterns of species distribution and population divergence in Canadian waters (Bernatchez & Wilson 1998, Wares & Cunningham 2001). This thesis demonstrates that COI is a useful tool for analyzing patterns of genetic variation in marine molluscs, on local and regional scales. It is also useful for highlighting cases of deep intraspecific divergence that may signal cryptic species. In this study, COI was successful in clarifying patterns of intraspecific variation in two bivalve species, providing evidence for unrecognized species in both taxa, and identifying locations of potential glacial refugia. This work highlights the need for greater effort at species documentation, not only to further understanding of marine biodiversity, but also to aid conservation and the implementation of marine protected areas.

2

Chapter 1 Patterns of DNA Barcode Variation in Canadian Marine Molluscs

Abstract A 648 base pair segment of the cytochrome c oxidase subunit 1 gene has proven useful for the identification and discovery of species in many animal lineages. This study begins the assembly of a comprehensive barcode reference library for Canadian marine molluscs, examining patterns of sequence variation in 234 morphospecies, about 25% of the marine mollusc fauna in Canada. These taxa showed a mean intraspecific sequence divergence of 0.51%, while congenerics showed a mean divergence of 13.5%. Nine cases of deep (>2%) intraspecific divergence were detected, suggesting possible overlooked species. Structural variation was detected in the barcode region with indels in 38 species, most (71%) in bivalves. GC content varied from 32 – 43% and there was a significant positive correlation between GC content and nearest neighbour distance.

3

Introduction DNA barcoding employs sequence diversity in a 648 base pair region of the cytochrome c oxidase subunit 1 (COI) gene to distinguish species (Hebert et al. 2003, Kerr et al. 2009, Carr et al. 2010). Past work has shown that sequence divergences are generally much greater between than within species (Hebert et al. 2003). Because of this fact, DNA barcoding aids both the identification of known species and the discovery of overlooked taxa (Witt et al. 2006). The latter application has revealed that the incidence of sibling species is often high enough to lead to serious inaccuracies in estimates of biodiversity (Knowlton 2000, Carr et al. 2010). In light of this, it is increasingly recognized that molecular approaches need to be incorporated into biodiversity surveys. DNA barcoding is a particularly useful tool for groups with high diversity, especially those which have seen little taxonomic attention. Although marine molluscs have been the subject of considerable research, the number of species in Canadian waters remains uncertain with estimates ranging from 700 to 1200. In part, this uncertainty reflects taxonomic problems linked to the fact that molluscs are the most diverse phylum of marine life with more than 50,000 described species (Bouchet 2006). In addition, molluscs exhibit complex larval stages, frequent cryptic taxa and substantial phenotypic plasticity, all factors that impede morphological approaches to species identification (Drent et al. 2004, Marko & Moran 2009). In fact, the differing juvenile stages of molluscs have sometimes been treated as distinct species generating inaccurate estimates of biodiversity and geographic ranges (Johnson et al. 2008). The extreme phenotypic plasticity in shell morphology which is common in molluscs poses additional problems for (Johnson et al. 2008, Zu et al. 2011). Because traditional morphological approaches to identification confront so many challenges in molluscs, it is imperative to integrate molecular diagnostics into this process. Several prior studies have validated the efficacy of DNA barcoding in the discrimination of mollusc species, but most of this work has targeted a particular order or family. For instance, Meyer and Paulay (2005) presented a detailed study of barcode diversity in cowries (Family: Cypraeidae), demonstrating the general effectiveness of the approach, but showing that a fixed sequence threshold could not be used for species diagnosis. They concluded that DNA barcoding was a powerful aid for mollusc identification when paired with strong taxonomic validation and comprehensive sampling. More recent studies have extended these results by establishing the value of DNA barcoding in resolving cryptic species complexes in the molluscan families Muricidae, Thyasiriidae, Yoldiidae, Nuculidae and Lepetodrilidae (Mikkelsen et al. 2007, Johnson et al. 2008, Zou et al. 2012). While Zou et al. (2011) found that distance-based analyses were less effective than those based on characters, the former approach successfully delineated 40 Chinese neogastropod species. Despite the demonstrated utility of DNA barcoding in molluscs, no study has aimed to assemble a comprehensive barcode registry for a large geographic region. The present investigation addresses this

4 deficit, beginning the construction of a DNA barcode reference library for Canadian marine molluscs. It compares intra and interspecific divergences among 33 molluscan orders with prior values, and also examines the utility of different approaches for the designation of OTUs (Operational Taxonomic Units) based on the analysis of sequence variation at COI. This study also investigates variation in nucleotide composition among molluscs and how this property impacts levels of genetic divergence. Finally, insertions and deletions in the COI region are analyzed for all molluscs examined in this study. Methods Specimen collection and data scrutiny A total of 2471 specimens were collected from 2007 to 2012 at sites across Canada (Figure 1.1). Figure 1.2 provides rarefaction curves for each class as a measure of sampling efficiency. Specimen details, sequences and trace files are available on BOLD (www.boldsystems.org, Ratnasingham & Hebert 2007), while the specimens are held at the Biodiversity Institute of Ontario. Typically five specimens per species were collected from intertidal or subtidal habitats using nets, small dredges and SCUBA diving, but samples from the Beaufort Sea were collected from deep subtidal soft-bottom habitats using an Agassiz trawl. Specimens were immediately fixed in 90-100% ethanol, with regular replacement of ethanol to prevent its dilution. During fixation, the opercula of gastropods and the shells of bivalves were separated to ensure preservation of internal tissues. After each collecting trip, specimens were placed in fresh 95% ethanol and stored at -20C. When possible, specimens were identified to a species level with name usage following the World Register of Marine Species (WoRMS). Approximately 3% of the barcoded specimens could not be identified to a species-level because they were immature, but they were assigned to a genus and an interim species. DNA extraction, amplification and sequencing Doubly uniparental inheritance (DUI) has been found in some bivalve groups and is characterized by the transmission of a maternal and paternal mitochondrial lineage through eggs and sperm, respectively (Ghiselli et al. 2012). While DUI can cause deep divergences between male and female conspecifics avoiding gonadal tissue during sampling can resolve this problem given that male somatic tissue is still dominated by the female genome (Passamonti & Ghiselli 2009, Zouros 2012). In turn, DNA extracts were prepared from a small sample of muscle tissue from each specimen. Tissue samples were placed in cetyltrimethylammonium bromide (CTAB) lysis buffer solution with proteinase K and incubated for 12 hours at 56°C. DNA was then extracted using a manual glass fibre plate method

(Ivanova et al. 2008). After incubation, the DNA was eluted with 40 µl of ddH2O. After re-suspension, 2

µl of each DNA extract was placed into a well into another additional plate and 18 µl of ddH2O was added to dilute salts or mucopolysaccharides that might inhibit PCR. Three primer sets were employed to maximize amplicon recovery (dgLCO1490/dgHCO2198, LCO1490_t1/HCO2198_t1 and

5

BivF4_t1/BivR1_t1). The primer set that generated an amplicon for a particular specimen, and the primer sequences are available on BOLD. A primer cocktail (C_LepFolF/C_LepFolR) was used in a second round of PCR for specimens that failed to amplify in the first round of PCR. Each well was filled with 2 µl of diluted DNA and the following reagents were added to total a 12.5 µl PCR reaction: 6.25 µl 10% trehalose, 2 µl ddH20, 1.25 µl 10× PCR buffer, 0.625 µl MgCl2 (50 mM), 0.125 µl of each forward and reverse primer (10 µM), 0.0625 µl dNTP (10 mM) and 0.06 µl Platinum Taq polymerase. The thermocycling regime consisted of one cycle of 1 min at 94°C, 40 cycles of 40 s at 94°C, 40 s at 52°C, and 1 min at 72°C, and finally 5 min at 72°C. E-Gels (Invitrogen) were used to screen for amplification success and all positive reactions were bidirectionally sequenced using BigDye v3.1 on an ABI 3730xl DNA Analyzer (Applied Biosystems). Sequences were manually edited using CodonCode Aligner and aligned both by eye in MEGA5 and through the BOLD aligner algorithm (CodonCode Corporation, Tamura et al. 2011). MEGA5 was also used to assess the prevalence and location of insertions and deletions (indels) (Tamura et al. 2011). Sequences containing more than 1% ambiguities, stop codons, double peaks or that were shorter than 220 bp were removed from further analysis. Sequencing success was assessed for this study and a Pearson’s Chi-Squared test was used to determine whether significant differences existed between sequence recovery in each class. Data analysis A Kimura-2-parameter (K2P) distance model was employed in MEGA5 to construct a neighbour- joining (NJ) tree which served as a preliminary basis for species recognition (Kimura 1980, Tamura et al. 2011). Genetic distances, including intra and interspecific divergence along with nearest-neighbour distance, were calculated with the K2P distance model (Kimura 1980) and overall data were compared using the ‘Distance Summary’ and ‘Barcode Gap Analysis’ tools on BOLD (Ratnasingham & Hebert 2007). Maximum intraspecific divergence was plotted against nearest neighbour (NN) distance to determine how often NN distances surpassed intraspecific divergences, ultimately indicating the presence of a barcode gap. In addition, the ‘Sequence Composition’ tool on BOLD was used to examine variation in GC content among species in each order (Ratnasingham & Hebert 2007). Species numbers were determined by two approaches: i) morphology, through the examination of shell characters and soft tissue, and ii) through the analysis of sequence divergence patterns at COI to ascertain the number of sequence clusters present. The morphological approach involved the use of the national mollusc collection deposited at the Canadian Museum of Nature in Gatineau, Québec. The latter approach employed four algorithms designed for this purpose - Barcode Index Number (BIN) (Ratnasingham and Hebert 2013), Automated Barcode Gap Discovery (ABGD) (Pulliandre et al. 2011), Clustering 16S rRNA for OTU Prediction (CROP) (Hao et al. 2011) and jMOTU (Jones et al. 2011). The BIN algorithm only analyzed

6 sequences greater than 500 bp in length while the other three algorithms examined all sequences greater than 400 bp. Parameter settings for each OTU algorithm can be found in Table 1.1. Various packages in Revolution R were used to analyze levels of sequence variation. The Picante and VEGAN packages were used to generate rarefaction curves to assess sampling effort for each class of mollusc (Dixon 2003, Kembel et al. 2010). These software packages were also used to perform linear regressions to determine if the number of individuals sampled within a species impacted values of intraspecific divergence (Dixon 2003, Kembel et al. 2010). P-values less than 0.05 were considered significant. Finally, these packages were used to generate a box plot for comparing mean nearest neighbour distance between genera with two or more species sampled (Dixon 2003, Kembel et al. 2010). For box plot analysis, significance was determined from an analysis of variance (ANOVA). A chi-square test of homogeneity was used to determine whether nucleotide frequency was homogeneous among classes. Species with intraspecific divergences greater than 2% were treated as potential cryptic complexes. Genetic variation, both within and between species, as well as the number of individuals and genetic clusters, was analyzed for each cryptic species. Lastly, the boot and Hmisc packages were used to test whether mean nearest neighbour distance was correlated with mean GC content in molluscan orders (Harrell Miscellaneous 2012). Results Sequence recovery Among the 2471 marine mollusc specimens analyzed, 1334 COI sequences were recovered from 234 morphospecies. The LCO1490_t1 and HCO2198_t1 primer set, along with a 1:10 dilution of DNA and an annealing temperature of 52°C for amplification, generated the highest success in sequence recovery. Success rates were significantly different among chitons, bivalves and gastropods (p <.0001) (Table 1.2). Sequences ranged in length from 223 to 658 bp, but 88% were greater than 600bp. Table 1.3 displays the number of sequences and species examined in the 5 classes and 33 orders represented in this study, along with measures of intra and interspecific divergence and GC content. The overall mean values for intra and interspecific divergence were 0.51% and 13.5% respectively. Values of maximum intraspecific divergence ranged from 0 to 30.58%. COI variation in marine molluscs Morphological study indicated the presence of 234 species; 77 were represented by a single specimen while the other 157 species had an average of 8 specimens (range 2-67). All but one of these species had one or more sequence records >400 bp in length. A barcode gap was present for the majority of species in this study with exceptional cases likely reflecting cryptic complexes (Fig 1.3). Algorithms for OTU determination generated estimates of 242 (BIN), 255(jMOTU), 270(CROP) and 271(ABGD). However, 72 specimens representing 27 morphologically identified species lack BIN assignments because

7 their sequence records were <500bp. The BIN, jMOTU, CROP and ABGD algorithms generated 71, 97, 100 and 114 singletons, respectively. Values for intraspecific variation, both maximum and mean, were not significantly associated with the number of individuals analyzed (Figure 1.4; p = 0.71, p = 0.40). Furthermore, mean nearest neighbour distances were not correlated with the number of species analyzed from a genus (Figure 1.5; p=0.07). Lastly, nine species in this study demonstrated deep intraspecific divergences that were greater than 2%. Table 1.4 provides values of mean intraspecific divergence, the number of individuals and clusters and locality information detailing the distribution of each cryptic species across Canadian oceans. Figure 1.6 provides neighbour-joining trees (K2P) for each of the nine cases where deep intraspecific divergence was detected. Variation in nucleotide composition Mean GC content averaged 37.1% for all species (range 32-43%). A chi-square test of homogeneity demonstrated that nucleotide frequencies were not identical among species in each of the five molluscan classes (p <0.001). Mean nearest neighbour distances appeared positively correlated with mean GC content, with a correlation coefficient of 0.51, suggesting that as GC content increases across orders so does the genetic distance between congeneric species (Figure 1.7; p=0.002). Distribution of indels Indels were only detected in two of the five classes, Bivalvia and , but they occurred in nearly half (47%) of the bivalve species versus just 9% of the gastropods. Indels were detected in 27 bivalve species involving representatives of 12 families, and in 11 species of gastropods from 4 families. Indels were conserved in 7 of the 10 bivalve families, but varied between genera in 2 families and between species in 1 family. Indels were conserved in 3 of the 4 gastropod families but varied between genera in the Lottidae. In bivalves, 1 deletion was observed in Cyclocardia borealis, Mactromeris polynyma, Saxidomus gigantea and in all six Tellinidae species while 2-3 deletions occurred in Delectopecten greenlandicus, Astarte montagui and both Crassostrea species. Moreover, 1 insertion occurred in all species of Myidae, Mytilidae, Arcidae and Glycymerididae, while three insertions occurred in all species of Thyasiridae. Finally, 1 insertion and 3 deletions occurred in Astarte borealis while 2 insertions and 3 deletions occurred in Cyclocardia crassidens. In gastropods, all five Lottia species had one insertion and all Pyramidellidae (Boonea cf. bisuturalis, Odostomia sp.1, Odostomia sp. 2) and Onchidiidae (Onchidella borealis and Onchidella cf. carpenteri) species had one deletion, while Limacina helicina had four deletions. Indels ranged in length from 3 to 18 nucleotides in bivalves and from 3 to 12 nucleotides in gastropods. All indels were in multiples of 3 nucleotides, suggesting they did not derive from pseudogenes. Indels were mapped onto the secondary structure of COI and often appeared in proximity to external loops (Figure 1.8).

8

Discussion Sequencing success in Mollusca Most of the sequences recovered in this study were greater than 600 bp, but some were as short as 250 bp, reflecting the use of internal primers. This study employed multiple rounds of PCR, tested different primer cocktails and modified PCR regimes to minimize contamination and pseudogene recovery. These optimization studies revealed that a 1:10 dilution of DNA, coupled with tailed Folmer primers and a PCR regime with an annealing temperature of 52°C, generated the highest success. This protocol usually produced single amplicons that lacked evidence of pseudogene amplification. Mucopolysaccharides are present in many mollusc groups and because they are not always removed during DNA extraction, they can interfere with DNA polymerase, reducing PCR amplification success. Because only half of the specimens generated an amplicon, significant challenges in barcode recovery remain. No barcode sequences were recovered from six species (Anomia simplex, Astarte crenata, Astarte moerchi, scutulata, Notoacmea testudinalis and Argopecten irradians). Sequencing success significantly differed between classes, with polyplacophorans delivering the highest success (82.8%) and bivalves the lowest success (32.9%). Future work should focus on the development of primer sets that target specific lineages to generate greater success in sequence recovery. Despite the challenges with some mollusc groups, this study still presents sequence records for nearly 25% of the Canadian marine mollusc fauna. Patterns of sequence variation Considering all 156 morphospecies represented by two or more records, Canadian marine molluscs showed a mean intraspecific divergence of 0.51%, a value lower than that reported for echinoderms (0.62%) but higher than in polychaetes (0.38%), marine fishes (0.39%) and decapods (0.46%)(Ward et al. 2005, Costa et al. 2007, Ward et al. 2008, Carr et al. 2010). The relatively high intraspecific divergence in Canadian molluscs likely reflects, at least in part, the impact of overlooked species. For instance, a mean intraspecific divergence of 16.8% was detected for erosus in this study, while Sun et al. (2011) found a far lower mean intraspecific divergence (0.44%) for species in this order (). When the nine cases of deep sequence divergence detected in this study were excluded, mean intraspecific divergence dropped to 0.42%. In any case, there was little overlap between intra and interspecific divergence (Figure 1.3), suggesting that DNA barcoding is very effective in delineating marine mollusc species. Levels of sequence variation in this study differed greatly from some other recent studies (Table 1.5). While interspecific divergences were consistently high among taxa, some cases of high intraspecific divergence have been reported (Table 1.5). This difference may reflect the fact that many prior studies focused on a single genus or family while this study involved a phylum-wide analysis, potentially allowing for greater coverage of sequence variation in the former. However, the high intraspecific values in the

9 literature may also reflect misidentifications which would inflate values of within species divergence. However, the present study does confirm the particularly high interspecific variation in the Vetigastropoda (Table 1.5)(Meyer & Paulay 2005). Members of this order are a diverse group of gastropods that inhabit diverse marine environments from the shallow intertidal to deep-sea hydrothermal vents. Their success in invading such a variety of habitats may be facilitated by a high evolutionary rate that is reflected in high levels of sequence divergence between congeneric taxa (Colgan et al. 1999). This study revealed nine taxa in which intraspecific divergences were greater than 2% (Table 1.4; Figure 1.6). Prior work has established that deep mtDNA divergences do really exist in some mollusc species, such as the land snail, Cepaea nemoralis, where distances reach 12.9% (Thomaz et al. 1996). However, in most other cases, deep divergences, especially those involving allopatric lineages, are now thought to represent different species. For example, the deep divergences in populations of the bipolar pteropods Limacina helicina and Clione limacina are now viewed as evidence for separate species in the and Antarctic (van der Spoel & Dadon 1999, Hunt et al. 2010, Jennings et al. 2010). This study extends the earlier conclusion, suggesting the possible presence of two Clione species in the Arctic although their divergences are far lower than the major Antarctic/Arctic lineages (Table 1.4; Figure 1.6). Glaciation in Canada has played a key role in shaping the genetic structure of contemporary populations and may be responsible for the segregation and differentiation of lineages on opposing coasts (Hewitt 1996, Bernatchez & Wilson 1998, Wares & Cunningham 2001, Maggs et al. 2008, Dapporto 2009). For example, the deeply divergent lineages of Hiatella arctica, Macoma balthica and detected in this study may represent sibling species whose origin is linked to isolation in glacial refugia (Table 1.4; Figure 1.6, Layton unpublished). Moreover, the two distinct clusters of Mya truncata and Mya arenaria found in this study may include one of three cryptic species of Mya recorded from the Arctic Ocean (Peterson 1999). In any case, deep intraspecific divergences often flag overlooked species in a lineage (Witt et al. 2006). For instance, DNA barcoding unveiled five cryptic species complexes in the Leptodrilidae, a family of inhabiting deep-sea hydrothermal vents (Johnson et al. 2008). Similarly, COI analysis unveiled that the cold-seep bivalve species, Acesta bullisi, should be separated into two species (Järnegren et al. 2007). Not only have these discoveries been made in molluscs, but DNA barcoding has revealed overlooked diversity in numerous marine taxa, including fishes of Pacific Canada and asteroid species of southeast Australia (Naughton & O’Hara 2009, Steinke et al. 2009a).These results highlight the need for integrating molecular approaches into species identifications. Future work should focus on evaluating population structure in these potential cryptic cases and include a more detailed examination of genetic variants. Insertions and deletions in COI

10

This study has demonstrated that indels are considerably more prevalent in bivalves than in other molluscan classes. Interestingly, high rates of nucleotide substitution occur in lineages containing indels (Tian et al. 2008). For instance, a high mutational load was observed in oysters (Crassostrea) and mussels (Mytilus), genera for which three deletions and one insertion were detected in this study (Figure 9) (Hedgecock et al. 2004, An and Lee 2012). In fact, Vetsigian and Goldenfeld (2005) found that indels stimulate diversification fronts in the genome and over a short time can facilitate sequence divergence between populations. Moreover, the discovery of a single amino acid deletion in the Heterostropha and Pulmonata groups corroborates findings from Grande et al. (2004) which suggest this deletion may either be due to convergence, as a result of a length constraint, or several deletions arising in some pulmonates and the Heterstropha (Figure 9). Most indels in bivalves and gastropods occurred in close proximity to the first loop, a pattern that Remigio and Hebert (2003) also observed in Heterobranchia and Patellogastropoda (Figure 9). Furthermore, the presence of a single amino acid insertion in the Lottidae (Patellogastropoda) and four amino acid deletions in Limacina helicina may be correlated to accelerated rates of substitution in these groups (Remigio and Hebert 2003). Four bivalve orders demonstrated a single insertion event at position 37 (Figure 9). This insertion likely arose independently in each order as it was not possessed by all descendents of the monophyletic group (Plazzi et al. 2011). Interestingly, Mikkelsen et al. (2007) discovered that Thyasira specimens harboured 3 to 4 additional codons in the COI gene, a pattern corroborated by our finding of a three amino acid insertion in all Thyasiridae specimens. Together with prior work, the present study has established that insertions and deletions in the COI gene are relatively common in molluscs, suggesting that future work should aim to determine the functional relevance of this sequence variation as well as its implications for rates of molecular evolution. Finally, indel patterns are useful for inferring phylogenetic relationships, a particularly important application for a phylum which often faces taxonomic scrutiny. Patterns of nucleotide composition After examining the frequency of each nucleotide (A, T, G and C) in all sequences, and grouping these sequences by class, heterogeneity was apparent in nucleotide frequencies between each class. Differences in nucleotide composition can provide insight into ratios of nonsynonymous to synonymous substitutions and may also be linked to extrinsic factors (Albu et al. 2008). For instance, Dixon et al. (1992) found a positive correlation between rDNA melting temperature and environmental temperature in hydrothermal-vent polychaetes. Dixon et al. (1992) interpreted this finding to mean that organisms living in more extreme conditions will have a higher GC content. While Wu et al. (2012) recognize that environmental factors may impact GC content, they suggest that GC content is likely governed by more intrinsic controls, particularly mutator genes. We also discovered a moderate, positive correlation between GC content and sequence divergence between congeneric taxa, although unequal sampling between orders

11 may have contributed to this finding. Future work should aim to remove sampling bias and incorporate phylogenetic analyses to further probe this relationship. Examining GC content in a lineage can also provide insight into whole-genome patterns. For instance, Clare et al. (2008) found that sequence composition patterns in COI are likely reflective of patterns in the entire mitochondrial genome, potentially having implications for rates of mitochondrial evolution. Moreover, Wu et al. (2012) found that increased genome size in bacteria is based on an increase in GC content of the genome. In light of these findings, future work should focus on comparing GC content between taxa for which ecological and genomic data are available. Conclusions There is a pressing need to gain a more detailed understanding of Canadian biodiversity. Although barcode records are available for some groups, Canadian marine molluscs have, in the past, been the subject of very little attention with regards to molecular taxonomy. The present study begins to address this gap, providing barcode coverage for 25% of the Canadian fauna and establishing the effectiveness of this approach in delineating species of marine molluscs. Because 30% of the species in this study were represented by a single specimen, future studies should extend sample sizes for such cases to confirm that they do not contain cryptic species. This study has revealed that DNA barcoding is not only useful for documenting biodiversity, but also for unveiling patterns of genetic variation and sequence composition across a broad taxonomic group on a large geographic scale.

12

Table 1.1. Parameter settings for each OTU algorithm.

OTU Algorithm Parameter Settings jMOTU 2% threshold = 13 bp differences low BLAST identity filter = 99 sequence alignment overlap = 60% of min. sequence length CROP l 0.3 -u 0.5 (1%) l 0.6 -u 1 (2%) s (3%) l 1.2 -u 2 (4%) g (5%) ABGD Pmin = 0.0006 Pmax = 0.17

Table 1.2. Percent success in recovery of a COI sequence for specimens (n=1964) in three classes of Mollusca.

Class Specimens Processed Sequencing Success Bivalvia 586 32.9% Gastropoda 1256 42.5% Polyplacophora 122 82.8%

13

Table 1.3. The number of COI sequences and BINs, intraspecific and nearest neighbour distances and mean GC content for each of 33 orders. *denotes no BIN(s) assigned.

Class Order Sequences BINs mean intra (% mean inter (% mean GC K2P) (SE) K2P) (SE) (%) (SE) Bivalvia Arcoida 5 2 0.26 (0.03) 0 (0) 37.3 (0.7) Carditoida 4 2 0.11 (0.03) 58.2 (0.07) 35.0 (0.8) Euheterodonta 42 10 3.5 (0.04) 19.8 (0.03) 36.3 (0.3) Myoida 2 0* 0.47 0 (0) 40.5 Mytiloida 147 8 0.39 (0) 17.5 (0) 37.5 (0.1) Nuculanoida 38 8 0.20 (0.002) 31.8 (0.9) 39.4 (0.5) Ostreoida 21 3 0.29 (0.009) 25.7 (0.008) 39.0 (0.3) Pectinoida 10 1 0.07 (0.004) 0 (0) 42.6 (0.6) Pholadomyoida 3 2 0 (0) 0 (0) 38.5 (0.2) Veneroida 112 20 0.4 (0.001) 19.3 (0.004) 35.5 (0.3) Cephalopoda Decapodiformes 12 2 0.09 (0.002) 8.0 (0.2) 35.0 (0.2) Myopsida 3 1 0.06 (0.03) 0 (0) 39.2 (0.3) Octopoda 2 2 0 (0) 0 (0) 32.5 Oegopsida 22 2 0.65 (0.003) 0 (0) 37.8 (0.1) Gastropoda Archaeogastropoda 55 11 0.12 (0.002) 21.0 (0.02) 39.4 (0.3) Caenogastropoda 7 3 16.8 (0.7) 0 (0) 39.9 (3.2) Cephalaspidea 28 6 0.65 (0.007) 11.8 (0.04) 31.9 (0.3) Docoglossa 10 1 0.15 (0.004) 0 (0) 41.8 (0.09) Gymnosomata 17 2 2.2 (0.01) 0 (0) 37.6 (0.1) Heterostropha 4 3 0.31 (0) 21.5 (0) 35.8 (1.0) 223 23 0.43 (0.01) 9.6 (0.06) 37.4 (0.1) Mesogastropoda 1 1 0 (0) 0 (0) 39.1 Neogastropoda 162 40 0.23 (0.001) 8.4 (0.001) 35.3 (0.1) Nudibranchia 107 29 0.63 (0.1) 20.2(0.3) 37.2 (0.2) Opisthobranchia 12 4 0.24 (0.009) 0 (0) 34.0 (1.3) Patellogastropoda 52 6 0.17 (0.001) 23.9 (0.004) 38.4 (0.6) Pulmonata 27 3 0.57 (0.003) 11.1 (0.01) 34.6 (0.03) 19 4 0.45 (0.009) 19.3 (0.03) 38.2 (1.5) Vetigastropoda 17 6 0.11 (0.004) 39.0 (0.4) 39.9 (0.8) Polyplacophora Chitonida 115 15 0.54 (0) 15.4 (0.003) 36.6 (0.2) Neoloricata 29 7 1.1 (0.007) 16.8 (0.007) 36.7 (0.3) Scaphopoda Dentaliida 20 11 7.2 (0.8) 14.6 (0.2) 32.3 (0.5) Gadilida 6 4 0.22 (0.1) 0 (0) 35.8 (0.9)

14

Table 1.4. Mean intraspecific divergence (% K2P), number of genetic clusters, number of individuals sampled and locality information for each potential cryptic species complex in this study. *(Layton unpublished)

Species N Mean Intra Clusters Cluster Locality Clione limacina 17 2.15 2 2 Arctic Mya arenaria 13 2.45 2 1 Pacific/Atlantic, 1 Pacific Cryptonatica affinis 8 2.47 2 2 Arctic Hiatella arctica* 172 3.9 4 1 Tri-oceanic, 2 Pacific, 1 Atlantic Mya truncata 6 4.41 2 2 Arctic Macoma balthica* 163 7.7 3 1 Pacific/Arctic, 2 NW Atlantic Thyasira gouldi 7 10.04 2 2 Atlantic Tachyrhynchus erosus 7 16.81 3 2 Arctic, 1 Atlantic Triopha catalinae 3 17.99 2 2 Pacific

Table 1.5. Intra and interspecific distances (% K2P) for taxonomic groups examined by DNA barcoding in previous literature. These values are compared to values obtained from similar taxa investigated in this study.

This Study: Literature: Group Mean intra Mean inter Mean intra Mean inter Citation Muricidae 0.12 6.5-11.4 0.4 6.7-25.2 Zou et al. 2012 Littorinimorpha1 0.43 9.6 0.81 - Meyer & Paulay 2005 Turbinidae 0.07 - 0.18 - Meyer & Paulay 2005 Lottidae 0.17 23.9 0.25 - Meyer & Paulay 2005 Vetigastropoda2 0-0.34 38.3-39.7 0.10-1.34 3.01-31.25 Johnson et al. 2008 Neogastropoda 0.23 8.4 0.64 8.1 Zou et al. 2011 1 Cypraeidae used in literature but no Canadian representatives of this family 2 Lepetodrilidae used in literature but no representatives collected in this study

15

Figure 1.1. Sampling locations and the number of specimens examined in this study. Sequences obtained from GenBank are not included as they lack locality information.

16

A)

B)

Figure 1.2. Rarefaction curves for the five classes of Canadian marine mollusc represented in this study. Plot A provides curves for Bivalvia and Gastropoda while plot B provides curves for Cephalopoda, Polyplacophora and Scaphopoda.

17

Figure 1.3. Maximum intraspecific divergence plotted against nearest neighbour distance for all species in this study. All data points falling above the 1:1 line indicate a barcode gap is present for these species.

Figure 1.4. Maximum and mean intraspecific divergences (% K2P) plotted against the number of individuals analyzed for 157 species. The regression between sample size and both maximum and mean divergences were insignificant (p = 0.71, p = 0.4, respectively).

18

Figure 1.5. Box plots comparing mean nearest neighbour distance (% K2P) with the number of species sampled from each genus with ≥ 2 representative species (N=43). The ANOVA was insignificant (p = 0.069). Four morphospecies were excluded because they lack genus-level identification.

19

A) Mya truncata B) Cryptonatica affinis

C) Mya arenaria D) Clione limacina

E) Tachyrhynchus erosus F ) Thyasira gouldi

G) Triopha catalinae

H) Macoma balthica * I) Hiatella arctica *

Figure 1.6. Neighbour-joining trees (K2P), with locality information, for 9 cryptic species complexes in this study. NJ trees are coloured blue for bivalves and red for gastropods and triangles represent compressed clades, with sample size provided in brackets. *(Layton unpublished).

20

Figure 1.7. Mean nearest neighbour distance (% K2P) plotted against mean GC content (%) for the 33 orders of Mollusca represented in this study.

21

A)

B)

Figure 1.8. Secondary structure of COI marked with insertions and deletions for A) gastropods and B) bivalves. Insertions are marked with a blue + sign and deletions are marked with a red x.

22

Chapter 2 Geographic Patterns of mtCOI Diversity in Two Species of Canadian Marine Bivalves

Abstract Variation in modes of larval development has strong impacts on dispersal potential and gene flow among populations of marine invertebrates. However, Pleistocene glaciations have also played an important role in shaping population structure in benthic taxa in the northern hemisphere, even those with planktotrophic larvae. This study examines patterns of COI sequence divergence in two bivalve species, Hiatella arctica and Macoma balthica, which share a similar mode of larval development (planktotrophic) and a vast distribution in the Nearctic. This study reveals that both species possess high genetic diversity in the northeast Pacific, but H. arctica has less phylogeographic structure and more sequence variation across its range than M. balthica. Three North American lineages of M. balthica were detected, corroborating a recent taxonomic revision. This study also provides the first evidence that H. arctica may include four species. Ecological differences between these species have likely played a role in their differing biogeographical patterns.

23

Introduction Life history attributes, vicariance events and environmental tolerances all play a role in determining where species occur (Reid 1990, Hewitt 2000). In fact, contemporary patterns of population structure in the northern hemisphere can only be understood by considering dispersal capacity and past glacial events and how these factors impacted population isolation and differentiation (Wares & Cunningham 2001). Patterns of population structure in Canadian marine benthic invertebrates are particularly interesting (Meehan 1985) because many species rely on planktonic larvae for long-range dispersal (Meehan 1985, Lee & Boulding 2009). Some larvae spend weeks in the plankton where passive dispersal by oceanic currents can cause enough gene flow to ensure near panmixis across the species range (Jablonski 1986, Bradbury et al. 2008, Keever et al. 2009, Lee & Boulding 2009). Although planktotrophic species generally demonstrate less regional genetic divergence than congeneric species lacking a planktonic stage (Jablonski 1986, Bradbury et al. 2008, Keever et al. 2009, Lee & Boulding 2009), substantial population structure can still occur in some species with planktonic larvae. For instance, Macoma balthica shows clear genetic divergence between populations in the northwest and northeast Atlantic (Väinölä 2003, Nikula et al. 2007). Cases such as this indicate that spatial homogeneity does not inevitably accompany planktonic larval stages. Glacial periods had a dramatic impact on Canadian marine environments. During the last (Wisconsin) glacial maxima, global sea levels declined by up to 170 metres and the Arctic Ocean was covered by persistent ice. The Atlantic coast of Canada was heavily impacted by the Laurentide ice sheet, but the Cordilleran ice sheet on the west coast produced lesser impacts (Bernatchez & Wilson 1998, Rohling et al. 1998, Hewitt 2000, Mandryk et al. 2001, Marko 2004). The recurrent opening and closing of the Bering Strait linked to glacial cycles acted as a secondary factor in causing intermittent isolation and exchange of species between the Pacific and Atlantic Oceans (Vermeij 1991, Taylor & Dodson 1994, Dodson et al. 2007). While range expansions following the last deglaciation are responsible for current distributions, prior episodes of glacial activity undoubtedly shaped both levels and patterns of genetic variation (Bernatchez & Wilson 1998). During each glacial advance, species retreated to refugia and then expanded their range during the subsequent interglacial (Vermeij et al. 1990, Hewitt 2000). For example, studies of mitochondrial DNA diversity in two marine species with planktonic larvae - the fish, Mallotus villosus, and the sea urchin, Strongylocentrotus droebachiensis, demonstrated their persistence in glacial refugia in the northwest Atlantic and northeast Pacific (Addison & Hart 2005, Dodson et al. 2007). Because each glacial advance tended to fragment species distributions in a consistent way, populations were effectively separated for prolonged periods, setting the stage for their differentiation (Hewitt 1996, Maggs et al. 2008, Dapporto 2009). As species expanded their range during interglacials, divergent lineages re-established contact at sites along Canada’s coasts (Harper & Hart 2007). Despite their shared

24 exposure to these environmental changes, species with similar life history attributes, but differing ecological preferences, have undoubtedly responded differently to glacial cycles (Bernatchez & Wilson 1998). Thus, glacial cycles have significantly impacted contemporary genetic structure and these patterns vary with dispersal potential and ecological tolerance. Past studies have provided insights into how glacial cycles have shaped contemporary patterns of genetic variation in some marine taxa, but little work has been directed toward molluscs. The scope for investigation is vast with regards to marine molluscs as nearly 1000 species occur across Canada’s three oceans. However, of all the marine molluscs occurring in Canada, fewer than 25 have distributions spanning all three oceans. Species in this subset are ideal candidates for investigations of population structure and this study targets two wide-ranging bivalves, Hiatella arctica and Macoma balthica. No prior study has examined the genetic structure of H. arctica, a species with a Holarctic distribution (Alison & Marincovich 1982, Schneider & Kaim 2012) and a long planktonic larval phase that lasts several weeks (Lebour 1938). Adults of Hiatella display considerable phenotypic plasticity, its shell growth being impacted differentially depending on the substrate or microhabitat in which the bivalve grows, making identifications in this genus particularly difficult (Alison & Marincovich 1982). The second species examined in this study, M. balthica, also has a Holarctic distribution (Gofas 2012) and a long-lived planktonic larval stage that extends for 1-2 months, allowing for extensive offspring dispersal along coasts. This species has seen prior genetic analysis which revealed considerable differentiation across its range. In fact, Väinölä (2003) concluded that M. balthica includes two subspecies - M. balthica balthica from the northeast Pacific and Baltic Sea and M. balthica rubra from . This study had the primary goal of comparing the extent and patterns of genetic structure in Canadian populations of H. arctica and M. balthica, although European populations of M. balthica were also considered. If component populations of these two species were repeatedly isolated into separate glacial refugia, then this should be reflected in their current population structure and possibly by the occurrence of two or more lineages at sites of secondary contact. Alternatively, if certain populations were extirpated during glaciations, then patterns of reduced genetic diversity should reflect localized extinction and subsequent postglacial recolonization. Methods Specimen collection From 4-60 specimens of M. balthica and H. arctica were collected at various sites in the Pacific Ocean (British Columbia, Alaska), the Arctic Ocean (Manitoba) and the (New Brunswick, Nova Scotia, Prince Edward Island, Newfoundland) between 2007 and 2011 (Figure 2.1). Specimens were obtained from rock crevices, algal mats, holdfasts and soft bottoms in the intertidal and from subtidal habitats using otter trawls, dredges and SCUBA diving. Specimens were transferred into 95%

25 ethanol immediately after collection to ensure tissue preservation. Morphological identification of all specimens was confirmed by Dr. André L. Martel at the Canadian Museum of Nature, while subspecies designations for M. balthica follow Väinölä (2003). DNA extraction, amplification and sequencing Doubly uniparental inheritance (DUI), characterized by the transmission of a maternal and paternal mitochondrial lineage through eggs and sperm, can cause deep divergences between male and female conspecifics (Passamonti & Ghiselli 2009, Ghiselli et al. 2012). Despite the presence of this genetic phenomenon in some mollusc groups, somatic tissue is dominated by the female genome and thus sampling only this tissue type will avoid problems posed by DUI (Zouros 2012). In turn, DNA extracts were prepared from a small sample of adductor muscle tissue from each specimen. Tissue samples were placed in cetyltrimethylammonium bromide (CTAB) lysis buffer solution with proteinase K. The samples were then incubated for 12 hours at 56°C before a manual glass fibre plate method was used for DNA extraction (Ivanova et al. 2008). Following incubation, the DNA was eluted with 40 µl of ddH2O. After re-suspension, 2 µl of each DNA extract was placed into a well in a separate plate with 18 µl ddH2O to ensure the dilution of salts or mucopolysaccharides that might inhibit PCR. Species-specific primer sets were used: HiaF1/HiaR1: AAGTTGTAATCATCGAGATATTGG and TAGACTTCTGGGTGCCCGAAAAACCA for H. arctica and MMacF1/LepR1: CTTTTATTAGCTGCACCTGATAT and TAAACTTCTGGATGTCCAAAAAATCA for M. balthica. Each well was filled with 2 µl of diluted DNA and the following reagents to generate a 12.5 µl PCR reaction mix: 6.25 µl 10% trehalose, 2 µl ddH20, 1.25 µl 10× PCR buffer, 0.625 µl MgCl2 (50 mM), 0.125 µl of each forward and reverse primer (10 µM), 0.0625 µl dNTP (10 mM) and 0.06 µl Platinum Taq polymerase. The thermocycling regime consisted of one cycle of 1 min at 94°C, 40 cycles of 40 s at 94°C, 40 s at 52°C, and 1 min at 72°C, and finally 5 min at 72°C. An E-GelH 96 (Invitrogen) was used to check 3 µL of each PCR product and reactions that generated an amplicon were bidirectionally sequenced using BigDye v3.1 on an ABI 3730xl DNA Analyzer (Applied Biosystems). Sequences were edited manually using CodonCode (CodonCode Corporation) and were aligned by eye in MEGA5 (Tamura et al. 2011). The COI gene was amplified from 172 and 196 specimens of H. arctica and M. balthica, respectively. Data analysis Neighbour-joining (NJ) trees were constructed in MEGA5 using the Kimura-2-parameter (K2P) distance model and 1000 bootstrap replicates (Kimura 1980, Saitou & Nei 1987, Tamura et al. 2011). The H. arctica and M. balthica NJ trees were rooted with Mya arenaria and Macoma inquinata as outgroups, respectively. Each sequence cluster showing more than 2% divergence from other clusters was labelled. Clustering patterns in each NJ tree were compared with those in the corresponding median-joining

26 haplotype network. These networks are based on maximum parsimony and were constructed in Network 4.6.1 (fluxus-engineering.com, Bandelt et al. 1999). Haplotype networks were then recreated in TCS 1.21 to identify ancestral haplotypes (Clement et al. 2000). Divergence times were estimated in MEGA5 assuming a substitution rate of 2% per million years (Hellberg & Vacquier 1999, Marko 2002, Donald et al. 2005, Tamura et al. 2011). Arlequin 3.5 (Excoffier & Lischer 2010) was employed to examine patterns of genetic structure in each species. The haplotype and nucleotide diversity for each population was calculated, and Tajima’s D test of neutrality with 10,000 simulated samples was used to infer the nature of selection pressures (Nei 1987, Tajima 1989). The number of haplotypes in each species was determined and the proportion of unique haplotypes was quantified to ascertain which site possessed the highest genetic exclusivity. An analysis of molecular variance (AMOVA) was conducted with a K2P distance model and significance was tested with 1000 permutations (Kimura 1980). The AMOVA results were used to determine whether the majority of genetic variation in each species existed within or between populations as a measure of population differentiation (Excoffier & Lischer 2010). Fixation indices (FST) were estimated with a K2P distance model and significance was tested with 100 permutations to determine the partitioning of variance (Weir & Cockerham 1984). An FST value of 1 indicates that two populations share no haplotypes while a value of 0 indicates all haplotypes are shared and have similar frequencies (Weir & Cockerham

1984). Slatkin’s linearized FST (Slatkin 1993) was subsequently plotted against geographic distance to determine whether genetic variation among populations reflected long-term historical divergence or geographic distance (Slatkin 1993, Kyle & Boulding 2000, Marko 2004, Keever et al. 2009). In order to test the significance of isolation by distance, a Mantel test with 1000 permutations was conducted in Arlequin 3.5 (p-values < 0.05 were treated as significant). Results Sequence recovery and haplotype diversity The 196 COI sequences from M. balthica ranged in length from 377 - 655 bp (mean = 429 bp), while all 172 sequences from H. arctica were 572 bp. All sequences contained less than 1% ambiguous bases and none possessed stop codons or double peaks. The variable length of the M. balthica sequences reflected the use of internal primers to recover sequences from some specimens with degraded DNA. For the analysis of population structure, sequences from H. arctica were trimmed to 572 bp, while sequences of the Pacific/Arctic M. balthica balthica and the northwest Atlantic (ATL) M. balthica were trimmed to 430 bp and 378 bp, respectively. The highest intraspecific divergences (23%) were observed in H. arctica, while intraspecific divergences in the M. balthica complex peaked at 12.5% (Figure 2.2). NJ trees indicated the presence of four lineages in both species, but one of the lineages of M. balthica was only present in Europe (Figure 2.3 & 2.4).

27

Most sequences (155 of 172) and haplotypes (63 of 75) of H. arctica fell into a dominant cluster which was collected at all sites, although the Alaska population had the greatest number of unique haplotypes (27). All 63 members of this clade showed low divergence - just 1 to 3 mutational steps (Figure 2.3; Figure 2.5A). The remaining 12 haplotypes included representatives of three probable cryptic species, two in the Pacific (Cook Inlet, Alaska and Barkley Sound, British Columbia) and a third in New Brunswick (Figure 2.3; Figure 2.5A). TCS suggested that a northwest Atlantic haplotype was ancestral to the dominant cluster with a projected migration route into Hudson Bay (Churchill, MB) and then the northeast Pacific (Alaska, British Columbia)(Figure 2.5A). Subsequent analysis of intraspecific variation in H. arctica only examined specimens belonging to the dominant clade. M. balthica also showed high intraspecific divergences, but this variation showed a clear geographic pattern with up to 12.5% divergence between M. balthica balthica and ATL M. balthica populations (Figure 2.4). In total, 42 haplotypes of M. balthica were observed with the greatest number (14) in Europe. This high diversity likely reflects the analysis of larger sample sizes in Europe although only one representative of each European haplotype was submitted to GenBank (Luttikhuizen et al. 2003, Nikula et al. 2007, Becquet et al. 2012). These sequences were excluded from analysis in Arlequin because they would have distorted estimates of haplotype diversity. Interestingly, the northeast Pacific population grouped with both the Churchill and Baltic Sea populations, while the other European sequences formed a distinct cluster (M. balthica rubra) that appeared more similar to M. balthica balthica than to ATL M. balthica (Figure 2.5B). Populations from the northwest Atlantic only included members of the ATL M. balthica cluster, except those from Newfoundland which included some representatives of the M. balthica balthica cluster (Figure 2.5B). In fact, the median-joining network suggested that an ancestral haplotype for the M. balthica balthica cluster was present in Newfoundland (Figure 2.5B). Because the divergence between clusters 1 and 2 of M. balthica was only slightly greater than 3%, they were considered as a single taxon for analysis in Arlequin (Figure 2.5B). Regional divergence was obvious in the ATL M. balthica and M. balthica rubra lineages while M. balthica balthica was more broadly distributed, occurring in the northeast Pacific, Arctic and Baltic Sea (Figure 2.4; Figure 2.5B). This divergence is corroborated by the high number of mutational steps that separate each cluster of M. balthica (Figure 2.5B). The sequence divergences of the three cryptic groups (2, 3, 4) of H. arctica suggest that they diverged from a common ancestor 3, 4.5 and 5.5 million years ago, respectively (Figure 2.3). The four clusters of M. balthica all date to more than 1 million years ago, excluding the split between M. balthica balthica and the other lineage in Newfoundland which dates at around 900K years (Figure 2.4). Patterns of genetic diversity

28

Table 2.1 reports, for H. arctica and M. balthica, the number of unique haplotypes, nucleotide diversity, haplotype diversity and Tajima’s D index for each population. Populations at Churchill, Manitoba showed the lowest variation for both M. balthica and H. arctica. Populations of H. arctica from New Brunswick and British Columbia possessed the highest values for nucleotide and gene diversity respectively, while Newfoundland had the greatest gene diversity for both M. balthica clades. The high genetic diversity for Newfoundland may be an artefact of the small sample size (N=5, N=2) for both M. balthica lineages. Despite their exposure to intensive glaciations, populations of both H. arctica and M. balthica from New Brunswick were very diverse. Nucleotide and haplotype diversity did not significantly differ between species. Lastly, Tajima’s D values were negative for all populations except those from Newfoundland, suggesting the presence of many low frequency haplotypes at these sites. Again, the slightly positive Tajima’s D values for both Newfoundland populations may simply be a result of undersampling at this location. Population structure There was evidence for population subdivision in M. balthica, but not in H. arctica where most of the genetic variation resided within populations (Table 2.2A & B). This result for H. arctica was unaffected by grouping populations from each coast. By contrast, most genetic variation in M. balthica balthica existed among the coasts with deep divergence between populations in the Atlantic and Pacific (Table 2.3A & B). The opposite was true for ATL M. balthica as most of its variation existed within populations (Table 2.4). Fixation indices provided further insight into the partitioning of variation in H. arctica and M. balthica. In H. arctica, populations from Nova Scotia and New Brunswick showed little divergence (FST = 0.03), while those from British Columbia and New Brunswick were considerably more divergent (FST = 0.28) (Table 2.5A). Populations of M. balthica balthica from Alaska and British

Columbia were the most similar (FST = 0.02), while those from Churchill and Newfoundland showed high divergence (FST = 0.93) (Table 2.5B), a surprising result given the proximity of Hudson Bay and the

Atlantic coast. The ATL lineage of M. balthica also showed unique genetic structure with a high FST (FST = 0.23) between the Newfoundland and Prince Edward Island populations, although they were only separated by the narrow Northumberland and Cabot Straits (Table 2.5C). When Slatkin’s linearized FST values were plotted against geographic distance, only H. arctica demonstrated evidence of isolation by distance (Figure 2.6) with a strong, positive correlation (R2 = 0.83) and a Mantel test confirmed its significance (p= 0.011). By contrast, M. balthica balthica demonstrated no evidence for isolation by distance (Figures 2.7), while ATL M. balthica (Figure 2.8) demonstrated a negative relationship (R2 = 0.78), but Mantel tests indicated that neither value was significant. Only three populations were used to examine evidence of isolation by distance in the ATL M. balthica, suggesting future work should aim to gather additional data.

29

Discussion Comparing diversity and structure in two bivalves with planktotrophic larval development While a widespread lineage was present in both bivalve species, the M. balthica complex also showed evidence of regional divergence. Divergence between species with shared life history strategies has been demonstrated in other molluscs including the direct developing gastropods Nucella lamellosa and Nucella ostrina (Marko 2004). H. arctica showed less regional variation across Canada than M. balthica, although one lineage of the latter species occurs in both Canadian waters and in the Baltic Sea. The presence of widespread lineages in H. arctica and M. balthica suggests both species have high gene flow among their populations (Keever et al. 2009, Lee & Boulding 2009). The spatial homogeneity demonstrated by the main H. arctica lineage may be the result of adults burrowing in kelp holdfasts and attaching to ship hulls, providing a secondary dispersal mechanism (Helmuth et al. 1994). H. arctica and M. balthica differ radically in some life history strategies, including habitat and feeding behaviours (Newell 1965, Ali 1970, Hines & Comtois 1985). M. balthica lives in soft sediment and occupies shallower habitats and thus may have been more susceptible to local extinctions during glacial maxima, similar to the intertidal dogwhelk Nucella lapillus (Dorjes et al. 1986, Colson & Hughes 2004. Patterns of population fragmentation during glacial periods are often also reflected in measures of genetic diversity. For instance, reduced genetic diversity has been demonstrated in populations of marine fishes severely impacted by glaciation (Bernatchez & Wilson 1998) as well as in Mya arenaria, the softshell clam (Strasser & Barber 2009). However, some taxa which have undergone repeated extinctions during glaciations possess high diversity (Bernatchez & Wilson 1998, Strasser & Barber 2009). For example, high genetic diversity in the dogwhelk, Nucella lapillus, has been linked to rapid expansion following a severe population bottleneck (Colson & Hughes 2004). A similar process may explain the high diversity in populations of both H. arctica and M. balthica from New Brunswick despite their likely exposure to severe glacial conditions (Briggs 1970, Wares & Cunningham 2001). By contrast, the low genetic diversity of populations of both species at Churchill may reflect both the fact that Hudson Bay only formed 8000 years ago (Ashworth 1996) and its isolation from glacial refugia. Conversely, Alaskan populations of both species were diverse perhaps reflecting their foundation through the admixture of lineages from two or more refugia (Kelly et al. 2006, Sakaguchi et al. 2011). Implications for glacial refugia in the northeast Pacific The admixture of previously isolated lineages is thought to explain high levels of genetic diversity in certain populations of the longnose dace (Rhinichthys cataractae) and those of the bluestriped snapper (Lutjanus kasmira) (Girard & Angers 2006, Gaither et al. 2010). Because genetic diversity can be elevated in both zones of admixture and at sites that served as glacial refugia, it is difficult to determine which process explains high diversity in any particular situation (Petit et al. 2003, Kelly et al. 2006,

30

Sakaguchi et al. 2011). Populations of both bivalves examined in this study were diverse in the northeast Pacific, a pattern also seen in the direct developing gastropod N. lamellosa and the brooding sea cucumber Cucumaria pseudocurata (Arndt & Smith 2002, Marko 2004). Although the northeast Pacific coast was glaciated, the extent of glaciation along the Pacific was much less severe than in the northwest Atlantic (Mandryk et al. 2001, Marko 2004). As well, there may have been several glacial refugia in the North Pacific despite literature suggesting that populations were displaced to the south (Warner et al.1982, Hewitt 2000, Mandryk et al. 2001, Marko 2004). Populations in proximity to refugia often possess many unique haplotypes, while haplotypes in admixture zones combine variants found in two or more refugia (Provan & Bennett 2008). The presence of many unique haplotypes in Alaskan population of Nucella lamellosa was invoked as evidence for a northern refugium (Marko 2004). H. arctica and M. balthica reinforce this pattern as 78% and 92% of the haplotypes in their Alaskan population were unique. While most of their haplotypes are unique to this area, the presence of some shared haplotypes suggests possible admixture. The potential for postglacial admixture in Alaska has important implications given that founder populations may be more fit than parental lineages in these environments, ultimately accelerating rates of evolution (Wares et al. 2005, Kelly et al. 2006). Evidence of sibling species Populations isolated into separate glacial refugia inevitably diverge, leading in time to speciation (Dodson et al. 2007, Maggs et al. 2008, Dapporto 2009). M. balthica shows evidence of incipient speciation as indicated by the recognition of two subspecies (Väinölä 2003) with largely allopatric ranges. The detection of both M. balthica balthica and ATL M. balthica lineages in Newfoundland suggests that this area is a zone of secondary contact between Pacific and Arctic haplotypes. It also suggests that this region may be an ideal location for more detailed genetic analysis to determine if these lineages show incipient or complete reproductive isolation. It is possible that populations near Newfoundland experienced localized extinction and recolonization, a process reported for some taxa in the northwest Atlantic (Briggs 1970, Wares & Cunningham 2001). In any case, the presence of more than 8% sequence divergence between members of these two clusters suggests they may warrant recognition as sibling species. H. arctica also shows evidence for sibling species with the discovery of four genetically divergent lineages across Canada, despite the limited evidence for regional differentiation in the most abundant of these groups. Additional work is required to ascertain if the divergence at COI is accompanied by divergence at nuclear loci. These findings suggest that several glacial refugia occurred on the Pacific and Atlantic coasts of Canada and that these refugia fostered speciation. Coyer et al. (2011) discovered two separate glacial refugia for macroalgae (Fucus distichus) in the Newfoundland area alone. Moreover, Nikula et al. (2007) suggest that multiple refugia existed for M. balthica, likely contributing to the deep divergences observed in this complex. In all, the genetic structure of species with planktonic

31 larvae has not only been influenced by dispersal potential but has also been significantly shaped by Canada’s extensive glacial history (Bernatchez & Wilson 1998, Dodson et al. 2007). Conclusions This study constitutes the first investigation of population structure in Hiatella arctica and further contributes to our understanding of population structure in Macoma balthica. The present study has shown that while some lineages of M. balthica and H. arctica demonstrate spatial homogeneity across Canada, there is also evidence of genetic subdivision, suggesting that population structure varies in species with similar life history strategies. Because the taxonomic status of the three lineages in the M. balthica complex is uncertain, future work should incorporate additional genes and studies of reproductive compatibility to determine their status. Little phylogeographic structure was present in the main lineage of H. arctica, but the detection of three other deeply divergent lineages suggests the presence of overlooked sibling species. Whether there are morphological differences, at the larval or early juvenile stages, among these divergent H. arctica lineages remains to be determined and certainly warrants further taxonomic research. In addition, this study has revealed that the northeast Pacific is a zone of high diversity, suggesting it served as a glacial refugium or that it is a zone of secondary contact. Future work should compare intertidal and subtidal populations to rule out the possibility of depth related differentiation as well as include a survey of key environments, such as the Chukchi and Labrador Seas, which remain unsampled. Lastly, future work should aim to calibrate evolutionary rates in each genus as well as gain an understanding of genetic diversity in populations in Asian waters.

32

Table 2.1. Genetic diversity in populations of the bivalve species, H. arctica and M. balthica, as measured by number of haplotypes, haplotype diversity (h), nucleotide diversity (π) and Tajima’s D.

A) H. arctica Population N Haplotypes Unique H h π Tajima’s D NS 60 24 17 0.88 0.0071 -1.35 AK 43 23 18 0.94 0.0075 -1.40 MB 19 7 3 0.71 0.0051 -0.82 NB 25 18 13 0.93 0.0078 -1.73 BC 8 7 5 0.96 0.0060 -0.06

B) M. balthica balthica Population N Haplotypes Unique H h π Tajima’s D AK 38 12 11 0.69 0.0045 -1.48 MB 19 4 3 0.66 0.0019 -0.20 BC 4 3 2 0.83 0.0035 -0.75 NFLD1 5 4 4 0.90 0.0056 0

C) ATL M. balthica Population N Haplotypes Unique H h π Tajima’s D PEI 33 9 6 0.74 0.0053 -1.04 NB 51 12 9 0.83 0.0090 0.01 NFLD2 2 2 0 1 0.016 0

33

Table 2.2. Overall genetic structure measured by AMOVA (analysis of molecular variance) for H. arctica both A) grouped by coastal populations and B) with no grouping. P-values < 0.05 were treated as significant.

A)

Variation df Sum of Squares Variance % of Variation P-value

Components Among Coasts 2 36.9 0.30 12.7 0.06 Among Populations w/in 2 8.4 0.09 3.8 0.03 Coasts Within Populations 150 301.0 2.0 83.6 0.00 Total 154 346.3 2.40

B) Variation df Sum of Squares Variance % of Variation P-value Components Among Populations 4 45.3 0.33 14.1 0.00 Within Populations 150 301.0 2.0 85.9 Total 154 346.3 2.33

Table 2.3. Overall genetic structure measured by AMOVA for M. balthica balthica A) grouped by coastal populations and B) with no grouping. P-values < 0.05 were treated as significant.

A) Variation df Sum of Squares Variance % of Variation P-value Components Among Coasts 2 72.0 2.1 70.9 0.34 Among Populations w/in 1 1.1 0.04 1.4 0.19 Coasts Within Populations 62 50.3 0.81 27.8 0.00 Total 65 123.4 2.95

B) Variation df Sum of Squares Variance % of Variation P-value Components Among Populations 3 73.1 1.9 69.6 0.00 Within Populations 62 50.3 0.81 30.4 Total 65 123.4 2.71

34

Table 2.4. Overall genetic structure measured by AMOVA for ATL M. balthica with no grouping. P- values < 0.05 were treated as significant.

Variation df Sum of Squares Variance % of Variation P-value Components Among Populations 2 7.0 0.10 6.2 0.07 Within Populations 83 120.3 1.4 93.9 Total 85 127.3

35

Table 2.5. FST for populations of A) H. arctica, B) M. balthica balthica and C) ATL M. balthica. P-values < 0.05 are marked with an asterisk.

A) Nova Scotia New Alaska B.C. Churchill Brunswick Nova Scotia 0 New Brunswick 0.03* 0 Alaska 0.16* 0.22* 0 British Columbia 0.25* 0.28* 0.08 0 Churchill 0.05* 0.06* 0.15* 0.25* 0

B) Alaska B.C. Churchill Newfoundland Alaska 0 British Columbia 0.02 0 Churchill 0.09* 0.14* 0 Newfoundland 0.88* 0.88* 0.93* 0

C) P.E.I. New Brunswick Newfoundland Prince Edward Island (P.E.I.) 0 New Brunswick 0.07* 0 Newfoundland 0.23 -0.19 0

36

Figure 2.1. Collection sites for H. arctica and M. balthica. The sample size for each species at a site is shown in the pie.

37

A)

B)

Figure 2.2. Intraspecific sequence divergence (K2P) for A) H. arctica and B) M. balthica.

38

1

2 3 4

Figure 2.3. Neighbour-joining tree based on K2P distances for H. arctica. The top scale shows estimated divergence times in millions of years, while the bottom scale bar shows % sequence divergence (K2P). Bootstrap probabilities are shown on the NJ tree and red triangles represent compressed clades, with sample size provided in brackets.

39

1: M. balthica balthica

2

3 : M. balthica rubra M. balthica complex

4

Figure 2.4. Neighbour-joining tree based on K2P distances for M. balthica. Subspecies names suggested by Väinölä (2003) are provided. The top scale shows estimated divergence times in millions of years, while the bottom scale bar shows sequence divergence (% K2P). Bootstrap probabilities are shown on the NJ tree and blue triangles represent compressed clades, with sample size provided in brackets.

40

A)

1 4

2

3

1 B)

4 3

2

Figure 2.5. Median-joining haplotype networks for A) H. arctica and B) M. balthica constructed in Network 4.6.1 with maximum parsimony. All mutational steps are equal to 1 unless shown by numeral. Presumptive ancestral haplotypes are marked with a white star. The size of circles in each network varies with the number of sequences belonging to each haplotype. M. balthica rubra sequences are included.

41

Figure 2.6. FST values (Slatkin’s linearized) plotted against geographic distance (km) for populations within the main lineage of H. arctica to examine the extent of isolation by distance. Results from a Mantel test for significance are provided on the plot.

Figure 2.7. FST values (Slatkin’s linearized) plotted against geographic distance (km) for populations of M. balthica balthica to examine the extent of isolation by distance. Results from a Mantel test for significance are provided on the plot.

42

Figure 2.8. FST values (Slatkin’s linearized) plotted against geographic distance (km) for populations of ATL M. balthica to examine the extent of isolation by distance. Results from a Mantel test for significance are provided on the plot.

43

General Conclusions Summary of findings My thesis has explored patterns of sequence variation in the COI gene in Canadian marine molluscs both at the phylum and species level. The results of this work suggest that intraspecific variation at COI may be higher in molluscs than in other marine phyla, and that patterns of sequence divergence vary among species with similar life history strategies. My results expand our knowledge of species diversity and population structure in molluscs and provide crucial insight for conservation strategies, particularly for translocations. Chapter 1 makes an important contribution toward the construction of a comprehensive barcode library for Canadian marine molluscs, generating records for nearly 25% of the known malacological fauna of Canada. This work provides novel insight into how sequence divergence varies among molluscs, and also reveals a correlation between nearest neighbour distance and GC content. While my work has begun to fill the gap in barcode coverage for Canadian molluscs, many species still lack data. My investigations in Chapter 1 revealed deep intraspecific divergences in 9 species, motivating the two case studies in Chapter 2. In this chapter, I show that population structure can differ greatly in species with similar larval development and dispersal potential, suggesting that vicariance events have provoked variable outcomes in genetic divergence patterns. Certainly prior glaciations have fragmented the populations of both species, provoking genetic subdivision and the formation of species complexes. Given that patterns of population fragmentation vary between species, future research needs to examine how habitat preferences and larval resilience have impacted the contemporary distribution of species. The application of a DNA barcode library for Canadian marine molluscs DNA barcoding has proven effective in delineating species boundaries across many animal groups, but my work is the first attempt to examine its performance in a large number of Canadian marine molluscs. My investigations revealed 9 cases of deep intraspecific divergence (>2%) that likely represent cryptic species, potentially representing new Canadian records. Webb et al. (2012) similarly discovered that apparent cases of high intraspecific divergence in North American Ephemeroptera were largely due to overlooked species complexes that showed large divergence between their haplotype clusters. Such overlooked taxa inflate mean intraspecific divergences and highlight the importance of integrating molecular and morphological approaches to advance our understanding of species boundaries. In light of this, it is imperative that future barcode studies include detailed taxonomic work to resolve misidentifications, cases of synonymy, and overlooked species that otherwise generate a misleading sense of sequence divergence. Although my study surveyed many species, future work should aim to fill gaps in species coverage, especially by sampling deep sea habitats where many species await discovery (Archambault et

44 al. 2010). It would also be valuable to compare patterns of genetic variation in marine molluscs to those in their freshwater and terrestrial counterparts to gain insight into how patterns of sequence differentiation vary across these habitats. For instance, the majority of marine species in the tropical Pacific were found to be more widespread and to show less differentiation than terrestrial species (Paulay and Meyer 2002). Future work should extend this analysis by comparing patterns of divergence in molluscs from the Arctic, temperate zone and tropics to better understand how species diversity, as well as rates of speciation and diversification, varies among these environments. This research would help to provide the data needed to verify that lower extinction rates are responsible for the greater species richness evident in tropical settings (Schemske 2012). Gene flow in marine populations While gene flow is undoubtedly greater in species with planktonic larvae, recent studies have indicated that panmixis is not inevitable. In fact, spatial heterogeneity was noted in a planktotrophic ( balanoides), while the direct developing gastropod (Nucella ostrina) showed spatial homogeneity (Holm & Bourget 1994, Marko 2004). These results suggest that while dispersal potential is important, other factors also impact population structure. While my study assessed population structure in two species with planktotrophic larval development, only a few studies have compared patterns between species with different modes of larval development. Lee and Boulding (2009) found that population structure in littorinid snails from the northeast Pacific differed both within and between larval types. Given this complexity, future work should aim for a broad analysis of population structure in planktotrophic, lecithotrophic and direct developing species. Furthermore, while planktonic larvae facilitate gene flow among populations, gene flow may vary if larvae differ in their tolerance to environmental conditions. As well, some species show delayed metamorphosis when exposed to low temperatures, so gene flow can be impacted by environmental temperatures (Pechenik 1980). In light of such complexities, a solid understanding of the physiological tolerances of planktonic larvae and of the duration of the larval stage would bring new insight into population differentiation. If dispersal potential were the only factor determining population structure then differing phylogeographic patterns would always be apparent between species with planktotrophic larval development and those with direct development. The fact that they are not suggests that vicariance events play a key role. Multiple glacial cycles have impacted Canada’s coasts and while this study utilizes coalescent methods to examine the impact of glaciation on population structure, fossil data would provide crucial insight into both historical distributions and the location of glacial refugia. Lineages showing marked sequence divergence were discovered in about 7% of the species examined in this study. My work emphasizes the value of coalescent approaches for population genetics because of the difficulty of discriminating cryptic species through the fossil record alone.

45

Conclusions and implications for conservation Protecting marine biodiversity is a challenging and difficult task, but one which is critical given the rate at which i) marine ecosystems are being impacted by human activities and ii) species are becoming extinct. My work has begun the construction of a DNA barcode library for Canadian marine molluscs that not only provides taxonomic assignments, but also locality information. This study is a particularly important contribution to the DNA barcoding literature because it highlights the efficacy of COI for teasing apart patterns of genetic variation on both a broad and local scale. In addition, the documentation of genetic variation in this phylum lends itself to comparative studies with other marine phyla. Such research is crucial to better understand speciation in marine environments and to broaden knowledge of the factors promoting diversification in marine phyla. Discovering cryptic diversity in the marine realm, particularly in species with a Holarctic distribution, has implications for conservation efforts that target both species diversity and genetic diversity. For instance, identifying distinct genetic clusters in species thought to have broad distributions can impact translocation efforts which aim to replenish locally extirpated populations. These findings can also significantly impact bivalve mariculture. In all, this thesis broadens our understanding of genetic variation in Canadian marine molluscs, aiding future conservation efforts.

46

Literature Cited

Addison, J.A. & Hart, M.W. (2005). Colonization, dispersal, and hybridization influence phylogeography of North Atlantic sea urchins (Strongylocentrotus droebachiensis). Evolution, 59, 532-543.

Albu, M., Min, X.J., Hickey, D. & Golding, B. (2008). Uncorrected nucleotide bias in mtDNA can mimic the effects of positive Darwinian selection. Molecular Biology and Evolution, 25, 2521-2524.

Ali, R.M. (1970). The influence of suspension density and temperature on the filtration rate of Hiatella arctica. Marine Biology, 6, 291-302.

Alison, R.C. & Marincovich, L.J. (1982). A late Oligocene or earliest Miocene molluscan fauna from Sitkinak Island, Alaska. In: Jurassic (Oxfordian and late Callovian) Ammonites from the Western Interior Region of the United States (ed Imlay, R.W.). Geological Survey professional paper, Washington, 1232, 115.

An, H.S. & Lee, J.W. (2012). Development of microsatellite markers for the Korean mussel, Mytilus coruscus (Mytilidae) using next-generation sequencing. Journal of Molecular Science, 13, 10583- 10593.

Archambault, P., Snelgrove, P.V.R., Fisher, J.A.D., Gagnon, J.M., Garbary, D.J., et al. (2010). From sea to sea: Canada’s three oceans of biodiversity. PLoS ONE, 5, e12182.

Arndt, A. & Smith, M.J. (2002). Genetic diversity and population structure in two species of sea cucumber: differing patterns according to mode of development. Molecular Ecology, 7, 1053- 1064.

Ashworth, A. (1996). The response of arctic Carabidae (Coleoptera) to climate change based on the fossil record of the Quaternary. Annales Zoologici Fennici, 33, 125-131.

Bandelt, H-J., Forster, P. & Röhl, A. (1999). Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution, 16, 37-48.

Bax, N., Williamson, A., Aguero, M., Gonzalez, E. & Geeves, W. (2003). Marine invasive alien species: a threat to global biodiversity. Marine Policy, 27, 313-323.

Becquet, V., Simon-Bouhet, B., Pante, E., Hummel, H. & Garcia, P. (2012). Glacial refugium versus range limit: Conservation genetics of Macoma balthica, a key species in the (). Journal of Experimental Marine Biology and Ecology, 432-433, 73-82.

Bernatchez, L. & Wilson, C.C. (1998). Comparative phylogeography of Nearctic and Palearctic fishes. Molecular Ecology, 7, 431-452.

Bouchet, P., Lozouet, P., Maestrati, P. & Heros, V. (2002). Assessing the magnitude of species richness in tropical marine environments: exceptionally high numbers of molluscs at a New Caledonia site. Biological Journal of the Linnean Society, 75, 421-436.

Bouchet, P. (2006). The magnitude of marine biodiversity. In: The exploration of marine biodiversity: scientific and technological challenges (ed. Duarte, C.M.). Fundacion BBVA, Bilbao, , 31- 64.

47

Bradbury, I.R., Laurel, B., Snelgrove, P.V.R., Bentzen, P. & Campana, S.E. (2008). Global patterns in marine dispersal estimates: the influence of geography, taxonomic category and life history. Proceedings of the Royal Society of London: Biological Series, 275, 1803–1809.

Briggs, J.C. (1970). A faunal history of the North Atlantic Ocean. Systematic Zoology, 19, 19–34. Bucklin, A., Steinke, D. & Blanco-Bercial, L. (2011). DNA barcoding of marine metazoa. Annual Review of Marine Science, 3, 471-508.

Campillo, S., Serra, M., Carmona, M.J. & Gomez, A. (2011). Widespread secondary contact and new glacial refugia in the halophilic rotifer Brachionus plicatilis in the Iberian Peninsula. PLoS ONE, 6, e20986.

Canestrelli, D., Sacco, F. & Nascetti, G. (2011). On glacial refugia, genetic diversity, and microevolutionary processes: deep phylogeographical structure in the endemic newt Lissotriton italicus. Biological Journal of the Linnaean Society, 105, 42-55.

Carlton, J.T. (1999). The scale and ecological consequences of biological invasions in the world’s oceans. In: Invasive species and biodiversity management (eds Sandlund, O.T., Schei, P.J. & Viken, A.). Kluwer Academic Publishers, Dordrecht, 195-212.

Carr, C., Hardy, S.M., Brown, T.M., Macdonald, T.A. & Hebert, P.D.N. (2010). A tri-oceanic perspective: DNA barcoding reveals geographic structure and cryptic diversity in Canadian polychaetes. PLoS ONE, 6, e22232.

Clare, E.L., Kerr, K.C.R., von Königslöw, T.E., Wilson, J.J. & Hebert, P.D.N. (2008). Diagnosing mitochondrial DNA diversity: applications of a sentinel gene approach. Journal of Molecular Evolution, 66, 362:367.

Clement, M., Posada, D. & Crandall, K. (2000). TCS: a computer program to estimate gene genealogies. Molecular Ecology, 9, 1657-1660.

Colgan, D.J., Ponder, W.F. & Eggler, P.E. (1999). Gastropod evolutionary rates and phylogenetic relationships assessed using partial 28S rDNA and histone H3 sequences. Zoologica Scripta, 29, 29-63.

Colson, I. & Hughes, R.N. (2004). Rapid recovery of genetic diversity of dogwhelk (Nucella lapillus L.) populations after local extinction and recolonization contradicts predictions from life-history characteristics. Molecular Ecology, 13, 2223-2233.

Costa, F.O., deWaard, J.R., Boutillier, J., Ratnasingham, S., Dooh, R.T., Hajibabaei, M. & Hebert, P.D.N. (2007). Biological identifications through DNA barcodes: the case of the Crustacea. Canadian Journal of Fisheries and Aquatic Sciences, 64, 272-295.

Coyer, J.A., Hoarau, G., Van Schaik, J., Luijckx, P. & Olsen, J.L. (2011). Trans-Pacific and trans-Arctic pathways of the intertidal macroalga Fucus distichus L. reveal multiple glacial refugia and colonizations from the North Pacific to the North Atlantic. Journal of Biogeography, 38, 756- 771.

Dapporto, L. (2009). Speciation in Mediterranean refugia and post-glacial expansion of Zerynthia polyxena (Lepidoptera, Papilionidae). Journal of Zoological Systematics and Evolutionary Research, 48, 229-237.

48

Dixon, D.R., Simpson-White, R. & Dixon, L.R.J. (1992). Evidence of thermal stability of ribosomal DNA sequences in hydrothermal vent organisms. Journal of the Marine Biological Association of the United Kingdom, 72, 519-527. Dixon, P. (2003). VEGAN, a package of R functions for community ecology. Journal of Vegetation Science, 14, 927–930.

Dodson, J.J., Tremblay, S., Colombani, F., Carscadden, J.E. & Lecomte, F. (2007). Trans-Arctic dispersals and the evolution of a circumpolar marine fish species complex, the capelin (Mallotus villosus). Molecular Ecology, 16, 5030-5043.

Donald, K.M., Kennedy, M. & Spencer, H.G. (2005). Cladogenesis as the result of long-distance rafting events in South Pacific topshells (Gastropoda, Trochidae). Evolution, 59, 1701-1711.

Dorjes, J., Michaelis, H. & Rhode, B. (1986). Long-term studies of macrozoobenthos in intertidal and shallow subtidal habitats near the island of Norderney (East Frisian Coast, Germany). Hydrobiologia, 142, 217-232.

Drent, J., Luttikhuizen, P.C. & Piersma, T. (2004). Morphological dynamics in the foraging apparatus of a deposit feeding marine bivalve: phenotypic plasticity and heritable effects. Functional Ecology, 18, 349-356.

Excoffier, L. & Lischer, H.E.L. (2010). Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources, 10, 564- 567.

Gaither, M.R., Bowen, B.W., Toonen, R.J., Planes, S., Messmer, V., Earle, J. & Robertson, D.R. (2010). Genetic consequences of introducing allopatric lineages of bluestriped snapper (Lutjanus kasmira) to Hawaii. Molecular Ecology, 19, 1107-1121.

Ghiselli, F., Milani, L., Chang, P.L., Hedgecock, D., Davis, J.P., Nuzhdin, S.V. & Passamonti, M. (2012). De Novo assembly of the Manila clam Ruditapes philippinarum transcriptome provides new insights into expression bias, mitochondrial doubly uniparental inheritance and sex determination. Molecular Biology and Evolution, 29, 771-786.

Girard, P. & Angers, B. (2006). The impact of postglacial marine invasions on the genetic diversity of an obligate freshwater fish, the longnose dace (Rhinichthys cataractae), on the Quebec peninsula. Canadian Journal of Fisheries and Aquatic Sciences, 63, 1429-1438.

Gofas, S. (2012). Macoma balthica (Linnaeus, 1758). Accessed through: World Register of Marine Species at http:// marinespecies.org/aphia.php?p=taxdetails&id=141579.

Grande, C., Templado, J., Cervera, J.L. & Zardoya, R. (2004). Molecular phylogeny of Euthyneura (Mollusca: Gastropoda). Molecular Biology and Evolution, 21, 303-313.

Hao, X., Jiang, R. & Chen, T. (2011). Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering. Bioinformatics, 27, 611-618.

Harrell, F.E. & Miscellaneous (2012). Hmisc: Harrell miscellaneous. R package version 3.9-3.

49

Harper, F.M. & Hart, M.W. (2007). Morphological and phylogenetic evidence for hybridization and introgression in a sea star secondary contact zone. Invertebrate Biology, 126, 373-384.

Hebert, P.D.N., Cywinska, A., Ball, S.L. & deWaard, J.R. (2003). Biological identifications through DNA barcodes. Proceedings of the Royal Society of London Series B: Biological Sciences, 270, 313-321.

Hebert, P.D.N., Penton, E.H., Burns, J.M., Janzen, D.H., Hallwachs, W. (2004). Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proceedings of the National Academy of Sciences of the United States of America, 101, 14812-14817.

Hedgecock, D., Li, G., Hubert, S., Bucklin, K. & Ribes, V. (2004). Widespread null alleles and poor cross-species amplification of microsatellite DNA loci cloned from the Pacific oyster (Crassostrea gigas). Journal of Shellfish Research, 23, 379-385. Hellberg, M.E. & Vacquier, V.D. (1999). Rapid evolution of fertilization selectivity and lysin cDNA sequences in teguline gastropods. Molecular Biology and Evolution, 16, 839-848.

Helmuth, B., Veit, R.R. & Holberton, R. (1994). Long distance dispersal of subantarctic brooding bivalve (Gaimardia trapesina) by kelp rafting. Marine Biology, 120, 421-426.

Hewitt, G.M. (1996). Some genetic consequences of ice ages, and their role in divergence and speciation. Biological Journal of the Linnaean Society, 58, 247–276.

Hewitt, G. (2000). The genetic legacy of the Quaternary ice ages. Nature, 405, 907-913.

Hines, A.H. & Comtois, K.L. (1985). Vertical distribution of infauna in sediments of a subestuary of central Chesapeake Bay. , 8, 296-304.

Holm, E.R. & Bourget, E. (1994). Selection and population genetic structure of the barnacle Semibalanus balanoides in the northwest Atlantic and Gulf of St. Lawrence. Marine Ecological Progress Series, 113, 247-256.

Hunt, B., Strugnell, J., Bednarsek, N., Linse, K., Nelson, R.J., Pakhomov, E., Seibel, B., Steinke, D. & Würzberg, L. (2010). Poles apart: The “bipolar” pteropod species Limacina helicina is genetically distinct between the Arctic and Antarctic Oceans. PLoS ONE, 5, e9835.

Ingolfsson, A. (1992). The origin of the rocky shore fauna of Iceland and the Canadian Maritimes. Journal of Biogeography, 19, 705-712.

Ivanova, N.V., Fazekas, A.J. & Hebert, P.D.N. (2008). Semi-automated, membrane-based protocol for DNA isolation from plants. Plant Molecular Biology Reporter, 26, 186-198.

Jablonski, D. (1986). Larval ecology and macroevolution in marine invertebrates. Bulletin of Marine Science, 39, 565-587.

Järnegren, J., Schander, C., Sneli, J.A., Rønningen, V. & Young, C. (2007). Four genes, morphology and ecology: distinguishing a new species of Acesta (Mollusca; Bivalvia) from the Gulf of Mexico. Marine Biology, 152, 43-55.

50

Jennings, R.M., Bucklin, A., Ossenbrügger, H. & Hopcroft, R.R. (2010). Species diversity of planktonic gastropods (Pteropoda and Heteropoda) from six ocean regions based on DNA barcode analysis. Deep-Sea Research Part II: Topical Studies in Oceanography, 57, 2199-2210.

Johnson, S.B., Waren, A. & Vrijenhoek, R.C. (2008). DNA barcoding of Lepetodrilus limpets reveals cryptic species. Journal of Shellfish Research, 27, 43-51.

Jones, M., Ghoorad, A. & Blaxter, M. (2011). jMOTU and Taxonerator: turning DNA barcode sequences into annotated operational taxonomic units. PLoS ONE, 6, e19359.

Keever, C.C., Sunday, J., Puritz, J.B., Addison, J.A., Toonen, R.J., Grosberg, R.K. & Hart, M.W. (2009). Discordant distribution of populations and genetic variation in a sea star with high dispersal potential. Evolution, 63, 3214-3227.

Kelly, D.W., Muirhead, J.R., Heath, D.D. & MacIsaac, H.J. (2006). Contrasting patterns in genetic diversity following multiple invasions of fresh and brackish waters. Molecular Ecology, 15, 3641-3653.

Kembel, S.W., Cowan, P.D., Helmus, M.R., Cornwell, W.K., Morlon, H., Ackerly, D.D., Blomberg, S.P. & Webb, C.O. (2010). Picante: R tools for integrating phylogenies and ecology. Bioinformatics, 26, 1463-1464.

Kerr, K.C.R., Lijtmaer, D.A., Barreira, A.S., Hebert, P.D.N. & Tubaro, P.L. (2009). Probing evolutionary patterns in neotropical birds through DNA barcodes. PLoS ONE, 4, e4379.

Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111-120.

Knowlton, N. (2000). Molecular genetic analyses of species boundaries in the sea. Hydrobiologia, 420, 73–90.

Kyle, C.J. & Boulding, E.G. (2000). Comparative population genetic structure of marine gastropods (Littorina spp.) with and without pelagic larval dispersal. Marine Biology, 137, 835-845.

Lebour, M.V. (1938). Notes on the breeding of some lamellibranchs from Plymouth and their larvae. Journal of the Marine Biological Association of the United Kingdom, 23, 119-144.

Lee, H.J. & Boulding, E.G. (2009). Spatial and temporal population structure of four northeastern Pacific littorinid gastropods: the effect of mode of larval development on variation at one mitochondrial and two nuclear DNA markers. Molecular Ecology, 18, 2165-2184.

Luttikhuizen, P.C., Drent, J. & Baker, A.J. (2003). Disjunct distribution of highly diverged mitochondrial lineage clade and population subdivision in a marine bivalve with pelagic larval dispersal. Molecular Ecology, 12, 2215–2229.

Maggs, C.A., Castilho, R., Foltz, D., Henzler, C., Jolly, M.T., Kelly, J., Olsen, J., Perez, K.E., Stam, W., Väinolä, R., Viard, F. & Wares, J. (2008). Evaluating signatures of glacial refugia for North Atlantic benthic marine taxa. Ecology, 89, S108-S122.

51

Mandryk, C.A.S., Josenhans, H., Fedje, D.W. & Mathewes, R.W. (2001). Late Quaternary paleoenvironments of Northwestern : implications for inland versus coastal migration routes. Quaternary Science Review, 20, 301-314.

Marko, P.B. (2002). Fossil calibration of molecular clocks and the divergence times of geminate species pairs separated by the Isthmus of Panama. Molecular Biology and Evolution, 19, 2005-2021.

Marko, P.B. (2004). ‘What’s larvae got to do with it?’ Disparate patterns of post-glacial population structure in two benthic marine gastropods with identical dispersal potential. Molecular Ecology, 13, 597-611.

Marko, P.B. & Moran, A.L. (2009). Out of sight, out of mind: high cryptic diversity obscures the identities and histories of geminate species in the marine bivalve subgenus Acar. Journal of Biogeography, 36, 1861-1880.

Meehan, B.W. (1985). Genetic comparison of Macoma balthica (Bivalvia, Telinidae) from the eastern and western North Atlantic Ocean. Marine Ecological Progress Series, 22, 69-76.

Meyer, C. (2003). Molecular systematics of cowries (Gastropoda: Cypraeidae) and diversification patterns in the tropics. Biological Journal of the Linnean Society, 79, 401-459.

Meyer, C.P. & Paulay, G. (2005). DNA barcoding: error rates based on comprehensive sampling. PLoS Biology, 3, e422.

Mikkelsen, N.T., Schander, C. & Willassen, E. (2007). Local scale DNA barcoding of bivalves (Mollusca): a case study. Zoologica Scripta, 36, 455-463.

Naughton, K.M. & O’Hara, T.D. (2009). A new brooding species of the biscuit star Tosia (Echinodermata: Asteroidea: Goniasteridae), distinguished by molecular, morphological and larval characters. Invertebrate Systematics, 23, 348-366.

Nei, M. (1987). Molecular Evolutionary Genetics. Columbia University Press, New York.

Newell, R.C. (1965). The role of detritus in the nutrition of two marine deposit feeders, the prosobranch Hydrobia ulvae and the bivalve Macoma balthica. Proceedings of the Royal Zoological Society of London, 144, 25- 45.

Nikula, R., Strelkov, P. & Väinölä, R. (2007). Diversity and trans-arctic invasion history of mitochondrial lineages in the North Atlantic Macoma balthica complex (Bivalvia: Tellinidae). Evolution, 61, 928-941.

Passamonti, M. & Ghiselli, F. (2009). Doubly uniparental inheritance: two mitochondrial genomes, one precious model for organelle DNA inheritance and evolution. DNA and Cell Biology, 28, 79-89.

Paulay, G. & Meyer, C.M. (2002). Diversification in the tropical Pacific: comparisons between marine and terrestrial systems and the importance of founder speciation. Integrative and Comparative Biology, 42, 922-934.

Pechenik, J.A. (1980). Growth and energy balance during the larval lives of three prosobranch gastropods. Journal of Experimental Marine Biology and Ecology, 44, 1-28.

52

Peterson, G.H. (1999). Five recent Mya species, including three new species and their fossil connections. Polar Biology, 22, 322-328.

Petit, R.J., Aguinagalde, I., de Beaulieu, J-L., Bittkau, C., Brewer, S., et al. (2003). Glacial refugia: hotspots but not melting pots of genetic diversity. Science, 300, 1563-1565.

Plazzi, F., Ceregato, A., Taviani, M. & Passamonti, M. (2011). A molecular phylogeny of bivalve mollusks: ancient radiations and divergences as revealed by mitochondrial genes. PLoS ONE, 6, e27147.

Puillandre, N., Lambert, A., Brouillet, S. & Achaz, G. (2011). ABGD, Automated Barcode Gap Discovery for primary species delineation. Molecular Ecology, 21, 1864-1877.

R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

Radulovici, A.E., Archambault, P. & Dufresne, F. (2010). DNA barcodes for marine biodiversity: moving fast forward? Diversity, 2, 450-472.

Ramel, C. (1998). Biodiversity and intraspecific genetic variation. Pure and Applied Chemistry, 70, 2079- 2084.

Ratnasingham, S. & Hebert, P.D.N. (2007). BOLD: The Barcode of Life Data System (www.barcodinglife.org). Molecular Ecology Notes, 7, 355-364.

Reid, D.G. (1990). Trans-Arctic migration and speciation induced by climate change: The biogeography of Littorina (Mollusca: Gastropoda). Bulletin of Marine Science, 47, 35-49.

Remigio, E.A. & Hebert, P.D.N. (2003). Testing the utility of partial COI sequences for phylogenetic estimates of gastropod relationships. Molecular Phylogenetics and Evolution, 29, 641-647.

Rohling, R.J., Fenton, M., Jorissen, F.J., Bertrand, P., Ganssen, G., Caulet, J.P. (1998). Magnitudes of sea-level lowstands of the past 500,000 years. Nature, 394, 162–165.

Saitou, N. & Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406-425.

Sakaguchi, S., Takeuchi, Y., Yamasaki, M., Sakurai, S. & Isagi, Y. (2011). Lineage admixture during postglacial range expansion is responsible for the increased gene diversity of Kalopanax septemlobus in a recently colonised territory. Heredity, 107, 338-348.

Schemske, D.W. (2009). Biotic interactions and speciation in the tropics. In: Speciation and patterns of diversity (eds Butlin, R.K., Bridle, J.R. & Schulter, D.). Cambridge University Press, British Ecological Society, 219-239.

Schneider, S. & Kaim, A. (2012). Early ontogeny of middle Jurassic hiatellids from a wood-fall association: implications for phylogeny and paleoecology of Hiatellidae. Journal of Molluscan Studies, 78, 119-127.

53

Slatkin, M. (1993). Isolation by distance in equilibrium and non-equilibrium populations. Evolution, 47, 264-279.

Snelgrove, P.V.R. (1999). Getting to the bottom of marine biodiversity: sedimentary habitats. Bioscience, 49, 129-138.

Snelgrove, P.V.R. (2010). Discoveries of Census of Marine Life: making ocean life count. Cambridge University Press.

Steinke, D., Zemlak, T.S., Boutillier, J.A. & Hebert, P.D.N. (2009a). DNA barcoding Pacific Canada’s fishes. Marine Biology, 156, 2641-2647.

Sun, Y., Li, Q., Kong, L. & Zheng, X. (2011). DNA barcoding of Caenogastropoda along coast of China based on the COI gene. Molecular Ecology Resources, 12, 209-218.

Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics, 123, 585-595.

Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. & Kumar, S. (2011). MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution, 28, 2731-2739.

Taylor, E.B. & Dodson, J.J. (1994). A molecular analysis of relationships and biogeography within a species complex of Holarctic fish (genus Osmerus). Molecular Ecology, 3, 235–248.

Thomaz, D., Guiller, A. & Clarke, B. (1996). Extreme divergence of mitochondrial DNA within species of pulmonate land snails. Proceedings of the Royal Society B: Biological Sciences, 263, 363-368. Tian, D., Wang, Q., Zhang, P., Araki, H., Yang, S., Kreitman, M., Nagylaki, T., Hudson, R., Bergelson, J. & Chen, J.Q. (2008). Single-nucleotide mutation rate increases close to insertions/deletions in eukaryotes. Nature, 455, 105–108.

Újvári, B., Madsen, T., Kotenko, T., Olsson, M., Shine, R. & Wittzell, H. (2002). Low genetic diversity threatens imminent extinction for the Hungarian meadow viper (Vipera ursinii rakosiensis). Biological Conservation, 105, 127-130. van der Spoel, S. & Dadon, J.R. (1999). Pteropoda. In: South Atlantic (ed. Boltovskoy, D.) Bachhuys Publishers, Leiden Netherlands, 649–706.

Väinölä, R. (2003). Repeated trans-Arctic invasions in littoral bivalves: molecular zoogeography of the Macoma balthica complex. Marine Biology, 143, 935-946.

Vermeij, G. (1991). Anatomy of an invasion: the trans-Arctic interchange. Paleobiology, 17, 281–307.

Vetsigian, K. & Goldenfeld, N. (2005). Global divergence of microbial genome sequences mediated by propagating fronts. Proceedings of the National Academy of Sciences of the United States of America, 102, 7332-7337.

Ward, R.D., Zemlak, T.S., Innes, B.H., Last, P.R. & Hebert, P.D.N. (2005). DNA barcoding Australia’s fish species. Philosophical Transactions of the Royal Society B: Biological Sciences, 360, 1847– 1857.

54

Ward, R.D., Holmes, B.H. & O’Hara, T.D. (2008). DNA barcoding discriminates echinoderm species. Molecular Ecology Resources, 8, 1202-1211.

Wares, J.P. & Cunningham, C.W. (2001). Phylogeography and historical ecology of the North Atlantic intertidal. Evolution, 55, 2455-2469.

Wares, J.P., Hughes, A.R. & Grosberg, R.K. (2005). Mechanisms that drive evolutionary change: insights from species introductions and invasions. In: Species Invasions: Insights into Ecology, Evolution and Biogeography (eds Sax, D.F., Stachowicz, J.J. & Gaines, S.D.). Sinauer Associates, Sunderland, Massachusetts, 229-257.

Warner, B.G., Mathewes, R.W. & Clague, J.J. (1982). Ice-free conditions on the Queen Charlotte Islands, British Columbia, at the height of the late Wisconsin glaciation. Science, 218, 675-677.

Webb, T. III & Bartlein, P.J. (1992). Global change during the last 3 million years: climatic controls and biotic responses. Annual Review of Ecology, Evolution and Systematics, 23, 141–173.

Webb, J.M., Jacobus, L.M., Funk, D.H., Zhou, X., Kondratieff, B., Geraci, C.J., DeWalt, R.E., Baird, D.J., Richard, B., Phillips, I. & Hebert, P.D.N. (2012). A DNA barcode library for North American Ephemeroptera: progress and prospects. PLoS ONE, 7, e38063.

Weir, B.S. & Cockerham, C.C. (1984). Estimating F-statistics for the analysis of population structure. Evolution, 38, 1358-1370.

Witt, J.D.S., Threloff, D.L. & Hebert, P.D.N. (2006). DNA barcoding reveals extraordinary cryptic diversity in an amphipod genus: implications for desert spring conservation. Molecular Ecology, 15, 3073-3082. Wu, H., Zhang, Z., Hu, S. & Yu, J. (2012). On the molecular mechanism of GC content variation among eubacterial genomes. Biology Direct, 7, 1-16.

Zou, S., Li, Q., Kong, L., Yu, H. & Zheng, X. (2011). Comparing the usefulness of distance, monophyly and character-based DNA barcoding methods in species identification: a case study of Neogastropoda. PLoS ONE, 6, e26619.

Zou, S,. Li, Q. & Kong, L. (2012). Multigene barcoding and phylogeny of geographically widespread muricids (Gastropoda: Neogastropoda) along the coast of China. Marine Biotechnology, 14, 21- 34.

Zouros, E. (2012). Biparental inheritance through uniparental transmission: the Doubly Uniparental Inheritance (DUI) of Mitochondrial DNA. Evolutionary Biology, 1-31.

55

Appendices Appendix A Specimen Preservation

Most specimens were fixed in the field in 95% ethanol however specimens from southeast Alaska were fixed in 100% ethanol due to available resources at the University of Fairbanks. Upon initial fixation, the operculum of snails was carefully pulled back to prevent DNA degradation. Similarly, a small, lateral incision was made along the shells of bivalves to ensure that interior tissues would be properly preserved. Ethanol was replaced in the field once a day for the first 3 days and up to 3 additional times when resources were available. Once arriving back at the lab, specimens were stored in a -20°C freezer or a fridge, depending on available space. Ethanol was refreshed immediately upon returning to the lab, even prior to sorting lots into individual specimens. Five to ten specimens per species were chosen for processing and were assigned unique sample names. These specimens were imaged with a Canon EOS 30D/50D camera and hydrated with several drops of ethanol during imaging. All images, along with specimen information, can be found on the Barcode of Life Database (BOLD). During sub-sampling, tissue was typically removed from the foot in gastropods and chitons and from the adductor muscle or mantel in bivalves. After tissue was removed from each specimen for molecular analysis, ethanol was refreshed in all specimens prior to being placed back in cold storage. Interestingly, we saw high sequencing success in specimens from British Columbia that were fixed in dry ice and stored in a -80°C freezer, suggesting these are the best preservation techniques for marine molluscs.

56

Appendix B Species Identifications

All specimens were assigned interim species names in the field through the references outlined in Table B.1. After processing specimens, I traveled to the Canadian Museum of Nature to verify species identifications for all barcode clusters recovered in Chapter 1. I worked alongside Dr. André Martel and used the references outlined in Table B.1 to assign species-level identifications to each cluster. Nudibranchs and microsnails were hydrated in 70% ethanol and identified under a microscope while larger, shelled gastropods were identified dry. Shelled gastropods were identified by the number and shape of radial ridges and striaea along their shell as well as by the operculum material, aperature teeth and general whorl shape. Bivalves were often identified through shell morphology alone so tissue was removed to view muscle scars, the pallial sinus and hinge dentition, all diagnostic features used for identifying bivalves (Figure B.1). Chitons were hydrated in 70% ethanol under a microscope to view girdle scales and spines, key features used for identification in this class. Species identifications are particularly difficult in Tonicella, a diverse genus of chiton. Two commonly misidentified species, T. marmorea and T. rubra, were distinguished through girdle scale patterns (Figure B.2). In T. rubra, girdle scales are 2 to 3 times wider than in T. marmorea, the latter also possesses a much lighter girdle colour.

57

Table B.1. References for species identifications made in the field and at the Canadian Museum of Nature.

Place of Identification Reference: Title Reference: Author Field Sites National Audubon Society Regional Guide to Amos, S.H. (1985) Atlantic and Gulf Coast: A Personal Journey The Marine Molluscs of Arctic Canada: Macpherson, E. (1971) National Museums of Canada The Larousse Guide to Shells of the World Oliver, A.P.H. (1980)

Seashells of the Northeast Coast from Cape Gordon, J. & Weeks, T.E. Hatteras to Newfoundland (1982)

Intertidal bivalves: a guide to the common Foster, N.R. (1991) marine bivalves of Alaska National Audubon Society Field Guide to Meinkoth. N.A. (1981) North American Seashore Creatures Shells & Shellfish of the Pacific Northwest: Harbo, R.M. (2009) A Field Guide Canadian Museum of Nature Peterson Field Guides: Shells of the Atlantic Abbott, R.T., Morris, P.A. & & Gulf Coasts & the West Indies Peterson, R.T. (1995)

Seashore Life of the Northern Pacific Coast: Kozloff, E.N. (1983) An Illustrated Guide to Northern California, Oregon, Washington, and British Columbia Between Pacific Tides Ricketts, E.F., Calvin, J. & Hedgpeth, J.W. (1992) American Seashells: The Marine Mollusca of Abbott, R.T. (1974) the Atlantic and Pacific Coasts of North America Peterson Field Guides: A Field Guide to the Gosner, K.L. & Peterson, Atlantic Seashore R.T. (1999) A New Species of the Genus Macoma Dunnill, R.M. & Coan, E.V. (Pelecypoda) from West American Coastal (1968) Waters, with comments on Macoma calcarea Marine Bivalve Molluscs of the Canadian Lubinksy, I. (1980) Central and Eastern Arctic: Faunal Composition & Zoogeography Catalogue of the Marine Invertebrates of the Brunel, P., Bosse, L. & and Gulf of Saint Lawrence Lamarche, G. (1998) Seashells of North America: A Guide to Field Abbott, R.T. (2001) Identification Bivalve Seashells of Western North America: Coan, E.V., Scott, P.V. & Marine Bivalve Molluscs from Arctic Alaska Bernard, F.R. (2000) to Baja California

58

A)

B)

Figure B.1. Hinge dentition in bivalves, a common character used for species identifications. A) Primitive taxodont teeth displayed in a Nucula specimen and B) heterodont teeth present near the umbone in Macoma balthica.

59

A)

B)

Figure B.2. View of girdle scales on A) Tonicella marmorea and B) Tonicella rubra, two cryptic species of chiton in Canadian oceans. Scale patterns are diagnostic features for identification in this genus.

60

Appendix C Chapter 1 Supplementary Material

Table C.1. List of COI primers used for molecular techniques in Chapter 1. * indicates those primers that were tested but recovered little to no sequences.

Primer Name Nucleotide Sequence (5’ to 3’) Reference (F/R) LCO1490_t1/ TGTAAAACGACGGCCAGTGGTCAACAAATCATAAAGATATTGG/ Floyd HCO2198_t1 CAGGAAACAGCTATGACTAAACTTCAGGGTGACCAAAAAATCA dgLCO-1490/ GGTCAACAAATCATAAAGAYATYGG/ Meyer 2003 dgHCO-2198 TAAACTTCAGGGTGACCAAARAAYCA BivF4_t1/BivR1_t1 TGTAAAACGACGGCCAGTGKTCWACWAATCATAARGATATTGG/ Prosser CAGGAAACAGCTATGACTAMACCTCWGGRTGVCCRAARAACCA unpublished C_LepFolF/ ATTCAACCAATCATAAAGATATTGGGGTCAACAAATCATAAAGATATTGG/ Ivanova C_LepFolR TAAACTTCTGGATGTCCAAAAAATCATAAACTTCAGGGTGACCAAAAAATCA LepF1/LepR1* ATTCAACCAATCATAAAGATATTGG/ Hebert et al. 2004 TAAACTTCTGGATGTCCAAAAAATCA FishF2/FishR2* TCGACTAATCATAAAGATATCGGCAC/ Ward et al. 2005 ACTTCAGGGTGACCGAAGAATCAGAA C_GasF1_t1*/ TGTAAAACGACGGCCAGTTTTCAACAAACCATAARGATATTGGTGTAAAACG Prosser GasR1_t1 ACGGCCAGTATTCTACAAACCACAAAGACATCGGTGTAAAACGACGGCCAG unpublished TTTTCWACWAATCATAAAGATATTGG/ CAGGAAACAGCTATGACACTTCWGGRTGHCCRAARAATCARAA

Table C.2. List of GenBank specimens used for analysis in Chapter 1.

Process ID Sample ID Class Species GBML0009-06 AB084110 Scaphopoda Episiphon yamakawai GBML0013-06 AF120639 Scaphopoda Dentalium pilsbryi GBML0014-06 AF120640 Scaphopoda Rhabdus rectius GBML0015-06 AY260813 Scaphopoda Antalis antillarum GBML0016-06 AY260814 Scaphopoda Antalis antillarum GBML0017-06 AY260815 Scaphopoda Antalis dentalis GBML0018-06 AY260816 Scaphopoda Antalis entalis GBML0019-06 AY260817 Scaphopoda Antalis entalis GBML0020-06 AY260818 Scaphopoda Antalis entalis GBML0021-06 AY260821 Scaphopoda Dentalium pilsbryi GBML0022-06 AY260822 Scaphopoda Antalis sp. PR-2003 GBML0024-06 AY260824 Scaphopoda Fissidentalium candidum GBML0025-06 AY260825 Scaphopoda Graptacme eborea GBML0026-06 AY260826 Scaphopoda Rhabdus rectius GBML0027-06 AY260827 Scaphopoda Rhabdus rectius GBML0028-06 AY260828 Scaphopoda Entalina tetragona GBML0029-06 AY260829 Scaphopoda Gadila aberrans GBML0030-06 AY260830 Scaphopoda Gadila aberrans GBML0031-06 AY260831 Scaphopoda Polyschides carolinensis GBML0032-06 AY260832 Scaphopoda Pulsellum salishorum GBML0033-06 AY260833 Scaphopoda Pulsellum salishorum GBML0035-06 AY342055 Scaphopoda Siphonodentalium lobatum GBML0062-06 DQ093531 Scaphopoda Antalis inaequicostata GBML0064-06 NC_005840 Scaphopoda Siphonodentalium lobatum GBML0065-06 NC_006162 Scaphopoda Graptacme eborea

61

62

63

Figure C.1. Neighbour-joining tree (K2P) for all barcoded specimens (1334).

64

Appendix D Chapter 2 Supplementary Material

Table D.1. Detailed collection information for all 172 Hiatella arctica specimens in this study.

Sample ID Country Province Region Sector Site Descrip. Lat Long Depth 11BIOAK-0043 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0046 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0047 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0048 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0049 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0050 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0051 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0052 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0053 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0054 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0115 USA AK Cook Inlet Kachemak Bay kelp trawl 59.64 -151.37 15 11BIOAK-0116 USA AK Cook Inlet Kachemak Bay kelp trawl 59.64 -151.37 15 11BIOAK-0117 USA AK Cook Inlet Kachemak Bay kelp trawl 59.64 -151.37 15 11BIOAK-0125 USA AK Cook Inlet Outside Beach rocky intertidal 59.46 -151.71 0 11BIOAK-0126 USA AK Cook Inlet Outside Beach rocky intertidal 59.46 -151.71 0 11BIOAK-0127 USA AK Cook Inlet Outside Beach rocky intertidal 59.46 -151.71 0 11BIOAK-0128 USA AK Cook Inlet Outside Beach rocky intertidal 59.46 -151.71 0 11BIOAK-0135 USA AK Cook Inlet Kasitsna Bay Lab dock side 59.47 -151.55 0 11BIOAK-0150 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0151 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0152 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0153 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0154 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0188 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0198 USA AK Cook Inlet Jakalof Bay dock side 59.47 -151.54 0 11BIOAK-0254 USA AK Cook Inlet China Poot kelp trawl 59.56 -151.25 9 11BIOAK-0255 USA AK Cook Inlet China Poot kelp trawl 59.56 -151.25 9 11BIOAK-0337 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0338 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0606 USA AK Cook Inlet Camel Rock rocky intertidal 59.44 -151.72 0 11BIOAK-0622 USA AK Cook Inlet Outside Beach rocky intertidal 59.46 -151.71 0 11BIOAK-0636 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0637 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0638 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0639 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0640 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0641 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0642 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0643 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0644 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0645 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0646 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0647 USA AK Cook Inlet Kasitsna Bay Lab rocky intertidal 59.47 -151.55 0 11BIOAK-0705 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0706 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0707 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0708 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0710 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0714 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 11BIOAK-0715 USA AK Cook Inlet Little Tutka rocky intertidal 59.47 -151.49 0 10BCMOL-00374 Canada BC Bamfield 11POPHI-0007 Canada BC Edward King Is. 11POPHI-0008 Canada BC Scott's Bay 11POPHI-0009 Canada BC Scott's Bay 11POPHI-0010 Canada BC Scott's Bay 11POPHI-0011 Canada BC Scott's Bay 11POPHI-0012 Canada BC Scott's Bay 11POPHI-0013 Canada BC Prasiola Point 11POPHI-0014 Canada BC Prasiola Point 11POPHI-0015 Canada BC Prasiola Point 11POPHI-0016 Canada BC Scott's Bay 10PROBE-105770.1 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-105780.1 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-105790.1 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-105900.1 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 10PROBE-105910.1 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 10PROBE-105920.1 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 10PROBE-106050 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106060 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0

65

10PROBE-106070 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106080 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106090 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106100 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106110 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106120 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106130 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106140 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106150 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106160 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106170 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10NBMOL-10008.1 Canada NB St. Andrews Passamaquoddy 45.08 -67.08 10NBMOL-10009.1 Canada NB St. Andrews Passamaquoddy 45.08 -67.08 10NBMOL-10010.1 Canada NB St. Andrews Passamaquoddy 45.08 -67.08 10NBMOL-10011.1 Canada NB St. Andrews Passamaquoddy 45.08 -67.08 11BFMOL-0017 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0018 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0019 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0020 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0021 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0022 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0023 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0024 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0025 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0026 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0027 Canada NB St. Andrews Spruce Island boat dive 44.97 -66.91 20 11BFMOL-0040 Canada NB St. Andrews The Wolves boat dive 44.95 -66.73 9 11BFMOL-0041 Canada NB St. Andrews The Wolves boat dive 44.95 -66.73 9 11BFMOL-0042 Canada NB St. Andrews The Wolves boat dive 44.95 -66.73 9 11BFMOL-0043 Canada NB St. Andrews The Wolves boat dive 44.95 -66.73 9 11BFMOL-0044 Canada NB St. Andrews The Wolves boat dive 44.95 -66.73 9 11BFMOL-0045 Canada NB St. Andrews The Wolves boat dive 44.95 -66.73 9 11BFMOL-0068 Canada NB St. Andrews Navy Island rocky intertidal 45.06 -67.06 0 11BFMOL-0069 Canada NB St. Andrews Navy Island rocky intertidal 45.06 -67.06 0 11BFMOL-0070 Canada NB St. Andrews Navy Island rocky intertidal 45.06 -67.06 0 11BFMOL-0087 Canada NB St. Andrews Western Passage trawl 44.95 -67.02 9 11BFMOL-0132 Canada NB St. Andrews Western Passage trawl 44.95 -67.02 9 11BFMOL-0157 Canada NB St. Andrews Casco Bay Island boat dive 44.96 -66.93 9 11BFMOL-0158 Canada NB St. Andrews Casco Bay Island boat dive 44.96 -66.93 9 11BFMOL-0159 Canada NB St. Andrews Casco Bay Island boat dive 44.96 -66.93 9 11BFMOL-0160 Canada NB St. Andrews Casco Bay Island boat dive 44.96 -66.93 9 11BFMOL-0161 Canada NB St. Andrews Casco Bay Island boat dive 44.96 -66.93 9 11BFMOL-0297 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11ECMOL-0364 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0365 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0366 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0367 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0368 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0369 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0370 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0371 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0372 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0373 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0374 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0375 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0376 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0377 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0378 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0379 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0380 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0381 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0382 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0383 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0384 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0385 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0386 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0387 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0388 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0389 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0390 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0391 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0392 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0393 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0394 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0395 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0396 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0397 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0398 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0399 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0400 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0

66

11ECMOL-0401 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0402 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0403 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0404 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0405 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0406 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0407 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0408 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0409 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0410 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0411 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0412 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0413 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0414 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0415 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0416 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0417 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0418 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0419 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0420 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0421 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0422 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0 11ECMOL-0423 Canada NS Bedford Basin Bedford Basin 44.69 -63.64 0

Table D.2. Detailed collection information for all 196 Macoma balthica specimens in this study. * denotes GenBank specimens.

Sample ID Country Province Region Sector Site Descrip. Lat Long Depth 11BIOAK-0439 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0424 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0330 USA AK Cook Inlet Kasitsna Bay rocky intertidal 59.47 -151.55 0 11BIOAK-0328 USA AK Cook Inlet Kasitsna Bay rocky intertidal 59.47 -151.55 0 11BIOAK-0326 USA AK Cook Inlet Kasitsna Bay rocky intertidal 59.47 -151.55 0 11BIOAK-0248 USA AK Cook Inlet China Poot gravel mud flat 59.57 -151.30 0 11BIOAK-0245 USA AK Cook Inlet China Poot gravel mud flat 59.57 -151.30 0 11BIOAK-0244 USA AK Cook Inlet China Poot gravel mud flat 59.57 -151.30 0 11BIOAK-0243 USA AK Cook Inlet China Poot gravel mud flat 59.57 -151.30 0 11BIOAK-0416 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0455 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0454 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0453 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0452 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0451 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0450 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0449 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0448 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0447 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0446 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0445 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0444 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0443 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0442 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0441 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0440 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0438 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0437 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0436 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0435 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0434 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0433 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0432 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0431 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0430 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0429 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0428 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0427 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0426 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0425 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0423 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0422 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0421 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0420 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0419 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0418 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 11BIOAK-0417 USA AK Cook Inlet Jakalof Bay mud flats 59.47 -151.54 0 10BCMOL-00310 Canada BC Haida Gwaii

67

10BCMOL-00312 Canada BC Haida Gwaii 10BCMOL-00314 Canada BC Haida Gwaii 10BCMOL-00315 Canada BC Haida Gwaii 10PROBE-106310 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-106300 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106290 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106280 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106270 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106260 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106250 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106240 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106230 Canada MB Churchill Bird Cove mud flats 58.77 -93.84 0 10PROBE-106220 Canada MB Churchill Bird Cove mud flats 58.77 -93.84 0 10PROBE-106210 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-106200 Canada MB Churchill Bird Cove intertidal 58.77 -93.87 0 10PROBE-105590.1 Canada MB Churchill Bird Cove mud flats 58.77 -93.84 10PROBE-105600.1 Canada MB Churchill Bird Cove mud flats 58.77 -93.84 10PROBE-105620.1 Canada MB Churchill Bird Cove mud flats 58.77 -93.84 10PROBE-105710.1 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-106360 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-106340 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 10PROBE-106330 Canada MB Churchill Churchill River dredge 58.79 -94.21 15 11BFMOL-0277 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0274 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0220 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0279 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0278 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0276 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0275 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0273 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0271 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0270 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0269 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0268 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0266 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0265 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0264 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0263 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0262 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0261 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0260 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0259 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0258 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0257 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0256 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0255 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0254 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0253 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0252 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0251 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0250 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0248 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0247 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0245 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0244 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0243 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0242 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0241 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0240 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0239 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0238 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0236 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0235 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0234 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0233 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0231 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0228 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0227 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0225 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0224 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0223 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0222 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11BFMOL-0221 Canada NB St. Andrews Indian Point rocky intertidal 45.07 -67.04 0 11MMMOL-00050 Canada NFLD Bonne Bay St. Paul's 0 11MMMOL-00046 Canada NFLD Bonne Bay St. Paul's 0 11MMMOL-00045 Canada NFLD Bonne Bay DB lagoon 0 11MMMOL-00044 Canada NFLD Bonne Bay DB lagoon 0 11MMMOL-00043 Canada NFLD Bonne Bay DB lagoon 0 11MMMOL-00042 Canada NFLD Bonne Bay DB estuary 0 11MMMOL-00031 Canada NFLD Bonne Bay DB estuary 0

68

11ECMOL-0037 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0032 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0029 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0028 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0027 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0020 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0017 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0015 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0013 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0011 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0010 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0009 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0008 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0007 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0006 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0005 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0004 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0003 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0001 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0035 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0034 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0036 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0038 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0033 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0026 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0024 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0023 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0022 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0021 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0019 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0018 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0016 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0014 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0012 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 11ECMOL-0002 Canada PEI Pinette Pinette red mud 46.05 -62.91 0 *HM756189.1 France Bay of Biscay Bay of Biscay *HM756187.1 France Bay of Biscay Bay of Biscay *HM756185.1 France Bay of Biscay Bay of Biscay *HM756183.1 France Bay of Biscay Bay of Biscay *HM756181.1 France Bay of Biscay Bay of Biscay *HM756179.1 France Bay of Biscay Bay of Biscay *HM756177.1 France Bay of Biscay Bay of Biscay *HM756175.1 France Bay of Biscay Bay of Biscay *HM756173.1 France Bay of Biscay Bay of Biscay *HM756171.1 France Bay of Biscay Bay of Biscay *HM756188.1 France Bay of Biscay Bay of Biscay *HM756186.1 France Bay of Biscay Bay of Biscay *HM756182.1 France Bay of Biscay Bay of Biscay *HM756180.1 France Bay of Biscay Bay of Biscay *HM756178.1 France Bay of Biscay Bay of Biscay *HM756176.1 France Bay of Biscay Bay of Biscay *HM756174.1 France Bay of Biscay Bay of Biscay *HM756172.1 France Bay of Biscay Bay of Biscay *HM756170.1 France Bay of Biscay Bay of Biscay *AF443220.1 Europe Wadden Sea Wadden Sea *AF443222.1 Europe Wadden Sea Wadden Sea *AF443219.1 Europe Wadden Sea Wadden Sea *AF443221.1 Europe Wadden Sea Wadden Sea *AY162261.1 Europe Baltic Sea Baltic Sea *AY162263.1 Europe Baltic Sea Baltic Sea *AY162260.1 Europe Baltic Sea Baltic Sea *AY162262.1 Europe Baltic Sea Baltic Sea *EF044127.1 Arctic Ocean Chukchi Sea Chukchi Sea *EF044125.1 Arctic Ocean Chukchi Sea Chukchi Sea *EF044126.1 Arctic Ocean Chukchi Sea Chukchi Sea *AF443216.1 Europe *AF443218.1 Europe *AF443217.1 Europe *EF044130.1 Europe *EF044132.1 Europe *EF044134.1 Europe *EF044131.1 Europe *EF044129.1 Europe *EF044133.1 Europe *EF044135.1 Europe

69

A)

B)

F = - / ST between within between

C)

BCD (i,j) = (

π = mean pairwise difference

S = number of segregating sites

i,j = objects

k = index of variable

y = variable n = total number of variables

d = distance matrix

Figure D.1. Calculations for a) Tajima’s D, b) FST and c) Bray Curtis index.

70

A)

B)

Figure D.2. Bray Curtis similarity values between populations of Hiatella arctica A) including potential cryptic species and B) excluding potential cryptic species (main lineage with shared haplotypes). Values closer to 1 indicate that two populations are more similar.

71

Figure D.3. Bray Curtis values of similarity between populations of Macoma balthica based on identified clusters/provisional species in the NJ tree. Values closer to 1 indicate that two populations are more similar.

72

A) Main Lineage B) New Brunswick Cryptic

RV

LV

LV

RV

C) NE Pacific Cryptic 1 D) NE Pacific Cryptic 2

RV RV

LV

LV

Figure D.4. Hinge dentition in each of the four cryptic lineages of Hiatella arctica. A) the main lineage has two similar denticles on the left valve (LV) and one denticle on the right valve (RV), B) the New Brunswick cryptic lineage has two very dissimilar denticles on the LV and one denticle on the RV C) the first northeast Pacific cryptic lineage has one very prominent tubercle in the LV as well as an elongated phalange, with a definite denticle in the RV and lastly D) the second northeast Pacific cryptic lineage lacks any defined dentition. Denticles in the RV are marked with blue arrows and denticles in the LV are marked with red arrows. An initial examination suggests that shell morphology may be different in each genetic cluster, but future work will aim to focus on morphometrics and determine whether these traits show phylogenetic signal.

73

Appendix E R Code

Chapter 1 Pearson’s Chi-Square for Sequencing Success chi <- read.csv("chi Dec 5.csv") chi chisq.test(chi)

Testing for a correlation between mean GC content and sequence divergence between congenerics order<-read.csv("GC vs NN R input Nov 28.csv", header=T, sep=",") order plot(order,xlab="Mean GC Content (%)",ylab="Mean Nearest Neighbour Distance (K2P)",pch=20, col="darkgoldenrod2") abline(lm(mean.NN ~ mean.GC, data= order),col= "red") legend("topleft",legend=c("Correlation coefficient= 0.51","p= 0.002*"),col="black",box.col="black") library(boot) order1<-corr(order) order1 # Correlations with significance levels library(Hmisc) rcorr(order$mean.GC, order$mean.NN, type="pearson")

Constructing rarefaction curves for Bivalvia and Gastropoda

Biv<-read.csv("BivAccumInput Nov 27.csv") library(picante) BivMatrix<-sample2matrix(Biv) BivAccum<-specaccum(BivMatrix, "rarefaction") plot(BivAccum, xlab="", ylab="", col="cyan4", xlim=c(0,800), ylim=c(0,150), ann=FALSE, ci=0, lwd=2) title(main="", col.main=FALSE, sub="", col.sub=FALSE, xlab="Sample", ylab="Species", col.lab="black") Gas<-read.csv("GasAccumInput Nov 27.csv") library(picante) GasMatrix<-sample2matrix(Gas) GasAccum<-specaccum(GasMatrix, "rarefaction") plot(GasAccum, add=T,xlab="", ylab="", col="darkgoldenrod2", xlim=c(0,800), ylim=c(0,150), ann=FALSE, ci=0, lwd=2) title(main="", col.main=FALSE, sub="", col.sub=FALSE, xlab="Sample", ylab="Species", col.lab="black") legend(5,150, c("Bivalvia","Gastropoda"), lty=c(1,1,1), lwd=c(2,2,2,2,2), col=c("cyan4","darkgoldenrod2"), box.lwd=0, box.col="white")

Constructing rarefaction curves for Cephalopoda, Polyplacophora and Scaphopoda

Ceph<-read.csv("CephAccumInput Nov 27.csv") library(picante)

74

CephMatrix<-sample2matrix(Ceph) CephAccum<-specaccum(CephMatrix, "rarefaction") plot(CephAccum, xlab="", ylab="", col="cornflowerblue", xlim=c(0,150), ylim=c(0,25), ann=FALSE, ci=0, lwd=2) title(main="", col.main=FALSE, sub="", col.sub=FALSE, xlab="Sample", ylab="Species", col.lab="black") Poly<-read.csv("PolyAccumInput Nov 27.csv") library(picante) PolyMatrix<-sample2matrix(Poly) PolyAccum<-specaccum(PolyMatrix, "rarefaction") plot(PolyAccum, add=T, xlab="", ylab="", col="mediumvioletred", xlim=c(0,150), ylim=c(0,25), ann= FALSE, ci=0, lwd=2) title(main="", col.main=FALSE, sub="", col.sub=FALSE, xlab="Sample", ylab="Species", col.lab="black") Scaph<-read.csv("ScaphAccumInput Nov 27.csv") library(picante) ScaphMatrix<-sample2matrix(Scaph) ScaphAccum<-specaccum(ScaphMatrix, "rarefaction") plot(ScaphAccum, add=T,xlab="", ylab="", col="black", xlim=c(0,150), ylim=c(0,25), ann=FALSE, ci=0, lwd=2) title(main="", col.main=FALSE, sub="", col.sub=FALSE, xlab="Sample", ylab="Species", col.lab="black") legend(5,25, c("Cephalopoda","Polyplacophora","Scaphopoda"), lty=c(1,1,1), lwd=c(2,2,2,2,2), col=c("cornflowerblue","mediumvioletred","black"), box.lwd=0, box.col="white")

Barcode gap histogram mean<-read.csv("mean intra R input Nov 27.csv", header=TRUE, sep=",") mean hist(mean$Mean.Intra, xaxt='n', col=rgb(0,0,1,0.5), xlab="Sequence Divergence (%),main="", ylim=c(0,150), xlim=c(0,50), breaks= seq(0,50, by=2), axes = TRUE, plot = TRUE, labels = FALSE) axis(side= 1, at = seq(0, 50, by=2)) NN<-read.csv("NN R input Nov 27.csv", header=TRUE, sep=",") NN hist(NN$NN, add=TRUE, col=rgb(1,0,0,0.5), breaks= seq(0,50, by=2)) legend(30, 130, c("Mean Intra-specific", "Nearest Neighbour"), cex=1.0, bty="n", c(col=rgb(0,0,1,0.5), col=rgb(1,0,0,0.5)), box.lwd=0, box.col="white")

ANOVA for testing whether mean nearest neighbour distance is equal between genera with ≥ 2 species genus<-read.csv("NNgenus vs species Nov 27.csv", header=T, sep=",") genus boxplot(genus$NN ~ genus$Species, ylab="Mean NN Distance (%)", xlab="Number of Species Sampled") legend("topright",legend=c("p-value = 0.069"), box.col="black") genus.aov<-aov(genus$NN ~ genus$Species) summary(genus.aov)

Linear regression modeling the relationship between intraspecific divergence and number of individuals

MaxIndi<-read.csv("max intra vs indi Nov 27.csv", header=T, sep=",")

75

MaxIndi plot(MaxIndi$Max.Intra ~ MaxIndi$X..Individuals, xlim=c(0,70), ylim=c(0,35), col="mediumvioletred", xlab="Number of Individuals", ylab="Intra-specific Divergence (%)", bty = "n", pch=20, cex=1.5) Max1<-lm(MaxIndi$Max.Intra ~ MaxIndi$X..Individuals) summary(Max1) abline(Max1, col="mediumvioletred") MeanIndi<-read.csv("mean intra vs indi Nov 27.csv", header=T, sep=",") MeanIndi par(new=TRUE) plot(MeanIndi$Mean.Intra ~ MeanIndi$X..Individuals, xlim=c(0,70), ylim=c(0,35), axes=FALSE, col="darkgoldenrod2", xlab="Number of Individuals", ylab="Intra-specific Divergence (%)", bty = "n", pch=20, cex=1.5) Mean1<-lm(MeanIndi$Mean.Intra ~ MeanIndi$X..Individuals) summary(Mean1) abline(Mean1, col="darkgoldenrod2") legend(50,35, legend=c("Maximum", "Mean"), box.col="white", text.col=c("black"), pch=c(20,20), col=c("mediumvioletred","darkgoldenrod2")) text(62,4,c("R2=-0.006"),col=c("black"), cex=0.75) text(62,1,c("R2=-0.002"),col=c("black"), cex=0.75)

Chapter 2 Paired t-test for determining whether mean haplotype and nucleotide diversity are equal between species pop<-read.csv("diversity input POP GEN.csv", header=TRUE, sep=",") pop boxplot(pop$Haplotype ~ pop$Species, ylab="Haplotype Diversity", xlab="Species") t.test(Haplotype~Species, data=pop) nuc<-read.csv("nucleotide input POP GEN.csv", header=TRUE, sep=",") nuc boxplot(nuc$Nucleotide ~ nuc$Species, ylab="Nucleotide Diversity", xlab="Species") t.test(Nucleotide~Species, data=nuc)

Histogram displaying the range of intraspecific divergence in Hiatella arctica and Macoma balthica

HiDiv <- read.csv("Hia divergences Nov 24.csv", header=TRUE, sep=",") hist(HiDiv$Divergence, xlab="Sequence Divergence (%)", ylab="Frequency", main="", xlim=c(0,25), ylim=c(0,13000), col="red") MacDiv <- read.csv("Mac divergences Nov 24.csv", header=TRUE, sep=",") hist(MacDiv$Divergence, xlab="Sequence Divergence (%)", ylab="Frequency", main="", xlim=c(0,15), ylim=c(0,6000),col="blue")

76