<<

TEMPORAL AND SPATIAL VARIATION IN POPULATION STRUCTURE AND DISTRIBUTION OF A DIRECT-DEVELOPING CRYPTIC COMPLEX, LEP TASTERIAS

A s A thesis submitted to the faculty of San Francisco State University In partial fulfillment of the requirements for 6 /0 L the Degree ■ M $

Master of Science

In

Biological Sciences: Ecology, Evolution and Conservation

by

Laura Michelle Melroy

San Francisco, California

January 2015 Copyright by Laura Michelle Melroy 2015 CERTIFICATION OF APPROVAL

I certify that I have read TEMPORAL AND SPATIAL VARIATION IN POPULATION

STRUCTURE AND DISTRIBUTION OF A DIRECT-DEVELOPING CRYPTIC

SPECIES COMPLEX, by Laura Michelle Melroy, and that in my opinion this work meets the criteria for approving a thesis submitted in partial fulfillment of the requirement for the degree Master of Science in Biology: Ecology, Evolution and

Conservation Biology at San Francisco State University.

C. Sarah Cohen, Ph.D. Professor, Biology

Professor, Biology TEMPORAL AND SPATIAL VARIATION IN POPULATION STRUCTURE AND DISTRIBUTION OF A DIRECT-DEVELOPING CRYPTIC SPECIES COMPLEX, LEPTASTERIAS SPP.

Laura Michelle Melroy San Francisco, California 2015

Temporal genetic studies of direct developing organisms are rare. Marine invertebrates lacking a planktonic larval stage are expected to have lower dispersal, low gene flow, and a higher potential for local adaptation than organisms with planktonic dispersal. Leptasterias is a of brooding sea stars that contains several cryptic species complexes. Population genetic methods were used to resolve patterns of fine-scale population structure in central California Leptasterias using three loci from nuclear and mitochondrial genomes. Historic samples were compared to contemporary samples to delineate changes in species distributions in space and time. Contemporary sampling confirmed a pattern of shared haplotypes at both mitochondrial and nuclear loci around the mouth of San Francisco Bay with distinct, shared haplotypes bracketing populations north and south of the bay. Phylogenetic analysis confirmed the presence of a bay- localized clade and revealed the presence of an additional bay-localized and previously undescribed clade of Leptasterias. Analysis of contemporary and historic samples indicated these two clades may be experiencing a constriction in their southern range limit and suggest a decrease in abundance at sites at which they were once prevalent. Historic sampling revealed a different distribution of diversity along the California coastline compared to contemporary sampling and illustrates the importance of temporal genetic sampling in phylogeographic studies.

I certify that the Abstract is a correct representation of the content of this thesis.

Date ACKNOWLEDGEMENTS

I would like to thank members of the Cohen Lab, Jason Helvey and Shana Gallagher for help with sample collections and bench work. I would also like to thank John Largier, Eric Routman and Greg Spicer for helpful discussion. We would like to thank the California Academy of Science Invertebrate Zoology Collection for allowing me to sample their collections of Leptasterias, specifically the help of Kelly Markello and Chrissy Piotrowski. I would like to thank David Foltz for providing samples and for helpful discussion. I would like to thank Mamie Chapman, Sara Caldwell, and Sherry Tamone for assistance with sample collection in Alaska. Thanks for Renate Eberl for comments on the manuscript. Funding was provided to LM by SFSU Graduate Student Council, SFSU Instructional Research Award, and RTC. The Romberg Tiburon Center, SFSU gene lab use was made possible by National Science Foundation FSML Grant 0435033 (CSC) and donations from Biolink CCSF, SFSU COSE, and the Koret Foundation.

v TABLE OF CONTENTS

List of Tables...... vii

List of Figures...... viii

List of Appendices...... ix

Introduction...... 1

Methods...... 4

Sample Collection and Extraction...... 4

PCR Amplification...... 5

Sequencing Reactions...... 7

Phylogenetic Analysis...... 7

Population Analysis...... 9

Results...... 11

Phylogenetic Analysis of Contemporary and Historical Samples...... 12

Population Genetic Analysis...... 14

Historical Analysis...... 17

Discussion...... 18

Divergence Times and Phylogenetics...... 18

Patterns of Population Structure...... 20

Population Expansion and Evidence for Range Shifts...... 25 References...... 28

Appendices...... 44

vii LIST OF TABLES

Table Page

1. Sample Collection Data...... 35 2. Genetic distances...... 36 3. Molecular Diversity Statistics...... 36 LIST OF FIGURES

Figures Page

1. Leptasterias mtDNA Phylogeny...... 37 2. Time Since Divergence Tree...... 38 3. Haplotype Maps...... 39 4. Clade Maps...... 40 5. Haplotype Network...... 41 6. Mismatch Distributions...... 42 7. Change in Clade Frequency Map ...... 43

ixx \

LIST OF APPENDICES

Appendix Page

1. Reference Sequences...... 44 2. Population pairwise differences...... 44-45 3. AMOVA...... 45 4. Mismatch Statistics...... 46 5. Intron Phylogeny...... 47 6. Minimum Spanning Tree Excluding Indels...... 48

x 1

Introduction

The marine environment is heterogeneous with many instances of speciation and

genetic differentiation due to adaptive and neutral processes of divergence. Geographic

patterns of genetic variation in the marine environment are shaped by life history,

oceanographic and transport processes, sea level and land changes, and selection. These processes influence the population structure of marine invertebrates on both temporal and

spatial levels. Temporal studies of genetic structure are rare, and even more so are temporal genetic studies of direct developing organisms (Lee and Boulding 2009).

Marine organisms with direct development tend to have lower vagility than organisms with planktonic larvae, low levels of gene flow and high genetic structure between populations, a higher potential for local adaptation, and more frequent speciation and extinction events (Strathmann and Strathmann 1982; Jablonski 1986; Levin 2006; Kelly and Palumbi 2010). Direct developers are also expected to show higher temporal genetic

stability than planktonic dispersers (Lee and Boulding 2009). However, predictions of

genetic structure based on development mode are not always consistent (Sotka 2004;

Becker et al. 2007; Bradbury et al. 2008). These inconsistencies have been explained by

intrinsic and extrinsic factors influencing dispersal on temporal and spatial scales such as

larval behavior, currents and circulation, habitat discontinuity, and geographic features

(Avise 1992; Billot et al. 2003; Selkoe et al. 2010; Winston 2012; Kamel et al. 2014).

Connectivity influences the evolutionary trajectory of populations and properly quantifying gene flow between populations depends on accurate taxonomic groupings 2

(Pante et al. 2015). Successful management of marine protected areas depends upon

understanding the levels of connectivity between populations within and outside the

protected boundaries and can benefit from temporal genetic studies and proper species

delineation. Assessing temporal stability of genetic diversity in a brooding species may

give insight into long-term environmental and demographic processes contributing to

population structure.

Leptasterias, the six-rayed sea star, is a genus of direct-developing sea stars

ranging from Alaska to central California. Leptasterias occur in rocky intertidal and

subtidal habitats and typically measure less than 6 cm from ray tip to ray tip (Chia 1966).

Leptasterias brood their young and fully developed young crawl away to disperse (Chia

1966; Menge 1975). Due to their direct-developing life history and small size, these sea

stars have limited dispersal to new sites. Dispersal likely occurs by individuals rafting on

macroalgae or other floating substrate, with long distance dispersal possible, though

likely infrequent (Highsmith 1985; Parker and Tunnicliffe 1994). Due to the limited

vagility of these sea stars, Leptasterias can be considered a coastal indicator species

reflecting local environmental health; however, proper species identification is necessary

to assess changing distributions, abundances, and population health. Assessing cryptic

diversity is also important in providing baseline data for monitoring the ecological effects

of mortality events and other environmental perturbations, especially as these types of

events are predicted to increase with global climate change (Jurgens et al. 2015). Many lineages within the Leptasterias genus have an unresolved taxonomic status. In several broad scale analyses of the Leptasterias genus, several cryptic lineages were identified within both L. hexactis and L. aequalis using mitochondrial molecular data (Hrincevich et al. 2000; Foltz et al. 2008). They determined that L. hexactis was comprised of two clades: L. hexactis C found in Washington and L. hexactis G found in

Alaska. L. aequalis was found to be comprised of four allopatric and sympatric clades: L. aequalis B found in Washington, L. aequalis A ranging from Washington to north of San

Francisco Bay, L. aequalis D ranging from Washington to south of Monterey Bay, and L. aequalis K ranging from Point Conception to south of Monterey Bay. Morphological characters are challenging for identification within the Leptasterias genus due to high morphological variability within and between clades (Foltz et al. 1996). Taxonomic uncertainty exists for L. aequalis D, which is also referred to as L. pusilla in the literature

(Foltz et al. 2008). Fine-scale genetic analysis of Leptasterias may lead to taxonomic revision and resolution in this genus.

Past studies on the broad scale distributions of Leptasterias spp. may not account for the fine scale cryptic diversity within the genus and previous studies addressed the need for fine scale analysis (Foltz et al. 2007; 2008). Indeed, a recent study used sequence data from D-Loop to reveal the presence of a previously undescribed clade,

Clade Y, localized around San Francisco Bay (Melroy et al. in prep). The current study will investigate temporal and spatial population structure using nuclear and mitochondrial sequence data with more widespread contemporary sampling and historic museum 4

sampling. Previous range estimates of Leptasterias may be incorrect as museum samples were historically identified based on unreliable morphological characters; molecular identification in this study may contradict these early classifications. Phylogenetic clarity and temporal genetic analysis of the low dispersing Leptasterias genus will offer important insights into understanding evolutionary processes such as local adaptation, speciation, and divergence. Multilocus sequence data will be used to 1) clarify the phylogenetic relationship between central California lineages of Leptasterias spp., 2) investigate present patterns of genetic structure, and 3) determine how species ranges and abundances are shifting over time and space.

Methods

Sample Collection and Extraction

Three hundred forty five Leptasterias individuals were collected from 17 intertidal sites on the Pacific Coast between December 2008 and July 2014 (Table 1).

Ray samples were collected from individuals separated by at least one meter and tissue was stored in 95% ethanol. Samples are stored at the Romberg Tiburon Center for

Environmental Studies (RTC) in Tiburon, California at 4°C.

Historical samples of Leptasterias spp. collected between 1897 and 1998 were obtained from the Invertebrate Zoology collection at the California Academy of Sciences and from D. W. Foltz (Louisiana State University). Historic samples were obtained from sites on the Pacific coast between Lonesome Cove, Washington and Diablo Canyon, 5

California (Table 1). Tube feet were sampled from historic samples and transferred from ethanol into milliQ water and left on a shaker for two days to remove excess ethanol prior to extraction. Ray tissue was used for extraction of DNA in contemporary samples.

Samples collected in 2008 from Marshall Gulch, Bodega Bay and Rock were extracted with a phenol chloroform extraction. All remaining DNA extractions were carried out using NucleoSpin Tissue Columns (Macherey-Nagel Inc, Bethlehem, PA,

USA).

Control Region Amplification

Forward primer E16Sa (Smith et al. 1993) and reverse primer Star-L (Flowers and Foltz 2001) were used for amplification of 286 bp of the putative control region and

8 bp of the conserved 3’ end of the large ribosomal subunit 16S gene (henceforth referred to as D-Loop for simplicity). For contemporary samples, PCR reactions were carried out in a 25 pL reaction with the following components: 5-500 ng DNA template, 0.2 pM of each primer, IX PE II Buffer, 1 mM dNTPs, 2.5 mM MgCk, 1.25 pg BSA, 1 unit of Taq

DNA polymerase (New England Biolabs, NEB, Ipswich, MA, USA) and milliQ water up to the final volume. Thermal cycling conditions were: initial denaturation at 94°C for 2 minutes, denaturation at 94°C for 30 seconds, annealing at 45 °C for 1 minute, extension at 72°C for one minute for a total of 30 cycles and final extension at 72°C for 5 minutes.

For historic samples, a separate thermal cycling reaction was run with a total of 35 cycles if amplification was unsuccessful in 30 cycles. 6

COI Amplification

Primers were designed from published COI sequences of L. aequalis and L. hexactis (Hrincevich et al. 2000; Foltz et al. 2008). Forward primer COILF, 5’ GCA-

GGA-TTT-ACC-CAC-TGA-TTT-C 3’ and reverse primer COILR, 5’CCT-GGC-TTC-

ACA-GGC-AGA-T 3’ amplified 378 bp of cytochrome oxidase subunit I, 68 bp of tRNA-Arg and 90 bp of ND4L genes (henceforth referred to as COI for simplicity). PCR reactions were carried out using the same reaction volumes as in the control region amplification. Thermal cycling conditions were: initial denaturation at 96°C for 2 minutes, denaturation at 94°C for 30 seconds, annealing at 46°C for 30 seconds, extension at 72°C for one minute for a total of 35 cycles, and final extension at 72°C for

5 minutes. For historic samples, thermal cycling conditions included a total of 40 cycles.

Intron Amplification

Five Exon Primed Intron Crossing (EPIC) loci (Chenhuil et al. 2010; Gerard et al.

2013) were screened for amplification in Leptasterias based on successful amplification in other taxa: il, i9, i39, i43, and i51. One EPIC locus was chosen that offered the highest resolution among sites and clades to be used for population genetic analyses. The i51 primer pair amplified an intron in the gene group UDP-N- acetylglucosaminyl-transferase (Chenhuil et al. 2010). Primers were redesigned for i51 to decrease the primer degeneracy. Forward primer i51LF GAT-CGA-CCC-AGC-CAC-

ATT and reverse primer i51LR TTG-AAG-CAA-CAG-GGG-AGA-AG were then 7

exclusively used to amplify a 277 base pair region of the intron. PCR reactions were the same as D-Loop and COI, but used 0.1 pM of each primer. Thermal cycling conditions were: initial denaturation at 96°C for 1 minute, denaturation at 94°C for 40 seconds, annealing at 45°C for 30 seconds, extension at 72°C for 40 seconds for a total of 35 cycles, and a final extension at 72°C for 2 minutes.

Sequencing Reactions

Amplification of PCR templates was assessed with gel electrophoresis using a

1.5% agarose gel stained with ethidium bromide. PCR products were cleaned using a

SAP/EXO reaction and following manufacturer instructions (ExoSAP-IT® for PCR product cleanup, Affymetrix, Santa Clara, CA, USA). Cycle sequencing reactions were carried out in the reverse direction using the 1/8 reaction BigDyeTerminator v3.1

(Applied Biosystems, ABI, Foster City, USA). Cycle sequencing products were sequenced using an ABI 3130 genetic analyzer at RTC. Cloning was used to resolve and confirm a subset of alleles for i51 in heterozygous individuals using Vector System II, pGEM-T (Promega, Madison, WI).

Phylogenetic Analysis

D-Loop and COI haplotypes were included in phylogenetic analyses and compared to published D-Loop and COI Leptasterias sequences (Table SI). L. camtschatica was chosen as an outgroup based on previous phylogenetic analysis identifying it as a sister group to L. hexactis and L. aequalis (Foltz et al. 2008). 8

Maximum likelihood (ML) analyses were performed in PAUP* v.4.0 (Swofford 2001) for D-Loop, COI and i51 haplotypes. Indels were treated as missing data. The automated model selection feature was used to choose the most appropriate nucleotide substitution model using the Akaike Information Criterion (AIC; Posada and Crandall 1998). TN93 with gamma site heterogeneity (Tamura and Nei 1993) was used for D-Loop and COI and HKY+G (Hasegawa et al. 1985) was used for i51. Bootstrap analyses were performed using a Jukes-Cantor neighbor-joining tree as the starting tree for a heuristic search with 1000 replicates. ML analysis was performed on the full COI data set, and then again on the dataset excluding third position changes.

Bayesian analysis was performed in MrBayes v.3.2 (Huelsenbeck and Ronquist

2001) to evaluate the results of the ML trees. Metropolis-coupled Markov Chain Monte

Carlo (MCMCMC) methods were employed for all loci using the HKY+G model of nucleotide evolution. With each tree search, four parallel searches were run for 2 million generations with chains sampled every 500 generations. Trees prior to a split frequency value of 0.01 were discarded as the bum-in. Trees were constructed for D-Loop, COI, and concatenated D-Loop and COL The i51 phased haplotype alignment was imported into Seqstate v. 1.4.1 (Muller 2005) to code indels as simple characters (Simmons and

Ochoterena 2000) and complex characters (Muller 2006). A phylogenetic tree was estimated in MrBayes for i51 using indels as both missing characters and coded as informative characters. 9

Divergence times were measured between concatenated historical and contemporary mtDNA haplotypes to estimate the time of differentiation between previously undescribed clades within L. aequalis using BEAST vl.8.2 (Drummond et al.

2012). The TN93+G model was employed in BEAST. An uncorrelated lognormal relaxed clock with an estimated substitution rate was used with all tips set to zero. L. aequalis was constrained as monophyletic, but taxon sets within L. aequalis were not constrained as monophyletic. Using the putative mitochondrial control region and COI, Foltz et al.

(2008) used a molecular clock calibrated with the crossover of L. muelleri through the

Bering Strait as the prior probability. Divergence between L. hexactis and L. aequalis was estimated to have occurred 2.56 Mya and 3.08 Mya. These estimations were averaged for the combined mtDNA divergence time and set as the normal distribution calibration point in this study. Starting trees were randomly generated and the tree prior assumed a Yule speciation process. For each BEAST analysis, the MCMC was performed for 107 generations, sampling every 1000 generations with a burn-in of 10%. Summary statistics were generated and visualized in Tracer v 1.6.0 (Drummond et al. 2012). Maximum clade credibility trees with median node heights of >50% posterior probabilities were calculated with Tree Annotator vl.8.2 (Drummond et al. 2012) and were drawn in

FigTree vl.4.0 (Rambaut 2009). The analysis was run five times to confirm convergence and the combined results are reported.

Population Analysis 10

COI, D-Loop and 151 sequences were edited by eye in Geneious v7.1.7 (Kearse et al. 2014) and aligned separately using the MUSCLE algorithm. The COI alignment was translated in Mesquite v3.0.4 (Maddision and Maddison 2001) to ensure the correct reading frame was used and to determine base saturation and nucleotide position changes.

D-Loop and COI were analyzed both separately and as a single locus in all analyses below. The program DnaSP v5.10 (Rozas et al. 2003) was used to calculate standard diversity indices including haplotype diversity (h) and nucleotide diversity (7 11) for each population. DnaSP was used to calculate Tajima’s D, Fu and Li’s D, and Fu and Li’s F.

Neutrality statistics were considered significant when p < 0.05. The PHASE (Stephens et al. 2001) algorithm in DnaSP was used to resolve the allelic phase for single nucleotide polymorphisms in i51 sequences. MEGA v5.2.2 (Tamura et al. 2011) was used to generate mean genetic distances between and within clades for all loci using the TN93 model for COI and D-Loop and the HKY model for i51.

Arlequin v3.5 (Excoffier et al. 2005) was used to test for signatures of non-neutral evolutionary forces on all loci by calculating Fu’s Fs neutrality statistic and was considered significant when p < 0.02. Arlequin was used to calculate fixation indices ( F s t and (I>s t ) . s t was calculated with the TN93 model of evolution for D-Loop and COI and

HKY was used for i51. Population structure was examined using an Analysis of

Molecular Variance (AMOVA) across populations grouped into three regions based on population pairwise differences (Table 1): northern (north of Point Reyes), bay-proximal

(Duxbury Reef to Mussel Rock) and southern (south of Half Moon Bay). Genetic 11

differentiation was compared between 1) northern versus southern versus bay-proximal populations, and 2) northern and southern populations versus bay-proximal populations.

Mismatch distributions were generated in Arlequin and DnaSP for L. aequalis K and

Clade Y samples. Other clades were omitted due to low sample sizes. Haplotype connections were exported from Arlequin and imported into HapStar v0.7 (Teacher and

Griffiths 2011) to build a haplotype network for all loci. The haplotype network for i51 was constructed using indels as both informative and non-informative characters.

Results

Contemporary samples amplified preferentially in COI and were often unsuccessful for amplification of D-Loop or i51, therefore, historical population genetic analysis is based upon COI sequence data. Of 367 total COI sequences, 306 sea stars were from contemporary samples and 61 sea stars were from historic sampling. A 536 base pair region of COI showed 85 variable sites, 60 of which were parsimony- informative. The COI region amplified comprised of 74 haplotypes, 30 of which were private haplotypes in contemporary samples and seven of which were private haplotypes found only in historic samples. A 297 base pair region of D-Loop amplified in 325 individuals and showed 48 variable sites, 40 of which were parsimony informative. The

D-Loop region amplified contained 82 haplotypes including the presence of six indels. Of the 82 total haplotypes, 22 were private. A 197 base pair intron region of i51 showed 55 variable sites, 15 of which were parsimony informative for a total of 68 alleles. 38 of the alleles were private. Alleles consisted of 10 polymorphic sites and two indels. One indel 12

consisted of a 13 base pair sequence repeat with alleles having between one and seven perfect repeats.

Phylogenetic Analysis o f Contemporary and Historical Samples

Base position changes for COI included 23 first position changes, 15 second position changes, and 47 third position changes. When ML analyses were performed excluding third position changes for COI sequences overall topology was unchanged, though nodal support values increased slightly. Phylogenetic trees built with D-Loop and

COI separately had unchanged topology regardless of the method used to estimate the tree. Therefore, loci were concatenated and a phylogenetic tree was built (Fig 1).

The D-Loop and COI concatenated haplotype phylogeny resolved six monophyletic clades with statistical nodal support with the exception of four haplotypes, which are refered to as “Group 1”. The tree resolved all L. aequalis clades previously characterized by Foltz et al. (1996; 2008): L. aequalis K, D, A, and B. Phylogenetic analysis confirmed the presence of an undescribed clade, Clade Y (Smith and Cohen

2013; Coleman and Cohen unpubl data; Melroy et al. in prep) and uncovered an additional undescribed clade, here termed Clade Z. Clade Y is supported as a monophyletic clade through bootstrap support. Clade Z is supported as a monophyletic clade through both bootstrap and Bayesian posterior probabilities.

Genetic distances were calculated for concatenated D-Loop and COI using the

TN93+G model (Table 2). Two main groups emerged from phylogenetic analysis: L. 13

aequalis B and L. aequalis K with Group 1 versus L. aequalis D, L. aequalis A, Clade Y and Clade Z. Genetic distances of clades within the two main groups ranged from 0.9 -

2.0% within the first group and 1.3 - 2.0% within the second. Genetic distance ranged from 2.4 - 2.9% between clades within the two groups. Intra-clade genetic distances ranged from 0.10 - 1.30%, with a mean genetic distance of 0.70%. L. aequalis K had the highest within-clade genetic distance (1.30%). Mean inter-clade genetic distances was higher (1.97%) with a range of 0.9 - 2.9%.

Bayesian reconstruction in BEAST resulted in a tree with overall similar topology to the ML and Bayesian tree in Figure 1. All Leptasterias clades were supported as monophyletic with the exception of haplotypes assigned as Group 1. The estimated mean rate of evolution for mtDNA was 0.0135 ± 0.0001 substitutions/site/My. Divergence times were estimated between clades and time to most recent common ancestor

(TMRCA) was estimated within each clade (Fig 2). All effective sample sizes were over

1000 with the exception of the split between L. aequalis A and Clade Z (ness = 949),

TMRCA for Clade Y (nESs = 517) and TMRCA for Clade Z (nESS = 987). The two main groupings of lineages were estimated to have diverged between 1.74 and 1.25 Mya.

ML and Bayesian analysis of i51 alleles produced trees with the same topology.

Samples obtained from Alaska populations, alleles CC, G, and SS, were use as outgroup sequences, as these samples were identified as L. hexactis through D-Loop and COI barcoding. Alleles within L. aequalis were not resolved (Fig SI). The topology of the i51 phylogeny was unchanged when the 13 base pair repeat indel was used as informative. 14

The i51 phylogenetic tree showed low resolution and did not further resolve the relationships within lineages of the putative L. aequalis.

Population Genetic Analysis

Analysis of Leptasterias populations revealed 59 D-Loop haplotypes and 46 i51 alleles across 17 sites and 58 COI haplotypes across 33 sampling sites and times.

Mitochondrial and nuclear haplotypes revealed a strong contemporary geographic pattern in which there were shared haplotypes proximal to the bay bracketed by 1 distinct, shared haplotypes north and south of the bay (Fig 3). When haplotypes were delineated into clades, a genetic disconnect was confirmed with Clade Y found as bay-associated, bracketed by distinct populations made up predominantly of L. aequalis K (Smith and

Cohen 2013; Melroy et al. in prep; Fig 4a). Clade Z was also found as bay-proximal.

Northern and southern populations were comprised of haplotypes that resolved into L. aequalis K, L. aequalis B. L. aequalis D was found in all regions. L. aequalis K and

Clade Y were the most abundant clades found in the contemporary samples. Alaska population haplotypes were delineated as L. hexactis, and the Washington population comprised of haplotypes delineated as L. aequalis B and L. aequalis A. There were high numbers of private haplotypes for mtDNA and fewer private alleles for nuclear DNA

(Table 1, Fig 3).

The haplotype network for the mtDNA revealed many low frequency haplotypes separated by a large number of mutations (Fig 5). Haplotypes in Clade Y showed a more 15

typical pattern of one high frequency haplotype with other haplotypes separated by one or two mutations. Other clades revealed haplotypes separated by multiple mutations within the clade. L. aequalis K was comprised of haplotypes found both north and south of San

Francisco Bay, while Clade Y was comprised almost completely of haplotypes that were bay-proximal. The haplotype network for i51 (Fig. 5) revealed four common alleles with other low frequency alleles separated by one or two mutations. Indels were included in the haplotype network and resulted in many mutations separating each cluster of alleles.

When indels were excluded from the haplotype network, alleles were separated by fewer mutations; however, a geographic pattern was still not evident (Fig S2).

Haplotype diversity for all populations ranged from 0.0 to 0.947 for mtDNA and

0.104 to 0.867 for i51 (Table 3). Sites with predominant Clade Y abundance had the lowest haplotype diversity values and sites with high abundances of L. aequalis K had the highest haplotype diversity values. Within-clade haplotype diversity was calculated for each population for L. aequalis K and Clade Y as these clades had the largest sample sizes. L. aequalis K population haplotype diversities ranged from 0.5 to 0.933. Clade Y haplotype diversities ranged from 0 to 0.123. Nucleotide diversity for all clades ranged from 0.0 to 0.02494 for mtDNA and 0.00076 to 0.01048 for i51 (Table 3).

Leptasterias spp. pairwise comparisons ( F s t and st) for mtDNA and i51 contemporary samples revealed significant population structure between most localities

(Table S2). All but four mtDNA F st and ® s t values showed significant population differentiation for central California sites. The only populations not significantly different 16

were northern sites: Twin Cove and Marshall Gulch, and bay-proximal sites: Muir Beach and Lands End, Muir Beach and Mussel Rock, and Lands End and Mussel Rock.

Pairwise comparisons of mitochondrial and nuclear haplotypes at northern and southern sites were low, indicating genetic similarity. The AMOVA analysis for mtDNA and i51 reflected the genetic similarity between populations north of San Francisco Bay and south of San Francisco Bay. In both analyses, the predominant variation accounted for between-group variation (Table S3). High within-population variation indicated the presence of sympatric clades.

Tajima’s D, Fu’s Fs, Fu and Li’s Fs ,and Fu and Li’s D statistics revealed several individual localities with significant negative values (Table 3), specifically at Lands End,

Muir Beach and Mussel Rock across both mitochondrial and nuclear loci. Negative neutrality statistics across loci from different genomes indicates demographic factors as the cause of genetic variation rather than selection at Clade Y dominated bay-proximal sites.

Mismatch distributions for L. aequalis K and Clade Y did not differ significantly from the unimodal curves expected for a sudden demographic expansion or for a rapid spatial expansion (Fig 6). The raggedness index values were not significant for either clade and a hypothesis of sudden or spatial demographic expansion could not be rejected

(Table S4). Clade Y showed a steep peak in the distribution, consistent with a recent bottleneck or expansion event. The L. aequallis K distribution was slightly more ragged, but still consistent with a demographic expansion. 17

Historic Samples

Historic samples successfully amplified at COI, however, sequence data could not be obtained for D-Loop or i51 due to low DNA yield and quality. COI haplotypes for the historic samples were pooled with contemporary COI haplotypes and resolved into clades

4 using a phylogenetic tree constructed with Bayesian analysis and ML (not shown; the topology of supported branches did not differ from the concatenated mtDNA tree). There were two dominant, widespread haplotypes in the historic samples that correspond to bay proximal haplotypes in contemporary samples, which resolve into Clades Z and Y (Fig.

4b).

Past samples revealed two clades, Clades Z and Y, were historically more widespread and abundant than indicated by contemporary sampling. Clade delineation of historic samples indicates Clade Z was once found across 800 km of coastline from

Crescent City, CA to Diablo Canyon, CA and Clade Y was found across 750 km of coastline from Crescent City, CA to Piedras Blancas, CA (Fig. 7). In contemporary samples, Clade Y was found at sites across 100 km of coastline with one individual found at Pigeon Point as the southern range limit. Clade Z was less abundant in contemporary samples, located only at Slide Ranch in high abundance and Rodeo Beach in low abundance, two sites separated by 8 km. Both clades appear to be currently localized around San Francisco Bay. One Clade Y individual was found in Pacific Grove in 1897 out of six and in 1909 all six individuals sampled at Pacific Grove were Clade Y. In

2014, zero Clade Y individuals were found in Pacific Grove. At Pigeon Point, one, one, 18

and four individuals were found in 1971, 1972, and 1998 respectively. In 2014, one Clade

Y individual was found at Pigeon Point with a more robust sampling scheme of 32 individuals. L. aequalis K was the most abundant clade at both Pigeon Point and Pacific

Grove in contemporary samples.

Discussion

Fine-scale genetic analysis of mitochondrial and nuclear loci confirmed a strong genetic disconnect and high levels of population structure for the direct-developing

Leptasterias sea stars on a regional scale. San Francisco Bay-proximal populations share common haplotypes at mitochondrial and nuclear loci that are bracketed by distinct northern and southern populations with shared haplotypes, confirming a previous study

(Melroy et al. in prep) Historical sampling reveals a different pattern of clade distribution along the coastline than contemporary sampling and illustrates the value of historical data when interpreting phylogeographic history.

Divergence Times and Phylogenetics

COI coupled with D-Loop offered higher resolution of lineages within the

Leptasterias genus than analysis using D-Loop alone (Melroy et al. in prep) and revealed the presence of two likely species complexes that have been historically grouped into one. Because L. aequalis D is also referred to as L. pusilla in the literature (Foltz et al.

2008), the grouping in the tree of L. aequalis D, L. aequalis A, Clade Y, and Clade Z may make up a species complex separate to the L. aequalis complex. Genetic distances 19

between the two potential species complexes are comparable to those found in other brooding asteroid lineages (Hart et al. 2003). While higher levels of genetic divergence in the direct-developing Leptasterias might be expected, recent speciation events would not yet have accrued significant levels of divergence and would also explain morphological stasis. In addition, the use of a smaller segment of DNA in this study could underestimate comparisons of genetic distance compared to other studies (Sponer and

Roy 2002). While genetic distances do not alone support the separation of clades into distinct species, more genes and behavioral or physiological analyses would help to resolve the relationship of lineages within the Leptasterias genus.

The nuclear intron locus showed lower levels of variation than mitochondrial loci and was not able to resolve the putative L. aequalis into monophyletic lineages. The nuclear tree showed a starburst pattern, supporting the hypothesis of a recent speciation with shared ancestral polymorphisms. Recurrent gene flow is also an explanation for an unresolved species tree, however, the low dispersal potential of Leptasterias makes a recent species radiation more likely. Introns tend to accumulate mutations at a higher rate than exons (Newin et al. 2011), however, i51 is a short intron and may be experiencing genetic hitchhiking through selection on the exons. This could reduce the observed variability compared to expected variability of the locus.

Divergence times between Leptasterias clades were estimated to have occurred between 1.74 and 1.25 Mya, indicating Pleistocene divergence. TMRCA of clades within

Leptasterias ranged between 0.15 and 1.74 Mya, suggestive of a recent species radiation 20

event. Given these Pleistocene divergence times, the Leptasterias clades were likely isolated due to range fragmentation caused by glaciation events (Dawson 2001), which caused a disruption in gene flow. Following isolation in the Pleistocene, range expansion in the Holocene likely occurred, as seen in other taxa (Marko 1998; Hellberg et al. 2001;

Jacobs et al. 2004; Ellingson and Krug 2006; Dawson et al. 2011). Historic sampling shows a different pattern of clade distribution than contemporary sampling, with Clades

Z and Y historically widespread. Neutral and adaptive processes of divergence likely contribute to contemporary patterns of population structure.

Patterns o f Population Structure

Historic samples revealed a more complete pattern of phylogeography than contemporary samples would have alone. While sample sizes for historic sites are low and a comparison of temporal and contemporary changes in allele frequency could not be made, direct developers are expected to have higher levels of temporal genetic stability than organisms with planktonic development (Lee and Boulding 2009). Here, there are seemingly low levels of temporal stability; sites that were once dominated by Clades Y and Z appear to be shifting to L. aequalis K dominated within a timeframe of 30 years.

Low sample sizes should detect the most common haplotype and temporal analysis suggests the most common haplotype decreased in frequency between 1897 and present day. 21

Pleistocene mediated isolation probably contributed to the speciation of

Leptasterias clades, however the current distribution of population structure is most likely the remnant of additional processes of neutral or adaptive divergence. Historic samples revealed a widespread range of Clade Y and Clade Z, while contemporary sampling revealed restricted ranges and a localization around the San Francisco Bay. The present-day distribution of Clade Y around San Francisco Bay cannot be attributed to the formation of San Francisco Bay itself in the Holocene approximately 10,000 years ago

(Atwater et al. 1977, Axelrod 1981), as the speciation event of Clade Y predates the bay formation. While stochastic events such as variable hydrodynamic processes could affect low dispersing organisms, patterns of contemporary clade distribution may be more specifically explained by several mechanisms, including: 1) colonization events during the San Francisco bay formation, 2) ocean circulation and transport processes, 3) local adaptation of Clade Y to bay effluent conditions, or 4) competitive success of L. aequalis

K.

1. Colonization events during the San Francisco bay formation

While the formation of San Francisco Bay predates speciation events of

Leptasterias clades, it is possible that Clade Y inhabited the area that eventually formed

San Francisco Bay and colonized coastal areas around the bay following formation, resulting in their current pattern of distribution. Colonization events are supported by negative neutrality statistics and low haplotype diversities at Y Clade sites, and a unimodal mismatch curve suggesting recent population expansion. The historically 22

widespread Clade Y or Clade Z haplotypes could represent the source of founders colonizing the bay.

2. Ocean circulation and transport processes

While Leptasterias lack a planktonic dispersal stage, they may be considered epi- planktonic through long-distance dispersal on algal rafts (Highsmith 1985). While these events are likely infrequent, when they do occur, rafters could be affected by ocean circulation processes much as planktonic dispersers are. The region of coastline in this study has two potential physical barriers to dispersal for rafting organisms: San Francisco

Bay and Point Reyes.

Estuarine outflow could act as a barrier to dispersal through uninhabitable effluent conditions. Young or adults rafting to bay-proximal populations may experience mortality due to conditions associated with San Francisco Bay effluent: low salinity, high temperatures, or pollutants from wastewater run-off (Luoma and Cloern 1982; Conomos et al. 1985; Nichols et al. 1986). Puritz and Toonen (2011) found reduced genetic diversity and connectivity in the planktonic disperser Patiria miniata across areas of high human impact and pollutant run-off in the Southern California Bight, likely through larval mortality. San Francisco Bay effluent could act as a similar barrier to dispersal in

Leptasterias, though additional genetic assays of other intertidal organisms around the bay would help elucidate this theory. 23

The Point Reyes peninsula is a prominent geographic feature in the range of

Leptasterias and could act as a phylogeographic barrier to dispersal as indicated in other low-dispersing taxa (Marko 1998; Edmands 2001; Ellingson and Krug 2006) and in one higher dispersing taxon (Moberg and Burton 2000). Retention embayments occur both north and south of Point Reyes, which can retain nearshore waters and entrain non-local propagules (Wing et al. 1998; Morgan et al. 2009). These retention zones could effectively limit connectivity of Leptasterias populations north and south of Point Reyes and reduce connectivity of bay-proximal populations to populations north of Point Reyes.

The Monterey Bay region is also associated with retention zones (Graham and

Largier 1997) and could facilitate connectivity between northern populations and southern populations. The California current is southward driven during upwelling months (Huyer 1983; Largier et al. 1993; Checkley and Barth 2009). Water flowing past

Point Reyes either moves offshore or is retained in the southern Point Reyes eddy where it would eventually move into the bifurcating Ano Nuevo jet (Vander Woude et al. 2006), which moves offshore or towards Monterey (Rosenfeld et al. 1994; Steger et al. 2000).

While it is more likely that water would travel offshore rather than be pulled into the upwelling retention zone in Monterey, there is a potential for preferential transport of water north to south, which could connect northern populations and southern populations while reducing movement of water towards the bay gateway. During upwelling relaxation water moves northward along the California coastline (Largier et al. 1993; Steger et al.

2000), though it is difficult to imagine a scenario in which migration would happen to 24

northern sites but not to bay proximal sites (J. Largier, personal communication). One possibility is that transport does occur, but L. aequalis K is not adapted to bay conditions and experiences mortality when rafts settle.

3. Local adaptation of Clade Y to bay effluent conditions

Rather than divergence due to phylogeographic breaks, Leptasterias divergence may be better attributed to selection causing reproductive isolation reflected in the genes used in this study. Interestingly, Leptastieras patterns of distribution appear to coincide with general regions of upwelling exposure in central California. Clade Y may be locally adapted to warm, low salinity conditions from San Francisco effluent affecting local coastal areas (Smith and Cohen 2013; Melroy et al. in prep).

Intense upwelling zones span from Point Arena to Cape Mendocino (Huyer et al.

1987; Bakun 1990), and occur near Ano Nuevo (Rosenfeld et al. 1994). L. aequalis K and L. aequalis B occur at upwelling exposed regions north of Point Reyes and south of

Half Moon Bay (Fig 4). Bay-proximal populations are less exposed to the cold, high- salinity waters of upwelling and are more exposed to the warm, low salinity effluent from

San Francisco Bay. Low haplotype diversities and negative neutrality statistics at mitochondrial and nuclear loci could reflect selection at other loci favoring Clade Y individuals at bay-proximal sites. Adaptive divergence is consistent with expectations of direct-developers that lack a highly dispersive life stage and should be more likely to exhibit local adaptation (reviewed by Sanford and Kelly 2011; Sotka 2012). Indeed, 25

influence of sea surface temperature is indicated as a selective force in other low- dispersing taxa (Willet 2010; Eberl et al. 2013). While the genetic break of Leptasterias clades around the bay area appears to be upwelling associated, other factors associated with estuarine effluent may be causing further differentiation of Clade Y. Behavioral assays assessing the tolerance of clades to variable temperature and salinity conditions would be valuable in determining the causes for the genetic disconnect found around San

Francisco Bay.

4. Competitive success of L. aequalis K

Another hypothesis for the contemporary distribution of Leptasterias clades around central California is that L. aequalis K is outcompeting Clade Y except at bay- proximal sites where Clade Y is better adapted. While historical sample sizes are low, presence of L. aequalis K was detected in only four populations between 1977 and 1998.

In comparison, L. aequalis K was one of the most abundant clades in contemporary samples. It is possible L. aequalis K has increased in abundance at sites that are not bay- proximal, but that were once presumably dominated by Clades Z or Y. Low sample sizes of historic collections make this hypothesis difficult to test, though physiological assays could reveal differences in clade tolerance.

Population Expansion and Evidence for Range Shifts

Range expansions resulting in low haplotype diversities have been reported in regions experiencing expansions due to genetic bottlenecks associated with colonization 26

(Marko 1998; Nasser at al. 2002; Hess et al. 2011). Reduced haplotype diversity, negative neutrality statistics, and mismatch distributions suggest a scenario of population expansion of bay-proximal populations. The expansion indicated by the mismatch distribution of L. aequalis K likely reflects expansion following Pleistocene isolation while Clade Y more recently expanded, perhaps to occupy bay-proximal sites.

An alternative possibility for reduced haplotype diversity and negative neutrality statistics of Clade Y populations is a selective event favoring Clade Y individuals to bay conditions, decreasing overall genetic diversity at the loci analyzed here. Because neutrality statistics of sites dominated by Clade Y were significant across both nuclear and mitochondrial genomes, demographic processes of divergence are more likely (e.g.

Powers et al. 1991; Marko 1998), though it is possible these statistics are due to both demographic and adaptive processes.

While there is evidence for past population expansion and colonization events, historic sampling in the last 100 years indicates more recent range contraction events for

Clades Z and Y. Historic sampling indicated Clade Z and Clade Y were abundant at southern California sites, whereas contemporary sampling did not detect Clade Y or Z at the southern sites sampled (Fig 8). Alternatively, L. aequalis K may be increasing in frequency at southern sites, perhaps due to competitive success, though additional sampling at southern and northern sites would help to elucidate this pattern. 27

There are many examples of marine invertebrates experiencing range shifts in the poleward direction that have been attributed to global climate change (Sorte et al. 2010), including the Reliefs whelk, Kelletia kelletii (Zacherl et al. 2003), the nudibranch,

Phidiana hiltoni (Goddard et al. 2011; Schultz et al. 2011), and the red volcano

Tetraclita rubescens (Connolly and Roughgarden 1998; Dawson et al. 2010). Range shifts of native species have the potential to heavily impact community structure and function in regions experiencing expansions (Sorte et al. 2010). Leptasterias function as important predators in the intertidal environment by preying upon snails, , and 1 . Changes in population abundance of Leptasterias could impact standing algal stocks by indirectly affecting grazing by these herbivores (Gravem 2015). Long-term changes in upwelling processes associated with climate change such as stronger upwelling-favorable winds, colder water, and a higher frequency of upwelling occurrences (Garcia-Reyes and Largier 2010) have the potential to impact selective and demographic forces that can lead to further shifts in population dynamics. Monitoring changing species diversity and distributions over time may allow predictions of how range shifts associated with global climate change impact community ecology. This study illustrates the value of historic sampling when interpreting population genetic data in a phylogeographic context. 28

References

Atwater BF, Hedel CW, Helley EJ (1977) Late Quaternary depositional history, Holocene sea-level changes, and vertical crust movement, southern San Francisco Bay, California (Vol. 1014). US Govt. Print. Off.

Avise JC (1992) Molecular population structure and the biogeographic history of a regional fauna: a case history with lessons for conservation biology. Oikos, 62-76.

Axelrod DI (1981) Holocene climatic changes in relation to vegetation disjunction and speciation. American Naturalist, 847-870.

Bakun A (1990) Global climate change and intensification of coastal ocean upwelling. Science, 247, 198-201.

Becker BJ, Levin LA, Fodrie FJ, McMillan PA (2007) Complex larval connectivity patterns among marine invertebrate populations. Proceedings o f the National Academy o f Sciences, 104, 3267-3272.

Billot C, Engel CR, Rousvoal S, Kloareg B, Valero M (2003) Current patterns, habitat discontinuities and population genetic structure: the case of the kelp Laminaria digitata in the English Channel. Marine Ecology Progress Series, 253, 21.

Bradbury IR, Laurel B, Snelgrove PV, Bentzen P, Cana SE (2008) Global patterns in marine dispersal estimates: the influence of geography, taxonomic category and life history. Proceedings o f the Royal Society ofLondon B: Biological Sciences, 275, 1803- 1809.

Checkley DM, Barth JA (2009) Patterns and processes in the California Current System. Progress in Oceanography, 83, 49-64.

Chenhuil A, Hoareau TB, Egea E et al. (2010) An efficient method to find potentially universal population genetic markers, applied to metazoans. BMC Evolutionary Biology, 10, 17.

Chia FS (1966) Brooding behavior of a six-rayed , . Biological Bulletin, 304-315.

Connolly SR, Roughgarden J (1998) A range extension for the volcano barnacle, rubescens. California fish and game, 84, 182-183.

Conomos TJ, Smith RE, Gartner JW (1985) Environmental setting of San Francisco Bay. Hydrobiologia, 129, 1-12.

Dawson MN (2001) Phylogeography in coastal marine : a solution from California? Journal o f Biogeography, 28, 723-736. 29

Dawson MN, Barber PH, Gonzalez □ Guzman LI, Toonen RJ, Dugan JE, Grosberg RK (2011) Phylogeography of Emerita analoga (Crustacea, Decapoda, Hippidae), an eastern Pacific Ocean sand crab with long □ lived pelagic larvae. Journal o f Biogeography, 38, 1600-1612.

Dawson MN, Grosberg RK, Stuart YE, Sanford E (2010) Population genetic analysis of a recent range expansion: mechanisms regulating the poleward range limit in the volcano barnacle Tetraclita rubescens. Molecular Ecology, 19, 1585-1605.

Drummond AJ, Suchard MA, Xie D and Rambaut A (2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7 Molecular Biology And Evolution, 29, 1969-1973.

Edmands S (2001) Phylogeography of the intertidal copepod Tigriopus californicus reveals substantially reduced population differentiation at northern latitudes. Molecular Ecology, 10, 1743-1750.

Ellingson RA, Krug PJ (2006) Evolution of poecilogony from planktotrophy: cryptic speciation, phylogeography, and larval development in the gastropod genus Alderia. Evolution, 60, 2293-2310.

Excoffier L, Laval G, Schneider S (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online, 1, 47.

Flowers JM, Foltz DW (2001) Reconciling Molecular Systematics and Traditional in a Species-Rich Clade of Sea Stars {Leptasterias subgenus Hexasterias). Marine Biology, 139, 475-483

Foltz DW, Breaux JP, Campagnaro EL et al (1996) Limited morphological differences between genetically identified cryptic species within the Leptasterias species complex (Echinodermata: Asteroidea). Canadian Journal o f Zoology, 74, 1275-1283. Foltz DW, Nguyen AT, Nguyen I, Kiger JR (2007) Primers for the amplification of nuclear introns in sea stars of the family . Molecular Ecology Notes, 1, 874- 876.

Foltz DW, Nguyen AT, Kiger JR, Mah CL (2008) Pleistocene speciation of sister taxa in a North Pacific clade of brooding sea stars (Leptasterias). Marine Biology, 154, 593-602.

Garcia-Reyes M, Largier J (2010) Observations of increased wind □ driven coastal upwelling off central California. Journal of Geophysical Research: Oceans (1978-2012), 115, doi: 10.1029/2009jc005576.

Gerard K, Guilloton E, Arnaud-Haond S et al. (2013) PCR survey of 50 introns in animals: Cross-amplification of homologous EPIC loci in eight non-bilaterian, protostome and deuterostome phyla. Marine Genomics, 12, 1-8. 30

Goddard JH, Gosliner TM, Pearse JS (2011) Impacts associated with the recent range shift of the aeolid nudibranch Phidiana hiltoni (Mollusca, Opisthobranchia) in California. Marine Biology, 158, 1095-1109.

Graham WM, Largier JL (1997) Upwelling shadows as nearshore retention sites: the example of northern Monterey Bay. Continental Shelf Research, 17, 509-532.

Gravem SA (2015) Linking antipredatory behavior of prey to intertidal zonation and community structure in rocky tidepools. Dissertation, Davis, CA: University of California, Davis.

Hart MW, Byrne M, Johnson SL (2003) Patiriella pseudoexigua (Asteroidea: Asterinidae): a cryptic species complex revealed by molecular and embryological analyses. Journal o f the Marine Biological Association o f the UK, 83, 1109-1116.

Hasegawa M, Kishino H, Yano T (1985). "Dating of human-ape splitting by a molecular clock of mitochondrial DNA". Journal o f Molecular Evolution, 22, 160-174. doi: 10.1007/BF02101694.

Hellberg ME, Balch DP, Roy K (2001) Climate-driven range expansion and morphological evolution in a marine gastropod. Science, 292, 1707-1710.

Hess JE, Vetter RD, Moran P (2011) A steep genetic cline in yellowtail rockfish, Sebastes flavidus, suggests regional isolation across the Cape Mendocino faunal break. Canadian Journal of Fisheries and Aquatic Sciences, 68, 89-104.

Highsmith RC (1985) Floating and algal rafting as potential dispersal mechanisms in brooding invertebrates. Marine Ecology Progress Series, 25, 169-179.

Hrincevich AW, Rocha-Olivares A, Foltz DW (2000) Phylogenetic analysis of molecular lineages in a species-rich subgenus of sea stars (Leptasterias subgenus Hexasterias). American Zoologist, 40, 365-374.

Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17, 754-755.

Huyer A (1983) Coastal upwelling in the California Current system. Progress in Oceanography, 12,259-284.

Huyer A, Kosro PM (1987) Mesoscale surveys over the shelf and slope in the upwelling region near Point Arena, California. Journal of Geophysical Research: Oceans (1978- 2012), 92, 1655-1681.

Jablonski D (1986) Larval ecology and macroevolution in marine invertebrates. Bulletin o f Marine Science, 39, 565-587. 31

Jacobs DK, Haney TA, Louie KD (2004) Genes, diversity, and geologic process on the Pacific coast. Annual Review of Earth Planetary Science, 32, 601-652.

Jurgens LJ, Rogers-Bennett L, Raimondi PT et al. (2015) Patterns of Mass Mortality among Rocky Shore Invertebrates across 100 km of Northeastern Pacific Coastline. PLOSONE, 10, eO 126280.

Kamel SJ, Grosberg RK, Addison JA (2014) Multiscale patterns of genetic structure in a marine snail (Solenosteira macrospira) without pelagic dispersal. Marine Biology, 161, 1603-1614.

Kearse M, Moir R, Wilson A et al. (2012) Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics, 28, 1647-1649.

Kelly RP, Palumbi SR (2010) Genetic structure among 50 species of the northeastern Pacific rocky intertidal community. PLoS One, 5, DOI: 10.1371/joumal.pone.0008594.

Largier JL, Magnell BA, Winant CD (1993) Subtidal circulation over the northern California shelf. Journal of Geophysical Research: Oceans (1978-2012), 98, 18147- 18179.

Levin LA (2006) Recent progress in understanding larval dispersal: new directions and digressions. Integrative and Comparative Biology, 46, 282-297.

Luoma SN, Cloem JE (1982) The impact of waste-water discharge on biological communities in San Francisco Bay. San Francisco Bay: Use and Protection, Eds. WJ Kockelman, TJ Conomos, and AE Leviton, 137-160.

Maddison WP, DR Maddison (2015) Mesquite: a modular system for evolutionary analysis. Version 3.03 http://mesquiteproject.org.

Marko PB (1998) Historical allopatry and the biogeography of speciation in the prosobranch snail genus Nucella. Evolution, 52,151-11 A.

Melroy LM, Smith RJ, Cohen CS. Phylogeography of direct-developing sea stars in the genus Leptasterias in relation to San Francisco Bay outflow in central California. In prep.

Menge BA (1975) Brood or broadcast? The adaptive significance of different reproductive strategies in the two intertidal sea stars Leptasterias hexactis and . Marine Biology, 31, 87-100.

Moberg PE, Burton RS (2000) Genetic heterogeneity among adult and recruit red sea urchins, Strongylocentrotus franciscanus. Marine Biology, 136, 773-784. 32

Morgan SG, Fisher JL, Miller SH, McAfee ST, Largier JL (2009) Nearshore larval retention in a region of strong upwelling and recruitment limitation. Ecology, 90, 3489- 3502.

Muller K (2005) SeqState. Applied Bioinformatics, 4, 65-69.

Muller K (2006) Incorporating information from length-mutational events into phylogenetic analysis. Molecular Phylogenetics and Evolution, 38, 667-676.

Nichols FH, Cloem JE, Luoma SN, Peterson DH (1986) The modification of an estuary. Science (Washington), 231, 567-573.

Pante E, Puillandre N, Viricel A et al. (2015) Species are hypotheses: avoid connectivity assessments based on pillars of sand. Molecular Ecology, 24, 525-544.

Parker T, Tunnicliffe V (1994) Dispersal strategies of the biota on an oceanic seamount: implications for ecology and biogeography. The Biological Bulletin, 187, 336-345.

Powers DA, Lauerman T, Crawford D et al. (1991). The evolutionary significance of genetic variation at enzyme synthesizing loci in the teleost Fundulus heteroclitus. Journal o f Fish Biology, 39, 169-184.

Puritz JB, Toonen RJ (2011) Coastal pollution limits pelagic larval dispersal. Nature Communications, 2, 226.

Rambaut A (2009) FigTree. Retrieved from http://tree.bio.ed.ac.uk/software/figtree/.

Rosenfeld LK, Schwing FB, Garfield N, Tracy DE (1994) Bifurcated flow from an upwelling center: a cold water source for Monterey Bay. Continental Shelf Research, 14, 931-964.

Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics, 19, 2496- 2497.

Sanford E, Kelly MW (2011) Local adaptation in marine invertebrates. Annual Review of Marine Science, 3, 509-535.

Schultz ST, Goddard JH, Gosliner TM et al. (2011). Climate□ index response profiling indicates larval transport is driving population fluctuations in nudibranch gastropods from the northeast Pacific Ocean. Limnology and oceanography, 56, 749-763.

Selkoe KA, Watson JR, White C et al. (2010). Taking the chaos out of genetic patchiness: seascape genetics reveals ecological and oceanographic drivers of genetic patterns in three temperate reef species. Molecular Ecology, 19, 3708-3726. 33

Smith MJ, Amdt A, Gorski S, Fajber E (1993) The phylogeny of eehinoderm classes based on mitochondrial gene arrangements. Journal o f Molecular Evolution, 36, 545-554.

Smith RJ, Cohen CS (2013) Phylogeography of a direct-developing seastar in relation to the San Francisco Bay outflow. Poster presentation. Integrative and Comparative Biology, 53 P2.217.

Sorte CJ, Williams SL, Carlton JT (2010) Marine range shifts and species introductions: comparative spread rates and community impacts. Global Ecology and Biogeography, 19,303-316.

Sotka EE, Wares JP, Barth JA, Grosberg RK, Palumbi S (2004) Strong genetic clines and geographical variation in gene flow in the rocky intertidal barnacle Balanus glandula. Molecular Ecology, 13, 2143-2156.

Sotka EE (2012) Natural selection, larval dispersal, and the geography of phenotype in the sea. Integrative and Comparative Biology, 52, 538-545.

Steger JM, Schwing FB, Collins CA et al. (2000). The circulation and water masses in the Gulf of the Farallones. Deep Sea Research Part II: Topical Studies in Oceanography, 47, 907-946.

Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. The American Journal o f Human Genetics, 68, 978- 989.

Strathmann RR, Strathmann MF (1982) The relationship between adult size and brooding in marine invertebrates. American Naturalist, 119, 91-101.

Strathmann RR (1986) What controls the type of larval development? Bulletin o f Marine Science, 39, 616-622.

Sponer R, Roy MS (2002) Phylogeographic analysis of the brooding brittle star Amphipholis squamata (Echinodermata) along the coast of New Zealand reveals high cryptic genetic variation and cryptic dispersal potential. Evolution, 56, 1954-1967.

Swofford DL (2001) Paup*: Phylogenetic analysis using parsimony (and other methods) 4.0. B5.

Tamura K, Peterson D, Peterson N, et al. (2011) MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Molecular Biology and Evolution, 28, 2731-2739.

Tamura K, Nei M (1993) "Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees". Molecular Biology and Evolution 10 (3): 512-526. 34

Teacher AGF, Griffiths DJ (2011) HapStar: Automated haplotype network layout and visualisation. Molecular Ecology Resources, 11, 151-153.

Vander Woude AJ, Largier JL, Kudela RM (2006) Nearshore retention of upwelled waters north and south of Point Reyes (northern California)—Patterns of surface temperature and chlorophyll observed in CoOP WEST. Deep Sea Research Part II: Topical Studies in Oceanography, 53, 2985-2998.

Wing SR, Botsford LW, Ralston SV, Largier JL (1998) Meroplanktonic distribution and circulation in a coastal retention zone of the northern California upwelling system. Limnology and Oceanography 43 (7): 1710-1721.

Winston JE (2012) Dispersal in marine organisms without a pelagic larval phase. Integrative and Comparative Biology, ics040.

Zacherl D, Gaines SD, Lonhart SI (2003) The limits to biogeographical distributions: insights from the northward range extension of the marine snail, Kelletia kelletii (Forbes, 1852). Journal o f Biogeography, 30, 913-924. 35

List of Tables

Table 1. Summary data for each sampling locality for all loci. Year indicates year of sample collection. Estimated latitude and longitude coordinates for collection sites (Lat, Long), number of samples successfully sequenced for each locus (N), number of haplotypes found per site (Nh), number of alleles found at each site (Na), number of singletons per site (Pr) are provided for each collection site and time point. Region indicates grouping used for AMOVA analysis. Site code indicates population abbreviations found in figures. CAS ID is the specimen identification number for the California Academy of Sciences Invertebrate Zoology Collection and DF indicates samples from David Foltz (LSU).

Site COI D-Loop i51 Location Code Lat, Long Year Region N Nh (Pr) N Nh (Pr) N Na (Pr) Contemporary Sites 305 74(35) 322 82 (45) 289 114 (27) Auke Bay, AK AB 58.38,-134.64 2014 Alaska 5 2(1) 5 3(1) 5 2(1) Sage Bay, AK SB 57.05,-135.34 2014 Alaska 5 1 5 2 5 5(1) Griffin Bay, WA GB 48.50,-123.02 2014 Washington 19 4(4) 19 4(4) 17 6(2) Twin Coves, CA TC 38.46,-123.14 2011 Northern 20 10(6) 20 7(5) 20 9(2) Marshall Gulch, CA MG 38.37,-123.07 2008 Northern 14 6(4) 14 5(3) 6 6 Bodega Bay, CA BB 38.30, -213.06 2008 Northern 8 3 7 4(2) 8 9(1) Duxbury Reef, CA DR 37.89,-122.70 2014 Bay-proximal 16 2(2) 16 2(1) 16 4 Slide Ranch, CA SR 37.87,-122.60 2014 Bay-proximal 21 3(2) 21 4(3) 21 4 Muir Beach, CA MB 37.86,-122.59 2013 Bay-proximal 27 4(1) 44 4(4) 25 9(2) Rodeo Beach, CA RB 37.82,-122.53 2014 Bay-proximal 31 4(2) 31 7(1) 31 7(4) Point Bonita, CA PB 37.81,-122.53 2014 Bay-proximal 20 4 20 6(2) 20 8(2) Lands End, CA LE 37.78,-122.50 2013 Bay-proximal 20 1 20 2(1) 20 6(3) Mussel Rock, CA MR 37.50,-122.49 2008 Bay-proximal 20 2(1) 20 1 19 4 Half Moon Bay, CA HMB 37.18,-122.39 2011-14 Southern 7 4(3) 8 5(4) 8 9(2) Pigeon Point, CA PP 37.18,-122.39 2013-14 Southern 32 9(1) 32 14(8) 32 12(3) Point Pinos, CA PN 36.64,-121.95 2014 Southern 20 7(3) 20 7(3) 20 9(4) Carmel Point, CA CP 36.54,-121.93 2014 Southern 20 8(5) 20 5(3) 16 5 Historic Sites CAS ID 61 35(8) Crescent City, CA CC 201227 1897 6 2(1) Pacific Grove, CA PG 191756 1897 9 2 Pacific Grove, CA PG 108854 1909 6 1 San Simeon, CA ss 115491 1916 2 2 Bodega Head, CA BB 115521 1963 3 1 Pigeon Point, CA PP 7676 1971 3 2 Pigeon Point, CA PP 191755, 7642 1972 4 3(1) Franklin Point, CA FP 7645 1972 1 3(2) Point Bonita, CA PB 115524 1973 2 4(3) Diablo Canyon, CA DC 135003 1974 1 1 SE Farallons, CA FI 4826 1977 3 2(1) Piedras Blancas, CA PE 164031, 115497 1978 4 4 Duxbury Reef, CA DR 115493 1998 1 1 Lonesome Cove, WA LC DF 1998 2 2 « Pigeon Point, CA PP DF 1998 8 8(4) Franklin Point, CA FP DF 1998 2 2(1) 36

Table 2. Tamura-Nei mean genetic distances for concatenated 16S and COI mtDNA haplotypes between and within Leptasterias clades and species (± values are the standard error of the mean).

L. aequalis A L. aequalis K L. aequalis D L. aequalis B Clade Y Group 1 Clade Z L. hexactis L. aequalis A 0.001 ± 0.001 L. aequalis K 0.027 ± 0.006 0.013 ±0.002 L. aequalis D 0.015 ±0.004 0.028 ±0.005 0.005 ± 0.002 L. aequalis B 0.024 ±0.005 0.009 ±0.003 0.026 ± 0.005 0.009 ±0.002 Clade Y 0.013 ±0.004 0.029 ±0.006 0.018 ±0.004 0.025 ± 0.005 0.006 ±0.001 Group 1 0.010 ±0.003 0.030 ±0.006 0.017 ±0.004 0.027 ± 0.006 0.016 ±0.004 0.005 ± 0.002 Clade Z 0.018 ±0.004 0.011 ±0.003 0.020 ±0.005 0.009 ± 0.003 0.020 ±0.004 0.021 ±0.005 0.012 ±0.003 L. hexactis 0.040 ± 0.007 0.042 ± 0.007 0.043 ± 0.007 0.040 ± 0.007 0.038 ± 0.007 0.043 ± 0.007 0.039 ± 0.007 0.004 ±0.001 L. camtschatica 0.071 ±0.009 0.072 ± 0.009 0.075 ± 0.009 0.070 ±0.009 0.071 ±0.009 0.069 ±0.009 0.068 ± 0.009 0.052 ±0.009

Table 3. Molecular diversity indices for Leptasterias spp. estimated per population for concatenated 16S and COI (mtDNA) and nuclear intron, i51. Bold values indicate significance. Hd= Haplotype diversity and 7i= nucleotide diversity (p < 0.05 for Tajima’s D, Fu and Li’s D, Fu and Li’s F; p < 0.02 for Fu’s Fs).

mtDNA______i51 Tajima’s Fu and Fu and Tajima’s Fu and Fu and Site H d± SE 71 ! SI: D Fu’s Fs Li’sD Li’s F H d± SE 7t ± SE D Fu’sFs Li’sD Li’s F AB 0.90±0.16 0.0020±0.001 -1.094 -1.405 -1.938 -1.113 0.02±0.15 0.0008±0.001 - 1 .1 1 2 -0.339 -1.243 -1.347 SB 0.40±0.13 0.0005±0.000 -0.817 0.090 -0.817 -0.772 0.87±0.09 0.0105±0.002 -0.432 3.379 -0.629 -0.652 GB 0.57±0.11 0.0120±0.002 1.722 7.562 1.337 1.721 0.56±0.09 0.0039±0.001 -0.235 11.130 1.044 0.774 TC 0.95±0.03 0.0149±0.015 0.871 -0.740 1.053 1.117 0.75±0.04 0.0069±0.017 -0.234 3.174 -0.456 -0.336 MG 0.85±0.07 0.0102±0.002 -0.158 4.137 0.145 0.082 0.80±0.08 0.0063±0.001 -0.515 0.158 -0.135 -0.263 BB 0.71±0.13 0.0249±0.004 2.055 4.527 1.454 1.756 0.74±0.08 0.0062±0.001 0.527 3.394 0.254 0.374 DR 0.33±0.13 0.0072±0.003 0.313 10.721 1.544 1.384 0.24±0.20 0.0016±0.001 - 1.602 2.428 -2.024 -2.207 SR 0.49±0.13 0.0048±0.002 -0.852 2.321 1.292 0.720 0.05±0.05 0.0004±0.000 - 1.482 4.058 -2.451 -2.514 MB 0.57±0.12 0.0034±0.002 - 2.142 -0.560 - 2.770 - 3.039 0.31±0.08 0.0023±0.001 - 2.054 0.956 -2.003 -2.396 RB 0.70±0.06 0.003 7±0.002 -1.495 3.476 0.737 -0.006 0.13±0.06 0.0008±0.000 - 1.922 0.995 - 3.902 - 3.857 PB 0.86±0.06 0.0134±0.003 0.060 2.141 1.090 0.912 0.55±0.08 0.0054±0.001 -1.346 9.261 -2.399 -2.424 LE 0.00 0.000 - - 1.863 -- 0.28±0.09 0.0014±0.001 - 2.358 0.990 - 3.751 - 3.791 MR 0.20±0.12 0.0003±0.000 - 1.513 2.207 -2.053 -2.188 0.10±0.07 0.0010±0.001 - 1.883 1.000 - 3.191 - 3.259 HMB 0.79±0.15 0.0212±0.005 0.215 3.039 0.134 0.137 0.82±0.10 0.0087±0.002 -0.677 0.8^3 1.281 -1.282 PP 0.90±0.03 0.0175±0.002 -0.748 1.502 -1.263 -1.228 0.87±0.03 0.0079±0.001 -0.673 -2.653 -0.387 -0.573 PN 0.78±0.07 0.0112±0.001 0.262 3.506 0.523 -0.075 061±0.08 0.0046±0.001 -1.179 -0.836 0.523 -0.006 CP 0.83±0.06 0.0091 ±0.001 0.317 0.851 0.250 -0.053 0.65±0.08 0.0053±0.001 -0.091 -1.022 0.250 0.173 37

List of Figures

Hap 75

Hap 88

Hap 46 — Hap 8 Hap 84

j Hap i

1 - Hap 24

--- Hap! L aequalis K

IfHiHap ft**

c

^Hap 67

H«p 16

I p ” | L. aequalis B f e ’i Group 1 > ^-Hap27 IS

L. aequalis D (L. pusilla)

Clade Z I! L. aequalis A

Clade Y

i L hexactis

Figure 1. Maximum Likelihood tree for mtDNA haplotypes. Circles represent nodal support: right side shading represents bootstrap values of 70 or greater and left side shading represents Bayesian Posterior Probabilities of 95% or greater. Colors indicate groupings of haplotypes into clades. Reference sequences for L. aequalis clades D, K, A, and B and L. hexactis were obtained from GenBank. 38

L. aequalis K + L aequalis B 1.73 (0 .92 - 2 .58 )

L. aequalis D/K + L aequalis A/D (L. pusilla)

1.74 (0 .92 - 2 . 5 9 )

L aequalis A/D + Clades Z/Y

L hexactis + L aequalis 1.25 (0 .6 0 - 1 .9 4 ) 2 .5 1 ( 1 .4 4 - 3 .5 8 ) *

Figure 2. BEAST (Drummond et al 2012) consensus tree for Leptasterias haplotypes using mtDNA reconstructed with Bayesian MCMC analysis. L. camtshatica was included as the outgroup sister taxon. The L. hexactis and L. aequalis split, marked with a red asterisk, was estimated by Foltz et al (2008) and used for calibration. Reference sequences were included from GenBank (see text for accession numbers). Black circles represent Bayesian Posterior Probability nodal support of 95% or greater. Estimates of divergence times are shown in millions of years and numbers in parentheses are 95% highest posterior density intervals. The scale bar shows the expected number of substitutions per site and the bottom grid axis represents time in millions of years with 0.0 as present day. Colors represent clades and are consistent with colors from Figure 1. 39

I B Hap 1 D Hap 15 TC (20) □ Hap 2 | Hap 17 MG (14) ( ^ ) San Francisco Bay | Hap 7 I I Hap 22

BB (7) 0 H Hap 36 LE (20) B HaPn DR (16) |7 ] Hap 12 Private Haps

MR (20) f Hap 14 SR (21) i - ( J ) HMB (7)

MU (23) PP (33)

RB (31) PN (20)

PB (20) - 8 r cp (20)

- 123.40-______-123.05* •uyar

B E51-A ] E51-K TC (20) [ " I E51-B □ E51-L MG (6) □ E51-C | E51-M BB (8) H E51-D ■ E51-N

||| E51-E ■ E51-Q.

| E51-F | E51-R

■ E51-G 1 E51-T

37-20° MU (25) B E51-H □ E51-U

□ E51-I | E51-Y RB (31) | E51-J | | Private Haps

PB (20) CP (16)

Figure 3. Haplotype frequency maps for mtDNA (top) and nuclear i51 (bottom) for contemporary samples. Letters indicate the population code (n=sample size). Colors represent different haplotypes, white wedges represent private haplotypes and numbers within and next to the white represent number of private haplotypes within the population. 40

Figure 4. lade frequencies for contemporary samples from central California to Alaska (a) and historic samples in California (b). Colors represent clades and correspond to those found in Fig 1. Letters represent sample site (n=sample size), numbers above the circles represent the collection year, and circle size is representative of sample size. 41

Figure 5. Haplotype network for mtDNA haplotypes (left) and i51 haplotypes (right). Circles represent haplotypes and circle size represents the frequency of haplotypes. Black circles represent missing haplotypes. Colors indicate population regions corresponding to the map (left). Grey shading represents clade delineation of haplotypes from phylogenetic analysis. 42

a) Clade Y

Paiiwise Differences b) L. aequalis K

Paiiwise Differences

Figure 6. Mismatch distribution of pairwise distances among mtDNA haplotypes for (a) Leptasterias spp. Clade Y (Harpening’s raggedness value r= 0, p-value= 1) and (b) L. aequalis K (Harpening’s raggedness value r= 0.15, p-value= 0.41) compared to expected frequencies (calculated in DnaSP v5.10 and Arlequin v3.5). 43

1897-1974 1975-2000 2001-2014

Figure 7. Time series frequency clade map for Leptasterias spp. compiled using COI between 1897 and 2014. Letters represent site code, size of circles represent sample size (see Table 1) and colors represent clade delineation (see Fig. 1). 44

Appendices

Table Al. GenBank Accession numbers for reference sequences used in phylogenetic analysis.

Leptasterias clade D-loop Accession Number COI Accession Number L. aequalis A AF162089 AF162111 L. aequalis B A F162090 U37563 L. aequalis D A F162092 AF162112 L. aequalis K AF162109 AF162128 L. aequalis K AF162099 AF 162118 L. aequalis K AF162100 AF162119 L. hexactis C AF 162091 U37564 L. hexactis G AF162095 AF162115 L. camtschatica AF162104 AF162123

Table A2. Pairwise comparison of genetic structure for L. hexactis and L. aequalis populations. Values for i51 are shown above the diagonal and values for mtDNA are shown below the diagonal for a) 0 St and b) Fst- Bold numbers indicate significant values with p < 0.05. See Table 1 for location abbreviations.

AB SB GB TC MG BB DR SR RB MB PB LE MR HMB PP PN CP AB - 0.190 0.730 0.673 0.792 0.724 0.860 0.946 0.812 0.917 0.676 0.868 0.900 0.636 0.647 0.804 0.764 SB -0.087 - 0.386 0.436 0.523 0.382 0.488 0.615 0.496 0.636 0.362 0.530 0.543 0.323 0.468 0.631 0.559 GB 0.780 0.786 - 0.236 0.496 0.194 0.033 0.071 0.015 0.101 -0.017 0.137 0.081 0.098 0.308 0.530 0.460 TC 0.745 0.753 0.495 - 0.153 0.002 0.386 0.442 0.389 0.494 0.189 0.441 0.422 0.206 0.070 0.162 0.108 MG 0.836 0.844 0.602 0.068 - 0.140 0.684 0.782 0.664 0.796 0.412 0.722 0.736 0.349 0.008 0.082 0.071 BB 0.669 0.683 0.327 0.133 0.230 - 0.408 0.523 0.378 0.549 0.106 0.446 0.435 0.147 0.044 0.184 0.099 DR 0.851 0.859 0.478 0.662 0.767 0.430 - 0.022 -0.011 0.001 0.046 0.043 -0.097 0.216 0.427 0.675 0.612 SR 0.907 0.913 0.548 0.729 0.816 0.657 0.667 - 0.001 0.022 0.085 0.135 -0.027 0.357 0.461 0.733 0.678 RB 0.925 0.931 0.658 0.758 0.849 0.624 0.365 0.802 - 0.011 0.049 0.047 -0.038 0.254 0.436 0.665 0.604 MB 0.918 0.922 0.659 0.770 0.853 0.640 0.324 0.781 0.075 - 0.129 0.059 -0.052 0.408 0.513 0.760 0.707 PB 0.770 0.776 0.433 0.576 0.659 0.433 0.455 0.609 0.636 0.653 - 0.130 0.095 0.082 0.269 0.466 0.384 LE 0.992 0.998 0.733 0.797 0.895 0.714 0.543 0.881 0.024 0.144 0.706 - -0.006 0.339 0.477 0.713 0.640 MR 0.987 0.993 0.729 0.795 0.892 0.708 0.534 0.876 0.024 0.140 0.703 -0.001 - 0.322 0.462 0.716 0.637 HMB 0.722 0.732 0.373 0.450 0.549 0.286 0.502 0.606 0.705 0.725 0.168 0.785 0.780 - 0.260 0.447 0.376 PP 0.742 0.747 0.506 0.061 0.096 0.207 0.659 0.709 0.739 0.751 0.571 0.767 0.765 0.443 - 0.057 0.043 PN 0.813 0.820 0.606 0.074 0.105 0.261 0.749 0.801 0.824 0.830 0.655 0.861 0.859 0.552 0.076 - 0.031 CP 0.841 0.848 0.622 0.230 0.325 0.354 0.768 0.818 0.843 0.847 0.675 0.884 0.881 0.572 0.240 0.243 . 45

bi AB SB GB TC MG BB DR SR RB MB PB LE MR HMB PP PN CP AB - 0.269 0.535 0.422 0.473 0.415 0.701 0.847 0.543 0.555 0.400 0.549 0.628 0.450 0.353 0.456 0.503 SB -0.016 - 0.297 0.160 0.158 0.119 0.475 0.660 0.336 0.343 0.162 0.323 0.407 0.155 0.118 0.206 0.258 GB 0.269 0.438 - 0.263 0.234 0.202 0.511 0.642 0.421 0.430 0.263 0.421 0.456 0.240 0.152 0.271 0.273 TC 0.072 0.250 0.206 - 0.081 0.037 0.428 0.541 0.342 0.335 0.149 0.329 0.387 0.154 0.039 0.017 0.062 MG 0.132 0.319 0.264 0.101 - 0.033 0.436 0.616 0.311 0.306 0.112 0.329 0.413 0.156 0.005 0.079 0.081 BB 0.123 0.346 0.275 0.079 0.149 0.396 0.563 0.284 0.278 0.089 0.295 0.373 0.123 0.017 0.040 0.037 DR 0.490 0.654 0.511 0.348 0.424 0.475 - 0.021 0.095 0.027 0.100 0.477 0.603 0.443 0.334 0.459 0.428 SR 0.286 0.451 0.374 0.220 0.279 0.292 0.519 - 0.161 0.086 0.309 0.588 0.708 0.623 0.434 0.573 0.541 RB 0.278 0.441 0.367 0.215 0.273 0.284 0.507 0.378 - 0.011 0.065 0.233 0.366 0.343 0.267 0.368 0.342 MB 0.230 0.389 0.328 0.183 0.235 0.240 0.455 0.326 0.082 - 0.087 0.322 0.450 0.338 0.262 0.362 0.335 PB 0.122 0.298 0.249 0.095 0.145 0.139 0.391 0.262 0.233 0.205 - 0.123 0.217 0.144 0.100 0.166 0.149 LE 0.615 0.757 0.588 0.429 0.514 0.589 0.746 0.592 0.084 0.218 0.446 - 0.026 0.331 0.265 0.357 0.329 MR 0.473 0.627 0.502 0.345 0.417 0.462 0.654 0.509 0.131 0.154 0.361 0.004 - 0.399 0.314 0.416 0.387 HMB 0.104 0.318 0.256 0.077 0.133 0.125 0.448 0.273 0.266 0.223 0.124 0.558 0.437 - 0.096 0.200 0.236 PP 0.076 0.244 0.200 0.045 0.084 0.079 0.325 0.213 0.208 0.179 0.093 0.391 0.321 0.059 - 0.043 0.039 PN 0.159 0.332 0.278 0.124 0.175 0.173 0.421 0.291 0.285 0.250 0.166 0.500 0.416 0.157 0.124 - 0.017 CP 0.122 0.298 0.249 0.095 0.145 0.139 0.391 0.262 0.257 0.233 0.137 0.471 0.387 0.124 0.094 0.159 -

Table A3. AMOVA results comparing populations grouped geographically with Northern and Southern populations either combined or separate. See Table 1 for detailed regional groupings.

Source of variation Df Sum of squares Variance components Percentage of variance P-value a) Two groups: Northern and Southern populations versus Bay-proximal populations mtDNA Among groups 1 1238.536 8.71335 58.08 <0.00001 Among populations within groups 12 663.46 2.70628 18.04 <0.00001 Within populations 259 927.627 3.58157 23.88 <0.00001 Total 272 2829.623 15.0012

i51 Among groups 1 655.518 2.27888 27.42. <0.00001 Among populaitons within groups 12 764.783 1.61975 19.49 <0.00001 Within populations 512 2259.698 4.41347 53.1 0.00293 Total 525 3679.998 8.3121

b) Three groups: Northern populations versus Bay-proximal populations versus Southern populations mtDNA Among groups 2 1265.136 7.18242 52.89 <0.00001 Among populations within groups 11 634.173 2.8228 20.78 <0.00001 Within populations 259 926.137 3.57582 26.33 <0.00001 Total 272 2825.447 13.58104

i51 Among groups 2 30.724 0.08385 18.1 <0.00001 Among populaitons within groups 11 31.971 0.07054 15.22 <0.00001 Within populations 512 158.199 0.30898 66.68 0.00196 Total 525 220.894 0.46337 46

Table A4. Parameter values for the sudden and spatial expansion models for the mismatch distribution of two Leptasterias clades.

Sudden Expansion Model Goodness-of-fit P- Harpending’s 0o 0! x value Raggedness P-value L. aequalis K 0 37.72 13.82 0.15 0.01 0.41 Clade Y 0 99999 0 0 0.06 1

Spatial Expansion Model Goodness-of-fit P- Harpending’s 0 M i value Raggedness P-value L. aequalis K 1.63 18.23 11.96 0.08 0.01 0.94 Clade Y 1.76 0.07 11.28 0.74 0.06 0.83 47

Hap_CC

HapjSS

Hap_B

- Hap_E

Hap_S ■ Hap_K — Hap_L

- Hap_PP - HapJ Hap_NN

- Hap_T ■ Hap_U

- Hap_0 Hap_FF Hap_GG ------Hap_HH

— HapJI - Hap_J

Hap_LL Hap_N Hap_Q

Hap_RR Hap_TT

Figure Al. Bayesian tree for i51 haplotypes using indels coded as simple characters (Simmons and Ochoterena 2000). Numbers at the nodes indicate Bayesian Posterior Probabilities. Haplotypes CC, G, and SS belong to individuals identified as L. hexactis using mtDNA haplotypes and were used to root the tree. 48

Alaska

Washington L. hexactis

L. aequalis

North of the Bay

Bay proximal

South of the IBay

Figure A2. Haplotype network for i51 haplotypes excluding indels. Region key found left. Circles represent haplotypes and size of circles represents the frequency of haplotypes. Black circles represent missing haplotypes. Colors represent population regions. Grey shaded areas represent clade delineation of haplotypes.