Comparative population genetics confirms range-specific lineages of alpine stoneflies

Authors: Scott Hotaling1, Lusha M. Tronstad2, J. Joseph Giersch3, Clint C. Muhlfeld3, Debra S. Finn4, and David W. Weisrock1

Affiliations: 1 Department of Biology, University of Kentucky, Lexington, KY 40506 USA 2 Wyoming Natural Diversity Database, University of Wyoming, WY 82071, USA 3 U.S. Geological Survey, Northern Rocky Mountain Science Center, Glacier National Park, West Glacier, MT 59936, USA 4 Department of Biology, Missouri State University, Springfield, MO 65897, USA

Correspondence: Scott Hotaling, Department of Biology, University of Kentucky, Lexington, KY 40506; Fax: (859) 257-1717; Phone: (828) 507-9950; Email: [email protected]

Keywords: population genomics, Zapada glacier, Lednia, and Rocky Mountains

Abstract: Zapada glacier and Lednia tumana are stoneflies (family ) that live in cold, alpine streams draining from glaciers, snowfields, or permanent ice. Both species were petitioned and warranted protection under the Endangered Species Act at least partially because of declining habitat (i.e., melting glaciers) as a result of climate change. Zapada glacier was previously only known from Glacier National Park (GNP) but this stonefly was recently discovered in the Absaroka-Beartooth Wilderness (ABW) and Teton Range. The goal of this study was to estimate the degree to which Z. glacier is genetically differentiated across these three mountain ranges. For comparison, we also included L. tumana (only known from the vicinity of GNP) and L. tetonica (only known from the Teton Range). We combined sequence data for all three species from both the mitochondrial (cytochrome oxidase I barcoding) and nuclear (restriction-site associated DNA sequencing) genomes. These data were analyzed in parallel and results were compared both within and across taxa. Taken together, this study represents the most robust systematic assessment of genetic boundaries in Z. glacier, L. tumana, and L. tetonica to date. Multiple analyses of the nuclear data robustly support three independent lineages of Z. glacier that correspond with the three mountain ranges. However, results from mitochondrial DNA are less emphatic, but do show Z. glacier in the Teton Range to be isolated from GNP and ABW. Both nuclear and mitochondrial evidence clearly supported the existing descriptions of L. tumana and L. tetonica as robust, species-level entities. However, additional analyses (both genetic and morphological) should be performed before range-specific lineages of Z. glacier are considered for new species designations.

1 Introduction: Many factors, both evolutionary and environmental, dictate the contemporary distributions of species. While the criteria for defining a species has been the subject of extensive debate (De Queiroz, 2007), at their core, species concepts aim to use varied criteria to identify independently evolving evolutionary lineages (Carstens et al., 2013). From an applied standpoint, the most important aspect of any systematic efforts is to identify underlying lineage independence; whether or not the evidence rises to the level of a new named species has little biological implication. However, when conservation is considered, species definitions, particularly for those taxa that are either already protected under federal statutes, such as the U.S. Endangered Species Act (ESA) or may be considered for ESA protection, are crucial. Because of these overarching policy indications, researchers are faced with a two-fold challenge: (1) To establish biologically accurate delimitations of independently evolving evolutionary lineages from all available data; and (2) To decide if the results cross the threshold of separate species, and, if so, to describe those species accordingly. Every individual in a population has its own evolutionary history, and together, individuals combine into populations, metapopulations, and species. Underpinning these groupings is a shared genetic history that provides a vital link between past and present (Hewitt, 2000; Whiteman et al., 2007). To this end, comparative population genetic studies are particularly powerful, because they highlight the degree to which species have responded to various influences, including past geological processes (e.g., glacial oscillations) and/or variance in dispersal (Lourie et al., 2005). Exemplar studies have shown that both shared geographic distributions and overlapping ecological requirements can be important predictors of similar shared evolutionary trajectories (Lapointe & Rissler, 2005; Whiteman et al., 2007; Satler & Carstens, 2017). Moreover, comparisons between well-resolved species can provide valuable reference points for estimating where similar species may fall on a speciation timeline. In the North American Rockies, two alpine stoneflies, Zapada glacier and Lednia tumana have been recommended for listing under the U.S. Endangered Species Act (ESA) due to climate change-induced habitat loss (U.S. Fish and Wildlife Service, 2016). A third stonefly endemic to the Teton Range – Lednia tetonica – was previously only known from a single stream fed by permanent subterranean ice. In GNP, L. tumana and Z. glacier co-occur and in the Teton Range, L. tetonica and Z. glacier co-occur. All three species are highly similar in many ways: they belong to the same order and family (: Nemouridae), are phytophagous with short (< 30 days) winged adult stages, and are tightly linked to the hydrologic conditions associated with melting glaciers, ice, and snow (Baumann, 1975; Muhlfeld et al., 2011; Baumann & Call, 2012; Giersch et al., 2015; Giersch et al., 2016). Previous mitochondrial DNA (mtDNA) evidence has indicated genetic distinctiveness among populations of Z. glacier generally corresponding with mountain ranges, suggesting that Z. glacier may actually represent one or more mountain range- specific species, similar to the morphology-based descriptions of L. tumana and L. tetonica (Baumann & Call, 2012; Giersch et al., 2015; Giersch et al., 2016). Moreover, addressing these population genetic questions holds significant implications for conservation in the region, as the alpine streams that Z. glacier, L. tumana, and L. tetonica inhabit, as well as those worldwide, are under significant threat as rapid warming drives substantial glacier recession (Hall & Fagre, 2003; Hansen et al., 2005; Pederson et al., 2010; Roe et al., 2016). Linked to this decline in the alpine cryosphere is the potential for loss of an entire community of meltwater-dependent alpine organisms (Muhlfeld et al., 2011; Giersch et al., 2016; Hotaling et al., 2017b; Hotaling et al., 2017a).

2 A lack of agreement can exist between mtDNA and nuclear genomes of the same species (Gompert et al., 2008). This “mito-nuclear discordance” is typically a result of unique evolutionary characteristics of the mtDNA genome (e.g., matrilineal inheritance, interspecific introgression; Boratyński et al., 2011). This discordance can obscure signals of differentiation as lineages that are distinct according to the nuclear genome may actually appear to be closely- related according to mtDNA data. Thus, it is important for researchers to incorporate sequence data from both the mtDNA and nuclear genomes in any analyses of species histories. However, when a disconnect arises, the two sources of information should not be considered equal – rather, multiple independent nuclear loci, and the story they collectively tell, should be given greater credence. In this study, we combined mtDNA and nuclear sequence data for Z. glacier, L. tumana, and L. tetonica to assess (1) whether range-specific lineage boundaries within Z. glacier as inferred from mtDNA are supported by genome-scale nuclear data and (2) to assess the degree to which patterns of Z. glacier differentiation correspond with those from co-occurring Lednia species. Given that our study species were described based solely upon the presence (L. tumana and L. tetonica; Baumann & Kondratieff 2010; Baumann & Call 2012) or absence (Z. glacier; Baumann 1975) of distinguishing morphological characters, and no comparative genetic perspectives have been presented, we hypothesized that both comparisons (Z. glacier by mountain range and L. tumana vs. L. tetonica) would reveal similar levels of differentiation and gene flow, and that results would largely align between mtDNA and nuclear data. From a conservation perspective, our study sheds light on a specific, applied question: is the currently described Z. glacier species comprised of several mountain range-specific species?

Materials and Methods: Study species and field sampling Zapada glacier (Plecoptera: Nemouridae; Figure 1A) is known to occur in three mountainous regions: GNP in northwestern Montana, the Absaroka-Beartooth Wilderness (ABW) in southern Montana, and the Teton Range in northwestern Wyoming (Figure 2; Giersch et al., 2016). Conversely, both Lednia (Plecoptera: Nemouridae) species are endemic to a single mountain range: L. tumana (GNP; Figure 1B) and L. tetonica (Teton Range; Figures 1, 1C), can co-occur with Z. glacier in their respective ranges. Beyond Z. glacier, the Zapada genus is widely distributed, with seven recognized species in the western United States (Baumann, 1975; Baumann et al., 1977) whereas Lednia contains only two other species, both of which are also mountain range endemics: L. borealis of the Cascades, and L. sierra of the Sierra Nevada (Baumann & Kondratieff, 2010). While no Lednia species overlap geographically, Zapada sp. overlap with one another extensively. These overlapping distributions raise a particular challenge for identifying Z. glacier in the field because only Zapada adults are morphologically distinguishable, are typically hard to find, and have short emergence periods (Baumann & Gaufin, 1971). To overcome this, we sequenced the cytochrome oxidase I (COI) “barcoding” locus for all newly collected Zapada specimens and compared these new sequences with those previously collected for known Zapada adults and nymphs (Giersch et al., 2015) by constructing a COI-based mtDNA gene tree. Aside from an efficient means for generating COI sequence data, we also used this initial barcoding step to identify Z. glacier samples for population genomic data collection (see below). Sampling for this study was conducted in alpine streams of GNP, ABW, and the Teton Range in the summers of 2015 and 2016. To ensure accuracy of barcoding identifications, we also acquired specimens of L. sierra, L. borealis, and Zapada from mountain

3 streams in California, Washington, New Mexico, and Oregon (Figure 3). Despite efforts on multiple occasions, the genus Lednia has never been observed in the ABW. Sampling information for all samples and localities of our focal species are included Table 1, with all samples (including outgroup species) described in Appendix 1. mtDNA data generation (COI barcoding) and identification of Z. glacier specimens In total, we sequenced the ‘DNA barcoding’ portion of the mtDNA genome, a 658-bp region of COI, for 79 newly collected specimens representing Zapada sp. (n = 34), L. tetonica (n = 43), L. sierra (n = 1), and L. borealis (n = 1). COI is commonly used in DNA barcoding as it is variable across species, yet retains conserved primer binding sites (Hebert et al., 2003). Barcoding was performed by the Canadian Center for DNA Barcoding (CCDB) following established protocols for extraction (Ivanova et al., 2006), polymerase chain reaction (PCR), and sequencing (Hajibabaei et al., 2005; DeWaard et al., 2008). For PCR, the primer sets LCO1490/HCO2198 (Folmer et al., 1994) were used to amplify the target fragment of COI. Successful PCR amplicons were checked on a 2% agarose gel and products were cleaned using ExoSAP-IT (Affymetrix, Santa Clara, California, USA). Purified amplicons were cycle- sequenced using a Big Dye v3.1 dye termination kit, purified using Sephadex, and sequenced bidirectionally on an ABI 3730 sequencer (Applied Biosystems, Foster City, California, USA). Additional information on the methods and pipelines used for barcoding by CCDB are available at http://ccdb.ca/resources.php. After barcoding, COI sequences were visually inspected, corrected, and aligned using MUSCLE (Edgar, 2004) as implemented in Geneious version 6.1.8 (Kearse et al., 2012). To identify Z. glacier specimens and generate a complete data set for population genetic comparisons, we combined the 79 new COI sequences with data from three published studies: two focused on Zapada sp. (Giersch et al., 2015; Giersch et al., 2016) and one on L. tumana (Jordan et al., 2016). GenBank and BOLD accession information, as well as references, for all previously published mtDNA sequence data included in this study can be found in Table 2. To limit any influence of temporal genetic change (e.g., loss of haplotypes, Jordan et al., 2016), only specimens for the focal species collected after 2010 were included, except for six specimens of Z. glacier from ABW that were collected in 2000. For Zapada, the final data set contained 460 specimens, with 256 sequences for Z. glacier and 204 sequences representing all other species in the western Zapada . For Lednia, the final data set contained 115 specimens, with 70 L. tumana sequences, 43 L. tetonica sequences, and one sequence each for L. borealis and L. sierra. Henceforth, this combined mtDNA COI dataset will be referred to as “mtDNA”. To identify Z. glacier specimens and confirm the distinctiveness of L. tumana versus L. tetonica, we built COI gene trees for the Zapada and Lednia data sets separately with Visoka cataractae (Pleoptera: Nemouridae) serving as the outgroup for all Zapada specimens and Z. glacier as the outgroup for Lednia. To construct trees, we first used an Akaike information criterion (AIC) test implemented in MrModeltest (Nylander, 2004) to select the best-fit model of DNA substitution (GTR+I+G). Next, we used MrBayes version 3.2.4 (Ronquist et al., 2012) to generate mtDNA gene trees for each data set using 5 chains analyzed for 10 million generations with a 1-million generation burn-in. Samples were taken every 10000 generations for two replicates. Convergence was determined by inspecting effective samples size (ESS >200) values in Tracer v1.6.0 (Rambaut & Drummond, 2007). Retained poster distributions for each replicate were combined to generate a majority-rule consensus tree. For Zapada, the placement of our

4 newly barcoded samples provided the evidence for which species or lineage our 34 new specimens represented.

Population genomic data generation (ddRAD sequencing) and SNP calling To integrate nuclear genomic perspectives into our study of lineage boundaries in Z. glacier, we collected population genomic data using a double digest restriction-site associated DNA sequencing approach (ddRAD; Peterson et al., 2012) Briefly, this approach allows for efficient generation of many orthologous genetic markers through an initial digestion of genomic DNA with two restriction enzymes, a subsequent size selection, and the integration of sample- specific barcodes in a final PCR amplification. Using the Z. glacier identifications from our COI barcoding, we generated ddRAD data for 64 specimens: Z. glacier (n = 56), L. tumana (n = 4), and L. tetonica (n = 4; Table 1). DNA was extracted using a Qiagen DNEasy Blood & Tissue Kit, gel-checked for quality, and quantified with a Qubit. Following Peterson et al. (2012), we selected the optimal restriction enzyme pair (Nla-III and MluCl) and performed digests on 900 ng of starting DNA per sample. After adapter ligation, we size-selected a window of 526-626 bp on a PippenPrep. Next, samples were cleaned, pooled, and PCR amplified target fragments following the methods outlined in Nunziata et al. (2017). Finished libraries were quantified and checked for the appropriate size distribution on an Agilent Bioanalyzer. Next-generation sequencing (NGS) was performed by the University of Illinois at Urbana-Champaign Roy J. Carver Biotechnology Center on a single lane of Illumina HiSeq 4000 with 100-bp single-end chemistry. After sequencing, raw NGS reads were demultiplexed and subsequently processed through the Stacks pipeline (Catchen et al., 2011; Catchen et al., 2013) with default settings except for a minimum stack depth (m) of 10 was required for before SNP calls. Final data sets for both Z. glacier (n = 56) and L. tumana + L. tetonica (n = 8) were exported and subsequently filtered in PLINK (Purcell et al., 2007) to create two primary NGS data sets, one with all SNPs included that had no more than 33% missing data at a given locus and a second with the same missing data criteria as well as all singletons removed. All NGS data sets were filtered to remove any SNP that was not at Hardy-Weinberg equilibrium. For Z. glacier, two samples were removed because their average missing data were two standard deviations higher than the mean for all 56 samples. Additional filtering for specific programs or analyses are described when applicable. Hereafter, these population genomic datasets will be referred to as “NGS” with additional details provided as is necessary. mtDNA: Population genetic analyses and demographic modeling We constructed haplotype networks by compressing sequences into common haplotypes using the ALTER web server (Glez-Peña et al., 2010) and generating networks in POPART (Leigh & Bryant, 2015) with the TCS implementation (Clement et al., 2000). We performed an analysis of molecular variance (AMOVA) in Arlequin 3.5 (Excoffier & Lischer, 2010) to assess how genetic variation is partitioned across multiple sampling levels. The AMOVA was performed separately on the Z. glacier and L. tumana + L. tetonica data sets using mountain ranges as the highest level of structure. We assessed significance and 95% confidence intervals using 5000 bootstrap replicates. For both the L. tumana + L. tetonica and Z. glacier data sets, we tested a range of demographic models and characterized gene flow (when applicable) under a coalescent framework in the program Migrate-n v3.6 (Beerli & Felsenstein, 2001). For Z. glacier, we tested

5 eight three-population models that were similar to those tested for Lednia (Figure 4A). For L. tumana + L. tetonica, we tested five two-population models ranging from full, bidirectional gene flow to panmixia (Figure 4B). For all Migrate-n analyses, initial parameter values were calculated using FST and model averaging was used to estimate migration rate (m) and θ. For the two models without migration, we followed Beerli and Palczewski (2010) in specifying a very small (m = 0.01), uniform custom migration rate among groups. We estimated the transition/transversion ratio (ti/tv) for both alignments via maximum likelihood model selection in jmodeltest2.1.10 (Darriba et al., 2012). These ratios were 15.63 and 4.70 for L. tumana/L. tetonica and Z. glacier, respectively. For all runs, a static heating strategy with four short chains (temperatures of 1.0, 1.5, 3.0, and 1.0 x 106) and one long chain was used. We recorded 25,000 steps every 100 generations with 10,000 steps discarded as burn-in. To ensure Markov chain stationarity, we examined ESS values (> 200) for each parameter as well as their distributions. To select among models, we used the Bezier Approximation Score (BAS) to calculate log Bayes Factors (LBFs) and probabilities for each model following Beerli and Palczewski (2010). We calculated the number of migrants per generation using the equation, Nm = M x θ.

NGS: Population genetic analyses and demographic modeling Like the mtDNA data set, all analyses of the NGS data were performed separately for Z. glacier and L. tumana + L. tetonica. For both, we assessed population structure using a Bayesian clustering method implemented in the program Admixture v1.3.0 (Alexander et al., 2009) and a discriminant analysis of principal components (DAPC) implemented in the R package adegenet (Jombart, 2008; Jombart et al., 2010). For both Z. glacier and L. tumana + L. tetonica we used the full SNP data sets. Admixture analyses were performed using default settings under a range of cluster numbers (K) between 1-12. For each K, we calculated and plotted the cross-validation error to select the best-fit K (lowest cross-validation error). For DAPC analyses, we first used the find.clusters function to assess the optimal number of groups using the Bayesian information criterion (BIC) as our selection criteria with the K that minimized BIC selected as the best-fit K. Because retaining too many discriminant functions with respect to the number of populations can lead to overfitting of the population structure model, we retained the optimal number of principal components according to the α-score (Jombart et al., 2010). Next, we performed a final DAPC analysis for each data set using the best-fit K and optimal number of discriminant functions. We calculated pairwise FST among sampling localities and tested for isolation-by- distance (Wright, 1943) using the program GenoDive (Meirmans & Van Tienderen, 2004). For pairwise FST calculations, significance was assessed using 5000 permutations (or resamplings) of the observed data. To test for a signature of isolation-by-distance among sampling localities, we estimated Euclidean distances among localities with Google Earth and tested the correlation between geographic distance and FST using a Mantel test. We performed a hierarchical AMOVA in the program Arlequin v3.5 (Excoffier & Lischer, 2010) on our best assessment of population structure (i.e., best-fit K) for Z. glacier to quantify how genetic variation was partitioned across different levels of sampling. An AMOVA was not calculated for the Lednia NGS data set because we only collected data for two sampling locations representing one L. tumana population in GNP and one L. tetonica population in the Teton Range. We tested the same demographic models for the NGS data as we did for the mtDNA data set (Figure 4). However, because Migrate-n is a very computationally intensive program, we were not able to include all of our SNPs in Migrate-n analyses. Rather, we chose to overcome computational constraints by randomly subsampling our data to 100 and 474 full length loci for

6 Z. glacier and L. tumana + L. tetonica, respectively. For all runs Migrate-n analyses of the NGS data, a static heating strategy with four short chains (temperatures of 1.0, 1.5, 3.0, and 1.0 x 106) and one long chain was used. We recorded 750,000 steps every 100 generations with 150,000 steps discarded as burn-in. As above, we ensured Markov chain stationarity by examining ESS values (>200) for each parameter as well as their distributions. To select among models, we used the Bezier Approximation Score (BAS) to calculate log Bayes Factors (LBFs) and probabilities for each model following Beerli and Palczewski (2010). We calculated number of migrants per generation using the equation, Nm = M x θ.

NGS: Phylogenetic tree reconstruction To assess the degree of nuclear monophyly among our three hypothesized, mountain- range specific lineages of Z. glacier, we constructed a hypothesized species tree using the program SVDQuartets (Chifman & Kubatko, 2014) implemented in PAUP v4.0a146 (Swofford, 2015). Specifically, we used a three-tip species tree model that corresponded to putative Z. glacier species with samples grouped by mountain range. For these data, we used the full 4497 loci (9268 SNP) data set, but to avoid including linked SNPs to the greatest degree possible, only one randomly selected SNP per locus was included for a total of 4497 SNPs. In some ways, we treated this phylogenetic analysis of the NGS data as a species delimitation test, where placement of tips into their expected clades with high bootstrap support was interpreted as evidence for possible species-level entities. For the analysis, we used 50,000 possible quartets (which equaled 34.06% of all possible quartets). Branch support for the inferred trees was estimated using 100 non-parametric bootstrap replicates.

NGS: Coalescent-based species delimitation To explicitly test whether mountain range-specific lineages of Z. glacier and Lednia amounted to distinct species, we performed coalescent-based species delimitation tests using the program BPP version 3.3 (Yang & Rannala, 2010; Yang, 2015). BPP employs reversible-jump Markov chain Monte Carlo (rjMCMC) sampling to explore the likelihood of species tree or species delimitation models under different numbers of lineages and prior conditions defined a priori. In doing so, BPP calculates posterior probabilities of nodes in the guide tree and thus the relative model probability of competing delimitation models. Following the tutorial by Yang (2015), we analyzed 522 loci for four Z. glacier samples from each mountain range for a total of 12 samples representing the three putative species. For Lednia, we analyzed 471 loci for all four samples representing both L. tumana and L. tetonica. As BPP analyses can be incredibly computationally complex and quickly becoming intractable if too many loci and/or samples are included, we chose this subsampling to match the number of individuals in the Lednia data set and a goal of approximately 500 loci for both. Also, we elected to only use loci with complete data across all sampled individuals (within genera). We analyzed our data using the BPP algorithm “A11” which jointly estimates both a species tree topology and species delimitation model probabilities (Yang, 2015). As mentioned above, BPP requires user-specified prior combinations of θ (population size) and τ (divergence time). Following similar studies (e.g., Hotaling et al., 2016), we chose six combinations of priors which ranged from including three population sizes: small (Gamma distribution set with α = 1, β = 1000), medium (1, 100), or large (1, 10). And, two divergence times: shallow (1, 1000) and deep (1, 100). For each rjMCMC analysis, we discarded the first 25,000 generations as a burn-in. Subsequently, a sample was recorded every 100 generations for a total of 100,000 samples (and a

7 total of 10 million post-burn-in generations). For both focal groups (Z. glacier and L. tumana + L tetonica), we performed three replicate analyses for each set of priors (six combinations in all) and averaged model support across them. Also, to assess the degree to which delimitation probabilities were merely the product of noise in the data, and not biologically accurate, we also conducted a single matching replicate of all prior combinations and delimitation models with samples randomized among groupings. The expectation for the randomized replicates is that any perceived support for delimitations would be lost when genetic variation was randomly distributed.

Results: mtDNA: Data summary and sample identification Our final COI alignment for Zapada was 658-bp long with 2.49% missing data across all specimens and 1.95% missing data for Z. glacier only. Phylogenetic analyses supported the seven recognized western Nearctic Zapada species as being monophyletic with posterior probabilities (PPs) of 1.0 (Figure 5A). For our 34 newly barcoded Zapada specimens, 18 were identified as Z. glacier. These new specimens were from four streams where Z. glacier had not previously been recorded: three in ABW and one in the Teton Range (Figure 2; Table 1), bringing the total number of streams known to contain Z. glacier to 13 (Giersch et al., 2015; Giersch et al., 2016). For Lednia (n = 115), our final COI alignment was 658-bp long with 1.27% missing data. We confirmed the presence of L. tetonica at its only previously known location, Wind Cave at the head of Darby Creek, in the Teton Range (Baumann & Call, 2012). Our field surveys expanded this known distribution to seven new sites, all within the Teton Range (Figure 1, Table 1). Phylogenetic analyses strongly supported the existing Lednia taxonomy with PPs of 1.0 for all nodes and described species resolved as monophyletic (Figure 5B). The mtDNA gene tree placed L. tetonica and L. tumana as sister species, with L. borealis as the sister lineage to the L. tetonica + L. tumana clade, and L. sierra as the sister species to the other three (Figure 5B). mtDNA: Haplotype networks and population genetics A haplotype network connecting all Z. glacier specimens (n = 256) included 20 haplotypes from the three mountain ranges: GNP (n = 198 specimens; 14 haplotypes), ABW (n = 23 specimens; 2 haplotypes), and the Teton Range (n = 35 specimens; 5 haplotypes; Figure 6A). Each mountain range was generally characterized by a distinct set of haplotypes, however, haplotypes were shallowly diverged within mountain ranges and only slightly more diverged among them. Interestingly, a single haplotype overlapped between the Grinnell Glacier site in GNP (N = 1) and ABW (N = 22; Figure 6A). The maximum divergence among any two Z. glacier haplotypes was 1.22%. Among range differentiation explained 89.5% of the variation (ΦCT = 0.89) in the data and within population variation explained another 10.9% (ΦST = 0.89; Table 3A). For all Lednia specimens, we identified five L. tumana haplotypes, seven L. tetonica haplotypes, and one haplotype each for the single specimens of L. borealis and L. sierra. A haplotype network connecting Lednia samples revealed strong divergence across described species (and by proxy, mountain ranges; Figure 6B). When L. tumana and L. tetonica were grouped by species (i.e., mountain range), among species variation explained 95.3% of the variation (ΦCT = 0.89), differentiation within species explained a negligible amount (0.4%, ΦCT

8 = 0.08), and within population differentiation explained the remaining 4.3% (ΦST = 0.95; Table 3B).

mtDNA: Demographic model selection and gene flow estimation For Z. glacier, the best-supported demographic model for the mtDNA data set was the “north-to-south” model, which included gene flow from GNP into ABW and from ABW into the Teton Range (model 2, model probability ~1; Figure 4a, Table 4). All other models were strongly rejected (LBFs ≥ 12, model probabilities ≤ 2.4 x 10-3). Interestingly, a no-migration model was one of the least supported models (model 7; LBF = 47.3, model probability = 5.5 x 10-11). For the best-fit model, the number of migrants per generation from GNP into ABW (mean Nm = 1.02, 95% confidence interval = 0 – 5.27) was estimated at twice the rate observed for ABW into the Teton Range (mean Nm = 0.5, 95% confidence interval = 0 – 2.75; Table 5). For L. tumana and L. tetonica, the best-supported mtDNA demographic model included no migration between species (model 4, model probability ~1; Figure 2b, Table 4. All models including a gene flow parameter were rejected (LBFs ≥ 142.9, model probabilities ≤ 9.3 x 10-32) as was the panmixia model (model 5, LBF = 529.5, model probability = 1.1 x 10-115). Because the best-fit model for the mtDNA data set did not include a gene flow parameter, we did not estimate migration rates between L. tumana and L. tetonica.

NGS: Data summary and population genetics In total, we generated 346,078,450 100-bp reads with uniformly high coverage across samples for Z. glacier (mean coverage = 80.9, min. = 34.8, max. = 155.6) and Lednia (mean coverage = 85.7, min. = 74.6, max. = 103.2). We identified more than twice as many total catalog loci (i.e., recovered RAD fragments) for Z. glacier (578,119) versus Lednia (249,292). After initial read processing, two Z. glacier samples were removed from all downstream analyses because they contained two standard deviations more missing data versus the average bringing the final sampling totals to 54 for Z. glacier, 4 for L. tumana, and 4 for L. tetonica. After filtering, our Z. glacier data set contained 4497 variable loci, 9268 SNPs (2.06 SNPs per locus), and it was 84.2% complete. For L. tumana + L. tetonica, our final data set contained 7034 variable loci, 10708 SNPs (1.52 SNPs per locus), and it was 87.9% complete. When the two data sets were compared, just 214 SNPs were shared between Z. glacier and L. tumana + L. tetonica. For Z. glacier, both Admixture and DAPC analyses favored grouping samples into K = 3 genetic clusters that aligned with mountain range (Figure 7). For Lednia, both Admixture and DAPC analyses favored grouping samples into K = 2 genetic clusters, which aligned with both existing species descriptions and mountain range (Figure 7). For Z. glacier, when samples were analyzed by mountain range alone, no evidence for nested, within-range structure was present for either method. When all 11 Z. glacier localities were considered, average genetic differentiation (FST) among localities was 0.216 (Table 6). When grouped by mountain range, Z. glacier samples from both GNP and ABW were almost equally differentiated from the Teton Range samples (FST = 0.306 and 0.298, respectively; Table 7). Conversely, GNP and ABW samples were much more closely related (FST = 0.175). As would be expected, within mountain range differentiation was much lower than between range differentiation with GNP samples the most different from one another (mean FST = 0.101) and ABW and the Teton Range much more closely related (mean within-range FST = 0.049 and 0.014, respectively). This pattern of isolation-by-distance was supported by a Mantel test (r2 = 0.51, P < 0.001). Though only one population was sampled for

9 L. tumana and L tetonica in each mountain range, these populations were extremely different from one another (FST = 0.741; Table 7). For the Z. glacier NGS data set, AMOVA results identified modest differentiation among mountain ranges (ΦCT = 0.279, P < 0.0001; Table 3) which accounted for 25.1% of the observed variation. Variation among populations within mountain ranges was also significantly differentiated but it only explained a small portion of the total variation (2.8%). The bulk of the variation (72.1%) was observed among individual within sampling localities (ΦST = 0.251, P < 0.0001; Table 3). With only two localities sampled, no AMOVA was calculated for Lednia NGS data set.

NGS: Demographic model selection and gene flow estimation Results of demographic model selection for the NGS data differed substantially from the mtDNA results. For both Z. glacier and L. tumana + L. tetonica , the best-supported NGS model was the full migration model which included bidirectional gene flow among all localities (model 1 in both cases; model probability = ~1 for both; Figure 4, Table 4). For Z. glacier, all other demographic models were rejected with LBFs ≥ 6259.8 and for L. tumana + L. tetonica, the other models were all rejected with LBFs ≥ 6299.8 (Table 4). For the Z. glacier full migration model, the number of migrants per generation was ~22x higher for estimates from both GNP and the Teton Range into ABW (mean Nm = 121.4) versus reciprocal rates from ABW into GNP or the Teton Range (mean Nm = 5.48; Table 5). Migration between GNP and the Teton Range was also low (mean Nm = 4.89). For Lednia, migration rates were generally equivalent to Z. glacier rates that did not include ABW with an estimated 4.12 migrants moving from the Teton Range to GNP per generation and 4.18 migrants moving in the reverse direction (Table 5).

NGS: Phylogenetic tree reconstruction and species delimitation SVDQuartets analysis of the Z. glacier data with samples grouped as hypothesized species resulted in strongly supported splits among each mountain range (Figure 8). Unlike samples from ABW and the Teton Range, Z. glacier samples from within GNP were split into four additional clades with all samples from Grinnell Glacier, Piegan Tarn, and Dry Fork forming their own clades and a combined group of two localities from the southern portion of GNP forming the fourth (Figure 8). However, because no outgroup was included, we were unable to infer relationships among lineages from this phylogenetic reconstruction. Because only two populations of Lednia were sampled in the NGS data set – one each for both L. tumana and L. tetonica – we did not infer an SVDQuartets phylogeny for these populations. BPP analysis of the NGS data set yielded a best-fit, three-species model (model probability >0.99 across all prior combinations and replicates) for Z. glacier. This three-species Z. glacier model also uniformly supported a species tree topology which grouped samples from GNP and ABW as sister lineages with the Teton Range having diverged earlier from both (Figure 9). Similarly, BPP analyses yielded strong support for a two-species Lednia model delimiting the already described L. tumana and L. tetonica as separate species (Figure 9). For both sets of BPP analyses, randomized tip labeling (where individual samples were randomly assigned to groups instead of being assigned according to collection location as in the non- random tests) generally supported one-species models (no delimitation) with high model probabilities. This was true with the exception of two prior combinations for Z. glacier (Figure 9).

10 Discussion: Alpine streams and their associated biota remain understudied in North America, with distributions and habitat requirements largely unknown (Hotaling et al., 2017b). Research in GNP focused on L. tumana and Z. glacier represents perhaps the most studied examples in North American and our field efforts in ABW and the Teton Range greatly expanded basic knowledge of Z. glacier and Lednia sp. beyond GNP. Indeed, we identified a fourth stream containing Z. glacier on the east side of the Teton Range in the inlet to Delta Lake and three additional streams containing Z. glacier in the ABW (Figure 2). For L. tetonica, the results were even more dramatic, known originally from a single stream in the Teton Range, we identified six new streams containing this species all across the Teton Range. While Lednia samples from ABW would have been important comparative genetic data points, Lednia has never been observed in ABW despite what appears to be appropriate habitat. Given the attention surrounding both Z. glacier and L. tumana in North America, and their respective recommendations for listing under the ESA, the overarching goal of this study was to sequence their DNA to estimate the degree of differentiation among mountain ranges. Lednia species can be distinguished using morphology between mountain ranges, but Z. glacier can be found in three distinct mountain ranges of the central Rocky Mountains. Due to the absence of adult material from the Teton Range, a phylogenetic analysis incorporating morphological data has not been performed for Z. glacier. With a strong affinity for high- elevation, cold streams (Giersch et al., 2016), intervening lower elevation habitat should act as a barrier to Z. glacier dispersal among ranges. This would drive a reduction in gene flow, and ultimately, speciation among Z. glacier lineages. This general framework aligns with previous research in alpine streams supporting significant isolation among mountaintop-isolated stream populations, even within ranges (Finn & Adler, 2006; Finn et al., 2016; Jordan et al., 2016). Previous studies of Z. glacier population genetic structure have supported this theme with clear breaks identified between GNP, ABW, and the Teton Range (Giersch et al., 2015; Giersch et al., 2016). However, this theoretical basis stems largely from mtDNA data sets (Hotaling et al., 2017b), and while informative, mtDNA results alone must be interpreted with caution. This caution is due to several unique characteristics of the mtDNA genome beyond merely making inferences from a single genetic marker. Namely, the mtDNA genome has a smaller effective population size, no recombination, and matrilineal inheritance (Ballard & Whitlock, 2004), as well as the potential for mito-nuclear discordance (Gompert et al., 2008). By extending previous Z. glacier population genetic efforts to include both mtDNA and nuclear genome perspectives as well as comparisons to co-occurring, previously described stonefly species that share highly similar ecological patterns, we greatly increased our power to resolve and give context to patterns of Z. glacier divergence among mountain ranges. Zapada glacier, as currently described, clearly includes independent lineages that correspond with three mountain ranges: GNP, ABW, and the Teton Range. This result is supported by many lines of evidence from our NGS data set: clear population genetic structure support for three clusters that correspond with mountain range across two methods (Figure 7), phylogenetic support for nuclear monophyly of the same clusters including bootstrap values of 100 for the identified splits among ranges (Figure 8), significant explanation of genetic variation (25.07%) by mountain range groupings (Table 3), minimum FST between samples in different ranges of 0.175 (Table 7), and BPP support for a three-species model (Figure 9). However, when mtDNA are also considered, these patterns become less clear. Giersch et al. (2015, 2016)

11 identified genetic breaks between Z. glacier samples that corresponded with mountain range. Our new data set, which included representatives from three additional streams in ABW, still clearly supported the isolation of the Teton Range, but our finding of one mtDNA haplotype that is shared by Z. glacier samples from both GNP and ABW supports a slightly closer connection between these two groups than has been previously described. Moreover, the mtDNA pattern observed for Z. glacier was drastically different from L. tumana and L. tetonica, which exhibited extensive differentiation for the same COI locus. The disconnect in observed differentiation between NGS and mtDNA for Z glacier samples over the same area clearly indicates that some degree of mito-nuclear discordance exists for Z. glacier that is not present for Lednia species. This discordance could be the product of some life history difference between the two groups (e.g., sex-biased dispersal; Elbrecht et al., 2014) or some as yet undiscovered pattern of mtDNA introgression among Zapada species (Boumans & Tierno de Figueroa, 2016). For both Zapada and Lednia comparisons, our NGS data supported a non-zero amount of gene flow among groups, whether between described species (Lednia) or populations (Z. glacier). The rates of gene flow were largely comparable between GNP and the Teton Range for both Z. glacier and L. tumana + L. tetonica, but greatly elevated for Z. glacier for both GNP and the Teton Range into ABW. This evidence for gene flow does not undermine existing descriptions of L. tumana and L. tetonica nor does it preclude similar boundaries existing in Z. glacier (Nosil, 2008; Feder et al., 2012; Martin et al., 2013). But, still, the presence of gene flow and particularly the elevated rates moving into ABW for Z. glacier represent an important consideration when determining if observed patterns rise to the level of being considered species-level biodiversity. Ultimately, while this collective evidence clearly demonstrates that significant differentiation exists among Z. glacier populations residing in different mountain ranges, we hesitate to describe this variation as equating to independent species. While the evidence is robust, we are erring on the side of caution following the recommendations of Carstens et al. (2013), who implore the systematic community to ensure accuracy by using many species delimitation methods before making new species descriptions that could be inaccurate. Indeed, systematists have a responsibility to not diagnose new species that may ultimately be proved inaccurate as false descriptions confuse nomenclature in the literature, reduce confidence from the broader community, and at their worst, negatively impact conservation resource allocation and management decisions. Depending upon how a species delimitation method is defined we have used just one (BPP) or two (BPP and SVDQuartets) to analyze these data. And, for BPP analyses, more thorough characterization of the influence starting priors and run length have on delimitation results needs to be conducted before confident assessments can be made. Moreover, newer methods that leverage gene trees to test a wide range of demographic models, including parameters for gene flow, coalescence (i.e., divergence) time, and population size will no doubt shed important light on the most likely demographic scenario Z. glacier population have experienced and whether this best-fit scenario supports species-level divergences (Carstens et al., 2013; Hime et al., 2016; Jackson et al., 2017). Finally, as both L. tumana and L. tetonica were originally described from morphological data, a robust morphological comparison of adult Z. glacier from each mountain range would provide a highly relevant additional line of evidence for (or against) any resulting delimitations.

12 Acknowledgements: We acknowledge funding for this research from the Wyoming Governor’s Office, University of Wyoming-National Park Service Research Grants, and the Teton Conservation District. We thank Cayley Faurot-Daniels and Lydia Zeglin for assistance in the field. Robin Bagley and Kara Jones provided valuable technical and analytical assistance. Lynn Hotaling provided comments that improved this report.

13 Figures: FIGURE 1. Photographs of adult (a) Zapada glacier, (b) Lednia tumana, and (c) Lednia tetonica. (d) Garnett Canyon in Grand Teton National Park, exemplar alpine stream habitat where Z. glacier and L. tetonica co-occur. The stream is primarily fed by the Middle Teton glacier above the cliff bands.

14 FIGURE 2. The distribution of Zapada glacier, Lednia tumana, and Lednia tetonica specimens included in this study. The study area shown includes Glacier National Park, the Absaroka- Beartooth Wilderness, and Grand Teton National Park superimposed on an elevation gradient. Detailed locality information is included in Table 1. Stars indicate localities where both mtDNA and NGS data were collected. White circles indicate 10 new populations (four of Z. glacier, six of L. tetonica) discovered during this study.

15 FIGURE 3. Distribution of all Zapada and Lednia specimens included in this study. Detailed locality information for each species or lineage is included in Appendix 1.

16 FIGURE 4. Demographic models tested in Migrate-n for (a) Zapada glacier and (b) Lednia tumana and Lednia tetonica. GNP = Glacier National Park, ABW = Absaroka-Beartooth Wilderness, GRTE = the Teton Range. Black arrows indicate the direction of gene flow.

17 FIGURE 5. Cytochrome oxidase c subunit I (COI) gene trees of (a) the genus Zapada including 70 specimens from Jordan et al. (2016) and 45 newly barcoded specimens, and (b) western North American Lednia. Terminal nodes were compressed into triangles and scaled according to number of specimens. Node numbers indicate posterior probabilities.

18 FIGURE 6. A COI haplotype network of (a) all Zapada glacier specimens and (b) the four species recognized by the current Lednia taxonomy. Colored circles represent compressed haplotypes (with higher frequency haplotypes as larger circles) with one mutational difference between them. Hashmarks between compressed haplotypes represent one additional mutational step. Connections with no hashmarks are one mutation apart.

19 FIGURE 7. Population genomic structure inferred from 9268 SNPs using DAPC for (A) Lednia tumana versus Lednia tetonica, (B) Zapada glacier grouped by populations in principal component space, and (C) Z. glacier as grouped by mountain range. For (a) and (c), each bar represents one individual and the y-axis indicates assignment probability to a given genetic cluster. Though not shown, Admixture results were in agreement with DAPC analyses.

20 FIGURE 8. A hypothesized species tree generated for Z. glacier in SVDQuartets. Abbreviations include: Glacier National Park (GNP), Absaroka-Beartooth Wilderness (ABW), and Grand Teton National Park (GRTE). Node labels are bootstrap support across 100 nonparametric replicates. For clarity, only higher level bootstrap values are shown. All others lower-level node bootstrap support values were significantly lower than 100.

21 FIGURE 9. Results of BPP delimitation tests for (a) Lednia tumana versus Lednia tetonica and (b) Zapada glacier across mountain ranges. Prior combinations are given in the grey box and support values correspond with this ordering. “Test” indicates support across replicates where samples were grouped into described (a) or hypothesize (b) species. “Random” indicates support when tips were randomized among groups.

22 Tables: TABLE 1. Sampling information for Zapada glacier, Lednia tumana, and Lednia tetonica specimens included in this study. Region refers to the primary geographic area where specimens were collected. NmtDNA and NNGS are the sample sizes for a given locality for the mitochondrial DNA and nuclear DNA data sets, respectively. Range abbreviations: GNP = Glacier National Park, ABW = Absaroka-Beartooth Wilderness, GRTE = Grand Teton National Park/Teton Range. All lake locations are referring to outlet streams unless otherwise indicated. Complete sampling information for all taxa is included in Appendix 1. Asterisks indicate populations newly identified in this study.

NmtDN Species Stream Range NNGS GPS coordinates A Z. glacier Piegan Pass GNP 16 3 48.7294, -113.6972 Z. glacier Grinnell Lake GNP 37 5 48.7574, -113.7248 Z. glacier Appistoki Creek GNP 87 12 48.4589, -113.3489 Z. glacier Dry Fork Spring GNP 55 2 48.5345, -113.3805 Z. glacier Buttercup Park GNP 3 3 48.4237, -113.3844 Z. glacier *Jasper Lake ABW 2 2 45.0233, -109.5785 Z. glacier *Timberline Lake ABW 5 6 45.1325, -109.5077 Z. glacier Frosty Lake ABW 6 45.0261, -109.5515 Z. glacier *W. Fork Rock Ck. ABW 10 6 45.0962, -109.6040 Z. glacier *Delta Lake GRTE 1 43.7325, -110.7750 Z. glacier Teton Meadows GRTE 21 10 43.7259, -110.7904 Z. glacier S. Cascade Ck. GRTE 6 2 43.7285, -110.8373 Z. glacier Mica Lake GRTE 7 5 43.7854, -110.8414 L. tumana Lunch Creek GNP 23 48.7052, -113.7046 L. tumana Sexton Glacier GNP 31 4 48.7003, -113.6281 L. tumana Siyeh Bend GNP 4 48.7115, -113.6751 L. tumana Bearhat Mountain GNP 10 48.6650, -113.7491 L. tumana Heavens Peak GNP 1 48.7102, -113.8427 L. tumana Grant Glacier GNP 1 48.3314, -113.7368 L. tetonica *Alaska Basin GRTE 6 43.6895, -110.8327 L. tetonica *Sunset Lake Inlet GRTE 6 43.7102, -110.8556 L. tetonica *Schoolroom Glacier GRTE 6 43.7286, -110.8440 L. tetonica Wind Cave GRTE 6 43.6657, -110.9561 L. tetonica *Teton Meadows GRTE 6 4 43.7258, -110.7931 L. tetonica *N. Fork Teton Ck. GRTE 6 43.7681, -110.8615 L. tetonica *Upper Paintbrush GRTE 7 43.7852, -110.7941

23 TABLE 2. GenBank and BOLD accession information for previously published sequence data included in this study.

Project name or Species Database Study Notes accession ID(s) GNPZa / Giersch et al. Zapada sp. BOLD/Genbank KM874110– 2015 KM874263 Giersch et al. Zapada sp. BOLD GNPZP 2016 KX212679- Jordan et al. Samples from Lednia tumana GenBank KX212864 2016 2010 or later only

24 TABLE 3. Analysis of molecular variance (AMOVA) results for Zapada glacier (mtDNA), Z. glacier (NGS), and Lednia tumana + Lednia tetonica (mtDNA). Groupings for all comparisons are by mountain range. ns = not significant at P ≤ 0.05. Percentage of variation explained is given with the value for the respective fixation index in parentheses.

Z. glacier, Z. glacier, Lednia, Fixation index Source of variation mtDNA NGS mtDNA Among mountain Φ 89.5% (0.89) 25.1% (0.28) 95.3% (0.96) CT ranges Among populations ΦSC within mountain ns 2.8% (0.04) 0.4% (0.08) ranges

ΦST Within populations 10.9% (0.89) 72.1% (0.25) 4.3% (0.95)

25 TABLE 4. Phylogeographic model descriptions and selection results for (a) Lednia tumana and Lednia tetonica, and (b) Zapada glacier tested in Migrate-n. LBF: log Bayes factor. GNP: Glacier National Park. ABW: Absaroka-Beartooth Wilderness. LBFs and model probabilities calculated following Beerli and Palczewski (2010). Arrows (>) indicate the direction of migration for a given model. The best-fit models are highlighted in bold.

LBF, Choice, LBF, Choice, Model Description mtDNA mtDNA NGS NGS (a) Lednia tumana and Lednia tetonica 1 Full migration 255.6 4 -- 1 2 Unidirectional: L. tetonica > L. tumana 161.4 2 6299.8 2 3 Unidirectional: L. tumana > L. tetonica 142.9 3 6346.5 3 4 No migration -- 1 18464.5 4 5 Panmixia 525.5 5 23264.2 5 (b) Zapada glacier 1 Full migration 64.5 5 -- 1 North to south: GNP > ABW > Teton 2 -- 1 6315.0 3 Range South to north: Teton Range > ABW > 3 12.1 2 6647.7 6 GNP Out of GNP: GNP > ABW, GNP > 4 14.0 3 6259.8 2 Teton Range Out of ABW: ABW > GNP, ABW > 5 31.5 4 6329.3 4 Teton Range Out of the Teton Range: Teton Range > 6 60.0 6 6371.8 5 GNP, Teton Range > ABW 7 No migration 47.3 7 12000.6 7 8 Panmixia 215.4 8 14249.7 8

26 TABLE 5. Rate of migration (M), direction, θ (mutation-scaled effective population size), and Nm (number of immigrants per generation) for the best-fit model identified by Migrate-n analyses for (a) Zapada glacier mtDNA, (b) Zapada glacier NGS, and (c) Lednia NGS. All values are the mean estimate and for Nm, the 95% confidence intervals is provided in parentheses. Provided θ values are for the region receiving migrants. No mtDNA estimates are given for Lednia because the best-fit model did not include a migration parameter. GNP: Glacier National Park/Teton Range. ABW: Absaroka-Beartooth Wilderness.

M Direction θ Nm (a) Z. glacier mtDNA 636.5 GNP > ABW 1.6 x 10-3 1.02 (0 – 5.27) 201.1 ABW > GRTE 2.5 x 10-3 0.5 (0 – 2.75) (b) Z. glacier NGS 3717.8 GNP > ABW 3.4 x 10-2 128 (40 – 157.7) 3270.5 GNP > GRTE 1.7 x 10-3 5.7 (0.2 – 11.7) 2223.6 ABW > GNP 3.1 x 10-3 7 (4.5 – 12.7) 2304.4 ABW > GRTE 1.7 x 10-3 4 (0.1 – 9.1) 1306.9 GRTE > GNP 3.1 x 10-3 4.1 (2.3 – 7.9) 3333.1 GRTE > ABW 3.4 x 10-2 114.7 (34.5 – 147.5) (c) Lednia NGS 1165.1 L. tumana > L. tetonica 3.6 x 10-3 4.2 (1.9 – 6.9) 1091.7 L. tetonica > L. tumana 3.7 x 10-3 4.1 (2 – 6.7)

27 TABLE 6. Fixation index (FST) values for Zapada glacier populations included in this study. Glacier National Park locations: Buttercup Park (BCP), Appistoki Creek (APC), Dry Fork (DRF), Piegan Tarn (PGT), Grinnell Lake outlet (GRL). Absaroka-Beartooth Wilderness locations: Timberline Lake (TLK), Jasper Lake (JLK), and Rock Creek (RCK). Grand Teton National Park locations: Mica Lake (MLK), South Cascade Creek (SCC), and Teton Meadows (TMW). Bolded values were not significant at P < 0.05. Mean FST = 0.216.

APC DRF PGT GRL TLK JLK RCK MLK SCC TMW

BCP 0.036 0.113 0.114 0.112 0.211 0.262 0.17 0.282 0.321 0.243 APC 0.111 0.137 0.104 0.217 0.257 0.202 0.358 0.359 0.317

DRF 0.119 0.129 0.219 0.258 0.178 0.284 0.298 0.249

PGT 0.034 0.233 0.252 0.207 0.317 0.331 0.275

GRL 0.219 0.284 0.193 0.324 0.363 0.275

TLK 0.076 0.028 0.335 0.337 0.301

JLK 0.042 0.286 0.3 0.258

RCK 0.315 0.309 0.286

MLK 0.017 0.018

SCC 0.007

28 TABLE 7. Fixation index (FST) values for Zapada glacier populations grouped by mountain range (GNP: Glacier National Park; ABW = Absaroka-Beartooth Wilderness; GRTE = Grand Teton National Park/Teton Range, and Lednia tumana versus L. tetonica. All values significant at P > 0.05.

Location FST GNP v. ABW 0.175 GNP v. GRTE 0.306 ABW v. GRTE 0.298 L. tumana v. L. tetonica 0.741

29 References: Alexander, D.H., Novembre, J. & Lange, K. (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome research, 19, 1655-1664. Ballard, J.W.O. & Whitlock, M.C. (2004) The incomplete natural history of mitochondria. Molecular ecology, 13, 729-744. Baumann, R.W. (1975) Revision of the stonefly family Nemouridae (Plecoptera): a study of the world fauna at the generic level. Smithsonian Contributions to Zoology, 211, 1-74. Baumann, R.W. & Gaufin, A.R. (1971) New species of Nemoura from western North America (Plecoptera: Nemouridae). Pan-Pacific Entomologist, 47, 270-278. Baumann, R.W. & Kondratieff, B.C. (2010) The stonefly genus Lednia in North America (Plecoptera: Nemouridae). Illiesia, 6, 315-327. Baumann, R.W. & Call, R.G. (2012) Lednia tetonica, a new species of stonefly from Wyoming (Plecoptera: Nemouridae). Illiesia, 8, 104-110. Baumann, R.W., Gaufin, A.R. & Surdick, R.F. (1977) The stoneflies (Plecoptera) of the Rocky Mountains. Memoirs of the American Entomological Society, 31, 1-208. Beerli, P. & Felsenstein, J. (2001) Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach. Proceedings of the National Academy of Sciences, 98, 4563-4568. Beerli, P. & Palczewski, M. (2010) Unified framework to evaluate panmixia and migration direction among multiple sampling locations. Genetics, 185, 313-326. Boratyński, Z., Alves, P.C., Berto, S., Koskela, E., Mappes, T. & Melo-Ferreira, J. (2011) Introgression of mitochondrial DNA among Myodes voles: consequences for energetics? BMC evolutionary biology, 11, 355. Boumans, L. & Tierno de Figueroa, J.M. (2016) Introgression and species demarcation in western European Leuctra fusca (Linnaeus, 1758) and L. digitata Kempny, 1899 (Plecoptera: Leuctridae). Aquatic , 37, 115-126. Carstens, B.C., Pelletier, T.A., Reid, N.M. & Satler, J.D. (2013) How to fail at species delimitation. Molecular Ecology, 22, 4369-4383. Catchen, J., Hohenlohe, P.A., Bassham, S., Amores, A. & Cresko, W.A. (2013) Stacks: an analysis tool set for population genomics. Molecular ecology, 22, 3124-3140. Catchen, J.M., Amores, A., Hohenlohe, P., Cresko, W. & Postlethwait, J.H. (2011) Stacks: Building and Genotyping Loci De Novo From Short-Read Sequences. G3: Genes, Genomes, Genetics, 1, 171-182. Chifman, J. & Kubatko, L. (2014) Quartet inference from SNP data under the coalescent model. Bioinformatics, 30, 3317-3324. Clement, M., Posada, D. & Crandall, K.A. (2000) TCS: a computer program to estimate gene genealogies. Molecular ecology, 9, 1657-1659. Darriba, D., Taboada, G.L., Doallo, R. & Posada, D. (2012) jModelTest 2: more models, new heuristics and parallel computing. Nature methods, 9, 772-772. De Queiroz, K. (2007) Species concepts and species delimitation. Systematic biology, 56, 879- 886. DeWaard, J.R., Ivanova, N.V., Hajibabaei, M. & Hebert, P.D. (2008) Assembling DNA Barcodes. Environmental Genomics (ed. by C.C. Martin), pp. 275-293. Humana Press, Totowa, New Jersey. Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792-1797.

30 Elbrecht, V., Feld, C.K., Gies, M., Hering, D., Sondermann, M., Tollrian, R. & Leese, F. (2014) Genetic Diversity and Dispersal Potential of the Stonefly Dinocras cephalotes in a Central European Low Mountain Range. Freshwater Science, 33, 181-192. Excoffier, L. & Lischer, H.E. (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular ecology resources, 10, 564-567. Feder, J.L., Egan, S.P. & Nosil, P. (2012) The genomics of speciation-with-gene-flow. Trends in Genetics, 28, 342-350. Finn, D.S. & Adler, P.H. (2006) Population genetic structure of a rare high-elevation black fly, Metacnephia coloradensis, occupying Colorado lake outlet streams. Freshwater Biology, 51, 2240-2251. Finn, D.S., Encalada, A.C. & Hampel, H. (2016) Genetic isolation among mountains but not between stream types in a tropical high-altitude mayfly. Freshwater Biology, 61, 702- 714. Folmer, O., Black, M., Hoeh, W., Lutz, R. & Vrijenhoek, R. (1994) DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Molecular Marine Biology and Biotechnology, 3, 294-299. Giersch, J.J., Hotaling, S., Kovach, R.P., Jones, L.A. & Muhlfeld, C.C. (2016) Climate‐induced glacier and snow loss imperils alpine stream insects. Global Change Biology, 23, 2577- 2589. Giersch, J.J., Jordan, S., Luikart, G., Jones, L.A., Hauer, F.R. & Muhlfeld, C.C. (2015) Climate- induced range contraction of a rare alpine aquatic invertebrate. Freshwater Science, 34, 53-65. Glez-Peña, D., Gomez-Blanco, D., Reboiro-Jato, M., Fdez-Riverola, F. & Posada, D. (2010) ALTER: program-oriented conversion of DNA and protein alignments. Nucleic Acids Research, 38, W14-W18. Gompert, Z., Forister, M.L., Fordyce, J.A. & Nice, C.C. (2008) Widespread mito‐nuclear discordance with evidence for introgressive hybridization and selective sweeps in Lycaeides. Molecular ecology, 17, 5231-5244. Hajibabaei, M., DeWaard, J.R., Ivanova, N.V., Ratnasingham, S., Dooh, R.T., Kirk, S.L., Mackie, P.M. & Hebert, P.D.N. (2005) Critical factors for assembling a high volume of DNA barcodes. Philosophical Transactions of the Royal Society B-Biological Sciences, 360, 1959-1967. Hall, M.H.P. & Fagre, D.B. (2003) Modeled climate-induced glacier change in Glacier National Park, 1850-2100. Bioscience, 53, 131-140. Hansen, J., Nazarenko, L., Ruedy, R., Sato, M., Willis, J., Del Genio, A., Koch, D., Lacis, A., Lo, K., Menon, S., Novakov, T., Perlwitz, J., Russell, G., Schmidt, G.A. & Tausnev, N. (2005) Earth's energy imbalance: Confirmation and implications. Science, 308, 1431- 1435. Hebert, P.D.N., Cywinska, A. & Ball, S.L. (2003) Biological identifications through DNA barcodes. Proceedings of the Royal Society B: Biological Sciences, 270, 313-321. Hewitt, G. (2000) The genetic legacy of the Quaternary ice ages. Nature, 405, 907-913. Hime, P.M., Hotaling, S., Grewelle, R.E., O'Neill, E.M., Voss, S.R., Shaffer, H.B. & Weisrock, D.W. (2016) The influence of locus number and information content on species delimitation: an empirical test case in an endangered Mexican salamander. Molecular Ecology, 25, 5959-5974.

31 Hotaling, S., Hood, E. & Hamilton, T.L. (2017a) Microbial ecology of mountain glacier ecosystems: Biodiversity, ecological connections, and implications of a warming climate. Environmental Microbiology, 10.1111/1462-2920.13766. Hotaling, S., Finn, D.S., Giersch, J.J., Weisrock, D.W. & Jacobsen, D. (2017b) Climate change and alpine stream biology: progress, challenges, and opportunities for the future. Biological Reviews, DOI: 10.1111/brv.12319. Ivanova, N.V., DeWaard, J.R. & Hebert, P.D.N. (2006) An inexpensive, automation-friendly protocol for recovering high-quality DNA. Molecular Ecology Notes, 6, 998-1002. Jackson, N.D., Morales, A.E., Carstens, B.C. & O’Meara, B.C. (2017) PHRAPL: Phylogeographic Inference Using Approximate Likelihoods. Systematic Biology, syx001. Jombart, T. (2008) adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics, 24, 1403-1405. Jombart, T., Devillard, S. & Balloux, F. (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC genetics, 11, 94. Jordan, S., Giersch, J.J., Muhlfeld, C.C., Hotaling, S., Fanning, L., Tappenbeck, T.H. & Luikart, G. (2016) Loss of genetic diversity and increased subdivision in an endemic alpine stonefly threatened by climate change. PLoS ONE, 11, e0157386. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P. & Drummond, A. (2012) Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics, 28, 1647- 1649. Lapointe, F.-J. & Rissler, L.J. (2005) Congruence, consensus, and the comparative phylogeography of codistributed species in California. The American Naturalist, 166, 290-299. Leigh, J.W. & Bryant, D. (2015) POPART: full-feature software for haplotype network construction. Methods in Ecology and Evolution, 6, 1110-1116. Lourie, S., Green, D. & Vincent, A. (2005) Dispersal, habitat differences, and comparative phylogeography of Southeast Asian seahorses (Syngnathidae: Hippocampus). Molecular ecology, 14, 1073-1094. Martin, S.H., Dasmahapatra, K.K., Nadeau, N.J., Salazar, C., Walters, J.R., Simpson, F., Blaxter, M., Manica, A., Mallet, J. & Jiggins, C.D. (2013) Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Research, 23, 1817-1828. Meirmans, P.G. & Van Tienderen, P.H. (2004) GENOTYPE and GENODIVE: two programs for the analysis of genetic diversity of asexual organisms. Molecular Ecology Notes, 4, 792- 794. Muhlfeld, C.C., Giersch, J.J., Hauer, F.R., Pederson, G.T., Luikart, G., Peterson, D.P., Downs, C.C. & Fagre, D.B. (2011) Climate change links fate of glaciers and an endemic alpine invertebrate. Climatic Change, 106, 337-345. Nosil, P. (2008) Speciation with gene flow could be common. Molecular Ecology, 17, 2103- 2106. Nunziata, S.O., Lance, S.L., Scott, D.E., Lemmon, E.M. & Weisrock, D.W. (2017) Genomic data detect corresponding signatures of population size change on an ecological time scale in two salamander species. Molecular ecology, 26, 1060-1074. Nylander, J.A.A. (2004) MrModeltest v2. Evolutionary Biology Centre, Uppsala University, Sweden.

32 Pederson, G.T., Graumlich, L.J., Fagre, D.B., Kipfer, T. & Muhlfeld, C.C. (2010) A century of climate and ecosystem change in Western Montana: what do temperature trends portend? Climatic Change, 98, 133-154. Peterson, B.K., Weber, J.N., Kay, E.H., Fisher, H.S. & Hoekstra, H.E. (2012) Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PloS one, 7, e37135. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., De Bakker, P.I. & Daly, M.J. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 81, 559-575. Rambaut, A. & Drummond, A. (2007) Tracer v 1.4. Program distributed by the authors, Available from beast.bio.ed.ac.uk/Tracer. The University of Edinburgh. Roe, G.H., Baker, M.B. & Herla, F. (2016) Centennial glacier retreat as categorical evidence of regional climate change. Nature Geoscience, Ronquist, F., Teslenko, M., van der Mark, P., Ayres, D.L., Darling, A., Hohna, S., Larget, B., Liu, L., Suchard, M.A. & Huelsenbeck, J.P. (2012) MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic Biology, 61, 539-542. Satler, J.D. & Carstens, B.C. (2017) Do ecological communities disperse across biogeographic barriers as a unit? Molecular Ecology, DOI: 10.1111/mec.14137. Swofford, D. (2015) PAUP* Version 4.0a146. Phylogenetic Analysis Using Parsimony (*and Other Methods), Sinauer Associates, Sunderland, Massachusetts. U.S. Fish and Wildlife Service (2016) Endangered and threatened wildlife and plants; 12-month finding on a petition to list the western glacier stonefly as an endangered or threatened species; proposed threatened species status for meltwater lednian stonefly and western glacier stonefly. Federal Register, 81, 68379-68397 Whiteman, N.K., Kimball, R.T. & Parker, P.G. (2007) Co‐phylogeography and comparative population genetics of the threatened Galápagos hawk and three ectoparasite species: ecology shapes population histories within parasite communities. Molecular Ecology, 16, 4759-4773. Wright, S. (1943) Isolation by distance. Genetics, 28, 114. Yang, Z. (2015) The BPP program for species tree estimation and species delimitation. Current Zoology, 61, 854-865. Yang, Z. & Rannala, B. (2010) Bayesian species delimitation using multilocus sequence data. Proceedings of the National Academy of Sciences, 107, 9264-9269.

33 Appendix:

APPENDIX 1. Locality information for all samples included in this study. Species in quotation marks are hypothesized species lineages based upon monophyly in mtDNA gene trees and described in Giersch et al., (2015) and (2016).

Species Location State Latitude Longitude L. sierra Sky Meadows CA 37.571604 -118.987500 L. borealis Snow Lake WA 46.757046 -121.698900 L. tetonica Alaska Basin WY 43.689457 -110.832700 L. tetonica N. Fork Teton Creek WY 43.768084 -110.861500 L. tetonica Schoolroom Glacier WY 43.728578 -110.844000 L. tetonica Sunset Lake WY 43.710189 -110.855600 L. tetonica Teton Meadows WY 43.725804 -110.793100 L. tetonica Upper Paintbrush WY 43.785213 -110.794100 L. tetonica Wind Cave WY 43.665728 -110.956100 L. tumana Bearhat Mtn./Hidden Lk. MT 48.665010 -113.749060 L. tumana Grant Glacier MT 48.331410 -113.736870 L. tumana Heavens Peak MT 48.710220 -113.842660 L. tumana Lunch Creek MT 48.705240 -113.704550 L. tumana Sexton Glacier MT 48.700300 -113.628080 L. tumana Siyeh Bend MT 48.711490 -113.675120 “Zapada Sexton” Basin Lakes MT 43.692800 -110.858310 “Zapada Sexton” Black Butte MT 44.902558 -111.844196 “Zapada Sexton” Burnt Cr. Headwaters MT 44.937410 -111.837370 “Zapada Sexton” S. Fork Darby Creek WY 43.683544 -110.956600 “Zapada Sexton” S. Fork Teton Creek WY 43.692870 -110.858540 “Zapada Sexton” Sexton Glacier MT 48.700330 -113.619230 “Zapada Sexton” South Cascade Creek WY 43.690776 -110.843355 V. cataractae Cataract Creek MT 48.737981 -113.699007 “Zapada WY-NM” Alaska Basin WY 43.692870 -110.858540 “Zapada WY-NM” Wheeler Peak NM 36.564893 -105.406999 Z. cinctipes Cataract Creek MT 48.766600 -113.698480 Z. cinctipes Flathead River MT 48.499740 -113.969710 Z. cinctipes McDonald Creek MT 48.638740 -113.864520 Z. cinctipes Snyder Lake MT 48.625970 -113.804710 Z. columbiana Alaska Basin WY 43.692870 -110.858540 Z. columbiana Appistoki Creek MT 48.458690 -113.353020 Z. columbiana Cataract Creek MT 48.766600 -113.698480 Z. columbiana Cataract Peak MT 48.729417 -113.685395 Z. columbiana Iceberg Creek MT 48.820180 -113.740120 Z. columbiana Lower Shepard MT 48.868380 -113.850360 Z. columbiana Lunch Creek MT 48.699940 -113.703670 Z. columbiana Piegan Pass MT 48.729412 -113.697169 Z. columbiana Preston Park MT 48.717380 -113.641420 Z. columbiana Reynolds Creek MT 48.687290 -113.733020

34 Z. columbiana Sexton Glacier MT 48.700330 -113.619230 Z. columbiana Shadow Lake MT 43.732504 -110.775000 Z. columbiana Shangri-La Outlet MT 48.809272 -113.720659 Z. columbiana Skalkaho Pass MT 46.256100 -113.787900 Z. columbiana Wind Cave WY 43.665728 -110.956100 “Z. columbiana PNW” Blue Lake WA 46.405750 -121.739000 “Z. columbiana PNW” Colchuck Lake WA 47.485133 -120.826709 “Z. columbiana PNW” Devil's Lake OR 44.040182 -121.775770 “Z. columbiana PNW” Divide Camp Spring WA 46.244180 -121.558580 “Z. columbiana PNW” Goat Rocks WA 46.514000 -121.474560 Z. cordillera Cerulean Stream MT 48.842630 -114.142440 Z. cordillera Lake McDonald Trib. MT 48.535890 -113.969100 Z. cordillera North Fork MT 48.573951 -114.014895 Z. cordillera Upper Lost Basin MT 48.396198 -113.417350 Z. frigida Apikuni Creek Basin MT 48.822250 -113.654790 Z. frigida Iceberg Creek MT 48.820180 -113.740120 Z. frigida Swiftcurrent Pass MT 48.781790 -113.758030 Z. frigida Wilbur Creek MT 48.800310 -113.681060 Z. glacier Appistoki Creek MT 48.458775 -113.348869 Z. glacier Buttercup Park MT 48.423732 -113.384444 Z. glacier Delta Lake WY 43.732504 -110.775000 Z. glacier Dry Fork Spring MT 48.534545 -113.380525 Z. glacier Frosty Lake MT 45.026079 -109.551534 Z. glacier W. Fork Rock Creek MT 45.096220 -109.604000 Z. glacier Grinnell Outlet MT 48.757364 -113.724798 Z. glacier Jasper Lake MT 45.023313 -109.578500 Z. glacier Mica Lake WY 43.785354 -110.841346 Z. glacier Piegan Pass MT 48.729412 -113.697169 Z. glacier South Cascade Creek WY 43.728490 -110.837297 Z. glacier Teton Meadows WY 43.725912 -110.790375 Z. glacier Timberline Lake MT 45.132528 -109.507700 Z. haysi Appistoki Creek MT 48.462469 -113.343448 Z. haysi Black Butte MT 44.902558 -111.844196 Z. haysi Burnt Creek MT 44.937410 -111.837370 Z. haysi Cataract Creek MT 48.766600 -113.698480 Z. haysi Clements Creek MT 48.688130 -113.729350 Z. haysi Delta Lake WY 43.732504 -110.775000 Z. haysi Grinnell Outlet MT 48.764580 -113.714790 Z. haysi Iceberg Creek MT 48.817810 -113.743710 Z. haysi Lower Shepard MT 48.871030 -113.850360 Z. haysi N. Fork Teton Creek WY 43.770831 -110.861436 Z. haysi Ole Creek MT 48.384313 -113.390840 Z. haysi Ptarmigan Creek MT 48.841590 -113.711820 Z. haysi Reynolds Creek MT 48.688760 -113.723580 Z. haysi S. Fork Darby Creek WY 43.683544 -110.956600 Z. haysi Sexton Glacier MT 48.700330 -113.619230

35 Z. haysi Tumalo Creek OR 44.073151 -121.382885 Z. oregonensis Grinnell Outlet MT 48.759100 -113.724820 Z. oregonensis Iceberg Creek MT 48.821240 -113.737830 Z. oregonensis Lower Shephard MT 48.871030 -113.850360 Z. oregonensis Mill Creek MT 45.515320 -111.990370 Z. oregonensis N. Fork Teton Creek WY 43.770831 -110.861436 Z. oregonensis Shangri-La Outlet MT 48.809272 -113.720659 Z. oregonensis Siyeh Creek MT 48.704200 -113.668950 Z. oregonensis Skalkaho Pass MT 46.266100 -113.765600 “Z. oregonensis WA” Goat Creek WA 46.467100 -121.513480

36