MAPPING THE HYPOTELMINORHEIC HABITAT BY STUDYING THE POPULATION

STRUCTURE OF AMPHIPODS IN SEEPS

By

Karen Kavanaugh

Submitted to the

Faculty of the College of Arts and Sciences

of American University

in Partial Fulfillment of

the Requirements for the Degree of

Master of Science

In

Dean of the College of Arts and Sciences CUe~~~ ~ u;oq Date

2009

American University

Washington, D.C. 20016 AMERICAN UNIVERSITY LIBRARY UMI Number: 1472764

All rights reserved

INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. UMI ...... -=*Dissertation Publishing~

UMI 1472764 Copyright 2010 by ProQuest LLC. All rights reserved. This edition of the work is protected against unauthorized copying under Title 17, United States Code. Pro uesf ------·

ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml48106-1346 MAPPING THE HYPOTELMINORHEIC HABITAT BY STUDYING THE

POPULATION STRUCTURE OF AMPHIPODS IN SEEPS

BY

Karen Kavanaugh

ABSTRACT

The hypotelminorheic is a type of perched aquifer with an unknown geographic extent. Depending on the local topography, groundwater from one hypotelminorheic may flow to the surface forming a single seep, or to several surface locations forming multiple seeps. In order to infer the boundaries of the hypotelminorheic, I analyzed the population structure of a subterranean seep amphipod species. I analyzed a 628 bp region of mtDNA corresponding to the cytochrome oxidase c subunit I gene from 118 specimens of S. tenuis potomacus, collected from 9 seeps along the George Washington

Memorial Parkway. Pairwise comparisons among sites (uncorrected "p" = 0.7- 12.7%) and the nested clade phytogeographic analysis suggested that hypotelminorheic habitats are fragmented. However, population subdivision (within site uncorrected "p" = 0-

13.06%) found at many ofthe sites suggests that hypotelminorheic habitats have a dynamic extent that fluctuates with the water table, forming temporary corridors between hypotelminorheics.

11 ACKNOWLEDGEMENTS

Funding for this project was provided by American University and the U.S.

National Park Service. Many thanks to my advisor, Dr. Dan Fong, and my committee,

Dr. Dave Carlini and Dr. Dave Culver, for their patience, guidance, and support. Thanks to Ben Hutchins who mentored me both in the field and in the genetics laboratory.

Special thanks to Katie 0' Neill and Bryan Adams for providing me with their friendship and much needed help collecting samples in the field.

111 TABLE OF CONTENTS

ABSTRACT ...... ii

ACKNOWLEDGEMENTS ...... iii

LIST OF TABLES ...... vi

LIST OF ILLUSTRATIONS ...... " ...... vii

Chapter

1. INTRODUCTION ...... 1

The Hypotelminorheic Habitat...... 1

Amphipods ...... 4

COl: A Useful Molecular Marker for Studying Population Structure ...... 6

Studying Gene Flow ...... 8

Population Structure: The Island Model, Isolation by Distance, and Habitat Fragmentation ...... 9

Hypothesis and Predictions ...... 14

2. METHODS ...... l6

Sample Collection ...... - ...... 16

DNA Extraction ...... 20

PCR ...... 20

Gel Purification of the COl Gene ...... 21

lV Gene Sequencing ...... 22

Analyses ...... 23

Sequence Characterization ...... 23

Phylogenetic Analysis ...... 24

Population Structure ...... 25

Nested Clade Phytogeographic Analysis ...... 25

3. RESULTS ...... 28

Collection and Sequencing ...... 28

Sequence Characterization ...... 29

Phylogenetic Analysis ...... 32

Population Structure ...... 44

Nested Clade Phytogeographic Analysis ...... 47

4. DISCUSSION ...... 47

Lack of an Overall Isolation by Distance Pattern Among the Ingroup Seeps Suggests that the Hypotelminorheic is Discontinuous ...... 4 7

Evidence that Hypotelminorheic Habitats are Fragmented ...... 48

The Extent of the Hypotelminorheic is Dynamic ...... 51

Implications for the Conservation of Rare Species of in Seeps ...... 59

Conclusions ...... 60

APPENDIX ...... 63

REFERENCES ...... 64

v LIST OF TABLES

1. Number of Specimens Collected and Sequenced from Each Site ...... 30

2. Number of Sequences from Each Site Where Stop Codons Were Present in All 6 Reading Frames ...... : ...... 31

3. Number of Sequences from Each Site that Were Used in the Analyses ...... 31

4. Results of the Tajima's D Test for Selective Neutrality ...... 32

5. Distribution ofHaplotypes ...... 34

6. Average Uncorrected PercentPairwise Distances Within Each S. tenuis potomacus Site, Within Each Cluster, and Among Clusters ...... 35

7. Average Uncorrected Percent Pairwise Distances Within Outgroup Sites and Between Outgroup and Ingroup Sites ...... _ ...... 37

8. Average Uncorrected Percent Pairwise Distances Between Each Site ...... 38

9. Results of an AMOVA Constructed for S. tenuis potomacus ...... 40

A 1. Significant Results of the Nested Clade Phytogeographic Analysis ...... 63

Vl LIST OF ILLUSTRATIONS

1. Map of Collection Sites for Stygobromus tenuis potomacus, S. pizzinii, and Crangonyx shoemakeri ...... 17

2. Map of S. pizzinii Collection Sites in C&O Geographic Cluster ...... 18

3. Map of S. tenuis potomacus Collection Sites in Northern GWMP Cluster...... 18

4. Map of S. tenuis potomacus Collection Sites in Central GWMP Cluster...... 19

5. Map of S. tenuis potomacus Collection Sites in Southern GWMP C1uster...... 19

6. Map of the Geographic Distribution of Haplotypes Shared Among Multiple S. tenuis potomacus Sites ...... 33

7. Phylogram of S. tenuis potomacus Estimated Using the Maximum Likelihood Criterion with C. shoemakeri as the Root of the Tree ...... 41

8. Phylogram of S. tenuis potomacus Estimated Using the Maximum Likelihood Criterion with S. pizzinii as the Root of the Tree ...... 42

9. Graph ofFst Values, an Estimate of Genetic Diversity, vs. the Geographic Distance Between Sites ...... 44

10. Unrooted Haplotype Network Estimated for S. tenuis potomacus Using the Maximum Parsimony Criterion ...... 46

Vll CHAPTER 1

INTRODUCTION

The Hypotelminorheic Habitat

In 1962, the Croatian biologist, Milan Mestrov, defined a new freshwater habitat that he called the hypotelminorheic (1962). The hypotelminorheic is a shallow, subterranean aquifer that is perched above a water impermeable clay layer, usually within close proximity to the surface. Where the water table of the hypotelminorheic intersects the ground surface in an area with a slight depression or gradual slope, groundwater from the hypotelminorheic exits to the surface, forming a seep (Culver, Pipan, and Gottstein

2006). Depending on the local topography, the groundwater from one hypotelminorheic habitat may exit to the surface at a single seep, or to several surface locations at multiple seeps. In this scenario, a seep is a point of exit of the hypotelminorheic groundwater to the surface.

The surface habitat associated with a seep may be a localized wet spot or a shallow pool if the seep is situated in a depression, or a small rill if the seep is located on a slope.

Depending on the extent of seasonal or other temporal fluctuation of the water table in the hypotelminorheic, a particular seep and its associated surface habitat may persist year round or may dry up and cease to exist for a period although the habitat, especially the clay layer retains moisture. Mestrov (1962) originally observed that the dominant species found in seeps associated with the hypotelminorheic habitat shared troglomorphic

1 1 2

characteristics with subterranean organisms found usually in caves, such as reduced or complete lack of pigment and eyes as well as elongated appendages and enhanced extraoptic sensory structures.

In a recent study, Culver et al. (2006) compared the characteristics of a group of seeps in three countries: in George Washington Memorial Parkway in the United States, in Medvednica Mountain in Croatia, and in Nanos Mountain in Slovenia. They found that the seeps were very similar in physical and chemical properties, as well as in their fauna. There were six physical features shared by sites with fauna that are groundwater obligates, or stygobionts. The hypotelminorheic and its associated seep habitat had a clay layer, approximately 5 to 50 em beneath the surface, which allowed the seep to exist as a wet spot that persisted through time. In addition, at each site, the subsurface water ofthe hypotelminorheic exited into a depression or a gentle slope and seeped through to an area with a small to medium-sized slope. Furthermore, the drainage area of the hypotelminorheic was estimated to be 10,000 m2 or less. Finally, the associated seep habitat consistently provided a plentiful source of organic matter and was characterized by a dark color resulting from the presence of decaying leaves that were not skeletonized

(Culver, Pipan, and Gottstein 2006).

The fauna of each of these seeps was dominated by species of amp hi pod . The study areas in each of the three countries hosted at least two stygobiotic amphipod species. The temperature of the water from the hypotelminorheic sites sampled was close to the long-term means for the area. This may make hypotelminorheic habitats suitable for stygobiotic organisms which are known to require a stable habitat similar to the groundwater in subterranean habitats. In the U.S. locations, seeps inhabited 3

by stygobiotic amphipods in the Stygobromus had a lower temperature, higher

conductivity, higher dissolved oxygen, lower pH, and lower nitrate levels than seeps without Stygobromus (Culver, Pipan, and Gottstein 2006).

The hypotelminorheic habitat shares some common features with other superficial subterranean habitats such as epikarst and talus slopes (milieu souterrain superficiel).

Although all of these habitats lack light, which may serve as a barrier to colonization, there are advantages to able to adapt to life without light. The stable, predator­ free environment of the hypotelminorheic habitat may provide refuge for subterranean amphipods. Although subterranean environments are usually nutrient poor, amphipods in a hypotelminorheic may migrate to the surface habitat associated with its seeps because they are rich in organic matter (Culver and Pipan 2008).

A recent mark-recapture study provided evidence that subterranean amphipods undergo directed migrations between hypotelminorheic habitats and the associated seep at the surface. Sixteen amphipods were collected from a seep in the George Washington

Memorial Parkway, near Chain Bridge, in Virginia. The specimens were marked with insoluble red dye and returned to the collection site. Each week the same area of the seep was monitored for 15 minutes, and all amphipods were counted and examined for markings. Although over 20 amphipods were counted each week, only one marked amphipod was recaptured, four weeks after its initial release. This demonstrates that seep amphipods actively move back and forth between the hypotelminorheic and its associated surface habitat (Kavanaugh and Fang unpublished data).

While the mark-recapture data illustrates that amphipods actively move where there are no barriers to dispersal, the true geographic extent of the hypotelminorheic habitat is 4

unknown. Previous study has shown that the ranges of some hypotelminorheic species seem to be especially restricted. In the lower Potomac River drainage, in the

Washington, D.C. area, and S. kenki are known from fewer than five seeps. The linear extent of the range of these species seem to be less than 5 km in linear extent (Culver, Pipan, and Gottstein 2006). The small ranges of these particular amphipod species makes them vulnerable to extirpation due to human activities or stochastic events.

Krejca (2005) used the evolutionary patterns of stygobiotic isopods as a biological tool to infer how aquifers in central Texas and northern Mexico evolved through hydrogeological processes. In another study, Hutchins (2007) examined the population structure of a federally threatened species of stygobiotic isopod that inhabits the phreatic waters of the Shenandoah Valley to infer the presence of physical barriers to its dispersal.

Similarly, although on a smaller scale, studying the gene flow of aquatic subterranean amp hi pods that inhabit seeps may reveal the extent of hypotelminorheic habitats.

Studying the evolution of gene flow in stygobiotic amphi pods may allow, inferences about whether the groundwater from a given hypotelminorheic habitat feeds into one isolated seep or multiple connected seeps.

Amp hi pods

An especially high number of subterranean amphipod species are endemic to the eastern United States (Vainola et al. 2008). In North America, the highest concentration of amp hi pods in subterranean habitats is found in the Coastal Plain of the Potomac River 5

(Culver, Pipan, and Gottstein 2006). This makes the Washington, D.C. area distinct from anywhere else on the continent with regards to amphipod species diversity.

Subterranean amphipods are well-represented in karst areas (Hutchins and Culver

2007). Yet, non-karstic areas, such as interstitial habitats, also host a rich array of species. Although it has not been the subject of much research, it has been suggested that the seep is an interstitial habitat that hosts a great diversity of subterranean fauna (Collier and Smith 2006; Culver, Pipan, and Gottstein 2006). Hypotelminorheic habitats are unique in their closeness to the surface. This provides access to an unusually high amount of organic material for a groundwater habitat (Culver, Pipan, and Gottstein

2006). As a result, seeps may be an ideal habitat for subterranean amphipods whose main nutrient source is organic debris (Vainola et al. 2008).

All of the over 200 species in the genus Stygobromus are obligate subterranean organisms (Holsinger 1986). Seven species of Stygobromus have been identified in the

Coastal Plain of the Potomac River. S. hayi, afederally endangered species, is found in only a few seeps or seep-like springs in Rock Creek Park and a spring in the National

Zoological Park in Washington, D.C. The U.S. Fish and Wildlife Service has recently been petitioned to classify Stygobromus kenki as endangered because it is found in just four seeps on the east side of Rock Creek (Culver, Pipan, and Gottstein 2006).

My study compared the sequence divergence of Stygobromus tenuis potomacus within and among sites in order to infer the extent of dispersal among seeps and the extent of the hypotelminorheic habitats where the species resides. S. tenuis potomacus was chosen as the subject of the study because it is a common inhabitant of seeps in the

George Washington Memorial Parkway (GWMP) along the Virginia shoreline of the 6

Potomac River. It is the most widely distributed stygobiont in the lower Potomac drainage. S. tenuis potomacus primarily inhabits seeps, but is also known from shallow wells and springs (Hutchins and Culver 2007).

For comparison, I also analyzed the sequences of two other amphipod species:

Stygobromus pizzinii and Crangonyx shoemakeri. S. pizzinii is a subterranean amphipod that occurs in seeps in the Chesapeake and Ohio Canal National Historical Park (C&O) along the Maryland shoreline of the Potomac River. S. pizzinii is less common and more patchily distributed than S. tenuis potomacus. The species is known to inhabit seeps, shallow wells, springs, and caves (Hutchins and Culver 2007). S. pizzinii served as an outgroup for analyzing the genetic divergence of the COl sequences of S. tenuis' potomacus. Samples of C. shoemakeri collected in seeps alongside S. tenuis potomacus

(in the GWMP) and S. piz~inii (in the C&O) were also sequenced to be used as an outgroup to increase the resolution of the phylogenetic analyses. Crangonyx and

Stygobromus are members of the same taxonomic family, Crangonyctidea. C. shoemakeri is a stygophile, a species that is more common, but not limited to groundwater habitats. C. shoemakeri is the most common species of Crangonyx found in seeps, but also inhabits springs, bogs, ponds, small streams, and temporary pools

(Hutchins and Culver 2007).

COl: A Useful Molecular Marker for Studying Population Structure

In order to study the population structure of S. tenuis potomacus to infer the extent ofhypotelminorheic habitats along the GWMP, the mitochondrial DNA gene that code's 7

for the cytochrome oxidase c subunit I (COl) was chosen as the molecular marker for comparison. In the last thirty years, numerous studies of population genetics have used mitochondrial DNA (mtDNA). This is because of a number of characteristics of animal mtDNA that make it a practical molecular marker for such studies. Similar mtDNA genes are present in a wide variety of species, which allows for interesting comparisons. MtDNA is easy to isolate and assay. It also usually lacks many of the features that complicate nuclear DNA analysis such as, transposable elements, pseudogenes, and repetitive regions of DNA. Furthermore, mtDNA does not usually undergo recombination or other genetic rearrangements because it is maternally inherited.

This is important because genetic rearrangements could potentially confound estimates of the rate of genetic variation beyond what was due to the rate of random mutations.

MtDNA also evolves at a relatively rapid rate so that new character states may develop within a short time period (Avise et al. 1987).

In order to compare the genetic variation that exists between populations, it is important to analyze a selectively neutral marker that accumulates random mutations at a steady rate. The majority of the mutations in the mitochondrial genome are selectively neutral, consisting of silent mutations or additions or deletions of bases in the untranscribed D-loop region (Avise et al. 1987). For this project, I chose to use a fragment of the mtDNA COl gene because it is one of the most commonly used molecular markers for population studies involving amphipods and other invertebrates.

COl is one of the most highly conserved protein-coding genes in the mitochondrial genome of animals. This makes it ideal for phylogenetic studies seeking to compare diverse organisms. Folmer et al. developed universal DNA primers for the amplification 8

of a fragment of the COl gene to enable phylogenetic studies of newly discovered marine invertebrates (Folmer et al. 1994).

Since then, COl has become one of the main tools of DNA barcoding projects because it is easy to recover and provides good taxonomic resolution. Hebert et al.

(2004) sampled the COl sequences of260 North American bird species and demonstrated that it is easy to distinguish between taxonomic species based on the COl sequence. The authors found that the average differences between closely related species were 18 times higher than the differences within species (Hebert et al. 2004). When combined with traditional taxonomic practices, DNA barcoding has the potential to help identify unknown individuals to species and enhance the discovery of new species (Moritz and

Cicero 2004). Analysis of COl is also useful for phylogeographic studies because it shows variation even among populations that have only been separated relatively recently on the evolutionary timeline (Hebert et al. 2004).

Studying Gene Flow

In order to infer the extent of the hypotehriinorheic habitat, this study examined patterns of gene flow between populations of S. tenuis potomacus collected from seeps along the GWMP. Gene flow includes the various mechanisms that result in the movement of one set of genes from one population to another. These mechanisms include the migrat.ion of individuals to a new population or the extinction and recolonization of populations. Gene flow limits the genetic divergence of a local population. However, the degree to which divergence is limited depends on the selection 9

pressures acting on the population and the amount of migrants from the source population

(Slatkin 1985).

Migration is one of the most important processes that influences population and community dynamics. Animal movement patterns are difficult to observe. Studying spatial population structure is important for understanding major threats to flora and fauna such as habitat fragmentation and habitat loss (Leblois, Estoup, and Streiff 2006).

It is also essential for understanding evolutionary concepts like allopatric speciation

(Kelly, Macisaac, and Heath 2006). Analyzing spatial population genetic structure can be a vital tool when the movements of organisms lead to gene flow. Selectively neutral genetic markers, such as COl, can be used to compare the relative levels of dispersal within and among spatially structured groups of populations. Phylogeographic principles can be used to identify barriers to dispersal in the landscape (Finn et al. 2006).\

Population Structure: The Island Model. Isolation by Distance, and Habitat Fragmentation Population structure results from various barriers to dispersal that restrict gene flow. The pattern of population structure detected in the specimens of S. tenuis potomacus examined illustrates the path of gene flow and/ or restrictions to gene flow within the hypotelminorheic habitat. From this, the extent of the habitat can be inferred.

According to Sewall Wright (1943) there are two basic ways of understanding population structure: the island model and the isolation by distance model. The simplest model is the island model. As is roughly the case in a group of islands, in this model the total population is divided into subgroups. Random mating occurs within each of these subgroups. There is also gene flow between subgroups via migrants that are drawn 10

randomly from the total population (Wright 1943). A slightly more complicated version of this is the hierarchical island model. Under this model, the total population is still divided into subgroups, in this case, called neighborhoods. However, significantly higher gene flow is said to occur within each neighborhood than between neighborhoods

(Slatkin and Voelm 1991).

Sometimes a total population is completely contiguous, but individuals only have the ability to disperse over short distances. This means that interbreeding may be restricted to small ranges (Wright 1943). According to the isolation by distance model, individuals born in close proximity to each other are expected to have a higher probability of mating with each other than with individuals further away (Leblois,

Estoup, and Streiff 2006). As a result, remote regions of the population may become differentiated because of isolation by distance. The model is complicated by factors such as mutation and selection. Selection may help to increase such differentiation since an allele may have different selection pressures acting upon it under different local conditions (Wright 1943). Wright's mathematical isolation by distance model is supported by a number of simulations, as well as field studies (Finn et al. 2006; Kelly,

Macisaac, and Heath 2006; Slatkin 1993).

Aquatic habitats have island-like characteristics because many freshwater invertebrates are not able to disperse over land. Terrestrial barriers may restrict dispersal even when organisms have a non-aquatic stage in their life cycle. This is the case in

Prosimulium neomacropyga, a species of black fly native to alpine tundra streams in the

U.S. Southern Rockies. The majority of this species' life cycle is spent in an aquatic juvenile stage, emerging briefly as a winged adult to find a mate. The female then lays 11

her eggs in a stream and the cycle repeats. The aquatic range of P. neomacropyga is restricted to the headwaters of the streams they inhabit. Finn et al. (2006) found evidence of very limited gene flow between the streams studied, and a strong effect of isolation by distance within each stream. Weaker, but significant support for isolation by distance between each stream was also found, which provides evidence for dispersal at a broader level, as well (Finn et al. 2006).

The dispersal of freshwater amp hi pods may be even more limited because their entire life cycle is aquatic, without a dispersal stage. In many amphipod species this may result in genetic divergence, even when there is no apparent morphological differentiation. Witt and Hebert (2000) found this to be the case in Hyalella azteca, a widespread species of aquatic amphipod found from the Atlantic to the Pacific and from

Panama to beyond the Arctic Circle. Hyalella azteca inhabits ponds, streams, and lakes in central glaciated North America, but demonstrates no morphological variation. Witt and Hebert (2000) found evidence from analysis of both allozymes and the mitochondrial cytochrome c oxidase I gene (COl) that there was significant genetic divergence among the populations of Hyalella azteca sampled with a range in pairwise COl nucleotide sequence divergence between haplotypes in different clusters ranging from 8. 7 to 27.6%.

The authors concluded that Hyalella azteca is a cryptic species complex (Witt and Hebert

2000). Further analysis by Witt et al. (2006) used a species screening threshold that considered the species definition to be sequence divergence greater than 10 times the average intrapopulation COI haplotype divergence, 0.375%. Although the authors considered this to be a conservative threshold, the results of the study suggested that

Hyalella azteca is composed of at least 33 provisional species with sequence divergences 12

ranging between 4.4 and 29.9% (Witt, Threloff, and Hebert 2006).

Gene flow is inhibited by more than just distance and limited dispersal ability.

Some habitats, such as mountaintops, caves, and streams, are naturally isolated from one another. Natural populations can also become fragmented due to events such as glaciation or human activities and changes to the land (Leblois, Estoup, and Streiff 2006;

Templeton, Routman, and Phillips 1995). Physical barriers may separate a once continuous population into fragments. This allopatric fragmentation eliminates gene flow between fragmented populations and increases genetic drift. As a result, the fragmented populations also may accumulate different mutations (Templeton, Routman, and Phillips

1995).

Hogg et al. (2006) attributed the formation of a cryptic species complex in a species of amp hi pod to historical fragmentation. The authors collected samples of the amp hi pod,

Paracalliope jluviatilis, from 14 freshwater fluvial habitats on both the North and South

Islands of New Zealand. Despite the fact that analysis showed that the individuals were morphologically identical, the authors found considerable genetic differentiation among

28 geographic locations. Data from the analysis of eight allozyme loci and COl sequences resulted in an average Wright's FsT value that exceeded 0.68 (p<0.001). This suggested a great amount of genetic differentiation among the sites sampled, and minimal levels of gene flow at present. cor sequence divergences were between 12 and 26%.

From this, the authors suggested that P. jluviatilis is actually a cryptic species complex consisting of four or more genetically distinct species (Hogg et al. 2006).

Sometimes the phylogeographic patterns that have resulted from historical fragmentation appear similar to those reflected by isolation by distance. This emphasizes 13

the need for adequate geographic sampling from intermediate, as well as distant, sampling locales, to help distinguish between patterns of fragmentation and limited gene dispersal (Templeton, Routman, and Phillips 1995). It is important to incorporate information about the geographic distribution ofhaplotypes into the genetic analysis. For example, in populations that are isolated by distance, the outskirts of the population's geographic range are more geographically restricted than the interiors of that range. In addition, the haplotypes present in the outskirts ofthe range are usually scattered throughout the interior range of the population. Some statistics traditionally used to estimate population structure do not make use of this type of spatial information. For example, F statistics, which measure the amount of heterozygosity in a population, are commonly used in population genetics. While they rely on measures of allele frequencies, F statistics do not incorporate geographic data into the analysis.

In a 1998 paper, Templeton makes a strong case for the analysis ofhaplotype trees, in addition to the use ofF statistics and other statistics used for studying population structure. Templeton describes a case study where the analysis of the fixation index (Fsr) alone led to a very different conclusion than the analysis of the haplotype network. Fsr is a comparison of the genetic variability within and between populations. In a phylogenetic study, the Fsr values of buffalo and impala were not found to differ significantly from each other. From this data alone, it seemed that both species had undergone similar evolutionary processes and experienced comparable rates of gene flow between the populations sampled. However, when Templeton and Georgiadis (1995) constructed a haplotype network, they found evidence that the two species had undergone very different evolutionary processes. The spatial/ temporal pattern indicated the 14

presence of recurring gene flow between all of the buffalo populations sampled.

Meanwhile, the pattern exhibited by the impalas sampled in the same locations provided

evidence of either a past fragmentation event or the presence of isolation by distance

(Templeton 1998). This emphasizes the usefulness of haplotype tree analysis.

The analysis of haplotype trees can be coupled with significance testing to increase

the objectivity of inferences about the historical processes that influenced the observed

distribution ofhaplotypes. In this study, I chose to use nested clade phytogeographic

analysis (NCPA), in addition to traditional population structure statistics, including, Fsr.

Templeton et al. (1995) proposed this nested design approach to interpreting

phytogeographic data in 1995. NCPA makes use of geographic data by calculating two

main statistics. The clade distance, De, measures the geographic range of a given clade.

The nested clade distance, Dn, measures the geographic distribution of a given clade

compared to its closest evolutionary sister clades. Statistically significant patterns are

interpreted biologically to determine the role of restricted gene flow, past fragmentation

events, or range expansion (Templeton, Routman, and Phillips 1995).

Hypothesis and Predictions

This project seeks to use a species of amphipod, S. tenuis potomacus, as a

biological indicator of the geographic extent of the hypotelminorheic habitat. Two

closely related amphipod species, S. pizzinii, and C. shoemakeri, were compared as

outgroups. A sequence of mtDNA commonly used in population genetics studies, COl,

was analyzed for each of the amp hi pod species. I hypothesized that the S. tenuis potomacus specimens collected from seeps along the GWMP are not part of one 15

contiguous population and that their population structure would reflect existing barriers to gene flow. Even if the source of the seeps along the GWMP is one continuous habitat, as a result of the limited dispersal of amp hi pod species, I expected to observe a pattern characteristic of isolation by distance where genetic divergence among sequences increases with increasing geographic distance. The hypotelminorheic may be discontinuous, like other subterranean habitats. If so, I expected the observed distribution of haplotypes to exhibit a pattern characteristic of habitat fragmentation. CHAPTER2

METHODS

Sample Collection

The general distribution of seeps in the Washington, D.C. area has been demonstrated by previous research (Hutchins and Culver 2007). I chose to focus the majority of my sampling efforts within the confines of the George Washington Memorial

Parkway (GWMP) for two main reasons. First, the GWMP is a national park that encompasses a large geographic range with a north to south orientation. Second, seeps inhabited by Stygobromus tenuis potomacus are relatively common within the GWMP across the park's geographic.area.

I chose to divide the known distribution of seeps within the GWMP into three geographic clusters: a northern, central, and southern cluster (see Figures 1, 3-5). Each cluster consisted of three seeps known to containS. tenuis potomacus. Samples of

Crangonyx shoemakeri were also collected when encountered. Furthermore, samples of

Stygobromus pizzinii and additional samples of C. shoemakeri were collected from another geographic cluster consisting of three seeps located in the Chesapeake & Ohio

Canal National Historic Park (C&O) (see Figures 1 and 2). Specimens of S. tenuis potomacus collected from a seep in Manassas, VA were also included in this study. The

Manassas site is located over 30 km from the GWMP sites, and the Manassas population served as the geographic outgroup for the study (see Figure 1).

16 17

The goal was to collect 20 S. tenuis potomacus specimens from each seep. Similar studies have demonstrated that this sample size is large enough to capture the variation present in a population (Hutchins 2007). It was also feasible to collect this many animals at most of the sites, though return trips were usually necessary.

All amphipods were collected via visual searches that lasted for approximately 30-

60 minutes at each site. Specimens were hand collected using a turkey baster or spoon to minimize damage to the habitat. C. shoemakeri, S. tenuis potomacus, and S. pizzinii were identified on site.

Geographic Clusters -<)r N01th A IV!NSS Maryland 1.1 ., SP-COL 7 L:J Central (!) South. * SP-OAI

34.4 km

Virginia 22.3 km

5.5 km 43.4 km &;

0 .l.i.l i.5 15 Km l-..L-'-.-'-..l__c_..__.__j

Figure 1. Map of Collection Sites for Stygobromus tenuis potomacus, S. pizzinii, and Crangonyx shoemakeri. This map depicts the five geographic clusters of seeps where specimens were collected: S. tenuis potomacus- Northern (N), Central (C), Southern (S), and Manassas (MNSS); S. pizzinii- C&O Canal Lock 7 (COL7) and OAI; C. shoemakeri­ COL 7, Northern, and Central. The estimated distances between each cluster are shown to the nearest tenth of a kilometer (Gis data 2009; Whitler 2009). 18

Maryland

0.1 km

C&O Canal

I) 0.03 O.OU 0.1.! KltOJUCI~·~

Figure 2. Map of S. pizzinii Collection Sites in C&O Geographic Cluster. Distances between the seeps are shown to the nearest tenth of a kilometer (Gis data 2009; Whitler 2009).

Maryland

8.4km

Virginia

0 0.~ l l Kilotuctct!l

Figure 3. Map of S. tenuis potomacus Collection Sites in Northern GWMP Cluster. Distances between the seeps are shown to the nearest tenth of a kilometer (Gis data 2009; Whitler 2009). 19

GWMP 0.5 km Potomac River

0.7km

0.3 km

0 0.0!- 0.1 0. .! N.lowetess

Figure 4. Map of S. tenuis potomacus Collection Sites in Central GWMP Cluster. Distances between the seeps are shown to the nearest tenth of a kilometer (Gis data 2009; Whitler 2009).

GWMP

0.4km s II I 0

0.3km 0.4km

Potomac River Potomac River

0 0.0.3:>0 fJ':" d.l-J .Kilometen·J

Figure 5. Map of S. tenuis potomacus Collection Sites in Southern GWMP Cluster. Distances between the seeps are shown to the nearest tenth of a kilometer (Gis data 2009; Whitler 2009). 20

When Stygobromus specimens were collected in areas potentially inhabited by both S.

tenuis potomacus and S. pizzinii (Northern cluster and C&O sites), the identity of each

specimen was confirmed in the lab using a dissecting scope. Once in the lab, specimens

were placed in 100% ethanol.

DNA Extraction

Total DNA was extracted from the abdomen of each am phi pod specimen. A region

of tissue, approximately 1 em long, was removed using a razor blade and forceps were

washed with Alconox and flame sterilized. DNA was extracted using DNeasy® Tissue

Kits (Qiagen), using the included protocol entitled "Purification of Total DNA from

Animal Tissues" with the following modifications as described by Hutchins (2007) and

Carlini et al. (2009):

Step 1: 150 J.lL Buffer ATL + 30 J.lL Phosphate Buffered Saline (pH 7.2) was substituted for180 J.lL Buffer A TL

Step 8: 100 J.lL Buffer AE was pipetted directly onto the DNeasy® membrane and allowed to incubate at room temperature for 5 minutes instead of 1 minute.

Step 9: Step 8 was not repeated.

Extracted DNA was stored at -20°C in 1.5 J.lL microcentrifuge tubes.

Polymerase chain reaction (PCR) was useq to isolate and amplify the mitochondrial cytochrome c oxidase subunit I (COl) gene. PuReTaq™ Ready-To-Go™ PCR beads

(GE Healthcare) were used for the PCR reactions. Each PCR reaction consisted of 1.5

J.lL genomic DNA, 23.5 J.lL ddH20, and 1 J.lL of a 10 J.lmol concentration ofthe following 21

two primers: LC01490 (5'-GGTCAACAAATCATAAAGATATTGG-3') with an M13R tail sequence (5' GGATAACAATTTCACACAGG- 3') and HC02918 (5'­

TAAACTTCAGGGTGACCAAAAAATCA-3') with a T7 tail sequence (5'-

T AA TACGACTCACT AT AGGG- 3 '). Tailed primers become linked to both the 5' and

3' ends of the forward and reverse sequence of the gene to be amplified during PCR, in this case COL This serves to increase the readability of the DNA sequences of the PCR products. Previous studies conducted at American University successfully utilized these designed primers to amplify the COl gene of a species of isopod, Antrolina lira, and species of amphipod, including Stygobromus emarginatus (Hutchins 2007) and

Gammarus minus (Carlini et al. 2009).

A Mastercycler Gradient 5331 thermocycler ( eppendorf®) was used to conduct

PCR. The reaction was performed with the following thermal profile: initial denaturation at 95°C for 2 min; 40 cycles of95°C for 1 min, 40.1 oc for 1 min, and 72°C for 1.5 min; and an elongation cycle at 72°C for 7 min.

Gel Purification ofthe COl Gene

The products ofPCR were run on a 0.8% agarose gel in order to purify the sample.

The gels were run at between 70- 90V with a 1 Kb +DNA ladder. DNA fragments were compared to the DNA ladder, to determine their approximate length. All bands that corresponded with the approximate known length of COl, 700 bp, were excised from the gel using a sterile razorblade. Then, the gel purified PCR products were extracted from the gel using a MinElute™ Gel Extraction Kit (QlAGEN). The "MinElute Gel 22

Extraction Kit" protocol was used with the following adjustments as outlined by Hutchins

(2007) and Carlini et al. (2009):

Step 11: Upon the addition of Buffer PE to the MinE lute Spin Column, the sample was left to stand for 10 min.

Step 14: DNA was eluted in 15 t-tL of Buffer EB, instead of the requested 10 t-tL. Upon the addition of Buffer EB to the MinE lute Spin Column, the sample was left to stand for 10 min.

The DNA concentration in gel purified PCR products was quantitated by running 2 t-tL of each sample out on a gel and comparing the resulting bands to a Low Mass DNA Ladder

(Invitrogen). Gel purified PCR products were found to range in DNA concentration from

10 ng/ t-tL to 100 ng/ t-tL

Gene Sequencing

10 t-tL of each sample of gel purified PCR product were loaded into individual wells of 96-well non-skirted PCR reaction plates (USA Scientific, Inc.) and submitted for sequencing to High-Throughput Sequencing Solutions, a non-profit company administered by the University ofWashington, Department of Genome Sciences.

Cycle sequencing of samples was performed using the universal vector primer, T7

(5' -TAATACGACTCACTATAGG- 3'). Samples were also sequenced using the primer we provided, Ml3R (5' -GGATAACAATTTCACACAGG- 3'). The use ofboth the T7 and M13R primers allowed the sequencing of samples in both the forward and reverse directions. This provided a way of detecting errors in the sequencing process that might result in observed variation in the sequence of a sample attributable to low quality DNA atthe site and not a mutation. Any variation that occurred in the same position of a bi- 23

directionally sequenced sample was most likely due to a genuine polymorphism and not simply due to an error in amplification or analysis.

Separation was performed using a high-throughput capillary sequencer. Sequence results were made available for download from a secure website, and I inspected all chromatograms for ambiguities and ensured only high quality sequences were used in further analyses.

Analyses

Sequence Characterization

In order to confirm the identity of the sequences as amphipod COl, the sequences were compared to sequences in the BLAST database (http://www.ncbi.nlm.nih.gov).

All nucleotide sequences were translated into amino acid sequences using the program CLC Main Workbench 5.0.2 according to the invertebrate mitochondrial genetic code. For each sequence, the six possible reading frames were examined for the presence of stop codons. Since COl sequences should have no stop codons present, if stop codons were found in all reading frames, the sequence was removed from further analysis. The presence of such stop codons could indicate an error in sequencing or that a pseudo gene was amplified instead of the true mitochondrial DNA sequence.

Arlequin 3.0 was used to calculate Tajima's D scores, which test for selective neutrality among a series of DNA sequences or restriction enzymes (Excoffier, Laval, and Schneider 2005). Specifically, Tajima's D consists of the scaled difference between two estimates of8: 8w and 8n (Nielsen 2001). For haploid data, 8 is equal to 2Ne!-l, where 24

number of polymorphisms determined through pairwise comparisons (k) of sequences in the sample. Since e'" and 6, should be relatively similar in value because they both estimate e, under the null hypothesis, D should not be significantly different from zero.

However, under selection, the difference between these two estimates is either a significantly small or significantly large value of D. This is because unlike 6,, 6"' is affected by the presence of low frequency, non-neutral mutations since S is independent of the frequency ofpolymorphisms (Tajima 1989). A significant negativeD value indicates the presence of strong selection against a deleterious allele, a population bottleneck, or population expansion, while a significant positive D value suggests balancing selection or population subdivision (Rand 1996).

Phylogenetic Analysis

PAUP v4.0 was used to calculate pairwise uncorrected "p" distances between sequences (Swofford 2002). These genetic distances are calculated from the number of differences between each sequence with no adjustment for multiple mutations occurring at a single site.

Modeltest 3.7 was used to select the best nucleotide substitution model given the sequence data, which according to the Hierarchical Likelihood Ratio Tests (hLrts) was the transversional model of nucleotide substitution, TVM+I+G, where I represented invariable sites and G was the gamma distribution (Posada and Crandall1998). Under this model, base frequencies of AC were equal to base frequencies of CT, but not AG,

AT, CG, or GT. 25

PAUP was used to conduct heuristic searches for unique haplotype trees using the maximum-likelihood (ML) criterion under the TVM+I+G model (Swofford 2002).

Branch support for the resulting ML consensus tree was conducted with 1,000 bootstrap replicates.

Population Structure

In order to statistically analyze how genetic variation was distributed within and among geographic locations and within and among the lineages indicated by phylogenetic analysis, an analysis of molecular variance (AMOVA) was conducted in

Arlequin 3.0 (Excoffier, Laval, and Schneider 2005). In addition, to test for a correlation between pairwise genetic differences and geographic distances that would be indicative of isolation by distance, Arlequin 3.0 was used to perform a Mantel test with 1000 permutations (Excoffier, Laval, and Schneider 2005). The Mantel test was conducted both including and excluding the geographic outgroup samples collected from Manassas.

Nested Clade Phylogeographic Analysis

It is difficult to draw consistent conclusions about the probable historic events that may have affected the distribution of haplotypes as depicted in a phylogenetic tree.

Nested clade phylogeographic analysis (NCPA) is now widely used in phylogeographic studies as a procedure to test hypotheses about gene flow and population history. The method generates summary statistics, which are then used by an inference key recommended by Templeton et al. (1998). This results in a list of probable processes that may have shaped the geographic distribution of the haplotypes in a population, such as range expansions and past fragmentation events. 26

NCPA offers a relatively simple method of interpreting phylogeographic data, but its effectiveness is currently the subject of some controversy. Critics ofNCPA claim that the method produces a high rate of Type I and Type II errors (Knowles 2008;

Knowles and Maddison 2002; Panchal and Beaumont 2007). Knowles (2008) also questioned the degree to which NCPA was tested and validated. Templeton (2009) defends NCPA by arguing that it has indeed been rigorously tested and applied successfully to 150 cases of positive controls. Panchal and Beaumont (2007) used a simulation to test the accuracy of the NCPA process. Although the study found a large proportion of the Type I and Type II errors, Templeton (2009) attributed this to inappropriate assumptions made by the simulations conducted.

Despite the present controversy regarding NCPA, I chose to use the method to aid my interpretation of my haplotype data for two reasons. First, it is still a commonly used procedure. Second, it is one of the simplest methods for this type of analysis. That said,

I was still cautious about drawing conclusions from the results. I chose to use ANECA, which is a software program developed by Panchal (2007) to automate the NCP A process. I checked the results manually using the inferences described by Templeton

(1998).

ANECA actually makes use of two software programs that are already commonly used when conducting a NCPA: TCS v1.21 and GeoDis v2.5. First, I generated a maximum parsimony haplotype network using TCS v1.21 in ANECA (Clement, Posada, and Crandall 2000). Before proceeding, I resolved several loop sin the network using the criteria described by Crandall and Templeton (1993). Then, I used ANECA to nest the tree. Next, I used GeoDis v2.5 in ANECA, which utilizes the geographic coordinates of 27

each population to calculate two main statistics>the nested clade distance (Dn) and the clade distance (De) (Posada, Crandall, and Templeton 2000). Dn measures the geographic distribution of a clade relative to the other clades in the same higher-level nesting category. De is a measure of the geographic spread of a clade (Posada, Crandall, and

Templeton 2000). After that, I used ANECA to generate a list of inferences. Finally, I applied the inference key manually and ensured that all inferences were concordant. CHAPTER3

RESULTS

Collection and Sequencing

A total of 222 S. tenuis potomacus specimens were collected. At least twenty specimens were obtained from each site, with two exceptions. Due to fluctuations in the surface water depth oftwo seeps, only 17 specimens were collected from the Turkey Run seep (N II) and 19 specimens were collected from the Fort Hunt Lot D seep (S I).

Of the 24 C. shoemakeri, 33 S. pizzinii, and 222 S. tenuis potomacus specimens collected, bidirectional COl sequences were successfully obtained from only 56 samples.

Each of these 56 bidirectional COl sequences was identical in the forward and reverse directions. A total of 188 sequences were obtained from the T7 primer with an average length of 449 bp. Only 100 sequences were obtained from the M13R primer with an average length of 144 bp. Table 1 describes the number of samples collected and sequenced from each site. The chromatograms of all sequences were manually checked for ambiguities, but none were found. Sequences were manually edited to ensure that only base calls with quality values of greater than 20 were included in further analysis.

Stop codons were found in all reading frames of20 T7 sequences and 6 M13R sequences, which were excluded from further analysis (see Table 2). Most of these sequences were of low quality, and sequence alignment revealed that the stop codons were not located in the same position in each sequence. This indicates that the stop

28 29

codons were most likely the result of amplification or sequencing errors, and not the products of genetically inherited pseudogenes.

Sequences less than 250 bp were not used in further analysis because they were unable to align with the other sequences and further comparisons would be severely limited. This meant eliminating the majority of the M13R sequences, and so only T7 sequences, at least 405 bp long, were used in subsequent analyses. Table 3 shows their distribution among geographic locations.

Sequence Characterization

None of the COl sequences obtained for this study had insertions, deletions, or ambiguities. In order to confirm the identity of the sequences as amp hi pod COl, the sequences were compared to sequences in the BLAST database

(http://www .ncbi.nlm.nih.gov/). These nucleotide sequences represented 77-81%

142 81 alignment identity with Gammarus minus COl (E=4e- - 5e- ) (GenBank Accession#

EF570326.1) The lowE-value (E) indicated that the probability that this alignment identity could have been found by chance alone was extremely low.

In order to test the sequences for selective neutrality, the sequences from each site were analyzed using Tajima's D test. Table 4 shows the results of the analysis for each sampling location. There was no significant deviation from selective neutrality at 13 of the 14 sites {Tajima's D= -0.610- 2.849; p= 0.296- 1.000). However, the S. tenuis potomacus sequences from the Turkey Run seep (N II) did deviate significantly from selective neutrality (Tajima's D= -2.238; p= 0.002). 30

Table 1. Number of Specimens Collected and Sequenced from Each Site.

# #T7 #M13R Date(s) Site Name Abbrev. specimens sequences sequences Collected collected > 2oo be > 2oo be C. shoemakeri- 3 0 24 12/21/07 Tree StumE cs avg.= 336 hE S. pizzinii- 5 C&O SP I 17 2/8/08 avg.= 309 bp avg.= 547 bp Footbridge S. pizzinii- 4/11/07- 5 C&O SP II 7 2/8/08 avg.=517 bp avg.= 305 bp Southern Seep S. pizzinii- 8 2 C&O SP III 9 2/8/08 avg.= 453 bp avg.= 282 bp Northern SeeE S. tenuis 19 6 potomacus- N I 31 2/2/08 avg.= 653 bp avg.= 500 bp Scott's Run S. tenuis 17 8 potomacus- Nil 17 4/7/08 avg.= 636 bp avg.=317 bp Turkey Run S. tenuis 2/2/08- 17 12 potomacus- NIII 24 119109 avg.= 626 bp avg.= 494 bp Chain Bridge S. tenuis 12/21/07- 11 4 potomacus- C I 21 2118/08 avg.= 501 bp avg.= 338 bp Tree Stump S. tenuis 2/8/08- 13 2 potomacus- 24 en 2/18/08 avg.= 547 bp avg.= 461 bp Roadside_ S. tenuis 12/21/07- 13 5 potomacus- CIII 21 3/13/08 avg.= 509 bp avg.= 536 bp Pi e S. tenuis 3/20/08- 16 2 potomacus- S I 19 3/24/08 avg.= 627 bp avg.= 399 bp Fort Hunt Lot D S. tenuis 2114/08- 19 8 potomacus- S II 24 3/9/08 avg.= 609 bp avg.= 559 bp Fort Hunt Lot E S. tenuis 3/5/08- 16 3 potomacus- Sill 35 3/17/08 avg.= 609 bp avg.= 389 bp River Farm Dr. S. tenuis 3 1 potomacus- MNSS 6 4/17/07 avg.= 701 bp avg.= 675 bp Mannassas 31

Table 2. Number of Sequences from Each Site Where Stop eodons Were Present in All 6 Reading Frames. Sequences with stop codons present were not included in further analyses.

# T7 Sequences # M13R Sequences Site Location w/ Sto~ Codons w/ Sto~ Codons es 0 0 SP I 0 0 SP II 0 0 SPill 0 0 N I 2 0 Nil 2 1 N III 2 2 e I 2 1 en 4 0 e III 1 1 S I 2 0 S II 5 1 Sill 0 0 MNSS 0 0

Table 3. Number of Sequences from Each Site that Were Used in the Analyses. Sequences that were less than 250 bp in length were excluded from further analyses.

Site Location # T7 Seguences Av~. Seguence Len~th (b~) es 3 284 SP I 4 280 SP II 5 458 SP III 8 400 N I 17 547 Nil 15 551 N III 14 564 e I 8 405 en 9 405 e III 13 425 S I 14 586 S II 12 525 SIII 13 527 MNSS 3 627 32

Table 4. Results of the Tajima's D Test for Selective Neutrality. The p values are from simulated data.

Site Location Tajima's D p cs 2:012 0.954 SP I 2.080 0.981 SP II 0.000 1.000 SP III -0.431 0.382 NI -0.161 0.453 Nil -2.238 0.002 N III 1.728 0.982 CI 0.043 0.576 en 1.934 0.990 CIII 0.708 0.753 SI 0.000 1.000 SII 2.849 1.000 Sill -0.610 0.296 MNSS 0.000 0.831

Phylogenetic Analysis

Of the 118 S. tenuis potomacus sequences examined, 34 haplotypes were found

(see Table 5). In addition, the three sequences of C. shoemakeri were separated into 2 haplotypes. Analysis also showed that the 17 S. pizzinii were distributed into 4 haplotypes.

Of the 34 S. tenuis potomacus haplotypes, 4 were shared between one or more seeps. Figure 6 illustrates the geographic distribution of the shared haplotypes.

Haplotype 4 was restricted to being shared only among the three seeps in the Central cluster. Three haplotypes were found in seeps located in two different geographic clusters. Haplotype 2 was shared between N II and seep C II. Haplotype 3 was shared between C I, C III, S II, and S III. Most surprisingly, haplotype 1 was found across all three geographic clusters: N II, C II, and S III. 33

The remaining 30 haplotypes were each isolated to a single seep. Most of the isolated haplotypes were found in the northern cluster. Both Scott's Run (N I) and Chain

Bridge (N III) hosted 5 isolated haplotypes each. Fort Hunt Lot D (S I) was fixed for one haplotype. Not surprisingly, individuals from Manassas (MNSS) did not share haplotypes with the other sites.

Shared H~1plotypes Maryland

rr::\,:J m 1 2 3 4

NIII

Virginia /.CI

~--- en '­ '- ~ 'itcin ./ J "',@ Gt'Ognphic Clusters SII • SI ~ North A l'viNSS 8 Central • SP-COL 7 S III 0 South O• .U * 0 3 6 12Km

Figure 6. Map of the Geographic Distribution ofHaplotypes Shared Among MultipleS tenuis potomacus Sites. Pie charts depict the number of individuals found with a given shared haplotype at each site. Sites with no shared haplotypes are denoted by a white circle with an x in the center. Specimens from Manassas did not share haplotypes with specimens from any other site, as expected due to its geographic distance from the ingroup sites (Gis data 2009; Whitler 2009). Table 5. Distribution ofHaplotypes. The total sample size(#) and the number of individuals with each haplotype (haplotypes shown in columns). The haplotypes that were shared between at least two sites are depicted on the left, while the haplotypes isolated to one site are shown on the right.

Shared Isolated Site # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 NI 17 2 4 5 4 2 Nil 15 7 1 1 5 1 NIII 14 7 4 1 1 1 CI 8 5 1 1 1 en 9 1 1 5 1 1 CIII 13 4 6 2 1 SI 14 14 SII 12 5 1 1 2 3 Sill 13 1 2 1 1 7 1 MNSS 3 2 1

w +:- 35

As demonstrated in Table 6, the average uncorrected percent pairwise distances (p) within each seep were relatively small. Five sites had an average uncorrected pairwise distance (p) of less than 2% within the seep. Four seeps (N III, C II, S II, and S III) had a slightly higher average p that ranged between 3.9 and 5.6%.

Table 6. Average Uncorrected Percent Pairwise Distances Within Each S. tenuis potomacus Site, Within Each Cluster, and Among Clusters.

Comparisons Average% Divergence Within Sites NI 0.20 ± 0.19 (0- 0.96) [17] Nil ·1.50 ± 3.17 (0- 10.87) [15] NIII 3.90 ± 4.27 (0- 9.16) [14] CI 0.55 ± 0.54 (0- 1.83) [8] en 5.56 ± 4.92 (0- 11.24) [9] CIII 0.54 ± 0.52 (0- 1.79) [13] SI 0.03 ± 0.12 (0- 0.48) [14] SII 5.41 ± 4.93 (0- 11.41) [12] Sill 4.46 ± 5.10 (0- 13.06) [13] Within Clusters

...... ~.~~!~~~-~G ..~ ..! ...... ~.~~.!.~.!!!~.g .. ~.. !...... N 7.25 ± 4.64 (0- 12.23) c 2.77 ± 4.03 (0- 12.89) s 12.21 ± 7.95 (0- 22.40) 8.27 ± 5.18 (0- 17.77) Between Clusters

...... ~.~~!~~~.~G ..~ ..! ...... ~.~~.!.~.!!.~~.&.~ ... ~ ...... Nand C 11.41 ± 2.69 (0- 15.32) N andS 13.58 ± 4.72 (0- 21.69) 10.39 ± 2.45 (0- 14.88) C and S 12.15 ± 6.95 (0- 23.06) 7.80 ± 4.68 (0 -13.98) Note: The within-site average % divergence was not 0 for S I due to the presence of 3 sequences that were of different lengths.

The average uncorrected percent pairwise distances (p) between each seep are also shown in Table 6. The average uncorrected pairwise distance among the seeps within the 36

ingroup S. tenuis potomacus was 9.4% with a range between 0.7% and 12.7%. This excludes comparisons with the samples collected from Fort Hunt Lot D (S I). This is because unexpectedly high uncorrected pairwise distances were observed in pairwise comparisons between S I and other samples of S. tenuis potomacus. The pairwise distances ranged between 19% and 20.2%. The average uncorrected pairwise distance between the samples from S I and S. pizzinii samples was 16.6%. This suggests that the

S I specimens may be another species of Stygobromus.

In addition, Table 6 shows the average uncorrected percent pairwise distances (p) within each geographic cluster, and between each geographic cluster focusing on comparisons of the in group S. tenuis potomacus samples.

Pairwise comparisons between the ingroups and outgroups for both geographic location and species can be found in Table 7. The greatest average uncorrected distances were observed between the ingroup, S. tenuis potomacus, and the outgroups, C. shoemakeri and S. pizzinii. The average uncorrected distances between sequences of C. shoemakeri and S. tenuis potomacus ranged between 23.1% and 28.4%. Unsurprisingly, the sequences of S. pizzinii and S. tenuis potomacus were slightly more similar with the average uncorrected distances ranging between 15.7% and 19.5%. Comparisons of the uncorrected pairwise distances between samples of S. tenuis potomacus from the

Manassas seep and all other samples of S. tenuis potomacus ranged between 14.4% and

16.9%. This may be due to the large geographic distance that separates the Manassas outgroup from the otherS. tenuis potomacus sites. 37

Table 7. Average Uncorrected Percent Pairwise Distances Within Outgroup Sites and Between Outgroup and Ingroup Sites. Comparisons were conducted within sites, within clusters, within the Stygobromus genus, and between Crangonyx and Stygobromus specimens.

Comparisons Average % Divergence Within Sites cs 0.34 ± 0.60 (0 - 1.03) [3] SP I 0.33 ± 0.26 (0 - 0.52) [4] SP II 0.00 ± 0.00 (0- 0.00) [5] SP III 0.83 ± 0.71 (0- 2.00) [8] MNSS 0.11 ± 0.09 (0- 0.16) [3] Within Cluster SP 0.68 ± 0.01 (0- 2.00) Within Genus ······················-·····································...... -...... ~.~.~!~.~.~.~g ..~ ..! ...... - ...... ~.!!:~.~.~~~~.S..~ .. ! ...... MNSSandN 15.16±0.90(13.72-17.13) MNSS andC 16.80 ± 0.68 (15.80- 19.09) MNSS and S 17.40 ± 1.90 (14.81- 21.13) 16.17± 1.11 (14.81-18.51) SPandMNSS 17.00 ± 1.43 (15.00- 20.14) SP and N 18.03 ± 1.63 (14.i3- 20.91) SP and C 18.48 ± 1.13 (14.43- 21.06) SP and S 17.30 ± 1.46 (14.40- 20.99) 17.66 ± 1.58 (14.40- 20.99) SP and S I 16.67 ± 0.92 (14.00- 18.31) Between Taxa ...... !~.~!~.~.~.~.S. .. ~..! ...... ~.!!:~.!.~.~.~~.g ... ~ . .! ...... CS and SP 24.59 ± 0.66 (23.00 - 25.98) CS and MNSS 17.00 ± 0.85 (15.23- 20.00) CS and N 26.28 ± 1.95 (23.87- 29.69) CS andC 23.35 ± 0.74 (21.85- 25.01) CS and S 24.11 ± 1.27 (21.73- 27.04) 23.45 ± 0.62 (21.73- 24.63)

Table 8 shows the average uncorrected percent divergence between each site. Table 8. Average Uncorrected Percent Pairwise Distances Between Each Site.

cs SP I SP II SP III NI N II N III CI C II C III SI SII S III MNSS cs - SPI 24.5 - SP II 24.5 0.3 - SP III 24.7 0.8 0.9 - NI 28.4 19.2 19.3 19.5 - Nil 24.4 16.1 17.9 17.8 10.9 - NIII 25.7 15.7 17.6 17.4 10.0 9.9 - CI 23.8 18.6 18.7 18.8 12.3 10.5 12.2 - CII 23.4 17.5 18.1 18.0 12.6 6.8 11.6 7.7 - C III 23.1 18.9 18.9 19.0 12.7 10.7 12.3 0.7 4.5 - SI 25.3 16.4 16.7 16.8 19.3 19.5 19.0 20.2 19.6 19.9 - SII 23.6 17.8 18.7 18.7 12.4 10.9 11.8 4.9 7.4 5.0 20.1 - S III 23.3 15.8 16.9 17.4 9.4 9.3 8.7 9.4 9.8 9.8 19.1 9.6 - MNSS 26.6 16.0 17.2 17.4 15.1 15.9 14.4 16.9 16.7 16.8 19.6 16.9 15.5 -

w 00 39

A maximum likelihood (ML) phylogram was constructed using the TVM+I+G model that best fit the nucleotide data according to the Hierarchical Likelihood Ratio

Tests (hLrts) in Modeltest 3.7 (Posada and Crandall1998). The ML consensus tree is shown in Figure 7. Branch support was calculated with 1,000 bootstrap replicates. The

ML tree showed that, as expected, there were strongly supported differences between the outgroups, C. shoemakeri and S. pizzinii, and the ingroup, S. tenuis potomacus. In addition, haplotypes from Manassas were divergent from the other specimens considered to be S. tenuis potomacus. Individuals collected from S I appeared to be more similar to

S. pizzinii than specimens of S. tenuis potomacus. However, the ML tree strongly suggested that the S I haplotype was still divergent from the S. pizzinii haplotypes collected.

The consensus ML tree revealed a complicated relationship between the haplotypes collected from the ingroup site locations. For better resolution, a second ML tree was constructed using the same criterion as above, but with only S. pizzinii as a root for the tree (Figure 8). All haplotypes collected from N I formed a clade with strong bootstrap support (95%). Three haplotypes collected from S II also formed a strongly supported clade (99% ), while two others are grouped into another strongly supported clade (97%) and were shared between sites in the central cluster and S III. Two N II haplotypes were divergent from those two clades, but with weak bootstrap support (54%). A well­ supported clade existed that contained four haplotypes from S III (88%). A haplotype of

N III was grouped with those haplotypes from S III, but with weak support (68%). The

ML tree grouped a haplotype of N II with haplotypes froni N III in the same clade, but without strong support (62%). In order to minimize the chance that a sequencing error 40

was responsible for this grouping, the individual from N II was sequenced on two independent occasions. Upon comparison, both sequences were identical.

Population Structure

An AMOV A was conducted on sequences from the three ingroup geographic clusters of S. tenuis potomacus: Northern, Central, and Southern. Due to their genetic divergence from ingroup samples, sequences from S I and the geographic outgroup,

Manassas, were included in the analysis and treated as an additional two separate groups.

The results demonstrated that approximately 34% of the genetic variation existed within each seep population (see Table 9). Approximately 32% of the genetic variation was found among seep populations within geographic clusters. Genetic variation among geographic clusters accounted for the approximately 34% of the variation that remained.

Table 9. Results of an AMOVA Constructed for S. tenuis potomacus. Samples were included from: Northern, Central, Southern, Manassas, and Fort Hunt Lot D (S 1). Anal~sis based u,eon 101 00 ,eermutations. Source of Sum of Variance Percentage d.f. p Variation S9uares Comeonents of Variation Among groups 4 1850.9 13.24 33.98 0.005 Among populations 5 920.01 12.36 31.73 <0.00001 within groups Within populations 119 1589.8 13.36 34.29 <0.00001 Total 128 4360.7 38.97

The FsTvalue calculated for all S. tenuis potomacus populations was 0.592. FsT values were plotted against the geographic distances between each site where samples of

S. tenuis potomacus were collected (see Figure 9). 41

CS a 2 cs b 2 SP Ia 2 SP II 5 SP lila 1 71 SP lb 1 SP lllb 2 I SP llld 1 r------1 SP llle 5 1'----- s 114 99 1 MNSS a 2 I MNSS b 1 N Ia 2 N lb 4 9.-::_ N le 5 N ld 4 N le 2 N lla 7 C lie 1 S lllb 1 N lib 1 C lie 1 77- N lie 5 c lib 1 c lid 1 N lid 1 J 73 N lie 1 100 - N lila 7 N lllb ·1 N llld 1 N llle 1 N llle 4 S llle 1 s llld 1 S llle 7 s lllf 1 S lla 5 S lila 2 C Ia 5 Cilia 4 C le 1 C lla 5 C llle 6 80 c llld 2 f-- C llle 1 C If 1 c lg 1 S lie 1 s lib 1 9~ S lie 2 Yslid 3 - 0.05 substitutions/site Figure 7. Phylogram of S. tenuis potomacus Estimated Using the Maximum Likelihood Criterion with C. shoemakeri as the Root of the Tree. Constructed using TVM+I+G. Bootstrap support values above 70% are indicated in bold. 42

SP Ia 2 SP II 5 SP lila 1 SP lb 1 SP lllb 2 SP llle 5 100 I MNSS a 2 L MNSS b N la2 Nib 4 95 N le 5 N ld4 N le2 ' N lla 7 C lie 1 S lllb 1 99 N lib 1 C lie 1 ....---- N lieS 75 c lib 1 '--- Clld 1 S lla 5 S lila 2 C Ia 5 Cilia 4 C If 1 c lg 1 97 S lie 1 81 C le 1 C lla 5 C llle 6 c llld2 - C llle 1 99 s lib 1 S lle2 93 slid 3 72J N lid 1 ~ N lie 1 - N lila 7 N lllb N llld N llle ..- N llle 4

_ 931 S llle 1 8fll s llld 1 '--- S llle7 s lllf 1 L..------S I 14 SP llld 1 - 0.01 substitutions/site

Figure 8. Phylogram of S. tenuis potomacus Estimated Using the Maximum Likelihood Criterion with S. pizzinii as the Root of the Tree. Constructed using TVM+I+G. B~otstrap support values above 70% are indicated in bold. 43

As Figure 9 demonstrates, FsT values seem to increase with increasing distance, as would be expected. However, a closer examination revealed that there was a wide range in genetic divergence of seeps found within 5 km of each other (FsT= 0.075 to 0.931). In particular, the three seeps in the northern cluster had very high FsT values despite their close proximity to one another. The Scott's Run seep (N I) was located closest to the

Turkey Run seep (N II). Although the two seeps were only 3.44 km apart, the analysis estimated that the FsT value was 0.931. The Chain Bridge seep (N III) was 5.76 km from the Turkey Run seep (N II), and the F sT value between the two sites was found to be

0.729. The seep at Scott's Run was 8.39 km from Chain Bridge, and the FsT value between the two seeps was 0.838.

Seeps located between 20 and 30 km apart exhibited a similar range of genetic divergence. N II was located 25.2 km away from C II. The FsT value among the individuals of those seeps was calculated to be 0.325. Even though C III was only 0.19 km further away from N II than C II, the FsT value between the two sites was more than twice as high (0.688). Comparisons toN I yielded especially high FsT values. For instance, N I and C III were located 26.8 km apart, but the FsT value between them was

0.801. However, N I and C II were only slightly closer (26.66 km apart), but were more genetically similar with an FsT value of0.646 between them.

Although Figure 9 shows that the genetic divergence between sites seemed to increase with increasing geographic distance, there was also variation in FsT values over short distances. This suggests that the genetic variation among these sites cannot be attributed to isolation by distance alone. 44

1.2 ··,------, hi~ Cluster ong Clusters anassas AI N I vs. N II

A I N I vs. N fii 0.8 • .. "'I N II vs. N Ill i!f' •• .... • • r: 0.6 • "' • • I

0.4 • • • • • 0.2 • •

0~--~-~--~--~--~--~--~--~-~--~ 0 5 10 15 20 25 30 35 40 45 50 Geographic Distance (km) Figure 9. Graph ofFst Values, an Estimate of Genetic Diversity, vs. the Geographic Distance Between Sites. A high amount ofF sT variation at each site was found between S. tenuis potomacus sites both within and among clusters. Fort Hunt Lot D (S I) samples were excluded.

The samples of S. tenuis potomacus were analyzed for patterns of isolation by distance by using a Mantel test to analyze the relationship between the matrix ofpairwise geographic distances and the matrix of pairwise genetic differences. The Mantel test demonstrated that there was no statistically significant correlation between geographic and genetic distance among S I, Manassas, and ingroup seeps. Tests were conducted including both S I and Manassas in the analysis (after 10,000 randomizations, Z= 113.01, r= 0.121, one-tailed p= 0.2115), excluding S I (after 10,000 randomizations, Z= 86.60, r

= 0.42, one-tailed p= 0.0586), and excluding S I and Manassas (after 10,000 randomizations, Z= 41.25, r= 0.007, one-tailed p= 0.55). 45

Nested Clade Phytogeographic Analysis

Cladogram estimation using the 90% connection limit revealed four distinct haplotype networks constructed using the maximum parsimony criterion (see Figure 10).

As expected, individuals from S I were isolated from the rest of the network. In addition, the two haplotypes from the Manassas site (MNSS) were disconnected. The five haplotypes from the Scott's Run seep (N I) also formed their own haplotype network, disconnected from the other sequences. The remaining sequences formed a haplotype network consisting of clades that correspond with those found by the phylogenetic analysis. A list of significant tip and interior distances, as well as the chain of inference for clades with geographic and genetic variation is provided in the Appendix.

NCP A revealed the presence of allopatric fragmentation separating the four main clades: Scott's Run seep (N I), Fort Hunt Lot D seep (S I), Manassas seep (MNSS), and all other sites in the northern, central, and southern clusters. There was evidence of contiguous range expansion in the Central and Southern clusters, specifically between the

C I, C III, and S II sites. A pattern indicative of restricted gene flow with isolation by distance was suggested for the N II, C II, and S II sites. Between sites N II, N III, C I, C

II, C III, and S II the data suggested the presence of restricted gene flow, with the possibility of some long distance dispersal over intermediate areas not occupied by the species. It is also possible that the pattern was the result of past gene flow followed by the extinction of intermediate populations. Analysis showed that N III, C II, and S III were separated by allopatric fragmentation. 46

s llll 5 .\1NSS a 2 N Jb4 S!Ua2

Figure 10. Unrooted Haplotype Network Estimated for S. tenuis potomacus Using the Maximum Parsimony Criterion. Rectangles represent the probable source haplotypes. All other haplotypes are represented by ovals whose size is proportional to the frequency of the haplotype. The number of steps between each haplotype are shown within squares. CHAPTER4

DISCUSSION

Lack of an Overall Isolation by Distance Pattern Among the Ingroup Seeps Suggests that the Hypotelminorheic is Discontinuous

One expectation of this study was that even if the George Washington Memorial

Parkway was underlain by one continuous hypotelminorheic habitat, phylogenetic

analysis would still reveal population structure among S. tenuis potomacus. A pattern of

isolation by distance was predicted due to the limited dispersal ability of subterranean

amphipods. Isolation by distance has been detected in other species in similar habitats. Gouws et al. (2005) detected a significant pattern of isolation by distance in two

species of phreatoicidean isopods collected from first-order streams, seepage areas, and springs in South Africa. Wilson (2009) also showed that the genetic divergence of phreatoicidean isopods in Northern Australia, collected from seeps, springs, and perched aquifers, was significantly correlated with the geographic distance separating the collection sites.

The results of this study did not support the pattern of isolation by distance that was expected if all of the seeps sampled were connected to one hypotelminorheic. There was no significant evidence of an overall pattern of isolation by distance found among the S. tenuis potomacus sites. The results of the Mantel test revealed that there was no significant correlation between the genetic divergence of the cor sequences of s.

47 48

tenuis potomacus and the geographic distance between sites. Pairwise FsT values plotted against geographic distance showed no significant relationship, and the range ofFsT values was approximately the same regardless of the geographic distance between sites.

These results suggest that the seeps are not connected to a single hypotelminorheic habitat.

Evidence that Hypotelminorheic Habitats are Fragmented

If the hypotelminorheic was a discontinuous habitat, I expected phytogeographic analysis to detect a pattern of habitat fragmentation. This prediction was strongly supported by the results of the study. Although the sites chosen for the study were, at most, separated by no more than 35 km, the genetic divergence detected among sites was comparable to divergences found by studies of freshwater crustaceans collected from localities separated by over 100 km. The average uncorrected nucleotide COl divergences among specimens of S. tenuis potomacus collected from different seeps was

9.4% with a range between 0.7% (comparison between C I and C III) and 12.7%

(comparison between N I and C III). This was similar to the genetic divergence found in a phytogeographic study of phreatoicidean isopods in the mountainous southwestern region of South Africa. Gouws et al. (2005) collected specimens of Mesamphisopus abbreviatus and M depressus from first-order streams, seepage areas, or springs in 15 localities. Comparisons of COl sequences revealed uncorrected genetic distances among the populations to range between 0 and 7.83% with a mean of 4.66% (Gouws, Stewart, and Matthee 2005). If we can assume that Stygobromus has similar dispersal abilities to isopods of the genus Mesamphisopus, then the level of genetic divergence that I found 49

over small geographic distances suggests that the hypotelminorheic poses additional barriers to the dispersal of subterranean invertebrates.

The nested clade phytogeographic analysis and maximum likelihood tree both provided further evidence that the hypotelminorheic is a fragmented habitat. NCPA revealed four distinct haplotype networks: Manassas, N I, S I, and the remaining ingroup seeps. This pattern was also supported by the ML tree. NCPA suggested that these areas were separated by allopatric fragmentation. Given its geographic location, approximately

35 km away from the otherS. tenuis potomacus sites, it makes sense that the Manassas seep was so genetically divergent from the other sites. However, geological history and other historical processes may explain the divergence of the N I and S I clades.

N I is the northern most ingroup site, but it is only about 3. 5 km from N II. The geological history of the northern cluster may explain why N I is distinct from the other ingroup sites. Scott's Run (N I), Turkey Run (N II), and Chain Bridge (N III) are underlain by rock that formed during the Cambrian period (Sykesville Formation,

Diamictite and Mather Gorge formation, Migmatitic metagraywacke). Yet, N II is located on an alluvial flood plain of the Potomac River. Therefore, in addition to the bedrock from the Cambrian, N II is underlain by much younger alluvium (Qa) from the

Holocene period (Southworth and Denenny 2006). CI, CII, and S III are also underlain by alluvium from the same geological time period. This indicates that N I and N III seeps may have formed longer ago than the other ingroup seeps. This is also supported by the

ML tree. The elevation of the two sites may have changed as the bedrock ofthe Potomac

River was incised. Meanwhile, N II, C I, C II, and S III may have been formed much 50

more recently due to periodic flooding of the Potomac River in the last ten thousand years.

While S I is located less than 1 km from the other two seeps in the southern cluster, the habitat turned out to be genetically unique. The sequences of specimens collected from S I were extremely divergent from otherS. tenuis potomacus sequences

(between 19 and 20.2%). The individuals collected from S I were also fixed for a single haplotype. This suggests that the site was fragmented from all other seeps in the area.

The maximum likelihood tree demonstrated that the S I sequences were more similar to sequences of S. pizzinii, but there was still a high genetic divergence between the sequences (between 14 and 18.31 %). The specimens collected from S I may be S. pizzinii, and this genetic divergence may be attributed to the approximately 30 km that separates the S I and C+O sites. Similar genetic variation (between 13.72 and 19.09%) exists between MNSS and otherS. tenuis potomacus sites located approximately 35 km away. If the specimens collected from S I are not S. pizzinii, they may be a cryptic species of Stygobromus. The morphology of the specimens should be examined to determine whether they resemble S. pizzinii, S. tenuis potomacus, or another species of

Stygobromus.

Unfortunately, since there were no accounts of S. pizzinii being found in the Fort

Hunt area, S I specimens were not examined morphologically prior to DNA extraction.

In order to ascertain the taxonomic identity of individuals found at S I, new collections are needed for morphological examination, but so far I have not been able to obtain them.

Upon returning to the site, the area was found to be disturbed by construction work, and the seep was completely dry. The National Park Service (NPS) revealed that the S I seep 51

had actually been artificially created. The area was underlain by cement and a water mane break had formed a seep-like habitat.

Even though the S I seep was artificially created, the site still provides insight into how a seep is colonized and yields further evidence that the hypotelminorheic habitat is fragmented. The Fort Hunt Park area is generally characterized by poor drainage.

Stygobromus may maintain a patchy distribution throughout the interstitial layer. When artificial seep conditions were created by the water mane break, S I was probably quickly colonized by Stygobromus from the surrounding interstitial layer resulting in an increase in population density. Once the NPS repaired the water mane, the S I seep dried up, and no specimens could be found at the site. In the future, the surrounding area of the S I site should be thoroughly examined for seeps that may host other Stygobromus specimens that may be part of the S I population. Until more specimens are located, the taxonomic identity of the S I population cannot be confirmed. However, the fact that the S I site was so genetically distinct from the other two southern seeps demonstrates that fragmentation has prevented the migration of Stygobromus from the S I area to S II or S III and vice versa.

The Extent of the Hypotelminorheic is Dynamic

Although the population structure of S. tenuis potomacus does indicate that the hypotelminorheic is subject to fragmentation, the geological history of the area and phytogeographic analysis both indicate that the habitat seems to be very dynamic as well.

The results suggest that there is genetic subdivision among many of the seep populations of S. tenuis potomacus. Population subdivision could be the result of cycling between 52

prolonged periods of gene flow between sites followed by periods of isolation. As the water table fluctuates, the extent of a hypotelminorheic may also change. Periods of increased precipitation or flooding of the Potomac River may result in an increased range of the hypotelminorheic. This could create a temporary corridor between two hypotelminorheic habitats, which could allow for the dispersal of aquatic organisms and subsequent gene flow. If the region experienced a prolonged period of drought, the water table would respond and the extent of the hypotelminorheic would decrease. Increases in the elevation of a site due to events such as glacial uplift might have the same influence on the hypotelminorheic. An end in the flow of water in either scenario would prevent the dispersal of organisms between hypotelminorheic habitats, creating fragmentation.

The D.C. area has gone through periods of dramatic environmental change that may have influenced the formation of hypotelminorheic habitats and the distribution of S. tenuis potomacus. During the late Paleozoic, the Iapetus Ocean was destroyed as the

North American and African continental plates collided, forming the Appalachian mountain belt. Following this event, during the Mesozoic, the deformed rocks of the joined continents began to break apart. This crustal fracturing led to the formation of the

Atlantic Ocean. Large alluvial fans and streams carried debris shed from the earlier uplifted Blue Ridge and Piedmont provinces, eastward (Southworth et al. 2001).

Stygobromus is thought to have evolved during the Mesozoic, although it may be older. The genus is widely distributed throughout the Appalachians (Holsinger 1978,

1994). Since Stygobromus species are known to move through shallow groundwater, it is possible that Stygobromus extended eastward towards Manassas, followed by the D.C. 53

area, dispersing along the path of newly forming streams and shallow groundwater habitats, as the ocean water receded during the late Mesozoic.

The Potomac River valley was formed much more recently as the result of erosion and deposition from about the mid-part of the Cenozoic Era to the present (or at least the last 5 million years) (Southworth et al. 2001 ). The George Washington Memorial

Parkway transects both the eastern Piedmont and the Atlantic Coastal Plain. GWMP is underlain by bedrock that consists of metamorphosed sedimentary rocks of the Sykesville

Formation. These were intruded by Ordovician mafic and felsic igneous plutonic rocks that were metamorphosed and deformed by several Paleozoic tectonic episodes. The rocks became eroded after being uplifted and eventually overlain by Cretaceous and

Tertiary deposits of the Atlantic Coastal Plain (Southworth and Denenny 2006).

During the Quaternary period, the Potomac River eroded the bedrock of the area, producing fluvial terrace deposits. The central and southern clusters are both located on the Atlantic Coastal Plain deposits. The sediment in this region originated from the erosion of the mountains to the west. The central cluster sits on an alluvial plain. C III is underlain by the oldest deposit, the Early Cretaceous Potomac Formation. The southern cluster sits on lowland terrace alluvium and estuarine deposits (Qte). In addition, the older layer of Qte from the Pleistocene is incised by alluvium from the Holocene

(Southworth and Denenny 2006).

These alluvial layers may fluctuate, in tum affecting the connections between hypotelminorheic habitats. For instance, along the shores and islands of the Potomac

River, the silt layer has been measured to be as much as 20 to 22 feet deep. Radiocarbon dating suggests that this layer is about ten thousand years old, and it was probably 54

deposited as the climate warmed following glaciation. Floods tend to remove this material and deposit it elsewhere, as was recorded in a flood in 1996 (Southworth and

Denenny 2006).

Although the Potomac River remained free of glaciers throughout the last glacial cycle, climatic changes significantly increased the incision rate of the river in the late

Pleistocene. Snowmelt floods increased in number, severity, and duration due to warming temperatures. In addition, meltwater generated by the retreat of glaciers increased the amount of discharge during this period until the climate stabilized during the Holocene (Reusser et al. 2004). These events could have raised the water table, creating connections between hypotelminorheic habitats. Glacial forebulge also may have raised Mather Gorge, leading to an increase irt the incision rate of the river

(Southworth and Denenny 2006). Since N I and N III are underlain by bedrock from the

Mather Gorge, glacial forebulge and the increased incision of rock by the river may have separated the sites from the rest of the Stygobromus population.

Changes in the discharge and incision rate of the river, flooding events, and climate fluctuations may all be responsible for transforming hypotelminorheic habitats and the subterranean amphipod populations that inhabit them. The population structure of S. tenuis potomacus suggests that there are currently several hypotelminorheic habitats located along the GWMP. Although the habitats are fragmented, the extent ofthese habitats is dynamic, changing along with the water table.

The population structure of S. tenuis potomacus suggests that the seeps of the

George Washington Memorial Parkway were most likely once connected to the same hypotelminorheic or another groundwater habitat. Phylogenetic analysis suggested that 55

N I became isolated from the other hypotelminorheic habitats first, which is consistent with geological evidence. Meanwhile, the NCPA provided strong evidence that the seeps of the northern, central, and southern clusters were once part of one contiguous hypotelminorheic habitat. N II, N III, C I, C II, C III, and S II demonstrated a pattern of past gene flow with intermediate populations having gone extinct. Each one of these sites hosted some haplotypes that were similar to those found in other sites, while also hosting haplotypes that were only similar to those found in one specific site.

Wilson et al. (2009) suggested that co-occurrences of genetically distinct haplotypes could result from the successful colonization of locations by new founding populations during periods of connectivity. The high degree of genetic divergence observed among the ingroup sites may suggest that the seeps have been colonized more than once. Each ingroup seep had between 4 to 6 haplotypes. Of the thirty-four haplotypes found, nineteen were found in more than a single individual, but only four haplotypes were shared between individuals collected from two or more seeps.

Comparisons of the genetic divergence within a seep had a range that overlapped with the among seep comparisons. The uncorrected COl nucleotide divergence within a seep ranged between 0 and 13.06%. Within site comparisons among four ingroup seeps (N

III, C II, S II, and S III) had an uncorrected COl nucleotide divergence that ranged between 3.9 and 5.6%.

The AMOVA revealed that the amount of genetic variation within each seep population (34%) was relatively equal to the amount of genetic variation among seeps within the same geographic cluster (32%). The remaining 34% of the genetic variation was among populations in different geographic clusters. These results were much 56

different than those found by a phylogenetic study of freshwater shrimp in groundwater­ fed wetlands in Australia. Gouws et al. (2007) estimated that approximately 96% of the genetic variation existed among the 23 collection localities, while only 4% of the genetic variation was within each location. This indicates that at least some of the seeps along the GWMP are not as isolated as the groundwater-fed habitats studied by Gouws et al.

(2007).

Further evidence that changes in the connectivity between hypotelminorheic habitats have created population subdivision was provided by the analysis of the genetic divergence at several ofthe sites. For instance, the average uncorrected COl nucleotide divergence within N II was 1.5% with a range between 0 and 10.87%. This range can probably be attributed to subdivision within N II. Most N II specimens were genetically very similar or identical to those in C II or S III. Seven individuals from N II shared a haplotype with an individual from C II and an individual from S III. One specimen from

N II shared another haplotype with an individual from C II. In addition, N II had three isolated haplotypes, exhibited by seven individuals. However, the maximum likelihood tree and the maximum parsimony haplotype network both demonstrated that the haplotype of one of those specimens was more similar to haplotypes observed in N III than haplotypes from any other seep, including others from N II.

Currently, C I and C III are probably connected to the same hypotelminorheic.

Their haplotypes were grouped into the same clade with strong bootstrap support. In addition, C I and C III had the lowest average uncorrected COl nucleotide divergences,

0.55% (range between 0 and 1.83%) and 0.54% (range between 0 and 1.79%), respectively. The average uncorrected COl nucleotide divergence between the two sites 57

is 0. 7%. The low genetic divergence between the two sites is probably due to existing gene flow.

Although C II is located in between C I and C III, it is slightly more genetically divergent. There were two distinct groups of C II haplotypes. The first group was similar or identical to haplotypes found inN II. The second group was more similar to other haplotypes in the central and southern clusters. This suggests that C II may become or is currently disconnected from C I and C III. The water from C II exits directly onto the road ofthe GWMP. The construction ofthe parkway itself in 1932 may have altered the hypotelminorheic of the central cluster by artificially creating the C II seep or modifying it as alterations were made to reinforce the road (Southworth and Denenny

2006).

While the central cluster is probably fed by groundwater from one hypotelminorheic, the extent of other hypotelminorheic habitats may currently be smaller.

S II and S Ill shared some similar haplotypes, while also hosting haplotypes that were distinct from each other. This suggests that the two seeps are connected to the same hypotelminorheic intermittently. When the water table rises, the two seeps may become connected. The corridor between the seeps might dry as the water table decreases.

There is evidence of the existence of a similar intermittent connection between the central and southern cluster. The NCPA suggested that there was a pattern of contiguous range expansion between C I, C III, and S II. The maximum likelihood tree organized haplotypes from S II, S III, and the central cluster into one strongly supported clade.

Given the similarity of haplotypes that occur in C I, C II, C III, S II, and S Ill, the central 58

cluster hypotelminorheic probably does become connected to the southern cluster hypotelminorheic when the water table reaches a certain level.

According to the NCP A, N III, C II, and S III were separated by allopatric fragmentation. This provides an explanation for why some haplotypes from N III are similar to those from S III, but the sites do not share identical haplotypes. The haplotypes found in C II were more similar to those found in N II. Perhaps, the N III population became fragmented from the central and southern cluster before N II became separated.

Since twoS. tenuis potomacus samples collected from C II and S III shared the same haplotype, the sites were probably not fragmented for very long. This provides further support for an intermittent connection that forms between the central and southern hypotelminorheic depending on the level of the underlying aquifer.

NCPA also found a pattern of restricted gene flow with isolation by distance between the N II, C II, and S II sites based on their shared haplotypes. Given there geographic separation, it is difficult to believe that a connection still exists between N II and the central and southern clusters. However, this population expansion from N II is supported by the results of the Tajima's D test for selective neutrality (D= -2.238,p=

0.002) (Rand 1996).

N II, C II, and S II are all underlain by alluvium deposited by the Potomac River in the last ten thousand years, which suggests that the seeps may have been formed at the same time. An intermittent connection could exist between seeps that are intermediate between N II and the central and southern clusters. Thearea between N II and the central cluster should be thoroughly searched for seeps that might host intermediate populations of S. tenuis potomacus. 59

Implications for the Conservation of Rare Species of Stygobromus in Seeps

The susceptibility ofhypotelminorheic habitats to fragmentation may have important conservation implications for rare subterranean seep species that are closely related to S. tenuis potomacus. The patterns of habitat fragmentation characteristic of the hypotelminorheic may be responsible for the high endemism exhibited by S. kenki and S. hayi. S. hayi is a rare seep specialist that is listed as a federally endangered species. It is known from its type locality, a small spring, in the National Zoological Park and six seeps and small springs in Rock Creek Park, Washingtop., D.C. S. kenki is also a rare seep specialist, known from only four seeps or small, seep-like springs in Rock Creek

Park (Hutchins and Culver 2007).

The linear extent of the range of each of these species is less than 5 km (Culver,

Pipan, and Gottstein 2006). However, this study of S. tenuis potomacus demonstrates that the actual range of S. kenki and S. hayi may be even more restricted than was previously thought. Similar to the hypotelminorheic habitats of the GWMP, the Rock

Creek hypotelminorheic(s) inhabited by these species may go through periods of fragmentation as the water table fluctuates. If the water table rises, then connectivity may be restored allowing gene flow to occur again between hypotelminorheic habitats.

S. tenuis potomacus seems to be a resilient species, able to survive the periodic fluctuations in the extent of a hypotelminorheic habitat. It is also a wide-spread species so, even if a particular population became extirpated from a hypotelminorheic habitat, once connectivity was restored, the area could potentially be recolonized by a neighboring population. Recolonization of an area might not be as easy for S. kenki and 60

S. hayi, since they may be limited to only five seeps and perhaps even fewer hypotelminorheic habitats. As a result of being restricted to a small habitat range in the highly urbanized area of Washington, D.C., S. kenki and S. hayi are threatened by a variety of anthropogenic impacts. These threats include contamination of the aquifer, the presence of impervious urban surfaces that may redirect rainwater and reduce the recharge of the aquifer and fragment the hypotelminorheic habitat, and the compaction of soil around a seep (Hutchins and Culver 2007). Further study is necessary to assess the best way to protect the known populations of these species.

Conclusions

Examination of the population structure of S. tenuis potomacus in seeps in the

George Washington Memorial Parkway revealed that although in the past, all of the sites were probably part of one panmictic population, changes in the geology of the region as the Potomac River formed resulted in the fragmentation of the hypotelminorheic.

However, phylogenetic analysis also indicated that hypotelminorheic habitats are dynamic. The extent of the hypotelminorheic probably varies with fluctuations in the water table and other hydrogeologic changes.

From examining the apparent size of the area drained by a seep, Culver et al.

(2006) suggested that the drainage area of hypotelminorheic habitats is usually less than

2 10,000 m . This study provides strong evidence that the extent ofhypotelminorheic

2 habitats is usually probably less than 5,000 m • Some hypotelminorheic habitats may

2 2 only range between less than 500m and 1,000 m . 61

Hypotelminorheic habitats may be small and isolated, providing groundwater for only one seep. This is currently the case for N I and N III. Larger hypotelminorheic habitats may also exist. For example, the seeps of the central cluster are fed by a hypotelminorheic that spans approximately 0. 7 km. Changes in the water table may cause intermittent connections to form between two smaller hypotelminorheic habitats, such as the 0.4 km corridor between S II and S III. Intermittent connections may form that span even longer distances, such as the approximately 5 km corridor that allows gene flow to occur between the central and southern clusters. Another variable connection may also allow occasional gene flow to occur between N II and the central cluster. However, because N II is located approximately 22 km from the central cluster, it is more likely that gene flow occurs between intermediate populations that exist between the sites and are connected by intermittent corridors. Future research should be conducted to locate these possible intermediate populations.

In addition, the area surrounding S I should be surveyed in order to locate

Stygobromus specimens that may belong to the same population as the individuals collected from S I. Morphological examination in conjunction with genetic analysis is needed to determine whether the S I site was colonized by S. pizzinii or another species of

Stygobromus, perhaps not yet described.

Furthermore, the fact that Stygobromus was able to quickly colonize and persist in the S I site despite its being an artificially created seep, demonstrates the ability of the genus to disperse to new groundwater habitats. This may allo\\;' populations to persist despite environmental changes. The subdivision present in many of the S. tenuis potomacus populations demonstrates that fluctuations in the extent of the 62

hypotelminorheic result in a cycle between genetic isolation and gene flow. Even if a seep becomes dry on the surface, the population may still survive in the groundwater of the hypotelminorheic until connectivity to other seeps is restored.

While species of Stygobromus, like S. tenuis potomacus, seem to be resilient, this might not be the case for all species in the genus. As a result of their small range, the rare seep species, S. kenki and S. hayi, may be more vulnerable to the periodic fluctuations and the fragmentation of the hypotelminorheic habitat. APPENDIX 1

NESTED CLADE PHYLOGEOGRAPHIC ANALYSIS

Table A 1. Significant Results ofthe Nested Clade Phylogeographic Analysis. Only nested clades or interior-tip (1- T) distances with a significant within clade distance (WC) or nested distances (NC) are included. Chain of Inference and interpretation are given online at http://bioag.byu.edu/zoology/Crandall_lab/programs.htm.

Interior- Clade Distance Inference Chain Ti 3-4 2-4 2.47 1-2-11-12-No: Contiguous range NC 2.31 expansion 2-15 we 0.17 NC 0.98 1-T we -2.32 NC -1.33 4-4 3-4 we 1.74 1-2-3-4-No: Restricted gene flow NC 4.69 with isolation by distance 3-9 we 0.00 NC 8.19 3-6 we 11.93 NC 15.43 1-T we 10.47 NC 10.18 5-4 4-6 we 0.90 Restricted gene flow/ dispersal but NC 14.50 with some long-distance dispersal 4-4 we 8.19 over intermediate areas not occupied NC 9.58 by the secies; or past gene flow 1-T we 7.30 followed by extinction of intermediate populations. NC -4.92 5-5 4-5 we 0.00 Allopatric fragmentation NC 18.12 4-7 we 0.00 Ne 6.73 Total Cladogram 6-1 we 0.00 Allopatric fragmentation Ne 18.20 6-2 we 0.00 NC 12.82 6-3 we 0.00 NC 34.42 6-4 we 10.22 NC 12.18

63 REFERENCES

Avise, J. C., J. Arnold, R. M. Ball, E. Bermingham, T. Lamb, J. E. Neigel, C. A. Reeb, and N.C. Saunders. 1987. Intraspecific phylogeography- the mitochondrial-DNA bridge between population-genetics and systematics. Annual Review ofEcology and Systematics 18: 489-522.

Carlini, David B., John Manning, Patrick G. Sullivan, and Daniel W. Fong. 2009. Molecular genetic variation and population structure in morphologically differentiated cave and surface populations of the freshwater amphipod gammarus minus. Molecular Ecology 18: 1932-1945.

Clement, M., D. Posada, and K. A. Crandall. 2000. Tcs: A computer program to estimate gene genealogies. Molecular Ecology 9, no. 10: 1657-1660.

Collier, K. J. and B. J. Smith. 2006. Distinctive invertebrate assemblages in rockface seepages enhance !otic biodiversity in northern new zealand. Biodiversity and Conservation 15, no. 11: 3591-3616.

Crandall, K. A. and A. R. Templeton. 1993. Empirical tests of some predictions from coalescent theory with applications to intraspecific phylogeny reconstruction. Genetics 134, no. 3: 959-969.

Culver, David C. and Tanja Pipan. 2008. Superficial subterranean habitats-gateway to the subterranean realm?: Cave and Karst Science.

Culver, David C., Tanja Pipan, and Sanya Gottstein. 2006. Hypotelminorheic-a unique freshwater habitat. Subterranean Biology 4: 1-7.

Excoffier, L., G. Laval, and S. Schneider. Arlequin ver. 3.0 3.0.

Finn, Debra S., David M. Theobald, William C. Black lv, and N. Leroy Poff. 2006. Spatial population genetic structure and limited dispersal in a rocky mountain alpine stream insect. Molecular Ecology 15, no. 12: 3553-3566.

64 65

Folmer, 0., M. Black, W. Hoeh, R. Lutz, and R. Vrijenhoek. 1994. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit i from diverse metazoan invertebrates. Molecular Marine Biology and Biotechnology 3, no. 5: 294-299.

Gis data. http://data.geocomm.com/catalog/US/61 079/sublist.html (accessed 3/12/09.

Gouws, G. and B. A. Stewart. 2007. From genetic structure to wetland conservation: A freshwater isopod paramphisopus palustris (phreatoicidea: Amphisopidae) from the swan coastal plain, western australia. Hydrobiologia 589: 249-263.

Gouws, G., B. A. Stewart, and C. A. Matthee. 2005. Lack of taxonomic differentiation in an apparently widespread freshwater isopod morphotype (phreatoicidea : Mesamphisopidae: Mesamphisopus) from south africa. Molecular Phylogenetics and Evolution 37, no. 1: 289-305.

Hebert, Paul D. N., Mark Y. Stoeckle, Tyler S. Zemlak, and Charles M. Francis. 2004. Identification of birds through DNA barcodes. P LoS Bioi 2, no. 10: e312.

Hogg, I. D., M. I. Stevens, K. E. Schnabel, and M. A. Chapman. 2006. Deeply divergent lineages of the widespread new zealand amphi pod paracalliope fluviatilis revealed using allozyme and mitochondrial DNA analyses. Freshwater Biology 51, no. 2: 236-248.

Holsinger, J. R. 1978. Systematics ofthe subterranean amphipod genus stygobromus (), part ii: Species ofthe eastern united states. Smithsonian contributions to zoology. Washington, D.C.: Smithsonian Institution Press.

____. 1994. Pattern and process in the biogeography of subterranean amphi pods. Hydrobiologia 287, no. 1: 131-145.

____, ed. 1986. Zoogeographic patterns of north american subterranean amphipod crustaceans. Edited by Robert H. Gore and Kenneth L. Heck. Crustacean issues 4: Crustacean biogeography. Boston: A.A. Balkema.

Hutchins, Benjamin. 2007. Genetic divergence among populations of the madison cave isopod, antrolana lira, American University. 66

Hutchins, Benjamin and D. C. Culver. 2007. Investigating rare and endemic pollution­ sensitive subterranean fauna of vulnerable habitats in the ncr: U.S. National Park Service.

Kavanaugh, K. and D. Fong. 2009. Unpublished data. Washington, D.C.: American University.

Kelly, D. W., H. J. Macisaac, and D. D. Heath. 2006. Vicariance and dispersal effects on phytogeographic structure and speciation in a widespread estuarine invertebrate. Evolution 60, no. 2: 257-267.

Knowles, L. L. 2008. Why does a method that fails continue to be used? Evolution 62, no. 11:2713-2717.

Knowles, L. L. and W. P. Maddison. 2002. Statistical phytogeography. Molecular Ecology 11, no. 12: 2623-2636.

Krejca, Jean Kathleen. 2005. Stygobite phylogenetics as a tool for determining aquifer evolution. PhD Dissertation, University of Texas at Austin.

Leblois, Raphael, Arnaud Estoup, and Rejane Streiff. 2006. Genetics of recent habitat contraction and reduction in population size: Does isolation by distance matter? Molecular Ecology 15, no. 12: 3601-3615.

Mestrov, M. 1962. Un nouveau milieu aquatique souterrain: Le biotope hypotelminorheique

. CR Acad. Sci. Paris 254: 2677-2679.

Moritz, Craig and Carla Cicero. 2004. DNA barcoding: Promise and pitfalls. PLoS Biol2, no. 10: e354.

Nielsen, Rasmus. 2001. Statistical tests of selective neutrality in the age of genomics. Heredity 86: 641-647.

Panchal, M. 2007. The automation of nested clade phylogeographic analysis. Bioinformatics 23, no. 4: 509-510. 67

Panchal, M. and M. A. Beaumont. 2007. The automation and evaluation of nested clade phytogeographic analysis. Evolution 61, no. 6: 1466-1480.

Posada, D. and K. A. Crandall. 1998. Modeltest: Testing the model of DNA substitution. Bioinformatics 14,no. 9:817-818.

Posada, D., K. A. Crandall, and A. R. Templeton. 2000. Geodis: A program for the cladistic nested analysis of the geographical distribution of genetic haplotypes. Molecular Ecology 9: 487-488.

Rand, David M. 1996. Neutrality tests of molecular markers and the connection between DNA polymorphism, demography, and conservation biology. Conservation Biology 10, no. 2: 665-671.

Reusser, L. J., P.R. Bierman, M. J. Pavich, E. A. Zen, J. Larsen, and R. Finkel. 2004. Rapid late pleistocene incision of atlantic passive-margin river gorges. Science 305,no. 5683:499-502.

Slatkin, M. 1985. Gene flow in natural populations. Annual Review ofEcology and Systematics 16, no. 1: 393-430.

Slatkin, M. and L. Voelm. 1991. F(st) in a hierarchical island model. Genetics 127, no. 3: 627-629.

Slatkin, Montgomery. 1993. Isolation by distance in equilibrium and non-equilibrium populations. Evolution 47, no. 1: 264-279.

U.S. Geological ,Survey. 2001. Geology ofthe chesapeake and ohio canal national historical park and potomac river corridor, district ofcolumbia, maryland, west virginia, and virginia, by Southworth, C. S., D. K. Brezinski, R. C. Orndorff, P. G. Chirico, and K. M. Lagueux. Washington, D.C.

Southworth, Scott and Danielle Denenny. 2006. Geologic map ofthe national parks in the national capital region, washington, d C., virginia, maryland, and west virginia. 68

Swofford, D. L. Paup*. Phylogenetic analysis using parsimony (and other methods), version 4.0.B 10. Sinauer Associates, Sunderland, MA.

Tajima, Fumio. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585-595.

Templeton, A. R. 1998. Nested clade analyses of phylogeographic data: Testing hypotheses about gene flow and population history. Molecular Ecology 7, no. 4: 381-397.

____. 2009. Why does a method that fails continue to be used? The answer. Evolution 63, no. 4: 807-812.

Templeton, A. R. and N.J. Georgiadis. 1995. A landscape approach to conservation genetics: Conserving evolutionary processes in african bovids. In Conservation genetics: Case histories from nature, ed. J. A vise and J. Hamrick. New York: Chapman and Hall.

Templeton, A. R., E. Routman, and C. A. Phillips. 1995. Separating population structure from population history: A cladistic analysis of the geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, ambystoma tigrinum. Genetics 140, no. 2:767-782.

Vainola, R., J.D. S. Witt, M. Grabowski, J. H. Bradbury, K. Jazdzewski, and B. Sket. 2008. Global diversity of amphipods (; crustacea) in freshwater. Hydrobiologia 595: 241-255.

Whitler, Jeannie. 2009. George washington memorial parkway tract and boundary data. http://science.nature.nps. gov/nrdata!index.cfm accessed Date Accessed) I.

Wilson, G. D. F., C. L. Humphrey, D. J. Colgan, K. A. Gray, and R.N. Johnson. 2009. Monsoon-influenced speciation patterns in a species flock of eophreatoicus nicholls (isopoda; crustacea). Molecular Phylogenetics and Evolution 51, no. 2: 349-364. 69

Witt, J. D. S. and P. D. N. Hebert. 2000. Cryptic species diversity and evolution in the amphipod genus hyalella within central glaciated north america: A molecular phylogenetic approach. Canadian Journal of Fisheries and Aquatic Sciences 57, no. 4: 687-698.

Witt, J.D. S., D. L. Threloff, and P. D. N. Hebert. 2006. DNA barcoding reveals extraordinary cryptic diversity in an amphipod genus: Implications for desert spring conservation. Molecular Ecology 15, no. 10: 3073-3082.

Wright, S. 1943. Isolation by distance. Genetics 28: 114-138.