Western Kentucky University TopSCHOLAR®

Masters Theses & Specialist Projects Graduate School

Summer 2021

Predictive Distributional Modeling of Rare and Uncommon Stoneflies (Insecta: ) of the Central Appalachian Mountains Using Maximum Entropy

Phillip Nathaniel Hogan

Follow this and additional works at: https://digitalcommons.wku.edu/theses

Part of the Biodiversity Commons, Entomology Commons, and the Terrestrial and Aquatic Ecology Commons

This Thesis is brought to you for free and open access by TopSCHOLAR®. It has been accepted for inclusion in Masters Theses & Specialist Projects by an authorized administrator of TopSCHOLAR®. For more information, please contact [email protected]. PREDICTIVE DISTRIBUTIONAL MODELING OF RARE AND UNCOMMON STONEFLIES (INSECTA: PLECOPTERA) OF THE CENTRAL APPALACHIAN MOUNTAINS USING MAXIMUM ENTROPY

A Thesis Presented to The Faculty in the Department of Biology Western Kentucky University Bowling Green, Kentucky

In Partial Fulfillment Of the Requirements for the Degree Master of Science

By Phillip Nathaniel Hogan

August 2021 PREDICTIVE DISTRIBUTIONAL MODELING OF RARE AND UNCOMMON STONEFLIES (INSECTA: PLECOPTERA) OF THE CENTRAL APPALACHIAN MOUNTAINS USING MAXIMUM ENTROPY

______Associate Provost for Research and Graduate Education ACKNOWLEDGEMENTS

In its entirety, a full list of people who helped bring this thesis to reality would fill more pages than this manuscript itself. Too many individuals to list aided or assisted in some way to make this thesis what it is today. Above all, I would like to thank Dr. Scott

Grubbs for continuously providing guidance, offering overwhelming support, and willing to go the extra mile while mentoring me. Dr. Jarrett Johnson was beyond helpful in extracting and amplifying DNA sequences, in addition to offering guidance and constructive feedback. Jarrett was more than generous with sharing his lab space and materials. I would also like to thank Dr. John Andersland for helping me use the SEM and teaching me the ropes of microscope photography. I cannot forget to mention Dr.

Keith Philips who helped present new directions for this thesis work.

This project would never have gotten far if it had not been for the help from my good friend, Zach Talbert, who provided housing throughout the duration of my collecting trips in western Maryland and for keeping me company throughout this study, both of which I am extremely grateful for. I must thank both of my parents, Edward and

Rochelle Hogan, who helped support my research and by giving me a place to crash when my collecting trips led me home.

Finally, it would be a mistake to not mention the Maryland Department of Natural

Resources for preserving a piece of the natural beauty of the Appalachians and protecting the regions that I had the joy to collect in. Often while collecting, I would find myself stopping in wonderment, curiosity, or awe at a waterfall or rock formation that was hidden far from trails and roads.

III

TABLE OF CONTENTS

INTRODUCTION ...... 1 METHODS ...... 7 RESULTS ...... 17 DISCUSSION ...... 31 TABLES ...... 41 FIGURES ...... 50 LITERATURE CITED ...... 102

IV

LIST OF TABLES

Table 1. Number of valid presence records prior to 2020 for Appalachian stonefly species targeted with distribution modeling with state Species of Greatest Conservation Need

(SGCN) status listed. Delaware = DE, Maryland = MD, Pennsylvania = PA, Virginia =

VA, West Virginia = WV...... 41

Table 2. Description of variables prepared for use in MaxEnt for each target species.

Bioclimatic variables from WorldClim are averaged values from 1970 ̶ 2000...... 42

Table 3. Evaluation metrics for each the MaxEnt model generated for each target species using only pre-2020 collection data. Mean AUC is averaged over 15 repetitions when creating the model and the standard deviation is shown. Regularized Training Gain

(RTG) and Test Gain (TG) are presented for each model along with the Maximum

Training Specificity-Sensitivity (MaxTSS) used as a threshold for producing binary maps of suitable habitat...... 43

Table 4. Percent contribution and permutation importance of selected environmental variables used to implement MaxEnt modeling with pre-2020 collection data...... 44

Table 5. Evaluation metrics for each the MaxEnt model generated for each target species using both pre-2020 and 2020 collection data. Mean AUC is averaged over 15 repetitions when creating the model and the standard deviation is shown. Regularized Training Gain

(RTG) and Test Gain (TG) are presented for each model along with the Maximum

V

Training Specificity-Sensitivity (MaxTSS) used as a threshold for producing binary maps of suitable habitat...... 47

Table 6. Percent contribution and permutation importance of selected environmental variables used to implement MaxEnt modeling with pre-2020 and 2020 collection data.

...... 49

VI

LIST OF FIGURES

Figure 1. Heatmap of Pearson’s Correlation Coefficient between variables considered for

MaxEnt modeling. Warmer colors (e.g., yellow) represent higher correlation and cooler colors (e.g., blue) represent lower correlation...... 50

Figure 2. Location of 2020 collection sites in Maryland used to field test distributional models...... 51

Figure 3. Binary probability of habitat suitability for Allocapnia frumi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training

Sensitivity – Specificity threshold...... 52

Figure 4. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia frumi using pre-2020 records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 53

Figure 5. Binary probability of habitat suitability for Allocapnia harperi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 54

Figure 6. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia harperi using pre-2020 collection data. Each curve shows

VII the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 55

Figure 7. Binary probability of habitat suitability for Allocapnia simmonsi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 56

Figure 8. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia simmonsi using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 57

Figure 9. Binary probability of habitat suitability for Allocapnia simmonsi generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold...... 58

Figure 10. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia simmonsi using both pre-2020 and 2020 collection data.

Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions. ....59

Figure 11. Binary probability of habitat suitability for Alloperla biserrata generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 60

VIII

Figure 12. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla biserrata using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 61

Figure 13. Binary probability of habitat suitability for Alloperla biserrata generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold...... 62

Figure 14. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla biserrata using both pre-2020 and 2020 collection data.

Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions. ....63

Figure 15. Binary probability of habitat suitability for Sweltsa palearata generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 64

Figure 16. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa palearata using pre-2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 65

IX

Figure 17. Binary probability of habitat suitability for Sweltsa palearata generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold...... 66

Figure 18. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa palearata using pre-2020 and 2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 67

Figure 19. Binary probability of habitat suitability for Sweltsa pocahontas generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland...... 68

Figure 20. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa pocahontas using pre-2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 69

Figure 21. Binary probability of habitat suitability for Sweltsa pocahontas generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented and contemporaneous collections from within Garrett County, Maryland...... 70

X

Figure 22. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa pocahontas using pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 71

Figure 23. Binary probability of habitat suitability for Alloperla aracoma generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland...... 72

Figure 24. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla aracoma using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 73

Figure 25. Binary probability of habitat suitability for Alloperla aracoma generated from the logistic output from MaxEnt using both pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies new collections from within Garrett County, Maryland.

...... 74

Figure 26. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla aracoma using both pre-2020 and 2020 collection data.

Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions. ....75

XI

Figure 27. Binary probability of habitat suitability for Rasvena terna generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training

Sensitivity – Specificity threshold...... 76

Figure 28. MaxEnt response curves for each environmental variable used to identify suitable habitat for Rasvena terna using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 77

Figure 29. Binary probability of habitat suitability for Utaperla gaspesiana generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold. The record of U. gaspesiana from West

Virginia was not included in analyses due to the location uncertainty...... 78

Figure 30. MaxEnt response curves for each environmental variable used to identify suitable habitat for Utaperla gaspesiana using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 79

Figure 31. Binary probability of habitat suitability for Megaleuctra flinti generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 80

Figure 32. MaxEnt response curves for each environmental variable used to identify suitable habitat for Megaleuctra flinti using pre-2020 collection data. Each curve shows

XII the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 81

Figure 33. Binary probability of habitat suitability for Megaleuctra flinti generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold...... 82

Figure 34. MaxEnt response curves for each environmental variable used to identify suitable habitat for Megaleuctra flinti using both pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 83

Figure 35. Binary probability of habitat suitability for Ostrocerca complexa generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland...... 84

Figure 36. MaxEnt response curves for each environmental variable used to identify suitable habitat for Ostrocerca complexa using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 85

Figure 37. Binary probability of habitat suitability for Ostrocerca complexa generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold. The inset map location is

XIII highlighted in blue and identifies documented and contemporaneous collections from within Garrett County, Maryland...... 86

Figure 38. MaxEnt response curves for each environmental variable used to identify suitable habitat for Ostrocerca complexa using both pre-2020 and 2020 collection data.

Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions... ..87

Figure 39. Binary probability of habitat suitability for Ostrocerca prolongata generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 88

Figure 40. MaxEnt response curves for each environmental variable used to identify suitable habitat for Ostrocerca prolongata using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 89

Figure 41. Binary probability of habitat suitability for Soyedina kondratieffi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland...... 90

Figure 42. MaxEnt response curves for each environmental variable used to identify suitable habitat for Soyedina kondratieffi using pre-2020 records. Each curve shows the

XIV mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 91

Figure 43. Binary probability of habitat suitability for Soyedina kondratieffi generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a

Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented and contemporaneous collections from within Garrett County, Maryland...... 92

Figure 44. MaxEnt response curves for each environmental variable used to identify suitable habitat for Soyedina kondratieffi using pre-2020 and 2020 collection records.

Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions... ..93

Figure 45. Binary probability of habitat suitability for Hansonoperla appalachia generated from the logistic output from MaxEnt using pre-2020 collection data with a

Maximum Training Sensitivity – Specificity threshold...... 94

Figure 46. MaxEnt response curves for each environmental variable used to identify suitable habitat for Hansonoperla appalachia using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 95

XV

Figure 47. Binary probability of habitat suitability for Bolotoperla rossi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum

Training Sensitivity – Specificity threshold...... 96

Figure 48. MaxEnt response curves for each environmental variable used to identify suitable habitat for Bolotoperla rossi using pre-2020 records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions...... 97

Figure 49. Scatterplot of MaxEnt model performance for each of the target species using collection data prior to 2020 against the amount of suitable habitat identified by the

MaxEnt algorithm. The black line is a linear regression, with the equation and R2 value shown in the top right corner...... 98

Figure 50. Scatterplot of MaxEnt model performance for each of the target species with collection data from 2020 against the amount of suitable habitat identified by the MaxEnt algorithm. The black line is a linear regression, with the equation and R2 value shown in the top right corner...... 99

Figure 51. Average percent permutation importance of seven environmental variables used to generate distributional models in MaxEnt for 15 stonefly species using pre-2020 collection data. Land cover (NLCD) was only used in eight of the 15 models...... 100

Figure 52. Average percent permutation importance of seven environmental variables used to generate distributional models in MaxEnt for eight stonefly species using pre-

XVI

2020 and 2020 collection data. Land cover (NLCD) was only used in four of the eight models...... 101

XVII

PREDICTIVE DISTRIBUTIONAL MODELING OF RARE AND UNCOMMON STONEFLIES (INSECTA: PLECOPTERA) OF THE CENTRAL APPALACHIAN MOUNTAINS USING MAXIMUM ENTROPY

Phillip Hogan August 2021 118 Pages

Directed by: Scott Grubbs, Jarrett Johnson, and Keith Philips

Department of Biology Western Kentucky University

Predicting where rare species may be found is important in addressing and directing conservation efforts. Knowledge of the distribution for many of these taxa is often lacking or unknown altogether. The use of species distributional modeling fills gaps in this knowledge by predicting where a species may be present by taking a correlative approach between presence/pseudoabsences and environmental data. The aim of this study was to describe the distribution of several rare and uncommon aquatic using

Maximum Entropy (MaxEnt) modeling as human influences within the central

Appalachian Mountains are increasing and isolating pockets of biodiversity. Species distribution modeling of 15 central Appalachian stoneflies (Insecta: Plecoptera) resulted in the identification of potentially suitable habitat that was subsequently field-tested with adult collections in Maryland during the emergence period of each target species. This method yielded 29 new collections of seven target species. Locations from these collections of targeted species were used to generate refined models of species distribution following an iterative process. Final models now function as a guide for future collecting events.

XVIII INTRODUCTION

Conservation biology has largely focused on protecting and studying charismatic, vertebrate megafauna (Collier et al. 2016), however, recent assessments of biomass have indicated that rapid biodiversity loss is threatening ecosystems with collapse (Yang and Gratton 2014; Hallmann et al. 2017; Stepanian et al. 2020). Among the most threatened fauna are those in freshwater systems because these habitats are intricately connected with human presence in the terrestrial area of the drainage basin (McCluney et al. 2014). The absence of freshwater invertebrate species offers an important insight into how impacted an aquatic system is by anthropogenic threats (Harding et al. 1998). These threats include chemical and thermal water pollution (Clark 1969), climate change

(Woodward et al. 2010), habitat degradation (Newcombe and Macdonald 1991), eutrophication (Alexander and Smith 2006), introduction of invasive species

(McCormick et al. 2010), and a combination thereof (Polhemus 1993; Strayer and

Dudgeon 2010; Collier et al. 2016).

Aquatic insect taxa in the orders Ephemeroptera (mayflies), Plecoptera

(stoneflies), Odonata (damselflies and dragonflies), and Trichoptera (caddisflies) are at an elevated risk of imperilment compared to other aquatic insects (Sanchéz-Bayo and

Wyckhuys 2019). Of these four orders, stoneflies have been documented as having the highest percentage of imperiled species (Master et al. 2000; DeWalt et al. 2005). Wilcove and Master (2005) estimated that 21% of Nearctic stoneflies are imperiled, while Master et al. (2000) estimated that 43% of all stonefly species are in need of conservation efforts.

In regions such as Illinois, where large-scale agricultural practices have significantly

1 altered the landscape, greater than 70% of the stonefly fauna can be considered imperiled, extirpated, or extinct (DeWalt et al. 2005).

Stoneflies are small-bodied aquatic insects characteristic of highly oxygenated lotic systems found across a wide range of stream sizes, and some also inhabit wave- swept shorelines of mountain lakes and large, cold lakes (Helešic 2001). Larval stoneflies are commonly used as biological indicators by state, federal, and private conservation agencies (Hilsenhoff 1982; Eaton and Lenat 1991; Lenat 1993; Rosenberg and Ross

1993; Barbour et al. 1999). Stoneflies are temperature and pollution sensitive with an increased need for conservation from anthropogenic impacts, as minute changes in water quality may result in local extirpations and possible extinction (Baumann 1979;

Malmqvist and Rundle 2002). DeWalt et al. (2005) noted that the majority of recently extirpated stonefly species in Illinois had similar life history strategies. The largest losses of species were found in semivoltine or slower univoltine species indicative of severe ecological disturbances such as groundwater withdrawal and loss of permanent stream flow. Bojková et al. (2012) identified large river habitat specialists and habitat generalists as being extirpated from the Czech Republic or facing the most threats from increased agricultural and stream modifications.

The loss of large-bodied, potamal species is exacerbated by the low capacity of stonefly species to disperse to new, unimpacted habitats (Zwick 1992). Following a disturbance and subsequent extirpation, stonefly species are often the last group of aquatic insects to recolonize and are found in lower densities than during pre-disturbance surveys (Wallace et al. 1986). Poor lateral dispersal mechanisms of stoneflies, which are largely inadequate to cross large terrestrial habitats and lentic bodies of water that

2 function as barriers to connectivity of suitable habitat, can be attributed to the reduced capability of stoneflies for recolonization post-disturbance (Griffith et al. 1998; Briers et al. 2002). Limited dispersal capacities combined with habitat degradation across large scales has yielded high levels of endemism with lesser likelihood for the reestablishment of local populations (Dudgeon et al. 2006; Strayer 2006; Strayer and Dudgeon 2010;

Pond 2012; Grubbs 2021). Pond (2012) found that long-term disturbances (i.e.,, residential development, surface mining) within a catchment leads to localized extinctions of stoneflies within headwater streams with a lessened chance for recolonization. DeWalt and South (2015) noted that smaller-bodied species of stoneflies may be more aerodynamically suitable to dispersing longer distances as adults than larger species, providing further evidence that it is the larger-bodied stonefly taxa (i.e., those in the families Perlidae, Perlodidae, and Pteronarcyidae) that appear most at risk for imperilment (DeWalt et al. 2005). In addition, poor dispersal mechanisms of these groups restrict stonefly populations to habitats of inadequate quality that may be further degraded by human activities (Zwick 1992; Griffith et al. 1998; Petersen et al. 1999).

One key element in attempting to employ effective conservation methods on insect taxa that are at risk is to study regional biogeographical and distributional patterns

(Tronstad et al. 2018). For many species, geographic distributions are not well-defined or are wholly unknown, indicative of inadequate collecting efforts. The lack of information on a species distribution creates limitations for conservation studies, termed as the

Wallacean Shortfall (Lomolino and Heaney 2004; Cardoso et al. 2011). Addressing the

Wallacean Shortfall through Species Distribution Models (SDM) allow for the identification of potential suitable habitat of a species within distributional gaps and

3 under sampled regions. SDM uses statistical methods to generate predictions of species distribution ranges by combining environmental characteristics found in locations where the targeted species is present and applying these characteristics to find similar areas across a study region (Warton et al. 2013; El-Gabbas and Dormann 2018). Modeling distributions requires a diverse range of environmental data, such as elevation, land cover, stream order, mean annual temperature, and water chemistry, to provide the most accurate predictions of suitable habitats for aquatic invertebrates (Cao et al. 2013).

Utilization of SDM allows targeted, efficient sampling of potential localities of a target species by identifying suitable habitat where the species may be present within unsampled regions in a broad landscape (Tronstad et al. 2018; Young et al. 2019A). The products of SDM are hypotheses that are reliant upon ground truthing for validation with the potential for refined model construction.

As collection data are often limited for rare and uncommon species, distributional modeling is reliant upon methods that can work well with low sample sizes. One such method, Maximum Entropy modeling, or MaxEnt (Phillips et al. 2006; Phillips and

Dudík 2008), has been demonstrated to effectively determine species distributions with samples as low as three for narrowly endemic species and 13 for widespread species, making this an effective tool in the conservation of rare stoneflies (van Proosdij et al.

2016). MaxEnt is a machine-learning algorithm that combines environmental variables and presence-only data to predict suitable habitat for a target species across a landscape.

Several advantages of MaxEnt compared to other SDM methods (e.g., general linear models, general additive models, boosted regression trees, etc.) is the inclusion of presence-only data as opposed to presence-absence data (Elith et al. 2010), accurate

4 predictions with low sample sizes and rare species (Pearson et al. 2006; Young et al.

2019A), and the ability to utilize both continuous and categorical environmental datasets

(Phillips et al. 2006). The ability to use presence-only data, or data where the true absences are not recorded, is important as true absences are often not feasible to accurately determine as time and resource limitations constrain diversity and conservation research (Elith et al. 2010). Additionally, presence-only data is also often the only form of data provided by natural history museum collections and literature records (Rondinini et al. 2006).

SDM have been applied to stoneflies successfully in a variety of applications.

Predictions of Acroneuria frisoni Stark & Brown, 1991 using SDM indicated that the historical range of this species expanded following retreating glaciers during the

Wisconsinan and Illinoian glaciation episodes (Pessino et al. 2014). The application of

SDM in conjunction with a phylogeographic approach, permitted divergence dating between populations supporting historic range expansions and contractions corresponding to glaciation. Likewise, historic distribution trends were recreated for the western

Nearctic stonefly, Doroneuria baumanni Stark & Gaufin, 1974, using similar methods to identify ranges at differing climates episodes during the Pleistocene and Holocene

(Schultheis et al. 2012).

Predictions of current distributions have been applied to rare species to forecast potential impacts that climate change or other anthropogenic impacts may have to aid in informing conservation efforts. SDM was recently implemented to find overlap between ranges of newly described species of congeners of the genus Remenus Ricker, 1952

(Verdone 2018). The application of SDM allowed for the comparison of predicted current

5 distribution of each species in relation to other congeners, guiding which species could be prioritized for future conservation efforts. Muhlfield et al. (2011) presented distribution models for Lednia tumana (Ricker, 1952), as this species has been petitioned for federal protection and listing as a federally endangered species by the US Fish and Wildlife

Service (USFWS 2019). One model predicted drastic range reductions under future climate scenarios in which glaciers feeding the tributaries that this species is dependent upon melt entirely and reduce the suitable habitat of the species to less than five km2

(Muhlfield et al. 2011). Muhlfield et al. (2011) helped define and prioritize this species as a target for conservation by the predicted sharp decreases in the range and the currently observed low abundance throughout its range. Young et al. (2019A) presented distributional models for Arsapnia arapahoe (Nelson & Kondratieff, 1988) to define the geographic range as a guide for future collection attempts, as this species was also previously petitioned for federal protection (Young et al. 2019B). Models were generated iteratively by adding presence data following new collections to continually refine SDM of this stonefly in hopes of improving descriptions of this species distribution (Young et al. 2019A). Generating models using SDM allows for refined hypotheses of the distribution of a target species, both contingent upon, and guiding future collections.

Models of rare and uncommon species using small samples sizes are sufficient to direct attempts to find new localities of a species (van Proosdij et al. 2016; Tronstad et al.

2018; Young et al. 2019A; Konowalik and Nosol 2021). The United States Geological

Survey (USGS) compiles a list of rare, threatened, or endangered species listed under each State Wildlife Action Plan (SWAP). This list of Species of Greatest Conservation

Need (SGCN) includes multiple species of stoneflies from the Appalachian region in

6 general. The objectives of this study were to (1) estimate the distribution of SGCN, rare, or otherwise uncommon stoneflies from the central Appalachians to guide future sampling attempts, (2) locate additional populations of SGCNs, rare, or uncommon stoneflies with collections within regions identified by MaxEnt, and (3) generate updated models refining the predicted suitable habitat with location data from recent collections.

METHODS

Target species

Central Appalachian stonefly species were selected for modeling based on SGCN status in Maryland and neighboring states, expert opinion on the potential for each species range to intersect Maryland’s borders, and those with enough documented records to generate a distribution model. The latter of the three requirements eliminated rare species with very limited distributional information (e.g., at or near the type localities only), namely Soyedina merritti Baumann & Grubbs, 1996, Acroneuria kosztarabi

Kondratieff & Kirchner, 1993, A. yuchi Stark & Kondratieff, 2004, and Taeniopteryx nelsoni Kondratieff & Kirchner, 1982 for modeling efforts. A total of 13 species listed as

SGCNs and two additional species that are not listed, but are considered rare compared to congeners, were targeted in this study for SDM (Table 1).

Data mining for documented occurrence records of targeted species predating the beginning of this study (i.e., published prior to January 2020) included an extensive literature search and database queries (Table 1). Records without coordinate pairs were assigned GPS data to the closest degree of precision obtainable from provided location

7 descriptions using a combination of Google Earth 7.3 and Acme Mapper 2.2

(https://mapper.acme.com). Elevation data were added to each record for the corresponding coordinate pair from the National Elevation Dataset (NED) using GPS

Visualizer (https://gpsvisualizer.com).

A total of fifteen species were selected as target species for modeling. Of which, three Allocapnia Claassen, 1928 species met the aforementioned requirements. Two, A. frumi Kirchner, 1982 and A. harperi Kirchner, 1980, are known from relatively restricted ranges along the central Appalachians. The first, A. frumi is known only from central

West Virginia and western Maryland (Kirchner 1982; Grubbs 1997; DeWalt et al. 2020).

Collections of A. harperi span from southwestern Pennsylvania to western North

Carolina (Kirchner 1980; 1982; Grubbs 1996; 1997; DeWalt et al. 2020; Metzger and

Grubbs, in preparation). The sole record of A. harperi from North Carolina was unavailable at the time of this study and was not included. The third species, A. simmonsi

Kondratieff & Voshell, 1979 is known from southeastern Pennsylvania south to southwestern Virginia (Kondratieff and Voshell 1979; Hood et al. 2008; DeWalt et al.

2020). Both A. frumi and A. simmonsi are known from 10 or fewer collections while A. harperi is known from 26 collections (Table 1).

Six species in the family were selected for modeling, five of which have SGCN status. The sixth, Rasvena terna (Frison, 1942), was not listed as a

SGCN, but was selected as this species is uncommonly collected throughout its known range from southern Quebec south to east Tennessee and western North Carolina (Grubbs and Singai 2018; DeWalt et al. 2020). A total of 34 collections of R. terna were identified for modeling. Two species of Alloperla Banks, 1906, two species of Sweltsa Ricker,

8

1943, and Utaperla gaspesiana Harper & Roy, 1975 were the SGCN-listed chloroperlid species selected for modeling. Alloperla aracoma Harper & Kirchner, 1978 has been collected sparsely from Kentucky to southwestern Pennsylvania (Grubbs 1996; Griffith and Perry 1992; Tarter et al. 2015; DeWalt et al. 2020). A total of 16 collections are available for A. aracoma throughout its range. Alloperla biserrata Nelson & Kondratieff,

1980, has been collected from a total of 24 unique locations from southwestern

Pennsylvania to southwest Virginia (Nelson and Kondratieff 1980; Surdick 2004; DeWalt et al. 2020). Sweltsa palearata Surdick, 2004 shares a similar distribution as A. biserrata, with collections ranging from southwestern Pennsylvania to southwestern Virginia

(Surdick et al. 2004; Earle 2009; Grubbs and Baumann 2019A; DeWalt et al. 2020). Most of the 38 collections of S. palearata are clustered within western Maryland (Surdick

2004; Stark et al. 2011). Sweltsa pocahontas Kirchner & Kondratieff, 1988 is known from the holotype location in central West Virginia to western Maryland (Kirchner and

Kondratieff 1988; Grubbs 1997; Stark et al. 2011; DeWalt et al. 2020). This species had the lowest sample size for this family (n=10; Table 1). Utaperla gaspesiana is typically uncommon throughout its range, spanning from Quebec and Ontario south to West

Virginia, following the Appalachians highlands and central Appalachian Plateau (Harper and Roy 1975; Tarter and Kirchner 1980; Surdick 2004; Tarter and Nelson 2006; DeWalt et al. 2020). Only 13 literature records were available for U. gaspesiana (Table 1).

Three species from the family were selected, including one without

SGCN status. The two with SGCN listings, Ostrocerca complexa (Claassen, 1937) and

O. prolongata (Claassen, 1923) are Appalachian species found in springs and headwater streams (Young et al. 1989). Ostrocerca complexa is known from Quebec and Ontario

9 south to Virginia and West Virginia (Young et al. 1989; Tarter and Nelson 2006; DeWalt et al. 2020), including a single collection from northern Delaware (Lake 1980). This species is comparatively rarer than its congener O. complexa, as the total records collected for O. complexa was 23 and the total for O. prolongata was seven (Table 1).

The third nemourid species selected for modeling was Soyedina kondratieffi Baumann &

Grubbs, 1996. This species ranges from Maryland south to Tennessee and North Carolina

(Baumann and Grubbs 1996; Grubbs and Baumann 2019B). Collections of this species are uncommon (n = 21) throughout its range.

A single species, Bolotoperla rossi (Frison, 1942), from the family

Taeniopterygidae was selected for modeling. This species is listed on Virginia’s SWAP, but is known from a broad Appalachian distribution with records known from Georgia

(Grubbs unpublished data) north to Quebec (Stark et al. 2016; DeWalt et al. 2020). This species represents the highest samples size of all species collected (n = 58). Megaleuctra flinti Baumann, 1973 and Hansonoperla appalachia Nelson, 1979 are also single representatives of their families, Leuctridae and Perlidae, respectively. Megaleuctra flinti known from a narrow range with published collections from southwestern Pennsylvania to Virginia (Baumann 1973; Grubbs 1996; Baumann and Stark 2013; DeWalt et al.

2020). A total of 27 literature records were available for this species (Table 1).

Hansonoperla appalachia is collected infrequently throughout its known range, which spans from New Hampshire south to North Carolina (Nelson 1979; Kondratieff and

Kirchner 1988; 1996; Earle 2009; DeWalt et al. 2020). A total of 15 literature records of

H. appalachia were available.

10

Pre-collection Species Distribution Modeling

All geographic data were processed using a combination of ArcMap 10.7 (ESRI

2016) and QGIS version 3.16 (QGIS Development Team 2020). Twenty-two environmental variables constituted the initial set of parameters used for modeling (Table

2). A total of 19 environmental variables were downloaded from the WorldClim website

(https://worldclim.org/) at 30 arcsecond resolution and were used as surrogates for in- stream habitat characteristics since fine scale datasets over a broad landscape for hydrologic characters were unavailable (Carlisle et al. 2010; Fick and Hijmans 2017). A

Digital Elevation Model (DEM) raster was downloaded from the NED at a 1/3 arcsecond resolution (USGS 2017). A raster file containing slope values was calculated using the

‘Slope’ feature in the Spatial Analyst Toolbox in ArcMap from the DEM. Slope was added to the list of environmental variables as the slope of a stream or river influences the velocity of water flow (Hallema et al. 2016). Solar radiation raster files were generated for each season using the ‘Solar Radiation’ tool in ArcMap. A land cover raster file was downloaded from the GAP/LANDFIRE National Terrestrial Ecosystems dataset at a

30m2 resolution (USGS 2011). The land cover data is derived from 2011 satellite imagery by the USGS National Center for Earth Resources Observation and Sciences. These land cover data summarize land use in terms of human development and influences and vegetation types across the conterminous United States (USGS 2011). The extent of each raster file was limited to the US Environmental Protection Agency (EPA) Level III

Ecoregions where pre-2020 collections of each target species have been reported. With the landcover dataset limited to the US borders, distribution models were also delimited to this extent and did not include records or environmental data outside the geopolitical

11 borders of the conterminous United States. Each raster file was resampled to match the resolution of the elevation file (1/3 arcsecond). Pearson’s Correlation Coefficient (r) was calculated between each environmental raster file using the R package ENMTools

(Figure 1; Warren et al. 2019). Highly correlated environmental variables (r > 0.70) were removed prior to modeling to avoid collinearity and reduce overfitting of the model leaving seven variables (Table 2).

One consequence of using presence-only data is that estimates of a species distribution may include a level of spatial bias as a remnant of collections limited to easily accessed areas. To avoid spatial bias when generating SDMs for each target stonefly species, 10,000 background points (coordinate pair data overlaid on environmental variable raster files to draw comparative data against the presence data of a target species) were randomly drawn from USGS Hydrologic Unit Code (HUC) 12 watersheds with collection records from the same family and limited to the same spatial extent as the environmental variables. Background points allow the MaxEnt algorithm to sample environmental data available within the landscape for the target species, providing a better estimation of the species distribution and tolerance to environmental values while also minimizing omission and commission errors (Elith et al. 2010; Kramer-

Schadt et al. 2013). Termed a “bias file,” this method delimits where background points are drawn from, thus providing a distribution similar to that of sampling effort and reduces over-representation of regions that are extensively surveyed (Kramer-Schadt et al. 2013). Only selecting HUC 12 watersheds with collections from confamilial species limited the bias file to regions that were sampled in similar seasons and habitat to the target species. Duplicate records, plus records within the same cell area as another record,

12 were removed to further limit spatial bias. All files were converted to ASCII files using the ‘Conversion Tools’ within the Toolbox feature in ArcMap as this is the default input file type used by the MaxEnt program.

Models were generated using the default settings in MaxEnt 3.4.3 (Phillips et al.

2006) with minor modifications. The algorithm parameters were changed so that only linear and quadratic features were used. These features constrain the prediction of suitable habitat to the variance and the expectation from the sample data (Elith et al.

2010; Merow et al. 2013). The number of iterations per replication, was changed from

500 to 5,000 to allow time for the model to meet the default convergence threshold of

0.00001. The remaining altered setting was the replicated run type, which was changed to

“Cross-validate.” This setting instructs the program to use a randomized percentage of the presence data to train the model and the remainder is withheld to test the model. Each model was replicated a maximum of 15 times, using different randomized partitions of the presence data in each replication. For species with sample sizes < 15, a single individual was removed from the model training data and used to test the model. This was repeated using a different individual for each replicate until all individuals had been used to test the model. For models with < 15 samples, the total number of replications was equal to the sample size. In both scenarios, models with < 15 samples and models with > 15 samples, all of the replicates were averaged to create the final distribution model.

MaxEnt produces a continuous logistic prediction with values ranging from 0 to 1 as a probability of habitat suitability, with values closer to 1 representing higher probability of suitable habitat (Merow et al. 2013). A threshold was applied to the logistic

13 output of each target species to provide a binary map of habitat suitability where only habitat identified as being suitable is shown. The Maximum Training Sensitivity plus

Specificity (MaxTSS) threshold was applied to utilize the number of accurately predicted presences (sensitivity) among tested presences and accurately predicted absences among background points (specificity) (Liu et al. 2016). MaxTSS is the rate of the omission and commission errors identified by MaxEnt during model training using the background and presence-only data provided. Use of MaxTSS as a threshold in distribution models has been shown to have increased accuracy compared to other thresholds (Cao et al. 2013).

The binary output is representative of the potential distribution based on the correlations between the environmental variables provided and the presence data of a species rather than the true realized distribution (Liu et al. 2013).

Adult specimen collection

Field work for adult stoneflies in 2020 were targeted in regions identified as suitable habitat by SDM within the Appalachian counties of Maryland (Garrett, Allegany,

Washington, and Frederick counties) with Maryland DNR Scientific Collecting Permit

(Number: SCP202064) (Figure 2). Adult collections spanned from January to December

2020 with the use of a beating sheet from riparian vegetation, UV light traps, and hand- picking with forceps from bridges, culverts, streamside rocks, and vegetation. Sites were accessed from roads, bridges, trail crossings, and backcountry hiking mainly within public land (i.e., Maryland state forests, parks, wildlife management areas, and angler- access sites). At each site of collection, GPS coordinates in decimal degrees were taken using a Garmin eTrex 10 handheld unit. Collected individuals were preserved in 95%

14 ethanol for later identification. On occasion, battery-powered UV florescent light traps were set up to collect adults of the family Perlidae at night. Identifications were completed using an Olympus SZ61 stereo microscope in the Western Kentucky

University Biodiversity Center. Voucher specimens have been deposited into the Western

Kentucky University Collection.

Post-collection Maxent Models

Following the 2020 collection period, models were again generated for target species collected from new locations with the same parameters, environmental raster files, and extents as the pre-2020 models described above. Thresholds were applied for each model using the MaxTSS value calculated by the MaxEnt program to make a binary map of habitat suitability. Both the continuous probability and binary outputs from

MaxEnt were mapped using ArcMap.

Model Evaluation

Several metrics were used to assess model performance. One of the most commonly used methods for estimating model performance is the area under the receiver- operator curve (AUC), since this test statistic quantifies a model’s ability to differentiate between randomly selected presence data and a randomly selected background point.

Scores range from 0 ̶ 1 with an AUC > 0.5 indicating that the model performed better than random. Although it remains popular, the use of AUC as the defining metric for model performance has been criticized by several authors for being used without

15 accounting for the target species biology and as being unfit for model comparison (Lobo et al. 2008; Yackulic et al. 2012). Lobo et al. (2008) found that AUC is dependent upon a species prevalence or how widespread it is throughout the study area. As the true presence of a single species within a landscape is unknown and this is the objective of

SDM, comparisons of mean AUC of models across different species is misleading (van

Proosdij et al. 2016). Suggested alternatives to AUC (e.g., True Skill Statistic, Mean

Absolute Error, Sum of Squared Errors, Maximum Calibration Error, unweighted Kappa statistic, Detection Rate, etc.) are either dependent upon species prevalence, the geographic size of the study area, or not as representative of model performance

(Konowalik and Nosol 2021). Without changing the model parameters, background points, and environmental variables between the model generated from pre-2020 collection records model and the model using both pre-2020 and 2020 collection records,

AUC is an appropriate test statistic to evaluate model performance changes between models of the same species (Merow et al. 2013). The use of AUC was supplemented by the Regularized Training Gain (RTG), a standard output of the MaxEnt program. This test statistic is a measure of the distance between the distribution of values of environmental variables at background points, assuming a uniform distribution, and the distribution of values from training locations with known occurrences and provides an estimate of the range of environmental conditions (Gormley et al. 2011). Once models are converted from the output files into a geographic processing program, model success can be visually evaluated by determining whether the presence points are located within areas of high probability of habitat suitability. Models with presence points found in

16 regions of predicted low probability are likely to have performed poorly (Yost et al.

2008).

RESULTS

The number of documented collection records used for distribution modeling of target species ranged from 7 to 58 (mean = 21.9). The two highest sample sizes used in generating species distribution models using solely pre-2020 collection records were for

S. palearata (n = 38) and B. rossi (n = 58). The mean AUC of MaxEnt models using solely pre-2020 records ranged from 0.606 ̶ 0.941 (mean = 0.761). The SDM for A. simmonsi was the lowest performing model in terms of mean AUC (Table 3). The highest mean AUC was calculated for model of A. frumi.

A total of 1,318 records were accumulated from 333 unique collection locations in the westernmost four Appalachian counties in Maryland during 2020. Eight of the 15 modeled species were collected at least once from a previously undocumented location, resulting in 29 new locality records for the target species. The highest number of new records was for A. aracoma (n = 9), with additional records collected for S. palearata (n

= 5), M. flinti (n = 4), A. biserrata (n = 3), and only two new records each were collected for A. simmonsi, O. complexa, S. kondratieffi, and S. pocahontas (Table 3). The remaining seven targeted species were not recollected, and updated models using 2020 collection data were not created for these species.

Models were generated incorporating the new presence records for species that were collected in 2020. MaxEnt models using both pre-2020 records and the 2020

17 collection data had a mean AUC range from 0.657 ̶ 0.898 (mean = 0.769; Table 5). The lowest performing model was for S. kondratieffi and the highest performing model was for A. biserrata. Model performance measured in AUC deceases as the landscape of interest and the amount of identified suitable habitat increases. Detailed results of each model are presented below in alphabetical order of each family.

Capniidae

Both A. frumi and A. harperi were targeted with collections from springs and headwater streams in western Maryland in March 2020, but these collections attempts yielded no new records. Collections were not attempted at the previously recorded locations documented by Grubbs (1997). The binary map of habitat suitability for A. frumi indicates that the majority of suitable habitat is found in central West Virginia, extending northwards into southwestern Pennsylvania (Figure 3). Approximately 17,000 km2 of suitable habitat at the MaxTSS threshold were identified throughout the landscape of interest for A. frumi. Four variables used to generate the model had negligible influences, with only annual precipitation, slope, and land cover shown to drive habitat selection. Regions with a high habitat suitability had high annual precipitations and low slope. Two land cover classes; mixed forests and developed open areas, were selected as having the greatest influence on the identification of suitable habitat (Figure 4).

The binary habitat suitability map for A. harperi indicates a more expansive range than A. frumi with 29,000 km2 of potentially suitable habitat predicted throughout the

Allegheny Mountains in central West Virginia and western Maryland and in the Blue

18

Ridge Mountains of Virginia (Figure 5). Much of the identified regions are high elevation with cool temperatures, with highlighted regions approaching the edges of the delineated study region often being ridgelines. The most influential variable in this model was annual temperature (61.7%) and the second highest was elevation (12.2%). Annual precipitation was found to decrease with increasing habitat suitability, similar to slope, and annual temperature (Figure 6). Both the minimum temperature of the coldest month and the maximum temperatures of the warmest month had minor influences on habitat selection. Land cover was found to be influential with 11.4% contribution to the model.

Within Maryland, A. harperi has been collected from high elevation springs along Big

Savage Mountain, similar habitat to that predicted by the MaxEnt algorithm.

The habitat suitability model for A. simmonsi using only pre-2020 records performed poorly in terms of AUC, likely due to a combination of small sample size and the large landscape chosen for modeling (Table 3). Approximately 26,500 km2 was identified as potentially suitable habitat for A. simmonsi in Pennsylvania south to Virginia using the pre-2020 collection data models (Figure 7). Following initial modeling using only pre-2020 collection data, two new locations of A. simmonsi were recorded from western Maryland. Both records for A. simmonsi were collected in regions identified as potentially suitable habitat by the pre-2020 model and exceeded the MaxTSS threshold

(Table 3). These two records were used to generate updated predictions of habitat suitability. Mean AUC increased between the model using solely pre-2020 collection data and the model using both pre-2020 and 2020 collection data (Table 3; Table 5).

Predictions of potentially suitable habitat for A. simmonsi increased between the pre-2020 records model and the model including both pre-2020 records and the 2020 collections as

19

28,000 km2 of habitat was identified ranging from southwestern Virginia to northeastern

Pennsylvania, following the eastern flank of the Appalachian Mountains (Figure 9).

Sizeable portions of the Susquehanna River Basin in central Pennsylvania were identified in both models (Figure 9). The majority of identified habitat was located in the Ridge and

Valley Level III Ecoregion in western Maryland, eastern West Virginia, and south-central

Pennsylvania. Elevation, annual precipitation, and land cover were the three most influential variables used in model creation for both pre-2020 and updated datasets (Table

4; Table 6). Response curves between models show much similarity with little influence of annual mean temperature, minimum temperature of the coldest month and maximum temperature of the warmest month. Suitable habitat was consistently identified in areas of low elevation and with low annual precipitations. The response of the logistic prediction to slope was less pronounced in the model using both pre-2020 and 2020 collection data than in the model using only pre-2020 data (Figure 8; Figure 10). Habitat suitability was found along medium intensity developed land, likely caused by sampling bias near populated areas and the high human population density in these regions.

Chloroperlidae

Modeling of A. aracoma using documented records provided the second-worst performing model with regard to mean AUC (Table 3). Nine out of 16 pre-2020 collections of A. aracoma were not located within the 24,000 km2 of predicted suitable habitat as instead MaxEnt selected regions along the western flank of the Central

Appalachians (69) Level III Ecoregion as potentially suitable habitat (Figure 23). The remaining ecoregion used to delimit the study area for this species, the Ridge and Valley

20

(67), did not have any area identified as suitable habitat (Figure 23). Total annual precipitation and elevation had the highest contributions to model creation. The four other variables used (mean annual temperature, minimum temperature of the coldest month, maximum temperature of the warmest month, and slope) had little influence on the model creation with variable importance ranked < 1%. New collections in western

Maryland resulted in nine new records for A. aracoma from eight new locations. All 2020 collections of A. aracoma in Maryland were in regions ranked below the MaxTSS threshold used to identify habitat as potentially suitable from the pre-2020 records

MaxEnt model. These new collection data were used to generate a refined MaxEnt model

(Figure 25). The updated model was less heavily influenced by annual precipitation

(21.0% contribution) and elevation (35.0%). The minimum temperature of the coldest month became the most influential variable, contributing the most to habitat selection

(38.8%; Table 4). The logistic output for the updated model increased with total annual precipitation and slope and decreased with elevation and minimum temperature of the coldest month (Figure 26). Identified suitable habitat increased to 43,000 km2 when using both pre-2020 and 2020 collection data with the suitable habitat including the records in western Maryland, Pennsylvania, and one of the records from eastern West Virginia

(Figure 25). A single record from eastern West Virginia was not included in regions identified as suitable habitat. This model updated with new collection data performed better than the model relying solely on pre-2020 records (Table 3). Both models predict suitable habitat to be on the western flank of the Appalachian Mountains along the

Appalachian Plateau. Expanding the study area for A. aracoma may provide a more

21 detailed range as current models were limited to the extent of HUC 12 catchments in the

Ridge and Valley (67) and Central Appalachians (69) Level III Ecoregions.

The model generated from valid pre-2020 records of A. biserrata performed well with a mean AUC of 0.877 (Table 3). A total of 18,500 km2 of suitable habitat was identified from the model using the pre-2020 model data (Figure 11). Identified habitat was distributed throughout the northerly and southerly extent of this species range, with suitable habitat particularly found in low elevation valleys, similar to the elevation range identified by Grubbs and Bauman (2019; their figure 41). The response curves for this model indicate that habitat suitability is at its highest as annual mean temperature approaches 13 ⁰C, decreasing as maximum temperature of the warmest month, annual precipitation, and as slope increases (Figure 12). Low intensity development and deciduous forests were the two land cover characters with the highest predicted habitat suitability. Reponses to the minimum temperature of the coldest month and elevation were minimal. The low influence elevation had on habitat selection is surprising as this species is known from low elevation streams compared to congeners (Grubbs and

Baumann, 2019). Annual precipitation, slope, and land cover were the most influential variables used in this model (Table 4). Sizeable portions of several of the counties listed by Surdick (2004) as having positive collections were selected as being potentially suitable habitat for A. biserrata.

Collections made during the 2020 field season resulted in three new collections of

A. biserrata from western Maryland. The MaxEnt model incorporating 2020 collection locations, in addition to pre-2020 collection data, responded similarly to environmental parameters as the model using solely pre-2020 collections records. Model performance

22 increased as a result of increased sample size between the pre-2020 records model and the model using both pre-2020 records and 2020 collections (Table 3; Table 5). This model continued to identify regions of West Virginia and Virginia where positive records are reported from, but high levels of geographic uncertainty exist in the referenced locations (Surdick, 2004) (Figure 13). The model using updated collection data and pre-

2020 collection data reduced the region identified as suitable habitat using the MaxTSS threshold by only 141 km2 throughout the entire study area but increased mean AUC by

0.021 (Table 3).

Initial modeling of S. palearata identified habitat spanning from southwestern

Pennsylvania to eastern West Virginia (Figure 15). Collections in western Maryland during 2020 resulted in five new localities of S. palearata in localities exceeding the

MaxTSS threshold, indicating that successful sampling attempts occurred in predicted potentially suitable habitat. Models using both pre-2020 and 2020 collection data identified similar habitat without noticeable differences (Figure 15; Figure 17). Both the model using pre-2020 collection data and the model using pre-2020 and 2020 collection data performed similarly with a mean AUC of 0.861. Only regularized training gain and testing gain decreased between the pre-2020 collections model and the model using both pre-2020 and 2020 collection data (Table 3). The response curves showed a decreased logistic prediction in elevation, mean annual temperature, maximum temperature of the coldest month, annual precipitation, and slope increase (Figure 16; Figure 18). The three land cover characters with the highest logistic predictions were open water, developed, open spaces, and deciduous forests. Annual precipitation, land cover, and elevation were the three most influential variables in both models (Table 4; Table 6).

23

Modeling using pre-2020 data identified nearly 40,500 km2 of potentially suitable habitat for S. pocahontas, ranging across much of the higher elevation peaks in eastern

West Virginia and western Maryland and extending northwards into central Pennsylvania with cooler temperatures associated with northern latitudes (Figure 19). Mean annual temperature was the most influential parameter for this model, contributing 77.0% to model generation. The other two variables influencing habitat selection were land cover

(16.3%) and elevation (5.4%) (Table 4). The land cover character that was predicted to have the highest suitable habitat was grassland/herbaceous with deciduous forests having the second highest logistic output. This model did not perform well (mean AUC = 0.672).

Two additional collections were made during the spring of 2020 from two previously unrecorded locations in western Maryland. These two collections were recorded from regions ranked greater than the MaxTSS threshold of the MaxEnt model using pre-2020 collection records only. The SDM for S. pocahontas generated from the 2020 collection data performed better (mean AUC of 0.712) than the pre-2020 collection data model

(Table 3). Response curves between both models showed little difference with the added

2020 collection data (Figure 20; Figure 22). The highest logistic output was found in regions with the lowest mean annual temperatures (ca. 6 ⁰C) and potentially suitable habitat decreasing with increasing temperature. Minimal increased logistic outputs were noted as a response to maximum temperature of the warmest month, minimum temperature of the coolest month, and slope. The logistic output also increased in response to both increased elevation and slope (Figure 21). A total of 33,500 km2 of suitable habitat was identified using this model, a decrease of nearly 7,000km2 over the entire delimited extent between both models.

24

The MaxEnt prediction for R. terna identified northern and southern extremes of the range and underpredicted more central portions of the species range. A total of

258,000 km2 of potentially suitable habitat was identified from Georgia north to Maine

(Figure 27). Low elevation, low slope, and low maximum temperature of the warmest month were all selected to have the highest influence on habitat suitability. Additionally, regions with elevated amounts of annual precipitation were also selected (Figure 28). The coldest temperature of the warmest month had a considerably high contribution to the creation of the model (34.0%) but low variable importance (Table 4). This model did not identify regions of the central Appalachians where pre-2020 records are known.

Modeling did identify habitat throughout northwestern Maine, near the border with

Quebec where pre-2020 records have been reported (Grubbs and Singai 2018). This is a positive indication that habitat in Quebec may have been selected if the model was not limited to the conterminous U.S. borders.

MaxEnt modeling identified a large swath of suitable habitat ranging from Maine and Vermont south to Pennsylvania for U. gaspesiana (Figure 29). Regions south of

Pennsylvania were not selected with only isolated patches of habitat identified, suggesting that U. gaspesiana is at the southern terminus of its range in West Virginia and Maryland. Response curves for the pre-2020 model of U. gaspesiana showed a decreased habitat suitability response to elevation and to annual temperature, with a preference for mixed hardwood forests and developed open land (Figure 30). The other variables used in modeling had little to no contribution to the model (Table 4). This model using pre-2020 collection data performed well in terms of AUC (AUC = 0.704) given the large geographic range (386,000 km2) of the model and few presence locations.

25

The identification of suitable habitat within northern US states can be evaluated with the pre-2020 records that lacked the geographic specificity necessary to be included in model generation. Kondratieff and Baumann (1994) include a record of U. gaspesiana from western Maine, a region identified by this MaxEnt model as being potentially suitable habitat for this species.

Leuctridae

The distribution model of M. flinti using pre-2020 records performed well (mean

AUC = 0.829). Approximately 24,500 km2 of potentially suitable habitat was predicted from the central Appalachians, particularly from southeastern West Virginia and southwestern Virginia north to south central Pennsylvania and eastwards to the Blue

Ridge Mountains (Figure 31). One record of M. flinti from southwestern Pennsylvania was not found within suitable habitat in the binary map of habitat suitability (Figure 31).

The most influential variables in identifying suitable habitat in this model were maximum temperature of the warmest month, mean annual temperature, and total annual precipitation (Table 4). The logistic output response to mean annual temperature decreased as temperature increased past 6 ⁰C, decreased as the maximum temperature of the warmest month increased, and increased as total annual precipitation increased

(Figure 32). Elevation, slope, and minimum temperature of the coldest month had little to no influence on model creation. Following modeling, four new collections of M. flinti were recorded during the spring 2020, of which only two were recorded in regions ranked greater than the MaxTSS threshold. All of the 2020 collections were subsequently added to a model which further exaggerated several of the response curves and increased the 26 influence of the maximum temperature of the warmest month had on model creation

(Table 4; Figure 34). Suitable habitat (27,500 km2) identified using both the pre-2020 and

2020 collection does not noticeably differ from the model using solely pre-2020 collection data (Figure 30; Figure 33).

Nemouridae

Suitable habitat for O. complexa was predicted to be found from the high elevations along the Tennessee and North Carolina border plus portions of the

Cumberland Mountains in southeastern Kentucky and Tennessee northward along the

Appalachian Mountains into Vermont, New Hampshire, and Maine (Figure 35). Three new records for O. complexa were recorded during the 2020 field season with two collections in new stream reaches consisting of one collection from two 1st-order tributaries in western Maryland (Figure 37). These two collections were both from habitat identified as potentially suitable, meeting and exceeding the MaxTSS threshold produced by the first iterative MaxEnt model. The mean AUC increased from 0.788 to 0.805 between the two models with the addition of 2020 collection data. Regularized training gain decreased from 0.379 to 0.415 (Table 3). The response curves generated by each model appear to be in consensus, both supporting the decreased logistic prediction in reaction to increasing annual temperature and increased predictions at increased minimum temperature of the coldest month and annual precipitation. Both annual temperature and the minimum temperature of the coldest month had the highest influence on model creation (Table 4). Elevation showed that the peak habitat suitability was near

600 m, with habitat suitability decreasing with increasing distance from this peak (Figure

27

38). Slope was the least influential variable used in modeling (Table 4). The response curves for O. complexa show a much sharper negative response to increasing slope when using the dataset with 2020 collection records compared to using pre-2020 records only

(Figure 36; Figure 38). The amount of suitable habitat meeting the MaxTSS threshold increased from 107,000 km2 in the model using only pre-2020 collection data to 115,250 km2 in the model using both pre-2020 and 2020 collection data.

No new collections of O. prolongata were obtained during 2020. Using pre-2020 collection data, the distribution maps identify 348,000 km2 of potentially suitable habitat from eastern West Virginia and the Blue Ridge Mountains in Virginia northwards to New

York and Maine (Figure 39). Western Delaware was not identified as suitable habitat by

MaxEnt in the binary habitat suitability maps although a single record occurs in the northern portion of the state (Lake 1980). Modeling occurred with both the Delaware record included and removed, with negligible influence on the regions identified or model performance. Although this model including Delaware had a very low sample size

(n = 7) spread over a large geographic range, this model performed well, with a mean

AUC of 0.759. All documented locations excluding the Delaware record were found within potentially suitable habitat meeting the MaxTSS threshold. Of the environmental parameters used to create this model, mean annual temperature had the highest contribution (91.3%) with the highest temperature of the warmest month contributing the second highest percentage (8.2%). All other variables had little influence on habitat selection (> 0.3% contribution) (Table 4). The response curves also indicated the logistic output reacted negatively to increasing annual temperature and maximum temperature of the warmest month, rather, favoring cooler regions for suitable habitat (Figure 40). This

28 model suggests that the records of O. prolongata in Virginia and West Virginia are near the southern terminus of this species’ range.

The SDM for S. kondratieffi using only pre-2020 collection records predicted

71,000 km2 of potentially suitable habitat from north Georgia extending northwards though the Allegheny Mountains, Blue Ridge Mountains, and Cumberland Mountains to southeastern New York (Figure 41). This model predicts suitable habitat in regions of

West Virginia, Tennessee, Pennsylvania, Kentucky, and Georgia where this species has not been collected. Annual precipitation, slope, and the maximum temperature of the warmest month had the highest contributions to model creation (Table 4). Suitable habitat increased with regions of increased annual precipitation. Similar responses are noted with slope and with the minimum temperature of the coldest month (Figure 42). Regions with higher predicted suitable habitat had lower predicted maximum temperatures of the warmest month. The addition of two new collection records from Maryland resulted in similar response curves with the three most influential variables remaining annual precipitation, slope, and the maximum temperature of the warmest month (Table 4). The updated model using pre-2020 and 2020 collection data from Maryland identified broader regions in West Virginia, Kentucky, and Pennsylvania as potentially suitable habitat totaling 97,500 km2 (Figure 43). Only one of the collections from 2020 was recorded from regions ranked greater than the MaxTSS threshold.

Perlidae

29

The model for H. appalachia using pre-2020 collection data performed marginally. Potentially suitable habitat was predicted from western North Carolina northwards through the Appalachians into New England with a band of predicted suitable habitat stretching into eastern Ohio (Figure 45). Predicted suitable habitat meeting the

MaxTSS threshold included all presence locations in addition to regions in New

Hampshire where larvae have been reported but were not used in model generation as the records provided by Kondratieff & Kirchner (1988) did not include enough detail to identify a specific location. This visual assessment may be indicative of over prediction of habitat in the northerly latitudes of this species range as documented collections in

North Carolina and Tennessee were not as strongly supported by suitable habitat in the continuous prediction provided by the MaxEnt algorithm (Figure 45). This model of H. appalachia identified northerly latitudes of this species range as being higher in predicted habitat suitability than the southerly latitudes. A total of 359,000 km2 of potentially suitable habitat met or exceeded the MaxTSS threshold. Habitat that was selected was influenced strongly by land cover and the maximum temperature of the warmest month

(Table 4). The highest habitat suitability was found at the lowest maximum temperature

(19 ⁰C) and in open developed areas, deciduous forests, and mixed forests (Figure 46).

Taeniopterygidae

The SDM for B. rossi performed poorly with a low mean AUC (= 0.671).

Predicted potentially suitable habitat identified by the pre-2020 collection data for B. rossi model was clustered around documented collection records within the northerly and southerly longitudes of the known range (Figure 47). Within the central portion (i.e., 30

West Virginia, Maryland, and Pennsylvania), suitable habitat was identified outside of areas with documented collections. Although visually, the model did not identify several collection locations as suitable habitat, according to mean AUC, the model performed better than random (mean AUC > 0.5) at separating both random presence points and random background points (Table 3). The response curves indicate that habitat suitability for B. rossi is higher in regions with higher annual precipitation and minimum temperature of the coldest month with the highest habitat suitability found at 2200 mm and 2 ⁰C respectively (Figure 48). Habitat suitability at its highest at the lower end of the maximum temperature of the warmest month, elevation, and slope. Annual precipitation was the highest contributing variable to this model (57.5%) and the maximum temperature of the warmest month had the second highest contribution (25.3%). Annual mean temperature had no influence on model creation, which is reflected in the response curve (Figure 48). The other three variables; elevation, slope, and the minimum temperature of the coldest month had a low contribution to the model (< 10%) (Table 4).

This model identified 242,500 km2 of suitable habitat meeting or exceeding the MaxTSS threshold.

DISCUSSION

The use of MaxEnt in modeling the distribution of aquatic insects with low sample sizes in combination with field validation of models is shown here with varied success. For models with contemporaneous (i.e., 2020) collections from field validation,

MaxEnt modeling using pre-2020 collection records accurately predicted regions with

31 newly confirmed presences as having potentially suitable habitat for all but one of the target species. The use of these distribution models can be categorized into four categories: (1) pre-2020 records with no new collections, (2) pre-2020 records with new collections in habitat previously identified as unsuitable, (3) pre-2020 records with new collections in habitat previously identified as suitable, and (4) pre-2020 records with new collections from both habitat identified as suitable and unsuitable. Maps created from these models do not represent occurrences of target species but rather habitat where these species are likely to be found based on similarities of environmental and climatological characteristics selected a priori between the landscape of interest and known presences.

Model Performance

Seven of the 15 models were not updated with new collection data although habitat was identified in the western portion of the state and sampling occurred during the known adult presence of each species. The lack of new collections from suitable habitat elucidates how rare or uncommon these species are across their range. Without updated collection data, models using only pre-2020 records could not be further refined and are reliant upon future collections efforts. The remaining eight models showed varied success in identifying suitable habitat when field validated. Collections during 2020 of A. simmonsi, A. biserrata, O. complexa, S. palearata, and S. pocahontas were all from habitat exceeding the MaxTSS threshold applied to define potentially suitable habitat from models built from documented pre-2020 records.

32

For most of the species with new records, models incorporating 2020 collection data showed signs of increased performance. Models for S. kondratieffi and M. flinti had a decrease in mean AUC with 2020 collection data (Table 3). These two species also had a decrease in RTG, indicative of a narrowed environmental range provided with new collection data. All other species with refined models using 2020 collection data showed increases in both RTG and AUC (Table 3). Four of these eight models performed poorly.

MaxEnt modeling of A. aracoma did not identify any suitable habitat within Maryland although three documented records from prior to 2020 existed from within the state. An additional nine records of A. aracoma were collected from Maryland during the 2020 collection season from streams similar to the previously reported state records (Surdick

2004). Models of S. kondratieffi and M. flinti only identified a portion of the newly collected records as being from suitable habitat and both had a decrease in mean AUC incorporating the 2020 data. Models that were able to accurately predict suitable habitat in only a portion of the areas where 2020 collections occurred are indicative of those performing poorly, likely due to the lessened predictive power associated with small sample size. The difficulty in producing distribution models for M. flinti, S. kondratieffi, and to some degree, S. pocahontas, relative to other sample size limited models, likely originates from the stenophilic nature of each species. All three species have been collected only from small seeps, springheads, and small headwater tributaries (Baumann and Grubbs 1996; Baumann and Stark 2013; Verdone et al. 2017; Grubbs and Baumann

2019B). The use of SDM using climate variables in highly fragmented habitats and impacted areas, such as the central Appalachians, is still shown to be effective in identifying suitable habitat (McCune 2016). There is a general inability of the low-

33 resolution climatic and topographical variables to accurately predict groundwater characters such as temperature and flow (Becker 2006; Jha et al. 2006). High-resolution, environmental parameters specific to the hydrologic regime, flow dynamics, and water quality of each waterbody across the landscape delimited for modeling are crucial to providing accurate predictions of these species distribution, all of which are known affect larval stoneflies (Lenat 1993; Stewart and Stark 2002). Additionally, historical land use within each drainage basin may be an important factor influencing central Appalachian aquatic insects, as much of the region has historically been afflicted by extractive industries such as mountaintop removal mining and clear cutting for timber (Buckley

1998; Harding et al. 1998; Yarnell 1998; Webster and Jenkins 2010).

The addition of more variables may come with a drawback as overparameterization with small sample sizes will underpredict suitable habitat of a species (Warren and Seifert 2011; Cao et al. 2013). Underprediction of habitat by overparameterization and subsequent overfitting of the model in combination with the use of small sample sizes will result in maps that may be superficially appealing but will poorly describe species distributions (Warren and Seifert 2011). Overfitting of a model results in potential omission errors as true suitable habitat of a target species is overlooked (Rondinini et al. 2006).

The SDM for O. prolongata includes a single record from Delaware that is not within any regions of identified suitable habitat. This record stands out since it is geographically distant from other collections and does not match the montane habitat similar to the other known localities, presenting an issue on using literature sources of questionable quality. Unless the material in question has been examined, an assumption is

34 made about the accuracy of specimen identifications and location data for each literature record included in distribution modeling. While this presents a cautionary tale in the production of distribution models using literature sources, predicting the distribution of rare and uncommon species when excluding unreviewed records may be detrimental to the predictive power of models with already low sample sizes. In the case of O. prolongata, there are several possibilities for why this model did not predict habitat outside of the range of the other, montane presence points. One, this record is a misidentification, but this is unlikely as O. prolongata is unique among congeners in the adult stage (Young et al. 1989). Additionally, Lake (1980) acknowledges several eminent stonefly taxonomists that assisted in identifications of the adult stoneflies collected during that study. Two, this is a true record for O. prolongata, but without records in similar habitats, this record was treated herein as an outlier, had little influence over habitat selection, and as a result is an omission error of the model. The possibility of this record being treated as an outlier does suggest that either this species may be more widely distributed throughout the northeast and is found in a much broader range of environments that are not often sampled for stoneflies (e.g., streams of the Coastal Plain), or that there is a remnant of high-quality habitat similar to that found in more montane areas that at the time of reported collection was able to sustain this species.

Likewise, models generated from small sample sizes across a large landscape should be used with caution since they may extrapolate outside of the range of data used to train the model and then overpredict the distribution of the target species (van Proosdij et al. 2016). Even if models do not extrapolate, species often use a much smaller fraction of the habitat than predicted as a result of competitive interactions, differences in

35 microclimate, extirpation and subsequent failed colonization, and other factors that may restrict a species presence from an area. The factors that delimit a species’ realized niche have conservation implications for SDM as the realized niche is often more limited than the regions of identified potentially suitable habitat. The broad selection of habitat across the landscape of interest decreases the probability of suitable habitat being overlooked.

When a threshold is applied to define regions as being suitable versus unsuitable, there is a probability that some suitable habitat may be missed. However, regions identified by meeting the threshold are more likely to be suitable. Overprediction of potentially suitable habitat is acceptable when the objective is to drive targeted sampling efforts rather than to delineate the true distribution of a species (Tronstad et al., 2018). Models presented are likely to have overestimated habitat of each target species; however, this is acceptable as the purpose of these models was not describe the true distribution of these species but rather to act as a guide for future sampling attempts. SDM, in conjunction with expert opinion of sampling locations, has resulted in identification of new locations of rare and uncommon Appalachian stoneflies in western Maryland. Implementing SDM with a target species with small sample sizes produces models that should be used conservatively when attempting to direct sampling efforts.

Model performance measured by mean AUC has had several criticisms, namely the equal weighting of commission and omission errors that may be untrue in a natural scenario and are influenced by the area of the landscape of interest selected for modeling

(Lobo et al. 2008). Proponents of the use of mean AUC argue of its independence from species prevalence in a region (i.e., proportion of sampling sites where the species was found to be present; McPherson et al. 2004). There is a slight negative correlation

36 between increasing potential suitable habitat and mean AUC for both models using solely pre-2020 data (Figure 49) and models using pre-2020 and 2020 collection data (Figure

50). The inverse relationship between the landscape extent and AUC occurs since AUC measures a model’s ability to differentiate between presences and pseudoabsences/absences within the provided landscape extent (Smith and Santos 2020).

If the extent broadens as sample size remains constant, the MaxEnt algorithm has more pseudoabsences/absences to distinguish from presences as well as additional background points covering a wider range of environmental variation (VanDerWal et al. 2009;

Anderson and Raza 2010). Poor performance of distribution models may be an ecological artifact where species with narrow distributions exhibit tolerances to a range of environmental variables not expressed at a broader geographic scale used for modeling, while species with a broader distribution may have a diversity of tolerances with local variations not expressed within models (Fielding and Haworth 1995; McPherson et al.

2004; Lobo et al. 2008).

Geographic Resolution

Checklists of aquatic insects within geopolitical units (e.g., states, counties) have been and remain a common method of presenting species assemblages for regions, yet by simplifying presence data to broader geographic resolutions, these publications lack the ability to provide definitive locations necessary for directing conservation efforts

(Robinson et al. 2016). The use of checklists and documented records from literature sources presents difficulty in generating and testing models by obscuring fine-scale environmental variations that may be important in describing a species distribution. For

37 example, several records of A. biserrata were withheld from model creation since location descriptions lacked the geographic resolution necessary to be accurately placed without influencing the model and introducing bias (Surdick 2004). Similarly, several records for A. harperi are listed as occurring at the same location as A. frumi, but no exact reference to the locations is given beyond the county names (Kirchner 1982). Presence data at low geographic resolution can still be used to test model performance and determine whether or not the model has accurately predicted suitable habitat of the target species in an approach similar to the cross-validation technique used by MaxEnt in testing each iteration of a model (West et al. 2016). When geographic resolution is low, the ability to test high-resolution models using documented records is handicapped. One solution is to reduce the resolution of the model to coarser-scale raster files, but the loss of fine-scale data also comes at the loss of identifying specific habitat where a species can be found, further exacerbating issues that limit research on these taxa. Additionally, the lowest resolution available when using county-level geographic data would not present new information for identifying suitable habitat for narrowly endemic species, particularly within a topographically and climatologically heterogenous region such as the Appalachian Mountains (Connor et al. 2017). A second solution is to not use the low- geographic resolution records to reduce bias created through random placement of GPS data, as the coordinate data associated with the georeferenced location description may not correspond to the actual collection location. In the case of species with small sample sizes, the loss of one or more collection records can be detrimental to the predictive capabilities of SDM (Wisz et al. 2008; Bean et al. 2012).

38

Variable Importance

The inclusion of relevant environmental variables that influences the life history of each species allows for refined estimation of each species realized niche, which in the case of rare, uncommon, or otherwise reclusive species can be difficult to determine solely using in-situ observations. The amount that each variable influence model generation differed between all models in this study. In models using pre-2020 collection data, the most influential variable used was annual precipitation (Figure 51). Annual precipitation was followed by land cover and mean annual temperature (Figure 51). Land cover data was not used in every model as models with this variable were overfit to the land cover categories for developed and urban areas. Urban land cover variables were highly represented in models for some species (e.g., B. rossi and H. appalachia) with collections from urban areas (Table 4). Records of stoneflies near and the selection by

MaxEnt of these land cover categories is likely a relic of sampling bias with collections in close proximity of regions with high human population densities. In models using both pre-2020 and 2020 collection data, annual precipitation was again the most influential variable (Figure 52). Land cover was the second most influential factor for the four models that included this environmental variable. Mean annual temperature was not as influential as in models using solely pre-2020 collection records. The influence of annual precipitation was prominent (>20% permutation importance) for five of the eight species modeled with both pre-2020 and 2020 collection data. The three species in which annual precipitation was not influential in models using both pre-2020 and 2020 collections were

M. flinti, O. complexa, and S. pocahontas. Although these species have varied life histories, these species are dependent upon groundwater recharge in springheads and

39 spring seeps (Baumann 1973; Stewart and Stark 2002; Baumann and Stark 2013). Strong groundwater influences for these lotic habitats may be the underlying reason behind the low influence of annual precipitation of model generation.

Conclusions

Rare and uncommon stoneflies, plus other aquatic insects, are often targeted using a “shotgun” approach, in which a researcher selects sampling sites based on the proximity to access points (roads, trails, etc.) within a region of interest or near known locations of previous collections. Beyond these factors, however, site selection is largely uniformed.

Application of SDM to SGCN-listed taxa, particularly those with small sample sizes, is shown herein to be effective in the identification of suitable habitat and the identification of new localities. Providing distribution models for a target species generated using relevant environmental parameters that influence the natural history of said species and continuously refining these models by incorporating new collection data in an iterative process presents a cost beneficial strategy on which to direct future sampling attempts.

Concentrating sampling efforts in regions identified as being potentially suitable habitat, particularly along riparian areas, are likely to increase collections of rare and uncommon stoneflies.

40

TABLES

Table 1. Number of valid presence records prior to 2020 for Appalachian stonefly species targeted with distribution modeling with state Species of Greatest Conservation Need (SGCN) status listed. Delaware = DE, Maryland = MD, Pennsylvania = PA, Virginia = VA, West Virginia = WV.

States with SGCN Family Species Documented Records Status Capniidae Allocapnia frumi 7 WV Allocapnia harperi 26 PA Allocapnia simmonsi 10 PA, VA Chloroperlidae Alloperla aracoma 16 MD, PA, WV Alloperla biserrata 24 MD, PA, VA, WV Rasvena terna 34 Not an SGCN Sweltsa palearata 38 MD, PA Sweltsa pocahontas 10 MD, WV Utaperla gaspesiana 13 MD, PA, WV Leuctridae Megaleuctra flinti 27 MD, PA, VA, WV Nemouridae Ostrocerca complexa 23 VA, WV Ostrocerca prolongata 7 DE, PA, VA, WV Soyedina kondratieffi 21 Not an SGCN Perlidae Hansonoperla appalachia 15 PA, VA, WV Taeniopterygidae Bolotoperla rossi 58 VA

41

Table 2. Description of variables prepared for use in MaxEnt for each target species. Bioclimatic variables from WorldClim are averaged values from 1970 ̶ 2000. An asterisk (*) represents final variable selection

Variable Explanation Source Bio1* Annual mean temperature (⁰C) WorldClim Mean diurnal range mean of monthly (max temperature - min Bio2 temperature) (⁰C) WorldClim Bio3 Isothermality ((Bio2/Bio7)x100) (%) WorldClim Bio4 Temperature seasonality (standard deviation x 100) (⁰C x 100) WorldClim Bio5* Maximum temperature of the warmest month (⁰C) WorldClim Bio6* Minimum temperature of the coldest month (⁰C) WorldClim Bio7 Temperature annual range (Bio5 - Bio6) (⁰C) WorldClim Bio8 Mean temperature of wettest quarter (⁰C) WorldClim Bio9 Mean temperature of the driest quarter (⁰C) WorldClim Bio10 Mean temperature of the warmest quarter (⁰C) WorldClim Bio11 Mean temperature of the coldest temperature (⁰C) WorldClim Bio12* Annual precipitation (mm) WorldClim Bio13 Precipitation of the wettest month (mm) WorldClim Bio14 Precipitation of the driest month (mm) WorldClim Bio15 Precipitation seasonality (Coefficient of Variation) (%) WorldClim Bio16 Precipitation of the wettest quarter (mm) WorldClim Bio17 Precipitation of the driest quarter (mm) WorldClim Bio18 Precipitation of the warmest quarter (mm) WorldClim Bio19 Precipitation of the coldest quarter (mm) WorldClim Elevation* Average altitude of surface topology USGS NED Percent change between neighboring pixels derived from Slope* elevation raster USGS NED Land Cover* Classification of the land use characteristics USGS NLCD

42

Table 3. Evaluation metrics for each the MaxEnt model generated for each target species using only pre- 2020 collection data. Mean AUC is averaged over 15 repetitions when creating the model and the standard deviation is shown. Regularized Training Gain (RTG) and Test Gain (TG) are presented for each model along with the Maximum Training Specificity-Sensitivity (MaxTSS) used as a threshold for producing binary maps of suitable habitat. Sample AUC (mean ± Family Species RTG TG MaxTSS Size St.Dev) Capniidae Allocapnia frumi 7 0.941 ± 0.122 2.318 2.399 0.125 Allocapnia harperi 26 0.832 ± 0.131 0.987 0.675 0.253 Allocapnia simmonsi 10 0.606 ± 0.266 0.269 -0.162 0.509 Chloroperlidae Alloperla aracoma 16 0.643 ± 0.327 0.258 0.150 0.651 Alloperla biserrata 24 0.877 ± 0.087 1.186 0.959 0.237 Rasvena terna 34 0.698 ± 0.199 0.293 0.120 0.489 Sweltsa palearata 38 0.861 ± 0.066 1.103 0.991 0.373 Sweltsa pocahontas 10 0.672 ± 0.226 0.283 -0.029 0.478 Utaperla gaspesiana 13 0.704 ± 0.249 0.464 0.089 0.318 Leuctridae Megaleuctra flinti 27 0.829 ± 0.104 0.751 0.654 0.366 Nemouridae Ostrocerca complexa 23 0.788 ± 0.146 0.379 0.478 0.448 Ostrocerca prolongata 7 0.759 ± 0.188 0.311 0.338 0.391 Soyedina kondratieffi 21 0.700 ± 0.217 0.294 0.215 0.394 Perlidae Hansonoperla appalachia 15 0.674 ± 0.208 0.244 0.110 0.418 Taeniopterygidae Bolotoperla rossi 58 0.671 ± 0.104 0.181 0.157 0.473

43

Table 4. Percent contribution and permutation importance of selected environmental variables used to implement MaxEnt modeling with pre-2020 collection data. See Table 2 for full variable descriptions. Variable Permutation Family Species Variable Contribution (%) Importance Capniidae Allocapnia frumi Bio1 0.5 0 Bio5 0 0 Bio6 0 0 Bio12 60.9 45.3 Elevation 0 0 Slope 30.4 53.6 NLCD 8.1 1 Allocapnia harperi Bio1 61.7 48.8 Bio5 1.7 0.2 Bio6 0 0.1 Bio12 2 2.9 Elevation 12.2 24.8 Slope 11 19.8 NLCD 11.4 3.4 Allocapnia simmonsi Bio1 0.1 0 Bio5 0 0 Bio6 0 0 Bio12 43.4 18 Elevation 13.3 2.5 Slope 3.3 18.5 NLCD 39.8 61 Chloroperlidae Alloperla aracoma Bio1 0 0.3 Bio5 0 0 Bio6 1.1 6.9 Bio12 58.4 42.2 Elevation 40.3 50 Slope 0.2 0.7 Alloperla biserrata Bio1 1.9 5.3 Bio5 1.2 3.9 Bio6 0 0.1 Bio12 56.1 41.4 Elevation 0 0 Slope 20.4 32.6 NLCD 20.4 16.7 Rasvena terna Bio1 0 1.1

44

Table 4. cont. Variable Permutation Family Species Variable Contribution (%) Importance Bio5 5 25 Bio6 34 0.1 Bio12 32 41.2 Elevation 14 23 Slope 15 9.6 Sweltsa palearata Bio1 0.4 0.7 Bio5 1.6 12 Bio6 4.3 0.2 Bio12 73.9 62.5 Elevation 4 17.7 Slope 1 2.2 NLCD 14.6 4.7 Sweltsa pocahontas Bio1 77 40 Bio5 0 4.1 Bio6 0 0 Bio12 0.6 16.1 Elevation 5.4 7.2 Slope 0.6 10.8 NLCD 16.3 21.8 Utaperla gaspesiana Bio1 14.9 44.9 Bio5 0.7 11.3 Bio6 40.8 0 Bio12 0 0.4 Elevation 25.6 25.6 Slope 0.4 0 NLCD 17.5 17.8 Leuctridae Megaleuctra flinti Bio1 25.2 36.2 Bio5 55 62.1 Bio6 3 0.4 Bio12 8.6 1.1 Elevation 8.1 0.1 Slope 0.1 0.1 Nemouridae Ostrocerca complexa Bio1 21 60.6 Bio5 19.9 1 Bio6 17.1 31.1 Bio12 1.1 2 Elevation 39.9 5.1 Slope 1 0.1

45

Table 4. cont. Variable Permutation Family Species Variable Contribution (%) Importance Ostrocerca prolongata Bio1 91.3 26.8 Bio5 8.2 0 Bio6 0 0 Bio12 0 1.9 Elevation 0.3 14.3 Slope 0.2 14.2 Soyedina kondratieffi Bio1 0 0 Bio5 13.3 58.7 Bio6 2.2 16.8 Bio12 66.4 5.8 Elevation 0.6 2.4 Slope 17.5 16.4 Hansonoperla Perlidae appalachia Bio1 0 0 Bio5 23.4 26.2 Bio6 0 0 Bio12 0.6 0 Elevation 0.9 5.2 Slope 1.7 1.9 NLCD 73.4 66.7 Taeniopterygidae Bolotoperla rossi Bio1 0 0 Bio5 25.3 34.5 Bio6 4.7 32.4 Bio12 57.5 19.1 Elevation 6.8 7.1 Slope 5.7 6.7

46

Table 5. Evaluation metrics for each the MaxEnt model generated for each target species using both pre-2020 and 2020 collection data. Mean AUC is averaged over 15 repetitions when creating the model and the standard deviation is shown. Regularized Training Gain (RTG) and Test Gain (TG) are presented for each model along with the Maximum Training Specificity-Sensitivity (MaxTSS) used as a threshold for producing binary maps of suitable habitat. Sample AUC (mean ± Family Species RTG TG MaxTSS Size St.Dev) Capniidae Allocapnia simmonsi 12 0.705 ± 0.251 0.450 0.144 0.460 Chloroperlidae Alloperla aracoma 25 0.712 ± 0.115 0.263 0.207 0.473 Alloperla biserrata 27 0.898 ± 0.075 1.257 1.187 0.210 Sweltsa palearata 43 0.861 ± 0.062 1.115 0.949 0.415 Sweltsa pocahontas 12 0.712 ± 0.208 0.348 0.087 0.497 Leuctridae Megaleuctra flinti 31 0.800 ± 0.088 0.667 0.556 0.361 Nemouridae Ostrocerca complexa 25 0.805 ± 0.137 0.415 0.546 0.425 Soyedina kondratieffi 23 0.657 ± 0.156 0.280 0.075 0.361

47

Table 6. Percent contribution and permutation importance of selected environmental variables used to implement MaxEnt modeling with pre-2020 and 2020 collection data. See Table 2 for full variable descriptions. Variable Permutation Family Species Variable Contribution Importance (%) Allocapnia Capniidae simmonsi Bio1 0.0 0.0 Bio5 0.1 0.6 Bio6 0.0 0.0 Bio12 49.3 49.8 Elevation 13.3 13.9 Slope 0.9 3.8 NLCD 36.5 31.9 Chloroperlidae Alloperla aracoma Bio1 0.0 0.1 Bio5 0.0 0.0 Bio6 38.8 37.2 Bio12 21.0 26.9 Elevation 35.0 34.3 Slope 5.2 1.6 Alloperla biserrata Bio1 1.1 9.7 Bio5 0.5 5.2 Bio6 0.0 0.3 Bio12 56.8 39.1 Elevation 0.0 0.0 Slope 21.6 16.5 NLCD 19.9 29.3 Sweltsa palearata Bio1 0.5 1.1 Bio5 1.8 10.4 Bio6 3.6 0.1 Bio12 75.0 54.3 Elevation 5.4 26.5 Slope 1.1 0.3 NLCD 12.5 7.2 Sweltsa pocahontas Bio1 74.6 66.8 Bio5 0.1 0.0 Bio6 1.3 0.6 Bio12 0.2 1.4 Elevation 6.7 12.1 Slope 0.1 1.1 NLCD 17.0 18.0 Leuctridae Megaleuctra flinti Bio1 20.3 18.7 Bio5 68.4 79.0 48

Table 6 cont. Variable Permutation Family Species Variable Contribution Importance (%) Bio6 6.9 0.3 Bio12 1.0 0.2 Elevation 2.3 0.0 Slope 1.1 1.9 Ostrocerca Nemouridae complexa Bio1 19.8 56.6 Bio5 15.9 1.1 Bio6 18.4 35.7 Bio12 1.2 1.0 Elevation 42.7 5.1 Slope 2.0 0.4 Soyedina kondratieffi Bio1 0.1 0.0 Bio5 20.2 44.6 Bio6 0.9 8.4 Bio12 51.9 13.2 Elevation 1.1 31.5 Slope 25.8 2.4

49

FIGURES

Figure 1. Heatmap of Pearson’s Correlation Coefficient between variables considered for MaxEnt modeling. Warmer colors (e.g., yellow) represent higher correlation and cooler colors (e.g., blue) represent lower correlation.

50

distributionalmodels. Figure 2. Location of 2020 collection sites in Maryland used to field test 2020 Location 2. test sites of field to Maryland in used collection Figure

51

Figure 3. Binary probability of habitat suitability for Allocapnia frumi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

52

Figure 4. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia frumi using pre-2020 records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

53

Figure 5. Binary probability of habitat suitability for Allocapnia harperi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

54

Figure 6. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia harperi using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

55

Figure 7. Binary probability of habitat suitability for Allocapnia simmonsi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

56

Figure 8. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia simmonsi using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

57

Figure 9. Binary probability of habitat suitability for Allocapnia simmonsi generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

58

Figure 10. MaxEnt response curves for each environmental variable used to identify suitable habitat for Allocapnia simmonsi using both pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

59

Figure 11. Binary probability of habitat suitability for Alloperla biserrata generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

60

Figure 12. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla biserrata using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

61

Figure 13. Binary probability of habitat suitability for Alloperla biserrata generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

62

Figure 14. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla biserrata using both pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

63

Figure 15. Binary probability of habitat suitability for Sweltsa palearata generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

64

Figure 16. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa palearata using pre-2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

65

Figure 17. Binary probability of habitat suitability for Sweltsa palearata generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

66

Figure 18. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa palearata using pre-2020 and 2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

67

Figure 19. Binary probability of habitat suitability for Sweltsa pocahontas generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland.

68

Figure 20. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa pocahontas using pre-2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

69

Figure 21. Binary probability of habitat suitability for Sweltsa pocahontas generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented and contemporaneous collections from within Garrett County, Maryland.

70

Figure 22. MaxEnt response curves for each environmental variable used to identify suitable habitat for Sweltsa pocahontas using pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

71

Figure 23. Binary probability of habitat suitability for Alloperla aracoma generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland.

72

Figure 24. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla aracoma using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

73

Figure 25. Binary probability of habitat suitability for Alloperla aracoma generated from the logistic output from MaxEnt using both pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies new collections from within Garrett County, Maryland.

74

Figure 26. MaxEnt response curves for each environmental variable used to identify suitable habitat for Alloperla aracoma using both pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

75

Figure 27. Binary probability of habitat suitability for Rasvena terna generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

76

Figure 28. MaxEnt response curves for each environmental variable used to identify suitable habitat for Rasvena terna using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

77

Figure 29. Binary probability of habitat suitability for Utaperla gaspesiana generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The record of U. gaspesiana from West Virginia was not included in analyses due to the location uncertainty.

78

Figure 30. MaxEnt response curves for each environmental variable used to identify suitable habitat for Utaperla gaspesiana using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

79

Figure 31. Binary probability of habitat suitability for Megaleuctra flinti generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

80

Figure 32. MaxEnt response curves for each environmental variable used to identify suitable habitat for Megaleuctra flinti using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

81

Figure 33. Binary probability of habitat suitability for Megaleuctra flinti generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

82

Figure 34. MaxEnt response curves for each environmental variable used to identify suitable habitat for Megaleuctra flinti using both pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

83

Figure 35. Binary probability of habitat suitability for Ostrocerca complexa generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland.

84

Figure 36. MaxEnt response curves for each environmental variable used to identify suitable habitat for Ostrocerca complexa using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

85

Figure 37. Binary probability of habitat suitability for Ostrocerca complexa generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented and contemporaneous collections from within Garrett County, Maryland.

86

Figure 38. MaxEnt response curves for each environmental variable used to identify suitable habitat for Ostrocerca complexa using both pre-2020 and 2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

87

Figure 39. Binary probability of habitat suitability for Ostrocerca prolongata generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

88

Figure 40. MaxEnt response curves for each environmental variable used to identify suitable habitat for Ostrocerca prolongata using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

89

Figure 41. Binary probability of habitat suitability for Soyedina kondratieffi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented collections from within Garrett County, Maryland.

90

Figure 42. MaxEnt response curves for each environmental variable used to identify suitable habitat for Soyedina kondratieffi using pre-2020 records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

91

Figure 43. Binary probability of habitat suitability for Soyedina kondratieffi generated from the logistic output from MaxEnt using pre-2020 and 2020 collection data with a Maximum Training Sensitivity – Specificity threshold. The inset map location is highlighted in blue and identifies documented and contemporaneous collections from within Garrett County, Maryland.

92

Figure 44. MaxEnt response curves for each environmental variable used to identify suitable habitat for Soyedina kondratieffi using pre-2020 and 2020 collection records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

93

Figure 45. Binary probability of habitat suitability for Hansonoperla appalachia generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

94

Figure 46. MaxEnt response curves for each environmental variable used to identify suitable habitat for Hansonoperla appalachia using pre-2020 collection data. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

95

Figure 47. Binary probability of habitat suitability for Bolotoperla rossi generated from the logistic output from MaxEnt using pre-2020 collection data with a Maximum Training Sensitivity – Specificity threshold.

96

Figure 48. MaxEnt response curves for each environmental variable used to identify suitable habitat for Bolotoperla rossi using pre-2020 records. Each curve shows the mean response (in red) from the 15 repetitions. Blue coloration on each plot represents the mean ± one standard deviation averaged from 15 repetitions.

97

hm. The black line is a linear linear a is line black The hm.

Figure 49. Scatterplot of MaxEnt model performance for target to Scatterplot 49. collection using the of for prior of each performance data model species MaxEnt Figure identified algorit MaxEnt habitat the by suitable of amount the against 2020 equation the with right top the in corner. shown R2 value regression, and

98

amount of suitable habitat identified by the MaxEnt algorithm. The black line is a linear regression, with with linear a is regression, line black The algorithm. MaxEnt by the identified habitat suitable of amount

against the the against Figure 50. Scatterplot of MaxEnt model performance for each of the target species with collection data from 2020 2020 data target from Scatterplot 50. collection with the of for of each performance model species MaxEnt Figure R2 value right top and corner. the in shown equation the

99

Figure 51. Average percent permutation importance of seven environmental variables used to generate distributional models in MaxEnt for 15 stonefly species using pre-2020 collection data. Land cover (NLCD) was only used in eight of the 15 models.

100

Figure 52. Average percent permutation importance of seven environmental variables used to generate distributional models in MaxEnt for eight stonefly species using pre- 2020 and 2020 collection data. Land cover (NLCD) was only used in four of the eight models.

101

LITERATURE CITED

Alexander R. B. & R. A. Smith, 2006. Trends in the nutrient enrichment of U.S. rivers

during the late 20th century and their relation to changes in probable stream

trophic conditions. Limnology and Oceanography 51: 639 ̶ 654.

Anderson R. P. & A. Raza, 2010. The effect of the extent of the study region on GIS

models of species geographic distributions and estimates of niche evolution:

Preliminary tests with montane rodents (genus Nephelomys) in Venezuela. Journal

of Biogeography 37: 1378 ̶ 1393.

Barbour M. T., J. Gerritsen, B. D. Snyder, & J. B. Stribling, 1999. Rapid bioassessment

protocols for use in streams and wadeable rivers: periphyton, benthic

macroinvertebrates, and fish. U.S. Environmental Protection Agency. Washington

D.C.

Baumann R. W. 1973. New Megaleuctra from the Eastern United States (Plecoptera:

Leuctridae). Entomological News 84: 247 ̶ 250.

Baumann R. W., 1979. Rare aquatic insects, or how valuable are bugs? Great Basin

Naturalist Memoirs 3: 65 ̶ 67.

Baumann R. W. & S. A. Grubbs, 1996. Two new species of Soyedina (Plecoptera:

Nemouridae) from the Appalachian Mountains. Entomological News 107: 220 ̶

224.

Baumann R. W. & B. P. Stark, 2013. The genus Megaleuctra Neave (Plecoptera:

Leuctridae) in North America. Illiesia 9: 65 ̶ 93.

102

Bean W. T., R. Swafford & J. S. Brashares, 2012. The effects of small sample size and

sample bias on threshold selection and accuracy estimation of species distribution

models. Ecography 35: 250 ̶ 258.

Becker M. W., 2006. Potential for satellite remote sensing of ground water. Ground

Water 44: 306 ̶ 318.

Bojková J., K. Komprodová, T. Soldán, & S. Zahradkova, 2012. Species loss of

stoneflies (Plecoptera) in the Czech Republic during the 20th Century. Freshwater

Biology 57: 2550 ̶ 2567.

Briers R. A., H. M. Cariss, & J. H. Gee, 2002. Dispersal of adult stoneflies (Plecoptera)

from upland streams draining catchments with contrasting land-use. Archiv fur

Hydrobiologie 155: 627 ̶ 644.

Buckley, G. L., 1998. The environmental transformation of an Appalachian valley, 1850 ̶

1906. Geographical Review 88: 175 ̶ 198.

Cao Y., R. E. DeWalt, J. L. Robinson, T. Tweddale, L. Hinz, & M. Pessino, 2013. Using

Maxent to model the historic distributions of stonefly species in Illinois streams:

The effects of regularization and threshold selections. Ecological Modeling 259:

30 ̶ 39.

Cardoso P., T. L. Erwin, P. A. V. Borges, & T. R. New, 2011. The seven impediments in

invertebrate conservation and how to overcome them. Biological Conservation

144: 2647 ̶ 2655.

103

Carlisle D. M., J. Falcone, D. M. Wolock, M. R. Meador, R. H. Norris, 2010. Predicting

the natural flow regime; modeling for assessing hydrological alternation streams.

River Research and Applications 26: 118 ̶ 136.

Clark J. R., 1969. Thermal pollution and aquatic life. Scientific American 220: 18 ̶ 27.

Collier K. J., P. K. Probert, & M. Jefferies, 2016. Conservation of aquatic invertebrates:

concerns, challenges, and conundrums. Aquatic Conservation 26: 817 ̶ 837.

Connor T., V. Hull, A. Viña, A. Shortridge, Y. Tang, J. Zhang, F. Wang, J. Liu, 2017.

Effects of grain size and niche breadth on species distribution modeling.

Ecography 40: 1 ̶ 12.

DeWalt R. E., C. Favret & D. W. Webb, 2005. Just how imperiled are aquatic insects? A

case study of Stoneflies (Plecoptera) in Illinois. Annals of the Entomological

Society of America 98: 941 ̶ 950.

DeWalt R. E. & E. J. South, 2015. Ephemeroptera, Plecoptera, and Trichoptera on Isle

Royale National Park, USA compared to mainland species pool and size

distribution. ZooKeys 532: 137 ̶ 158.

DeWalt R. E., M. D. Maehr, H. Hopkins, U. Neu-Becker, & G. Stueber, 2020. Plecoptera

Species File Online. Version 5.0/5.0. Retrieved February 2020.

http://plecoptera.species.file.org.

Dudgeon D., A. H. Arthington, M. O. Gessner, Z. Kawabata, D. J. Knowler, C. Leveque,

C. A. Sullivan, 2006. Freshwater biodiversity: importance, threats, status, and

104

conservation challenges. Biological Reviews of the Cambridge Philosophical

Society 81: 163 ̶ 182.

Earle J. I., 2009. New state stonefly (Plecoptera) records for Pennsylvania, with

additional records and information on rare species. Illiesia 5: 169 ̶ 181.

Eaton L. E. & D. R. Lenat, 1991. Comparison of a rapid bioassessment method with

North Carolina’s qualitative macroinvertebrate collection method. Journal of the

North American Benthological Society 10: 335 ̶ 338.

Elith J., S. J. Phillips, T. Hastie, M. Dudík, Y. E. Chee, & C. J. Yates, 2010. A statistical

explanation of MaxEnt for ecologists. Diversity and Distributions. 17: 43 ̶ 57.

El-Gabbas A. & C. F. Dormann, 2018. Wrong, but useful: regional species distribution

models may not be improved by range wide data under biased sampling.

Evolution and Ecology 8: 2196 ̶ 2206.

ESRI, 2016. ArcGIS Release 10.4.1. Redlands, California.

Fick S. E., & R. J. Hijmans, 2017. Worldclim 2: New 1-km spatial resolution climate

surfaces for global land areas. International Journal of Climatology 37: 4302 ̶

4315.

Fielding A. H. & P. F. Haworth, 1995. Testing the generality of bird-habitat models.

Conservation Biology 9: 1466 ̶ 1481.

Gormley A. M., D. M. Forsyth, P. Griffioen, M. Lindeman, D. S. L. Ramsey, M. P.

Scroggie, L. Woodford, 2011. Using presence-only and presence-absence data to

105

estimate the current and potential distributions of established invasive species.

Journal of Applied Ecology 48: 25 ̶ 34.

Griffith M. B. & S. A. Perry, 1992. Plecoptera of headwater catchments in the Fernow

Experimental Forest, Monongahela National Forest, West Virginia. Proceedings

of the Entomological Society of Washington 94: 282 ̶ 287.

Griffith M. B., E. M. Barrows, S. A. Perry, 1998. Lateral dispersal of adult aquatic

insects (Plecoptera, Trichoptera) following emergence from headwater streams in

forested Appalachian catchments. Annals of the Entomological Society of

America 91: 195 ̶ 201.

Grubbs S. A., 1996. Stoneflies (Plecoptera) of the Powdermill Nature Reserve,

Southwestern Pennsylvania. Entomological News 107: 255 ̶ 260.

Grubbs S. A., 1997. New records, zoogeographic notes, and a revised checklist of

Stoneflies (Plecoptera) from Maryland. Transactions of the American

Entomological Society 123: 71 ̶ 84.

Grubbs S. A., 2021. Evidence of range contraction and increased metapopulation

patchiness of the rare eastern Nearctic Karst Snowfly Allocpania cunninghami

Ross & Ricker, 1971. Journal of Insect Conservation 1 ̶ 11.

Grubbs S. A. & R. W. Baumann, 2019A. Alloperla clarki sp. nov. (Plecoptera:

Chloroperldiae), a new species from the eastern Nearctic with discussion of a new

species group. Zootaxa 4624: 241 ̶ 255.

106

Grubbs S. A. & R. W. Baumann, 2019B. Soyedina Ricker, 1952 (Plecoptera:

Nemouridae) in the eastern Nearctic: review of species concepts, proposed

morphology-based species groups, and description of a new species from North

Carolina. Zootaxa 4658: 223 ̶ 250.

Grubbs S. A. & M. M. Singai, 2018. The eastern Nearctic species Rasvena terna (Frison,

1942) (Plecoptera, Chloroperlidae). Check List 14: 657 ̶ 663.

Hallema D. W., R. Moussa, G. Sun, & S. G. McNulty, 2016. Surface storm flow

prediction on hillslopes based on topography and hydrologic connectivity.

Ecological Processes 5: 1 ̶ 13.

Hallmann C. A., M. Sorg, E. Jongejans, H. Siepel, N. Hofland, H. Schwan, & H. de

Kroon, 2017. More than 75 percent decline over 27 years in total flying insect

biomass in protected areas. PLoS ONE, 12: e0185809.

Harding J. S., E. F. Benfield, P. V. Bolstad, G. S. Helfman, & E. B. Jones III, 1998.

Stream biodiversity: the ghost of land use past. Proceedings of the Natural

Academy of Science 95: 14843 ̶ 14847.

Harper P. P. & D. Roy, 1975. Utaperla gaspesiana sp. nov. le premier Plécoptére

Paraperlinaé de l’Est canadien. Canadian Journal of Zoology 53: 1185 ̶ 1187.

Helešic J., 2001. Nonparametric evaluation of environmental parameters determining the

occurrence of stonefly larvae (Plecoptera) in streams. Aquatic Sciences 63: 490 ̶

501.

107

Hilsenhoff W. L., 1982. Using a biotic index to evaluate water quality in streams.

Madison, WI: Wisconsin Department of Natural Resources.

Hood R. W., R. W. Kirchner, & J. I. Earle, 2008. A new state record of Allocapnia

simmonsi Kondratieff and Voshell (Plecoptera: Capniidae) from West Virginia,

U.S.A., with additional records from Pennsylvania and Virginia. Entomological

News 119: 303 ̶ 305.

Jha M. K., A. Chowdhury, V. M. Chowdary, & S. Peiffer, 2006. Groundwater

management and development by integrated remote sensing and geographic

information systems: prospects and constraints. Water Resources Management

21: 427 ̶ 467.

Kirchner R. F., 1980. A new Allocapnia from Virginia (Plecoptera: Capniidae).

Entomological News 91: 19 ̶ 21.

Kirchner, R. F., 1982. A new Allocapnia from West Virginia (Plecoptera: Capniidae).

Proceedings of the Entomological Society of Washington 84: 786 ̶ 790.

Kirchner R. F. & B. C. Kondratieff, 1988. A new species of Sweltsa from West Virginia

(Plecoptera: Chloroperlidae). Entomological News 99: 233 ̶ 236.

Kondratieff B. C. & R. W. Baumann, 1994. Assault on Atlantic Canada: A stonefly

collecting foray to the Canadian maritime provinces. Perla 12: 16 ̶ 18.

Kondratieff B. C. & R. F. Kirchner, 1988. A new species of Acroneuria from Kentucky

(Plecoptera: Perlidae) and new records of stoneflies form eastern North America.

Journal of the Kansas Entomological Society 61: 201 ̶ 207.

108

Kondratieff B. C. & R. F. Kirchner, 1996. Two new species of Hansonoperla

(Plecoptera: Perlidae) from Eastern North America. Annals of the Entomological

Society of America 89: 501 ̶ 509.

Kondratieff B. C. & J. R. Voshell Jr., 1981. Allocapnia simmonsi, a new species of

Winter Stonefly (Plecoptera: Capniidae). Annals of the Entomological Society of

America 74: 58 ̶ 59.

Konowalik K. & A. Nosol, 2021. Evaluation metrics and validation of presence-only

species distribution models based on distributional maps with varying coverage.

Scientific Reports 11: 1482.

Kramer-Schadt S, J. Niedballa, J. D. Pilgrim, B. Schröder, J. Lindenborn, V. Reinfelder,

M. Stillfried, I. Heckman…A. Wilting, 2013. The importance of correcting for

sampling bias in MaxEnt species distribution models. Diversity and Distributions

19: 1366 ̶ 1379.

Lake R. W., 1980. Distribution of the Stoneflies (Plecoptera) of Delaware. Entomological

News 91: 43 ̶ 48.

Lenat D. R., 1993. A biotic index for the southeastern United States: derivation and list of

tolerance values, with criteria for assigning water-quality ratings. Journal of the

North American Benthological Society 12: 279 ̶ 290.

Liu C., M. White, & G. Newell, 2013. Selecting thresholds for the prediction of species

occurrence with presence-only data. Journal of Biogeography 40: 778 ̶ 789.

109

Liu C., G. Newell, M. White, 2016. On the selection of thresholds for predicting species

occurrence with presence-only data. Ecology and Evolution 6: 337 ̶ 348.

Lobo J. M., A. Jiménez-Valverde, & R. Real, 2008. AUC: a misleading measure of the

performance of predictive distribution models. Global Ecology and Biogeography

17: 145 ̶ 151.

Lomolino M. & L. Heaney, 2004. Frontiers of Biogeography: new directions in the

geography of nature. Sunderland, Massachusetts.

Malmqvist B. & S. Rundle, 2002. Threats to the running water ecosystems of the world.

Environmental Conservation 29: 134 ̶ 153.

Master L. L., B. A. Stein, L. S. Kutner, G. A. Hammerson, 2000. Vanishing assets.

Precious heritage: The status of biodiversity in the United States. Oxford

University Press.

McCluney K. E., N. L. Poff, M. A. Palmer, J. H. Thorp, G. C. Poole, B. S. Williams, & J.

S. Baron, 2014. Riverine macrosystems ecology: sensitivity, resistance, and

resilience of whole river basins with human alterations. Frontiers in Ecology and

the Environment 12: 48 ̶ 58.

McCormick F. H., M. J. Smith, & S. L. Johnson, 2010. Effects of nonindigenous invasive

species on water quality and quantity. Washington D. C., U.S. Department of

Agriculture, Forest Service. Research and Development.

110

McCune J. L., 2016. Species distribution models predict rare species occurrences despite

significant effects of landscape context. Journal of Applied Ecology 53: 1871 ̶

1879.

McPherson J. M., W. Jetz, & D. J. Rodgers, 2004. The effects of species range sizes on

the accuracy of distribution models: ecological phenomenon or statistical artefact?

Journal of Applied Ecology 41: 811 ̶ 823.

Merow C., M. J. Smith, & J. A. Silander Jr., 2013. A practical guide to MaxEnt for

modeling species’ distributions: what it does, and why inputs and settings matter.

Ecography 361: 1058 ̶ 1069.

Muhlfield C. C., J. J. Giersch, F. R. Hauer, G. T. Pederson, G. Luikart, D. P. Douglas, C.

C. Downs, & D. B. Forge, 2011. Climate change links fate of glaciers and an

endemic alpine invertebrate. Clim. Change 106: 337 ̶ 345.

Nelson, C. H., 1979. Hansonoperla appalachia, a new genus and a new species of eastern

Nearctic Acroneuriini (Plecoptera: Perlidae), with a phenetic analysis of the

genera of the tribe. Annals of the Entomological Society of America 72: 735 ̶ 739.

Nelson C. H. & B. C. Kondratieff, 1980. Description of a new species of Alloperla

(Plecoptera: Chloroperlidae) from Virginia. Journal of the Kansas Entomological

Society 53: 801 ̶ 804.

Newcombe C. P. & D. D. Macdonald, 1991. Effects of suspended sediments on aquatic

ecosystems. North American Journal of Fisheries Management 11: 72 ̶ 82.

111

Pearson R. G., C. J. Raxworthy, M. Nakamura, A. T. Pearson, 2006. Predicting species

distributions from small numbers of occurrence records: a test case using cryptic

geckos in Madagascar. Journal of Biogeography 34: 102 ̶ 117.

Pessino M., E. T. Chabot, R. Giordano, R. E. DeWalt, 2014. Refugia and postglacial

expansion of Acroneuria frisoni Stark & Brown (Plecoptera: Perlidae) in North

America. Freshw. Sci. 33: 232 ̶ 249.

Petersen I., J. H. Winterbottom, S. Orton, N. Friberg, A. G. Hildrew, D. C. Spiers, W.

Gurney, 1999. Emergence and lateral dispersal of adult Plecoptera and

Trichoptera from Broadstone Stream, U. K. Freshwater Biology 42: 401 ̶ 416.

Phillips S. J. & M. Dudík, 2008. Modeling of species distributions with Maxent: new

extensions and a comprehensive evaluation. Ecography 31: 161 ̶ 175.

Phillips S. J., R. P. Anderson, R. E. Schapire, 2006. Maximum entropy modeling of

species geographic distributions. Ecological Modelling 190: 231 ̶ 259.

Polhemus D. A., 1993. Conservation of aquatic insects: worldwide crisis or localized

threats? Am. Zool. 33: 588 ̶ 598.

Pond G. J., 2012. Biodiversity loss in Appalachian headwater streams (Kentucky, USA):

Plecoptera and Trichoptera communities. Hydrobiologia 679: 97 ̶ 117. van Proosdij A. S. J., M. S. Sosef, J. J. Wiering, N. Raes, 2016. Minimum required

number of specimen records to develop accurate species distribution models.

Ecography 39: 542 ̶ 552.

112

QGIS Development Team (2020). QGIS Geographic Information System. Open Source

Geospatial Foundation Project. http://qgis.osgeo.org.

Robinson J. L., J. A. Fordyce, & C. R. Parker, 2016. Conservation of aquatic insect

species across a protected area network: null model reveals shortfalls of

biogeographical knowledge. Journal of Insect Conservation 20: 565 ̶ 581.

Rondinini C., K. A. Wilson, L. Boitani, H. Grantham, & H. P. Possingham, 2006.

Tradeoffs of different types of species occurrence data for use in systematic

conservation planning. Ecology Letters 9: 1136 ̶ 1145.

Rosenberg D. M. & V. H. Ross, 1993. Freshwater biomonitoring and benthic

macroinvertebrates. Chapman and Hall, New York.

Sanchez-Bayo F. & K. A. Wyckhuys, 2019. Worldwide decline of the entomofauna: a

review of its drivers. Biological Conservations 232: 8 ̶ 27.

Schultheis A. S., J. Y. Booth, L. R. Perlmutter, J. E. Bond, & A. L. Sheldon, 2012.

Phylogeography and species biogeography of montane Great Basin stoneflies.

Mol. Ecol. 21: 3325 ̶ 3340.

Sheldon A. L., 2012. Possible climate induced shift of stoneflies in a southern

Appalachian catchment. Freshwater Science 31: 765 ̶ 774.

Smith A. B., & M. J. Santos, 2020. Testing the ability of species distribution models to

infer variable importance. Ecography 43: 1801 ̶ 1813.

Stark B. P., B. C. Kondratieff, R. F. Kirchner, & K. W. Stewart, 2011. Larvae of eight

eastern North American Sweltsa (Plecoptera: Chloroperlidae). Illiesia 7: 51 ̶ 64.

113

Stark B. P., A. B. Harrison, B. C. Kondratieff, R. W. Baumann & K. C. Nye, 2016.

Distribution of the Smoky Willowfly, Bolotoperla rossi (Frison) (Plecoptera:

Taeniopterygidae: Brachypterainae) in Eastern North America. Illiesia 12: 15 ̶ 20.

Stepanian P. M., S. A. Entrekin, C. E. Wainwright, D. Mirkovic, J. L. Tank, & J. F.

Kelly, 2020. Declines in an abundant aquatic insect, the burrowing mayfly, across

major North American waterways. Proceedings of the North American Academy

of Science 117: 2987 ̶ 2992

Stewart K. W., B. P. Stark, 2002. Nymphs of the North American stonefly genera

(Plecoptera). Second edition. The Caddis Press, Columbus, Ohio, 510 pages.

Strayer D. L., 2006. Challenges for freshwater invertebrate conservation. Journal of the

North American Benthological Society 25: 271 ̶ 287.

Strayer D. L. & D. Dudgeon, 2010. Freshwater biodiversity conservation: recent progress

and future challenges. Freshwater Biology 29: 344 ̶ 358.

Surdick R. F., 2004. Chloroperlidae (The Sallflies). In: Stark B. P., B. J. Armitage (eds.).

The stoneflies (Plecoptera) of eastern North America. Volume II. Chloroperlidae,

Perlidae, and Perlodidae (Perlodinae). Ohio Biological Survey Bulletin New

Series Volume 14, Number 4. Ohio Biological Survey, Columbus, Ohio.

Tarter D. C. & R. F. Kirchner, 1980. List of the stoneflies (Plecoptera) of West Virginia.

Entomological News 91: 49 ̶ 53.

114

Tarter D. C. & C. H. Nelson, 2006. A revised checklist of the stoneflies (Plecoptera) of

West Virginia (USA). Proceedings of the Entomological Society of Washington

89: 429 ̶ 442.

Tarter D. C., D. L. Chaffee, S. A. Grubbs, & R.E. DeWalt, 2015. New state records of

Kentucky (USA) stoneflies (Plecoptera). Illiesia 11: 167 ̶ 174.

Tronstad L. M., M. D. Anderson, & K. M. Brown, 2018. Using species distribution

models to guide field surveys for an apparently rare aquatic beetle. Journal of Fish

and Wildlife Management 9: 330 ̶ 339.

United States Fish and Wildlife Service (USFWS), 2019. Endangered and Threatened

Wildlife and Plants; Threatened species status for Meltwater Lednian Stonefly

and Western Glacier Stonefly with a section 4(d) rule. Federal Register 84: 64210 ̶

64227.

United States Geological Survey (USGS), 2011. Gap Analysis Program,

GAP/LANDFIRE National Terrestrial Ecosystems;

https://doi.org/10.5066/F7ZS2TM0.

United State Geological Survey (USGS), 2017. 1/3rd arc-second Digital Elevation Models

(DEMs). USGS National Map 3DEP Downloadable Data Collection.

VanDerWal J., L. P. Shoo, C. Graham, & S. E. Williams, 2009. Selecting pseudo-absence

data for presence-only distribution modeling: How far should you stray from what

you know? Ecological Modeling 220: 589 ̶ 594.

115

Verdone C. J., 2018. Holomorphology and systematics of the eastern Nearctic stonefly

genus Remenus Ricker (Plecoptera: Perlodidae). Master’s thesis, Department of

Bioagricultural Sciences and Pest Management, Colorado State University, Fort

Collins, Colorado.

Verdone C. J., B. C. Kondratieff, R. E. DeWalt, & E. J. South, 2017. Studies on the

stoneflies of Georgia with the description of a new species of Soyedina Ricker,

new state records and an annotated checklist. Illiesia 13: 30 ̶ 49.

Wallace J. B., D. S. Vogel, & T. F. Cuffney, 1986. Recovery of a headwater stream from

an insecticide-induced community disturbance. Journal of the North American

Benthological Society 5: 115 ̶ 126.

Warren D.L., N. Matzke, M. Cardillo, J. Baumgartner, L. Beaumont, N. Huron, M.

Simões, Teresa L. Iglesias, & R. Dinnage, 2019. ENMTools (Software Package).

URL: https://github.com/danlwarren/ENMTools. doi:10.5281/zenodo.3268814

Warren D. L. & S. N. Steifert, 2011. Ecological niche modeling in MaxEnt: the

importance of model complexity and the performance of model selection criteria.

Ecological Applications 21: 335 ̶ 342.

Warton D. I., I. W. Renner, & D. Ramp, 2013. Model-based control of observer bias for

the analysis of presence-only data in ecology. PLoS ONE 8:e79168.

Webster C. R. & M. A. Jenkins, 2010. Coarse woody debris dynamics in the southern

Appalachians as affected by topographic position and anthropogenic disturbance

history. Forest Ecology and Management 217: 319 ̶ 330.

116

West A. M., S. Kumar, C. S. Brown, T. J. Stohlgren, & J. Bromberg, 2016. Field

validation of an invasive species Maxent model. Ecological Informatics 36: 126 ̶

134.

Wilcove D. S. & L. L. Master, 2005. How many endangered species are there in the

United States. Frontiers in Ecology and Environment 3: 414 ̶ 420.

Wisz M. S., R. J. Hijmans, J. Li, A. T. Peterson, C. H. Graham, & A. Guisan, 2008.

Effects of sample size on the performance of species distribution models.

Diversity and Distributions 14: 763 ̶ 773.

Woodward G., D. M. Perkins, & L. E. Brown, 2010. Climate change and freshwater

ecosystems: impacts across multiple levels of organization. Philosophical

Transactions of the Royal Society of Biological Sciences 365: 2093 ̶ 2106.

Yackulic C. B., R. Chandler, E. F. Zipkin, J. A. Royle, J. D. Nichols, E. H. C. Grant, & S.

Veran, 2012. Presence-only modelling using MAXENT: when can we trust the

inferences? Methods in Ecology and Evolution 4: 236 ̶ 243.

Yang L. H. & C. Gratton, 2014. Insects as drivers of ecosystem processes. Current

Opinion in Insect Science 2: 26 ̶ 32.

Yarnell S. L., 1998. The southern Appalachians: a history of the landscape. U.S.

Department of Agriculture, Forest Service, Southern Research Station.

Yost A. C., S. L. Petersen, M. Gregg, & R. Miller, 2008. Predictive modeling and

mapping sage grouse (Centrocercus urophasianus) nesting habitat using

117

Maximum Entropy and a long-term dataset from Southern Oregon. Ecological

Informatics 3: 375 ̶ 386.

Young D. C., B. C. Kondratieff & R. F. Kirchner, 1989. Description of male Ostrocerca

Ricker (Plecoptera: Nemouridae) using the scanning electron microscope.

Proceedings of the Entomological Society of Washington 91: 257 ̶ 268.

Young N. E., M. P. Fairchild, T. Belcher, P. Evangelista, C. J. Verdone, & T. J.

Stohlgren, 2019A. Finding the needle in the haystack: iterative sampling and

modeling for rare taxa. Journal of Insect Conservation 23: 589 ̶ 595.

Young M. K., R. J. Smith, K. L. Pilgrim, M. P. Fairchild, & M. K. Schwartz, 2019B.

Integrative refutes a species hypothesis: The asymmetric hybrid origin

of Arsapnia arapahoe (Plecoptera, Capniidae). Ecology and Evolution 9: 1364 ̶

1377.

Zwick P, 1992. Stream habitat fragmentation ̶ a threat to biodiversity. Biodiversity and

Conservation 1: 80 ̶ 97.

118