Clemson University TigerPrints

All Theses Theses

December 2020

Using Environmental DNA and Macroinvertebrate Biotic Integrity to Inform Conservation Efforts for the

Benjamin Schmidt Clemson University, [email protected]

Follow this and additional works at: https://tigerprints.clemson.edu/all_theses

Recommended Citation Schmidt, Benjamin, "Using Environmental DNA and Macroinvertebrate Biotic Integrity to Inform Conservation Efforts for the Carolina Heelsplitter" (2020). All Theses. 3441. https://tigerprints.clemson.edu/all_theses/3441

This Thesis is brought to you for free and open access by the Theses at TigerPrints. It has been accepted for inclusion in All Theses by an authorized administrator of TigerPrints. For more information, please contact [email protected].

USING ENVIRONMENTAL DNA AND MACROINVERTEBRATE BIOTIC INTEGRITY TO INFORM CONSERVATION EFFORTS FOR THE CAROLINA HEELSPLITTER

A Thesis Presented to the Graduate School of Clemson University

In Partial Fulfillment of the Requirements for the Degree Master of Science Wildlife and Fisheries Biology

by Benjamin Schmidt December 2020

Accepted by: Dr. Catherine M. Bodinof Jachowski, Committee Chair Dr. Stephen F. Spear Dr. Robert F. Baldwin

ABSTRACT

The Carolina Heelsplitter ( decorata) is a federally endangered freshwater mussel endemic to North and South Carolina, USA. The has experienced dramatic range-wide declines as a result of habitat fragmentation and water quality deterioration, and the remaining populations are isolated and extremely small. Conservation efforts for the Carolina Heelsplitter have been limited by a lack of knowledge regarding distribution, life history traits, and habitat requirements. Our objectives during this project were to 1. Evaluate the efficacy of an environmental DNA

(eDNA) assay to detect the Carolina Heelsplitter and a known host fish, the Bluehead

Chub, from stream water samples and 2. Develop a biotic integrity index to assess habitat suitability within three ecologically important watersheds.

To determine if eDNA techniques can be used to detect the Carolina Heelsplitter and the Bluehead Chub from stream water samples, we developed species specific primer pairs and probes for the target taxa and applied our assay to water samples collected from 100 randomly selected sites. While our assay validation successfully detected the Carolina Heelsplitter in stream segments where the species was known to occur, our results were highly variable, and we failed to detect the species at any new locations within the study area. Detection rates for the Bluehead Chub were high, indicating that host fish availability is relatively widespread within the study area and likely not a limiting factor in Carolina Heelsplitter recruitment. Our results suggest that

ii current eDNA sampling methods may be ineffective for some extremely rare unionid taxa.

To assess habitat suitability, we developed a predictive model of macroinvertebrate biotic integrity based on samples collected across 49 spatially balanced sites. We used multiple linear regression and model ranking criteria to evaluate potential drivers of biotic integrity at multiple spatial scales. We found that local stream conditions were influenced by a relatively large spatial extent, and that land use immediately adjacent to the stream edge plays a larger role in determining biological condition than land use across the entire upstream watershed. Our results can be used to identify areas of high and low habitat quality, evaluate connectivity between populations, and prioritize locations for reintroducing propagated mussels.

iii ACKNOWLEDGEMENTS

First and foremost, I would like to thank my advisor, Dr. Cathy Jachowski, for her mentorship, patience, and encouragement during my time at Clemson. I am incredibly grateful for your unwavering support through all the twists and turns of my graduate school journey. I would also like to thank my graduate committee, Dr. Stephen Spear and Dr. Robert Baldwin, for their guidance, advice, and insight during the formation of this thesis.

I would like to extend my greatest appreciation to all those who provided support during data collection and analysis. Thank you to Jonathan Wardell at the

Orangeburg National Fish Hatchery for teaching me how to identify freshwater mussels and for helping me collect control samples. Thank you to Morgan Kern at the S.C.

Department of Natural Resources and Morgan Wolf at the U.S. Fish and Wildlife Service for providing expertise and technical assistance. Thank you to Rachael Hoch at the N.C.

Wildlife Resources Commission for providing supplies and assistance in the pilot study.

Thank you to the technicians at the Wilds Conservation Center, Amelia Tomi, Sophie

Aller, Molly Garrett, and Kristina Black for spending countless hours processing water samples. Thank you to Dr. John Morse, Matthew Green, and Nathan Arey for collecting, sorting, and identifying macroinvertebrate samples.

I would like to thank my fellow graduate students whose friendship became the highlight of my time spent at Clemson. To Ellie Johnson in particular, thank you for believing in me and helping me stay focused whenever I was discouraged. Finally, I

iv would like to thank my parents, Anne and John Schmidt, for their love, continuous encouragement and emotional support. Thank you for inspiring me to follow my dreams.

This research was funded by the National Fish and Wildlife Foundation through the Southeast Aquatics Fund (project ID: 1906.17.058597), with additional support provided by Clemson University and The Wilds Conservation Center.

v TABLE OF CONTENTS

PAGE

TITLE PAGE ...... i

ABSTRACT ...... ii

ACKNOWLEDGEMENTS ...... iv

LIST OF TABLES ...... viii

LIST OF FIGURES ...... ix

CHAPTER

I. EVALUATING THE EFFICACY OF ENVIRONMENTAL DNA (eDNA) TO DETECT AN ENDANGERED FRESHWATER MUSSEL LASMIGONA DECORATA (: ) ...... 1

Abstract ...... 1 Introduction ...... 2 Methods ...... 5 Results ...... 15 Discussion ...... 17 References ...... 23

II. USING ENVIRONMENTAL DNA (eDNA) TO PREDICT HOST FISH DISTRIBUTION FOR AN ENDANGERED FRESHWATER MUSSEL LASMIGONA DECORATA ...... 38

Abstract ...... 38 Introduction ...... 39 Methods ...... 41 Results ...... 50 Discussion ...... 52 References ...... 55

vi TABLE OF CONTENTS (CONTINUED)

PAGE

III. PREDICTING BIOLOGICAL INTEGRITY IN STREAMS: THE IMPORTANCE OF SCALE AND VARIABLE SELECTION ...... 70

Abstract ...... 70 Introduction ...... 71 Methods ...... 74 Results ...... 83 Discussion ...... 85 References ...... 93

vii LIST OF TABLES

TABLE PAGE

1.1 Estimated detection probability for Lasmigona decorata based on sample replicates and volume ...... 32

1.2 List of covariates used to model sample inhibition in the upper Lynches River Sub-basin ...... 33

1.3 Candidate models used to investigate the probability of sample inhibition in stream water samples ...... 34

1.4 Coefficient estimates for the top-ranked models used to describe sample inhibition in stream water samples ...... 35

2.1 List of covariates used to model occupancy and detection probability of Nocomis leptocephalus ...... 62

2.2 Candidate models used to investigate occupancy and detection probability of Nocomis leptocephalus ...... 65

2.3 Coefficient estimates for the top-ranked models used to describe occupancy and detection probability of Nocomis leptocephalus ...... 66

3.1 Stratification of sampling locations based on stream order and percent forest cover within local catchments ...... 104

3.2 List of covariates used to model macroinvertebrate biotic integrity in the Upper Lynches River Sub-basin ...... 105

3.3 Candidate models used to investigate macroinvertebrate biotic integrity ...... 107

3.4 Coefficient estimates for the top-ranked model used to describe macroinvertebrate biotic integrity ...... 109

viii LIST OF FIGURES

FIGURE PAGE

1.1 Map of study area showing where environmental DNA water samples were collected ...... 36

1.2 Relationship between pH and the estimated probability of inhibition in environmental DNA samples ...... 37

2.1 Life cycle of Lasmigona decorata ...... 67

2.2 Map of study area showing where environmental DNA water samples were collected ...... 68

2.3 Spatial extents used to define landcover variables ...... 69

3.1 Map of study area showing where macroinvertebrate samples were collected ...... 110

3.2 Spatial extents used to define land cover variables ...... 111

3.3 Relationships between variables in the top-ranked model and predicted biotic integrity ...... 112

3.4 Comparison of predicted and observed biotic integrity scores after k-fold cross validation ...... 113

3.5 Predicted conditional biotic integrity scores for target streams based on parameter estimates in the top-ranked model ...... 114

ix CHAPTER ONE: EVALUATING THE EFFICACY OF ENVIRONMENTAL DNA (eDNA) TO DETECT AN ENDANGERED FRESHWATER MUSSEL LASMIGONA DECORATA (BIVALVIA: UNIONIDAE)

Benjamin C. Schmidt, Stephen F. Spear, and Catherine M. Bodinof Jachowski

ABSTRACT

Environmental DNA (eDNA) sampling is a powerful tool for monitoring aquatic organisms and has the potential to make significant contributions to the conservation of freshwater mussels. Conventional survey methods for mussels can be expensive, disruptive to habitat, and may be ineffective for detecting cryptic and rare species.

Environmental DNA can provide a non-invasive and cost efficient alternative, but its utility is highly dependent on environmental conditions and target taxa. In this study, we evaluated the efficacy of eDNA for detecting a federally endangered unionid mussel, the

Carolina Heelsplitter (Lasmigona decorata), in the Lynches River Sub-basin of North and

South Carolina, USA. We developed a qPCR assay for the detection of L. decorata from stream water samples and applied our assay to water samples collected from 100 randomly selected sites while using an internal positive control to monitor PCR inhibition in samples. We used logistic regression and model ranking criteria to assess the relationships between sample inhibition and environmental variables. While our assay validation successfully detected L. decorata in stream segments where the species was known to occur, detectability was highly variable, and we failed to detect the species at any new locations within the study area. We observed extensive inhibition in our samples, and our model results indicated that pH was a strong predictor of the

1 presence of inhibitory compounds. Our results highlight the importance of considering sample inhibition in eDNA surveys and suggest that current eDNA methods may be ineffective for some extremely rare unionid taxa.

INTRODUCTION

Freshwater mussels are among the most endangered taxa in North America with over 70% of species considered endangered, threatened, or listed as special concern

(Strayer et al. 2004, Williams et al. 1993). Freshwater mussels are long-lived, sessile filter feeders, with a unique life cycle in which larvae develop and disperse as obligate ectoparasites on host fish (Barnhart et al. 2008, Haag 2012). These life history characteristics make them particularly susceptible to anthropogenic disturbances and habitat degradation (Haag 2012). Within the Southeastern United States, mussel declines have been attributed to siltation, wastewater discharge, urban and agricultural runoff, stream impoundments, and the introduction of invasive species (Bogan 1993,

Haag and Williams 2014, Gillis et al. 2017, Vaughn and Taylor 1999). Despite their imperiled status, basic knowledge of habitat occupancy and species distributions are still lacking for many taxa (Strayer 2008). Understanding the current distribution of freshwater mussels and the status of remaining populations is necessary for establishing habitat requirements and directing conservation and restoration efforts.

Traditional survey methods for freshwater mussels, which primarily involve tactile and visual searches, depend on many factors including water clarity, time constraints, and surveyor skill, and may be susceptible to observer biases associated

2 with shell size or positioning in the substrate (Amyot and Downing 1991, Carlson et al.

2008, Hornbach and Deneka 1996). Intensive survey methods can be very expensive, cause habitat disruption, and may not be effective for species present at low densities

(Stoeckle et al. 2016). In addition, field identification of freshwater mussels is quite difficult, and both qualitative and quantitative survey methods require taxonomic expertise. In the context of widespread species declines and extinction risk, new survey methods are required to effectively assess mussel populations across broad spatial extents, especially when funding and logistical constraints are prohibitive for traditional survey methods.

Environmental DNA (eDNA) sampling is an emerging survey tool for monitoring aquatic organisms with the potential to greatly improve the efficiency and accuracy of aquatic sampling (Goldberg et al. 2016). This method is based on the collection, extraction, and amplification of genetic material that is shed from aquatic organisms into the water column (Jerde et al. 2011). The use of eDNA as a survey tool has been successful for multiple taxa, including amphibians, fishes, and invertebrates from both lentic and lotic systems (see Thomson and Willerslev 2015 for a review). While studies focusing on freshwater mussels are less common (Belle et al. 2019), previous work has indicated that eDNA may be a viable survey technique for mussels as well (Diener and

Altermatt 2014, Shogren et al. 2019, Stoeckle et al. 2016). In theory, the detection of target species using eDNA requires less time and effort than traditional survey methods and may also reduce observer error (Thomson and Willerslev 2015, Wilcox et al. 2013).

3 The use of eDNA sampling can be particularly beneficial for rare and endangered species when minimal disturbance of organisms and habitat is crucial (Rees et al. 2014).

Despite the potential benefits of eDNA sampling, there are a number of variables associated with sample collection and processing which can lead to errors and inaccurate results if not properly accounted for (Barnes et al. 2014, Jane et al. 2015).

One such variable is the presence of co-extracted compounds which can inhibit the amplification of target DNA in samples, leading to false negatives (McKee et al. 2015,

Wilcox et al. 2018). Previous studies have identified a variety of co-extracted compounds that can inhibit PCR amplification including tannic and humic acids, heavy metals, polysaccharides, and excess salts (Schrader et al. 2012). These compounds can result in sample inhibition through a variety of mechanisms including interfering with cell lysis during extraction (Wilson 1997), binding with DNA templates, primers or polymerases (Opel et al. 2010), and quenching the fluorescence of probes during qPCR

(Sidstedt et al. 2015). Depending on the environment and the conditions in which samples were collected, the number of inhibited water samples can be extensive.

Understanding the spatial distribution of inhibitory compounds in an environment may provide valuable insights into the dynamics of co-extracted compounds and potentially yield predictive models for determining the likelihood that a given sample will be inhibited.

The goal of this study was to evaluate the efficacy of eDNA as a survey tool for a critically imperiled freshwater mussel, the Carolina Heelsplitter (Lasmigona decorata

4 [Lea 1852]) in the Upper Lynches River Sub-basin in North and South Carolina. While the complete historic range of L. decorata is unknown, available records indicate that it was once fairly widely distributed in the Pee Dee, Catawba, Savannah, and Saluda River systems throughout North and South Carolina (USFWS 1996). Presently, the species is restricted to a few isolated streams and remnant populations are at risk from human induced habitat alterations and deteriorating water quality (USFWS 2012). Based on the most recent survey data, the extant populations are extremely small, with less than 200 individual mussels found in all of the surviving populations combined (USFWS 2012).

Recovery efforts for L. decorata would benefit from an eDNA assay, which could be used as a non-invasive survey tool to monitor existing populations and search for unknown populations. Our primary objective was to develop and validate an eDNA assay for L. decorata and assess whether detection was possible despite low densities. After observing a high level of sample inhibition during our initial sampling efforts, we added an additional objective to investigate patterns of sample inhibition in relation to environmental and methodological variables.

METHODS

Primer design

Existing sequence data for L. decorata, which included only the cytochrome C oxidase subunit 1 (CO1), was not variable enough to develop species-specific primers

(King et al. 1999). Thus, in order to create a primer set and probe for our assay, we extracted whole genomic DNA from tissue swabs of L. decorata captured within the

5 study area. We used the Qiagen DNeasy blood and tissue kit (Qiagen) following the standard protocol with swabs immersed in lysis buffer reagents. We amplified the complete NADH dehydrogenase (ND1) mitochondrial gene (~1000 base pairs) using the following primers (Serb et al. 2003):

Forward Primer (Leu-uurF) 5’-TGGCAGAAAAGTGCATCAGATTAAAGC-3’

Reverse Primer (LoGlyR) 5’-CCTGCTTGGAAGGCAAGTGTACT-3’

We ran extracts in 25 μl reactions consisting of 2.5 μl DNA extract, 12.5 μl 2X

Qiagen multiplex PCR master mix (Qiagen), and 0.2 μm of each forward and reverse primer on a Bio-Rad T100 thermocycler (Bio-Rad). The thermal cycling protocol consisted of an initial denaturing at 95°C for 15 minutes followed by 34 cycles of 94°C for 30 seconds (denaturing), 57°C for 90 seconds (annealing), and 72°C for 90 seconds

(elongation), followed by a 10-minute final elongation at 72°C. Extraction and initial PCR was done at The Wilds Conservation Science Training Center in Cumberland, Ohio. Prior to sequencing, we purified PCR product with Exo-SAP IT reagent (Thermo-Fisher

Scientific). We submitted PCR product to the Ohio University Genomics Facility for bi- directional Sanger sequencing using the BigDye Terminator cycle sequencing kit (v 3.1,

Thermo-Fisher Scientific) on an Applied Biosystems 3130xl capillary sequencer (Thermo-

Fisher Scientific). We inspected and aligned sequence data using Geneious Prime (v.

6 2019.0, Kearse et al. 2012) and submitted sequence data of the ND1 gene of L. decorata to GenBank.

Assay development

We designed a primer set and probe for L. decorata using Primer3Plus

(Untergasser et al. 2012). The primer pair amplifies a 109bp region of the NADH dehydrogenase subunit 1 (ND1) mitochondrial gene. The sequence of the forward primer, reverse primer, and probe are as follows:

Forward Primer 5’ CATGTCTGCCTGAGCAGTTA 3’

Reverse Primer 5’ TGACTCTCCTTCTGCGAAATC 3’

Probe 5’ 6FAM – TGCAAGAATGACTGCTAACCATATAGCCC – ZEN-IBFQ 3’

We assessed specificity (i.e., the probability of amplifying non-target DNA) using

Primer-BLAST (Ye et al. 2012) by comparing primers to all available sequence data including both related species and non-target species known to occur within the region.

In addition, we assessed the potential for cross amplification with sympatric taxa using tissue-derived DNA from Strophitus undulatus from individuals collected in the study area.

Assay validation

In order to validate the selected primers, we collected positive controls from L. decorata propagation tanks at the Orangeburg Mussel Conservation Center in South

7 Carolina. Once validated, we established a pilot study to investigate the role of water sample volume on assay sensitivity (i.e., the probability of detecting the target species when present). In March 2019, we collected 6 replicate water samples at 3 volumes (1 L,

2 L, and 3 L) from a stream locality with an extant population of L. decorata and calculated the probability of detection in relation to sample volume and the number of replicate samples (Table 1). Based on the results of this pilot study and the logistical constraints of field sampling, we chose to collect two 2 L samples from each site. Based on a mean estimated detection probability of 0.36 (95% CI = 0.19–0.56), we estimated that this approach would facilitate a 59% (i.e., 1-(1-0.36)2 = 0.59) chance of detecting our focal species in at least one of our replicate samples from a truly occupied site.

Study area and sampling design

Our study took place in three watersheds within the Lynches River sub-basin, representing the upper Lynches river and its primary tributaries (Figure 1). The study area is a heterogenous landscape and marks the transition between the Piedmont and

Southeastern Plains ecoregions in North and South Carolina (Omernik and Griffith 2014,

USEPA 2013). The Lynches River sub-basin (HUC 10: 03040202) is recognized as a conservation priority in South Carolina and contains 67 species of fish, 10 species of crayfish, and 15 species of freshwater mussel, many of which are considered imperilled

(Elkins et al. 2016). Furthermore, the largest known extant L. decorata populations occur in our study area and represent the most stable of the six critical habitat units designated by the United States Fish and Wildlife Service (USFWS 2002).

8 Using ArcMap 10.7 (Esri) and the National Hydrology Dataset Plus (NHDPlus v2.1)

(USGS 2019), we identified potential sampling sites as all locations where a navigable roadway intersected a stream as defined by the National US Primary and Secondary

Roads TIGER/line dataset (US Census Bureau 2017). From the pool of 496 sites, we selected a total of 100 sampling sites using a generalized random tessellation stratified sampling design with the ‘spsurvey’ package (Kincaid et al. 2019) in R (version 3.6.1, R

Core Team 2019). We stratified sites by stream order (five levels: 1st, 2nd, 3rd, 4th, and 5th) to ensure sampling across a wide range of stream sizes and established a minimum stream distance of 1 km between sites to reduce spatial autocorrelation. Due to a limited number of fourth and fifth order stream sites within the study area, we sampled all available sites within those orders and distributed the remaining sites across the lower order streams. In total, we sampled 26 first order streams, 33 second order streams, 24 third order streams, 13 fourth order streams, and four fifth order streams.

Field sampling

At each site, we collected two consecutive 2 L surface water samples from the stream edge using new polyethylene bottles. To monitor potential contamination between sites, we paired water samples from each site with 1 L negative controls consisting of distilled or tap water poured on site and transported back to the lab alongside the stream water samples. All water samples were transported in individual sealed plastic bags on ice and processed within 12 hours of collection. Following sample collection, we recorded water quality data from the point of collection using a YSI Pro

9 Plus (Xylem Analytics) and a portable turbidity meter (Sper Scientific, Ltd.). We collected samples between May 7 and July 4 of 2019 during three separate sampling periods. We collected samples from 19 sites between May 7–11, 46 sites between May 28–June 8, and 35 sites between July 1–4. We collected positive controls during each sampling period from a fourth order stream locality with an extant population of L. decorata

(Figure 1).

Based on the low number of detections from the initial sample of 100 sites (see results), we sampled an additional 16 sites in October 2019 as we suspected that detection of L. decorata may increase in the fall as a result of spawning (e.g., Spear et al.

2015). While the timing of fall spawning for L. decorata remains largely unknown, state and federal malacologists suspect that it occurs between August and November based on water temperature cues (M. Kern and M. Wolf, personal communication, 2019). We selected these 16 sites non-randomly along a 42 km stream network that collectively encompassed the designated critical habitat for the species within the study area

(USFWS 2002). We suspected that this network would include the portion of our study area most likely to harbor previously undetected L. decorata populations.

We filtered water samples through 5 μm cellulose nitrate filters (Whatman

International, Ltd.) using an electric vacuum pump assembly. When filters became clogged, we used multiple (up to 4) filters to reach our target volume (2L) for each sample. We stored filters in microcentrifuge tubes containing 95% ethanol at -80 °C until

DNA extraction.

10 Laboratory methods

We extracted DNA from the filters using the DNeasy Blood and Tissue Kit

(Qiagen) with the addition of a Qiashredder spin column (Qiagen) after the initial lysis buffer step (Goldberg et al. 2011). Prior to extraction, we cut filters in half, storing one half in ethanol as a backup and allowing the other half to dry overnight. For samples that required multiple filters, we processed each filter separately until the spin column step, at which point we loaded each filter extract onto the same column and processed the combined extract as a single sample for the remainder of the extraction process.

We used DNA extractions from tissue swabs of L. decorata collected within the study area to create standards for the qPCR assay. Each plate included standards at four dilution points (10-3, 10-4, 10-5, and 10-6) in addition to an extraction negative and a no template control. We conducted all extractions and PCR setup within lab space dedicated to low-copy DNA extractions at The Wilds Conservation Science Center in

Cumberland, Ohio.

We ran real time qPCR assays in triplicate using an Applied Biosystems 7500 Fast

Real-Time PCR System (Thermo-Fisher Scientific) (Heid et al. 1996). Samples were run in

20 ul reactions consisting of 7.5 μl of Quantitect© Multiplex PCR mastermix (Qiagen), 0.4 uM of the forward and reverse primers, 0.2 uM of probe, 0.6 μl of 10x TaqMan

Exogenous Internal Positive Control (IPC) Mix (Thermo-Fisher Scientific), 0.3 μl of 50x

TaqMan Exogenous Internal Positive Control DNA (Thermo-Fisher Scientific), 3.0 ul

RNase free water, and 2.85 ul of sample extract. After failing to detect L. decorata in

11 positive field controls from the first sampling period (see results), we modified the protocol by replacing the RNase free water with an additional 3 ul of sample extract in an attempt to increase the amount of target DNA and thus detection for the remaining samples. We amplified a total of 59 samples using 2.85 ul of sample extract and 139 samples using 5.85 of sample extract. The qPCR cycling protocol consisted of an initial denaturing step at 95°C for 15 minutes, followed by 50 cycles of 94°C for 1 minute

(denaturation) and 60°C for 1 minute (annealing and extension).

We used Applied Biosystems Software v2.06 (Thermo-Fisher Scientific) to analyze the amplification results. Manual thresholds for amplification were set using the

Ct value at which all standards amplified. We considered samples to be positive, and indicative of species presence, if they amplified the IPC and target DNA in at least one of the three PCR technical replicates. To reduce the risk of false positives due to contamination, we only considered samples to be positive if there was no observed amplification in their associated negative field control, extraction control, and no template control. We considered samples to be inhibited if the IPC failed to amplify during qPCR (Hartman et al. 2005). We ran a subset of the inhibited samples (n = 52) through a commercial inhibitor removal spin column (Zymo Research) to investigate the potential for removing inhibitory compounds from samples collected in the study area.

Model development and selection

We used logistic regression to investigate the effects of environmental variables and lab methods on the probability of inhibition in samples (Table 2). We considered

12 sample inhibition as a binary response variable based on whether the IPC crossed the amplification threshold set by the standards. We screened variables based on Pearson’s correlation coefficients using package ‘Hmisc’ (Harrell 2020) in R (version 3.6.1, R Core

Team 2019) to avoid coefficient estimation issues when predictors were collinear (|r| <

60%). Physiography and pH were highly correlated (r = 0.69, df = 196, p = < 0.001). We retained pH over physiography after comparing AICc scores of univariate models with either pH or physiography.

We developed a candidate set of seven models, each of which corresponded to an a priori hypothesis concerning drivers of inhibition (Table 3). We used a laboratory methods model to represent our hypothesis that a higher sample extract volume, especially from combined filters, would yield higher concentrations of co-extracted substances, some of which may act as inhibitory compounds during qPCR. We used a model with seasonal variables and sample period to represent our hypothesis that sample inhibition would become more prevalent in mid to late summer when water levels are typically lower and thus inhibitory compounds may be more concentrated in the water column (Jane et al. 2015, Wachob et al. 2009). We used a univariate model with stream order to represent our hypothesis that the probability of inhibition will be lower in samples from larger streams as inhibitory compounds are likely more diluted in the water column. We used a water chemistry model to represent our hypothesis that the concentration of inhibitory compounds (e.g., humic acids) would be higher in streams with degraded water quality (e.g., high conductivity, low dissolved oxygen (DO),

13 and low pH). In addition, we used a univariate model with pH to represent our hypothesis that inhibitory compounds would be most prevalent in highly acidic streams

(Tan et al. 1990). We used a univariate model with turbidity to represent our hypothesis that suspended particulate is a primary driver of sample inhibition (Harper et al. 2018,

Williams 2017). Finally, we included a null model that assumed a constant rate of inhibition across all samples. We included site ID as a random effect in every model to account for the similarities between samples collected from the same stream locality.

We fit models using maximum likelihood methods with the package ‘lme4’

(Bates et al. 2015) and ranked models using the Akaike Information Criterion adjusted for small sample size (AICc) (Akaike 1974) with the package ‘AICmodavg’ (Mazerolle

2019) in R (version 3.6.1, R Core Team 2019). We selected the top ranked models based on a 95% confidence set (Burnham and Anderson 2002). We based inference on well supported parameters in our top ranked models, where we considered parameters to be well supported if the 95% confidence interval for the effect size did not overlap 0. To understand the variability in our data explained by covariates, we calculated marginal

(fixed effects only) and conditional (fixed and random effects) R2 values for the top ranked model using package ‘MuMIn’ in R (Bartoń 2020, Nakagawa et al. 2017).

While AIC model ranking identifies the most supported model within the candidate suite, it does not provide a metric for model performance. To assess performance and predictive ability for the top ranked models, we used K fold cross validation (Boyce et al. 2002). We ran five iterations using an 80:20 split of training to

14 testing data, refitting the top ranked models for each replicate of training data. We then used the newly fitted models to make predictions for the corresponding testing data.

We pooled the results of the testing data and used a receiver operating characteristic curve (ROC) to evaluate the predictive ability of each model (i.e., the ability to distinguish inhibited samples from uninhibited samples).

Briefly, a ROC curve plots the true positive rate (sensitivity) and false positive rate (1-specificity) of a model at all possible threshold settings. The area under the curve

(AUC) is an assessment of a model’s predictive power and ranges from 0.5 to 1.0. When the AUC = 1, the model has perfect predictive ability and when the AUC is 0.5, the predictive ability of a model is no better than random chance. A model with good predictive ability has an AUC of greater than 0.7 (Boyce et al. 2002).

RESULTS

Detection of L. decorata

We collected a total of 200 samples from 100 sites. All negative field controls, extraction controls, and no template controls performed as expected with no indication of target DNA amplification. However, we removed one site from analysis due to inhibition in the negative control. Aside from our initial positive controls used in assay validation, we did not detect L. decorata from any stream water samples, including later positive controls and samples collected in October during the presumed fall spawning.

15 Patterns of inhibition

We used a total of 198 samples collected from 99 sites to investigate patterns of inhibition. In total, 38% of our samples (78 of 198) were inhibited. Thirty-four sites were inhibited across all samples and an additional 10 sites were inhibited in at least one of the samples (Figure 2). In a post-hoc exploration of our results, we noted that sample inhibition was strongly correlated with physiographic province and sites with catchments located primarily in the Southeast Plains ecoregion were much more likely to have inhibited samples than sites draining lands in the Piedmont ecoregion (75% of samples from the SE Plains vs. 15% of samples from the Piedmont). After running sample extract from 52 inhibited samples through the commercial inhibition removal spin kit (Zymo Research), 23% of processed samples amplified the IPC during the second round of qPCR (12 of 52), though none of the cleared samples amplified target DNA.

Two models fell within our 95% confidence set and collectively accounted for

100% of the AICc model weight (Table 3). The top ranked model included pH as a univariate parameter (wi = 0.58), and the second ranked model included pH as well as

DO and conductivity (wi = 0.42). Model weights indicated that the pH model was 1.4 times more likely to be the best fitting model in our candidate set than the second ranked model. Both models indicated that pH was a strong predictor of sample inhibition. (Table 4). In addition, the second ranked model indicated that sample inhibition was positively associated with DO concentrations and conductivity. However,

95% confidence intervals for both parameters overlapped 0, suggesting considerable

16 uncertainty in the effects of DO and conductivity (Table 4). For simplicity, we do not discuss the effects of DO or conductivity in our second ranked model further.

The top ranked model explained 98% of observed variation in sample inhibition

(conditional R2 = 0.98), with fixed effects accounting for the majority of the variation

(marginal R2 = 0.66) (Nakagawa et al. 2017). Both top-ranked models indicated a strong threshold relationship between pH and inhibition. Below a pH of 6, 95% of samples were inhibited, whereas above a pH of 6, only 16% of samples were inhibited (Figure 3). Our five-fold model cross validation suggested that the pH model performed well when used to predict whether water samples would be inhibited (AUC = 0.89).

DISCUSSION

Detection of L. decorata

Environmental DNA is quickly becoming an important tool for conservation biology and the detection of rare and endangered taxa (Thompson and Willerslev 2015).

The number of studies using eDNA techniques has increased exponentially in the last decade, coupled with a similar increase in the number of reviews, opinion papers, and published protocols (Belle eta l. 2019, Coble et al. 2019). Many of these studies have indicated that eDNA may be more sensitive than traditional survey methods for detecting cryptic or rare taxa, with new eDNA detections often confirmed by intensive follow-up surveys using traditional techniques (Dejean et al. 2012, Jerde et al. 2011,

Pilliod et al. 2013). In contrast to these studies, our eDNA assay was less sensitive than previous visual and tactile surveys implemented within the study area. While the

17 occupancy status of our sampling sites was unknown, we repeatedly failed to detect L. decorata at our positive control site which was located immediately downstream from a known population (M. Wolf, Personal Communication, 2019). Our results indicate that eDNA sampling may not currently be a viable survey for all freshwater mussels, especially for endangered species that occur at low densities.

Many aspects of eDNA detection in lotic systems remain poorly understood

(Roussel et al. 2015). Our inability to amplify target DNA from stream water samples may be the result of true absence, low density populations, rates of mussel eDNA shedding, rates of eDNA degradation or dilution, or processes affecting suspension and downstream transport (Harrison et al. 2019, Rees et al. 2014, Sansom and Sassoubre

2017). Previous studies have indicated that detection of freshwater mussels using eDNA may depend on mussel density and overall biomass. For example, Wacker et al. (2019) found that detection of Margaritifera margaritifera was highly dependent on the size of the upstream population and observed low detection rates immediately downstream from a population of 100 individuals. Similarly, Gasparini et al. (2020) was unable to detect unionid DNA at distances of more than 10 m from a small caged population of

Lampsilis fasciola. While some freshwater mussel species exist in large populations that can be detected at distances of 1 km or more (Deiner and Altermatt 2014, Shogren et al.

2019), many imperiled species exist at low densities in fragmented populations (Strayer et al. 2004). Our results indicate that detection of these low-density mussel populations using eDNA may be difficult without additional methods to improve sensitivity.

18 In addition to density, seasonal changes in environmental conditions and organism behavior can affect the concentration of eDNA in the water column by altering both the rate of eDNA production and the time frame in which it can be detected. For example, water temperature has a strong effect on the rates of DNA degradation in lotic systems (Eichmiller et al. 2015, Strickler et al. 2015, Tsuji et al. 2017) and seasonal changes in hydrology impact rates of DNA suspension and dilution (Jane et al. 2015). For freshwater mussels, reproductive events such as spawning or the release of glochidia may increase the concentration of genetic material in the water column. For example,

Wacker et al. (2019) observed a 20-fold increase in unionid eDNA concentrations during late summer, coinciding with spawning of M. margaritifera. Similar increases of aquatic eDNA during periods of reproduction have been observed for amphibians (Buxton et al.

2017, Spear et al. 2015), fish (Milhau et al. 2019), and arthropods (Dunn et al. 2017).

Importantly, we were able to detect L. decorata at our positive control site in March of

2019 during assay validation. However, we failed to detect the species in subsequent control samples collected from the same locality in May, June, July, and October of

2019. Within our study area, early spring conditions include cooler water temperatures and higher flow rates than in the summer and fall, potentially resulting in slower degradation rates and allowing genetic material to remain suspended in the water column for longer distances from the source (Jane et al. 2015). In addition, the spring release of glochids by gravid female L. decorata may have increased the concentration of eDNA in the water column during spring compared to later sampling periods. Similar

19 increases in eDNA concentration may also occur during fall spawning, when male L. decorata broadcast sperm into the water column. Our inability to detect L. decorata during the fall sampling period may indicate that the spawning period occurs before or after mid-October. In some unionid species, spawning is highly synchronous and may last only a few weeks (Haggarty et al. 2011, Haggarty and Garner 2000). We recommend that future sampling be conducted at set intervals during both spring and fall to better understand how reproductive events affect the seasonal detection of L. decorata.

Patterns of inhibition

We observed inhibition in a considerable number of samples, highlighting the importance of including an internal positive control (IPC) in all samples during qPCR to reduce the risk of false negatives resulting from PCR inhibition. While previous studies have investigated the temporal variation in co-extracted compounds (Gasparini et al.

2020, Jane et al. 2015), few studies have considered the spatial distribution of inhibitory compounds in a given environment. Our results indicate that inhibitory compounds are heterogeneously distributed in our study area, with higher concentrations occurring in streams with low pH. This may be due to the presence of dissolved organic matter, such as humic acids and tannins, which can lower the pH of stream water at high concentrations (Hemmond 1994, Tan et al. 1990). The strong correlation between pH and physiography indicates that eDNA sampling may be difficult areas where streams are naturally acidic, such as the Southeast Plains ecoregion. Measuring the pH of sampling sites could potentially be an efficient and low-cost way to estimate the

20 probability of sample inhibition before investing in eDNA sample collection and processing. In addition, acidic samples may contain less eDNA than neutral of alkaline samples because acidic conditions catalyze hydrolytic processes that degrade DNA

(Seymour et al. 2018).

A variety of methods have been implemented to reduce the effects of inhibitory compounds including sample dilution, additive compounds such as bovine serum, or the use of inhibition resistant reagents such as Taqman Environmental Master Mix (Thermo-

Fisher Scientific) (Jane et al. 2015, Kreader 1996, McKee et al. 2015). In addition, methods to actively remove co-extracted compounds from sample extract using inhibitor removal kits, such as the Zymo OneStep (Zymo Research) spin column, have been successful in some cases (McKee et al. 2015, Niemiller et al. 2018, Williams et al.

2017). However, after processing a subset of our inhibited samples using the Zymo spin columns, the majority (77%) remained fully inhibited during the second round of amplification. These results indicate that post extraction inhibitor removal kits may not be an effective method for processing inhibited samples in our study area.

Management implications

Our study adds to the growing body of literature on the utility of eDNA sampling to detect and map the distribution of freshwater mussels and it is among the first to attempt watershed-wide detection of an endangered mussel. Our results provide insight into the potential limitations of eDNA sampling and highlight the need for high- resolution sampling designs for rare and endangered mussels. Several methods for

21 increasing eDNA detection have been identified including the use of nested primers

(Stoeckle et al. 2016), multi-filter assemblies (Hunter et al. 2019), and droplet digital PCR technology (Doi et al. 2015, Hunter et al. 2017). In addition, a recent study by

Schabacker et al. (2020) utilized large pore filter membranes and a mesh tow net to collect over 3,000 L of water per eDNA sample in lentic environments; a similar method could be employed in lotic environments to collect significantly larger volumes of water, and potentially larger quantities of eDNA (Wilcox et al. 2018).

While we were unable to detect L. decorata from sampling sites, our eDNA assay was effective in the detection of target DNA from propagation tanks, where population densities were high. This indicates that our primer and probe design may be useful for identifying specimen with ambiguous morphology or for the molecular ID of glochidia in host specificity studies (e.g., Kneeland and Rhymer 2008). In addition, our sample extracts can be repurposed to detect other taxa within the study area (Dysthe et al.

2018), which is beneficial considering the number of imperilled taxa in the Upper

Lynches River sub-basin and the costs associated with eDNA sample collection and extraction.

22 REFERENCES

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6): 716–723.

Amyot, J. P. and J. A. Downing. 1991. Endo and epibenthic distribution of unionid mollusc Elliptio complanata. North American Benthological Society 10: 280–285.

Barnes, M. A., C. R. Turner, C. L. Jerde, M. A. Renshaw, W. L. Chadderton, and D. M. Lodge. 2014. Environmental conditions influence eDNA persistence in aquatic systems. Environmental Science and Technology 48(3): 1819–1827.

Barnhart, M. C., Haag, W. R., and W. N. Roston. 2008. Adaptations to host infection and larval parasitism in Unionidae. Journal of the North American Benthological Society 27(2): 370–394.

Bartoń, K. 2020. MuMIn: Multi-model inference. R package version 1.43.17. https: // CRAN.R project.org/package = MuMIn.

Bates, D., M. Mächler, B. Bolker, and S. Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1): 1–48.

Belle, C. C., B. C. Stoeckle, and J. Geist. 2019. Taxonomic and geographical representation of freshwater environmental DNA research in aquatic conservation. Aquatic Conservation Marine and Freshwater Ecosystems 29: 1996–2009.

Bogan, A. E. 1993. Freshwater bivalve extinctions (: ): A search for causes. American Zoologist 33(6): 599–609.

Boyce, M. S., P. R. Vernier, S. E. Nielson, and F. K. Schmiegelow. 2002. Evaluating resource selection functions. Ecological Modelling 157: 281–300.

Burnham, K. and D. Anderson. 2002. Model selection and multimodel inference, Second Ed. Springer-Verlag, New York, NY.

Buxton, A. S., J. J. Groombridge, N. B. Zakaria, and R. A. Griffiths. 2017. Seasonal variation in environmental DNA in relation to population size and environmental factors. Scientific Reports 7: 46294.

23 Carlson, S., A. Lawrence, H. Blalock-Herod, K. McCafferty, and S. Abbott. 2008. Freshwater mussel survey protocol for the southeastern Atlantic slope northeastern gulf drainages in Florida and Georgia. United States Fish and Wildlife Service: Ecological Services and Fisheries Resources Offices.

Coble, A. A., C. A. Flinders, J. A. Homyack, B. E. Penaluna, R. C. Cronn, and K. Weitemier. 2019. eDNA as a tool for identifying freshwater species in sustainable forestry: A critical review and potential future applications. Science of the Total Environment 649: 1157–1170.

Deiner, K. and F. Altermatt. 2014. Transport distance of invertebrate environmental DNA in a natural river. PLoS One 92: e88786.

Dejean, T. A. Valentini, C. Miquel, P. Taberlet, E. Bellemain, and C. Miaud. 2012. Improved detection of an alien invasive species through environmental DNA barcoding: The example of the American Bullfrog Lithobates catesbeianus. Journal of Applied Ecology 49: 953–959.

Doi, H., T. Takahara, T. Minamoto, S. Matsuhashi, K. Uchii, and H. Yamanaka. 2015. Droplet digital polymerase chain reaction (PCR) outperforms real-time PCR in the detection of environmental DNA from an invasive fish species. Environmental Science and Technology 49(9): 5601–5608.

Dysthe, J. C., T. Rodgers, T. W. Franklin, K. J. Carim, M. K. Young, K. S. McKelvey, K. E. Mock, and M. K. Schwartz. 2018. Repurposing environmental DNA samples: Detecting the western pearlshell (Margaritifera falcata) as a proof of concept. Ecology and Evolution 8(5): 2659–2670.

Dunn, N., V. Priestley, A. Herraiz, R. Arnold, and V. Savolainen. 2017. Behavior and season affect crayfish detection and density inference using environmental DNA. Ecology and Evolution 7: 7777–7785.

Eichmiller, J. J., S. E. Best, and P. W. Sorensen. 2016. Effects of temperature and trophic state on degradation of environmental DNA in lake water. Environmental Science and Technology 50(4): 1859–1867.

Elkins, D. C., S. C. Sweat, K. S. Hill, B. R. Kuhajda, A. L. George, and S. L. Wenger. 2016. The Southeastern aquatic biodiversity conservation strategy. Final Report. University of Georgic River Basin Center, Athens, GA.

24 Gasparini, L. S. Crookes, R. S. Prosser, and R. Hanner. 2020. Detection of freshwater mussels (Unionidae) using environmental DNA in riverine systems. Environmental DNA 2: 321–329.

Gillis, P. L., R. McInnis, J. Salerno, S. R. de Solla, M. R. Servos, and E. M. Leonard. 2017. Freshwater mussels in an urban watershed: Impacts of anthropogenic inputs and habitat alterations on populations. Science of the Total Environment 574: 671– 679.

Goldberg, C. S., C. R. Turner, C. R. Deiner, K. Klymus, K. E. Thomsen, P. F. Murphy, S. F. Spear, A. McKee, S. J. Oyler-McCance, R. S. Cornman, M. B. Laramie, A. R. Mahon, R. F. Lance, D. S. Pilliod, K. M. Strickler, L. P. Waits, A. K. Fremier, T. Takahara, J. E. Herder, and P. Taberlet. 2016. Critical considerations for the application of environmental DNA methods to detect aquatic species. Methods in Ecology and Evolution 7: 1299–1307.

Goldberg, C. S., D. S. Pilliod, R. S. Arkle, and L. P. Waits. 2011. Molecular detection of vertebrates in stream water: A demonstration using Rocky Mountain Tailed Frogs and Idaho Giant Salamanders. PLoS One 6: e22746.

Haag, W. R. 2012. North American freshwater mussels: Natural history, ecology, and conservation. Cambridge University Press, Cambridge, UK.

Haag, W. R. and J. D. Williams. 2014. Biodiversity on the brink: An assessment of conservation strategies for North American freshwater mussels. Hydrobiologia 735: 45–60.

Haggerty, T. M. and Garner, J. T. 2000. Seasonal timing of gametogenesis, spawning, brooding, and glochidial discharge in Potamilus alatus (Bivalvia: Unionidae) in Wheeler Reservoir, Tennessee River, Alabama, USA. Invertebrate Reproduction and Development 38: 35–41.

Haggerty, T. M., J. T. Garner, A. E. Crews, and R. Kawamura. 2011. Reproductive seasonality and annual fecundity in Arcidens confragosus (Unionidae: Unioninae: Anodontini) from Tennessee River, Alabama, USA. Invertebrate Reproduction and Development 55: 230–235.

25 Harper, L. R., A. S. Buxton, H. C. Rees, K. Bruce, R. Brys, D. Halfmaerten, D. S. Read, H. V. Watson, C. D. Sayer, E. P. Jones, V. Priestley, E. Mächler, C. Múrria, S. Garcés- Pastor, C. Medupin, K. Burgess, G. Benson, N. Boonham, R. A. Griffiths, L. L. Handley, and B. Hänfling. 2018. Prospects and Challenges of environmental DNA (eDNA) monitoring in freshwater ponds. Hydrobiologia 826: 25–41.

Harrell, F. E. 2020. Hmisc: Harrell miscellaneous. R Package version 4.4. https: //CRAN.R- project.org/package = Hmisc.

Harrison, J. B., J. M. Sunday, and S. M. Rogers. 2019. Predicting the fate of eDNA in the environment and implications for studying biodiversity. Proceedings of the Royal Society B 286: 20191409.

Hartman, L. J., S. R. Coyne, and D. A. Norwood. 2005. Development of a novel internal positive control for Taqman based assays. Molecular and Cellular Probes 19(1): 51–59.

Heid, C. A., J. Stevens, K. J. Livak, and P. M. Williams. 1996. Real time quantitative PCR. Genomics Research 6: 986–994.

Hemmond, H. F. 1994. The role of organic acids in acidification of freshwaters. In C. W. Steinberg and R. F. Wright (Eds.) Acidification of freshwater ecosystems: Implications for the future. John Wiley and Sons Ltd., New York, NY.

Hornbach, D. J. and T. Deneka. 1996. A comparison of a qualitative and quantitative collection method for examining freshwater mussel assemblages. North American Benthological Society 15: 587–596.

Hunter, M. E., J. A. Ferrante, G. Meigs-Friend, and A. Ulmer. 2019. Improving eDNA yield and inhibitor reduction through increased water volumes and multi-filter isolation techniques. Scientific Reports 9: 5259.

Hunter, M. E., R. M. Dorazio, J. S. Butterfield, G. Meigs-Friend, L. G. Nico, and J. A. Ferrante. 2017. Detection limits of quantitative and digital PCR assays and their influence in presence-absence surveys of environmental DNA. Molecular Ecology Resources 17: 221–229.

Jane, S. F., T. M. Wilcox, K. S. McKelvey, M. K. Young, M. K. Schwartz, W. H. Lowe, B. H. Letcher, and A. R. Whiteley. 2015. Distance, flow, and PCR inhibition: eDNA dynamics in two headwater streams. Molecular Ecology Resources 15: 216–227.

26 Jerde, C. L., A. R. Mahon, W. L. Chadderton, and D. M. Lodge. 2011. “Sight-unseen” detection of rare aquatic species using environmental DNA. Conservation Letters 4: 150–157.

Kearse, M., R. Moir, A. Wilson, S. Stones-Havas, M. Cheung, S. Sturrock, S. Buxton, A. Cooper, S. Markowitz, C. Duran, T. Thierer, B. Ashton, P. Meintjes, and A. Drummond. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28(12): 1647–1649.

Kincaid, T. M., A. R. Olsen, and M. H. Weber. 2019. Spsurvey: Spatial survey design and analysis. R Package version 4.1.0

King, T. L., M. S. Eackles, B. Gjetvaj, and W. R. Hoeh. 1999. Intraspecific phylogeography of Lasmigona subviridis (Bivalvia: Unionidae): Conservation implications of range discontinuity. Molecular Ecology 8: 65-78.

Kneeland, S. and J. Rhymer. 2008. Determination of fish host use by wild populations of rare freshwater mussels using a molecular identification key to identify glochidia. Journal of the North American Benthological Society 27: 150–160.

Kreader, C. A. 1996. Relief of amplification inhibition in PCR with bovine serum albumin or T4 gene 32 protein. Applied and Environmental Microbiology 62: 1102–1106.

Lea, I. 1852. Descriptions of new species in the family Unionidae. Transactions of the American Philosophical Society 10: 253–294.

Mazerolle, M. J. 2019. AICmodavg: Model selection and multimodel inference based on (Q)AIC(c). R Package version 2.2.2. https: //CRAN.R-project.org/package = AICmodavg.

McKee, A. M., S. F. Spear, and T. W. Pierson. 2015. The effect of dilution and the use of a post extraction nucleic acid purification column on the accuracy, precision, and inhibition of environmental DNA samples. Biological Conservation 183: 70–76.

Milhau, T., A. Valentini, N. Poulet, N. Roset, P. Jean, C. Gaboriaud, and T. Dejean. 2019. Seasonal dynamics of riverine fish communities using eDNA. Journal of Fish Biology 2019: 1–12.

27 Nakagawa, S., P. D. Johnson, and H. Schielzeth. 2017. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed effects models revisited and expanded. Journal of the Royal Society Interface 14(134): 20170213

Niemiller, M. L., M. L. Porter, J. Keany, H. Gilbert, D. W. Fong, D. C. Culver, C. S. Hobson, K. D. Kendall, M. A. Davis, and S. J. Taylor. 2018. Evaluation of eDNA for groundwater invertebrate detection and monitoring: A case study with endangered Stygobromus (Amphipoda: Crangonyctidae). Conservation Genetics Resources 10: 247–257.

Omernik, J. M. and G. E. Griffith. 2014. Ecoregions of the conterminous United States: Evolution of a hierarchical spatial framework. Environmental Management 54(6): 1249–1266.

Opel, K. L., D. Chung, and B. R. McCord. 2010. A study of PCR inhibition mechanisms using real time PCR. Journal of Forensic Sciences 55(1): 25–33.

Pilliod, D. S., C. S. Goldberg, R. S. Arkle, and L. P. Waits. 2013. Estimating occupancy and abundance of stream amphibians using environmental DNA from filtered water samples. Canadian Journal of Fisheries and Aquatic Sciences 70: 1123–1130.

R Core Team. 2019. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Rees, H. C., B. C. Maddison, D. J. Middleditch, J. M. Patmore, and K. C. Gough. 2014. The detection of aquatic species using environmental DNA: A review of eDNA as a survey tool in ecology. Journal of Applied Ecology 51: 1450–1459.

Roussel, J. M., J. M. Paillisson, A. Tréguier, and E. Petit. 2015. The downside of eDNA as a survey tool in water bodies. Journal of Applied Ecology 52: 823–826.

Sansom, B. J. and L. M. Sassoubre. 2017. Environmental DNA (eDNA) shedding and decay rates to model freshwater mussel eDNA transport in a river. Environmental Science and Technology 51: 14244–14253.

Schabacker, J. C., S. J. Amish, B. K. Ellis, B. Gardener, D. L. Miller, E. A. Rutledge, A. J. Sepulveda, and G. Luikart. 2020. Increased eDNA detection sensitivity using a novel high-volume water sampling method. Environmental DNA 2: 244–251.

28 Schrader, C., A. Schielke, L. Ellerbroek, and R. Johne. 2012. PCR inhibitors: Occurrence, properties, and removal. Journal of Applied Microbiology 113(5): 1014–1026.

Serb, J. M., J. E. Buhay, and C. Lydeard. 2003. Molecular systematics of the North American freshwater bivalve genus Quadrula (Unionidae: Ambleminae) based on mitochondrial ND1 sequences. Molecular Phylogenetics and Evolution 28(1): 1– 11.

Seymour, M., I. Durance, B. J. Cosby, E. Ransom-Jones, K. Deiner, S. J. Ormerod, J. K. Colbourne, G. Wilgar, G. R. Carvalho, M. de Bruyn, F. Edwards, B. A. Emmett, H. M. Bik, and S. Creer. 2018. Acidity promotes degradation of multi-species environmental DNA in lotic mesocosms. Communications Biology 1: 4.

Shogren, A. J., J. L. Tank, S. P. Egan, D. Bolster, and T. Riis. 2019. Riverine distribution of mussel environmental DNA reflects a balance among density, transport, and removal processes. Freshwater Biology 64: 1467–1479.

Sidstedt, M., L. Jansson, E. Nilsson, L. Noppa, M. Forsman, P. Rådström, and J. Hedman. 2015. Humic substances cause fluorescence inhibition in real-time polymerase chain reaction. Analytical Biochemistry 487: 30–37.

Spear, S., J. Groves, L. Williams, and L. Waits. 2015. Using environmental DNA methods to improve detectability in a hellbender (Crptobranchus alleganiensis) monitoring program. Biological Conservation 183: 38–45.

Stoeckle, B. C., R. Kuehn, and J. Geist. 2016. Environmental DNA as a monitoring tool for the endangered freshwater pearl mussel (Margaritifera margaritifera L.): A substitute for classical monitoring approaches? Aquatic Conservation Marine and Freshwater Ecosystems 26(6): 1120–1129.

Strayer, D. L. 2008. Freshwater mussel ecology: A multifactor approach to distribution and abundance. University of California Press, Oakland, CA.

Strayer, D. L., J. A. Downing, W. R. Haag, T. L. King, J. B. Layzer, T. J. Newton, and J. S. Nichols. 2004. Changing perspectives on pearly mussels, North America’s most imperilled . Bioscience 54(5): 429–439.

Strickler, K. M., A. K. Fremier, and C. S. Goldberg. 2015. Quantifying effects of UV-B, temperature, and pH on eDNA degradation in aquatic microcosms. Biological Conservation 183: 85–92.

29 Tan, K. H., R. A. Leonard, L. E. Asmussen, J. C. Lobartini, and A. R. Gingle. 1990. The geochemistry of blackwater in selected coastal streams of the Southeastern United States. Communications in Soil Science and Plant Analysis 21: 1999–2016.

Thomsen, P. F. and E. Willerslev. 2015. Environmental DNA- An emerging tool in conservation for monitoring past and present biodiversity. Biological Conservation 183: 4–18.

Tsuji, S., M. Ushio, S. Sakurai, T. Minamoto, and H. Yamanaka. 2017. Water temperature-dependent degradation of environmental DNA and its relation to bacterial abundance. PLoS ONE 12(4): e0176608.

United States Census Bureau. 2017. TIGER/Line shapefile dataset. United States Census Bureau: Spatial Data Collection and Products Branch, Washington, DC.

United States Environmental Protection Agency (USEPA). 2013. Level III ecoregions of the United States. United States Environmental Protection Agency: National Health and Environmental Effects Research Laboratory, Corvallis, OR.

United States Fish and Wildlife Service (USFWS). 1996. Carolina Heelsplitter (Lasmigona decorata) recovery plan. United States Fish and Wildlife Service, Atlanta, GA.

United States Fish and Wildlife Service (USFWS). 2002. Endangered and threatened wildlife and plants; designation of critical habitat for the Carolina Heelsplitter; final rule. Federal Register 67: 127.

United States Fish and Wildlife Service (USFWS). 2012. Carolina Heelsplitter (Lasmigona decorata) 5-year review: Summary and evaluation. United States Fish and Wildlife Service, Ashville, NC.

United States Geological Survey (USGS). 2019. National hydrography dataset plus – NHDPlus v2.1. United States Geological Survey, Reston, VA.

Untergasser, A., I. Cutcutache, T. Koressaar, J. Ye, B. C. Faircloth, M. Remm, and S. G. Rozen. 2012. Primer3: New capabilities and interfaces. Nucleic Acids Research 40(15): e115.

Vaughn, C. C. and C. M. Taylor. 1999. Impoundments and the decline of freshwater mussels: A case study of an extinction gradient. Conservation Biology 13: 912– 920.

30 Wachob, A., A. D. Park, and R. Newcome (Eds.). 2009. South Carolina state water assessment, second ed. South Carolina Department of Natural Resources: Land Water, and Conservation Division, Columbia, SC.

Wacker, S., F. Fossøy, B. M. Larsen, H. Brandsegg, R. Sivertsgård, and S. Karlsson. 2019. Downstream transport and seasonal variation in freshwater pearl mussel (Margaritifera margaritifera) eDNA concentration. Environmental DNA 1: 64–73.

Wilcox, T. M., K. S. McKelvey, M. K. Young, S. F. Jane, W. H. Lowe, A. R. Whiteley, and M. K. Schwartz. 2013. Robust detection of rare species using environmental DNA: The importance of primer specificity. PLoS One 8(3): e59520.

Wilcox, T. M., K. J. Carim, M. K. Young, K. S. McKelvey, T. W. Franklin, and M. K. Schwartz. 2018. Comment: The importance of sound methodology in environmental DNA sampling. North American Journal of Fisheries Management 38(3): 592–596.

Williams, J. D., M. L. Warren, K. S. Cummings, J. L. Harris, and R. J. Neves. 1993. Conservation status of freshwater mussels of the United States and Canada. Fisheries 18(9): 6–22.

Williams, K. E., K. P. Huyvaert, and A. J. Piaggio. 2017. Clearing muddied waters: Capture of environmental DNA from turbid waters. PLoS One 12(7): e0179282.

Wilson, I. G. 1997. Inhibition and facilitation of nucleic acid amplification. Applied and Environmental Microbiology 63(10): 3741–3751.

Ye, J., G. Coulouris, I. Zaretskaya, I. Cutcutache, S. Rozan, and T. L. Madden. 2012. Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics 13: 134.

31 TABLES

Table 1. Estimated cumulative probability of detection for Lasmigona decorata (probability of detecting the target species in at least one replicate sample) and associated 95% confidence intervals based on sample volume and the number of sample replicates.

Volume (L) 1 Sample 2 Samples 3 Samples

1 0.27 (0.09-0.57) 0.47 (0.17-0.82) 0.61 (0.25-0.92)

2 0.36 (0.19-0.56) 0.59 (0.34-0.80) 0.74 (0.47-0.92)

3 0.46 (0.20-0.73) 0.71 (0.36-0.93) 0.84 (0.49-0.98)

32 Table 2. List of covariates used to model sample inhibition in the Upper Lynches River Sub-basin in North and South Carolina and the observed range of values across sampled sites (n = 99).

Variable Description Type Mean (Range)

Filter Categorical variable for whether filters were combined during Lab Method NA extraction process

Volume Volume of stream water (L) filtered through each filter Lab Method 1.4 (0.5-2.0)

Extract Volume of sample extract used in amplification protocol (either 2.85 Lab Method NA μl or 5.85 μl)

DO Dissolved oxygen (mg/l) at sampling location during time of Water Chemistry 6.9 (0.6-9.0) collection (Scaled as x/10)

Temp Water temperature (°C) at sampling location during time of Water Chemistry 21.4 (17.5-29.8) collection

Conduct Conductivity (μS/cm) at sampling location during time of collection Water Chemistry 8.8(1.0-74.4) (Scaled as x/10)

Turbid Turbidity (NTU) at sampling location during time of collection Water Chemistry 10.5 (0.1-24.1)

pH pH at sampling location during time of collection Water Chemistry 5.9 (2.9-7.3)

Season Categorical variable for collection date in 2019 (grouped into 3 Temporal NA sampling periods: 1 = May 7-11, 2 = May 28-June 8, and 3 = July 1-4)

Order Strahler stream order at the target segment based on the NHD Plus Hydrological 3 (1-5) v2.1 (USGS 2019)

Site Random effect representing the stream site where samples were collected

33 Table 3. Candidate models used to investigate the effects of environmental variables and laboratory methods on the probability of inhibition in stream water samples. All models included a random effect representing site ID.

a b c d Model Structure K AICC ΔAIC Likelihood Wi AUC

pH pH + Site 3 128.34 0.00 1.00 0.58 0.89

Water Chemistry DO + Conduct + pH + Site 5 129.00 0.66 0.72 0.42 0.88

Lab Methods Filter + Volume + Extract + Site 5 183.48 55.14 0.00 0.00

Hydrology Order + Site 3 191.95 63.61 0.00 0.00

Null Site 2 197.64 69.30 0.00 0.00

Turbidity Turbid + Site 3 199.46 71.12 0.00 0.00

Seasonality Season + Temp + Conduct + Site 6 202.28 73.94 0.00 0.00

a Number of estimated parameters in the model b Akaike’s information criterion, adjusted for small sample size c Relative model weight d Area under the receiver operating characteristic (ROC) curve from model cross validation

34 Table 4. Coefficients with 95% confidence intervals and odds ratios for the confidence set models describing the probability of sample inhibition in the Upper Lynches River Sub Basin in North and South Carolina. Parameters with an * indicate effect sizes that did not overlap 0 in 95% confidence intervals.

Model Parameter Estimate Std. Error 95% Confidence Intervals

pH model (Intercept) 55.93 21.47 13.84 98.02

pH * -9.73 3.68 -16.94 -2.51

Water Chemistry model (Intercept) 61.94 21.32 14.29 109.60

pH * -11.91 4.71 -21.14 -2.68

Dissolved oxygen 0.78 0.71 -0.59 2.16

Conductivity 0.23 0.16 -0.10 0.55

35 FIGURES

Figure 1. Map of study area where water samples were collected, consisting of three HUC 10 watersheds within the Lynches River sub-basin of the Pee Dee River Basin. Two water samples were collected from each site.

36

Figure 2. Relationship between pH and the estimated probability of inhibition in environmental DNA samples. Solid line represents mean estimated effects and dashed lines represent 95% confidence intervals of the fitted regression line based on our top-ranked model.

37 CHAPTER TWO: USING ENVIRONMENTAL DNA (eDNA) TO PREDICT HOST FISH DISTRIBUTION FOR AN ENDANGERED FRESHWATER MUSSEL LASMIGONA DECORATA

Benjamin C. Schmidt, Stephen F. Spear, and Catherine M. Bodinof Jachowski

ABSTRACT

The larvae of Unionid freshwater mussels are obligate ectoparasites on host fish for both dispersal and development into free-living juveniles. This host-parasite relationship plays a critical role in determining the distribution, demography, and survival of populations. Understanding spatial patterns of host fish occupancy can help managers identify suitable mussel habitat, evaluate connectivity between populations, and prioritize locations for reintroducing propagated mussels. In this study, we developed a qPCR assay to assess host fish availability for a federally endangered freshwater mussel, the Carolina Heelsplitter (Lasmigona decorata) in the Lynches River

Sub-basin of North and South Carolina. We collected replicate water samples across 100 spatially balanced sites and used single season occupancy modelling to develop predictive models of host fish occupancy as a function of remotely sensed and in-stream environmental variables. Our results indicate that streams within the study area have a

59% (Ψ = 0.59 [0.44-0.73]) chance of being occupied by suitable hosts, although we were unable to find support for in-stream or landscape variables as drivers of host fish occupancy. This study suggests that suitable hosts are relatively widespread within the

Lynches River sub-basin, and that host availability is likely not a limiting factor for

Carolina Heelsplitter recruitment.

38 INTRODUCTION

Freshwater mussels are among the most imperiled aquatic groups in North

America (Strayer et al. 2004) with over 70% of recognized taxa considered endangered, threatened, or of special concern (Haag 2009, Williams et al. 1993). Many species have undergone severe declines in the last century and persist only in small fragmented populations with a low probability of survival (Haag 2012). Within the Southeastern

United States, freshwater mussel declines have been attributed to impoundments and channelization, poor land use practices, water quality degradation, and the introduction of invasive mollusks (Haag 2012, Neves et al. 1997). The relative influence of these factors, however, are poorly understood, and many mussel populations remain vulnerable to multiple enigmatic stressors (Downing et al 2010).

Agencies tasked with the conservation of freshwater mussels must consider several characteristics of their biology, including a unique life cycle in which larvae

(glochidium) are obligate ectoparasites on host fish for both development and dispersal

(Barnhart et al. 2008) (Figure 1). This dependency on host fish plays a critical role in determining the distribution, demography, and survival of populations (Haag 2012,

Modesto et al. 2018, Schwalb et al. 2013, Strayer 2008). Many factors influence the distribution and movement of fish in lotic environments including dispersal ability, physiological tolerance, life history characteristics, and biotic interactions (Angermeier et al. 2002). In addition, the natural processes shaping stream fish distributions have been significantly altered by anthropogenic impacts, including habitat loss and

39 fragmentation (Dudgeon et al. 2006), pollution (Lloyd 1992), climate change

(Xenopoulos et al. 2005), and the introduction of invasive species (Lepriuer et al. 2008).

Models that predict occupancy patterns for fish in relation to environmental features and anthropogenic impacts have been developed for a variety of species (Anderson et al. 2012, Dextrase et al. 2014, Kennard et al. 2006, Oberdorff et al. 2001). These predictive models could be used to efficiently assess host fish availability across broad spatial extents, especially when developed using remotely sensed and GIS-based parameters (Guisan and Zimmermann 2000). Models of host fish distribution could be used to help identify suitable habitat for freshwater mussels, evaluate connectivity between populations, and prioritize locations for reintroducing propagated stock.

The goal of our study was to assess host fish occupancy patterns for the federally endangered Carolina Heelsplitter (Lasmigona decorata [Lea 1852]) within the Upper

Lynches River Sub-basin in North and South Carolina. While the complete historic range of the species is unknown, available records indicate that it was once fairly widely distributed in the Pee Dee, Catawba, Savannah, and Saluda River systems (USFWS 1996).

Presently, the species is restricted to a few isolated streams and the remaining populations are at risk from habitat alterations and deteriorating water quality (USFWS

2012). Understanding habitat requirements and identifying suitable reintroduction sites for propagated mussels are both listed as important conservation goals within the recovery plan (USFWS 1996). Previous research has identified several minnow species

(Cyprinidae) as suitable hosts for L. decorata, and propagation is underway at multiple

40 facilities in North and South Carolina (Eads et al. 2010). Based on results from Eads et al.

(2010) we used occupancy patterns of the Bluehead Chub (Nocomis leptocephalus) as a proxy for host fish availability and developed an environmental DNA (eDNA) assay for detecting the species from stream water samples. Our primary objective was to investigate N. leptocephalus occupancy in relation to in-stream and GIS-based variables in order to develop predictive maps of host fish distribution within the study area.

METHODS

Assay development and validation

In order to identify N. leptocephalus DNA from water samples, we designed a primer set and probe using Primer3Plus (Untergasser et al. 2012) with sequence data from Genbank. The primer pair amplifies a 120 bp region of the cytochrome b mitochondrial gene. The sequence of the forward primer, reverse primer, and probe are as follows:

Forward Primer 5’ TGAACACTGGTAGCAGACATAC 3’

Reverse Primer 5’ GAGAACAAGGAATAGCGCAAAG 3’

Probe 5’ CAL Fluor Red 610 – TCGGCCAAATCGCATCGGTTCTAT – BHQ-2 3’

We assessed specificity (the probability of amplifying non-target DNA) using

Primer-BLAST (Ye et al. 2012) by comparing primers to all available sequence data including both related species and non-target species known to occur in the study area.

41 In addition, we assessed the potential for cross amplification with sympatric taxa using tissue-derived DNA from Semotilus lumbee and Semotilus atromacultaus, using fin clips from individuals collected within the study area. We validated our assay using positive control samples collected from stream sites with known N. leptocephalus populations.

We confirmed occupancy status at each positive control site using data from electrofishing surveys (B. Peoples, personal communication, 2019).

Study area and sampling design

Our study took place in three watersheds within the Lynches River sub-basin, representing the upper Lynches river and its primary tributaries (Figure 2). The study area is a heterogenous landscape and marks the transition between the Piedmont and

Southeastern Plains ecoregions in North and South Carolina (Omernik and Griffith 2014,

USEPA 2013). The Lynches River sub-basin (HUC 10: 03040202) is recognized as a conservation priority in South Carolina and contains 67 species of fish, 10 species of crayfish, and 15 species of freshwater mussel, many of which are considered imperilled

(Elkins et al. 2016). Furthermore, our study area contains the largest known extant L. decorata populations and the most stable of the six critical habitat units designated by the USFWS (USFWS 2002).

Using ArcMap 10.7 (Esri) and the National Hydrology Dataset Plus (NHDPlus v2.1)

(USGS 2019), we identified potential sampling sites as all locations where a navigable roadway intersected a stream, as defined by the National US Primary and Secondary

Roads TIGER/line dataset (US Census Bureau 2017). From the pool of 496 sites, we

42 selected a total of 100 sampling sites using a generalized random tessellation stratified sampling design with the ‘spsurvey’ package (Kincaid et al. 2019) in R (v 3.6.1; R Core

Team 2019). We stratified sites by stream order (five levels: 1st, 2nd, 3rd, 4th, and 5th) to ensure sampling across the full range of stream orders within the study area and established a minimum stream distance of 1 km between sites to reduce spatial autocorrelation. Due to a limited number of fourth and fifth order stream sites within the study area, we sampled all available sites within those orders and distributed the remaining sites across the lower order streams. In total, we sampled 26 first order streams, 32 second order streams, 24 third order streams, 13 fourth order streams, and

4 fifth order streams.

Field sampling

At each site, we collected two consecutive 2 L surface water samples from the stream edge using new polyethylene bottles. To monitor potential contamination between sites, we paired water samples from each site with 1 L negative controls consisting of distilled or tap water poured on site and transported back to the lab alongside the stream water samples. All water samples were transported in individual sealed plastic bags on ice and processed within 12 hours of collection. Following sample collection, we recorded water quality data from the point of collection using a YSI Pro

Plus (Xylem Analytics) and a portable turbidity meter (Sper Scientific). We collected samples between May 7 and July 4 of 2019 during 3 separate sampling periods. We

43 collected samples from 19 sites between May 7–11, 46 sites between May 28–June 8, and 35 sites between July 1–4.

We filtered water samples through 5 μm cellulose nitrate filters (Whatman

International, Ltd.) using an electric vacuum pump assembly. When filters became clogged, we used multiple (up to 4) filters to reach our target volume (2L) for each sample. We stored filters in microcentrifuge tubes containing 95% ethanol at -80 °C until ready for DNA extraction.

Laboratory methods

We extracted DNA from the filters using the DNeasy Blood and Tissue Kit

(Qiagen) with the addition of a Qiashredder spin column (Qiagen) after the initial lysis buffer step (Goldberg et al. 2011). Prior to extraction, we cut filters in half, storing one half in ethanol as a backup and allowing the other half to dry overnight. For samples that required multiple filters, we processed each filter separately until the spin column step, at which point we loaded each filter extract onto the same column and processed the combined extract as a single sample for the remainder of the extraction process.

We used DNA extractions from fin clips of N. leptocephalus collected within the study area to create standards for the qPCR analysis. Each plate included standards at 4 dilution points (10-3, 10-4, 10-5, and 10-6) in addition to an extraction negative and a no template control. We conducted all extractions and PCR setup within lab space dedicated to low-copy DNA extractions at The Wilds Conservation Center in

Cumberland, Ohio.

44 We ran real time qPCR assays in triplicate using an Applied Biosystems 7500 Fast

Real-Time PCR System (Thermo Fisher Scientific). Samples were run in 20 ul reactions consisting of 7.5 μl of Quantitect© Multiplex PCR mastermix (Qiagen), 0.4 uM of the forward and reverse primers, 0.2 uM of probe, 0.6 μl of 10x TaqMan Exogenous Internal

Positive Control Mix (Thermo Fisher Scientific), and 0.3 μl of 50x TaqMan Exogenous

Internal Positive Control DNA (Thermo Fisher Scientific). To investigate the potential effects of sample extract volume on detection, we amplified 59 samples using 2.85 ul of sample extract and 139 samples using 5.85 μl of sample extract. The qPCR cycling protocol consisted of an initial denaturing step at 95°C for 15 minutes, followed by 50 cycles of 94°C for 1 minute (denaturation) and 60°C for 1 minute (annealing and extension).

We used Applied Biosystems Software v2.06 (Life Technologies) to analyze the amplification results. Manual thresholds for amplification were set using the Ct value at which all standards amplify. We considered samples to be positive, and indicative of species presence, if they amplified the IPC and target DNA in at least one of the three

PCR technical replicates. To reduce the risk of false positives due to contamination, we only considered samples to be positive if there was no observed amplification in their associated negative field control, extraction control, or no template control. We considered samples to be inhibited if the internal positive control (IPC) failed to amplify during qPCR and we removed all inhibited samples from further analysis (Hartman et al.

2005).

45 Quantifying environmental covariates

We used geographic information system (GIS) based and in-stream variables to quantify hydrologic, chemical, physiographic, and land cover variables for each site

(Table 1). We considered land use variables that corresponded with three separate spatial scales: the local catchment, the watershed-wide riparian area, and the entire upstream watershed (Figure 3). Local catchments included the contributing area surrounding a stream segment as defined by the NHD Plus v2.1 (USGS 2019) (Figure 3A).

Watershed-wide riparian areas included the collective extent of riparian area surrounding all upstream tributaries to a distance of 50 m from each side of the stream

(figure 3B). Upstream watersheds included all land within the contributing area upstream of the sampling location (Figure 3C).

We used ArcMap 10.7 (ESRI) to delineate catchments and quantify environmental variables at each spatial scale. We quantified land use variables using the

2016 National Land Cover Dataset (NLCD 2018, see Yang et al. 2018) at a spatial resolution of 30 m. We combined deciduous, evergreen, and mixed forest types into a single forest classification and included separate variables for the proportion of forest and the percent canopy cover within catchments. We combined all development classes

(low, medium, high, and open space) into a single development classification and combined the hay/pasture and cultivated crops into a single agricultural classification. In total, we quantified 4 land use classifications: % forest, % developed, % agricultural, and

% canopy cover for each spatial scale. We used the physiography dataset from the EPA

46 (USEPA 2013) to assign values to catchments at each spatial scale based on the proportion of the catchment located within the Piedmont physiographic province. In addition, we used the National Hydrology Dataset v2.1 (USGS 2019) to assign hydrological variables to each stream segment including slope and Strahler stream order. Finally, we used the EPA StreamCAT dataset (Hill et al. 2015) to assign special indices of catchment and watershed integrity to each stream segment.

We assessed collinearity between variables using multivariate correlation analysis based on Pearson’s correlation coefficients using package ’Hmisc’ (Harrell 2020) in R (v 3.6.1; R Core Team 2019). For non-land use variables, we retained one variable from each correlated pair (|r|> 70%) based on comparative range and correlation to the response variable. We removed slope from the set of variables because it was highly correlated with stream order (r = 0.73). Due to significant correlation between forest cover and canopy cover within spatial scales, we separated canopy cover from the other land use classes (agriculture, development, and forest) during model development.

Data analysis

We used single season occupancy models (MacKenzie et al. 2002) to investigate factors associated with N. leptocephalus occurrence while accounting for imperfect detection through the collection of multiple samples from each site. We defined sites as a 1 km stream reach above the sampling locality to account for estimated transport distances of eDNA (Shogren et al. 2017). We defined occupancy as the probability that a site was occupied by at least one individual during the sampling season and detection as

47 the probability of detecting the target species at a site that was occupied in a single sample. To maximize parsimony in the candidate set of models used to investigate patterns of N. leptocephalus occurrence and detection, we used a two-step approach to model fitting and selection (Mackenzie et al. 2006). In step one, we examined support for covariates of detection while holding occupancy constant. In step two, we carried over covariates of detection that were supported in step one, while examining support for factors hypothesized to influence occupancy.

In step one, we considered four detection models, each of which corresponded to an a priori hypothesis concerning factors that may influence N. leptocephalus detection probability. We used a laboratory methods model to represent our hypothesis that the probability of detection would be correlated with the process of combining filters during extraction and the volume of sample extract used during amplification. We hypothesized that a higher extract volume may increase detection due to the potential increase in target DNA during amplification. In contrast, we hypothesized that combining filters may reduce detection by reducing the efficiency of DNA retention during the extraction process. We used a seasonality model to represent our hypothesis that the probability of detection would be lower when water temperatures were high and water levels were reduced (Wachob et al. 2009). We used a water chemistry model to represent our hypothesis that detection is correlated to local water chemistry variables at the time of sampling. We hypothesized that detection would be lower in conditions that accelerate the degradation of DNA, such as high acidity and high

48 turbidity (Seymour et al. 2018, Stoeckle et al. 2017). Finally, we included a null model to represent our hypothesis that detection would be constant throughout the study area regardless of laboratory methods.

In step two, we evaluated support for 10 alternative hypotheses concerning factors driving N. leptocephalus occupancy. We hypothesized positive effects of canopy cover on occupancy and used 3 univariate models, corresponding to each spatial scale, to determine whether one spatial scale might be a better predictor of occupancy than another. We combined the remaining land use variables (agriculture, development, and forest cover) into 3 additive models, corresponding to each spatial scale; and hypothesized positive effects of forest cover, and negative effects of agriculture and development. We used a univariate model with stream order to represent our hypothesis that the probability of occupancy would be higher in larger streams that were less susceptible to intermittent flows (Wachob et al. 2009). We used an additive model with indices of catchment and watershed integrity from the StreamCAT dataset to represent our hypothesis that occupancy probability would be higher at sites with high catchment and watershed integrity. We used a univariate model with physiography to represent our hypothesis that occupancy is correlated to the underlaying geology and topography associated with physiographic ecoregion. We hypothesized that the probability of occupancy would be higher in catchments located primarily within the

Piedmont ecoregion compared to catchments draining lands in the Southeast Plains ecoregion (Page and Burr 2011). Because N. leptocephalus is known as a host generalist

49 species, we also included a null model to represent our hypothesis that occupancy probability would be constant throughout the study area, independent of environmental variables (Albanese et al. 2004).

In both steps, we ranked models using the Akaike Information Criterion adjusted for small sample size (AICc) (Akaike 1974) with package ‘AICmodavg’ (Mazerolle 2019) in

R (v 3.6.1; R Core Team 2019). We considered all models within 2 ΔAIC units from the top ranked model (Burnham and Anderson 2002). We considered parameters from top ranked models to be well supported if the 95% confidence interval for the effect sizes did not overlap 0.

RESULTS

We collected a total of 200 samples from 100 sites. All negative field controls, extraction controls, and no-template controls performed as expected with no indication of target DNA amplification. However, we removed one site from analysis due to inhibition in the negative control. We observed extensive inhibition during qPCR amplification, accounting for 30% of our samples (78 of 198). Within the study area, inhibition affected both samples from 34 sites, and one of two samples from an additional 10 sites. We observed a strong correlation between sample inhibition and physiographic ecoregion. Sites with catchments located primarily in the Southeast Plains ecoregion were much more likely to have inhibited samples than sites draining lands in the Piedmont ecoregion (75% of samples from the SE Plains vs. 15% of samples from the

Piedmont) (Figure 2). After removing inhibited samples, we based our analysis on 65

50 sites with 1-2 samples per site. We detected N. leptocephalus in at least one sample at

32 of 65 sites, indicating a naïve occupancy of 49%.

In step one (detection), the laboratory methods model carried 100% of the model weight (w1 = 1.00) (Table 2). Detection probabilities were approximately 100% regardless of sample extract volume or filter type (p > 0.99). However, the effect sizes for both extract volume and filter type had 95% confidence intervals which overlapped zero, indicating a great deal of uncertainty regarding the effects of laboratory methods on the detection of target DNA. Because of this uncertainty, we do not discuss detection variables further. We carried over all parameters in the laboratory methods model to define detection in step two.

We observed considerable model selection uncertainty in step two of our analysis. The top ranked model included a null model of occupancy (w1 = 0.29), and model weights indicated that it was more than twice as likely to be the best fitting model in our candidate set than the second rank model (Table 2). The null model indicated that any randomly selected 1 km stream reach in our study area had a 59% (Ψ

= 0.59 [0.44-0.73]) chance of being occupied by N. leptocephalus.

Two additional models fell within 2 ΔAIC units of the null model, including the hydrology model (w2 = 0.12) and the physiography model (w3 = 0.11). The hydrology model indicated a positive association between occupancy and stream order (βorder =

0.23, SE = 0.29, p = 0.43) and the physiography model indicated a very weak negative association between occupancy and the Piedmont ecoregion (βphysiography = -0.06, SE =

51 0.09, p = 0.54). However, the effect sizes for both parameters had 95% confidence intervals which overlapped zero (Table 3).

DISCUSSION

Host fish distribution

Collectively, our results indicate both widespread distribution of N. leptocephalus within the study area, and an inability to define occupancy as a function of the remotely sensed or in-stream variables that we measured herein. Previous research has indicated that species distribution models may be less accurate for habitat generalists and common species (Segurado and Araújo 2004, Tsoar et al. 2007). We suspect that the widespread distribution of N. leptocephalus within our study area and their ability to persist in highly degraded streams (Peoples et al. 2011) may have limited our ability to identify strong support for any environmental predictors of occupancy.

There are a number of additional variables that were not included in this study that may be important in determining distributional patterns of N. leptocephalus within the study area. One factor that is known to have considerable effects on the patterns of fish distribution is anthropogenic barriers, such as impoundments and road culverts

(Gardner et al. 2011, Norman et al. 2009). While the locations of large barriers can be identified using available GIS datasets such as the National Anthropogenic Barrier

Dataset (NABD; Ostroff et al. 2013), identifying small barriers is much more difficult.

These small barriers can impede fish passage by creating physical obstacles or changing local conditions such that fish are unable to pass through impacted stream reaches

52 (Warren and Pardew 1998). For habitat generalist and highly tolerant host fish species, barriers may be the most important factor in determining distributional patterns

(Dynesius and Nilsso 1994). In addition to environmental variables, mechanisms such as site fidelity, interspecific and intraspecific competition, predator avoidance, and spawning behaviour may affect the spatial distribution of N. leptocephalus, possibly to a large extent (Planque et al. 2011).

Patterns of inhibition

The removal of inhibited samples from analysis, which accounted for approximately 1/3 of all processed samples, may have influenced our results.

Samples collected from streams in the Southeast Plains ecoregion were much more likely to be inhibited than sites in the Piedmont ecoregion. While our sampling efforts were evenly distributed between the two ecoregions, the removal of inhibited samples resulted in a disproportionate number of samples from the Piedmont ecoregion being used in analysis. Because N. leptocephalus is uncommon below the Atlantic fall line

(Page and Burr 2011), we suspect the actual probability of occupancy in the study area is lower than indicated by our null model. In addition, the large number of inhibited samples from the Southeast Plains ecoregion may have obscured the relationship between N. leptocephalus and physiography.

These results indicate that inhibitory compounds may be heterogeneously distributed in our study area, with higher concentrations occurring in streams with low pH (see chapter 1). We suspect this is due to the presence of dissolved organic matter,

53 such as humic acids and tannins, which can lower the pH of stream water at high concentrations (Hemmond 1994, Tan et al. 1990). The strong correlation between pH and physiography indicates that eDNA sampling may be difficult areas where streams are naturally acidic, such as the Southeast Plains ecoregion. Due to the spatial pattern of inhibition we observed, we recommend that these results be interpreted with caution.

Management Implications

Understanding the distribution of host fish is an important component of L. decorata conservation, as host fish availability likely plays a major role in juvenile recruitment and overall population trends. While we were unable to develop predictive models of host fish distribution within the study area, we did observe N. leptocephalus across a variety of habitats including many sites within the critical habitat designation for L. decorata. These results suggest that host fish availability is likely not a limiting factor in L. decorata persistence and indicates that many sites within the study area may provide suitable habitat. However, no studies have been conducted regarding host fish specificity for L. decorata in the wild, and species deemed suitable in laboratory conditions may play a limited role in natural populations (Levine et al. 2012). While Eads et al. (2010) identified N. leptocephalus as yielding the highest rates of successful metamorphosis for artificial propagation, more research is needed to determine if the species is biologically important under natural conditions.

54 REFERENCES

Anderson, G. B., M. C. Freeman, M. M. Hagler, and B. J. Freeman. 2012. Occupancy modeling and estimation of the holiday darter species complex within the Etowah River System. Transactions of the American Fisheries Society 141(1): 34- 45.

Angermeier, P. L., K. L. Krueger, and C. A. Dolloff. 2002. Discontinuity in stream-fish distributions: Implications for assessing and predicting species occurrences. In: Scott, J. M., P. J. Heglund, F. Samson, J. Haufler, M. Morrison, M. Raphael, and B. Wall (Eds.), Predicting species occurrences: Issues of accuracy and scale. Island Press, Covelo, CA.

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6): 716-723.

Albanese, B., P. L. Angermeier, and S. Dorai-Raj. 2004. Ecological correlates of fish movement in a network of Virginia streams. Canadian Journal of Fisheries and Aquatic Sciences 61(6): 857-869.

Barnhart, M. C., Haag, W. R., and W. N. Roston. 2008. Adaptations to host infection and larval parasitism in Unionidae. Journal of the North American Benthological Society 27(2): 370-394.

Burnham, K. and D. Anderson. 2002. Model selection and multimodel inference, Second Ed. Springer Verlag, New York, NY.

Dextrase, A. J., N. E. Mandrak, and J. A. Schaefer. 2014. Modelling occupancy of an imperilled stream fish at multiple scales while accounting for imperfect detection: Implications for conservation. Freshwater Biology 59: 1799-1815.

Downing, J. A., P. Van Meter, and D. A. Woolnough. 2010. Suspects and evidence: A review of the causes of decline and extirpation in freshwater mussels. Animal Biodiversity and Conservation 33: 151-185.

Dudgeon, D., A. Arthington, M. Gessner, Z. Kawabata, D. Knowler, C. Lévêque, and C. Sullivan. 2006. Freshwater biodiversity: Importance, threats, status, and conservation challenges. Biological Reviews 81(2): 163-182.

Dynesius, M. and Nilsso, C. 1994. Fragmentation and flow regulation of river systems in the northern third of the world. Science 266(5186): 753-762.

55 Eads, C. B., R. B. Bringolf, R. D. Greiner, A. E. Bogan, and J. F. Levine. 2010. Fish hosts of the Carolina Heelsplitter (Lasmigona decorata), a federally endangered freshwater mussel (Bivalvia: Unionidae). American Malacological Bulletin 28: 151-158.

Elkins, D. C., S. C. Sweat, K. S. Hill, B. R. Kuhajda, A. L. George, and S. L. Wenger. 2016. The Southeastern aquatic biodiversity conservation strategy. Final Report. University of Georgia Basin Center, Athens, GA.

Gardner, C., S. M. Coghlan, J. Zydlewski, and R. Saunders. 2013. Distribution and abundance of stream fishes in relation to barriers: Implications for monitoring stream recovery after barrier removal. River Research and Applications 29: 65- 78.

Goldberg, C. S., D. S. Pilliod, R. S. Arkle, and L. P. Waits. 2011. Molecular detection of vertebrates in stream water: A demonstration using Rocky Mountain Tailed Frogs and Idaho Giant Salamanders. PLoS One 6: e22746.

Guisan, A. and N. E. Zimmermann. 2000. Predictive habitat distribution models in ecology. Ecological Modelling 135: 147-186.

Haag, W. R. 2009. Past and future patterns of freshwater mussel extinctions in North America during the Holocene. In S. T. Turvey (Ed.), Past and Present Holocene Extinctions. Oxford University Press, London, UK.

Haag, W. R. 2012. North American freshwater mussels: Natural history, ecology, and conservation. Cambridge University Press, Cambridge, UK.

Haag, W. R. and J. D. Williams. 2014. Biodiversity on the brink: An assessment of conservation strategies for North American freshwater mussels. Hydrobiologia 735: 45-60

Harrell, F. E. 2020. Hmisc: Harrell miscellaneous. R Package version 4.4. https: //CRAN.R- project.org/package = Hmisc.

Hartman, L. J., S. R. Coyne, and D. A. Norwood. 2005. Development of a novel internal positive control for Taqman based assays. Molecular and Cellular Probes 19(1): 51-59.

56 Hemmond, H. F. 1994. The role of organic acids in acidification of freshwaters. In C. W. Steinberg and R. F. Wright (Eds.) Acidification of freshwater ecosystems: Implications for the future. John Wiley and Sons Ltd., New York, NY.

Hill, R. A., M. H. Weber, S. G. Liebowitz, A. R. Olsen, and D. J. Thornbrugh. 2015. The stream catchment (StreamCAT) dataset: A database of watershed metrics for the conterminous United States. Journal of the American Water Resources Association 52(1): 120-128.

Jenkins, R. E. and N. M. Burkhead 1993. Freshwater fishes of Virginia. American Fisheries Society, Bethesda, MD.

Kennnard, M. J., J. B. Pusey, A. H. Arthington, B. D. Harch, and S. J. Mackay. 2006. Development and application of a predictive model of freshwater fish composition to evaluate river health in eastern Australia. Hydrobiologia 572: 33- 57.

Kincaid, T. M., A. R. Olsen, and M. H. Weber. 2019. Spsurvey: Spatial survey design and analysis. R Package version 4.1.0

Lea, I. 1852. Descriptions of new species in the family Unionidae. Transactions of the American Philosophical Society 10: 253–294. Leprieur, F., O. Beauchard, S. Blanchet, T. Oberdorff, and S. Brosse. 2008. Fish invasions in the world’s river systems: When natural processes are blurred by human activities. PeerJ 6: e28.

Levine, T. D., B. K. Lang, and D. J. Berg. 2012. Physiological and ecological hosts of Popenaias popeii (Bivalvia: Unionidae): Laboratory studies identify more hosts than field studies. Freshwater Biology 57: 1854-1864.

Lloyd, R. 1992. Pollution and freshwater fish. Fishing News Books Ltd., Oxford, UK.

MacKenzie, D. I., J. D. Nichols, G. B. Lachman, S. Droege, J. A. Royle, and C. A. Langtimm. 2002. Estimating site occupancy rates when detection probabilities are less than one. Ecology 83(8): 2248-2255.

Mackenzie, D. I., J. D. Nichols, J. A. Royle, K. H. Pollock, L. L. Bailey, and J. E. Hines. 2006. Occupancy estimation and modeling: Inferring patterns and dynamics of species occurrence. Elsevier, New York, NY.

57 Mazerolle, M. J. 2019. AICmodavg: Model selection and multimodel inference based on (Q)AIC(c). R Package version 2.2.2. https: //CRAN.R-project.org/package = AICmodavg.

Modesto, V., M. Ilarri, A. T. Souza, M. Lopes-Lima, K. Douda, M. Clavero, and R. Sousa. 2018. Fish and mussels: Importance of fish for freshwater mussel conservation. Fish and Fisheries 19(2): 244-259.

National Land Cover Database (NLCD). 2018. NLCD 2016 Land Cover (CONUS) and NLCD 2016 USFS tree canopy cover (CONUS). Multi-Resolution Land Characteristics Consortium, Sioux Falls, SD.

Neves, R. J., A. E. Bogan, J. D. Williams, S. A. Ahlstedt, and P. W. Hartfield. 1997. Status of aquatic mollusks in the southeastern United States: A downward spiral of diversity. In Benz, G. W. and D. E. Collins (Eds.), Aquatic fauna in peril: The southeastern perspective. Southeast Aquatic Research Institute, Decatur, GA.

Norman, J. R., M. M. Hagler, M. J. Freeman, and B. J. Freeman. 2009. Application of a multistate model to estimate culvert effects on movement of small fishes. Transactions of the American Fisheries Society 138: 826-838.

Oberdorff, T., D. Pont, B. Hugueny, and D. Chessel. 2001. A probabilistic model characterizing fish assemblages of French rivers: A framework for environmental assessment. Freshwater Biology 46: 399-415.

Omernik, J. M. and G. E. Griffith. 2014. Ecoregions of the conterminous United States: Evolution of a hierarchical spatial framework. Environmental Management 54(6): 1249-1266.

Ostroff, A., D. Wieferich, A. Cooper, and D. Infante. 2013. 2012 national anthropogenic barrier dataset (NABD). United States Geological Survey: Aquatic GAP Program, Denver, CO.

Page, L. M. and B. M. Burr. 2011. Peterson field guide to freshwater fishes of North America north of Mexico. Houghton Mifflin Harcourt, Boston, MA.

Peoples, B. K., M. B. Tainer, and E. A. Frimpong. 2011. Bluehead chub nesting activity: A potential mechanism of population persistence in degraded stream habitats. Environmental Biology of Fishes 90: 379-391.

58 Planque, B., C. Loots, P. Petitgas, U. Lindstrøm, and S. Vaz. 2011. Understanding what controls the spatial distribution of fish populations using a multi-model approach. Fisheries Oceanography 20: 1-17.

R Core Team. 2019. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Schwalb, A. N., T. J. Morris, N. E. Mandrak, and K. Cottenie. 2013. Distribution of unionid freshwater mussels depends on the distribution of host fishes on a regional scale. Diversity and Distribution 19: 446-454.

Segurado, P. and M. B. Araújo. 2004. An evaluation of methods for modelling species distributions. Journal of Biogeography 31: 1555-1568.

Seymour, M., I. Durance, B. J. Cosby, E. Ransom-Jones, K. Deiner, S. J. Ormerod, J. K. Colbourne, G. Wilgar, G. R. Carvalho, M. de Bruyn, F. Edwards, B. A. Emmett, H. M. Bik, and S. Creer. 2018. Acidity promotes degradation of multi-species environmental DNA in lotic mesocosms. Communications Biology 1: 4.

Shogren, A. J., J. L. Tank, E. Andruszkiewicz, B. Olds, A. R. Mahon, C. L. Jerde, and D. Bolster. 2017. Controls on eDNA movement in streams: Transport, retention, and resuspension. Scientific Reports 7: 5065.

Stoeckle, B. C., S. Beggel, A. F. Cerwenka, E. Motivans, R. Kuehn, and J. Geist. 2017. A systematic approach to evaluate the influence of environmental conditions on eDNA detection success in aquatic ecosystems. PLoS ONE 12(12): e0189119.

Strayer, D. L., J. A. Downing, W. R. Haag, T. L. King, J. B. Layzer, T. J. Newton, and J. S. Nichols. 2004. Changing perspectives on pearly mussels, North America’s most imperiled animals. Bioscience 54(5): 429-439.

Strayer, D. L. 2008. Freshwater mussel ecology: A multifactor approach to distribution and abundance. University of California Press, Berkley, CA.

Tan, K. H., R. A. Leonard, L. E. Asmussen, J. C. Lobartini, and A. R. Gingle. 1990. The geochemistry of blackwater in selected coastal streams of the Southeastern United States. Communications in Soil Science and Plant Analysis 21: 1999–2016.

Tsoar, A., O. Allouche, O. Steinitz, D. Rotem, and R. Kadmon. 2007. A comparative evaluation of presence-only methods for modelling species distribution. Diversity and Distributions 13: 397-405.

59 United States Census Bureau. 2017. TIGER/Line shapefile dataset. United States Census Bureau: Spatial Data Collection and Products Branch, Washington, DC.

United States Environmental Protection Agency (USEPA). 2013. Level III ecoregions of the United States. United States Environmental Protection Agency: National Health and Environmental Effects Research Laboratory, Corvallis, OR.

United States Fish and Wildlife Service (USFWS). 1996. Carolina Heelsplitter (Lasmigona decorata) recovery plan. United States Fish and Wildlife Service, Atlanta, GA.

United States Fish and Wildlife Service (USFWS). 2002. Endangered and threatened wildlife and plants; designation of critical habitat for the Carolina Heelsplitter; final rule. Federal Register 67: 127.

United States Fish and Wildlife Service (USFWS). 2012. Carolina Heelsplitter (Lasmigona decorata) 5-year review: Summary and evaluation. United States Fish and Wildlife Service, Ashville, NC.

United States Geological Survey (USGS). 2019. National hydrography dataset plus – NHDPlus v2.1. United States Geological Survey, Reston, VA.

Untergasser, A., I. Cutcutache, T. Koressaar, J. Ye, B. C. Faircloth, M. Remm, and S. G. Rozen. 2012. Primer3: New capabilities and interfaces. Nucleic Acids Research 40(15): e115.

Warren, M. L. and M. G. Pardew. 1998. Road crossings as barriers to small stream fish movement. Transactions of the American Fisheries Society 127: 637-644.

Wachob, A., A. D. Park, and R. Newcome (Eds.). 2009. South Carolina state water assessment, second ed. South Carolina Department of Natural Resources: Land, Water, and Conservation Division, Columbia, SC.

Williams, J. D., M. L. Warren, K. S. Cummings, J. L. Harris, and R. J. Neves. 1993. Conservation status of freshwater mussels of the United States and Canada. Fisheries 18(9): 6-22.

Xenopoulos, M. A., D. M. Lodge, J. Alcamo, M. Märker, K. Schulze, and D. P. Van Vuuren. 2005. Scenarios of freshwater fish extinctions from climate change and water withdrawal. Global Change Biology 11: 1557-1564.

60 Yang, L., S. Jin, P. Danielson, C. Homer, L. Gass, S. M. Bender, A. Case, C. Costello, J. Dewitz, J. Fry, M. Funk, B. Granneman, G. C. Liknes, M. Rigge, and G. Xian. 2018. A new generation of the United States national land cover database: Requirements, research, priorities, design, and implementation strategies. ISPRS Journal of Photogrammetry and Remote Sensing 146: 108-123.

Ye, J., G. Coulouris, I. Zaretskaya, I. Cutcutache, S. Rozan, and T. L. Madden. 2012. Primer BLAST: A tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics 13: 134.

61 TABLES

Table 1. List of covariates used to model occupancy and detection probability for Nocomis leptocephalus in the Upper Lynches River Sub Basin in North and South Carolina and the observed range of values across sampled sites (n = 65). Variable Description Type Scale Mean (Range)

Filter Categorical variable for whether filters were combined during Lab Method NA nsingle = 65,

extraction process ncombined = 55

Extract Volume of sample extract used in amplification protocol (either Lab Method NA n2.85 = 51,

2.85 μl or 5.85 μl) n5.85 = 69

Season Categorical variable for collection date in 2019 (grouped into 3 Temporal NA NA sampling periods: 1 = May 7–11, 2 = May 28–June 11, and 3 = July 1–4)

Stream Order Strahler stream order at the target segment based on the Hydrological Catchment 3 (1–5) National Hydrology Dataset

Index of Index of catchment integrity from the StreamCAT Dataset based Calculated Index Catchment 5.8 (2.6–8.2) Catchment on hydrology, water chemistry, sediment, connectivity, Integrity (ICI) temperature, and habitat provision (Scaled as x/10).

Index of Index of upstream watershed integrity based from the Calculated Index Watershed 5.0 (2.8–7.1) Watershed StreamCAT Dataset based on hydrology, water chemistry, Integrity (IWI) sediment, connectivity, temperature, and habitat provision (Scaled as x/10)

Catchment % canopy cover within local catchment based on the 2016 Land Cover Catchment 5.8 (1.6–8.5) Canopy National Land Cover Dataset (Scaled as x/10)

62 Catchment % agriculture (hay/pasture, cultivated crops) within local Land Cover Catchment 1.3 (0.0–6.2) Agriculture catchment based on the 2016 National Land Cover Dataset (Scaled as x/10)

Catchment % development (open, low, medium, high intensity) within local Land Cover Catchment 0.2 (0.0–4.1) Development catchment based on the 2016 National Land Cover Dataset (Scaled as x/10)

Catchment % forest (mixed, evergreen, deciduous, woody wetlands) within Land Cover Catchment 4.3 (0.0–9.1) Forest local catchment based on the 2016 National Land Cover Dataset (Scaled as x/10)

Riparian % canopy cover within upstream riparian area based on the Land Cover Riparian 7.3 (2.2–9.1) Canopy 2016 National Land Cover Dataset (Scaled as x/10)

Riparian % agriculture (hay/pasture, cultivated crops) within upstream Land Cover Riparian 0.9 (0.0–5.7) Agriculture riparian area based on the 2016 National Land Cover Dataset (Scaled as x/10)

Riparian % development (open, low, medium, high intensity) within Land Cover Riparian 0.1 (0.0–0.5) Development upstream riparian area based on the 2016 National Land Cover Dataset (Scaled as x/10)

Riparian % forest (mixed, evergreen, deciduous, woody wetlands) within Land Cover Riparian 4.3 (0.0–9.1) Forest upstream riparian area based on the 2016 National Land Cover Dataset (Scaled as x/10

Watershed % canopy cover within full upstream watershed based on the Land Cover Watershed 5.4 (1.4–7.3) Canopy 2016 National Land Cover Dataset (Scaled as x/10)

63 Watershed % agriculture (hay/pasture, cultivated crops) within full Land Cover Watershed 2.4 (0.0–7.3) Agriculture upstream watershed based on the 2016 National Land Cover Dataset (Scaled as x/10)

Watershed % development (open, low, medium, high intensity) within full Land Cover Watershed 0.7 (0.0–3.0) Development upstream watershed based on the 2016 National Land Cover Dataset (Scaled as x/10)

Watershed % forest (mixed, evergreen, deciduous, woody wetlands) within Land Cover Watershed 5.6 (0.0–7.9) Forest full upstream watershed based on the 2016 National Land Cover Dataset (Scaled as x/10)

Watershed % Piedmont (L3) within full upstream watershed (Scaled as Physiographical Watershed 8.0 (0.0–10.0) Physiography x/10)

Temperature Water temperature (°C) at sampling location during time of Water Catchment 22.1 (18.0–29.8) collection Chemistry

Dissolved Dissolved oxygen (mg/l) at sampling location during time of Water Catchment 6.5 (0.6–9.0) Oxygen (DO) collection (Scaled as x/10) Chemistry

Conductivity Conductivity (μS/cm) at sampling location during time of Water Catchment 10.7 (3.7–38.1) collection (Scaled as x/10) Chemistry pH pH at sampling location during time of collection Water Catchment 6.5 (5.4–7.3) Chemistry

Turbidity Turbidity (NTU) at sampling location during time of collection Water Catchment 9.7 (0.2–24.1) Chemistry

64 Table 2. Candidate models used to evaluate support for factors influencing detection (step 1) and occupancy (step 2) of Nocomis leptocephalus in the Upper Lynches River sub-basin in North and South Carolina. K = Number of estimated parameters in the model, AICc = Akaike’s information criterion for small sample size, and Wi = Relative model weight.

Model Structure K AICC ΔAIC Likelihood Wi

1. Lab Methods Ψ (.) p (Filter + Extract) 4 118.58 0.00 1.00 1.00

Seasonal Ψ (.) p (Season + Temp + Conduct) 5 130.76 12.18 0.00 0.00

Null Ψ (.) p (.) 2 131.50 12.92 0.00 0.00

Chemistry Ψ (.) p (Temp + DO + Conduct + pH + Turbid) 7 134.90 16.31 0.00 0.00

2. Null Ψ (.) p (Filter + Extract) 4 118.58 0.00 1.00 0.29

Hydrology Ψ (Order) p (Filter + Extract) 5 120.27 1.69 0.43 0.12

Physiography Ψ (Wat.Pd) p (Filter + Extract) 5 120.53 1.94 0.38 0.11

Wat.Land Ψ (Wat.Dev + Wat.Ag + Wat.For) p (Filter + Extract) 7 120.67 2.09 0.35 0.10

Cat.Can Ψ (Cat.Can) p (Filter + Extract) 5 120.88 2.30 0.32 0.09

Wat.Can Ψ (Wat.Can) p (Filter + Extract) 5 120.89 2.31 0.32 0.09

Rip.Can Ψ (Rip.Can) p (Filter + Extract) 5 120.93 2.35 0.31 0.08

Rip.Land Ψ (Rip.Dev + Rip.Ag + Rip.For) p (Filter + Extract) 7 121.78 3.20 0.20 0.06

Indices Ψ (ICI + IWI) p (Filter + Extract) 6 122.26 3.68 0.16 0.05

Cat.Land Ψ (Cat.Dev + Cat.Ag + Cat.For) p (Filter + Extract) 7 125.26 6.68 0.03 0.01

65 Table 3. Coefficients and 95% confidence intervals from the null, hydrology, and physiography models used to investigate detection and occupancy patterns of Nocomis leptocephalus in the Upper Lynches River Sub-basin in North and South Carolina. Super script denotes AICc model ranking.

Parameter Null Model 1 Hydrology Model 2 Physiography Model 3

(Occupancy Intercept) 0.36 (-0.26, 0.98) -0.24 (-1.82, 1.33) 0.83 (-0.85, 2.51)

Order NA 0.23 (-0.34, 0.81) NA

Wat.Pd NA NA -0.06 (-0.24, 0.13)

(Detection Intercept) 18.16 (-74.30, 111.05) 19.90 (-124.88, 164.66) 18.27 (-77.00, 113.55)

Extract (5.85) -3.17 (-19.06, 12.71) -3.47 (-28.22, 21.28) -3.19 (-19.48, 13.10)

Filter (single) 1.62 (-0.15, 3.40) 1.64 (-0.12, 3.40) 1.58 (-0.21, 3.38)

66 FIGURES

Figure 1. Carolina Heelsplitter (Lasmigona decorata) life cycle.

67

Figure 2. Map of study area where water samples were collected, consisting of 3 HUC 10 watersheds within the Lynches River sub-basin of the Pee Dee River Basin.

68

Figure 3. The three spatial extents used to define landcover variables while investigating patterns of occupancy and detection of Nocomis leptocephalus in the Upper Lynches River Sub-basin of North and South Carolina. Extents are defined as the local catchment (3A), the watershed-wide 100 m riparian area (3B), and the entire upstream watershed (3C).

69 CHAPTER THREE: PREDICTING BIOLOGICAL INTEGRITY IN STREAMS: THE IMPORTANCE OF SCALE AND VARIABLE SELECTION WITHIN A MULTI-MODEL FRAMEWORK

Benjamin C. Schmidt, Matthew W. Green, John. C. Morse, and Catherine M. Bodinof Jachowski

ABSTRACT

Characterizing areas of impairment within watersheds is critical to habitat conservation and restoration projects. In recent decades, biologists have taken advantage of the increasing accessibility and accuracy of GIS-based tools to develop predictive models of the physical, chemical, and biological conditions of streams at regional scales. However, the selection of variables and the scales at which they are quantified can have considerable effects on model performance. In this study, we assessed potential habitat impairment in an ecologically important sub-basin in North and South Carolina (Southwestern USA) using a predictive model of macroinvertebrate biological integrity. We collected macroinvertebrates using a standardized multi-habitat protocol developed by the state of South Carolina across 49 spatially balanced sites. We collected in-stream variables at each sampling locality and considered land use and remotely sensed variables at three spatial scales within the study area including the local catchment, upstream riparian area, and upstream watershed. We used multiple linear regression and model ranking criteria to evaluate the relative support for multiple a priori hypotheses regarding drivers of biotic integrity. We found that total upstream riparian land cover was more important for predicting biotic integrity than land cover at the local catchment and watershed scales. We also found that pairing in-stream water

70 chemistry parameters with upstream riparian land cover greatly improved our ability to predict stream biotic integrity. Our model validation results demonstrated good model performance, suggesting that our top-ranked model could be used to predict biological condition of unsampled streams within the study area. Predictive models of biotic integrity that utilize macroinvertebrate data can provide federal and state agencies with powerful tools for prioritizing conservation and restoration efforts across multiple watersheds.

INTRODUCTION

Freshwater ecosystems are among the most threatened in the world and are losing biodiversity much faster than terrestrial and marine ecosystems (Dudgeon et al.

2006, Sala et al. 2000). Anthropogenic activities such as agriculture, deforestation, and urbanization are primary drivers of this decline, and result in the chemical, physical, and hydrological degradation of aquatic systems (Lambin et al. 2001, Sponseller et al. 2001,

Tong and Chen 2002). In the context of widespread stream impairment, watershed assessment and monitoring programs have been established within the United States at the local, state, and federal levels (Buss et al. 2014, USEPA 2018). However, managers are often constrained by a lack of data and limited budgets, making it difficult to objectively determine the most effective management approach when faced with multiple drivers of stream impairment (Munns 2006, Strobl and Robillard 2008). To ensure success of these projects, watershed managers require tools that can help them

71 rapidly identify stream sites or regions for conservation and impaired habitat for restoration projects (Linke et al. 2010).

Agencies tasked with protecting and restoring aquatic habitats commonly rely on macroinvertebrate assemblages to assess levels of impairment and direct restoration efforts (Álvarez-Cabria et al. 2010). As biological indicators, macroinvertebrates reflect both current and historical environmental conditions, including single event disturbances and the cumulative effects of chronic stressors (Li et al. 2010, Rosenberg and Resh 1993). However, the scale of most watershed monitoring programs limits the number of streams reaches that can be sampled. As a result, agencies typically employ spatially balanced probabilistic sampling designs to assess impairment across regional scales (Stein and Bernstein 2008). Recent studies have developed models to predict the biological condition of streams using landscape attributes and other spatially continuous

GIS-based variables (e.g., land use, physiography) (Villeneuve et al. 2015, Waite and

Metre 2017). Such predictive modelling allows for the evaluation of unsampled stream reaches and creates a strong incentive for managers to apply GIS tools to assess, monitor, and ultimately target streams for conservation, restoration, and other management activities. While GIS-based predictive models have the potential to improve the efficiency of stream management activities greatly, they should be considered carefully as inaccuracies in data collection and interpretation can yield misleading evaluations of biological condition.

72 The development of effective predictive models is dependent on the identification of variables that are good predictors of biological condition (Cushman et al. 2008). The physicochemical properties of streams (e.g., specific conductance, pH, dissolve oxygen) and their associated catchments are known to be important determinants of invertebrate diversity and abundance (Heino et al. 2003, Allen 2004).

However, the relative influence of landscape and in-stream parameters on biological condition are often not well understood. Individual variables may have interactive or synergistic effects and biotic response may be complex and context-dependent

(Townsend et al. 2008). In addition, the importance of a given variable may change depending on the scale in which it is quantified (Mykrä et al. 2007). While GIS-based assessment tools allow for broad scale evaluations to be completed using existing spatial data (Kristensen et al. 2012, Villeneuve et al. 2015), models that incorporate both

GIS and in-stream parameters may provide better predictive ability for assessing stream impairment (Gergel et al. 2002).

The goal of our study was to improve our understanding of habitat impairment within the Upper Lynches River sub-basin in North and South Carolina. We chose this study area because of its conservation value for a number of imperiled taxa, including the only federally endangered freshwater mussel in South Carolina, the Carolina

Heelsplitter (Lasmigona decorata [Lea 1852]) (Elkins et al. 2016, USFWS 1993). Our primary objective was to develop and evaluate an empirical model that uses geospatial and/or in-stream parameters to predict biological condition for streams within the study

73 area. Our second objective was to develop a predictive map of biotic integrity within the study area for use as a conservation planning tool.

METHODS

Study area and sampling design

Our study area consisted of three watersheds within the Lynches River sub-basin, representing the upper Lynches river and its primary tributaries (Figure 1). The study area is a heterogenous landscape and marks the transition between the Piedmont and

Southeastern Plains ecoregions in North and South Carolina (Omernik and Griffith 2014,

USEPA 2013). The Lynches River sub-basin (HUC 10: 03040202) is an area of high conservation priority in South Carolina and contains 67 species of fish, 10 species of crayfish, and 15 species of freshwater mussel, many of which are considered imperiled

(Elkins et al. 2016).

We identified potential sampling sites using the local catchment polygons from the National Hydrography Dataset (NHD) Plus version 2.1 (USGS 2019) in ArcMap 10.7

(Esri). Within the dataset, catchment polygons represent an elevation-based catchment area for each individual stream segment in the flowline network. In order to make sampling sites more accessible, we removed catchments that did not bisect roadways as defined by the National US Primary and Secondary Roads TIGER/line dataset (US Census

Bureau 2017).

We selected 49 catchments from the remaining pool of 637 potential catchments using a generalized random tessellation stratified sampling design with the

74 ‘spsurvey’ package (Kincaid et al. 2019) in R (version 3.6.1, R Core Team 2019). We stratified catchments by both percent forest cover (five levels: 0–20, 20–40, 40–60, 60–

80, 80–100) and stream order (five levels: 1st, 2nd, 3rd, 4th, and 5th), resulting in 25 strata, each of which corresponded to a unique combination of stream order and forest cover.

When there were fewer than 2 catchments in a stratum, we oversampled a neighboring stratum (i.e., same stream order but at a higher or lower level of forest cover) (Table 1).

We chose this study design to ensure even sampling across the full range of stream orders and land cover characteristics represented, which was of particular value given our interest in using our final model to develop a predictive layer of biotic integrity throughout our study area. We defined sampling sites as 200 m stream reaches (100 m upstream and downstream of a bridge or central access point) within each selected catchment polygon. We selected sites based on accessibility during our first visit to target catchments. When pre-selected catchments were inaccessible, we replaced them with the closest available overdraw catchments of similar forest cover and stream order.

Field sampling

We sampled macroinvertebrates during March and April of 2018 and March and

May of 2019 using a standardized multi-habitat sampling protocol (MHSP) provided by the South Carolina Department of Health and Environmental Control (SCDHEC; SCDHEC

2017). The MHSP is a semi-quantitative sampling protocol that facilitates the collection of macroinvertebrate taxa from a broad range of meso-habitats; allowing for a more

75 accurate representation of diversity than quantitative protocols which are normally restricted to specific habitat types (Barbour et al. 1999, Marchant et al. 1997).

As per the MHSP, we sampled each site for three hours using 300 μm fine mesh samplers, d-nets, leaf pack samplers, seines (two, 3-minute riffle kicks per site), and timed visual sampling. Each of these sampling protocols was limited to 30 minutes except for visual sampling, which was limited to one hour. We sorted macroinvertebrates in the field from white larval sorting trays into 150 ml storage vials of 80% ethanol and transported them to the lab for identification. We recorded water quality data (water temperature (°C), dissolved oxygen (mg/l), specific conductance

(μS/cm), and pH) from each site prior to collecting samples using a YSI Pro Plus (Xylem

Analytics).

Laboratory methods

In the lab, we enumerated macroinvertebrates using a stereomicroscope

(Olympus SZ61 zoom, 90´ total magnification) and identified individuals to the lowest possible taxonomic unit using national and regional keys (e.g., Merritt et al. 2008; Morse et al. 2017). We slide mounted all chironomids on glass slides under cover slips with

CMC-10 mounting media and identified individuals to the lowest possible taxonomic unit using a compound microscope (Wild Heerbrugg M20, 1,000´ total magnification) and regional keys (Epler 2001).

76 Calculating biotic integrity

We calculated a biotic index score for each site according to the SCDHEC protocol (SCDHEC 2017) to represent the average pollution tolerance for all collected taxa. In order to reduce sampling bias, we quantified relative macroinvertebrate abundance within taxa by establishing 3 abundance categories: rare (rare [1–2 individuals], common [3–9 individuals], and abundant [>10 individuals]) and assigned these categories a value of 1, 3, or 10 respectively (SCDHEC 2017). We used the following formula:

∑(#$ )(& ) Biotic Index = ί ί '

th where TVi is the tolerance value for the i taxon, ni is the relative abundance value for the ith taxon, and N is the sum of all relative abundance values for all taxa. The resulting index scores were based on a scale from 0 to 10, with 10 representing the most impaired stream conditions. Taxa with no assigned tolerance values were excluded from analysis. In addition, we calculated EPT abundance as the total number of distinct

Ephemeroptera, Plecoptera, and Trichoptera taxa collected for each site.

Because macroinvertebrate communities change in relation to physiography

(Hughes and Larson 1988), we weighted both the biotic index and the EPT scores by ecoregion (Piedmont or Southeastern Plains) using the formulas provided in the SCDHEC protocol (2017):

77 EPT Scores:

Piedmont Score = 0.1247 x EPT abundance + 0.568

Southeastern Plains Score = 0.1431 x EPT abundance + 0.568

Biotic Index Scores:

Piedmont Score = -1.3241 x biotic index + 11.264

Southeastern Plains Score = -1.3599 x biotic index + 11.832

For sites within watersheds that overlapped the ecoregion boundary, we used the weighting formula associated with the ecoregion that contained the majority of the watershed upstream from the sampling location. We averaged the weighted biotic index and EPT scores to produce a combined score for each site representing overall biotic integrity. This metric (hereafter, biotic integrity) ranged from 0–6 with the following grades: 0–2 = poor, 2–3 = fair, 3–4 = good-fair, 4–5 = good, 5–6 = excellent.

Quantifying environmental covariates

We used geographic information system datasets and in stream measurements to quantify hydrologic, chemical, physiographic, and land use variables for each site

(Table 2). We considered land use variables that corresponded with 3 separate spatial scales: the local catchment, the watershed-wide riparian area, and the entire upstream watershed (Figure 2). Local catchments included the contributing area surrounding a stream segment as defined by the NHD Plus v2.1 (USGS 2019) (Figure 2A). Watershed- wide riparian areas included the collective extent of riparian area surrounding all

78 upstream tributaries to a distance of 50 m from each side of the stream (figure 2B).

Upstream watersheds included all land within the contributing area upstream of the sampling location (Figure 2C).

We used ArcMap 10.7 (Esri) to delineate catchments and quantify environmental variables at each spatial scale. We quantified land use variables using the 2016 National

Land Cover Dataset (NLCD 2018, see Yang et al. 2018) at a spatial resolution of 30 m. We quantified percent cover for four land use classifications: % forest, % developed, % agricultural, and % canopy cover for each spatial scale considered. We combined deciduous, evergreen, and mixed forest types into a single forest classification and included separate variables for the proportion of forest and the percent canopy cover within catchments, combined all development classes (low, medium, high, and open space) into a single development classification, and combined the hay/pasture and cultivated crops into a single agricultural classification. We used the physiography dataset from the EPA (USEPA 2013) to assign values to catchments at each spatial scale based on the proportion of the catchment located within the Piedmont physiographic province. In addition, we used the National Hydrology Dataset v2.1 (USGS 2019) to assign hydrological variables to each stream segment including slope, Strahler stream order (Strahler 1954), and basal flow index. Finally, we used the EPA StreamCAT dataset

(Hill et al. 2015) to obtain indices of catchment and watershed integrity for each stream segment. Briefly, the indices of integrity are special metrics in the StreamCAT dataset based on relationships between stressor variables and watershed functions in individual

79 NHD Plus v2 catchments (ICI) and contributing upstream watersheds (IWI) (Thornbrugh

2018).

We assessed collinearity between variables using multivariate correlation analysis based on Pearson’s correlation coefficients. For each correlated pair (|r|> 70%) of variables, we retained the variable with the largest range and strongest correlation to our response variable. We removed water temperature from the set of variables because it was highly correlated with dissolved oxygen (r = 0.71). Due to significant correlation between land use class percentages within spatial scales (|r|> 70%), we retained canopy cover as a univariate proxy for land use within each spatial scale (King et al. 2005). Our remaining in-stream and non-land use variables were not significantly correlated with canopy cover at any spatial scale.

Model development and selection

We used linear mixed models to investigate the effects of abiotic and land use variables on macroinvertebrate biotic integrity. We used biotic integrity scores as a continuous response and developed a candidate set of 19 models, each of which corresponded to an a priori hypothesis concerning drivers of biotic integrity. We hypothesized positive effects of canopy cover and used three univariate models, corresponding to each alternative spatial scale (Figure 2), to determine which spatial scale would best predict biotic integrity. We considered the interactive effects of canopy cover and contributing area at each spatial scale to represent our hypothesis that the effects of canopy cover would vary depending on the size of the upstream watershed

80 (Strayer et al. 2003). We also considered the interactive effects of canopy cover and physiography at each spatial scale to represent our hypothesis that the effects of canopy cover would vary depending on the underlaying geology and topography associated with physiographic ecoregion (Utz et al. 2009). Finally, we considered the additive effects of canopy cover and in-stream variables (dissolved oxygen, specific conductivity, and pH) at each spatial scale to represent our hypothesis that biotic integrity is driven by both land use and local conditions. We hypothesized that in-stream parameters may provide additional predictive power by capturing local disturbance or conditions not reflected in remotely sensed data.

In addition to land use models, we used an additive model with only water quality variables to represent our hypothesis that biotic integrity is driven primarily by local in-stream conditions. We used a univariate model with physiography to represent our hypothesis that biotic integrity is correlated with underlaying ecoregion. We used a univariate model with stream order to represent our hypothesis that biotic integrity is correlated to the size of the stream and its location within the watershed. We used an additive model and an interactive model with indices of catchment and watershed integrity from the StreamCAT dataset to represent our hypothesis that biotic integrity is correlated with metrics of overall stream integrity. We used a model with basal flow and slope to represent our hypothesis that biotic integrity is correlated to hydrologic conditions. Finally, we used a null model to represent our hypothesis that biotic integrity did not vary throughout the study area. We included stream ID as a random effect in

81 every model to account for the hierarchical nature of stream systems and similarities between sites located on the same stream.

We fit models using maximum likelihood methods with package ‘lme4’ (Bates et al. 2015) and ranked models using the Akaike Information Criterion adjusted for small sample size (AICc; Akaike 1974) with package ‘AICmodavg’ (Mazerolle 2019) in R

(version 3.6.1, R Core Team 2019). We selected the top-ranked model based on relative model weight in a 95% confidence set (Burnham and Anderson 2002). To understand the variability in our data explained by covariates, we calculated marginal (fixed effects only) and conditional (fixed and random effects) R2 values for the top ranked model using package ‘MuMIn’ (Bartoń 2020, Nakagawa et al. 2017).

Model validation

While AIC model ranking identifies the most supported model within the candidate suite, it does not provide a metric for model performance. To assess performance and predictive ability for the top-ranked model, we used K-fold cross validation (Boyce et al. 2002). We randomly split our data into training and testing sets using an 80:20 ratio and repeated the process 5 times, refitting the top-ranked model for each replicate of training data. We then used the newly fitted models to estimate biotic integrity scores based on the variables in each complementary test data set. We assessed model performance by examining the correlation between the predicted biotic integrity scores and the observed biotic integrity scores in test sets using Pearson’s correlation coefficient. If our model performed well, we expected to see a strong (r >

82 0.5) and statistically significant (α = 0.05) positive correlation between observed and predicted biotic integrity scores.

Predictive mapping

We used the top-ranked model to generate a spatially continuous predictive map of conditional biotic integrity for each uniquely identified NHD Plus v2.1 stream segment that occurred in our study area. We use the term conditional to reflect the fact that our predictions were generated using only GIS-based parameters in our top-ranked model, while assuming that the value of remotely sensed variables were similar throughout each stream segment, regardless of segment length. Because in-stream (i.e., water quality) variables were not available across our entire study area, we generated predictions while holding all in-stream parameters at their mean observed values based on all sampling localities. Thus, the actual estimated biotic integrity scores for any stream segment could be higher or lower than indicated by our predictive layer, depending on in-stream conditions.

RESULTS

We collected a total of 11,742 organisms representing 482 taxa across 49 sites.

Taxon richness ranged from 13 to 72 (x ̄ = 34) while EPT abundance ranged from 1 to 38

(x ̄ = 14). Biotic integrity scores for sampled sites ranged from 0.4 to 5.4 (x ̄ = 3.6). Based on the bioclassification categories established by the EPA Clean Water Act (aquatic life use support; section 305b), 26 streams were classified as fully supporting (not impaired),

83 22 streams as partially supporting (marginally impaired), and 1 stream as not supporting

(severely impaired) (USEPA 2018).

Model ranking

The top-ranked model within the candidate suite included canopy cover within the watershed-wide riparian area and 3 in-stream variables: dissolved oxygen, specific conductance, and pH (Table 3). Model selection uncertainty was very low, with the top- ranked model carrying 95% of the model weight (w1 = 0.95). The top-ranked model explained a total of 73% of the observed variation in biotic integrity scores (conditional

R2 = 0.73), with fixed effects accounting for the majority of variation (marginal R2 = 0.63)

(Nakagawa et al. 2017). The GIS-based parameter in the top-ranked model, riparian canopy cover, was positively correlated with biotic integrity (Table 4). Within the range of riparian canopy cover observed in the study area (range = 0-96%, x ̄ = 69%), the model indicated that a 10% increase in canopy cover yielded a 0.42 increase in predicted biotic integrity (Figure 3A). Dissolved oxygen and pH were also positively correlated with biotic integrity (Figures 3B and 3C). Specific conductance was negatively correlated with biotic integrity; however, the effect size was very small, and the 95% confidence intervals overlapped zero (βConduct = -0.01, SE = 0.02) (Figure 3D).

Comparatively, evidence ratios based on model weights indicated that the top ranked model was 31 times more likely to be the best approximating model in our set than the model containing in-stream variables and canopy cover at the full upstream watershed scale (w2 = 0.03 , ΔAICc = 6.8), and 95 times more likely than the model

84 containing in-stream variables and canopy cover at the local catchment scale (w3 = 0.01,

ΔAICc = 8.4). The model containing riparian canopy cover without the additional in- stream parameters ranked lower than all models containing in-stream variables (w5 =

0.00, ΔAICc = 22.5). We found no evidence to support the interactive effects of canopy cover and watershed size or physiography.

Model validation

In our model validation, predicted biotic integrity scores were strongly correlated with observed biotic integrity scores (r = 0.71, df = 48, p < 0.001) (Figure 4). When using predicted biotic integrity scores to assign sites to discrete levels of impairment according to the EPA bioclassification categories (i.e., impaired, marginally impaired, or severely impaired) the model correctly classified 38 of 49 sites (78%) (USEPA 2018).

After extrapolating model predictions across our entire study area, we found that conditional biotic integrity within the Upper Lynches River sub-basin ranged from 0.3 to

4.3, with an average of 3.2 (n = 1045 stream segments) (Figure 5).

DISCUSSION

The ability to predict biotic integrity at unsampled stream sites can improve efficiency in stream assessment and provide stream ecologists and resources managers with a valuable planning tool for conservation and restoration projects. Despite the complex ecological relationships that exist between abiotic mechanisms and biotic responses, we were able to predict the biotic condition of streams effectively using canopy cover and water chemistry parameters. We found that models containing both

85 remotely sensed and in-stream variables performed better than models based solely on remotely sensed variables. Our results suggest that macroinvertebrate biotic integrity is more strongly associated with canopy cover at the riparian scale than at the catchment or watershed scales, while also being highly influenced by local water chemistry (which was not strongly correlated with canopy cover in our system). Our results are consistent with a number of previous studies linking canopy cover loss with declines in biotic integrity (Maloney et al. 2018, Miller et al. 2019) and highlight the importance of considering spatial scale and in-stream parameters when developing predictive models of biotic integrity.

The loss of canopy cover can lead to habitat impairment through a variety of intermediate pathways (Nadaï-Monoury et a. 2014, Stone and Wallace 1998, Stout et al.

1993). Within our study area, canopy cover is altered by a variety of anthropogenic activities including mining, forestry, agriculture, and urban development (Macie and

Hermansen 2002). At the local catchment scale, a forested canopy provides shading, woody debris, and leaf litter input (Allen et al. 1997), which influences primary productivity and allochthonous nutrient contribution (Noel et al. 1986). At the riparian scale, canopy cover moderates stream temperature (Rishel et al. 1982), provides bank stability through root structure (Abernathy and Rutherford 2000), and serves as a sink to moderate nutrient pollution (Osborne and Kovacic 1993), sedimentation (Waters 1995), and other contamination (Schulz 2004) from entering the stream. At the watershed scale, vegetative cover influences aspects of stream hydrology through rates of

86 evapotranspiration, infiltration, and runoff transport (Allen 2004). Notably, canopy cover was correlated with several other land use variables in our study area, at each spatial scale considered. For example, proportional cover of agricultural land was negatively correlated with canopy cover at the riparian scale (r = -0.89, df = 47, p = <

0.0001). Collinearity among land use classes is a result of the inverse relationships between proportional representation of alternative classes (i.e., all land use classes within a catchment sum to 100%). Significant relationships between a response variable and a single land cover variable may be accompanied by significant relationships in other land use classes (King et al. 2005). Therefore, while canopy cover represented an effective predictor variable of biotic integrity in our study, it is difficult to interpret whether canopy cover or one of its correlates was the ultimate driver of biotic condition without a clear understanding of the correlative structure among land use variables and the intermediary mechanisms between land use and biotic integrity (MacNally 2000).

Similar to previous studies, we found that macroinvertebrate assemblages were highly responsive to water chemistry parameters, including dissolved oxygen and pH

(Berger et al. 2017, Hale et al. 2015). Exposure to low concentrations of dissolved oxygen (DO) leads to a number of lethal and sublethal effects in macroinvertebrates including direct mortality, delayed growth and emergence suppression (Lowell and Culp

1999, Nebeker 1972). Streams with low DO typically exhibit low diversity (Lenat 1988) and are dominated by organisms with high tolerance levels (Fore et al. 1996). While DO concentrations vary spatially and temporally through natural processes (i.e.,

87 atmospheric exchange, temperature and pressure changes, groundwater inflows, and rates of photosynthesis), anthropogenic impacts, such as nutrient enrichment, thermal pollution, and changes to stream morphology and hydrology reduce DO and increase the frequency, duration, and intensity of hypoxic events (Allen 1995, Dodds 2002,

Pearson and Penridge 1987). Similarly, anthropogenic acidification of stream systems results in a loss of species diversity and the extirpation of sensitive taxa through direct mortality and physiologic stress (Courtney and Clements 1998, Guérold et al. 2000).

However, it is important to note that biotic integrity in acidic streams is not necessarily impaired where pH is naturally low, as some taxa are adapted to acidic conditions

(Dangles et al. 2004, Zlatko et al. 2007).

Although conductivity is considered an important variable for determining the composition of macroinvertebrate assemblages (Allen and Castillo 2007), we did not detect a strong relationship in our study area. This finding contrasts similar studies which suggest that specific conductance is a good predictor of macroinvertebrate assemblages (Cormier et al. 2013, Morgan et al. 2007, Pond et al. 2017). Increased conductivity in streams can result from a variety of anthropogenic factors including wastewater discharge, application of road salts, mining, and increased sedimentation

(Cañedo-Argüelles et al. 2013, Pond et al. 2008, Stalter et al. 2013). High levels of conductivity induce physiological stress for aquatic organisms adapted to low molarities, primarily by disrupting osmoregulatory functions (Cañedo-Argüelles et al. 2013). In addition, high conductivity is often correlated with toxin concentrations, which may

88 account for some of the association between conductivity and biological integrity

(Berger et al 2017). The lack of correlation in our study may be due to relatively low variability of specific conductance within our study area (12.2 to 263.0 μS/cm). Previous studies have found a strong correlation between conductivity and the percentage of developed land within upstream watersheds, indicating that the low variation in our study area may be the result of relatively little urban development (Pond et al. 2017,

Wenger et al. 2009).

Comparing land use spatial scales

Understanding the scale at which the surrounding landscape influences stream condition within a specified reach is essential to adapting scale appropriate monitoring strategies and enacting effective conservation and restoration management (Mwaijengo et al. 2020). Our results indicate that local stream conditions are influenced by a relatively large spatial extent, and that land use immediately adjacent to the stream edge plays a larger role in determining biological condition than land use across the entire upstream watershed. Our findings are consistent with previous work that demonstrates the importance of riparian corridors as critical areas of the landscape for influencing stream processes and biotic integrity (Correll 2003). Our results suggest that current restoration strategies involving riparian buffer restoration may be effective for improving biological conditions in target streams, which is important considering that riparian improvements account for the majority of restoration project costs in the

Southeastern United States (Sudduth et al. 2007). Our results have direct implications

89 for stream restoration projects, where biological response to enhancements, such as tree planting and the removal of impervious surfaces, are likely scale dependent.

Our findings complement a number of similar studies in identifying the upstream riparian area as the most important spatial scale in the relationship between land use and biotic integrity (Booth et al. 2004, Jachowski and Hopkins 2018, Tran et al. 2010,

Wang et al. 2001). However, our findings contradict Sponseller et al. (2001) who found the local catchment scale to be more informative, along with others who showed that the effects of land use are most important at the watershed scale (Magierowski et al.

2012, Roth et al. 1996). The inconsistency among these findings may be due to differences in study design, different metrics of biotic integrity, or may indicate that the effects of land use variables and intermediate processes are regional and possibly dependent on underlaying physiography (Cuffney et al. 2011).

The relative importance of in-stream parameters

Models that incorporate both remotely sensed and in-stream measures are likely to have greater predictive power than models containing only one or the other. In our study, the addition of in-stream variables dramatically enhanced the fit of our top- ranked model to observed data. These results are consistent with a number of other studies comparing the relative strength of multimetric models to explain biotic integrity

(Bailey et al. 2007, Lammert and Allen 1999, Macedo et al. 2014). However, the degree to which in-stream parameters enhance the predictive power of models varies considerably (Kristensen et al. 2012), and data collection of in-stream variables

90 significantly increase the costs of a monitoring project (Pond et al. 2017). The influence of in-stream variables on biotic integrity may depend on the degree of watershed disturbance. For example, Wang et al. (2003) suggested that the relative importance of in-stream and local habitat variables on biotic condition decrease as watersheds become more degraded by anthropogenic activities. The value of including in-stream variables in predictive models lies primarily in their ability to predict fine scale variation that is not detected by remotely sensed variables. Because of this, their predictive power may change depending on the spatial scale and resolution of accompanying landscape variables. We suggest that in-stream variables be considered carefully in predictive modelling and incorporated whenever it is cost effective to do so. However, in the context of limited resources, GIS-based models can still provide agencies with additional tools to assess stream impairment and overall watershed condition, especially as remotely sensed data becomes more accurate and accessible.

Management implications

The ability to predict biotic integrity at unsampled sites provides managers and policy makers with an easily understood metric for evaluating impairment within a watershed. Our model validation highlights the potential value of predictive mapping within the Lynches River Sub-basin and elsewhere as a powerful management tool to identify and justify areas of conservation priority. We recommend that the spatially continuous map of conditional GIS-based biotic integrity be used as a preliminary tool for determining areas of probable high habitat quality and areas of probable impairment

91 within the study area, and potentially elsewhere in the Southeastern region. This can be followed by strategic collection of in-stream parameters to further define impairment on a site-by-site basis (e.g., Walters et al. 2009). For example, the map of conditional biotic integrity could be used to identify probable impairment among sites or identify areas of high risk for sensitive or endemic species. As project goals increase in complexity and specificity, in-stream variables could be incorporated into models to improve predictive ability. The in-stream variables in our top ranked model are all capable of being measured using a portable water quality meter, providing an efficient and relatively low-cost method for collecting in-stream variables during a single visit or repeated sampling. For assessments using in-stream variables in the Lynches River Sub- basin, we recommend using our top-ranked model to predict overall biotic integrity:

!" = −4.1 + (0.04 × % ./012/13 413506) + (1.60 × 95:(;/<<5=>?@ AB6:?3 )*/,)

+ (0.21 × 0D) − (0.01 × E0?F/G/F 453@HFI/>/I6 -.//))

This model can be used alongside the maps of conditional biotic integrity to quickly characterize probable impairment in a large number of streams while maximizing financial and labor resources.

92 REFERENCES

Abernathy, B. and I. D. Rutherford. 2000. The effect of riparian tree roots on the mass- stability of riverbanks. Earth Surface Processes and Landforms 25(9): 921–937.

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19(6): 716–723.

Allan, J. D. 1995. Stream ecology: Structure and function of running waters. Academic publishers, London, UK.

Allan, J. D. 2004. Landscapes and Riverscapes: The influence of land use on stream ecosystems. Annual Review of Ecology, Evolution, and Systematics 35: 257–284.

Allan, J. D. and M. M. Castillo. 2007. Stream ecology: Structure and function of running waters. Springer Netherlands, Dordrecht, Netherlands.

Allan, J. D., D. L. Erickson, and J. Fay. 1997. The influence of catchment land use on stream integrity across multiple spatial scales. Freshwater Biology 37: 149–161.

Álvarez-Cabria, M., J. Barquín, and J. A. Juanes. 2010. Spatial and seasonal variability of macroinvertebrate metrics: Do macroinvertebrate communities track river health? Ecological Indicators 10: 370–379.

Bailey, R. C., T. B. Reynoldson, A. G. Yates, J. Bailey, and S. Linke. 2007. Integrating stream bioassessment and landscape ecology as a tool for land use planning. Freshwater Biology 52: 908–917.

Barbour, M. T., J. Gerritsen, B. D. Snyder, and J. B. Stribling. 1999. Rapid bioassessment protocols for use in wadable streams and rivers. Periphyton, benthic macroinvertebrates, and fish. Second edition. EPA 841-B-99-002. United States Environmental Protection Agency: Office of Water, Washington, DC.

Bartoń, K. 2020. MuMIn: Multi-model inference. R package version 1.43.17. https: // CRAN.R project.org/package = MuMIn.

Bates, D., M. Mächler, B. Bolker, and S. Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1): 1–48.

93 Berger, E., P. Haase, M. Kuemmerlen, M. Leps, R. B. Schäfer, and A. Sundermann. 2017. Water quality variables and pollution sources shaping stream macroinvertebrate communities. Science of the Total Environment 587: 1–10.

Booth, D. B., J. R. Karr, S. Schauman, C. P. Konrad, S. A. Morley, M. G. Larson, and S. J. Burger. 2004. Reviving urban streams: Land use, hydrology, biology, and human behavior. Journal of the American Resources Association 40(5): 1351–1364.

Boyce, M. S., P. R. Vernier, S. E. Nielson, and F. K. Schmiegelow. 2002. Evaluating resource selection functions. Ecological Modelling 157: 281–300.

Burnham, K. and D. Anderson. 2002. Model selection and multimodel inference. Second Ed. Springer-Verlag, New York, NY.

Buss, D. F., D. M. Carlisle, T. Chon, J. Culp, J. S. Harding, H. E. Keizer-Vlek, W. A. Robinson, S. Strachan, C. Thirion, and R. M. Hughes. 2014. Stream biomonitoring using macroinvertebrates around the globe: A comparison of large-scale programs. Environmental Monitoring and Assessment 187: 4132.

Cañedo-Argüelles, M., B. J. Kefford, C. Piscart, N. Prat, R. B. Schäfer, and C. J. Schulz. Salinisation of rivers: An urgent ecological issue. Environmental Pollution 173: 157–167.

Cormier, S. M., G. W. Suter, L. Zheng, an G. J. Pond. 2013. Assessing causation of the extirpation of stream macroinvertebrates by a mixture of ions. Environmental Toxicology and Chemistry 32: 277–287.

Correll, D. 2003. Vegetated stream riparian zones: Their effects on stream nutrients, sediments, and toxic substances. An annotated and index bibliography of the world literature, including buffer strips and interactions with hyporheic zones and floodplains, thirteenth ed. Crystal River, FL.

Courtney, L.A. and W. H. Clements. 1998. Effects of acidic pH on benthic macroinvertebrate communities in stream microcosms. Hydrobiologia 379: 135– 145.

Cuffney, T. F., R. Kashuba, S. S. Qian, I. Alameddine, Y. K. Cha, B. Lee, J. F. Coles, and G. McMahon. Multilevel regression models describing regional patterns of invertebrate and algal responses to urbanization across the USS. Journal of the North American Benthological Society 30(3): 797–819.

94 Cushman, S. A., K. McGarigal, and M. C. Neel. 2008. Parsimony in landscape metrics: Strength, universality, and consistency. Ecological Indicators 8(5): 691–703.

Dangles, O., B. Malmqvist, and H. Laudon. 2004. Naturally acid freshwater ecosystems are diverse and functional: Evidence from boreal streams. Oikos 104: 149–155.

Dodds, W. K. 2002. Freshwater ecology: Concepts and environmental applications. Academic Press, San Diego, CA.

Dudgeon, D., A. Arthington, M. Gessner, Z. Kawabata, D. Knowler, C. Lévêque, and C. Sullivan. 2006. Freshwater biodiversity: Importance, threats, status, and conservation challenges. Biological Reviews 81(2): 163–182.

Elkins, D. C., S. C. Sweat, K. S. Hill, B. R. Kuhajda, A. L. George, and S. L. Wenger. 2016. The Southeastern aquatic biodiversity conservation strategy. Final Report. University of Georgia River Basin Center, Athens, GA.

Epler, J. H. 2001. Identification manual for the larval Chironomidae (Diptera) of North and South Carolina: A guide to the of the midges of the southeastern United States, including Florida. Special Publication SJ2001-SP13. North Carolina Department of Environment and Natural Resources, Raleigh, NC.

Fore, L. S., J. R. Karr, and R. W. Wisseman. 1996. Assessing invertebrate response to human activities: Evaluating alternative approaches. Journal of the North American Benthological Society 15: 212–231.

Gergel, S. E., M. G. Turner, J. R. Miller, J. M. Melack, and E. H. Stanley. 2002. Landscape indicators of human impacts to riverine systems. Aquatic Science 64: 118–128.

Guérold, F., J. P. Boudot, G. Jacquemin, D. Vein, D. Merlet, and J. Rouiller. 2000. Macroinvertebrate community loss as a result of headwater stream acidification in the Vosges Mountains (NE France). Biodiversity and Conservation 9: 767–783.

Hale, A. N., G. Noble, K. Piper, K. Garmire, and S. J. Tonsor. 2016. Controlling for hydrologic connectivity to assess the importance of catchment and reach scale factors on macroinvertebrate community structure. Hydrobiologia 763: 285–299.

Heino, J., T. Muotka, and R. Paavola. 2003. Determinants of macroinvertebrate diversity in headwater streams: Regional and local influences. Journal of Animal Ecology 72(3): 425–434.

95 Hill, R., M. Weber, S. Leibowitz, T. Olsen, and D. Thornbrugh. 2015. The stream- catchment (StreamCat) dataset: A database of watershed metrics for the conterminous USA. Journal of the American Water Resources Association 52(1): 120–128.

Hughes, R. M., and D. P. Larsen. 1988. Ecoregions: An approach to surface water protection. Journal of the Water Pollution Control Federation 60: 486–493.

Jachowski, C. M. B. and W. A. Hopkins. 2018. Loss of catchment-wide riparian forest cover is associated with reduced recruitment in a long-lived amphibian. Biological Conservation 220: 215–227.

Kincaid, T. M., A. R. Olsen, and M. H. Weber. 2019. Spsurvey: Spatial survey design and analysis. R Package version 4.1.0

King, R. S., M. E. Baker, D. F. Whigham, D. E. Weller, T. E. Jordan, P. F. Kazyak, and M. K. Hurd. 2005. Spatial considerations for linking watershed land cover to ecological indicators in streams. Ecological Applications 15: 137–153.

Kristensen, E. A., A. Baattrup-Pedersen, and H. E. Anderson. 2012. Prediction of stream fish assemblages from land use characteristics: Implications for cost-effective design of monitoring programs. Environmental Monitoring Assessment 184: 1435–1448

Lammert, M. and J. D. Allan. 1999. Assessing biotic integrity of streams: Effects of scale in measuring the influence of land use/cover and habitat structure on fish and macroinvertebrates. Environmental Management 23: 257–270.

Lambin, E. F., B. L. Turner, H. J. Geist, S. B. Agbola, A. Angelsen, J. W. Bruce, O. T. Coomes, R. Dirzo, G. Fischer, C. Folke, P. S. George, K. Homewood, J. Imbernon, R. Leemans, X. Li, E. F. Moran, M. Mortimore, P. S. Ramakrishnan, J. F. Richards, H. Skanes, W. Steffen, G. D. Stone, U. Svedin, T. A. Veldkamp, C. Vogel, and J. Xu. 2001. The cause of land-use and land-cover change: Moving beyond the myths. Global Environmental Change 11: 261–269.

Lea, I. 1852. Descriptions of new species in the family Unionidae. Transactions of the American Philosophical Society 10: 253–294.

Lenat, D. R. 1988. Water quality assessment of streams using a qualitative collection method for benthic macroinvertebrates. Journal of the North American Benthological Society 7: 222–233.

96 Li, L., B. Zheng, and L. Liu. 2010. Biomonitoring and bioindicators used for river ecosystems: Definitions, approaches, and trends. Procedia Environmental Sciences 2: 1510–1524.

Linke, S., E. Turak, and J. Nel. 2010. Freshwater conservation planning: The case for systematic approaches. Freshwater Biology 56(1): 6–20.

Lowell, R. B. and J. M. Culp. 1999. Cumulative effects of multiple effluent and low dissolved oxygen stressors on mayflies at cold temperatures. Canadian Journal of Fisheries and Aquatic Sciences 56: 1624–1630.

MacNally, R. 2000. Regression and model-building in conservation biology, biogeography, and ecology: The distinction between-and reconciliation of- “predictive” and “explanatory” models. Biodiversity and Conservation 9: 655– 671.

Macedo, D. R., R. M. Hughes, R. Ligeiro, W. R. Ferreira, M. A. Castro, N. T. Junqueira, D. R. Oliveira, K. R. Firmiano, P. R. Kaufmann, P. S. Pompeu, and M. Castillo. 2014. The relative influence of catchment and site variables on fish and macroinvertebrate richness in cerrado biome streams. Landscape Ecology 29: 1001–1016.

Macie, E. A. and L. A. Hermansen. 2002. Human influences of forest ecosystems: The southern wildland-urban interface assessment. United States Forest Service: Southern Research Station, Ashville, NC.

Magierowski, R. H., P. E. Davies, S. M. Read, and N. Horrigan. 2012. Impacts of land use on the structure of river macroinvertebrate communities across Tasmania, Australia: Spatial scales and thresholds. Marine and Freshwater Research 63: 762–776.

Maloney, K. O., Z. M. Smith, C. Buchanan, A. Nagel, and J. A. Young. 2018. Predicting biological conditions for small headwater streams in the Chesapeake Bay watershed. Freshwater Science 27(4): 795–809.

Marchant, R., A. Hirst, R. H. Norris, R. Butcher, L. Metzeling, and D. Tiller. 1997. Classification and prediction of macroinvertebrate assemblages from running waters in Victoria, Australia. Journal of the North American Benthological Society 16: 664–681.

97 Mazerolle, M. J. 2019. AICmodavg: Model selection and multimodel inference based on (Q)AIC(c). R Package version 2.2.2. https: //CRAN.R-project.org/package = AICmodavg.

Merritt, R. W., K. W. Cummins, and M. B. Berg. (eds.). 2019. An introduction to the aquatic insects of North America, fifth Ed. Kendall Hunt Publishing Company, Dubuque, IA.

Miller, J. W., M. J. Paul, and D. R. Obenour. 2019. Assessing potential anthropogenic drivers of ecological health in piedmont streams through hierarchical modeling. Freshwater Science 38(4): 771–789.

Morgan, R. P. II., K. M. Kline, and S. F. Cushman. 2007. Relationships among nutrients, chloride and biological indices in urban Maryland streams. Urban Ecosystems 10: 153–166.

Morse, J. C., W. P. McCafferty, B. P. Stark, and L. M. Jacobus. 2017. Larvae of the southeastern USA mayfly, stonefly, and caddisfly species. Clemson University: Public Service Publishing ,Clemson, SC.

Munns, W. 2006. Assessing risks to wildlife populations from multiple stressors: Overview of the problem and research needs. Ecology and Society 11(1): 23.

Mwaijengo, G. N., A. Msigwa, K. N. Njau, L. Brendonck, and B. Vanschoenwinkel. 2020. Where does land use matter most? Contrasting land use effects on river quality at different spatial scales. Science of the Total Environment 715: 134825.

Mykrä, H., J. Heino, and T. Muotka. 2007. Scale-related patterns in the spatial and environmental components of stream macroinvertebrate assemblage variation. Global Ecology and Biogeography 16(2): 149–159.

Nadaï-Monoury, E., F. Gilbert, and A. Lecerf. 2014. Forest canopy cover determines invertebrate diversity and ecosystem process rates in depositional zones of headwater streams. Freshwater Biology 59(7): 1532–1545.

Nakagawa, S., P. D. Johnson, and H. Schielzeth. 2017. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed effects models revisited and expanded. Journal of the Royal Society Interface 14(134): 20170213

98 National Land Cover Database (NLCD). 2018. NLCD 2016 Land Cover (CONUS) and NLCD 2016 USFS tree canopy cover (CONUS). Multi-Resolution Land Characteristics Consortium, Sioux Falls, SD.

Nebeker, A. V. 1972. Effect of low oxygen concentration on survival and emergence of aquatic insects. Transactions of the American Fisheries Society 4: 675–679.

Noel, D. S., C. W. Martin, and C. A. Federer. 1986. Effects of forest clearcutting in New England on stream macroinvertebrates and periphyton. Environmental Management 10(5): 661–670.

Omernik, J. M. and G. E. Griffith. 2014. Ecoregions of the conterminous United States: Evolution of a hierarchical spatial framework. Environmental Management 54(6): 1249–1266.

Osborne, L. L. and D. A. Kovacic. 1993. Riparian vegetated buffer strips in water-quality restoration and stream management. Freshwater Biology 29(2): 243–258.

Pearson, R. G. and L. K. Penridge. 1987. The effects of pollution by organic sugar mill effluent on the macro-invertebrates of a stream in tropical Queensland, Australia. Journal of Environmental Management 24: 205–215.

Pond, G. J., M. E. Passmore, F. A. Borsuk, L. Reynolds, and C. J. Rose. 2008. Downstream effects of mountaintop coal mining: Comparing biological conditions using family-and genus-level macroinvertebrate bioassessment tools. Journal of the North American Benthological Society 27: 717–737.

Pond, G. J., K. G. Krock, J. V. Cruz, and L. F. Ettema. 2017. Effort-based predictors of headwater stream conditions: Comparing the proximity of land use pressures and instream stressors on macroinvertebrate assemblages. Aquatic Sciences 79: 765–781.

R Core Team. 2019. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

Rishel, G. B., J. A. Lynch, and E. S. Corbett. 1982. Seasonal stream temperature changes following forest harvesting. Journal of Environmental Quality 11(1): 112–116.

Rosenberg, D. M. and V. H. Resh (Eds.). 1993. Freshwater biomonitoring and benthic macroinvertebrates. Chapman and Hall Inc, New York, NY.

99 Roth, N. E., J. D. Allen, and D. L. Erikson. 1996. Landscape influences on stream biotic integrity assessed at multiple spatial scales. Landscape Ecology 11: 141–156.

Schulz. R. 2004. Field studies on exposure, effects, and risk mitigation of aquatic nonpoint source insecticide pollution: A review. Journal of Environmental Quality 33: 419–448.

Sponseller, R. A., E. F. Benfield, and H. M. Valett. 2001. Relationships between land use, spatial scale, and stream macroinvertebrate communities. Freshwater Biology 46(10): 1409–1424.

South Carolina Department of Health and Environmental Control (SCDHEC). 2017. Standard operating and quality control procedures for macroinvertebrate sampling. Technical report number 0914-17. South Carolina Department of Health and Environmental Control: Bureau of Water, Columbia, SC.

Stalter, D., A. Magdeburg, K. Quednow, A. Botzat, and J. Oehlmann. 2013. Do contaminants originating from state-of-the-art treated wastewater impact the ecological quality of surface waters? PLoS One 8: e60616.

Stein, E. D. and B. Bernstein. 2008. Integrating probabilistic and targeted compliance monitoring for comprehensive freshwater assessment. Environmental Monitoring and Assessment 144: 117–129.

Stone, M. K. and J. B. Wallace. 1998. Long-term recovery of a mountain stream from clear-cut logging: The effects of forest succession on benthic invertebrate community structure. Freshwater Biology 39: 151–169.

Stout, B. M., E. F. Benfield, and J. R. Webster. 1993. Effects of a forest disturbance on shredder production in southern Appalachian headwater streams. Freshwater Biology 29: 59–69.

Strahler, A. N. 1957. Quantitative analysis of watershed geomorphology. Transactions of the American Geophysical Union 38(6): 913–920.

Strayer, D. L., R. E. Beighley, L. C. Thompson, S. Brooks, C. Nilsson, G. Pinay, and R. J. Naiman. 2003. Effects of land cover on stream ecosystems: Roles of empirical models and scaling issues. Ecosystems 6: 407–423.

100 Strobl, R. O. and P. D. Robillard. 2008. Network design for water quality monitoring of surface freshwaters: A review. Journal of Environmental Management 87(4): 639–648.

Sudduth E. B., J. L. Meyer, and E. S. Bernhardt. 2007. Stream restoration practices in the Southeast United States. Restoration Ecology 15(3): 573–583.

Thornbrugh, D. J., S. G. Leibowitz, R. A. Hill, M. H. Weber, Z. C. Johnson, A. R. Olsen, J. E. Flotemersch, J. L. Stoddard, and D. V. Peck. 2018. Mapping watershed integrity for the conterminous United States. Ecological Indicators 85: 1133–1148.

Tong, S. Y. and W. Chen. 2002. Modelling the relationship between land use and surface water quality. Journal of Environmental Management 66(4): 377–393.

Townsend, C. R., S. S. Uhlmann, and C. D. Matthaei. 2008. Individual and Combined responses of stream ecosystems to multiple stressors. Journal of Applied Ecology 45(6): 1810–1819.

Tran, C. P., R. W. Bode, A. J. Smith, and G. S. Kleppel. 2010. Land use proximity as a basis for assessing stream water quality in New York State (USA). Ecological Indicators 10: 727–733.

United States Census Bureau. 2017. TIGER/Line shapefile dataset. United States Census Bureau: Spatial Data Collection and Products Branch, Washington, DC.

United States Environmental Protection Agency (USEPA). 2013. Level III ecoregions of the United States. United States Environmental Protection Agency: National Health and Environmental Effects Research Laboratory, Corvallis, OR.

United States Environmental Protection Agency (USEPA). 2018. Guidance for 2018 assessment, listing, and reporting requirements pursuant to sections 303(d), 305(b), and 314 of the Clean Water Act. United States Environmental Protection Agency: Office of Water, Washington, DC.

United States Fish and Wildlife Service (USFWS). 1993. Endangered and threatened wildlife and plants; determination of endangered status for the Carolina Heelsplitter (Lasmigona decorata). Federal Register 58: 34926–34932.

United States Geological Survey (USGS). 2019. National hydrography dataset plus – NHDPlus v2.1. United States Geological Survey, Reston, VA.

101 Utz, R. M., R. H. Hilderbrand, and D. M. Boward. 2009. Identifying regional differences in threshold responses of aquatic invertebrates to land cover gradients. Ecological Indicators 9(3): 556–567

Villeneuve, B., Y. Souchon, P. Usseglio-Polatera, M. Ferréol, and L. Valette. 2015. Can we predict biological condition of stream ecosystems? A multi-stressors approach linking three biological indices to physio-chemistry, hydromorphology and land use. Ecological Indices 48: 88–98.

Waite, I. R. and P. C. Van Metre. 2017. Multistressor predictive models of invertebrate condition in the Corn Belt, USA. Freshwater Science 36: 901–914.

Walters, D. M., A. H. Roy, and D. S. Leigh. 2009. Environmental indicators of macroinvertebrate and fish assemblage integrity in urbanizing watersheds. Ecological Indicators 9(6): 1222–1233.

Wang, L., J. Lyons, and P. Kanehl. 2001. Impacts of urbanization on stream habitat and fish across multiple spatial scales. Environmental Management 28(2): 255–266.

Wang, L., J. Lyons, P. Rasmussen, P. Seelbach, T. Simon, M. Wiley, P. Kanehl, E. Baker, S. Niemela, and P. Stewart. 2003. Watershed, reach, and riparian influences on stream fish assemblages in the northern lakes and forest ecoregion, U.S.A. Canadian Journal of Fisheries and Aquatic Sciences 60(5): 491–505.

Waters, T. F. (Ed). 1995. Sediment in Streams. American Fisheries Society, Bethesda, MD.

Wenger, S. J., A. H. Roy, C. R. Jackson, E. S. Bernhardt, T. L. Carter, S. Filoso, C. A. Gibson, W. C. Hession, S. S. Kaushal, E. Martí, J. L. Meyer, M. A. Palmer, M. J. Paul, A. H. Purcell, A. Ramírez, A. D. Rosemond, K. A. Schofield, E. B. Sudduth, and C. J. Walsh. 2009. Twenty-six key research questions in urban stream ecology: An assessment of the state of the science. Journal of the North American Benthological Society 28(4): 1080–1098.

Yang, L., S. Jin, P. Danielson, C. Homer, L. Gass, S. M. Bender, A. Case, C. Costello, J. Dewitz, J. Fry, M. Funk, B. Granneman, G. C. Liknes, M. Rigge, and G. Xian. 2018. A new generation of the United States national land cover database: Requirements, research, priorities, design, and implementation strategies. ISPRS Journal of Photogrammetry and Remote Sensing 146: 108–123.

102 Zlatko, P., H. Laudon, and B. Malmqvist. 2007. Does freshwater macroinvertebrate diversity along a pH-gradient reflect adaptation to low pH? Freshwater Biology 52: 2172–2183.

103 TABLES

Table 1. Stratification of sampling locations based on stream order and percent forest cover within local catchments. Strata contain one to three sampling locations depending on availability during selection and access during the sampling time frame. No 5th order stream sections in catchments containing 0-20% forest cover were available for sampling within our study area.

Forest Bin Stream Order

1st 2nd 3rd 4th 5th Total

0-20% 1 2 3 1 NA 7

20-40% 2 3 3 2 2 12

40-60% 2 3 2 3 2 12

60-80% 2 3 2 3 1 11

80-100% 1 2 2 1 1 7

Total 8 13 12 10 6 49

104 Table 2. List of covariates used to model macroinvertebrate biotic integrity within the Upper Lynches River Sub Basin in North and South Carolina, and the observed range of values across sampled sites (n = 49).

Variable Description Type Scale Mean (Range) Source

Slope Slope of target segment (m/m) Hydrological Catchment 4.3 (0.0-15.3) National Hydrology based on smoothed elevations Dataset (USGS (Scaled by 1000) 2019)

Stream Order Stream order at the target segment Hydrological Catchment 3 (1-5) National Hydrology Dataset (USGS 2019)

Basal Flow Index % of basal flow to total flow (% Hydrological Watershed 38.6 (28.0-45.7) StreamCAT Dataset (BFI) attributed to ground water (Hill et al. 2015) discharge) for upstream watershed (Scaled by 10)

Index of Index of catchment integrity based Calculated Index Catchment 6.1 (3.0-8.9) StreamCAT Dataset Catchment on hydrology, water chemistry, (Hill et al. 2015) Integrity (ICI) sediment, connectivity, temperature, and habitat provision (Scaled by 10)

Index of Index of upstream watershed Calculated Index Watershed 5.5 (2.8-7.4) StreamCAT Dataset Watershed integrity based on hydrology, water (Hill et al. 2015) Integrity (IWI) chemistry, sediment, connectivity, temperature, and habitat provision

Catchment % canopy cover within local Land Cover Catchment 5.6 (2.3-8.5) 2016 National Land Canopy catchment (Scaled by 10) Cover Dataset (Yang et al. 2018)

105 Riparian Canopy % canopy cover within upstream Land Cover Riparian 7.2 (2.2-8.4) 2016 National Land riparian area (Scaled by 10) Cover Dataset (Yang et al. 2018)

Watershed % canopy cover within full upstream Land Cover Watershed 5.1 (1.4-7.2) 2016 National Land Canopy watershed (Scaled by 10) Cover Dataset (Yang et al. 2018)

Watershed Area Upstream watershed catchment Topological Watershed 120.1 (1.3-998.8) StreamCAT Dataset area in Km2 (Hill et al. 2015)

Watershed % Piedmont (L3) within full upstream Physiographical Watershed 6.0 (0.0-10.0) U.S. Environmental Physiography watershed (Scaled by 10) Protection Agency USEPA 2013)

Stream ID Random effect variable with 24 Hydrological Watershed NA National Hydrology levels representing the stream in Dataset (USGS which the sampling site was located 2019)

Temperature Water temperature at sampling site Water Catchment 18.9 (10.4-27.7) In Stream during time of collection (°C) Chemistry

Dissolved Dissolved oxygen at sampling site Water Catchment 8.9 (2.8-14.6) In Stream Oxygen (DO) during time of collection (mg/l) Chemistry pH pH at sampling site during time of Water Catchment 6.2 (3.9-8.5) In Stream collection Chemistry

Conductance Specific conductance at sampling site Water Catchment 76.0 (12.2-263.0) In Stream during time of collection (μS/m) Chemistry

106 Table 3. Ranking of candidate models used to investigate associations between land use and macroinvertebrate biotic integrity. All models included a random effect representing stream ID. BFI = basal flow index, Cat.Can = catchment canopy, Conduct = specific conductance, ICI = index of catchment integrity, IWI = index of watershed integrity, Rip.Can = riparian canopy, Wat.Area = watershed area, Wat.Can = watershed canopy, Wat.Pd = watershed physiography.

a b c Model Structure K AICc ΔAICc Likelihood Wi

Riparian Canopy + Stream Rip.Can + DO + Conduct + pH + (Stream ID) 6 104.63 0.00 1.00 0.95

Watershed Canopy + Stream Wat.Can + DO + Conduct + pH + (Stream ID) 6 111.45 6.82 0.03 0.03

Catchment Canopy + Stream Cat.Can + DO + Conduct + pH + (Stream ID) 6 113.05 8.42 0.01 0.01

Stream DO + Conduct + pH + (Stream ID) 5 119.20 14.57 0.00 0.01

Riparian Canopy Rip.Can + (Stream ID) 3 127.11 22.48 0.00 0.00

Watershed Canopy Wat.Can + (Stream ID) 3 129.14 24.51 0.00 0.00

Riparian Canopy x Area Rip.Can x Wat.Area + (Stream ID) 5 130.13 25.50 0.00 0.00

Watershed Canopy x Physiography Wat.Can x Wat.Pd + (Stream ID) 5 130.74 26.11 0.00 0.00

Watershed Canopy x Area Wat.Can x Wat.Area + (Stream ID) 5 131.06 26.43 0.00 0.00

Riparian Canopy x Physiography Rip.Can x Wat.Pd + (Stream ID) 5 132.13 27.50 0.00 0.00

Stream Order Order + (Stream ID) 3 133.27 28.64 0.00 0.00

Catchment Canopy x Area Cat.Can x Wat.Area + (Stream ID) 5 133.42 28.79 0.00 0.00

Additive indices ICI + IWI + (Stream ID) 4 134.63 30.00 0.00 0.00

Catchment Canopy Cat.Can + (Stream ID) 3 135.12 30.49 0.00 0.00

Catchment Canopy x Physiography Cat.Can x Wat.Pd + (Stream ID) 5 136.25 31.62 0.00 0.00

107 Interactive indices ICI x IWI + (Stream ID) 5 136.60 31.97 0.00 0.00

Hydrology Slope + BFI + (Stream ID) 4 140.67 36.04 0.00 0.00

Null (Stream ID) 2 143.66 39.03 0.00 0.00

Physiography Wat.Pd + (Stream ID) 3 145.71 41.08 0.00 0.00

a Number of estimated parameters in the model b Akaike’s information criterion, adjusted for small sample size c Relative model weight

108 Table 4. Coefficient estimates and 95% confidence intervals for the top ranked model predicting stream biotic integrity from macroinvertebrate collections made in the Upper Lynches River Sub-basin of North and South Carolina.

Model Parameter Estimate Std. Error Lower 95% CI Upper 95% CI

Riparian Canopy + Stream (Intercept) -4.08 1.01 -6.09 -2.05

Riparian Canopy Cover 0.42 0.09 0.23 0.61

Dissolved Oxygen 1.56 0.31 0.95 2.18

pH 0.21 0.11 0.02 0.44

Specific Conductance -0.01 0.02 -0.03 0.01

109 FIGURES

Figure 1. Study area where macroinvertebrate samples were collected, consisting of three HUC 10 watersheds within the Lynches River sub-basin of the Pee Dee River Basin.

110

Figure 2. The three spatial extents used to define landcover variables while investigating patterns of biotic integrity within the Upper Lynches River Sub-basin of North and South Carolina. Extents are defined as the local catchment (2A), the watershed-wide 100 m riparian area (2B), and the entire upstream watershed (2C).

111

Figure 3. Relationships between variables in the top-ranked model and predicted biotic integrity using the top ranked model. Plots include riparian canopy cover (3A), pH (3B), dissolved oxygen (3C), and specific conductance (3D). Covariate effects were predicted while holding other variables in the model at the mean observed value. Dashed lines represent 95% confidence intervals of the fitted regression line.

112

Figure 4. Comparison of predicted and observed biotic integrity scores after k-fold cross validation (n = 5). Shaded area represents 95% confidence interval.

113

Figure 5. Predicted conditional biotic integrity scores for target streams in the Upper Lynches River sub-basin based on parameter estimates within the top-ranked model. Predicted scores were calculated using percent riparian canopy cover while holding in-stream variables in the model at the mean observed values (dissolved oxygen = 8.9 mg/l, specific conductance = 75.9 μS/cm, pH = 6.2).

114