Microbial Community Responses to Environmental Perturbation

by

Raven Lee Bier

University Program in

Date:______Approved:

______Emily S. Bernhardt, Supervisor

______Edward K. Hall

______Heileen Hsu-Kim

______Dean L. Urban

______Jennifer J. Wernegreen

Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the University Program in Ecology in the Graduate School of Duke University

2016

ABSTRACT

Microbial Community Responses to Environmental Perturbation

by

Raven Lee Bier

University Program in Ecology Duke University

Date:______Approved:

______Emily S. Bernhardt, Supervisor

______Edward K. Hall

______Heileen Hsu-Kim

______Dean L. Urban

______Jennifer J. Wernegreen

An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the University Program in Ecology in the Graduate School of Duke University

2016

Copyright by Raven Lee Bier 2016

Abstract

Microorganisms mediate many biogeochemical processes critical to the functioning of , which places them as an intermediate between environmental change and the resulting response. Yet, we have an incomplete understanding of these relationships, how to predict them, and when they are influential. Understanding these dynamics will inform ecological principles developed for macroorganisms and aid expectations for microbial responses to new gradients. To address this research goal, I used two studies of environmental gradients and a literature synthesis.

With the gradient studies, I assessed microbial community composition in stream biofilms across a gradient of alkaline mine drainage. I used multivariate approaches to examine changes in the non-eukaryote microbial community composition of taxa (chapter 2) and functional genes (chapter 3). I found that stream biofilms at sites receiving alkaline mine drainage had distinct community composition and also differed in the composition of functional gene groups compared with unmined reference sites.

Compositional shifts were not dominated by groups that could benefit from mining associated increases of terminal electron acceptors; two-thirds of responsive taxa and functional gene groups were negatively associated with mining. The majority of subsidies and stressors (nitrate, sulfate, conductivity) had no consistent relationships with taxa or gene abundances. However, methane metabolism genes were less abundant iv

at mined sites and there was a strong, positive correlation between selenate reductase gene abundance and mining-associated selenium. These results highlighted the potential for indirect factors to also play an important role in explaining compositional shifts.

In the fourth chapter, I synthesized studies that use environmental perturbations to explore microbial community structure and microbial process connections. I examined nine journals (2009–13) and found that many qualifying papers (112 of 148) documented structure and process responses, but few (38 of 112 papers) reported statistically testing for a link. Of these tested links, 75% were significant. No particular approach for characterizing structure or processes was more likely to produce significant links. Process responses were detected earlier on average than responses in structure. Together, the findings suggested that few publications report statistically testing structure-process links; but when tested, links often occurred yet shared few commonalities in linked processes or structures and the techniques used for measuring them.

Although the research community has made progress, much work remains to ensure that the vast and growing wealth of microbial informatics data is translated into useful ecological information. In part, this challenge can be approached through using hypotheses to guide analyses, but also by being open to opportunities for hypothesis generation. The results from my dissertation work advise that it is important to carefully interpret shifts in community composition in relation to abiotic characteristics and

v

recommend considering ecological, thermodynamic, and kinetic principles to understand the properties governing community responses to environmental perturbation.

vi

Dedication

To my father, sibling, friends, and mentors. And in loving memory of my mother, Terry.

vii

Contents

Abstract ...... iv

List of Tables ...... xii

List of Figures ...... xiv

Acknowledgements ...... xvi

1. Introduction ...... 1

1.1 Motivation ...... 1

1.2 Microbial Responses to Chemical Perturbation ...... 2

1.3 Synthesis of Perturbation Studies Informing Microbial Community Structure- process Links ...... 5

1.4 Summary ...... 7

2. Bacterial Community Responses to a Gradient of Alkaline Mountaintop Mine Drainage in Central Appalachian Streams ...... 9

2.1 Introduction ...... 9

2.2 Methods ...... 11

2.2.1 Study Site ...... 11

2.2.2 Water ...... 13

2.2.3 Stream Biofilms ...... 13

2.2.4 Biofilm Community 16S rRNA Gene Analysis ...... 14

2.2.5 Bacterial Community Analyses ...... 15

2.2.6 Bacterial Taxa and Environmental Analysis ...... 18

2.2.7 Predicted Functional Profiles ...... 20

viii

2.3 Results ...... 20

2.3.1 Environmental Characterization of Streams ...... 20

2.3.2 Sequencing and Taxa Identification ...... 23

2.3.3 Overall Bacterial Community Structure ...... 23

2.3.4 Bacterial Diversity Along the Mining Gradient ...... 27

2.3.5 Indicator Taxa and Predicted Functions ...... 29

2.4 Discussion ...... 34

3. Subsidies or Stressors? How Alkaline Mine Drainage Shapes Microbial Communities ...... 42

3.1 Introduction ...... 42

3.2 Methods ...... 46

3.2.1 Study Site ...... 46

3.2.2 Sample Collection ...... 48

3.2.3 Water Chemistry ...... 48

3.2.4 Microbial Biofilm ...... 49

3.2.5 Nucleic Acid Preparation and Sequencing ...... 49

3.2.6 Bioinformatics Analysis ...... 50

3.2.7 Relative Abundance of Protein Encoding Genes Using qPCR ...... 51

3.2.8 Activity Assays ...... 52

3.2.8.1 Substrate-Induced Respiration (SIR) ...... 52

3.2.8.2 Nitrification Potential ...... 53

3.2.9 Statistical Analyses ...... 53

ix

3.2.9.1 Analysis of Microbial Indicators and Positive or Negative Responders ... 54

3.3 Results ...... 54

3.3.1 Characteristics of Stream Water and Microbial Biofilm Chemistry ...... 54

3.3.2 Characteristics of Taxa, Functional Genes, and Pathway Composition ...... 57

3.3.3 Environmental Correlates of Variation in Composition ...... 62

3.3.4 Shifts in Microbial Characteristics with Mining ...... 63

3.3.5 Relative Responses of Subsidy-stress Genes ...... 68

3.3.6 Quantified Gene Abundances Across the AlkMD Gradient ...... 70

3.3.7 Nitrification Potential Assay ...... 72

3.4 Discussion ...... 72

4. Linking Microbial Community Structure and Microbial Processes: An Empirical and Conceptual Overview ...... 82

4.1 Introduction ...... 82

4.2 Methods ...... 85

4.3 Results ...... 90

4.3.1 How Frequently Do Publications Report That an Experimental Manipulation Led to Changes in Microbial Community Structure or Microbially Mediated Ecosystem Processes? And How Often Do These Changes Co-occur? ...... 90

4.3.2 Do Particular Experimental Conditions or Techniques More Often Associate with Observed Links Between Community Structure and Microbial Processes? .... 92

4.3.2.1 Experimental Design ...... 92

4.3.2.2 Ecosystem Processes and Community Structure in the Link-tested Dataset ...... 95

4.3.2.3 Molecular Techniques Used in the Link-tested Dataset ...... 99

x

4.3.3 How are Researchers Attempting to Identify Links Between Measures of Microbial Community Structure and Process? ...... 101

4.4 Discussion ...... 102

5. Conclusion ...... 114

Appendix A ...... 121

Appendix B ...... 128

Appendix C ...... 144

References ...... 155

Biography ...... 176

xi

List of Tables

Table 1. Average water chemistry values ...... 21

Table 2. Overall and pair-wise comparisons of Mud River bacteria community composition ...... 24

Table 3. The α- diversity using 1543 sequences...... 29

Table 4. Taxa identified at the order or family level as indicators...... 30

Table 5. Energy metabolism functions ...... 34

Table 6. Summary of hypotheses and predictions ...... 46

Table 7. Expectations of subsidized and stressed taxa, functions, and genes ...... 46

Table 8. Alpha diversity metrics...... 61

Table 9. Environmental variables from water chemistry and biofilms ...... 123

Table 10. Environmental variables from water chemistry and biofilms continued ...... 124

Table 11. Multiplex Identifier Adapters ...... 125

Table 12. Environmental variables from PCA ...... 126

Table 13. Correlations of environmental variables and relative abundances of bacteria ...... 127

Table 14. Primers and standards used for quantitative PCR analysis of biofilms...... 128

Table 15. Variables used in Principal Components Analysis of dissolved water chemistry constituents...... 129

Table 16. Variables used in Principal Components Analysis of biofilm constituents. ... 130

Table 17. Phyla excluded from Fig. 10 due to <1% relative abundance...... 131

Table 18. KEGG orthogroups excluded from Fig. 11 due to <1% relative abundance. ... 131

Table 19. Key of Level B KEGG orthogroups for Fig. 11 and Tables 21-24...... 132

xii

Table 20. OTUs with a relative abundance significantly less (-) or greater (+) at sites with alkaline mine drainage (AlkMD) ...... 135

Table 21. KEGG orthogroups with a relative abundance significantly less (negative) at sites with alkaline mine drainage (AlkMD) ...... 136

Table 22. KEGG orthogroups with a relative abundance significantly less (negative) at sites with alkaline mine drainage (AlkMD) continued ...... 137

Table 23. KEGG orthogroups with a relative abundance significantly less (negative) at sites with alkaline mine drainage (AlkMD) continued ...... 138

Table 24. KEGG orthogroups with a relative abundance significantly greater (positive) at sites with alkaline mine drainage (AlkMD) ...... 139

Table 25. MetaCyc pathways with a relative abundance significantly greater (positive) at sites with alkaline mine drainage (AlkMD) ...... 140

Table 26. Results from linear regression using KEGG orthogroups from site metagenomes and subsidies ...... 141

Table 27. Results from linear regression using KEGG orthogroups from site metagenomes and stressors...... 142

Table 28. Results from qPCR of biofilms ...... 143

Table 29. Distribution of studies from 2009-2013 by journal...... 149

Table 30. Number of incidences associated with each type of manipulation ...... 150

Table 31. Additional microbial processes tested for a link with microbial community structure not included in Fig. 22...... 151

Table 32. Additional genes and tested for a link with microbial processes not included in Fig. 23...... 152

xiii

List of Figures

Figure 1. Sample sites on Mud River and Left Fork Mud River ...... 12

Figure 2. Principal components analysis (PCA) of selected environmental variables. .... 22

Figure 3. Non-metric multidimensional scaling ordination...... 25

Figure 4. Water and biofilm chemistry variables...... 26

Figure 5. Alpha diversity...... 28

Figure 6. Response of genera within class Betaproteobacteria ...... 32

Figure 7. Response of genera within class Alphaproteobacteria ...... 33

Figure 8. Sample sites on Mud River and Left Fork Mud River ...... 47

Figure 9. Principal components analysis of selected environmental variables...... 56

Figure 10. Relative abundances of phyla of Bacteria and Archaea ...... 58

Figure 11. Relative abundances of KEGG orthogroups ...... 59

Figure 12. Relative abundances of MetaCyc pathways...... 60

Figure 13. Composition of mined and unmined site a) taxa, b) KEGG orthogroups, and c) MetaCyc pathways using Nonmetric Multidimensional Scaling...... 62

Figure 14. OTUs that differed significantly between mined and unmined sites ...... 65

Figure 15. KEGG orthogroups that differed significantly between mined and unmined sites ...... 66

Figure 16. MetaCyc pathways that differed significantly between mined and unmined sites ...... 67

Figure 17. Graphs of gene copy relative abundance within each metabolism or stress category...... 70

Figure 18. Linear regressions of absolute abundance of selenate reductase gene, serA, with a) biofilm selenium concentration and b) dissolved selenium concentration...... 71

xiv

Figure 19. Distribution of the literature synthesis results...... 92

Figure 20. Distribution of incidences among different experimental attributes...... 94

Figure 21. Duration of experiments...... 95

Figure 22. Distribution of incidences among microbial processes ...... 96

Figure 23. Distribution of incidences among structural measurements ...... 98

Figure 24. Flowchart of guidelines for research involving microbial community structure and ecosystem processes ...... 103

Figure 25. Scatterplot of environmental variables ...... 121

Figure 26. Post hoc site groupings ...... 122

Figure 27. Rarefaction curves from rRNA using MG-RAST M5RNA database ...... 133

Figure 28. NMDS of OTUs (Fig. 13a) with environmental variable vectors ...... 133

Figure 29. NMDS of KEGG orthogroups (Fig. 13b) with environmental variable vectors ...... 134

Figure 30. Filtering process for papers...... 148

Figure 31. Distribution by journal of 1,189 total papers from 2009-2013 ...... 150

Figure 32. Examples of the distribution of incidences associated with a universal gene (bacterial 16S (A)) and a specific gene (bacterial amoA (B)) ...... 153

Figure 33. Comparison between the number of structure and process metrics used in each paper...... 154

xv

Acknowledgements

I acknowledge my advisor, Emily Bernhardt, for her incredible mentorship, patience, and support of my scientific development. I thank my dissertation committee for their guidance and advice both related to this research and broader scientific questions. I would also like to recognize members of the Bernhardt Lab and Duke River

Center for assistance with chemical analyses, statistics, coding, and field work. In particular, stellar contributions were made by Brooke Hassett, Anna Fedders, Kris Voss,

Alison Appling, Si-Yi Wang, Matt Ross, and Joanna Blaszczak. I thank members of the

Powell Center for Analysis and Synthesis Next-Generation of Ecological Indicators working group who provided guidance and comments for the third chapter. The Miller family including Anita Miller was essential contacts for field work and site selection.

Chris Ellis assisted with bioinformatics in the second chapter. Grace Schwartz and Sarah

Diringer of the Hsu-Kim lab group instructed me on sulfide analysis. Mariah Arnold and Ty Lindberg of the Di Giulio lab group assisted with field work. Nick Huffman and

Nick Goltfelty of the Department of Natural Resources provided advice and transportation on the reservoirs.

Research from the first chapter was supported by the Foundation for the

Carolinas. The second chapter was supported by the United States Geological Survey

Powell Center for Analysis and Synthesis. The third chapter has been supported by the

xvi

Duke University Genomics of Microbial Systems pilot project and a grant from the U.S.

Environmental Protection Agency's Science to Achieve Results (STAR) program.

Disclaimer statement:

Chapter 3 of this dissertation was developed under Assistance Agreement No. FP-

91764401-0 awarded by the U.S. Environmental Protection Agency to Raven Bier. It has not been formally reviewed by EPA. The views expressed in this document are solely those of Raven Bier and co-authors and do not necessarily reflect those of the Agency.

EPA does not endorse any products or commercial services mentioned in this publication.

xvii

1. Introduction

1.1 Motivation

Microorganisms perform many biogeochemical processes (ex. decomposition,

N2-fixation, sulfate reduction) that place them as an intermediate between environmental change and the resulting ecosystem response (Falkowski et al. 2008).

Historically, has treated the microbial component of a system as a black box that represents a static placeholder that connects the inputs and outputs, but does not affect their relationship. The more we study microbial communities, however, the more we realize how dynamic an intermediate they are. Physiology, phenotypic plasticity, population growth and turnover, evolutionary adaptation and shifts in community composition are all mechanisms of change in microbial communities

(Treseder et al. 2011). A combination of these mechanisms comprises their response to environmental perturbation and suggests that they play a greater role in mediating the connection between those environmental inputs and the abiotic consequences.

In this dissertation, I seek to understand microbial community responses to environmental perturbation and their role in mediating ecosystem responses. To address this goal, I pose the following research questions:

1) How does taxonomic diversity and composition of microbial communities

change across a strong chemical gradient of alkaline mine drainage?

1

2) How does the composition and relative abundance of microbial functional

genes change across a strong chemical gradient of alkaline mine drainage?

3) How and to what extent are perturbation studies revealing connections

between microbial community structure and microbial processes?

In the following chapters, I take two approaches to these questions: first, I examine taxonomic and functional gene responses to a chemical perturbation gradient

(Chapters 2 and 3), and second, I determine how often and under what conditions researchers are identifying connections between microbial community structure and microbially mediated processes following environmental perturbation (Chapter 4).

1.2 Microbial Responses to Chemical Perturbation

Chemical shifts in an ecosystem comprise a major source of environmental perturbation and in many instances is the ubiquitous consequence of activity by our planet’s 7 billion people. We apply fertilizers and pesticides to agricultural fields, increase rock weathering via mining, and release effluent from chemical processing plants (Klee and Graedel 2004). Due to the microbial role in mediation of biogeochemical processes, understanding how microorganisms respond to chemical additions to microorganisms could help us explore how ecosystems respond to chemical change. The incredible diversity of metabolic capabilities and life history traits allow microbes to flourish in systems that are chemically inhospitable to other organisms (ex. hydrothermal vents, hypersaline salt flats, and acid mine drainage). Perhaps because of

2

this flexibility, we have spent little time considering how contamination by novel chemical compounds and mixtures may alter microbial communities and the potential effect on microbially mediated ecosystem processes.

Given the immense genetic diversity of microorganisms, we assume that metabolic capabilities are typically widespread and flexible at the microbial community level, yet these capabilities vary on an individual basis. The diversity in metabolisms could create different “functional” susceptibility to contamination because any single metabolic capability occurs within a phylogenetic range that could be constrained or broad (Schimel 1995). Carbon mineralization, for example, is a nearly universal metabolism while nitrite-oxidation is performed by only four genera (Abeliovich 2007).

In cases where a metabolic capability is shared over many phyla, chemical contamination could have little effect: a phylogenetic branch of sensitive taxa could be eradicated, while tolerant taxa on other branches compensate for that loss of function

(Johnsen et al. 2001, Allison and Martiny 2008). Alternatively, if the metabolism is not redundant across phyla (ex. methanogenesis, nitrification, anammox), we might expect heightened sensitivity of the biogeochemical process they control (Widenfalk et al. 2008).

Contaminants should not, however, be viewed only as stressors. At some level it must be true that any given contaminant represents a stress to some member of the microbial community. But because of the immense variation in microbial life history strategies it also seems likely that the same contaminant could represent a resource

3

subsidy to a different subset of taxa or to the same set of taxa at a different concentration. I posit that in cases where a contaminant represents a significant stress or subsidy to a restricted set of taxa possessing unique metabolic capabilities we may observe substantial (and otherwise unpredictable) shifts in the composition of the microbial community.

Predicting the consequences of chemical alterations on microbial communities is extraordinarily difficult given our current state of knowledge about the variety of effects on microbial processes as well as the complexity of the chemical inputs. In many systems chemical alterations simultaneously add nutrient subsidies and potential toxins while significantly altering carbon quality. However, unpacking microbial community responses with respect to phylogenetic breadth and functional groups that should directly benefit or be harmed by changes in the environmental variables can improve our expectations of how a suite of chemicals will affect microbial communities beyond our current thermodynamic and kinetic predictions.

The first aim of this dissertation is to understand the responses of microbial community composition to a chemical perturbation gradient from both a taxonomic and functional gene approach. In Chapter 2, I investigate these relationships by studying bacterial communities sampled across a chemical gradient established by alkaline coalmine drainage. In Chapter 3, I examine how the functional potential of the community responds to the same gradient by using relative abundances of functional

4

genes (metagenomics) and absolute quantities of a subset of targeted functional genes

(quantitative PCR). The expected responses are guided by a subsidy-stress framework in the context of phylogenetic breadth. Thus, I anticipated subsidized taxa to respond positively, but that subsidized taxa with narrow phylogenetic distribution would show no aggregate change or could respond negatively due to the stressor effect dwarfing a subsidy effect. Alternatively, broadly distributed guilds of microbes could have a more varied response that reflects a greater genetic variability within that guild.

1.3 Synthesis of Perturbation Studies Informing Microbial Community Structure-process Links

Another major goal of microbial ecology is to identify informative connections between the microbial community and resulting processes (i.e., links between microbial community structure and microbial functions). Although this objective seems straightforward, there are conceptual and methodological challenges to designing studies that explicitly evaluate this link. For example, the multiple reaction mechanisms integrated by the microbial community can obscure detection and interpretation of the response. Further, disturbances are multi-faceted, often adjusting multiple, covarying factors. Yet, environmental perturbations that affect community structure provide an ecologically-relevant approach to identifying meaningful connections.

Substantial changes in the chemical and physical environment ought to exert strong selection pressure on microbial communities as has been observed in studies of community assembly (Lindstrom and Langenheder 2012). Yet, if many microbes are 5

resistant to stress (i.e., capable of “weathering” ill conditions for long periods) through dormancy (Jones and Lennon 2010, Lennon and Jones 2011) and some microbial communities are resilient to perturbations (Shade et al. 2012a, Shade et al. 2012b), environmental changes may often not lead to measurable changes in community composition or collective microbial community function. In a meta analysis,(Graham et al. (2016) found that 56% of variation in ecosystem process rates could be explained by environmental variables alone, but that that microbial community data could improve

29% of the analyzed studies. While this may be true mostly for narrowly distributed guilds as concluded by Graham et al. (2016), processes controlled by phylogenetically broad guilds may also reflect the contribution of those guilds to the microbial community.

With the second aim, I sought to better understand the conditions under which linked compositional and functional changes are observed in response to experimental manipulation of environmental conditions. Thus, in the fourth chapter, I synthesize environmental perturbation studies that measure microbial community structure and microbially-mediated processes to determine the frequency and conditions with which environmental perturbations lead to a detectable link between microbial community structure and microbial processes. I anticipated that processes controlled by more narrowly-distributed guilds would show a greater percentage of links to microbial community structure.

6

This synthesis involved contributions from multiple authors (Bier et al. 2015a), but I led the primary research effort related to designing, compiling, analyzing, and writing the project. My co-authors provided critiques and guidance on the overall project design and edits on the manuscript.

1.4 Summary

With this dissertation I examined the connections between microbial communities and environmental perturbation. In Chapters 2 and 3, I used an existing chemical gradient of alkaline mine drainage to look for large-scale relationships between subsidies and stressors introduced by the gradient and the taxonomic and functional gene abundances of the microbial communities. Determining the fidelity of these relationships can help us understand how effective gradients are in structuring microorganisms as has been observed for macroorganisms. Further, this approach can inform when and how thermodynamic and kinetic principles and stressors apply to microbial community composition shifts.

The final literature synthesis chapter addresses a major research question investigated by both microbial and ecosystem ecologists that asks under what circumstances are there links between microbial community structure and the ecosystem processes they mediate. Synthesizing the progress made on this question to date helps target an ecological question of when and whether microbial community information

7

matters for interpreting ecosystem processes while also serving to focus our research energy and resources.

8

2. Bacterial Community Responses to a Gradient of Alkaline Mountaintop Mine Drainage in Central Appalachian Streams

2.1 Introduction

Microbial diversity, abundance and composition change along environmental gradients. Understanding the nature of these changes may help us identify a set of environmental conditions characteristic to particular groups of taxa (Feris et al. 2009,

Logares et al. 2013). Ultimately, these relationships could be used to develop microbial indicators (Feris et al., 2009) and help predict microbial community responses to environmental conditions (Paerl et al. 2003, Sims et al. 2013). Communities of other organisms including macroinvertebrates and periphyton are widely and routinely used as bioindicators of localized chemical and physical conditions (Barbour et al. 1999,

Hering et al. 2003, Solimini et al. 2006). Microbial communities, with their great genetic diversity and now rapid identification process, also hold promise as bioindicators. There is reason to believe that microbial community indexes or indicator taxa could be developed, as a variety of studies demonstrate that diversity and composition shift considerably along gradients of pH (Fierer and Jackson 2006, Fierer et al. 2007, Lauber et al. 2009, Lear et al. 2009, Rousk et al. 2010, Griffiths et al. 2011), trace metal concentration

(Baker and Banfield 2003, Feris et al. 2003, Giller et al. 2009, Lami et al. 2013), salinity

(Lozupone and Knight 2007, Auguet et al. 2010), and substrate carbon-to-nitrogen ratio

(Bates et al. 2011).

9

Because microbial composition can be affected strongly by pH, salinity and metal concentrations, we speculated that exposure to alkaline mine drainage (AlkMD) would drive important shifts in stream microbial assemblages. AlkMD results from surface , the dominant form of land cover change in Central Appalachia (Townsend et al.

2009). Effluent is produced during rock weathering of surface coal mines that contain carbonate rock strata in addition to coal layers (Palmer et al. 2010, Bernhardt and Palmer

2011). The carbonate matrix buffers sulfuric acid produced from weathered pyrite minerals, increasing base cation (Ca2+, Mg2+ and HCO3–) and SO42- concentrations in receiving waters (Rose and Cravotta 1998, Kirby and Cravotta 2005). AlkMD is thus characterized by increased alkalinity, ionic strength and pH and often has elevated metals that reflect parent geology (Lindberg et al. 2011, USEPA 2011, Griffith et al. 2012).

Recent regional analyses suggest that AlkMD generated from surface mines led to significant chemical and biological degradation of at least 22% of southern West Virginia rivers (Bernhardt et al. 2012).

AlkMD offers an interesting contrast to the large body of literature exploring responses to acid mine drainage, as it increases salinity and trace metals but enhances alkalinity rather than reduces pH. In this study, we compared stream bacterial communities between mined and unmined catchments within the largest surface coal mine complex in Central Appalachia. We asked the following questions. (1) Does

AlkMD significantly alter community composition and do these changes manifest

10

themselves as changes in α-diversity? (2) What bacterial taxa are responsible for changes in community composition and can these taxa be used as indicators of AlkMD? (3) What insight regarding functional response can be gained by examining taxa lost and gained because of the influence of AlkMD?

We expected bacteria’s compositional response to AlkMD would mirror that observed for macroorganisms (Pond 2010, Bernhardt et al. 2012). Specifically, compositional shifts would be due to both decreased α-diversity and decreased evenness across the AlkMD gradient. We hypothesized that many taxa found in the low solute waters of unmined sites would be absent or rare in sites downstream of surface mines and that taxa known to metabolize ions released from mining would increase downstream of mining. We anticipated that taxa with distinct metabolic repertoires could indicate sites exposed to AlkMD.

2.2 Methods

2.2.1 Study Site

Sampling sites are in Mud River, a Central Appalachian surface coal mining region that lies in West Virginia’s Lower Guyandotte watershed (Fig. 1). Mud River has two forks. Upper Mud River passes through the Hobet Mine complex, the largest surface coal mine in Central Appalachia and includes active and reclaimed mines within

40km2 of permitted mines. Left Fork Mud River is unmined but has similar geology and low-density residential housing. Sampling locations spanned a gradient of AlkMD

11

contamination. Sites affected by mining included 9 along Upper Mud River’s mainstem and 8 within tributaries draining mines (6 active and 2 reclaimed). Unmined sites were one in Upper Mud River upstream of surface mines, one unimpaired tributary and four

Left Fork Mud River sites.

# Left Fork # ##

# " !"! ! " 5" ! !! " !XY Right Fork ± !"

XY! "

# Unmined Mainstem Mined Actively Mined Trib.

0 2.5 5 10 Kilometers Reclaimed Mined Trib.

Figure 1. Sample sites on Mud River and Left Fork Mud River in Boone and Lincoln Counties, West Virginia. HUCs 12-050701020301 and 12-050701020104 outlined in grey. Grey tributary streams run through mined areas, while black tributaries are unmined. The mainstem of Mud River and Left Fork Mud River shown in bold black from headwaters to confluence. Arrows show flow direction. Inset of US mid-Atlantic states shows Appalachian Coalfield Region as grey shaded area with relative location of study site in WV in red (not to scale).

West Virginia GIS Technical Center (http://wvgis.wvu.edu/) provided spatial data downloaded into ArcMap 10.0. We delineated catchments for samplings location from 30 meter DEM (digital elevation dataset) (US Geological Survey National Elevation

Dataset) with ArcMap 10.0 spatial analyst hydrology tools. Land cover classification of

12

mines and reclaimed mines were identified using 1-meter color orthophotos from the

USDA’s National Agriculture Imagery Program.

2.2.2 Water Chemistry

Water chemistry and temperature at each site were measured during deployment and collection of biofilm substrates and 1 month later (December 2010 and

April and May 2011). We measured in-stream conductivity and pH and analyzed samples for a suite of major and trace elements (Appendix A Tables 9 & 10, Fig. 25).

Water sampling followed USGS protocols (USGS Variously dated.). See Lindberg et al.

(2011) for details.

2.2.3 Stream Biofilms

Biofilms were grown on substrates suspended under water near the shaded streambank at each sampling site. To minimize variability (De Beer and Stoodley 2006,

Sabater et al. 2007) and use environmentally relevant substrates, we used wood veneers cut from the same tree (Acer saccharum) and enclosed veneers in mesh aquaculture bags

(Pentair Aquatic Eco-Systems, Apopka, FL, USA). Four sterilized veneers were deployed under water at each site and incubated for 4 months. Veneers were removed in April

2011; two were transported to the lab on dry ice and stored at -80 °C until DNA extraction. Two remaining veneers were used for metal analysis and carbon content.

Biofilm scraped from two veneers was oven-dried at 78 °C, homogenized, digested with trace metal grade HNO3 and heated at 80 °C. Samples were analyzed for

13

metal content with inductively coupled plasma mass spectrometry as detailed in

Lindberg et al. (2011). Remaining biofilm was used for ash- free dry mass via combustion at 500 °C. Then, 53% of the difference between pre- and post- combusted dry mass was calculated as carbon content (Wetzel 1983).

2.2.4 Biofilm Community 16S rRNA Gene Analysis

We homogenized biofilm from 2 of the veneers and subsampled 0.25g wet mass for DNA extraction. Community DNA was extracted from the biofilm using

PowerBiofilmTM DNA isolation kit (MO BIO Carlsbad, CA, USA) and stored at -20 °C.

Bacteria community DNA was amplified at the 27 to 338 region of the 16S (small-subunit ribosomal) RNA gene (regions V1 and V2 using the Escherichia coli genome numbering system). Forward 27F primer had a Roche Titanium Fusion Primer Adapter A, followed by a 4 nucleotide key, then a 454-specific 10 nucleotide MID barcode (Appendix A Table

11) for each sample site and finally, the 16S template-specific sequence (5’CCA TCT CAT

CCC TGC GTG TCT CCG AC TCAG NNN NNN NNN N AG AGT TTG ATC CTG GCT

CAG 3’). Reverse primer 338R (5’ GCT GCC TCC CGT AGG AGT 3’) had no barcode, thus sequencing was unidirectional using Roche 454 Lib-L kit. The polymerase chain reaction recipe was 2.5µL (10mM) of each forward and reverse primer, 1µL of template

DNA, 4µL of dNTPs (1mM each), 2.5µL BSA (10mg/mL), 0.75µL MgCl2 (50mM), 2.5µL

10x buffer, 1µL Platinum Taq polymerase, 9.1µL ddH2O. MgCl2, 10x buffer, and Taq polymerase were all from Invitrogen Platinum kit.

14

The PCR program was 5 min initial denaturation at 95 °C followed by 25 cycles of amplification: 95 °C for 60 s, 52 °C for 60 s, and 72 °C for 105 s. After the last cycle amplification was extended at 72 °C for 7 min. Samples, including negative controls (no template added), were amplified in triplicate. All PCR work was completed in a laminar flow hood. Replicate PCR samples were pooled and purified with QIAquick PCR

Purification Kit (Qiagen, Valencia, CA). Purified PCR products were normalized with

SequalPrep™ Normalization Plate Kit (Applied Biosystems®, Life Technologies Grand

Island, NY, USA). Equimolar purified PCR amplicons were combined in 3 microcentrifuge tubes each with a set of barcodes with MID1-10. Samples were sent to

Genome Sequencing and Analysis Core Resource at Duke University (Durham, NC,

USA) for pyrosequencing with a Roche 454 Life Sciences Genome Sequencer Flex

Titanium instrument (Branford, CT, USA).

2.2.5 Bacterial Community Analyses

QIIME 1.6.0 software pipeline (Caporaso et al. 2010) was used for downstream sequence processing: reverse primer and chimera removal, phylotype binning, operational taxonomic unit (OTU) assignment and 10-base MID (multiplex identifier) sample grouping. USEARCH was used to filter noisy sequences, chimera check and pick

OTUs from demultiplexed sequences (Edgar 2010). OTUs containing < 3 sequences were removed. Remaining OTUs were picked at 97% sequence similarity and identified using the RDP (Ribosomal Database Project) classifier retrained with Greengenes. The NAST

15

algorithm was used for alignment and Greengenes

(http://greengenes.secondgenome.com) supplied core representative sequences (version

October 2012). Sequences were quality filtered and rarefied to 1543 sequences per sample. To select 1543 sequences for analysis, each OTU’s abundance at a site was divided by the total sequence count for that site, and then multiplied by 1543 to retain the relative abundance of that OTU out of 1543 sequences. The resulting decimals were floored and remaining sequences needed for the site to contain 1543 sequences were selected using the distribution of OTUs at each site (Beevers 2006). One unmined site

(MRUl2) had only 825 sequences. Data from this site were used in environmental data correlations, but excluded from diversity calculations. Rarefaction curves were generated for Chao1 richness (Chao 1984), Margalef’s index, Shannon diversity,

Simpson’s index for evenness and evenness.

Multivariate analysis was guided by(Anderson and Willis (2003), who advocate following these approaches: (1) an ordination (robust and unconstrained), (2) statistical testing of the hypothesis and (3) identification of taxa driving the observed patterns. We visualized differences in OTU-based community composition with nonmetric multi- dimensional scaling (NMDS) ordinations based on Bray–Curtis (Bray and Curtis 1957) and generalized UniFrac (GUniFrac) distance matrices (item 1). GUniFrac distances measure community phylogenetic relatedness, but cover a series of distances from weighted to unweighted by adjusting the weight of the branches in the UPGMA tree

16

(Chen et al. 2012). Alpha controls the weight on lineages with common taxa and was set to 0.5 to provide the best overall power (Chen et al. 2012). For analysis, we grouped mined sites in two different ways: a priori (mainstem mined, active fill and reclaimed valley fill) and post hoc, (Ward’s method cluster analysis of Bray–Curtis distances separated sites into group A and group B) (Appendix A Fig. 26). Finally, we partitioned variation in our community distance matrix among these a priori and post hoc groupings with permutational multivariate analysis of variance (item 3) (Anderson

2001).

NMDS ordinations using counts relativized to 1543 sequences were created using 200 runs started at random configurations with the nmds function in ecodist

(Goslee and Urban 2007). The final solution for both the Bray-Curtis distance and

GUniFrac distance NMDS ordinations were created using a stepdown procedure.

GUniFrac distances were based on a UPGMA phylogenetic tree (Chen et al. 2012). The final solution used three axes to relieve stress from the configuration, but the third axis did not increase explanatory power of the ordination. We rotated the ordination result to achieve highest variance on Axis 1. We partitioned variation in our community distance matrices (both Bray-Curtis and GUniFrac) among our a priori and post hoc groupings with permutational multivariate analysis of variance (perMANOVA,(Anderson, 2001).

For the a priori grouping, we decomposed total variation among three orthogonal contrasts: mined vs. unmined, reclaimed vs. active, and mainstem vs. valley fill. For the

17

post hoc grouping we decomposed total variation among two orthogonal contrasts: mined vs. unmined, and group A vs. group B. We tested the significance of each contrast using a pseudo F-statistic generated from random permutation of the site categories This test was carried out using 999 permutations in the adonis function of the vegan package

(Oksanen et al. 2011) in R version 2.14.1 (R Development Core Team 2011).

2.2.6 Bacterial Taxa and Environmental Analysis

To understand community composition and environmental variable associations, we examined correlations between NMDS ordination scores and the first two component scores derived from a principal components analysis (PCA) of transformed environmental variables. We used correlation to look for trends between relative abundance of all phyla and classes and PCA axes, pH gradient and percent of watershed area mined. Because examining linear relationships between taxonomic groups at high hierarchical levels can obscure taxa-specific patterns at lower levels, we also used generalized linear models (Quasi-Poisson regression;(McCullagh and Nelder 1989) to identify genera with positive (slope> 0, P<0.10), negative (slope<0, P<0.10) and no response (P>0.1) across the gradient of area mined.

Environmental variables used in the PCA were those that differed significantly

(P≤0.05) between mined and unmined sites or between post hoc A and B sites (described below) using a two-tailed student’s t-test. They included average water chemistry that differed between mined and unmined sites (Table 1) as well as biofilm Ca, Cd, Mg, Mn,

18

Ni, Sr, Th and Zn. Chemistry variables, biofilm biomass, and biofilm C content that failed the Shapiro-Wilk normality test were transformed using log, square root, or inverse functions as needed.

We characterized taxa driving the multivariate patterns using indicator species analysis (Dufrene and Legendre 1997, De Cáceres and Legendre 2009, De Cáceres et al.

2010) with PC-ORD software. This technique analyzed the association between relative abundance and frequency of taxa and their designated sample site groups and identified which species had the greatest indicator value (percent of perfect indication, where 100% signifies perfect indication) in a designated site group. The index is greatest for an OTU when it is found within all of the sites comprising one group. This analysis was done using PC-ORD software (McCune and Mefford), V6). Monte Carlo randomizations were used to test for statistical significance of the OTU indicator using 4999 permutations.

OTUs selected as good indicators were those with indicator values >0.3 and p< 0.05 as recommended by Dufrêne & Legendre (1997).

Chao1 richness estimator (Chao 1984) is a useful index for phylotype richness of uncultured microbial communities that include many rare taxa because it uses a non- parametric estimation that accounts for datasets skewed towards low-abundance classes, which is typical for microbial community datasets. Shannon diversity index is a commonly used estimate of macroorganism diversity that accounts for group abundance and evenness, but focuses on the diversity of common taxa (Hill 1973).

19

2.2.7 Predicted Functional Profiles

To predict functional responses to the mining gradient, we used PICRUSt

(Phylogenetic Investigation of Communities by Reconstruction of Unobserved States; http://picrust.github.com;(Langille et al. 2013) to generate a functional profile using our

16S rRNA data. We followed the suggested methods for OTU picking with Greengenes

13.5 using Galaxy (http://huttenhower.sph.harvard.edu/ galaxy/). Predicted gene family abundances were rarefied to 1078 sequences per site, analyzed at KEGG (Kyoto

Encyclopedia of Genes and Genomes) Orthology group levels 2 and 3 and used in correlation analysis (Pearson’s) with percent watershed mined and the PCA axes. The mean nearest sequenced taxon index was lower (0.14±0.002) than that reported for soil communities (0.17±0.02) (Langille et al. 2013).

2.3 Results

2.3.1 Environmental Characterization of Streams

Our data set included sites that were unmined (n=5), Mud River tributaries draining active (n=6) and reclaimed (n=2) mines and sites within the mainstem of Mud

River both upstream and downstream of mined tributaries (n=9). All mined sites had

2- 2+ 2+ higher concentrations of typical AlkMD constituents (SO4 , Ca , Mg , Se and Mn) than unmined sites (Table 1 and Appendix A Fig. 25), leading to a distinct chemical composition in a PCA (Fig. 2). The majority of environmental variables strongly

2- 2+ correlated with component 1 are classically associated with AlkMD, with SO4 , Ca ,

20

and Mg2+ all highly correlated with this first axis (Appendix A Table 12). Component 1 also strongly correlated with landcover such that the percentage of watershed mined that drained to a sampling location explained 96% of site variance (r=0.96, P<0.001). We also observed unexpected increases in non-purgeable organic carbon (r=0.79, P<0.001) and mean nitrate (r=0.72, P< 0.001) along the component 1, or AlkMD, axis (Appendix A

Table 12). Component 2 positively correlated with increasing biofilm Cd, Mn, Zn, Ni and Zn concentrations (r=0.6, P<0.01).

Table 1. Average water chemistry values from December 2010, April and May 2011 that differed significantly between sites without AlkMD (unmined, n=6) and sites draining mines (n=19). pH did not differ significantly. Horizontal bar chart displays percent of change from unmined sites. pH did not differ significantly. Horizontal bar chart displays percent of change from unmined sites. Unmined All Mined Mined Moderately Heavily n=6 n=17 n=8 n=9 Environmental Variable Mean SE Mean SE Mean SE Mean SE U (μg/L) BDL ------2.01 0.35 1.84 0.41 2.34 0.63 *" Li (μg/L) 0.43 0.02 18.07 2.01 15.72 2.98 21.52 2.79 Se (μg/L) 0.27 0.07 9.13 1.53 6.43 1.40 12.32 2.57 Mg (mg/L) 3.18 0.21 88.40 9.10 80.71 12.28 101.04 14.42 Sr (μg/L) 46.33 4.23 660.14 134.74 472.59 104.50 884.13 250.66 Ca (mg/L) 9.04 1.43 113.81 13.58 97.09 15.89 136.72 22.29 2- SO4 (mg/L) 46.35 32.64 536.60 52.73 472.48 70.32 629.70 78.27 Ni (μg/L) 0.69 0.06 6.48 1.98 3.41 0.48 9.74 3.93 Conductivity (μS cm-1 ) 126.80 17.55 1065.56 95.44 958.18 130.92 1224.93 142.29 TN (mg/L) 0.84 0.26 3.21 0.56 2.37 0.37 4.19 1.28 - NO3 (mg/L) 1.03 0.36 2.36 0.57 1.60 0.32 3.23 1.12 B (μg/L) 12.59 1.91 26.28 2.63 24.17 2.69 28.90 5.04 NPOC (mg/L) 2.01 0.23 3.04 0.16 2.88 0.15 3.00 0.23 % Watershed Mined 0.00 0.00 50.40 5.61 39.70 4.80 63.24 9.03 pH 8.15 0.32 7.80 0.07 7.79 0.14 7.83 0.09 Fe (μg/L) 176.48 43.28 34.83 5.66 39.44 12.43 27.54 3.46 Si (mg/L) 3.15 0.13 2.43 0.09 2.48 0.16 2.33 0.10 Zn (μg/L) 17.11 3.68 10.23 0.94 10.81 1.84 9.82 1.24 V (μg/L) 0.24 0.02 0.11 0.01 0.12 0.02 0.09 0.02 Cl (mg/L) 7.43 1.33 2.41 0.38 1.79 0.41 2.99 0.69 * Percent change from unmined sites was 2E+7 % for U. -2000 0 2000 4000 % Change

21

5 0 Component 2 (16.9%) 5 −

−5 0 5 Component 1 (56.8%)

Unmined Mainstem Mined Actively Mined Trib. Reclaimed Mined Trib.

Figure 2. Principal components analysis (PCA) of selected environmental variables. Component 1 and Component 2 explain 56.8% +-3.8 and 16.9% +-2.1 variance, respectively. Sites with mining split into Group A (heavily mined; symbols outlined) and Group B (moderately mined; symbols not outlined).

Biofilm biomass as DNA per unit surface area ranged from 6 to 274 mg m-2

(mean=57±12.07 mg m-2) and did not differ between mined and unmined sites (log- transformed, Student’s t-test, P=0.25). Biofilm C content ranged from 10 to 127 g Cm -2

(mean=42±6.17 g Cm -2) and was not different between mined and unmined sites (log- transformed, Student’s t-test, P=0.24).

22

2.3.2 Sequencing and Taxa Identification

The sequencing run of 16S rRNA amplicons yielded 145 138 raw reads. Filtering and removing nontarget sites left 23 sites with 102 772 sequences. Maximum reads per site was 7555 with mean 3543 (s.e. 393). Final sequence clustering gave 1846 OTUs. Each sample had a mean of 391 OTUs (s.e. 9). Identification of OTUs at different taxonomic levels yielded 304 species, 298 genera, 203 families, 128 orders, 72 classes and 25 phyla.

Raw sequences are available at MG-RAST (accession numbers 4498070.3–4498093.3, http://metagenomics.anl.gov/ linkin.cgi?project=1572;(Meyer et al. 2008).

Across all sites, Proteobacteria was the dominant phylum (66.7%), followed by

Bacteroidetes (20.8%), Acidobacteria (4.7%) and Actinobacteria (2.0%). All other phyla had abundances of <1%. At the phylum level, <0.02% of reads were unclassified. The most common classes were Alphaproteobacteria (39.0%), Betaproteobacteria (19.3%) and

Sphingobacteria (12.5%). At the class level, 2% of reads were unclassified. The most abundant genera were Flavobacterium (6.7%) and Novosphingobium (4.9%). Of the

OTUs, 1.1% were assigned to known species.

2.3.3 Overall Bacterial Community Structure

We determined compositional differences among the following site categories: unmined, within valley filled tributaries and in Mud River’s mainstem downstream of valley filled tributaries using Bray–Curtis distance and GUniFrac NMDS (Table 2). With the Bray–Curtis distance matrix, mined sites separated into two groups using

23

hierarchical cluster analysis (Ward’s method), leading us to reclassify mined sites into two post hoc groupings: group A and group B (Fig. 3a). Community composition of these two groups differed significantly (permutational multivariate analysis of variance, F2,

19=4.61, P<0.001, Table 2). Group A was characterized by higher concentrations of

2 - biofilm Ca, Cd, Mn, Ni, Sr, Th and Zn, and water column Ca, Ni, Se, SO 4 and TN

(Student’s t-test, P≤0.05; Fig. 4). Group A sites occurred in stream reaches draining watersheds with 25–96% of watershed area occupied by mines, whereas sites in group B had 16–51% of their watershed mined (Student’s t-test, P=0.03).

Table 2. Overall and pair-wise comparisons of Mud River bacteria community composition analyzed with perMANOVA using Bray-Curtis and GUniFrac distances for a priori groups and Bray-Curtis distance for post hoc groups.

24

0.6 a)# b)#

PC1 0.2 0.4 B# 0.1 0.2 0.0 A# 0.0 0.2 0.1 − PC2 − NMDS Axis 2 (26.4%) NMDS Axis 2 (21.3%) 0.2 − 0.6 − −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 −0.2 −0.1 0.0 0.1 0.2 NMDS Axis 1 (25.9%) NMDS Axis 1 (31.0%)

Unmined Mainstem Mined Actively Mined Trib. Reclaimed Mined Trib.

Figure 3. Non-metric multidimensional scaling ordination a) using a Bray- Curtis distance matrix and b) using a GUniFrac distance matrix (with α=0.5). Mined sites categorized as Group A (symbols outlined) and Group B (symbols not outlined) resulting from cluster analysis of Bray-Curtis distance NMDS. Distance matrices based on 16S rRNA pyrosequences. R2 values in parentheses. Stress: a) 0.192 b) 0.190. Rarefied to 1543 sequences per site.

25

Figure 4. Figure 4. Water and biofilm chemistry variables that differ significantly between groups A and B shown as proportion of change from average concentration ! in unmined sites (P≤0.05).

Based on Bray–Curtis distances, which do not incorporate phylogenetic relatedness in community differences, bacteria community composition differed significantly overall (F3, 18=1.94, P=0.002) and between sites with and without AlkMD (F3,

18=3.18, P<0.001; Fig. 3a and Table 2). There were no significant differences in community composition between streams draining active and reclaimed valley fills (F1, 18=1.04,

P=0.42). NMDS axes 1 and 2 had similar degrees of explanatory power (25.9% and

26.4%, respectively). Configuration stress was 0.192.

We also used GUniFrac to compare composition across sites. GUniFrac analyses include the phylogenetic relatedness of taxa. Results were similar to the Bray–Curtis analysis, although separation between sites in ordination space was less distinct (F3,

18=1.55, P=0.02; Fig. 3b and Table 2). Contrasts between mined and unmined sites

26

showed significant differences in bacterial community composition (F1, 18 =2.36, P=0.002), but again we found no difference between communities in streams below reclaimed and active valley fills (F1, 18=0.72, P=0.75). Axis 1 in the GUniFrac distance NMDS ordination explained the most compositional differences between sites (31.0%), whereas axis 2 explained 21.3% and stress was 0.190.

2.3.4 Bacterial Diversity Along the Mining Gradient

We examined correlations between α-diversity and environmental variables.

Component 2, a PCA axis capturing variation in biofilm metals, was the single strongest correlate of richness estimated by multiple α-diversity metrics. Chao1 richness,

Margalef’s index and evenness negatively correlated with component 2 (Chao1: P=0.002, r=0.63; Margalef’s index: P=0.006, r=0.57; evenness: P=0.006, r=0.57). Across all sites we did not observe significant correlations between percent of watershed mined and α- diversity metrics (Fig. 5, all P≤0.05). However, Chao1 richness estimator and Margalef’s index of only mined sites were significantly negatively correlated with the percent watershed mined (both P=0.004, r=0.66).

27

Group A Group B

Reference Richness Observed

Chao1 Richness Estimator

Index Diversity Shannon

Percent Watershed Mined

Figure 5. Alpha diversity (Chao1 richness and Shannon diversity index (H’) shown here) across a range of watersheds with different percentages of their area that had been mined (Observed P=0.44, Chao1 P=0.74, Shannon P=0.37). Post hoc designations of sites (Group A, Group B, Reference) indicated in key.

We examined diversity variation among site categories using common biotic indices (Table 3). The α-diversity of bacteria OTUs did not differ between a priori designated site types using any of these indices (Kruskal–Wallis, P≤ 0.05). However, post hoc categories did differ in α-diversity for Chao1 richness, Margalef’s index and evenness (Kruskal–Wallis, P=0.04, P=0.04, P=0.002), which was lower in group A

(heavily mined) than group B (moderately mined) (Table 3).

28

Table 3. The α- diversity using 1543 sequences.

2.3.5 Indicator Taxa and Predicted Functions

We performed indicator species analysis of OTUs, orders and families to determine which taxa reliably indicated particular environmental conditions (Table 4).

Comparing heavily mined, moderately mined and unmined sites, we found 174 OTU- based taxa strongly associated with one of these three groups. Most OTU indicators

(n=156) closely associated with unmined sites, only 1 strongly associated with the moderately affected group B sites and 17 associated with the heavily affected group A.

Of the OTUs assigned to a taxa identifier, we found 20 orders (of 128 total), 34 families

(of 203 total) and 28 genera (of 255 total) that were indicator taxa for one of the three post hoc groups.

29

Table 4. Taxa identified at the order or family level as indicators using Indicator Taxa Analysis.

Out of all described taxa, percent of watershed mined explained significant linear trends in abundance for 9 of 72 classes, 18 of 128 orders and 12 of 203 families (Appendix

A Table 13). Whereas the Acidobacteria-5 and Betaprotebacteria classes, YCC11 and

Ellin329 orders, and Bacteriovoracaceae and EB1003 families correlated negatively with percent of watershed mined (all r=0.6), the Acidimicrobiia class, Acidimicrobiales,

SBR1032, and Rhodobacterales orders, and Phyllobacteriaceae, Methylophilaceae and

Desulfobacteraceae families increased in relative abundance in streams of more heavily mined watersheds (all r=0.5).

Because cross-gradient patterns of taxa are often conducted at coarse levels of taxonomic resolution, we explored genera responses within the two most abundant

30

classes: Betaproteobacteria, which negatively correlated with percent of watershed mined, and Alphaproteobacteria, which did not correlate with percent of watershed mined. For each genus within the class, we assessed abundance patterns across the gradient of percent mining using Quasi- Poisson regression (McCullagh and Nelder

1989). The negative correlation between Betaproteobacteria relative abundance and percent of watershed mined did not hold for all genera within the class (Fig. 6). Whereas

9 genera did show a negative response, 6 responded positively and 18 had no response.

In Alphaproteobacteria, which showed no response to mining at the class level, 13 genera responded positively, 11 responded negatively and 28 showed no significant response (Fig. 7). We referenced each responding genus with KEGG modules and Bergey’s Manual (Garrity 2005) to identify energy metabolisms that might respond to AlkMD constituents (Table 5). The majority of sulfur and nitrogen metabolism pathways were shared by both positive and negative responders. However, the only responsive genus reported to include a denitrifier increased with mining, whereas assimilatory sulfate reduction pathways were identified for genera who only responded negatively to mining.

31

Figure 6. Response of genera within class Betaproteobacteria to percent area of watershed mined using Quasi-Poisson regression.

32

Figure 7. Response of genera within class Alphaproteobacteria to percent area of watershed mined using Quasi-Poisson regression.

33

Table 5. Energy metabolism functions for Alpha- and Beta- Proteobacteria that respond to the mining gradient.

Relative gene family abundances generated by the predicted functional profile using PICRUSt grouped into three level-2 functional categories that correlated negatively with percent watershed mined. Gene family relative abundances ranged from

11% to < 1%. Percent watershed mined correlated negatively with gene families in

‘Signaling Molecules and Interaction’ (Environmental Information Processing category)

(P=0.005, r=0.59, 4% abundance), ‘Xenobiotics Biodegradation and Metabolism’

(Metabolism category) (P=0.01, r=0.53, 0.4% abundance) and ‘Transport and Catabolism’

(Cellular Processes category) (P=0.02, r=-0.50, 0.2% abundance).

2.4 Discussion

Despite receiving extensive AlkMD contamination from the largest surface coal mine in Appalachia, microbial communities exposed to exceptionally high levels of

34

AlkMD constituents in Mud River are no less diverse than nearby reference communities. Diversity in these streams best correlates with a multivariate factor that incorporates elevated biofilm Cd, Mn, Zn and Ni. Contrary to our predictions, overall bacterial diversity was not strongly correlated with the extent of upstream mining, although within mined sites, mining intensity correlated negatively with taxonomic richness. Despite only modest changes in α-diversity, we detected significant compositional differences between microbial communities of unmined and mining- affected streams. These compositional shifts resulted from changes in relative abundance rather than turnover between species at mined and unmined sites, as only

9% of OTUs differed enough between mined and unmined sites to serve as indicator taxa. There was limited evidence that these compositional shifts were driven by responses to nitrate and sulfate availability for use in energy metabolism. Rather, taxa shifts may be because of stressors that effect cellular processes and signaling.

Although our data suggest that bacterial community composition shifted with

AlkMD exposure, we do not observe linear trends in α-diversity along the mining gradient because α-diversity varies widely at unexposed sites. However, within mined sites, richness decreased as more watershed area was mined. When an ecosystem undergoes extreme environmental alteration, such as mountaintop mining, we expect organisms favored by the changes to flourish and sensitive taxa to fail. This subsidy– stress response (Odum et al. 1979) can shift community composition across

35

environmental gradients as well as increase diversity at intermediate exposure levels where sensitive and tolerant (or subsidized) taxa overlap (Niyogi et al. 2007). This is a possible explanation for microbial taxa richness and diversity responses to the AlkMD gradient. It is also consistent with a transplant experiment in the Clark Fork River drainage containing Butte copper mine where Feris et al. (2009) found bacterial taxa richness was greatest at low and moderate levels of metal contamination (As, Cd, Cu, Pb and Zn) and lowest in uncontaminated and highly contaminated sediments.

Many studies reporting microbial community responses to environmental contaminants occur in acid mine drainage systems (Baker and Banfield 2003).

Communities in these very acidic, metallic waters typically are less diverse than bacteria from neutral or alkaline streams (Lear et al. 2009, Kuang et al. 2013) and pH strongly correlates with phylogenetic diversity, richness and UniFrac distance (Kuang et al. 2013).

Water column pH during our study period was not statistically distinct between mined and unmined sites and had no correlation with percent watershed mined, although it spanned nearly two orders of magnitude (6.9–8.6). AlkMD characteristically elevates pH

(Griffith et al. 2012), and previous work at this field site found a strong positive correlation between percent watershed mined and pH (Lindberg et al. 2011). Unlike that earlier study of Mud River, this study occurred during winter/early spring at which time high flows dilute pH effects of mine drainage. In contrast to prior studies (Fierer and

Jackson 2006, Lauber et al. 2009, Rousk et al. 2010, Griffiths et al. 2011), pH was not an

36

important correlate of composition or diversity metrics during this sampling period.

This may be explained by the pHs we examined or the limited pH range in our study (2 orders of magnitude) relative to prior studies (≥4 orders of magnitude). Yet, our results also contrast a study of diversity in streams of Hubbard Brook Experimental Forest in

New Hampshire, USA, in which a similar range in pH (4–6.3) was shown to be the best correlate of microbial taxa richness across streams (Fierer et al. 2007). The lack of a pH diversity correlation in Mud River suggests that other chemical constituents were stronger determinants of bacterial community structure than pH, and this is perhaps unsurprising given the large difference in alkalinity, conductivity and numerous trace elements associated with surface mining.

Community composition differed significantly between unmined sites and sites downstream of surface mines. In mining-affected sites, we detected two distinct post hoc groups associated with different levels of AlkMD exposure. Bacterial communities in moderate AlkMD exposure sites were more diverse than communities with high contaminant exposure. The shift in composition between unmined and mined sites was

2+ 2- 2+ best explained by elevated AlkMD constituents Ca , Li, SO4 , Se and Mg and overall ionic strength. Composition differences between high and moderate AlkMD groups were best explained by greater biofilm Cd, Mn and Zn concentrations in the high- mining- affected sites. Earlier work shows that salinity and trace metals generate significant changes in microbial community composition (Baker and Banfield 2003, Feris

37

et al. 2003, Lozupone and Knight 2007, Giller et al. 2009, Auguet et al. 2010, Lami et al.

2013). A number of studies have found that Cd and Zn in particular alter microbial community composition (Ganguly and Jana 2002, Sverdrup et al. 2006, Bouskill et al.

2010, Xie et al. 2011).

As previous studies recognize (Lozupone et al. 2007, Kuczynski et al. 2010), analysis methods influence conclusions about composition differences. Although both distance matrices that we used for creating NMDS ordinations yielded identical differences between site types, they also revealed unique patterns of community composition within site types. GUniFrac incorporates phylogenetic relatedness into the distance matrix by more heavily weighting closely related taxa (Kuczynski et al. 2010,

Chen et al. 2012). Because site-type differences in composition were less strong when using a GUniFrac rather than Bray–Curtis dissimilarity distance matrix, shifts in composition may be due to closely related taxa responding quite differently to AlkMD.

This point is bolstered by the genera-level analyses that revealed that genera within the same class had varied responses to the mining gradient.

Ultimately, as our knowledge of microbial life history and physiology grows, we hope to map individual microbial traits onto phylogenies. Such knowledge would improve chemical pollution monitoring and predictions of microbial responses to ecosystem degradation. At present, mapping microbial traits to identities is highly limited, obscuring causes or consequences of specific compositional shifts in bacteria

38

communities. Yet, examining taxa that respond strongly to AlkMD and comparing associated compositional changes with those across other environmental gradients may illuminate important microbial indicator taxa.

Taxonomic data sets are informative, but poorly resolved at genus and species levels (only 1.1% of OTUs were assigned to known species). Thus, it is a challenge to select appropriate taxonomic levels for best understanding microbial responses. At the class level, the strongest AlkMD responders were within Proteobacteria, Acidobacteria and Actinobacteria phyla. Similar to Feris et al. (2003), we found that Betaproteobacteria relative abundance decreased across a contamination gradient. Betaproteobacteria indicator taxa also correlate negatively with river and estuary water organic carbon content (Fortunato et al. 2013), and this may play a role in structuring AlkMD communities as dissolved organic carbon had a strong positive correlation with mining.

In contrast, Feris et al. (2009) also found that in hyporheic sediments sourced from alkaline streams (pH 7.9–8.3), Alpha- and Gammaproteobacteria relative abundance increased with a metal contamination index. Although we used surface biofilms, not hyporheic sediments, we observed no such relationship. In our study, the strongest positive correlation with mining occurred in Actinobacteria that responded linearly rather than in a threshold manner.

After identifying taxa responsive to this gradient, the next step is investigating mechanisms and corresponding ecological implications. We anticipated that two ions

39

that substantially increased with mining, sulfate and nitrate, would link to changes in taxa composition as these can be used in energy metabolism. Indeed, several AlkMD- tolerant taxa included Gamma- and Deltaproteobacteria, Nitrospira, Bacilli and

Sphingobacteria, several of which perform biogeochemical transformations involving nitrogen cycling. These include nitrite oxidation to nitrate by Nitrospirae (Wakelin et al.

2008), methanol oxidation linked with denitrification by Methylophilaceae

(Kalyuhznaya et al. 2009) and nitrate reduction and aromatic compound degradation by

Rhodocyclales (Hesselsoe et al. 2009). However, KEGG Organism modules revealed similar nitrogen pathways for species in genera that increased and decreased with mining. Moreover, predicted gene family abundances for nitrogen metabolism were not positively correlated with mining. Sulfur metabolism had a similar outcome: predicted sulfur metabolism gene family abundances were not correlated with mining (and sulfate), yet the sulfate-reducing Desulfobacteraceae family increased in relative abundance in streams of more heavily mined watersheds. Because of the regime shift in many AlkMD-associated chemicals, it is likely that no single mechanism is responsible for the taxa patterns we observe. Rather, the multivariate nature of AlkMD is best represented by the percentage of watershed mined. This chemical regime shift may affect cellular processes and signaling as the predicted functional profile suggests. At the ecosystem level, it is possible that these effects could alter energy requirements, thus influencing carbon use efficiency and carbon cycling if energy is shunted toward cellular

40

processes and away from growth. Nonetheless, it seems that the majority of functional categories are predicted to be redundant between bacterial communities spanning the mining gradient.

In conclusion, our study shows that stream biofilm bacterial composition in the

Mud River system significantly differed between sites receiving AlkMD and unexposed sites. Average taxonomic richness in sites receiving moderate levels of AlkMD constituents exceeded that for unexposed or heavily exposed sites, creating a nonlinear relationship between exposure and diversity. At most taxonomic levels, few taxa were statistically dissimilar enough between exposure categories to indicate habitat specialization. The small number of strongly responding taxa and disparity in compositional similarity between GUnifrac and Bray–Curtis ordinations suggest that community shifts occur through families, genera and species rather than further up the hierarchy. Such results contrast macrofaunal responses to AlkMD exposure where entire orders of aquatic insects are lost from heavily AlkMD-affected streams (Pond et al. 2008,

Pond 2010, 2012). Testing microbial community functional responses is the next logical step toward understanding ecologically relevant links between compositional shifts and the strong chemical gradient AlkMD establishes in Central Appalachian streams.

41

3. Subsidies or Stressors? How Alkaline Mine Drainage Shapes Microbial Communities

3.1 Introduction

Communities of microorganisms in soils and sediments are collectively responsible for driving Earth’s biogeochemical cycles and consequently providing a variety of ecosystem services including the detoxification of pollutants and decomposition of wastes (Kremen 2005). While these communities are hyperdiverse and generally poorly described, access to their genomic information should enable us to link environmental contamination with microbial responses, including population changes of taxa and key protein-encoding genes used for relevant ecosystem functions.

Previous work in this area has shown that microbial communities can respond rapidly to environmental calamities that drastically alter abiotic variables. One such example is the Deepwater Horizon oil spill, which led to a significant increase in abundances of taxa capable of degrading substantial quantities of aliphatic and aromatic hydrocarbon molecules (reviewed by(Kimes et al. 2014). Population changes also co- occur in environments with chronic contamination such as heavy metals (Hemme et al.

2010, Yin et al. 2015) and agriculture-sourced nitrate pollution (Xu et al. 2014, Kim et al.

2015). These examples support the idea that microbial community members and activities reflect changes in their environment and demonstrate the capacity to adapt to these perturbations and consequently also affect abiotic conditions, though there are also exceptions to this pattern for example in studies of saltwater intrusion experiments 42

where community composition did not change (Edmonds et al. 2008, Edmonds et al.

2009).

Ecologists often assume that high abundance and genetic diversity is associated with greater resilience to ecosystem disturbance or pollution, thus avoiding a shift to a new or amended collection of dominant functions. This in part is due to microbes’ extensive array of capabilities for manipulating stressor compounds to prevent cellular damage (Welch 1993), such as altering oxidation states (Williams and Silver 1984), operating efflux pumps (Marquez 2005, Sun et al. 2014), and protein maintenance and repair (Visick and Clarke 1995). Recent studies of heavily polluted environments, however, have begun to question this assumption and show that microbial communities subjected to strong chemical alterations may in fact be unable to perform certain key functions previously thought to be universal among microbes. Examples include microbial communities exposed to acid mine drainage (Baker and Banfield 2003, Baker et al. 2009) and human gut microbiomes exposed to high doses of xenobiotics (Maurice et al. 2013).

The concept of resilience largely focuses on the ability of communities to recover from the stressors introduced by an environmental change. Yet, numerous cases of change also provide a battery of subsidies, or stressors that operate as subsidies. In response to alterations of substrate supply, microbial functions can shift in thermodynamically and kinetically predictable ways. For heterotrophic metabolism,

43

success in microbial competition for DOC is ordered by energy yield from thermodynamic expectations where denitrification, followed by sulfate reduction, then methane oxidation occurs. This has been observed in stream-soil interfaces and groundwater flow paths (Hedin et al. 1998). While some metabolic approaches yield more energy than others, the concentration of substrates (electron donors (DOC) and acceptors (NO3- and SO42-)) also influences the profitability of using a particular energy metabolism. Thus, in a simplistic view, an increase in one substrate may lead to a detectable change in the associated microbial function.

Contamination resulting from land-use change affects multiple environmental variables, thus creating a highly complex suite of abiotic stressors and subsidies. Given the extensive genetic diversity of many microbial communities, we anticipate that in a subsidy-rich contaminant source, the role of subsidies will best explain the system’s potential functional capability such that that instead of finding a resilient or resistant system, we will see evidence of a functional shift.

To pursue this question, we conducted a metagenomic survey and quantified specific genes with known function using qPCR to explore the microbiome of aquatic sediments in Mud River, WV, a river exposed to alkaline mine drainage (AlkMD) produced by Hobet Mine, the largest surface coal mine in Appalachia (Lindberg et al.

2011). AlkMD is generated from pyrite mineral dissolution in coal residues exposed to air during the mining process. In Central Appalachian surface mines, these minerals

44

release sulfuric acid within a matrix of carbonate bedrock that neutralizes the acid and releases coal-derived sulfate ions (SO42-) accompanied by elevated concentrations of calcium, magnesium, and bicarbonate ions (Ca2+, Mg2+, HCO3–) (Bernhardt and Palmer

2011). AlkMD is characterized by elevated pH, alkalinity, and ionic strength relative to receiving streams, as well as manganese (Mn) and selenium (Se) levels that frequently exceed toxicity standards (Palmer et al. 2010, Lindberg et al. 2011). Oxidized forms of Se, which dominate at alkaline pH, can be toxic to microbial cells (Kramer and Ames 1988,

Yee 2011), but can also serve as a terminal electron acceptor (Oremland et al. 1989,

Oremland et al. 1999). Bier et al. (2015b) have documented significant shifts in eubacteria taxonomic composition along this chemical gradient using pyrosequencing of 16S rRNA genes.

Our objective for this study was to evaluate patterns of microbial taxa and functional genes along a gradient of alkaline mine drainage that elevates compounds serving as microbial subsidies, stressors, or both. Given the increase in sulfate, nitrate, and selenium with alkaline mine drainage in Mud River, WV, we anticipated that microbial genes that use these substrates for energy metabolism (Table 6) would increase in relative abundance corresponding to the subsidy gradient. For organisms responding to mining-associated stressors (ex. metals and osmotic stress, Table 7), we expected a negative correlation between sensitive taxa abundances and a positive

45

correlation with relative abundance of stress-associated genes across the AlkMD gradient.

Table 1.aTable. Summary 6. Summary of hypotheses of hypotheses and predictions and predictions for relationships for relationships between microbial between Tablecommunities 1.a. Summarymicrobial and neutral/alkaline of communities hypotheses coal and and mine predictions neutral/alkaline drainage. for relationships coal mine between drainage. microbial communities and neutral/alkaline coal mine drainage. Hypotheses Predictions HypothesesH1: Microbial community composition PredictionsP1a: Taxa composition differs between H1:reflects Microbial environmental community subsidy composition and P1a:exposed Taxa and composition unexposed differscommunities. between reflectsstressor environmental variables. subsidy and exposedP1b: Association and unexposed between communities. subsidized or stressor variables. P1b:stressed Association taxa and thebetween concentration subsidized of theor stressedspecific compoundtaxa and the. concentration of the H2: Abundance of microbial functional specificP2a: Shift compound in functional. gene composition of H2:genes Abundance will reflect of the microbial availability functional of P2a:exposed Shift and in unexposedfunctional genecommunities. composition of genessubsidies will andreflect stressors the availability related to ofthe exposedP2b: Shift and in functionalunexposed pathway communities. subsgeneidies product. and stressors related to the P2b:composition Shift in offunctional exposed pathwayand unexposed gene product. compositioncommunities. of exposed and unexposed communities.P2c: Positive association between functional P2c:gene Positiveabundance association or pathway between abundance functional and genethe related abundance subsidy or orpathway stressor. abundance and the related subsidy or stressor. Table 1.b. Expectations of subsidized and stressed taxa, functions, and genes. Table 1.bTable. Expectations 7. Expectations of subsidized of subsidized and stressed and taxa, stressed functions, taxa, functions, and genes. and genes. Taxa Functions (genes) Subsidized: TaxaSulfate reducers SulfateFunctions reduction (genes) (dsrA, dsrB, aprA) Subsidized: SulfateDenitrifiers reducers DenitrificationSulfate reduction (nirK, (dsrA, nirS, dsrB, nosZ aprA) ) DenitrifiersSelenate reducers SelenateDenitrification reduction (nirK, (serA nirS,) nosZ) Stressed: SelenateNitrifiers reducers NitrificationSelenate reduction (amoA, ( serAnorB)) Stressed: NitrifiersMethanogens MethanogenesisNitrification (amoA, (mcrA norB) ) MethanogensMethanotrophs MethanotrophyMethanogenesis ( pmoA(mcrA) ) Methanotrophs OsmoticMethanotrophy regulation (pmoA (opu) ) Osmotic regulation (opu)

3.2 Methods

3.2.1 Study Site

Samples were gathered from tributaries and mainstem sites in Mud River, which is located in the Lower Guyandotte watershed of Southwestern West Virginia, U.S.A.

46

and receives a gradient of mountaintop coal mining effluent as determined previously

(Lindberg et al. 2011, Bier et al. 2015b) (Fig. 8). Mud River contains two main forks:

Upper Mud River begins upstream of the Hobet Mine Complex and passes through the

Complex while Left Fork Mud River drains a similar, but unmined watershed. Left Fork

Mud River contains low-density housing, but is unaffected by mountaintop mining.

( ( ( Legend Environmental Variables - Floc Sheet1$ Events

( PercentPerAr ofea WatershedMined Mined 4 ( Percent LWatershedegend Mined ( ( 0 0 ( 25 Sheet1$ Events ( ( 25 (2 (((( ( 50 PerAreaMined ( 50 ( ( 75 0 75 ( ( ( 100 25 ± 0 ( (( ( (( ( ( ( Site_Type 50 ( ( Component 2 (20.5%) 2 Component ( Mined 75 -2 ( ( ± Unmined 100 0 2.5 5 10 ( ( Kilometers -3 0( 3 6 Component 1 (44.4%) 0 2.5 5 10

Kilometers

Figure 8. Sample sites on Mud River and Left Fork Mud River in Boone and Lincoln Counties, West Virginia (WV). Hydrologic unit codes (HUCs) 12- 050701020301 and 12-050701020104 are outlined. Point color and size reflects mining. Site point size scaled to percent of the area draining to sampling point that was mined. Gray tributary streams run through mined areas; black tributaries are

47

unmined. The mainstem of Mud River and Left Fork Mud River shown in bold black from headwaters to confluence. Arrow shows flow direction. Inset of US mid-Atlantic states shows Appalachian Coalfield Region as gray-shaded area with relative location of study site in WV in red (not to scale).

3.2.2 Sample Collection

We gathered water samples and stream microbial biofilms from 15 sites on

November 14-15, 2013. Sites were selected based on previous water quality and mining impact assessments that partitioned locations into five unmined sites, five moderately impacted mined sites, and five heavily impacted mined sites (Bier et al. 2015b). The five unmined sites included two sites upstream of the Hobet Mine Complex on the mainstem of Mud River, and three sites in the unmined Left Fork of Mud River. Moderately impacted mined sites included one mined tributary and four mainstem sites in Mud

River within the Hobet Mine Complex and drained surface areas with 16-50% mining.

Heavily impacted mined sites had four mined tributaries and one mainstem site within the mine complex and drained surface areas with 40-90% mining.

3.2.3 Water Chemistry

Water samples were collected and analyzed from each site in mid-November,

2013 for temperature, pH, and chemistry as described in Lindberg et al. 2011. Briefly, we measured in-stream conductivity and pH and collected grab samples of stream water for analysis of major anions, total carbon, total nitrogen, and a suite of metals. Water sampling followed USGS protocols (USGS Variously dated.).

48

3.2.4 Microbial Biofilm

At each site, we selected five sub-sites and filled two sterile, 50 mL centrifuge tubes with 10mL microbial biofilm material from each sub-site collected to a depth of 0.5 cm. Microbial biofilm material was loosely structured depositional flocculent. Material in each centrifuge tube was mixed thoroughly to homogenize and 2 mL aliquots were transferred to each of 20 cryovials. Ten cryovials designated for nucleic acid extraction were flash frozen in liquid nitrogen and stored on dry ice during transport to the lab where they were immediately transferred to a -80 °C freezer and stored until nucleic acid extraction. The surplus homogenized microbial material was transported to the lab on ice and stored in a 4 °C cold room prior to analysis for biofilm chemistry variables.

Specifically, we measured biofilm metals (acid digestion) and ash-free dry mass analysis for conversion to carbon content (as detailed in Bier et al. 2015b) and substrate-induced respiration (SIR) for biomass estimation and nitrification potential (see below).

3.2.5 Nucleic Acid Preparation and Sequencing

Nucleic acids were extracted with duplicates from 5 g flash-frozen microbial biofilm using the Mo-Bio Alternative PowerMax DNA Isolation kit with the Protocol for

DNA from Low Biomass Soil with Low Humics (MoBio, Carlsbad, CA, USA). This modified protocol is available from Mo-Bio. Extracted nucleic acids were treated with

RNase A enzyme (RNase A, MoBio, Carlsbad, CA, USA) to yield DNA-only samples.

49

Libraries were prepared for metagenomic analysis from 12 of the 15 DNA samples using Illumina TruSeq DNA-Seq kit (Kapa Hyper Prep Kit (average default insert size 200bp)). These 12 samples were selected to run in order to increase sequencing depth compared with using all 15 samples. Libraries were sequenced at the

Sequencing and Genomic Technologies Shared Resource at Duke University (Durham,

NC, USA) using Illumina HiSeq 2000/2500 with a 100bp paired-end Rapid Run using a full flowcell. The DNA sequences are available at MG-RAST (accession numbers:

4589537.3 to 4589548.3).

3.2.6 Bioinformatics Analysis

The Illumina HiSeq run yielded 320 727 676 raw clusters that passed filtering with 1.8% reads having undetermined barcodes. The average quality score was 34.4 for

Lane 1 and 33.6 for Lane 2. Best hit classification in MG-RAST v3.6 (Meyer et al. 2008) was used to annotate raw, unassembled reads for organism samples with M5RNA (non- redundant multi-source ribosomal RNA annotation database), max e value cutoff 1e-5, min percent identity cutoff 60%, min alignment length cutoff 15. Reads were dereplicated (Gomez-Alvarez et al. 2009) and low-quality sequences were removed using a modified DynamicTrim (Cox et al. 2010) with 15 as the lowest phred score for designating high-quality bases and <6 bases per sequence with phred score below 15.

After removing 797 singletons, bacteria and archaea 1 776 Bacteria OTUs and 11 Archaea

OTUs in the combined metagenomes rarefied to 930 abundance for each site. To select

50

930 individuals per site for taxonomic analysis, each OTU’s abundance at a site was divided by the total sequence count for that site, and then multiplied by 930 to retain the relative abundance of that OTU out of 930 individuals. The resulting decimals were floored and remaining individuals needed for the site to contain 930 sequences were selected using the distribution of OTUs at each site (Beevers 2006).

To identify functional genes and pathways, we fed unassembled reads to the

HUMANn2 (HMP Unified Metabolic Analysis Network) pipeline

(http://huttenhower.sph.harvard.edu/humann2), which includes Bowtie 2 for nucleotide-level pangenome mapping to the functionally annotated species pangenomes

(ChocoPhlAn). UniRef50 (Suzek et al. 2015) was used for protein mapping and Diamond was used for the translated alignment. MetaCyc pathways were identified using

MinPath. Functional gene abundances were assigned as reads per kilobase to normalize by gene length. Pathway abundance reflects the number of complete pathway copies. To obtain evenness across sites, functional genes and pathways were converted to copies per million.

3.2.7 Relative Abundance of Protein Encoding Genes Using qPCR

We determined absolute abundance of a targeted a set of functional genes to determine the microbial community’s capacity for sulfur, nitrogen, selenium, and carbon metabolism using quantitative PCR (qPCR) (Appendix B Table 14). Standards and samples were normalized and quantified with Nanodrop. Standard curves were

51

generated using amplicons from specified taxa (16S, aprA, dsrB) or gBlocks (Integrated

DNA Technologies, Inc., IA, USA) of genomic sequences from the specified organisms

(Appendix B Table 14). qPCR reactions were made using iTaq Universal SYBR® Green

Supermix (Bio-Rad Laboratories, Inc., CA, USA). Samples were quantified in 25 µL reactions using a LightCycler® 96 Real-Time PCR system (Roche Diagnostics

Corporation, IN, USA). All samples were run in triplicate. Negative controls were run in each plate. Specific qPCR programs were followed as previously published (Appendix B

Table 14). Amplification efficiencies, melting curve Tm value, and R2 of the standard curve for each qPCR assay are reported in Appendix B. Using manufacturer-provided software, Ct values were calculated. Gene quantifications were tested for normality and log transformed.

3.2.8 Activity Assays

Activity assays were conducted in triplicates within 11 days of sample collection.

3.2.8.1 Substrate-Induced Respiration (SIR)

Active heterotrophic microbial biomass and potential activity were estimated using substrate-induced respiration in which autolyzed yeast extract was added to samples and CO2 evolution was measured over 4 hours. Briefly, 5 g microbial biofilm wet mass and 10mL autolyzed yeast extract were 20 mL borosilicate glass vials and capped with 0.125 cm Teflon-silicone septa. Vials were incubated in dark conditions at room temperature. Headspace CO2 was measured at 0, 2, and 4 h intervals using manual

52

headspace injections into a LI-COR 6262 CO2 gas analyzer (LI-COR, Lincoln, Nebraska,

USA). Protocol followed(Fierer et al. (2003) as modified from(West and Sparling (1986), but used 40 mL amber borosilicate glass vials with 0.125 cm Teflon-silicone septa.

3.2.8.2 Nitrification Potential

Microbial biofilms were assessed for nitrification potential using a chlorate inhibition assay (Belser and Mays 1980) on 4 g wet mass. NO2- was measured at 2.5 h intervals over a 6.5 h duration using a colorimetric approach on a spectrophotometer with absorbance at 543 nm.

3.2.9 Statistical Analyses

We characterized site environments and composition of microbial taxa, KEGG orthogroups, and MetaCyc pathways using multivariate statistic techniques. To examine site environments, we used principal components analysis (PCA) of log transformed water and biofilm chemistry variables. We assessed DNA-based microbial community taxa, functional groups, and functional pathways using nonmetric multi- dimensional scaling (NMDS) ordination based on a Bray–Curtis distance matrix (Bray and Curtis

1957). Community taxonomic composition of Bacteria and Archaea was assessed at the

OTU level. Functional groups were assigned from KEGG orthogroups and examined at the Level D KEGG orthology number. Data were square root transformed and had

Wisconsin double standardization. Significance of composition differences between mined and unmined sites were analyzed using perMANOVA with 999 permutations.

53

3.2.9.1 Analysis of Microbial Indicators and Positive or Negative Responders

To identify functional genes (KEGG orthogroups), pathways (MetaCyc), and known OTUs (M5NR) that associated with mined or unmined sites, we ran indicator species analysis (Dufrêne and Pierre Legendre 1997) using R package indicspecies version 1.7.1. The analysis was run with 999 permutations.

To determine differences between relative abundance of taxa, functional genes, and pathways occurring at mined and unmined site categories, we used Mann Whitney

U nonparametric tests. We used linear models to examine relationships between substrates or stressors with relative gene abundance from metagenomes or with absolute gene abundance from qPCR.

3.3 Results

3.3.1 Characteristics of Stream Water and Microbial Biofilm Chemistry

We characterized environmental variables from 15 sites (5 with no alkaline mine drainage (unmined) and 10 sites receiving AlkMD draining catchments with ~15-90% of their surface area in active or reclaimed surface mines and valley fills (mined). AlkMD was comprised of a mixture of major ions and trace elements. To create a composite score of these constituents, and to understand additional non-AlkMD related variation in water chemistry we used PCA. Variables were separated into stream water and biofilm chemistry groups to examine differences in how well water column (Fig. 9a) or biofilm mass constituents (Fig. 9b) related to microbial community composition. Similar

54

to previous work (Lindberg et al. 2011, Bier et al. 2015b), we measured large increases in a suite of chemical solutes associated with AlkMD between unmined and mined sites.

The variability in both stream water column chemistry and microbial biofilm chemistry were best explained by Component 1 PCA axis (water column, 58.6%; biofilm, 44.4%).

Component 1 of both water and biofilm chemistry showed a strong positive correlation with percent of the watershed that had been mined such that Percent Area Mined explained 86% and 91% of site variability, respectively (water: r=0.86, P< 0.001; biofilm: r=0.91, P< 0.001). For the water chemistry PCA conductivity and other dissolved constituents commonly associated with alkaline mine drainage: Se, Ca, NO32-, and SO42- were strongly positively correlated with Component 1 (all r>0.95 and P<0.001, Appendix

B Table 15). We thus consider the water chemistry PCA Component 1 axis as our composite measure of water column AlkMD pollution (hereafter Dissolved AlkMD).

Similarly, Component 1 of the biofilm chemistry PCA had strong positive correlations with the major AlkMD constituents of Se, Mg, and Ca, and strong negative correlations with Cr, As, and Fe (all r>0.8 and P<0.001, Appendix B Table 16), hereafter we refer to this biofilm chemistry Component 1 score as Biofilm AlkMD. Component 2 of the dissolved solute PCA was strongly positively correlated with Al and Zn (all r>0.8 and

P<0.001, Appendix B Table 15). Component 2 of biofilm chemistry was most strongly correlated with Co (r=0.85, P<0.001).

55

Environmental Variables - Water Environmental Variables - Biofilm a) b) 4 4 Percent Watershed Mined Percent Watershed Mined

0 0 ) 225 25 2 50 50

75 75 0 0 Site_Type Site_Type (biofilm constituents(biofilm Component 2 (10.3%) 2 Component (20.5%) 2 Component Mined Mined -2 Unmined Unmined -2

-3 0 3 6 -3 0 3 6 Component 1 (58.6%) Component 1 (44.4%) (AlkMD dissolved constituents)

Figure 9. Principal components analysis of selected environmental variables. Environmental variables separated into a) water chemistry and b) biofilm chemistry. Water chemistry component 1 and component 2 explain 58.6±3.8% and 10.3±1.6% variance, respectively. Biofilm chemistry component 1 and component 2 explain 44.4±3.4% and 20.5±2.3% variance, respectively. Sites colored by mining and scaled to percent of the watershed mined. See Appendix B Tables 15 & 16 for specific variable loads and correlations.

The organic matter content of biofilm samples ranged from 0.8-2.3% of dry mass

(mean=1.4%, s.e. 0.001). Biofilm microbial heterotrophic biomass, measured as substrate induced respiration, was significantly higher at mined sites (399.1 µg C-CO2/g sediment

C/h (s.e. 29)) than unmined sites (248.9 µg C-CO2/g sediment C/h (s.e. 53)) (Student’s t- test, P=0.045). Heterotrophic biomass per mass of DNA was also higher at mined sites

(mined mean 1.09 µg C-CO2/ µg DNA/h s.e. 0.13, unmined mean 0.63 µg C-CO2/ µg

DNA/h, s.e. 0.09, Student’s t-test, P=0.01). The DNA content of samples was highly variable across sites but did not differ consistently between site categories (mean=211 µg

DNA g OM-1 s.e. 16).

56

3.3.2 Characteristics of Taxa, Functional Genes, and Pathway Composition

Identification of Bacteria and Archaea OTUs at different taxonomic levels resulted in a total of 11 384 OTUs when clustered with 97% similarity. The final rarefied composition of taxa was dominated by Bacteria, with Archaea comprising less than 1% of OTUs with similar compositions at mined and unmined sites (Fig. 10a,c). Unclassified

Bacteria comprised 23% of OTUs. Over half of all identified OTUs were in Phylum

Proteobacteria (56-58%) followed by Bacteroidetes (15-16%) and Verrucomicrobia (10-

11%). Actinobacteria, Cyanobacteria, Firmicutes, Acidobacteria, and Planctomycetes ranged between 1-5% relative abundance. All other phyla had relative abundances of less than 1%. The most common classes were Betaproteobacteria (21.4%),

Alphaproteobacteria (18.7%), and Gammaproteobacteria (9.7%).

57

Mined Taxa, n=7 sites Unmined Taxa, n=5 sites

Acidobacteria Acidobacteria Actinobacteria Verrucomicrobia Actinobacteria Verrucomicrobia

Bacteroidetes Bacteroidetes

Cyanobacteria a) Cyanobacteria c) Firmicutes Firmicutes

Planctomycetes Planctomycetes

Proteobacteria Proteobacteria

Acidobacteria

Verrucomicrobia Bacteroidetes

b) Proteobacteria d) Bacteroidetes

Thaumarchaeota Firmicutes Proteobacteria Indicator taxa Indicator taxa Mined, n=6 taxa Unmined, n=18 taxa

Figure 10. Relative abundances of phyla of Bacteria and Archaea with >1% relative abundance from a) mined and c) unmined sites. Relative abundances of indicator taxa for b) mined and d) unmined sites. See Appendix B Table 17 for phyla with <1% relative abundance.

Functional gene hierarchies using KEGG orthogroups resulted in 9 412 functional gene groups (Fig. 11a,c). Relative abundances of both Level A and Level B functional categories were similar between mined and unmined sites. The majority of functional genes occurred in the Level A metabolism category (72%), followed by genetic information processing at 5-6%, and environmental information processing at 4%.

Cellular processes comprised only 1.5% of functional gene abundance. A large proportion of functional genes (16-17%) were uncategorized. The major functional gene

58

categories at the next level, Level B, were carbohydrate metabolism (20-21%), amino acid metabolism (18%) and energy metabolism (12%).

Mined Functional Genes, n=7 sites Unmined Functional Genes, n=5 sites

Xbm Xbm Tln Tln St Aam St Aam Rr Rr

Nm Nm Bm Bm

Mtp Mtp c) Moaa a) Moaa

Cm Cm Mcv Mcv

Mt Mt Lm Lm Gbm GbmFsd Em Fsd Em

Aam Bm Cm

Unclassified Aam Em Cm Tln Mt Fsd b) d) Unclassified Gbm Lm Mt Mcv Moaa Nm Rr St St Tc Tln Smi Xbm Cc Cmo Cgd Indicator KO Indicator KO Mined, n=10 KO Unmined, n=244 KO

Figure 11. Relative abundances of KEGG orthogroups with >1% relative abundance at a) mined and c) unmined sites. Proportions of indicator taxa for b) mined and d) unmined sites. See Appendix B Table 18 for KO groups with <1% relative abundance.

Major top level pathways identified with MetaCyc were similar at mined and unmined sites and comprised biosynthesis (61-62%), degradation/utilization/assimilation (22%) and generation of precursor metabolites and energy (15%) (Fig. 12a,b). In the second tier categories, pathways most commonly

59

occurred in biosynthesis of nucleosides and nucleotides (14-15%), amino acids (12-13%), and carbohydrates (8%).

Mined Pathways, n=7 sites Unmined Pathways, n=5 sites

Activation/ Superpathway Superpathway Activation/ Inactivation/ Inactivation/ Interconversion Interconversion Generation of Precursor Generation of Precursor Metabolites and Energy Metabolites and Energy

Detoxification Detoxification a) b)

Degradation/ Degradation/ Utilization/ Utilization/ Assimilation Assimilation Biosynthesis Biosynthesis

Indicator Pathways Indicator Pathways Mined, n=0 pathways Unmined, n=1 pathway (amino acid degradation)

Figure 12. Relative abundances of MetaCyc pathways at a) mined and b) unmined sites. Only one indicator pathway identified.

Shannon diversity of pathways was lower at mined sites (Table 8, Student’s t- test, P=0.045). Alpha diversity of taxa and functional gene groups, and pathway richness did not differ between mined and unmined sites and had no linear relationship with the percent of watershed mined (Table 8, Student’s t-test, linear regression, all P>0.05).

Alpha diversity was based on actual and estimated richness (Chao1 richness estimator), as well as Shannon Index. Rarefaction curves also showed no difference across sites

(Appendix B Fig. 27).

60

Table 8. Alpha diversity metrics at unmined (n=5) and mined (n=7) sites for taxa, KEGG orthogroups, and MetaCyc pathways.

Site Type Unmined Mined n=5 n=7 Richness 262.20 289.71 S.E. 5.22 13.48 Taxa (Bacteria and Chao1 370.46 445.69 Archaea) S.E. 17.64 41.04 Shannon 4.52 4.66 S.E. 0.05 0.05 Richness 7057.60 6298.14 S.E. 125.89 351.26 Chao1 7074.53 6301.79 KEGG Orthogroups S.E. 126.56 351.24 Shannon 7.43 7.42 S.E. 0.01 0.01 Richness 771.00 761.71 S.E. 17.73 13.74 Chao1 771.00 761.71 MetaCyc Pathways S.E. 17.73 13.74 Shannon 5.53 5.51 S.E. 0.005 0.003

The composition of microbial taxa and of functional genes both differed significantly between mined and unmined sites, but these compositional shifts did not lead to a change in metabolic pathways MetaCyc (Fig. 13a,b,c). Contrasts between mined and unmined sites showed that community composition in ordination space using

NMDS of Bray-Curtis distance matrices showed distinct separation between mined and unmined sites for Bacteria and Archaea OTUs (F1,10=1.4, P<0.001) as well as for functional genes at the most refined level of KEGG orthogroups (Level D) (F1,10=1.67, P=0.014).

Pathway abundance and composition did not shift with mining (F1,10=0.86, P=0.58).

61

0.10 a) Stress = 0.16 b) Stress = 0.04 0.2 F = 1.4 F = 1.67 1,10 Percent Watershed1,10 Mined Percent Watershed Mined P < 0.001 P = 0.014 0 0 0.05 0.1 25 25

50 50 0.0 75 75

NMDS2 0.00 NMDS2 Site Type Site Type -0.1 Mined Mined Unmined -0.05 Unmined -0.2

-0.2 -0.1 0.0 0.1 0.2 NMDS1 -0.1 0.0 0.1 0.2 NMDS1 0.06 c) Stress = 0.07 F = 0.80 1,10 Percent Watershed Mined P = 0.73 0.04 0

25

0.02 50

75 NMDS2 0.00 Site Type

Mined

Unmined -0.02

-0.05 0.00 0.05 0.10 NMDS1

Figure 13. Composition of mined and unmined site a) taxa, b) KEGG orthogroups, and c) MetaCyc pathways using Nonmetric Multidimensional Scaling of Bray-Curtis distance matrices. Site categories compared using permutational MANOVA.

3.3.3 Environmental Correlates of Variation in Composition

Major differences between mined and unmined sites were along Axis 1 of the

NMDS, an axis which was highly correlated with the percent of the watershed mined gradient as well as the composite chemistry scores for both Dissolved AlkMD and

62

Biofilm PCA axis 2 (Appendix B Figs. 28 & 29). Dissolved AlkMD composite scores explained the largest proportion of the variation in microbial community composition across NMDS Axis 1, with increasing values in increasingly mined sites (R2=0.75,

P=0.003). There was significant variation along Axis 2 in our ordination that was unrelated to the mining gradient, but was correlated with the 2nd composite axes scores for both water chemistry (Al and Zn, R2=0.54, P=0.043) and biofilm chemistry (Co,

R2=0.50, P=0.046).

Results for the composition of functional genes were also significantly related to

AlkMD dissolved constituents and the dissolved Al-Zn axis, but not related with biofilm chemistry. AlkMD dissolved constituents scores increased towards the mined sites while the Al-Zn axis scores increased towards the unmined sites (PC1: R2=0.49, P=0.033;

PC2: R2=0.54, P=0.019). Microbial pathway assemblages were not related to environmental vectors.

3.3.4 Shifts in Microbial Characteristics with Mining

We assessed site type differences in microbial taxa, functional genes, and pathways using two approaches: mean relative abundance comparisons with nonparametric Mann Whitney U tests and Indicator Species Analysis.

To determine microbial community characteristics that positively and negatively responded to mining, we conducted Mann-Whitney non-parametric tests between groups of taxa, functional genes, and pathways at mined and unmined sites using a

63

significance level of 0.05 to identify responders. Roughly two-thirds of the taxa and functional genes and roughly half of pathways that responded to the mining gradient decreased in relative abundance at mined sites (Figs. 14-16, Appendix B Tables 20-25).

Only 2.7% of Bacteria and Archaea OTUs were responders (i.e., significantly different in the frequency with which they occurred in either mined or unmined sites) (27 out of 987

OTUs). However, 63% of the responders were less abundant at mined sites, while 37% of responders had a greater relative abundance at mined sites. This suggests that mining derived chemical change is enriching one-third of the taxa while selecting against two- thirds of the taxa that respond to altered chemistry. Functional genes showed a similar pattern where only 9.6% (901 of 9412) of the total community of functional gene groups differed between mined and unmined sites, yet 64% of the responders were negatively influenced by mining. Functional pathways had a nearly equal proportion of negative and positive responses with 55% of the 29 significantly responding pathways (out of 586 pathways) decreasing at mined sites.

64

otu_49230 otu_382471 otu_49124 * otu_301262 otu_355614 otu_122184 * otu_263787 otu_322282 * otu_27880 * otu_336148 otu_296844 * Site Type Site Type otu_306969 * otu_276664 Mined Mined otu_49613 * otu_7502 OTU ID OTU OTU ID OTU otu_86442 * Unmined * Unmined otu_122198 * otu_49152 * * Indicator otu_122212 * otu_354049 * otu_53789 * otu_49052 * otu_320113 * otu_397972 * otu_5547 otu_7503 * otu_50671 * 0 5 10 15 0 5 10 15 Relative Abundance Relative Abundance

Figure 14. OTUs that differed significantly between mined and unmined sites (Mann Whitney U tests). Negative responders in left panel and positive responders in right panel. Indicator taxa noted by asterisks. See Appendix B Table 20 for taxonomic identification. Error bars represent standard error.

65

K02051 K01989 K03496 K07220 K07062 K03412 K06001 K07154 K00688 K02020 K06193 K06990 K00625 K01113 K12140 K04102 K17686 K02006 K07093 K09252 K09141 K00452 K00169 K14415 K05539 K13034 K16792 K09927 K00186 K11689 K01673 K05601 K00350 K00170 K00705 K05844 K02591 K04656 K00349 K01818 Site Type Site Type K14205 K09781 K02028 K06152 K02575Mined Mined K00436 K03670 K01917 K10046 K06954Unmined Unmined K00179 K10011 K15727 K02008 * Indicator KEGG Orthogroup ID Orthogroup KEGG K01531 ID Orthogroup KEGG K13628 K00929 K01846 * K12063 K14138 K13380 K12072 K15022 K03387 K16179 K18012 K15791 K00534 K18030 K17218 K09155 K03082 * K09883 K03207 K00879 * K17231 K15855 * K01490 K08313 * K08314 K13303 K12309 K08359 K07760 K03079 * K07144 * K10209 K01910 * K04283 K05084 * 0 500 1000 1500 2000 0 1000 2000 Copies per Million Copies per Million

Figure 15. KEGG orthogroups that differed significantly between mined and unmined sites (Mann Whitney U tests). Only orthogroups with the 10% largest effect size (Hedges’s g) shown. Negative responders in left panel and positive responders in right panel. Indicator groups noted by asterisks. See Appendix B Tables 21-24 for KO identification. Error bars represent standard error.

66

P76 P40 P69 P34 P12 P359 P433 P431 P18 P695 P350 Site Type Site Type P137 P374 P154 Mined Mined P333 P711 P2 Unmined Unmined

Pathway ID Pathway P390 ID Pathway P294 P441

P375 P142 P760 P785 P604 P139 P518 P339 P291 0 1000 2000 3000 2000 4000 6000 8000 Copies per Million Copies per Million

Figure 16. MetaCyc pathways that differed significantly between mined and unmined sites (Mann Whitney U tests). Negative responders in left panel and positive responders in right panel. No indicator group present. See Appendix B Table 25 for pathway identification. Error bars represent standard error.

We used indicator analysis to identify microbial characteristics with high fidelity and exclusivity to mined or unmined sites. Indicators were identified using significance level <0.05 and indicator statistic > 0.70. Groups indicative of unmined sites occurred more commonly than mined site indicators (Figs. 10c,d; 11c,d; 12). Of the significant indicator taxa identified, 75% (18 of 24) were indicators of unmined sites. This trend was stronger for functional genes were 96% (244 of 254) of functional genes were unmined site indicators. Only one pathway was a significant indicator and it was found more abundantly in unmined sites. Unmined indicator taxa reflected the overall relative phyla abundances while mined indicators were heavily skewed towards Archaea, which

67

overall represented <1% of OTUs, but comprised one-third of mining indicator taxa.

Functional genes indicative of unmined sites were frequently unclassified, but of the categorized high-level groups, indicators were largely proportional to the entire dataset.

Two categories of unmined functional gene indicators that did shift were signal transduction, which was higher that the proportion in the entire community, and amino acid metabolism, which was lower. Functional genes indicating mining were higher in the signal transduction category. The single indicator pathway was involved in amino acid degradation.

3.3.5 Relative Responses of Subsidy-stress Genes

We examined relative gene abundances for a suite of functional gene categories expected to be strongly influenced by the AlkMD gradient. Both nitrate and sulfate are important electron acceptors and increase with mining (NO3- increased by 3.68 mg/L and

SO42- by 959 mg/L between the unmined and most heavily mined sites). In contrast to our expectations, we saw no change in relative gene abundances of nitrogen and sulfur metabolism categories. There was, however, a lower total abundance of genes involved in methane metabolism at mined sites (Mann Whitney U test, P=0.002, Fig. 17). We also anticipated greater metabolism of selenium (which increased 12.34 µg/L between unmined and heavily mined sites) and an osmotic stress response to elevated stream water conductivity (1534 µS/cm increase at heavily mined sites). Yet, relative gene

68

abundances within these categories had little difference between mined and unmined sites (all P>0.05, Fig. 17).

Using linear regressions, we also examined genes within these coarse categories for relationships between a KEGG orthogroup relative abundance and the substrate or stressor we anticipated would affect that relative abundance. About 20% of methane and sulfur metabolism genes had significant linear relationships with the gradient of NPOC or SO42- (Fig. 17, Appendix B Table 26). Unexpectedly, the majority of these relationships for both methane metabolism and sulfur metabolism were negative (73% for methane,

76% for sulfur). Only 10% of nitrogen metabolism genes changed across the dissolved

NO3- gradient and were evenly divided among positive and negative slopes. The few selenium metabolism genes identified were all for selenate reductases and had no relationship with water column or biofilm selenium (P>0.05). We examined linear regressions between stressors and genes regulating osmotic stress and methane metabolism (Fig. 17, Appendix B Table 27). Only one osmoprotectant gene responded and had a negative relationship with water column conductivity (R2=0.43). We examined sulfate as a potential stressor for methane metabolism as it has been shown to inhibit methanogenesis (Oremland and Polcin 1982). Over half of methane metabolism genes had a significant relationship with dissolved SO42- and 86% of those relationships were negative.

69

200 200 200 25000 25000 Unmined Percent Watershed Mined Percent WatershedPercent Mined WatershedPercent Mined Watershed Mined 150 25000 40 150 40 150 40 40 50 40 Percent Watershed Mined 20000 20000 50 50 50 60 50 40 100 20000 60 60 60 70 60 50 100 Million per Copies 100 15000 15000 70 60 70 Million per Copies 70 70 Copies per Million per Copies Copies per Million per Copies Copies per Million per Copies 50 15000 70 50 50 Million per Copies 10000 10000 Selenium Stress Category 10000 Methane Nitrogen Sulfur Selenium StressSelenium StressMethane Nitrogen Sulfur Category Category Category Category Total Tested 67 58 124 3 12 67 Methane Nitrogen Sulfur

Predictor NPOC NO3 SO4 Se Conductivity SO4 Category Regression P<0.05 15 6 25 0 1 35 Positive Slope 4 3 6 NA 0 5 Negative Slope 11 3 19 NA 1 30

Figure 17. Graphs of gene copy relative abundance within each metabolism or stress category. Unmined sites combined to show average with standard error bars. Mined sites show individual site points sized by percent watershed mined. Table indicates regression between each gene in the category and the predictor listed. All R2 > 0.3. See Appendix B Tables 26 & 27 for individual regression statistics.

3.3.6 Quantified Gene Abundances Across the AlkMD Gradient

Although there was no difference in relative abundance of selenium metabolism genes, absolute selenate reductase (serA) gene abundance increased with both biofilm selenium (Fig. 18a) and water column selenium (Fig. 18b). The serA gene was not included among the genes within the selenium metabolism category returned from the metagenomics analysis.

70

2.5 y = 0.836 + 0.019x a) R2 = 0.56 p = 0.001 2.0

1.5

serA (copies/g DW) (copies/g serA 1.0 10 log

0.5 20 40 60 Biofilm Selenium (μg/g OM)

y = 0.931 + 0.076x b) 2.5 R2 = 0.65 p < 0.001

2.0

1.5

serA (copies/g DW) (copies/g serA 1.0 10 log

0.5 0510 15 Dissolved Selenium (μg/L)

Figure 18. Linear regressions of absolute abundance of selenate reductase gene, serA, with a) biofilm selenium concentration and b) dissolved selenium concentration.

No other functional gene abundances had significant linear relationships with either the percent of area mined or the substrate concentration including nirK, dsrB, aprA, mcrA, and pmoA (P > 0.05). We also found no relationship between methane

71

metabolism genes (pmoA and mcrA) abundance with water column dissolved SO42- concentration to examine potential for sulfate suppression of methanogenesis.

3.3.7 Nitrification Potential Assay

Nitrification potential had no relationship with the of percent watershed mined or dissolved NO3- (linear regression, P>0.05). Mean nitrification potential was 9.6 NO2-N

µg/g OM/hr (s.e. 1.11).

3.4 Discussion

The composition of microbial communities of Mud River were altered by alkaline mine drainage (AlkMD) delivered from upstream mountaintop removal coal mines. These shifts in community composition were accompanied by shifts in the frequency distribution of functional gene abundances. Though these differences are statistically significant, compositional changes did not appear to be sufficiently strong to lead to major losses of potential microbial functional capacity, as the composition of microbial functional pathways that were measured did not change along this strong chemical gradient. We saw no evidence that AlkMD associated subsidies of important electron acceptors (SO4) or limiting nutrients (NO3) was leading to an enrichment of microbial taxa or functional genes associated with using on these resources. Though we anticipated that more osmotic or oxidative stress along the AlkMD gradient might lead to fewer intolerant taxa, lower taxa richness or a greater abundance of functional genes associated with these stress responses, we found no evidence in support of these

72

expectations. Community and functional gene abundance responses to the mining gradient provided subtler indications that AlkMD is shifting community membership and potential activity. Though the majority of individual taxa and functional genes did not change in relative abundance along the gradient, of those that did, two-thirds of strongly responsive taxa and functional gene groups were negatively associated with mining. Gene abundances related to two specific metabolisms did shift along the mining gradient: the relative abundance of methane metabolism genes, which was lower at mined sites, and the absolute quantity of a selenate reductase gene, which increased with dissolved and biofilm selenium concentrations.

Environmental parameters have previously informed our expectations for community structure highlighted by Whittaker’s ‘gradient analysis’ curves that showed gradual changes in the abundance of individual plant taxa along gradients (Whittaker

1956). These shifts have been primarily related to climactic factors (Whittaker et al. 2001).

In aquatic environments, the variables that most influence microbial diversity are often metals, organic matter, and temperature (reviewed by(Zeglin 2015). Compared with other environments, microbial community composition in streams is highly variable

(Portillo et al. 2012, Shade et al. 2013) and our ability to connect those environmental variables to the wealth of data generated about microbial taxa and functional genes remains a challenge. Here, we identified shifts in the composition of taxa and functional genes that were partially explained by stream water and biofilm chemistry though there

73

were not strong relationships between specific variables expected to influence particular taxa and functional genes from an energetics perspective. While this result questions the predictive power of coarse measurements of microbial properties at the landscape scale, in other aquatic ecosystems, DNA data has provided an accurate fingerprint of both current and prior environmental conditions. For example, (Smith et al. (2015) used machine-learning models of 16S rRNA genes to predict uranium and nitrate concentrations of contaminated groundwater with 88% and 73% accuracy, respectively.

Although a collection of environmental variables strongly correlated with the main water chemistry axis, our predictions that elevated nitrate and sulfate in particular would influence functional gene abundance were guided by metabolic expectations.

Free energy yield from dissimilatory nitrate reduction is higher than that from sulfate reduction. Yet even with this increase in substrate supply, nitrogen metabolism gene abundance did not differ between mined and unmined sites in general. One gene used in denitrification, nosZ, did show a strong positive correlation with dissolved nitrate, yet it was the only denitrification gene with a significant relationship. An increase in nitrate- reducers has been documented in response to large inputs of nitrate to soils around oil field pipelines (105 g/L) and sulfidogenic wastewater (0.62 g/L) (Telang et al. 1997,

Mohanakrishnan et al. 2011)

Similar to our analysis of 16S rRNA gene amplicons (Bier et al. 2015b), we did not find any evidence that SRB genera and sulfate reduction genes were not greater at

74

sites with mining and in fact some taxa capable of sulfate reduction had negative associations with AlkMD. Losses of sulfate reducing bacteria (SRB) in nitrate-amended sulfidogenic wastewater were genus-specific with reductions in Desulfobacter and

Desulfobulbus abundances while Desulfovibrio and Desulfomicrobium were unaffected

(Mohanakrishnan et al. 2011). These genera did not overlap with any of the 11% of taxa capable of sulfur metabolism that responded negatively to AlkMD in the current study.

Sulfate increases related to other mine drainage studies without a pH decrease have also seldom detected strong SRB responses. In drainage ditches at a copper mine site with elevated sulfur, few known sulfur-cycling taxa were significantly correlated with sulfur content (Pereira et al. 2014, 2015).(Lindsay et al. (2009b) also identified highly variable populations of neutrophilic SRB in sulfidic mine tailing sediment cores at neutral/alkaline pH that corresponded with local hydrogeochemical conditions.

Regarding habitat preferences, pH and oxygen status of the immediate environment are known to affect nitrate and sulfate reduction. Many SRB are neutrophilic (Widdel 1992) and denitrifying bacteria communities are shown to have compromised growth at acid pHs (Brenzinger et al. 2015) suggesting that our study sites provide preferred growth conditions with respect to pH. In soils with fertilizer additions, which can reduce pH, denitrification gene abundances decreased despite the prevalence of substrate (Wallenstein et al. 2006). Oxygen gradients also influence dissimilatory metabolisms. The majority of anaerobic reduction processes occur under

75

suboxic or anoxic reducing conditions due to differences in energy yield as well as oxygen toxicity.

Despite the numerous genes targeted with qPCR, only one energy-yielding process gene was positively correlated with coarse substrate abundance: selenate reductase had a significant, positive relationship with selenium. Although we measured total selenium concentrations, rather than selenate/selenite, (WVDEP 2009) found that

95% of dissolved selenium in Upper Mud River below Hobet mining complex was in the form of selenate. Selenate and selenite were also found to comprise ~40% of selenium in composite insect samples collected from these sites (Arnold et al. 2014). Our observation that selenium concentration of biofilm and water was strongly correlated with selenate reductase gene abundance relied on the membrane-bound selenate reductase gene serA and was specific for targeting selenate-reducing bacteria (SeRB), but avoiding denitrifying bacteria which can be captured with less-specific primers (Wen et al. 2016).

Selenate can be reduced using selenate reductase (SerABC) (Schroeder et al. 1997) or nitrate reductase (Nar or Nap) (Sabaty et al. 2001, Gates et al. 2011), though the latter approach may contribute little to selenate reduction (Sabatay et al. 2001). Phylogenetic analysis of the serA primers identified organisms most closely related to two genera:

Dechloromonas sp. and Thauera selenatis (Wen et al. 2016).

Because nitrate and sulfate terminal electron acceptors are positively associated with mining, we had hypothesized that methane metabolism could be negatively

76

affected at mined sites due to energetic suppression. The free-energy yield from methanogenesis is lower than that gained from sulfate reduction, suggesting that methanogen heterotrophs could be outcompeted as has been observed with acetate or hydrogen substrates are used (Oremland and Polcin 1982) or with sulfate additions

(Raskin et al. 1996). Although NPOC increased across the mining gradient, we saw little evidence that it provided substrates to support a greater population of methane metabolism genes as 73% of the 15 significant relationships between methane metabolism genes and NPOC concentration were negative. An even greater proportion of the 25 significant relationships between methane metabolism genes and sulfate concentration were negative (76%).

AlkMD increased a number of potential stressors such as metals and conductivity. We saw little evidence of a greater proportion of osmotic stress response genes, though 50% of the taxa that increased in relative abundance at mined sites are described from high salinity environments. Only one osmoprotectant gene (opuA), which is known to synthesize glycine betaine (Kempf and Bremer 1995) responded to

AlkMD and the response was negative. This gene was also had a significant negative relationship with conductivity, which explained 48% of the gene relative abundance.

This may suggest a change away from osmoprotectant strategies that synthesize molecules instead of take up solutes from the surrounding environment, though there were no relationships of genes known regulate uptake. Metal toxicity from Ni, Cd, Cu

77

and less so from Zn, Cr, and Pb have been described for SRB (Hao et al. 1994), though we saw little evidence of metal toxicity genes.

We hypothesized that taxa and processes with narrow phylogenetic guilds and previously known to be sensitive to stressors would decrease with AlkMD. Nitrifiers or nitrification potential did not seem to be sensitive to AlkMD, though nitrifiers are often thought to be intolerant of metals (Tchobanoglous et al. 2003, Kapoor et al. 2015) and sensitive to osmotic stress (Jin et al. 2007). Two archaeal ammonia oxidizers

(Nitrosopumilus spp.) were positively associated with AlkMD, yet they are taxa that have previously been isolated from high salinity environments (Konneke et al. 2005,

Matsutani et al. 2011).

Despite the shifts in relative abundances of taxa and functional genes in exposed communities, relative pathway abundance was unaffected. While not all functional genes comprise pathways, the similarity between sites suggests that the functional potential of many pathways remains stable. As multiple functional genes can be used in part of a pathway, this could also suggest a portfolio effect (Doak et al. 1998) or functional redundancy (Allison and Martiny 2008), buffering a community from environmental perturbation due to its genetic diversity (Schimel 1995).

While we approached this study with a set of hypotheses and predictions, this work also generated post hoc hypotheses that deserve further exploration in this system.

In particular, below we discuss factors related to microsite heterogeneity, community

78

assembly, carbon requirements, and calcium effects that may have influenced the relationships between environmental variables and taxa or functional gene abundances.

Discrepancies between taxa and measured environmental variables in part may stem from microsite habitat variability. Smith et al. (2015) found variable success in the types of communities that provided predictive insight: free-living bacteria were stronger predictors of uranium than particle-associate communities, but there was no difference between the two lifestyles for predicting nitrate concentrations. In our study, water chemistry was more closely correlated with the proportion of mined watershed than biofilm chemistry and explained more variability in the community shifts. Previously, water chemistry has separated the composition of mined and unmined site biofilms while biofilm chemistry was the dominant factor in differentiating composition between sites with high and intermediate degrees of mining (Bier et al. 2015b). A large proportion of biofilm communities may derive from bacterioplankton inputs by free-living and suspended particle communities that in headwater river networks are known to be heavily dominated by terrestrial taxa (Ruiz-González et al. 2015) and highly diverse

(Besemer et al. 2013). Although taxa within biofilms are not typically reflective of pelagic taxa (Besemer et al. 2012), the biofilm depositional communities we sampled are likely to have greater inputs of particle-associated taxa from the water column. While these inputs could create a stronger connection between biofilm community composition and water chemistry due to greater exposure of bacterioplankton cells to water column

79

parameters prior to arrival in the biofilm community, they are likely to confound associations between resident biofilm taxa and subsidies.

The variable response of sulfate reducer abundance and sulfate reduction genes could be dependent on heterotrophic carbon requirements. In field experiments of carbon additions to these mine tailings, sulfate reduction and abundance of sulfate reducing bacteria increased with 0.6 wt. % organic C additions 2011 (Lindsay et al.

2009a, Lindsay et al. 2011). Indeed, sulfate reduction bioremediation strategies routinely involve carbon additions to stimulate heterotrophic reactions (Doshi 2006). As sulfate reduction requires simple carbon compounds (or H2), degradation of complex carbon substrates is an additional factor to consider in the growth requirements of SRB (Logan et al. 2005). Despite elevated concentrations of NPOC at the AlkMD sites, biofilms had low carbon content (1-2.3 dry wt. %). Heterotrophic respiration of biofilm was elevated at mined sites which is similar to findings for urban streams, which may also have reduced carbon use efficiency (Sudduth 2011). Using substrate induced respiration per mass of biofilm DNA does not directly translate to carbon use efficiency, but our findings that SIR per mass of community DNA was elevated in mined streams suggests the need to examine carbon use efficiency in sites exposed to AlkMD. This could be due to greater physiological stress from conditions at the mined sites or a higher proportion of CO2-producing heterotrophs in the entire community.

80

Influxes of calcium, which was strongly associated with AlkMD, may have influenced response patterns of both taxa and functional gene groups. Over half of the negative responders to AlkMD were in Proteobacteria, which has previously been negatively correlated with calcium content in neutral copper mine soil (Pereira et al.

2014). The major functional gene categories serving as indicators of mined sites were within the “Signal transduction” category (67% of KO indicator abundance) which included the calcium signaling. Both mined and unmined sites had functional indicators involved in the calcium signaling pathway. Further, twice as many functional genes comprising the calcium signaling pathway responded positively to AlkMD (n=6).

However, the entire calcium signaling pathway did not differ between site types. These results should be considered with caution as calcium binding is confirmed in in non- eukaryotes, but signaling pathways are still under study (Dominguez et al. 2015).

Ultimately, as the research community seeks to define the connections between environmental perturbations and the resulting microbial community composition, we should use caution in assuming that subsidy availability will translate into increased abundances of taxa we anticipate to benefit from these changes. While there has been a positive relationship between subsidies and subsidized taxa for many instances of environmental contamination, this work encourages us to consider both thermodynamic and ecological principles to interpret shifts in community composition.

81

4. Linking Microbial Community Structure and Microbial Processes: An Empirical and Conceptual Overview

4.1 Introduction

Microorganisms dominate Earth’s by virtue of both their numbers and metabolic capabilities (Falkowski et al. 2008). Modern-day molecular technologies allow us to identify the myriad microbes that exist, to identify the genes they carry, and even to determine whether those genes are being transcribed and translated into functional proteins. What remains an open question is whether all of this information will enable us to better understand, predict and model the ecosystem processes that microbes perform (e.g.(Carney and Matson 2005, van der Heijden et al.

2008, Todd-Brown et al. 2011, Wallenstein and Hall 2011, Petersen et al. 2012, Graham et al. 2014).

Microbial and ecosystem ecologists approach the question above with both optimism and caution. On one hand, our capacity to extract, amplify and assess microbial nucleic acids and proteins from environmental samples is staggering and is improving rapidly; we can evaluate the community composition of microbes present within nearly any environmental sample. Yet, this technological progress has repeatedly demonstrated that the phylogenetic identities and metabolic capabilities of microbes within any environmental sample are far more diverse than we had imagined (Prosser

2012) and the variety of metabolic states (growing, active, dormant, deceased) means that ‘who is present’ is not a proxy for ‘who is active’ (Jones and Lennon 2010, Lennon 82

and Jones 2011, Blagodatskaya and Kuzyakov 2013, Blazewicz et al. 2013). Perhaps even more importantly, we are increasingly aware that the presence or abundance of particular organisms, genes or gene transcripts may not be well connected to the rates with which the associated biochemical reactions are occurring (Schimel and Schaeffer

2012, Rocca et al. 2015).

Despite these challenges, many recent influential papers and reports have called for incorporating information about microbial communities into assessments of ecosystem functions and improvements of ecosystem models (e.g.(Moorhead and

Sinsabaugh 2006, Konopka 2009, Allison 2012, Bouskill et al. 2012, Wieder et al. 2013).

Some reports distinctly acknowledge that microbial communities temper the influence of natural and anthropogenic disturbances on ecosystem functioning (e.g.(Krause et al.

2014). Others suggest that we continue exploring how to use microbes for improving mechanistic predictions of ecosystem processes so that in cases where information about microbial communities ‘is’ relevant, we will have more accurate predictions

(e.g.(McGuire and Treseder 2010). A substantial increase in the ease and affordability of acquiring and analyzing microbial community data has spawned significant efforts to study structure-process connections (e.g.(Prosser 2012), and we can now examine these connections across spatial, temporal and taxonomic scales (e.g.(Walters and Knight

2014). Yet, some researchers continue to challenge the generality of many studies and encourage us to determine where genetic and ecosystem studies overlap (e.g.(Fuhrman

83

2009). Is information resulting from our numerous structure-process studies consistently filling a knowledge gap, or is there little return for our investment (Graham et al. 2014)?

While some studies have identified empirical links between microbial communities and ecosystem processes (defined in Appendix C), this body of literature is also replete with studies where structure and process appear uncoupled. Such uncoupling could occur when the ultimate rate-limiting step is abiotic, such as desorption of clay-bound organics or the breakup of aggregates and the release of labile materials; in such cases, the composition of the decomposer community would be unlikely to visibly influence the rate at which the materials are processed (Schimel and

Schaeffer 2012). However, are biotic— in such cases the lack of a relationship might be due to factors including microbial dormancy (Jones and Lennon 2010, Lennon and Jones

2011), horizontal gene transfer (Smets and Barkay 2005), functional redundancy (Allison and Martiny 2008), priority effects (Fukami et al. 2005), and neutral assembly processes

(Nemergut et al. 2013, Nemergut et al. 2014). When links do occur, the success of identifying them is also likely dependent on the conditions and techniques used in each study (e.g.(Shade et al. 2012a), or the time-scale over which measurements occur. Yet, it is not clear how often and with which techniques researchers have identified explicit links between microbial community structure and process, and examining the differences between such studies could guide the direction of future studies.

84

In this paper, we seek to describe the state of recent efforts to characterize microbial community structure and function relationships. We focused this literature synthesis on manipulative experiments because such studies may offer the best opportunity to establish a link between changes in both microbial identity and microbial processes in response to a known (and controlled) experimental driver. We evaluate the frequency with which authors of recent publications (2009–13) have simultaneously investigated microbial community structure and microbial process responses to an experimental manipulation and the time that lapsed before they detected changes in structure, process or both structure and process. To guide our evaluation, we focus on five questions: (1) How frequently do publications report that an experimental manipulation leads to changes in either microbial community structure or microbially mediated ecosystem processes? (2) How often do researchers measure simultaneous changes in both microbial community structure and process? (3) Are particular experimental conditions or techniques more often associated with links between structure and process? (4) Do structure and process respond to disturbance at different rates? (5) How are researchers attempting to evaluate inferential or empirical links between measures of microbial community structure and process?

4.2 Methods

We synthesized recent literature that contained experimental manipulations of environmental factors to induce stress on microbial communities. We excluded field-

85

based observational studies, such as environmental gradients, out of concern that relationships observed between microbial community structure and ecosystem processes within such studies might result from unobserved drivers (not associated with the gradient of interest) or reverse relationships where ecosystem function affects community composition (as discussed by Krause et al. 2014). By contrast, experimental manipulations allow researchers to determine whether and how structure and process metrics respond to a well-constrained change in the environment.

We used a set of structural terms and a set of process terms to search the ISI Web of Science literature database for papers published between 2009 and 2013 (Appendix C

Fig. 30). We required that papers include at least one of the process terms and one of the structural terms. To achieve this, we searched for papers containing processes where

Topic = ‘decomp OR methan OR sulfate red OR denitrif OR dnf OR nitrif ’ and structures where Topic = ‘commun OR gene OR physiolog ’. The processes indicated by these terms are commonly explored in the ecological literature and the results from this search yielded more papers than a search for ‘funct ’ alone. These structure-search terms were selected to return experiments involving microbial communities rather than culture isolates. The output from the structure and process search terms in the ISI

Database yielded 199 749 papers. We refined this search by Topic = ‘microb ’ to exclude papers focusing exclusively on macroorganisms; this narrowed to the total to 32 386 papers (Appendix C Fig. 30). We then restricted these results to ‘Environmental Sciences

86

Ecology’ as the Research Area and used ‘Topic = ecology’. From this output, we selected four of the top five journals with the greatest number of paper results (FEMS

Microbiology Ecology, Soil & Biochemistry, The ISME Journal and Microbial

Ecology) (we excluded one of the five journals (PLOS One, as a general journal)), two general ecology journals (Ecology and Ecology Letters) and two major full-spectrum journals (Nature and Science). Following a review of this list of journals by experts in the field as part of the Powell Center working group, we added a leading general aquatic science journal (Limnology and Oceanography). Limiting our search to this subset of journals reduced our results to 1189 papers.

We examined the abstract of each paper for the following criteria: (1) the study was experimental, (2) at least one process and one structural metric were measured simultaneously and (3) the study altered at least one chemical or physical condition. We included papers that manipulated biological conditions such as tree girdling if a chemical or physical change to the environment was documented. In total, we obtained

148 papers (12.4% of the original 1189) that comprised the ‘full dataset’ used in this synthesis. Datasets defined in Appendix C.

For each of these 148 papers, we recorded the type of manipulation, the test location (laboratory or field), the duration of the experiment, whether or not immigration or emigration was possible based on open or closed experimental units, and which groups of organisms were examined (microeukaryotes, archaea and/or

87

bacteria). We recorded each experimental treatment- process combination separately, such that each paper could have multiple ‘incidences’ if multiple processes were measured or if multiple treatments were applied within a single study. For example, a research project that used two treatments (e.g. elevated temperature or a fertilizer

addition) and measured both N2O flux and CO2 flux in each treatment plot would result

in four separate incidences: one for N2O flux in the elevated temperature plots, one for

CO2 flux in those same elevated temperature plots, and one each for the N2O flux and

CO2 flux in the fertilizer plots. Each of these incidences could contain multiple structure metrics if more than one aspect of the community was measured (e.g. 16S rRNA and nirK genes). For each process measured, we denoted whether the authors measured ambient or potential (i.e. rate measured with substrate enrichment) microbial processes, as well as the technique used to assess microbial community structure and the type of metric reported: relative abundance, absolute abundance (per gram of soil) or presence- absence.

To examine the connections between structural and process measures, we investigated whether or not each experimental treatment resulted in (1) no change in either structure or process metrics, (2) a process change only, (3) a structural change only or (4) a change in both structure and process. For those that reported simultaneous change in both structural and process attributes, we further tallied whether or not the authors had statistically tested for a relationship between these attributes and whether a

88

statistical relationship, or link, was found. This statistically tested dataset (referred to as the ‘link-tested dataset’ hereafter) contained 38 papers (26% of the full dataset) with 96 incidences that found a link and 32 incidences that found no link. For this link-tested dataset, we determined which genes or taxonomic groups of organisms were tested with a process and which metric of community structure had been used to measure them (e.g. qPCR, DGGE and TRFLPs). For detailed information on the generation of the link-tested dataset, see Appendix C.

Because measures of microbial community structure and process were often taken at multiple time points following a disturbance, we examined whether the time since the experimental manipulation affected the likelihood of detecting either a structural or process microbial response to experimental treatments. To investigate this, we examined the duration of experiments in both the full set of experimental papers

(148 papers) and the link-tested dataset (38 papers). Using the full set of experimental papers, we compared the duration of incidences in which there was a change in structure only, process only, both, or neither. For this analysis, duration was defined as the length of time from the first treatment application to a time at which structure and process were both measured. We included repeated measures of process or structure over the completion of the study. For example, if community structure and a process were measured on the 5th and 30th day of the experiment, a separate incidence was made for each date, so the treatment would have two durations. Using this approach,

89

we were able to capture temporal changes in structure that might occur on a different day from changes in process. Secondly, we used our link-tested dataset to compare the duration of studies in which a statistically significant link was present or absent. Because some studies combined time points in their analysis of a link, the link-tested dataset contained only one time point per incident, the duration recorded for each incident ended the day that both a structural and process change were measured.

4.3 Results

4.3.1 How Frequently Do Publications Report That an Experimental Manipulation Led to Changes in Microbial Community Structure or Microbially Mediated Ecosystem Processes? And How Often Do These Changes Co-occur?

The set of 1189 papers matching our search terms comprised less than 3% of the total number of papers published in the targeted journals between 2009 and 2013

(Appendix C Table 29). The majority of these papers were published in Soil Biology &

Biochemistry, FEMS Microbiology Ecology, Microbial Ecology and The ISME Journal, respectively (Appendix C Fig. 31). Moreover, only 12.4% (n=148) of these papers contained experiments that measured both microbial community structure and microbial process in response to an environmental manipulation (Appendix C Fig. 30).

For 19% of incidences (from 52 papers, 236 of 1082 incidences), authors concluded no change in either a structure or process metric, while 24% (68 papers, 219 incidences) reported only structural shifts, and 17% (46 papers, 139 incidences) reported only process changes. In the remaining 40% of incidences (112 papers, 488 incidences),

90

authors detected both a process change and a structural change in response to an experimental manipulation (Fig. 19). This subset was our ‘changed dataset’. Within each paper, we examined whether the authors did or did not test for statistical links between microbial structural and process responses to experimental manipulations. We found that only 38 of the 112 papers from the changed dataset were included in the link-tested dataset because they specifically tested for a statistical link between structure and process metrics (Fig. 19). Many of these papers measured multiple structure-process incidences so that our dataset included a total of 128 tested links. We found that in 75% of incidences (from 28 papers, 96 incidences) the authors identified a statistically significant relationship between structural and process responses to their experimental manipulation and in 25% of incidences (from 16 papers, 32 incidences) the tested link was not statistically significant. The set of statistically significant linked incidences comprises our linked dataset.

91

Figure 19. Distribution of the literature synthesis results expressed in proportion of papers (A and B) or incidences (C) derived from the 148 papers that matched selection criteria (see Methods). Papers contained multiple incidences; e.g. the 40% of structure-process change incidences from A occurred within 112 papers.

4.3.2 Do Particular Experimental Conditions or Techniques More Often Associate with Observed Links Between Community Structure and Microbial Processes?

4.3.2.1 Experimental Design

Although studies manipulated different variables, statistical links between structure and process were most commonly tested in studies that manipulated fertilizers

and drivers (e.g. temperature and CO2) (Appendix C Table 30). Among the incidences associated with fertilization treatments (n=36), such as the addition of ammonium nitrate or urea, 78% resulted in a significant link between structure and process. In climate change manipulations (n=27), which comprised the next most

92

common type of manipulation, 63% (17 incidences) showed a significant statistical relationship. Treatments that exhibited significant relationships between structure and

process included warming (53% of incidences), elevated CO2 (35% of incidences) or a

combination of both elevated temperature and CO2 (12% of incidences).

The proportion of significant links varied depending on the techniques used for measuring community structure (Fig. 20a), but not according to the group of organisms targeted (Fig. 20b), the location of the experiment (laboratory or field, Fig. 20c) or whether experimental design prevented dispersal into or out of experimental units (Fig.

20d). Moreover, those experiments using presence-absence measurements of community membership reported a smaller proportion of significant links between structure and process (36%) in comparison to those experiments assessing relative (67%) or absolute abundances (58%) of the present taxa. However, the number of incidences reporting presence-absence values was small (n=11) compared to either of the other two categories

(n=62 and 85 for absolute and relative abundances, respectively) and included ordination techniques, measures of diversity and specific taxonomic groups.

93

Figure 20. Distribution of incidences among different experimental attributes from the 38 papers that tested for a link between microbial community structure and process. The number of incidences is indicated within each bar. (A) Type of quantification of community structure; (B) major taxonomic groups targeted; (C) laboratory versus field experiments; (D) allowance of microbial dispersal during the experiment.

When examining our full dataset (148 papers), we found that the duration of the experiment significantly affected some of the observed responses (Fig. 21a, Kruskal–

Wallis test, df=3, P<0.01). Using incidence medians, changes in process alone occurred after shorter periods (27 days) than changes in structure alone (61 days), or than concurrent changes in structure and process (56 days) (Mann Whitney pairwise comparisons with Bonferroni corrections, P<0.02). However, there was no difference in the duration of experiments that produced structure changes or concurrent structure- process changes (P=1.0). When we compared durations with and without a statistically

94

significant link using the link-tested dataset, though, there was no significant difference between the duration of linked and unlinked experiments (Mann Whitney U test, df=1,

Z=-0.33, P=0.74, Fig. 21b). Using the entire datasets, experimental duration for studies in the link-tested dataset was longer than the mean experimental duration of the full dataset (Mann Whitney U test, df=1, Z=3.23, P<0.01).

Figure 21. (A) Duration of experiments from the 148 papers in which process changes [Proc] (n=140), structural changes [Struc] (n=221), simultaneous structure- process changes [Both] (n=488), or no changes [No effect] (n=237) were reported. Letters reflect Kruskal–Wallis rank sum test (df=3, P<0.01) followed by Mann Whitney pairwise comparisons with Bonferroni corrections. Different letters indicate significant differences (P < 0.02) between categories. (B) Duration of experiments from 38 papers where both structure and process changed and a link was statistically tested (Mann Whitney U test, df=1, Z=-0.33, P=0.74).

4.3.2.2 Ecosystem Processes and Community Structure in the Link-tested Dataset

Within the linked dataset, CO2 fluxes and nitrification were the most frequently measured ecosystem processes (Fig. 22, Appendix C Table 31). These processes were each significantly linked to a microbial community structure attribute in ca. 80% of

95

incidences where a statistical test was performed. Nitrification was the most commonly measured process (18 incidences, 10 papers), followed by CO2 flux (17 incidences, 9 papers), N2O flux (17 incidences, 10 papers) and CH4 flux (15 incidences, 5 papers). A link between community structure and microbial process was present in 100% of experiments that attempted to link CH4 flux to a microbial community attribute. The same was true for ammonia oxidation, though we came across only four such incidences, all within a single paper. On the other end of the spectrum, there were no significant links with community structure in experiments that measured organic nitrogen decomposition, ammonification and total activity from a suite of nine enzymes, though these conclusions were supported by only two or three incidences for each process.

Figure 22. Distribution of incidences among microbial processes from the 38 papers that tested for statistical links between structure and process (link-tested 96

dataset). Different colors indicate the relative proportion of linked incidences associated with the different process metrics used. The numbers above each bar indicate the number of papers from which the incidences were extracted and the total number of incidences. See Appendix C Table 31 for additional processes not included in figure due to fewer than two incidences.

Overall, the greatest number of incidences that tested for a significant link with process targeted the 16S rRNA gene (51 incidences, 12 papers), the denitrification gene nosZ (39 incidences, 12 papers) or the bacterial ammonia monooxygenase gene amoA (37 incidences, 14 papers) (Fig. 23, Appendix C Table 32). The proportion of incidences that were linked varied by the metric of community structure that was applied and included both phylogenetically specific and universal genes or phylogenetically broad markers.

All of the incidences in which community structure measurements were defined using most probable number of methanogens (n=8) and methanotrophs (n=8), 16S rRNA genes

(archaea) (n=2) or cbh1 (cellobiohydrolase gene) (n=6) resulted in a statistical link with process, although these incidences were drawn from only one or two papers. Nitrogen cycling genes nosZ and nirS as well as the methane monooxygenase gene pmoA had the next highest percentage of links with 64–70% of incidences linked (n=39, 30 and 6 total incidences from 12, 8 and 1 paper(s), respectively). When tested for, the remaining genes and organisms were linked with a process in <55% of the incidences. Categories in which no link with process was detected also included phylogenetically broad and narrow groups. Broad categories included Gram-positive and -negative bacteria (n=4 and 12), often assessed using phospholipid fatty acid (PLFA) techniques, as well as

97

genes in fungi associated with the internal transcribed spacer region (n=5). More specific genes not linked to any process included the nitrogen cycling nifH gene and the sulfate- reducing dsrAB genes (n=8 and 10).

Figure 23. Distribution of incidences among structural measurements from the 38 papers that tested for a statistical link between microbial community structure and microbial process (link-tested dataset). Different colors indicate the relative proportion of positive incidences associated with the different compositional metrics used. The numbers above each bar indicate the number of papers from which the incidences were extracted and the total number of incidences. Additional metrics with fewer than two papers found in Appendix C Table 32.

Researchers attempted to link non-specialized processes with genes that were narrowly distributed and vice versa. Occasionally, these genes were also not directly related to the process. To explore whether there were differences in the degree of metabolic specialization of processes tested with a universal or specific gene, we examined incidences measuring either the universal 16S rRNA gene or bacterial amoA

98

gene (Appendix C Fig. 32). These genes were both common in our link-tested dataset.

Multiple studies attempted to link bacterial 16S rRNA genes with both CO2 flux, a broad

process performed by all heterotrophs and with N2O flux, a more narrowly distributed process derived from both nitrification and denitrification. Community structure

assessed by the 16S rRNA gene was linked to CO2 flux in 83% of the experiments where

it was tested (n=5 of 6) but was never statistically associated with N2O flux (n=9).

Attempts to link specific functional genes were consistently well correlated with the associated process, e.g. nitrification rates were linked with bacterial amoA gene frequencies in 100% of tests (n=14). Conversely, specific genes were not significantly related to the broad processes with which researchers attempted to link them, e.g. amoA

with CO2 flux (n=3).

4.3.2.3 Molecular Techniques Used in the Link-tested Dataset

Community structure metrics used in the link-tested dataset (38 papers) were dominated by four techniques: qPCR, T-RFLPs, DGGE and PLFA. Approaches using

DNA resulted in the highest percent of linked structure-process incidences. Quantitative

PCR (qPCR) of functional genes was used in 8 out of the 10 most commonly linked processes and was the only technique that had a higher occurrence of links present (50 incidences) than absent (33 incidences). Primarily, these analyses used relative abundances, (copy number ng DNA−1) (66%); though absolute abundances (copy number g soil−1) were associated with about one third of the incidences (33%). Roughly

99

two-thirds (68%) of structure-process pairs using relative abundance with qPCR were statistically linked. This was a less common occurrence for links tested using absolute abundance (56% of pairs linked). The next most commonly used techniques were DNA- based Terminal Restriction Fragment Length Polymorphism (T-RFLP) and DNA-based

Denaturing Gel Gradient Electrophoresis (DGGE) which both were linked in 43% of incidences and used for the 16S rRNA gene and functional genes (Fig. 23). DNA-based

T-RFLP was more commonly used (24 incidences linked: 20 relative abundance, 4 presence-absence) than DNA-based DGGE (15 incidences linked: 11 relative abundance,

4 presence-absence). Methane flux was statistically linked to community structure in

100% of tested incidences in which structure was characterized using five different techniques from five different papers, while the other 100% linked process (ammonia oxidation) used only qPCR and was drawn from a single paper. RNA-based techniques

(reverse transcription qPCR and TRFLPs or DGGE using cDNA) were used solely for

exploring links with CH4 flux or oxidation, denitrification, nitrification and decomposition. These were associated with a small percentage of the linked incidences

in CH4 flux (n=3), nitrification (n=2) and decomposition (n=1) (Fig. 22). There was no apparent connection between the number of different techniques used and the likelihood of detecting a community structure-process link (Appendix C Fig. 33).

100

4.3.3 How are Researchers Attempting to Identify Links Between Measures of Microbial Community Structure and Process?

Abundance, diversity and presence-absence were all used to measure community structure, though abundance was used most frequently and contributed to more links than presence- absence or diversity measures. Three of the four community structure metrics that yielded 100% of linked incidences were made from copy numbers or organism counts of methanogens, methanotrophs or 16S rRNA genes (archaea) (Fig.

23, Appendix C Table 32). Microorganisms with the cbh1 (cellobiohydrolase) gene were other group with 100% of linked incidences and had incidences evenly split between qPCR abundance (n=4) and T-RFLP-based diversity indices (n=4). Each of these fully linked metrics of community structure, however, relied on results from only one or two papers. Nitrogen cycling genes nosZ and nirS (12 and 8 papers, respectively) as well as the methane monooxygenase gene pmoA (1 paper) had the next highest percentage of links present (64–70%). The majority of these links were also obtained using abundance instead of diversity metrics. With the nosZ gene, 15 out of the 25 linked incidences (60%) used abundance data while the remaining 40% used diversity metrics. In nirS, 19 incidences used abundance (83%) and four used diversity (17%). Using pmoA, links with diversity and abundance were evenly split with two of four incidences in each category. The most commonly used metric of community structure targeted the universal segment of the 16S rRNA gene (51 incidences) and yielded links nearly evenly split between abundance

(12 incidences) and diversity metrics (13 incidences). The second most common metric, the bacterial ammonia monooxygenase gene amoA (41 incidences), had nine incidences linked through diversity measures and 13 linked through abundance.

101

The majority of structure-function links were tested using correlation analysis.

Likely reflecting the prevalence of abundance metrics based on qPCR, the techniques used to test for links were dominated by Spearman or Pearson’s correlation analyses

(68% of incidences, details not shown). Roughly 77% of incidences that were tested using correlation analysis yielded links, which mirrored the percentage of links present in the total dataset (75% of total incidences had a link present, Fig. 19). Canonical correspondence analysis was the second most frequently used technique and represented 11% of tests, but only 44% of those yielded a link. Incidences based on redundancy or co-inertia analysis had 100% of incidences linked to process. Regression analysis was associated with only 5% of incidences, 75% of which had a link present.

4.4 Discussion

The literature synthesis presented here revealed that researchers explicitly tested for a statistical link between microbial community structure and process in only one third of incidences from the experimental studies detecting structure and process rate changes in response to experimental manipulations. Yet, when authors reported testing for links they were commonly found; three-quarters of tested incidences were statistically linked. Ideally, theories involving structure-process links would generate publications with a stated hypothesis that was statistically tested and fed back to theory development (Fig. 24). However, this flowpath occurred in only 17% of papers. This suggests that many datasets may be available to explore for structure-process linkages

102

or may support hypothesis generation. It is possible that due to publication bias toward statistically significant results, these datasets were tested previously and only significant results were reported. Regardless of whether or not there was a greater number of unlinked structure-process pairs than we report here, our analyses identify many challenges to consider when designing and conducting experiments to investigate microbial community structure and process links.

Figure 24. Flowchart of guidelines for research involving microbial community structure and ecosystem processes overlain with ‘yes/no’ data from this literature synthesis (n=number of papers). Research decisions lead to hypothesis testing or hypothesis generating paths.

One major challenge in identifying and examining relationships between microbial community composition and process responses is that we have little

103

understanding of the temporal scales at which changes in community structure and related functional attributes occur. Microbial enzymes can be modified by chemistry and biology of the surrounding environment before they are relevant for an ecosystem process, thus creating a temporal disconnect between structure and process. For example, phenotypic plasticity can lead to differences in activity over short periods without an apparent change in taxonomic composition. This is illustrated by experiments incorporating single-cell techniques such as microautoradiography combined with fluorescence hybridization which have shown that bacterial activity can change greatly without an apparent change in community membership (e.g.

(Ruiz-González et al. 2012). This suggests that transformations in both microbial structure and processes sometimes can be decoupled in time, potentially misleading interpretations about links between them. Within our link-tested dataset of incidences reporting that both structure and process had changed, there was no difference between the median duration of experiments where significant links were detected and those where there were none. This indicates that when experimental data captures structure and process responses, the timescale of the experiment does not affect the likelihood of detecting a link. In our full dataset (148 papers), however, the median duration in which structure changed (61 days) or both structure and process changed (56 days) was approximately twice as long as the mean duration of studies reporting a process change alone (27 days) (Fig. 21). This supports the idea that physiological responses precede,

104

and perhaps do not even require, community shifts (Comte et al. 2013). Therefore, if processes are changing prior to structure, researchers making early or infrequent measurements may not capture data necessary to support a connection between these two parameters. For longer experiments, there is evidence that researchers sample less frequently: in their literature exploration of microbial responses to disturbance, (Shade et al. 2012a) identified a negative relationship between sampling frequency and experiment duration. Thus, changes in process from our full dataset (148 papers) may reflect higher frequency sampling whereas changes in structure could result from less frequent sampling over a longer duration. Because this elevates the difficulty of identifying a connection, explicit consideration of temporal factors in study design may decrease these discrepancies.

Considerable experimental and empirical evidence has shown that alteration of environmental variables such as temperature, salinity, pH and nutrient concentrations often coincide with shifts in structure and/or processes of microbial communities across a variety of ecosystems (e.g.(Lozupone and Knight 2007, Braker et al. 2010,

Vishnivetskaya et al. 2011, Herold et al. 2012, Shade et al. 2012b, Wertz et al. 2012, Reed and Martiny 2013). In our compilation of experiments within single study systems, most often either only one or neither attribute responds to environmental disturbance. Many studies have shown that pH can drive multiple types of compositional and process changes (e.g.(Liu et al. 2010, Rousk et al. 2010, Meron et al. 2012, Cheng et al. 2013).

105

Therefore, it is surprising that the studies in our link-tested dataset rarely manipulated pH directly (two incidences), though other manipulations such as N additions often indirectly alter pH. The addition of fertilizers was among the most commonly used disturbance. Our finding that fertilization treatments such as urea or ammonium nitrate additions were most likely to yield a link between structure and process may have been a consequence of fertilizer serving as a microbial resource, increasing plant- derived carbon availability, and altering pH (Pierre 1928, Geisseler and Scow 2014).

Alternatively, nitrogen could inhibit microbial growth and activity. Meta-analyses of nitrogen enrichment studies found that under elevated nitrogen, microbial biomass and

CO2 flux may decline (Treseder 2008) and organic matter decomposition may be impeded (Janssens et al. 2010) or the recalcitrant soil carbon pool may be less effectively decomposed (Ramirez et al. 2010).

Of all the techniques used to characterize microbial community structure, links with microbial processes were most commonly detected with qPCR. This suggests that a microbial process is more likely to coincide with the relative or absolute membership of the responsible organisms instead of indirect metrics such as composition or diversity of the community. Diversity metrics may be more representative of within community dynamics than functional potential, yet in processes catalyzed by organisms with specialized metabolisms, diversity can also adequately predict process stability and magnitude (Levine et al. 2011). Often our ‘snapshot’ analysis of microbial communities

106

also attempts to link structure to ecosystem processes without identifying the influence of underlying ecological conditions such as competition, assembly, tradeoffs and feedbacks (Prosser et al. 2007, Prosser 2012), but arguably, this is a challenging task for any field of ecology, not just microbial ecology.

In our link-tested dataset, both relative and absolute abundance yielded a similar percentage of links. While this result suggests that neither approach had an advantage in terms of uncovering links, the contributing studies may have had little variation in total biomass between the control and treatment samples, resulting in similar relative and actual population sizes. Presence-absence characterizations of communities, however, provided less frequent links to process, likely because presence- absence data provides little information about the dominant organisms within a community that might be responsible for the process. This highlights that commonness or rarity attributes of microbial communities may be important for understanding function (Aanderud et al.

2015).

Given the large information output, decreasing costs of sequencing and suggested ecological applications (Poisot et al. 2013) we anticipated that next generation sequencing (NGS) would be a frequently used technique, but in fact, none of the studies used NGS as a method for examining links between structure and process. This may be due to the use of NGS in many observational studies instead of experiments, which would have excluded them from our datasets or because of prohibitive costs which

107

increase quickly when striving to fulfill replication requirements (Prosser 2010). While these costs continue to decrease, the time span we used for our literature search may have captured studies that were completed while NGS costs were still high. The use of

NGS data enhances our ability to characterize the diversity and composition of microbial communities at different levels of resolution, which could influence the detectability of relationships between structure and process. For instance, if two taxa within a family respond oppositely to an environmental perturbation, coarser-scale taxonomic resolution such as order could obscure the actual change in structure. This inconsistency in the response of closely related taxa has been documented across complex environmental gradients (e.g.(Bier et al. 2015b), but may be more the exception than the rule (Philippot et al. 2010, Lennon et al. 2012). A further consideration for NGS information is that it provides relative abundance data, and can result in substantial data mining and type II error. Thus, it is critical for experiments investigating structure- process relationships with NGS to be hypothesis motivated.

Guided by earlier hypotheses (Schimel 1995, Schimel et al. 2005), we expected that a confined guild of microbes would contain a more similar overall genetic makeup and would be more likely to respond to a perturbation using similar mechanisms, whereas a guild containing a much more diverse genetic toolbox would respond to the same perturbation in a greater variety of ways. Thus, there would be less variation in process output from the confined guild and hence a greater likelihood of a statistical link

108

between guild structure and the process measured. However, we found that links were not only identified for processes governed by microbes with narrow phylogenetic distributions such as methane oxidizers, methanogens and sulfate reducers, but also for processes performed by a wide diversity of taxa, for example, carbon substrate utilization (Martiny et al. 2013). These findings support the idea that for some processes, such as those related to soil moisture adaptation, organisms at coarse taxonomic levels have ecological coherence (Philippot et al. 2010, Lennon et al. 2012).

An important consideration in ‘linkage studies’ should be whether the linkage is direct and causal, or incidental being driven by a master variable. This synthesis led us to reflect on the use of universal genes for linking with specific, phylogenetically narrow processes. For instance, the 16S rRNA gene was used in correlation with methane flux and nitrification (Appendix C Fig. 32). While employing the 16S rRNA gene for sequencing would allow researchers to reduce costs for testing connections between structure and multiple processes, this broad approach may be more appropriate for hypothesis generation or incidental associations than targeted research questions (see

(Prosser 2013). Moreover, statistical links were tested between indirectly related structure-process pairs. For instance, archaeal ammonia oxidation genes were linked with denitrification processes (Appendix C Fig. 32). This may stem from a temptation to test every structure-process pair merely because the data are available. Given that the probability of finding a correlation increases with the number of variables, studies with

109

a small sample size run the risk of coincidental discoveries. In this synthesis, though, the number of different structure or process metrics used per study did not exceed 10 and had no influence on the likelihood of detecting a structure-process link (Appendix C Fig.

33).

Although for this synthesis we used literature reporting experimental manipulations of environmental variables, linkage studies can also result from another class of experiments that directly manipulate microbial community structure. This structure-manipulation approach would potentially allow one to assess the strength of structure-process links in a highly controlled environment. For example, altering

decomposer richness may increase rates of CO2 mineralization (Bell et al. 2005), while microbial evenness can affect responses to salt stress (Wittebolle et al. 2009). These studies unequivocally demonstrate that composition affects function under some conditions and can complement environmental manipulations to aid our understanding of microbial process responses.

We found that Spearman and Pearson’s correlations dominated the statistical techniques used to assess structure-process links, but these types of assessments may not be the most effective when considering structure changes in a community of microbes.

Correlations are useful for linear quantitative analysis with specific genes, but do not establish causality. Microbes both influence and respond to variations in the environment, and these are difficult to separate with correlations. Further, when

110

assessing the community as a whole, correlation analyses may obscure microbial interactions and non-linear responses to manipulations if researchers have not designed a study with the intent of exploring non-linearities. For community analyses, multivariate statistics may yield more appropriate approaches to testing these connections, but the multitude of options can be difficult to assess. In response to this challenge, some researchers have attempted to make the appropriate techniques more accessible. For example, the GUide to STatistical Analysis in Microbial Ecology (GUSTA

ME) is a web resource that guides users through new and accepted methods of analysis

(Buttigieg and Ramette 2014). Structural equation modeling is another important statistical technique that moves beyond the univariate analysis of microbial communities and examines relationships among all the interacting biotic and abiotic variables within an ecosystem (Grace et al. 2010). This technique has been used successfully to determine, for example, that amoA abundance information is important for understanding N cycling rates in soils (Petersen et al. 2012).

By compiling recent literature where environmental manipulations were conducted, we show that 36% of the papers collecting microbial community structure and ecosystem process data specified objectives or hypotheses regarding structure- process links, yet less than half of these papers specifically reported checking for the presence of a direct link between the two properties. And 17% of papers without structure-process hypotheses tested for a link post hoc. Certainly, there are different

111

objectives specific to each study, but over and above the biological complexity of these studies, our conclusions are likely complicated by the addition of biases toward reporting positive results in academic culture. Given this bias, the low frequency with which links were statistically explored and the even lower frequency with which they were reported should encourage us to think critically about the contribution of our data to hypothesis testing as we reflect on the return for our investment. While hypothesis generation is not without merit, there was a nearly equal contribution of papers supporting hypotheses (n=16) as those generating hypotheses (n=13) (Fig. 24). As the structure-process knowledge base develops, we look forward to a greater portion of studies testing hypotheses that may contribute to theories involving structure and process links.

In addition, we had anticipated more links associated with phylogenetically narrow groups, yet the prevalence of detected structure-process links did not seem to follow any particular type of perturbation, experimental design or analytical metric aside from qPCR that would target these groups, thus limiting our ability to elucidate the strength and ubiquity of connections between microbial community structure and microbially mediated processes. Our potential for identifying connections is improving as our fields implement standard methods (e.g. Earth Microbiome Project)

(http://www.earthmicrobiome.org/) and metadata requirements (e.g. for uploading data to the Joint Genome Institute and Metagenomic Rapid Annotations using Subsystems

112

Technology server). As we move forward, building on collaborative, targeted efforts that yield experimental designs with empirical associations among communities and ecosystems may aid in our discovery of broader conclusions about the relationships between structure and microbial processes.

113

5. Conclusion

Environmental changes that occur at both coarse and fine spatial scales are known to affect the characteristics of ecological communities. Yet, for communities of microorganisms, we often have little or conflicting information to guide our expectations of how variation in abiotic characteristics will affect their collective composition and function (de Vries and Shade 2013, Logue et al. 2015). Microbial communities exposed to changes in environmental conditions are expected to have sufficient genetic diversity to support growth of specialists that benefit from the altered conditions as well as generalists high plasticity that are unaffected by the changes (Logue et al. 2015). The majority of large-scale associations between environmental variables and microorganisms are related to changes in pH, soil moisture, metals, salinity, and temperature (Lauber et al. 2009, Herlemann et al. 2011, Zeglin 2015). Gradients of abiotic change provide us with a unique tool to investigate associations between taxa and environmental conditions in ecologically-relevant settings (Whittaker 1956).

Relationships between environmental variables and microbial communities across kilometer spatial scales in ecosystems have been less thoroughly investigated when climactic factors, salinity, and pH have small magnitudes of change across the gradient.

While pH is often recognized as a master variable in structuring community composition of microbial taxa across complex environmental gradients such as acid mine where many abiotic variables are affected (Baker and Banfield 2003, Lear et al.

114

2009), the consequences of complex environmental gradients on microbial communities in the absence of a strong pH change have been less clear. Thus, I used an environmental gradient with strong correlations to metals, anions, and conductivity, but where pH was a weakly correlated factor, to addressed the question: How does taxonomic composition of bacterial communities change across an environmental gradient?

Within this same gradient, there are major shifts in macroinvertebrate taxa with losses of sensitive taxa and reduced diversity (Voss 2015). While we may expect the vast genetic diversity within microbial communities and the possibility of horizontal gene transfer could compensate for the susceptibility of some taxa, I found that the composition of bacteria taxa in biofilms did differ significantly at sites exposed to alkaline coalmine drainage. Further, the greater the AlkMD-loading, the greater the dissimilarity in community composition, suggesting that AlkMD structured the composition of bacterial biofilm communities in a concentration-dependent manner. Within mined sites, community richness was negatively correlated with the extent of mining. Although, bacterial diversity of mined sites was not outside of the richness range identified at reference sites, this linear correlation could indicate that the AlkMD gradient is a factor in bacteria richness. While the suite of AlkMD dissolved constituents was the dominant factor for shifts in composition between mined and unmined sites, biofilm chemistry was most responsible for changes in composition among mined sites.

115

Microbial taxa have a unique sets of capabilities as dictated by their genome, yet also share functional genes across phylogenetic branches. Some microbial communities may share 70% of functional genes, yet only 15% of species (Burke et al. 2011). This opens the potential for perturbation responses to reveal a stronger relationship between environmental variables and functional genes found within a collection of taxa.

Alternatively, for functions controlled by phylogenetically broad guilds, individual taxa could be quite dissimilar in their response, leading to a weaker or no direct relationship.

Moreover, a shift in community composition unrelated to functional genes frequencies due to portfolio effects and redundancy of those genes in different taxa (Allison and

Martiny 2008). Thus, I wanted to know if the taxonomic differences identified in Chapter

2 translated into changes in functional gene composition and, therefore, addressed the following question: How does the composition and relative abundance of microbial functional genes change across a strong chemical gradient? In particular, I expected that functional gene frequency would reflect the presence of subsidies and stressors in the system that correspond with particular functions.

I used metagenomic sequencing of floc communities across this same environmental gradient to target overall composition of functional genes as well as specific metabolisms expected to change with the gradient as informed by concentration increases of subsidies and stressors related to AlkMD. This analysis showed that composition, but not diversity of functional genes changed between exposed and

116

unexposed communities with the majority of indicators occurring at unmined reference sites. The majority (90.4%) of functional genes were unresponsive, but of those functional gene groups that differed between mined and unmined sites, two-thirds had lower relative abundance at mined sites. These reduced abundances could be due to detrimental conditions for taxa with those genes, but there may also be a dilution of those functional gene groups due to genetic differences in taxa that immigrate to mined biofilms. Despite AlkMD-associated increases in sulfate and nitrate, methane metabolism, rather than sulfur or nitrogen metabolisms differed the most between site types and was lower at mined sites. Thus, functional gene abundances did not unify taxa responses. Further, sulfate and nitrate were not independently influential in functional gene disparities, rather shifts in gene composition were related to the entire suite of

AlkMD constituents that changed along the gradient.

This survey of microbial communities exposed to chemical gradients established by alkaline coal mine drainage has given new insight into the taxonomic and functional gene shifts that occur at these sites, but it also raises questions about the utility of using coarse measurements of subsidies and stressors in perturbation gradients to anticipate microbial community responses. This research suggests that for only some environmental variables can the bacteria and archaea community’s responses be streamlined based on our thermodynamic and kinetic expectations. Lotic environments are complex from the perspective of a microorganism because unlike soils, there is a

117

constant, rapid delivery of upstream constituents that includes both chemicals and biota, yet the community of microorganisms typically establishes a microenvironment within the biofilm (Battin et al. 2016). This microenvironment can be defined by isolated physical and chemical properties, but may have similar taxa compositions in biofilms from different catchments (Besemer et al. 2012). Thus, coarse spatial measurements of stream water and sediment chemistry can inform us of some specific responses across a gradient, but may not be well-suited for a broader phylogenetic group. Further, using metagenomic approaches without understanding site chemistry may preclude us from recognizing some important characteristics of the community such as the relationship observed between selenate reduction genes and selenium concentration that was imperceptible in the metagenomic data.

As I conducted these observational studies, I was interested perturbation-derived consequences of community composition change on ecosystem processes that are mediated by microorganisms. This connection has been sought by microbial ecologists with increasing intensity, yet the conditions under which we are identifying these connections is unclear. Thus I asked: How and to what extent are perturbation studies revealing connections between microbial community structure and microbial processes? Substantial efforts are put into this research question, but often the analysis of the results was not hypothesis –driven and did not always report testing for the connection. I found that when connections were tested, functional responses occurred

118

earlier than changes in microbial structure, which is a conclusion shared by multiple other studies (e.g. Reed and Martiny 2007, Shade et al. 2013). This encourages us to consider the timing of our studies when attempting to identify connections, but also suggests that using structure to predict functional changes may be appropriate only for short-term considerations, while predicting functional change over a longer period is unlikely to reflect the original functional results. Further, quantitative PCR of functional genes was the only technique more often linked with function than not. This suggests that while using mRNA may be attractive for detecting activity, DNA studies do not bar us from identifying structure-function responses.

Although we have made progress, there is still much work remaining to ensure that our vast and growing wealth of microbial informatics data can be translated into useful ecological information. In part, this challenge can be approached through using hypotheses to guide analyses, but also by being open to opportunities for hypothesis generation. These chapters collectively encourage us to carefully interpret shifts in community composition in relation to fluctuations in specific abiotic characteristics and suggest we consider both ecological and thermodynamic and kinetic principles as we seek to understand the properties governing community responses to environmental perturbation. Further, they suggest that this varies not only by the collective chemical parameters that change, but also by the physical properties of the environment and may be more difficult to identify in lotic systems than in soils. Humans are constantly

119

creating new gradients or modify existing ones and understanding these relationships is both useful for developing ecological principles and for teasing apart the biotic consequences of anthropogenic activity.

120

Appendix A

Conductivity (uS/cm) & 2 more vs. Percent Area Mined

1500

1000

500

Conductivity (uS/cm) 0

25 20 15 10 5 0 Dissolved Selenium (ppb) -5 1000 800 600 400

Sulfate (mg/L) 200 0 -200 -20 0 20 40 60 80 100 Percent Area Mined Percent Watershed Mined

Figure 25. Scatterplot of environmental variables (sulfate, dissolved selenium, and conductivity) sampled at time of biofilm collection (April 2011).

121

Figure 26. Post hoc site groupings based on HAC 1 analysis of Bray-Curtis distance NMDS using Ward’s method. Agglomerative coefficient=0.67, average silhouette width=0.23.

122

Table 9. Environmental variables from water chemistry and biofilms. BDL= Below Detection Level, which was determined as three times the standard deviation of blank measurements divided by the external standard’s slope. NA= sample not available. Site labels are MRU (unmined), MRM (mined mainstem), AVF (actively mined valley fill), and RVF (restored valley fill).

123

Table 10. Environmental variables from water chemistry and biofilms. See Table 9 for details.

124

Table 11. Multiplex Identifier Adapters for GS FLX Titanium Chemistry (TCB No. 005-2009).

125

Table 12. Environmental variables from PCA with Pearson correlation coefficients (r) and loadings for each axis.

126

Table 13. Correlations of environmental variables and relative abundances of bacteria from four different taxonomic levels: phylum, class, order, or family. Only taxa with r>|0.5| and P<0.05 shown. Table organized by 1) environmental variable, 2) taxonomic level at which the analysis was done, and 3) correlation coefficient.

127

Appendix B

Table 14. Primers and standards used for quantitative PCR analysis of biofilms.

Function Gene Primers Standards Citation Amplicons Sulfate aprA aps3F, aps2R Desulfovibrio Chrisophersen et al. Reduction vulgaris 2011 Hildenborough ATCC 29579 dsrB DSRp2060F, DSR4R “ ” Wagner et al. 1998, Oakley et al. 2011, Denitrification nirK nirKC1F, nirKC1R Ochrobactrum Wei et al. 2015 nirKC2F, nirKC2R anthropi nirKC3F, nirKC3R JCM 21032 nirKC4F, nirKC4R Azospirillum brasilense Sp245 Actinosynnema mirum NBRC-10460 Nitrobacter winogradskyi NBRC-14297 Methanogenesis mcrA mlasF, mcrAR Methanococcus Steinburg and vannielii Regan 2008 JCM 13029 Methane pmoA A189F, Mb601R Kolb et al. 2003 oxidation A189F, Mc468R Selenate serA Thauera Wen et al. 2016 reduction selenatis AJ007744 srdB Bacteria 16S Bact1369F, Desulfovibrio Suzuki et al. 2001, Prok1541R vulgaris Hildenborough ATCC 29579

(Christophersen et al. 2011),(Wagner et al. 1998, Steinberg and Regan 2008, Oakley et al.

2011) (Wei et al. 2015), (Suzuki et al. 2001, Kolb et al. 2003)

128

Table 15. Variables used in Principal Components Analysis of dissolved water chemistry constituents.

Environmental Variables - Water Component 1 Component 2 Variable r p-value loading Variable r p-value loading Sr (μg/L) 0.996 <0.001 0.260 Al (μg/L) log 0.880 <0.001 0.547 Ca (μg/L) 0.985 <0.001 0.257 Zn (μg/L) 0.816 <0.001 0.507 K (μg/L) 0.970 <0.001 0.253 Na (μg/L) log 0.689 0.004 0.428 Cond.(uS/cm) 0.970 <0.001 0.253 Se (μg/L) 0.965 <0.001 0.252 2- SO4 (mg/L) 0.955 <0.001 0.249 Mg (μg/L) log 0.953 <0.001 0.249 Ni (μg/L) log 0.950 <0.001 0.248 NPOC (mg/L) 0.912 <0.001 0.238 Co (μg/L) log 0.903 <0.001 0.236 TN (mg/L) log 0.896 <0.001 0.234 U (μg/L) 0.863 <0.001 0.225 2- NO3 (mg/L) log 0.830 <0.001 0.217 pH 0.634 0.010 0.166 Pb (μg/L) 0.616 0.014 0.161 Ag (μg/L) 0.544 0.036 0.142 Fe (μg/L) log -0.762 0.001 -0.199 Cl (mg/L) -0.828 <0.001 -0.216

129

Table 16. Variables used in Principal Components Analysis of biofilm constituents.

Environmental Variables - Floc Component 1 Component 2 Variable r p-value loading Variable r p-value loading Se (μg/g OM) 0.9309 <0.001 0.274 Co (μg/g OM) log 0.8482 <0.001 0.3673 Mg (μg/g OM) 0.8694 <0.001 0.2559 Tl (μg/g OM) 0.7947 0.0004 0.3441 Ca (μg/g OM) log 0.8192 <0.001 0.2411 Cd (μg/g OM) log 0.7671 0.0008 0.3322 Sr (μg/g OM) log 0.7204 0.002 0.212 Zn (μg/g OM) 0.6892 0.0045 0.2985 Mn (μg/g OM) log 0.7165 0.003 0.2109 Ni (μg/g OM) log 0.6397 0.0102 0.277 Ni (μg/g OM) log 0.6491 0.009 0.1911 Be (μg/g OM) 0.6308 0.0117 0.2731 Be (μg/g OM) -0.5148 0.049 -0.1515 Ba (μg/g OM) 0.5903 0.0205 0.2556 Ba (μg/g OM) -0.6294 0.010 -0.1853 U (μg/g OM) log 0.5858 0.0218 0.2537 Th (μg/g OM) -0.7498 0.001 -0.2207 Cu μg/g OM) -0.8525 <0.001 -0.2509 Ce (μg/g OM) -0.882 <0.001 -0.2596 Pb (μg/g OM) log -0.8919 <0.001 -0.2625 V (μg/g OM) -0.8947 <0.001 -0.2633 Cr (μg/g OM) -0.9012 <0.001 -0.2653 As (μg/g OM) -0.9025 <0.001 -0.2657 Fe (μg/g OM) -0.9095 <0.001 -0.2677

130

Table 17. Phyla excluded from Fig. 10 due to <1% relative abundance.

Phylum Relative Mined Unmined Unmined OTU count Abundance <1% (# of OTUs) (# of OTUs) Aquificae 1 1 Chlamydiae 14 8 Chlorobi 3 2 Chloroflexi 14 10 Deferribacteres 2 2 Deinococcus-Thermus 1 0 Elusimicrobia 1 1 Fibrobacteres 2 1 Fusobacteria 3 2 Gemmatimonadetes 1 1 Lentisphaerae 2 2 Nitrospirae 4 6 Spirochaetes 8 9 Synergistetes 2 2 Tenericutes 7 3 Thaumarchaeota 4 3 Thermotogae 2 3

Table 18. KEGG orthogroups excluded from Fig. 11 due to <1% relative abundance.

Mined Unmined KEGG Orthogroup Level B Relative (# of (# of Count of Unmined_sum Abundance <1% Level D) Level D) Cell growth and death 297 309 Cell motility 91 92 Cellular commiunity 193 201 Signaling molecules and interaction 68 63 Transcription 134 137 Transport and catabolism 238 234

131

Table 19. Key of Level B KEGG orthogroups for Fig. 11 and Tables 21-24.

Level Key ID A CP Cellular Processes A EIP Environmental Information Processing A GIP Genetic Information Processing A H Hydrolases A LI Ligases A M Metabolism A O Oxidoreductases A SS Secretion system A T Transferases A TPT Transporters B Aam Amino acid metabolism B Bm Biosynthesis of other secondary metabolites B Bss Bacterial secretion system B Cc Cellular community B Cgd Cell growth and death B Cm Carbohydrate metabolism B Cmo Cell motility B Em Energy metabolism B Fsd Folding sorting and degradation B Gbm Glycan biosynthesis and metabolism B Lm Lipid metabolism B Mcats Metallic cation, iron-siderophore and vitamin B12 transport system B Mcv Metabolism of cofactors and vitamins B Moaa Metabolism of other amino acids B Mots Mineral and organic ion transport system B Mt Membrane transport B Mtp Metabolism of terpenoids and polyketides B Nm Nucleotide metabolism B O Overview B Pnt Peptide nickel transport system B Rr Replication and repair B Sclts Saccharide, polyol, and lipid transport system B Smi Signaling molecules and interaction B St Signal transduction B Tc Transport and catabolism B Tln Translation B Tpcg Transferring phosphorus-containing groups B Txn Transcription B Xbm Xenobiotics biodegradation and metabolism

132

MM1 MM2 MM3 MM4 MV1 MV2

Species CountSpecies MV3 UM1 UM2 UM3 UM4 UM5

Number of Reads

Figure 27. Rarefaction curves from rRNA using MG-RAST M5RNA database. MM= mined, mainstem; MV=mined, valley fill; UM=unmined.

Stress = 0.16 F1,10 = 1.4 Percent Watershed Mined P < 0.001 0.50 0

25 PC1.Water PC2.Biofilm 50 PC2.Water 0.25 75 NMDS2 Site Type

0.00 Mined

Unmined

-0.25 -0.5 0.0 0.5 NMDS1

Figure 28. NMDS of OTUs (Fig. 13a) with environmental variable vectors that had a significant fit (P<0.05).

133

Percent Watershed Mined

0.2 0

25

50 0.0 75 NMDS2 Site Type -0.2 Mined

Unmined

-0.4

-0.6 -0.3 0.0 0.3 0.6 NMDS1

Figure 29. NMDS of KEGG orthogroups (Fig. 13b) with environmental variable vectors that had a significant fit (P<0.05).

134

Table 20. OTUs with a relative abundance significantly less (-) or greater (+) at sites with alkaline mine drainage (AlkMD). Significant indicator groups of mined and unmined sites noted.

Response OTU ID to AlkMD Phylogeny Indicator 1 otu_49230 - Bacteria; Acidobacteria; Solibacteres; Solibacterales; Solibacteraceae; Candidatus Solibacter; Candidatus Solibacter usitatus 2 otu_49124 - Bacteria; Acidobacteria; unclassified (from Acidobacteria); unclassified (from Acidobacteria); unclassified (from Acidobacteria); Candidatus Koribacter; Candidatus Koribacter versatilis * Unmined 3 otu_301262 - Bacteria; Proteobacteria; Deltaproteobacteria; Desulfuromonadales; Pelobacteraceae; Pelobacter; Pelobacter propionicus 4 otu_122184 - Bacteria; Proteobacteria; Deltaproteobacteria; Desulfuromonadales; Geobacteraceae; Geobacter; Geobacter lovleyi * Unmined 5 otu_322282 - Bacteria; Bacteroidetes; unclassified (from Bacteroidetes); unclassified (from Bacteroidetes); unclassified (from Bacteroidetes); Prolixibacter; Prolixibacter bellariivorans * Unmined 6 otu_27880 - Bacteria; Proteobacteria; Betaproteobacteria; Rhodocyclales; Rhodocyclaceae; Azoarcus; Azoarcus sp. BH72 * Unmined 7 otu_296844 - Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Parabacteroides; Parabacteroides merdae * Unmined 8 otu_306969 - Bacteria; Proteobacteria; Alphaproteobacteria; Caulobacterales; Caulobacteraceae; Phenylobacterium; Phenylobacterium zucineum * Unmined 9 otu_49613 - Bacteria; Bacteroidetes; Flavobacteriia; Flavobacteriales; Flavobacteriaceae; Capnocytophaga; Capnocytophaga gingivalis * Unmined 10 otu_86442 - Bacteria; Proteobacteria; Deltaproteobacteria; Desulfobacterales; Desulfobacteraceae; Desulfobacterium; Desulfobacterium autotrophicum * Unmined 11 otu_122198 - Bacteria; Proteobacteria; Deltaproteobacteria; Desulfuromonadales; Geobacteraceae; Geobacter; Geobacter sp. M21 * Unmined 12 otu_122212 - Bacteria; Proteobacteria; Deltaproteobacteria; Desulfuromonadales; Geobacteraceae; Geobacter; Geobacter uraniireducens * Unmined 13 otu_354049 - Bacteria; Acidobacteria; Solibacteres; Solibacterales; Solibacteraceae; Candidatus Solibacter; Candidatus Solibacter usitatus * Unmined 14 otu_49052 - Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; unclassified (from Bacteroidales); Candidatus Azobacteroides; Candidatus Azobacteroides pseudotrichonymphae * Unmined 15 otu_320113 - Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Porphyromonas; Porphyromonas endodontalis * Unmined 16 otu_5547 - Bacteria; Proteobacteria; Gammaproteobacteria; Aeromonadales; Aeromonadaceae; Aeromonas; Aeromonas veronii

135 17 otu_50671 - Bacteria; Firmicutes; Clostridia; Thermoanaerobacterales; Thermoanaerobacteraceae; Caldanaerobacter; Caldanaerobacter subterraneus * Unmined 18 otu_382471 + Bacteria; Proteobacteria; Alphaproteobacteria; unclassified (from Alphaproteobacteria); unclassified (from Alphaproteobacteria); unclassified (from Alphaproteobacteria); uncultured alpha proteobacterium

19 otu_355614 + Bacteria; Proteobacteria; Alphaproteobacteria; Sphingomonadales; Sphingomonadaceae; Novosphingobium; Novosphingobium aromaticivorans 20 otu_263787 + Bacteria; Bacteroidetes; Cytophagia; Cytophagales; Cytophagaceae; Microscilla; Microscilla marina 21 otu_336148 + Bacteria; Proteobacteria; Alphaproteobacteria; Rhodobacterales; Rhodobacteraceae; Rhodobacter; Rhodobacter sp. SW2 22 otu_276664 + Archaea; Thaumarchaeota; unclassified (from Thaumarchaeota); Nitrosopumilales; Nitrosopumilaceae; Nitrosopumilus; Nitrosopumilus maritimus 23 otu_7502 + Bacteria; Verrucomicrobia; Verrucomicrobiae; Verrucomicrobiales; Verrucomicrobiaceae; Akkermansia; Akkermansia muciniphila * Mined 24 otu_49152 + Archaea; Thaumarchaeota; unclassified (from Thaumarchaeota); Nitrosopumilales; Nitrosopumilaceae; Nitrosopumilus; Candidatus Nitrosopumilus sp. NM25 * Mined 25 otu_53789 + Archaea; Thaumarchaeota; unclassified (from Thaumarchaeota); Cenarchaeales; Cenarchaeaceae; Cenarchaeum; Cenarchaeum symbiosum * Mined 26 otu_397972 + Bacteria; Bacteroidetes; Flavobacteriia; Flavobacteriales; Flavobacteriaceae; Zunongwangia; Zunongwangia profunda * Mined 27 otu_7503 + Bacteria; Verrucomicrobia; Verrucomicrobiae; Verrucomicrobiales; Verrucomicrobiaceae; Akkermansia; Akkermansia muciniphila * Mined

Table 21. KEGG orthogroups with a relative abundance significantly less (negative) at sites with alkaline mine drainage (AlkMD). See Appendix B Table 19 for key of Level A and Level B descriptions. Significant indicator groups noted.

KEGG Orthogroup Response ID to AlkMD Level A Level B Level C Level D Indicator 1 K02051 - EIP Mots NitT/TauT family transport system sulfonate/nitrate/taurine transport system substrate-binding protein 2 K01989 - EIP Mcats Putative ABC transport system putative ABC transport system substrate-binding protein 3 K07220 - hypothetical protein 4 K03412 - CP Cmo Bacterial chemotaxis [PATH:ko02030] cheB; two-component system chemotaxis family response regulator CheB 4 K03412 - EIP St Two-component system [PATH:ko02020] cheB; two-component system chemotaxis family response regulator CheB 5 K06001 - M Aam Glycine serine and threonine metabolism [PATH:ko00260] trpB; tryptophan synthase beta chain 5 K06001 - M O Biosynthesis of amino acids [PATH:ko01230] trpB; tryptophan synthase beta chain 5 K06001 - M Aam Phenylalanine tyrosine and tryptophan biosynthesis [PATH:ko00400] trpB; tryptophan synthase beta chain 6 K00688 - M Cm Starch and sucrose metabolism [PATH:ko00500] glgP PYG; starch phosphorylase 7 K02020 - EIP Mt ABC transporters [PATH:ko02010] modA; molybdate transport system substrate-binding protein 8 K06990 - CP Tubulin-binding proteins MEMO1 family protein 9 K00625 - M Em Methane metabolism [PATH:ko00680] pta; phosphate acetyltransferase 9 K00625 - M O Carbon metabolism [PATH:ko01200] pta; phosphate acetyltransferase 9 K00625 - M Em Carbon fixation pathways in prokaryotes [PATH:ko00720] pta; phosphate acetyltransferase 9 K00625 - M Cm Propanoate metabolism [PATH:ko00640] pta; phosphate acetyltransferase 9 K00625 - M Cm Pyruvate metabolism [PATH:ko00620] pta; phosphate acetyltransferase 9 K00625 - M Moaa Taurine and hypotaurine metabolism [PATH:ko00430] pta; phosphate acetyltransferase 10 K12140 - O hydrogenase-4 component E 136 11 K04102 - M Xbm Polycyclic aromatic hydrocarbon degradation [PATH:ko00624] pht5; 4 5-dihydroxyphthalate decarboxylase 12 K02006 - EIP Mt ABC transporters [PATH:ko02010] cbiO; cobalt/nickel transport system ATP-binding protein 13 K09252 - H Carboxylic-ester hydrolases feruloyl esterase 14 K09141 - hypothetical protein 15 K00169 - M O Carbon metabolism [PATH:ko01200] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Em Methane metabolism [PATH:ko00680] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Cm Citrate cycle (TCA cycle) [PATH:ko00020] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Cm Butanoate metabolism [PATH:ko00650] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Cm Propanoate metabolism [PATH:ko00640] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Cm Pyruvate metabolism [PATH:ko00620] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Xbm Nitrotoluene degradation [PATH:ko00633] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Cm Glycolysis / Gluconeogenesis [PATH:ko00010] porA; pyruvate ferredoxin oxidoreductase alpha subunit 15 K00169 - M Em Carbon fixation pathways in prokaryotes [PATH:ko00720] porA; pyruvate ferredoxin oxidoreductase alpha subunit 16 K14415 - LI Ligases that form phosphoric-ester bonds protein RtcB 17 K13034 - M Aam Cysteine and methionine metabolism [PATH:ko00270] ATCYSC1; L-3-cyanoalanine synthase/ cysteine synthase 17 K13034 - M Em Sulfur metabolism [PATH:ko00920] ATCYSC1; L-3-cyanoalanine synthase/ cysteine synthase 17 K13034 - M O Biosynthesis of amino acids [PATH:ko01230] ATCYSC1; L-3-cyanoalanine synthase/ cysteine synthase 17 K13034 - M Moaa Cyanoamino acid metabolism [PATH:ko00460] ATCYSC1; L-3-cyanoalanine synthase/ cysteine synthase 17 K13034 - M O Carbon metabolism [PATH:ko01200] ATCYSC1; L-3-cyanoalanine synthase/ cysteine synthase 18 K16792 - M O Biosynthesis of amino acids [PATH:ko01230] aksD; methanogen homoaconitase large subunit 18 K16792 - M Em Methane metabolism [PATH:ko00680] aksD; methanogen homoaconitase large subunit 18 K16792 - M Aam Lysine biosynthesis [PATH:ko00300] aksD; methanogen homoaconitase large subunit 18 K16792 - M O 2-Oxocarboxylic acid metabolism [PATH:ko01210] aksD; methanogen homoaconitase large subunit 19 K00186 - M Aam Valine leucine and isoleucine degradation [PATH:ko00280] vorA; 2-oxoisovalerate ferredoxin oxidoreductase alpha subunit 20 K11689 - EIP St Two-component system [PATH:ko02020] dctQ; C4-dicarboxylate transporter DctQ subunit

Table 22. KEGG orthogroups with a relative abundance significantly less (negative) at sites with alkaline mine drainage (AlkMD). See Appendix B Table 19 for key of Level A and Level B descriptions. Significant indicator groups noted.

20 K11689 - EIP St Two-component system [PATH:ko02020] dctQ; C4-dicarboxylate transporter DctQ subunit KEGG Orthogroup Response ID to AlkMD Level A Level B Level C Level D Indicator 21 K05601 - M Em Nitrogen metabolism [PATH:ko00910] hcp; hydroxylamine reductase 22 K00170 - M Em Methane metabolism [PATH:ko00680] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Xbm Nitrotoluene degradation [PATH:ko00633] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M O Carbon metabolism [PATH:ko01200] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Cm Butanoate metabolism [PATH:ko00650] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Cm Propanoate metabolism [PATH:ko00640] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Em Carbon fixation pathways in prokaryotes [PATH:ko00720] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Cm Citrate cycle (TCA cycle) [PATH:ko00020] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Cm Glycolysis / Gluconeogenesis [PATH:ko00010] porB; pyruvate ferredoxin oxidoreductase beta subunit 22 K00170 - M Cm Pyruvate metabolism [PATH:ko00620] porB; pyruvate ferredoxin oxidoreductase beta subunit 23 K00705 - M Cm Starch and sucrose metabolism [PATH:ko00500] malQ; 4-alpha-glucanotransferase 24 K02591 - M Xbm Chloroalkane and chloroalkene degradation [PATH:ko00625] nifK; nitrogenase molybdenum-iron protein beta chain 24 K02591 - M Em Nitrogen metabolism [PATH:ko00910] nifK; nitrogenase molybdenum-iron protein beta chain 25 K04656 - hydrogenase maturation protein HypF 26 K01818 - M Cm Fructose and mannose metabolism [PATH:ko00051] fucI; L-fucose isomerase 27 K14205 - EIP St Two-component system [PATH:ko02020] mprF fmtC; phosphatidylglycerol lysyltransferase 28 K02028 - polar amino acid transport system ATP-binding protein 137 29 K06152 - M Cm Pentose phosphate pathway [PATH:ko00030] E1.1.99.3G; gluconate 2-dehydrogenase gamma chain 30 K00436 - hydrogen dehydrogenase 31 K01917 - M Moaa Glutathione metabolism [PATH:ko00480] E6.3.1.8; glutathionylspermidine synthase 32 K10046 - M Cm Ascorbate and aldarate metabolism [PATH:ko00053] GME; GDP-D-mannose 3' 5'-epimerase 32 K10046 - M Cm Amino sugar and nucleotide sugar metabolism [PATH:ko00520] GME; GDP-D-mannose 3' 5'-epimerase 33 K00179 - indolepyruvate ferredoxin oxidoreductase, alpha subunit 34 K10011 - M Cm Amino sugar and nucleotide sugar metabolism [PATH:ko00520] arnA pmrI; UDP-4-amino-4-deoxy-L-arabinose formyltransferase 35 K02008 - EIP Mt ABC transporters [PATH:ko02010] cbiQ; cobalt/nickel transport system permease protein 36 K01531 - Mg2+-importing ATPase 37 K00929 - M Cm Butanoate metabolism [PATH:ko00650] E2.7.2.7 buk; butyrate kinase *Unmined 37 K00929 - M Cm Butanoate metabolism [PATH:ko00650] E2.7.2.7 buk; butyrate kinase *Unmined 38 K01846 - M Cm C5-Branched dibasic acid metabolism [PATH:ko00660] glmS mutS mamA; methylaspartate mutase sigma subunit 38 K01846 - M Cm Glyoxylate and dicarboxylate metabolism [PATH:ko00630] glmS mutS mamA; methylaspartate mutase sigma subunit 38 K01846 - M O Carbon metabolism [PATH:ko01200] glmS mutS mamA; methylaspartate mutase sigma subunit 39 K14138 - M O Carbon metabolism [PATH:ko01200] acsB; acetyl-CoA synthase 39 K14138 - M Em Carbon fixation pathways in prokaryotes [PATH:ko00720] acsB; acetyl-CoA synthase 40 K13380 - M Em Oxidative phosphorylation [PATH:ko00190] nuoBCD; NADH-quinone oxidoreductase subunit B/C/D 41 K15022 - M Em Methane metabolism [PATH:ko00680] fdhB1; formate dehydrogenase beta subunit 41 K15022 - M Em Carbon fixation pathways in prokaryotes [PATH:ko00720] fdhB1; formate dehydrogenase beta subunit 41 K15022 - M O Carbon metabolism [PATH:ko01200] fdhB1; formate dehydrogenase beta subunit 42 K16179 - M Em Methane metabolism [PATH:ko00680] mtbC; dimethylamine corrinoid protein 42 K16179 - M O Carbon metabolism [PATH:ko01200] mtbC; dimethylamine corrinoid protein 43 K18012 - M Aam Lysine degradation [PATH:ko00310] kdd; L-erythro-3 5-diaminohexanoate dehydrogenase 44 K00534 - ferredoxin hydrogenase small subunit 45 K18030 - M Mcv Nicotinate and nicotinamide metabolism [PATH:ko00760] nicB; nicotinate dehydrogenase subunit B 46 K09155 - hypothetical protein *Unmined

Table 23. KEGG orthogroups with a relative abundance significantly less (negative) at sites with alkaline mine drainage (AlkMD). See Appendix B Table 19 for key of Level A and Level B descriptions. Significant indicator groups noted.

46 K09155 - hypothetical protein *Unmined KEGG Orthogroup Response ID to AlkMD Level A Level B Level C Level D Indicator 47 K03082 - M Cm Pentose and glucuronate interconversions [PATH:ko00040] sgbU; hexulose-6-phosphate isomerase 48 K03207 - H In phosphorus-containing anhydrides colanic acid biosynthesis protein WcaH *Unmined 49 K00879 - M Cm Fructose and mannose metabolism [PATH:ko00051] fucK; L-fuculokinase *Unmined 49 K00879 - M Cm Fructose and mannose metabolism [PATH:ko00051] fucK; L-fuculokinase *Unmined 50 K15855 - M Cm Amino sugar and nucleotide sugar metabolism [PATH:ko00520] csxA; exo-1 4-beta-D-glucosaminidase *Unmined 50 K15855 - M Cm Amino sugar and nucleotide sugar metabolism [PATH:ko00520] csxA; exo-1 4-beta-D-glucosaminidase *Unmined 51 K08313 - fructose-6-phosphate aldolase 1 52 K08314 - fructose-6-phosphate aldolase 2 53 K12309 - CP Tc Lysosome [PATH:ko04142] GLB1 ELNR1; beta-galactosidase 53 K12309 - M Gbm Other glycan degradation [PATH:ko00511] GLB1 ELNR1; beta-galactosidase 53 K12309 - M Lm Sphingolipid metabolism [PATH:ko00600] GLB1 ELNR1; beta-galactosidase 53 K12309 - M Gbm Glycosaminoglycan degradation [PATH:ko00531] GLB1 ELNR1; beta-galactosidase 53 K12309 - M Cm Galactose metabolism [PATH:ko00052] GLB1 ELNR1; beta-galactosidase 53 K12309 - M Gbm Glycosphingolipid biosynthesis - ganglio series [PATH:ko00604] GLB1 ELNR1; beta-galactosidase 138 54 K08359 - EIP St Two-component system [PATH:ko02020] ttrC; tetrathionate reductase subunit C *Unmined 54 K08359 - EIP St Two-component system [PATH:ko02020] ttrC; tetrathionate reductase subunit C *Unmined 54 K08359 - M Em Sulfur metabolism [PATH:ko00920] ttrC; tetrathionate reductase subunit C *Unmined 54 K08359 - M Em Sulfur metabolism [PATH:ko00920] ttrC; tetrathionate reductase subunit C *Unmined 55 K03079 - M Cm Ascorbate and aldarate metabolism [PATH:ko00053] ulaE sgaU; L-ribulose-5-phosphate 3-epimerase *Unmined 55 K03079 - M Cm Ascorbate and aldarate metabolism [PATH:ko00053] ulaE sgaU; L-ribulose-5-phosphate 3-epimerase *Unmined 56 K07144 - mfnE; 5-(aminomethyl)-3-furanmethanol phosphate kinase *Unmined 57 K01910 - EIP St Two-component system [PATH:ko02020] citC; [citrate (pro-3S)-lyase] ligase *Unmined 57 K01910 - EIP St Two-component system [PATH:ko02020] citC; [citrate (pro-3S)-lyase] ligase *Unmined 58 K05084 - CP Tc Endocytosis [PATH:ko04144] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined 58 K05084 - CP Tc Endocytosis [PATH:ko04144] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined 58 K05084 - EIP St Calcium signaling pathway [PATH:ko04020] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined 58 K05084 - EIP St Calcium signaling pathway [PATH:ko04020] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined 58 K05084 - EIP St ErbB signaling pathway [PATH:ko04012] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined 58 K05084 - EIP St ErbB signaling pathway [PATH:ko04012] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined

Table 24. KEGG orthogroups with a relative abundance significantly greater (positive) at sites with alkaline mine drainage (AlkMD). See Appendix B Table 19 for key of Level A and B descriptions. No significant indicator groups.

58 K05084 - EIP St ErbB signaling pathway [PATH:ko04012] ERBB3 HER3; receptor tyrosine-protein kinase erbB-3 *Unmined KEGG Orthogroup Response ID to AlkMD Level A Level B Level C Level D Indicator 582 K03496 + chromosome partitioning protein 583 K07062 + uncharacterized protein 584 K07154 + hipA; serine/threonine-protein kinase HipA 585 K06193 + phosphonoacetate hydrolase 586 K01113 + EIP St Two-component system [PATH:ko02020] phoD; alkaline phosphatase D 586 K01113 + M Mcv Folate biosynthesis [PATH:ko00790] phoD; alkaline phosphatase D 586 K01113 + M Xbm Aminobenzoate degradation [PATH:ko00627] phoD; alkaline phosphatase D 587 K17686 + copA, ATP7; Cu+-exporting ATPase 588 K07093 + uncharacterized protein 589 K00452 + M Aam Tryptophan metabolism [PATH:ko00380] HAAO; 3-hydroxyanthranilate 3 4-dioxygenase 590 K05539 + tRNA-dihydrouridine synthase A 591 K09927 + hypothetical protein 592 K01673 + M Em Nitrogen metabolism [PATH:ko00910] cynT can; carbonic anhydrase 593 K00350 + Na+-transporting NADH ubiquinone oxidoreductase subunit D 594 K05844 + ribosomal protein S6 modification protein 139 595 K00349 + Na+-transporting NADH ubiquinone oxidoreductase subunit C 596 K09781 + hypothetical protein

597 K02575 + M Em Nitrogen metabolism [PATH:ko00910] NRT narK nrtP nasA; MFS transporter NNP family nitrate/nitrite transporter 598 K03670 + periplasmic glucans biosynthesis protein 599 K06954 + uncharacterized protein 600 K15727 + czcB; membrane fusion protein, cobalt-zinc-cadmium efflux system 601 K13628 + iron-sulfur cluster assembly protein 602 K12063 + conjugal transfer ATP-binding protein TraC 603 K12072 + conjugative transfer pilus assembly protein TraH 604 K03387 + alkyl hydroperoxide reductase subunit F 605 K15791 + DHKTD1; probable 2-oxoglutarate dehydrogenase E1 component DHKTD1 606 K17218 + M Em Sulfur metabolism [PATH:ko00920] sqr; sulfide:quinone oxidoreductase 607 K09883 + M Mcv Porphyrin and chlorophyll metabolism [PATH:ko00860] cobT; cobaltochelatase CobT 608 K17231 + IYD, DEHAL1; iodotyrosine deiodinase 609 K01490 + M Nm Purine metabolism [PATH:ko00230] AMPD; AMP deaminase 610 K13303 + EIP St PI3K-Akt signaling pathway [PATH:ko04151] SGK2; serum/glucocorticoid-regulated kinase 2 610 K13303 + EIP St FoxO signaling pathway [PATH:ko04068] SGK2; serum/glucocorticoid-regulated kinase 2 611 K07760 + cyclin-dependent kinase 612 K10209 + M Mtp Carotenoid biosynthesis [PATH:ko00906] crtN; 4 4'-diapophytoene desaturase 613 K04283 + M Moaa Glutathione metabolism [PATH:ko00480] TRYR; trypanothione-disulfide reductase

Table 25. MetaCyc pathways with a relative abundance significantly greater (positive) at sites with alkaline mine drainage (AlkMD). No significant indicator groups.

Response Pathway ID to AlkMD Pathway Indicator 1 P154 - PWY-1861: formaldehyde assimilation II (RuMP Cycle) 2 P431 - PWY-6588: pyruvate fermentation to acetone 3 P433 - PWY-6594: superpathway of Clostridium acetobutylicum solventogenic fermentation

4 P695 - PWY0-1241: ADP-L-glycero-β-D-manno-heptose biosynthesis 5 P12 - ARG+POLYAMINE-SYN: superpathway of arginine and polyamine biosynthesis 6 P69 - HEME-BIOSYNTHESIS-II: heme biosynthesis I (aerobic) 7 P760 - RUMP-PWY: formaldehyde oxidation I 8 P137 - POLYAMSYN-PWY: superpathway of polyamine biosynthesis I 9 P604 - PWY-7289: L-cysteine biosynthesis V 10 P76 - HSERMETANA-PWY: L-methionine biosynthesis III 11 P390 - PWY-6328: L-lysine degradation X 12 P375 - PWY-6279: myxol-2' fucoside biosynthesis 140 13 P711 - PWY0-42: 2-methylcitrate cycle I

14 P294 - PWY-5747: 2-methylcitrate cycle II 15 P339 - PWY-5993: superpathway of rifamycin B biosynthesis 16 P518 - PWY-6989: (-)-camphor degradation 17 P142 + PRPP-PWY: superpathway of histidine, purine, and pyrimidine biosynthesis 18 P333 + PWY-5971: palmitate biosynthesis II (bacteria and plants) 19 P2 + 1CMET2-PWY: N10-formyl-tetrahydrofolate biosynthesis

20 P359 + PWY-6163: chorismate biosynthesis from 3-dehydroquinate 21 P291 + PWY-5741: ethylmalonyl-CoA pathway 22 P350 + PWY-6122: 5-aminoimidazole ribonucleotide biosynthesis II 23 P374 + PWY-6277: superpathway of 5-aminoimidazole ribonucleotide biosynthesis 24 P785 + VALDEG-PWY: L-valine degradation I 25 P34 + COMPLETE-ARO-PWY: superpathway of aromatic amino acid biosynthesis 26 P18 + ARO-PWY: chorismate biosynthesis I 27 P139 + PPGPPMET-PWY: ppGpp biosynthesis 28 P441 + PWY-6628: superpathway of L-phenylalanine biosynthesis 29 P40 + FAO-PWY: fatty acid β-oxidation I

Table 26. Results from linear regression using KEGG orthogroups from site metagenomes and subsidies (P≤ 0.05).

F 2 2 Type KO ID p.value intercept slope R Adj. R statistic df KO_name Independent Variable Sulfur K00860 0.001 3.011 0.007 0.715 0.686 25.077 10 cysC; adenylylsulfate kinase SO4 (mg/L) Sulfur K02045 0.001 9.958 -0.007 0.702 0.673 23.596 10 cysA; sulfate transport system ATP-binding protein SO4 (mg/L) Sulfur K08359 0.001 8.810 -0.006 0.658 0.624 19.230 10 ttrC; tetrathionate reductase subunit C SO4 (mg/L) Sulfur K01739 0.002 3.178 0.006 0.648 0.613 18.435 10 metB; cystathionine gamma-synthase SO4 (mg/L) Sulfur K17486 0.002 9.754 -0.006 0.622 0.584 16.452 10 dmdA; dimethylsulfoniopropionate demethylase SO4 (mg/L) Sulfur K01738 0.003 9.697 -0.006 0.600 0.560 15.028 10 cysK; cysteine synthase A SO4 (mg/L) Sulfur K12381 0.004 9.629 -0.006 0.575 0.532 13.528 10 ARSG; arylsulfatase G SO4 (mg/L) Sulfur K17217 0.009 3.544 0.006 0.513 0.465 10.549 10 mccB; cystathionine gamma-lyase / homocysteine desulfhydrase SO4 (mg/L) Sulfur K15765 0.010 7.377 -0.005 0.499 0.449 9.952 10 tmoF tbuC touF; toluene monooxygenase electron transfer component SO4 (mg/L) Sulfur K17226 0.011 9.390 -0.006 0.490 0.440 9.625 10 soxY; sulfur-oxidizing protein SoxY SO4 (mg/L) Sulfur K16954 0.012 3.013 -0.002 0.482 0.431 9.323 10 mtsA; methylthiol:coenzyme M methyltransferase SO4 (mg/L) Sulfur K16937 0.021 3.800 0.005 0.428 0.371 7.485 10 doxD; thiosulfate dehydrogenase [quinone] large subunit SO4 (mg/L) Sulfur K13892 0.024 9.153 -0.005 0.414 0.355 7.051 10 gsiA; glutathione transport system ATP-binding protein SO4 (mg/L) Sulfur K11181 0.025 9.138 -0.005 0.409 0.350 6.918 10 dsrB; sulfite reductase beta subunit SO4 (mg/L) Sulfur K17218 0.031 9.070 -0.005 0.388 0.327 6.337 10 sqr; sulfide:quinone oxidoreductase SO4 (mg/L) Nitrogen K00376 0.000 7.312 2.004 0.805 0.785 41.212 10 nosZ; nitrous-oxide reductase NO3 (mg/L) log10 Nitrogen K01673 0.006 7.170 1.654 0.548 0.503 12.142 10 cynT can; carbonic anhydrase NO3 (mg/L) log10 Nitrogen K10944 0.007 7.159 1.626 0.530 0.483 11.257 10 pmoA-amoA; methane/ammonia monooxygenase subunit A NO3 (mg/L) log10 Nitrogen K00260 0.009 5.852 -1.599 0.512 0.463 10.503 10 gudB rocG; glutamate dehydrogenase NO3 (mg/L) log10 Nitrogen K01674 0.023 5.916 -1.441 0.416 0.358 7.133 10 cah; carbonic anhydrase NO3 (mg/L) log10 141 Nitrogen K11260 0.048 3.569 -1.063 0.338 0.271 5.097 10 fwdG; 4Fe-4S ferredoxin NO3 (mg/L) log10 Methane K05884 0.001 10.945 -0.090 0.707 0.678 24.117 10 comC; L-2-hydroxycarboxylate dehydrogenase (NAD+) NPOC (mg/L) Methane K00200 0.002 10.650 -0.084 0.616 0.578 16.058 10 fwdA fmdA; formylmethanofuran dehydrogenase subunit A NPOC (mg/L) Methane K08691 0.003 2.371 0.084 0.610 0.571 15.639 10 mcl; malyl-CoA/(S)-citramalyl-CoA lyase NPOC (mg/L) Methane K16179 0.003 10.615 -8.341 0.606 0.566 15.364 10 mtbC; dimethylamine corrinoid protein NPOC (mg/L) Methane K00123 0.004 10.534 -0.082 0.582 0.540 13.929 10 fdoG fdfH; formate dehydrogenase major subunit NPOC (mg/L) Methane K00925 0.004 10.519 -0.081 0.578 0.536 13.685 10 ackA; acetate kinase NPOC (mg/L) Methane K16792 0.004 10.512 -0.081 0.576 0.534 13.582 10 aksD; methanogen homoaconitase large subunit NPOC (mg/L) Methane K14028 0.005 10.493 -0.081 0.570 0.527 13.274 10 mdh1 mxaF; methanol dehydrogenase (cytochrome c) subunit 1 NPOC (mg/L) Methane K16793 0.005 10.453 -0.080 0.559 0.515 12.675 10 aksE; methanogen homoaconitase small subunit NPOC (mg/L) Methane K00194 0.007 6.194 -0.055 0.538 0.492 11.652 10 cdhD acsD; acetyl-CoA decarbonylase/synthase complex subunit delta NPOC (mg/L) Methane K14082 0.007 7.167 -0.064 0.535 0.488 11.489 10 mtbA; [methyl-Co(III) methylamine-specific corrinoid protein]:coenzyme M methyltransferase NPOC (mg/L) Methane K10944 0.010 2.744 0.076 0.505 0.455 10.193 10 pmoA-amoA; methane/ammonia monooxygenase subunit A NPOC (mg/L) Methane K00625 0.011 10.218 -0.075 0.495 0.444 9.784 10 pta; phosphate acetyltransferase NPOC (mg/L) Methane K16157 0.011 7.044 -0.062 0.494 0.443 9.752 10 mmoX; methane monooxygenase component A alpha chain NPOC (mg/L) Methane K00850 0.011 10.199 -0.075 0.489 0.438 9.586 10 pfkA PFK; 6-phosphofructokinase 1 NPOC (mg/L) Methane K00148 0.013 10.150 -0.074 0.477 0.424 9.108 10 fdhA; glutathione-independent formaldehyde dehydrogenase NPOC (mg/L) Methane K03388 0.014 2.872 0.074 0.471 0.418 8.902 10 hdrA; heterodisulfide reductase subunit A NPOC (mg/L) Methane K00584 0.020 3.188 -0.027 0.434 0.377 7.653 10 mtrH; tetrahydromethanopterin S-methyltransferase subunit H NPOC (mg/L) Methane K00169 0.029 9.820 -0.067 0.394 0.334 6.512 10 porA; pyruvate ferredoxin oxidoreductase alpha subunit NPOC (mg/L) Methane K12234 0.029 3.183 0.067 0.394 0.333 6.490 10 cofE fbiB; coenzyme F420-0:L-glutamate ligase / coenzyme F420-1:gamma-L-glutamate ligase NPOC (mg/L) Methane K08094 0.030 8.765 -0.064 0.390 0.329 6.382 10 hxlB; 6-phospho-3-hexuloisomerase NPOC (mg/L) Methane K00170 0.032 3.226 0.066 0.383 0.322 6.220 10 porB; pyruvate ferredoxin oxidoreductase beta subunit NPOC (mg/L) Methane K00121 0.033 3.240 0.066 0.380 0.318 6.134 10 frmA ADH5 adhC; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase NPOC (mg/L) Methane K03519 0.034 9.738 -0.066 0.375 0.313 6.002 10 coxM cutM; carbon-monoxide dehydrogenase medium subunit NPOC (mg/L) Methane K15635 0.036 9.716 -0.065 0.370 0.307 5.876 10 apgM; 2 3-bisphosphoglycerate-independent phosphoglycerate mutase NPOC (mg/L)

Table 27. Results from linear regression using KEGG orthogroups from site metagenomes and stressors (P≤ 0.05).

Methane K15635 0.036 9.716 -0.065 0.370 0.307 5.876 10 apgM; 2 3-bisphosphoglycerate-independent phosphoglycerate mutase NPOC (mg/L) F Type KO ID p.value intercept slope R2 Adj. R2 statistic df KO_name Independent Variable Stress K05847 0.012 19.882 -2.108 0.481 0.429 9.266 10 opuA; osmoprotectant transport system ATP-binding protein Conductivity (µS/cm) log10

Methane K16179 0.000 10.101 -0.007 0.762 0.738 31.942 10 mtbC; dimethylamine corrinoid protein SO4 (mg/L)

Methane K16792 0.001 9.933 -0.007 0.692 0.662 22.510 10 aksD; methanogen homoaconitase large subunit SO4 (mg/L)

Methane K05884 0.001 9.927 -0.007 0.690 0.659 22.224 10 comC; L-2-hydroxycarboxylate dehydrogenase (NAD+) SO4 (mg/L)

Methane K00925 0.001 9.893 -0.007 0.676 0.644 20.892 10 ackA; acetate kinase SO4 (mg/L)

Methane K00625 0.002 9.803 -0.006 0.641 0.605 17.847 10 pta; phosphate acetyltransferase SO4 (mg/L)

Methane K00123 0.003 9.715 -0.006 0.607 0.568 15.463 10 fdoG fdfH; formate dehydrogenase major subunit SO4 (mg/L)

Methane K00194 0.003 5.728 -0.004 0.604 0.565 15.267 10 cdhD acsD; acetyl-CoA decarbonylase/synthase complex subunit delta SO4 (mg/L)

Methane K10944 0.003 3.326 0.006 0.592 0.551 14.497 10 pmoA-amoA; methane/ammonia monooxygenase subunit A SO4 (mg/L)

Methane K14082 0.004 6.595 -0.005 0.589 0.548 14.339 10 mtbA; [methyl-Co(III) methylamine-specific corrinoid protein]:coenzyme M methyltransferase SO4 (mg/L)

Methane K00200 0.005 9.601 -0.006 0.565 0.521 12.987 10 fwdA fmdA; formylmethanofuran dehydrogenase subunit A SO4 (mg/L)

Methane K00850 0.005 9.600 -0.006 0.564 0.521 12.957 10 pfkA PFK; 6-phosphofructokinase 1 SO4 (mg/L)

Methane K08691 0.006 3.466 0.006 0.541 0.495 11.772 10 mcl; malyl-CoA/(S)-citramalyl-CoA lyase SO4 (mg/L)

Methane K15635 0.007 9.504 -0.006 0.530 0.483 11.284 10 apgM; 2 3-bisphosphoglycerate-independent phosphoglycerate mutase SO4 (mg/L)

Methane K00584 0.008 2.988 -0.002 0.517 0.469 10.703 10 mtrH; tetrahydromethanopterin S-methyltransferase subunit H SO4 (mg/L) 142 Methane K16157 0.011 6.378 -0.005 0.495 0.444 9.787 10 mmoX; methane monooxygenase component A alpha chain SO4 (mg/L)

Methane K16793 0.012 9.379 -0.006 9.493 0.487 0.436 10 aksE; methanogen homoaconitase small subunit SO4 (mg/L)

Methane K00169 0.014 9.333 -0.006 0.472 0.419 8.925 10 porA; pyruvate ferredoxin oxidoreductase alpha subunit SO4 (mg/L)

Methane K00443 0.016 7.257 -0.005 0.454 0.400 8.324 10 frhG; coenzyme F420 hydrogenase subunit gamma SO4 (mg/L)

Methane K08094 0.017 8.247 -0.005 0.448 0.393 8.127 10 hxlB; 6-phospho-3-hexuloisomerase SO4 (mg/L)

Methane K00201 0.018 9.251 -0.005 0.445 0.389 8.006 10 fwdB fmdB; formylmethanofuran dehydrogenase subunit B SO4 (mg/L)

Methane K14028 0.020 9.216 -0.005 0.433 0.377 7.649 10 mdh1 mxaF; methanol dehydrogenase (cytochrome c) subunit 1 SO4 (mg/L)

Methane K13788 0.023 9.168 -0.005 0.418 0.360 7.187 10 pta; phosphate acetyltransferase SO4 (mg/L)

Methane K04480 0.025 5.245 -0.004 0.411 0.352 6.978 10 mtaB; methanol---5-hydroxybenzimidazolylcobamide Co-methyltransferase SO4 (mg/L)

Methane K01623 0.025 3.863 0.005 0.409 0.349 6.909 10 ALDO; fructose-bisphosphate aldolase class I SO4 (mg/L)

Methane K03518 0.025 9.137 -0.005 0.408 0.349 6.906 10 coxS; carbon-monoxide dehydrogenase small subunit SO4 (mg/L)

Methane K01499 0.029 9.085 -0.005 0.393 0.332 6.464 10 mch; methenyltetrahydromethanopterin cyclohydrolase SO4 (mg/L)

Methane K11781 0.029 9.083 -0.005 0.392 0.331 6.441 10 cofH; FO synthase subunit 2 SO4 (mg/L)

Methane K00121 0.031 3.941 0.005 0.385 0.323 6.255 10 frmA ADH5 adhC; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase SO4 (mg/L)

Methane K16256 0.032 8.041 -0.005 6.176 0.382 0.320 10 mxaA; mxaA protein SO4 (mg/L)

Methane K00399 0.038 2.190 -0.001 0.364 0.300 5.712 10 mcrA; methyl-coenzyme M reductase alpha subunit SO4 (mg/L)

Methane K05299 0.041 8.954 -0.005 0.354 0.289 5.470 10 fdhA1; formate dehydrogenase alpha subunit SO4 (mg/L)

Methane K03388 0.045 4.077 0.005 0.345 0.279 5.266 10 hdrA; heterodisulfide reductase subunit A SO4 (mg/L)

Methane K00148 0.048 8.893 -0.005 0.336 0.270 5.071 10 fdhA; glutathione-independent formaldehyde dehydrogenase SO4 (mg/L)

Methane K03519 0.048 8.891 -0.005 0.336 0.269 5.057 10 coxM cutM; carbon-monoxide dehydrogenase medium subunit SO4 (mg/L)

Methane K04041 0.050 8.550 -0.004 0.331 0.264 4.949 10 fbp3; fructose-1 6-bisphosphatase III SO4 (mg/L)

Table 28. Results from qPCR of biofilms. Primers with NA standards had sample amplification and were used for relative analysis.

Target Gene Amplification Standard qPCR program efficiency Curve (R2) Reference aprA 1.86 1.00 Frank et al. 2013 dsrB 1.85 1.00 Frank et al. 2013 nirK cluster I NA NA Wei et al. 2015 nirK cluster II 1.64 0.99 Wei et al. 2015 nirK cluster IV 1.52 0.99 Wei et al. 2015 pmoA (Mb601R) 2.29 0.98 Kolb et al. 2003 pmoA (Mc468R) 2.10 0.98 Kolb et al. 2003 mcrA NA NA Steinburg and Regan 2008 serA 2.11 0.99 Wen et al. 2016 16S (bacteria) 2.04 0.98 Frank et al. 2013

143

Appendix C

Term Definitions:

Microbial Community Structure: the characteristics of a community of microorganisms including bacteria, archaea, and microeukaryotes as measured by any metric of taxa or gene composition, diversity, and/or abundance via a range of molecular or cultural techniques

Microbial Process: microbial activity measured at the community scale, either through direct assessment of enzyme activities that mediate a process (e.g. denitrification enzyme assay), monitoring of end product accumulation over time (e.g. net nitrification), or tracking element cycles through stable isotope tracers (e.g. gross nitrification/denitrification)

Dataset Definitions:

Full dataset: 148 papers that matched our search terms and contained experiments that measured both a microbial community structure and a microbial process

Changed dataset: a subset of 112 papers from the full dataset in which both microbial community structure and process changed

Link-tested dataset: a subset of 128 incidences in 38 papers from the changed dataset where microbial community structure-process incidences were tested for statistical significance

144

Linked dataset: a subset of 96 incidences in 28 papers from the link-tested dataset in which structure-process links were found to be statistically significant

Paper: a single peer-reviewed publication, often containing multiple incidences

Incidence: the combination of a process and structures between which the authors looked for a link (e.g. an experiment may make one structure measurement and two process measurements resulting in two incidences: 16S rRNA gene with N2O flux, and 16S rRNA gene with CO2 flux; or one process and two community measures resulting in one incidence: CO2 flux with 16S rRNA gene and nirK)

Methods Instructions for tallying synthesis papers that showed a statistically tested link

(i.e., link-tested dataset)

1. General Notes: a. Exclude any treatments where a functional measurement and compositional measurement were not taken simultaneously. b. Use a separate column for each treatment/function combination. c. If a functional definition was made from multiple functions (ex. “overall function” in Reed et al. 2013), that was used in correlations only (not to determine if there were differences in function), mark presence of a correlation for each function. d. If microbial communities for treatments were grouped for correlation test (ex. all put into a CCA in (Nie et al. 2013), mark the presence of a correlation for each treatment that differed significantly from composition of the control. e. Group levels of treatments or frequency of application into one column and if a difference or correlation were found at any of the levels, count it as presence of a difference or correlation for that treatment type. (Ex. Additions of nitrate to different microcosms at 0.5mg, 1mg and 2mg are all lumped together.)

145

f. If soil depths were grouped together for correlation/regression analysis, group the results together in one column. 2. Function Metrics: a. Write the name of the functions measured in Column B. To the right of Column B, make a separate column for each function and enter “1” if it was measured. If it wasn’t measured, leave blank. b. Include enzyme measurements as functions. c. Only use functions that have a rate measurement, not a pool. 3. Function Measure: a. Record whether the function measured was an actual measurement or a potential measurement (i.e., if a substrate was added, the measurement was of a potential function). 4. Community Metrics (first pass): a. Tally the community metrics used to measure each function with a “0.” b. Include both DNA and RNA measurements. 5. Community Comparison (first pass): a. Record with a “0” whether relative, presence/absence, or absolute measures of community were used. (Some of these will have both if multiple metrics were used.) 6. Test Site: a. Record with a “1” if the experiment were performed in a lab or field setting. Microcosms fall into the lab category, and mesocosms fall into the field category. 7. Experiment Details: a. Duration of experiment: For each function, put the number of days the experiment had been running when the functional/composition measurement was made. For some experiments this will be thousands of days. For experiments where multiple days were measured, record the first day that a change was observed in both composition and function. b. Give a brief description of what the experimental manipulation was (ex. elevated CO2). c. Record whether or not dispersal was possible between the treatment unit and the outside environment during the experiment. “1” if dispersal were possible and “0” of not. d. Record which groups (bacteria, archaea, fungi) were targeted in the community metrics used. Use “1” for included and “0” for not included. e. Write down a list of which genes were measured in the “Genes” row. 8. Authors’ Conclusions: a. If only a function changed, record that in “Change in function only.” If only a gene or organism changed, tally that in “Change in Composition

146

only.” If both the function and a gene or organism changed, record that in “Change in both [function and composition].” If neither the function or a community metric changed, record that as “Change in neither.” Use a “1” to indicate which of these four options applied. b. Use a “1” or “0” to indicate whether or not the change in the function or composition, or the function-composition link were tested for significance.

For statistics tallies:

1. Enter a number >0 in the “Genes/Organisms that changed” set of rows if the gene/organism correlated and “0” if it changed and they looked for a correlation, but it wasn’t significantly correlated. To decide which number >0 to use if the gene/organism is correlated, start with “1” and increase sequentially for each additional gene/organism. 2. Community Metrics (second pass): a. Return to the “Community Metrics” section. In Community Metrics, if there was a correlation, change the number “0” to the number that corresponds with the gene/organism that correlated. So, if a correlation were found with 18S using TRFLPs, you would enter “1” in the 18S row for “Genes/Organisms that changed” and “1” in the TRFLPs row of “Community Metrics.” And if in that same study, 16S were also correlated using TRFLPs, use a “2” in the 16S row and make another row for TRFLPS and enter “2” in it. Keep the “0” if a metric that was tested for a correlation did not correlate. 3. Community Comparison (second pass): a. Return to the “Community Comparison” section. If a correlation was found, recode “0”s to the number corresponding with that in “Community Metrics” in order to indicate which type of comparison yielded a correlation. You may need to add multiple rows with the same comparison type if more than one correlation is found using the same community comparison type.

147

ISI Database Process Terms: decomp* OR methan* OR sulfate red* OR denitrif* OR dnf OR nitrif* Structural Terms: commun* OR gene* OR physiolog*

199 749 Papers

ISI Filtering General Term: microb* Process

32 386 Papers Source The ISME Journal OR FEMS Microbiology Ecology OR Science OR Nature OR Titles: Ecology OR Ecology Letters OR Microbial Ecology OR Limnology and Oceanography OR Soil Biology & Biochemistry

1,189 Papers

148 Papers Experiments that measured both a microbial community structure and a process. Manual Filtering Process The authors found a change in at least one of the structures and one of 112 Papers the processes they measured.

38 Papers Tested for a link between the measured structure and process.

Figure 30. Filtering process for papers beginning with search terms in ISI Web of Science and following with manual filtering process by reading abstracts and papers.

148

Table 29. Distribution of studies from 2009-2013 by journal.

Publication Year % Papers Total Total Papers Scanned from Papers Published Total Papers Journal 2009 2010 2011 2012 2013 Total Scanned2 2009-2013 Published Ecology 0/11 0 0/1 0 0 0/2 34 1571 0.127 Ecology Letters 0 0 0 1/2 1/1 2/3 24 829 0.121 FEMS Microbiology Ecology 0/4 2/12 0/6 2/6 1/8 5/36 272 1081 0.463 Limnology and Oceanography 0 0 0 0 0 0 30 969 0.000 Microbial Ecology 0/1 0/1 2/2 1/4 0/2 3/10 157 938 0.213 Nature 1/1 0 0/1 0 0 1/2 26 13,199 0.008 Science 0/1 0 0 0/1 0 0/2 19 19,255 0.000 Soil Biology Biochemistry 0/1 2/8 2/7 4/9 4/16 12/41 475 1622 0.678 The ISME Journal 1/2 2/4 1/4 0/3 1/3 5/16 152 943 0.742 Total 2/11 6/25 5/21 8/25 7/30 28/112 1189 40407 0.005

1 Number of papers that: Detected link between changes/Measured both a structural and process change 2 Number of papers that matched all search terms.

149

Soil Biology Biochemistry FEMS Microbiology Ecology Microbial Ecology The ISME Journal Ecology Limnology and Oceanography Nature Ecology Letters Science

0 50 100 150 200 250 300 350 400 450 500 Number of Papers

Figure 31. Distribution by journal of 1,189 total papers from 2009-2013 that met all search term requirements. Black bars represent subset of 148 papers that detected at least one structural or process change.

Table 30. Number of incidences associated with each type of manipulation among the 38 papers that tested for a statistical link between microbial community structure and microbial process

Manipulation Type Link Present Link Absent Fertilization 28 8

Warming & eCO2 17 10 Agriculture techniques 15 2 Redox 14 1 Salinity 10 0 Toxin addition 5 1 Forestry (deforestation, afforestation) 5 1 pH 1 1

150

Table 31. Additional microbial processes tested for a link with microbial community structure not included in Fig. 22.

Process Link Present Link Absent No. of Papers Aminopeptidase - acidic 1 0 1 Aminopeptidase - alkaline 1 0 1 Endochitinase 1 0 1 Nitrate reductase 1 0 1 Peptidase 1 0 1 Peroxidase 1 0 1 Phosphatase - acidic 1 0 1 Phosphatase - alkaline 1 0 1 Sulfatase - acidic 1 0 1 Sulfatase - alkaline 1 0 1 NO flux 0 1 1 B-xylosidase 0 1 1

151

Table 32. Additional genes and organisms tested for a link with microbial processes not included in Fig. 23.

Gene/Organism Link Present Link Absent No. of Papers Acetoclastic Methanogens 8 0 1 Hydrogenotrophic Methanogens 8 0 1 Methanotrophs 8 0 1 pmoA 4 2 1 ITS (Ascomycota) 1 1 1 ITS (Basidiomycota) 1 1 1 16S Acidobacteria 1 0 1 16S Actinobacteria 1 0 1 16S Verrucomicrobia 1 0 1 assA 1 0 1 bcrA 1 0 1 bssA 1 0 1 mcrA 1 0 1 AM fungi PLFA 0 3 1 norB 0 2 1 18S 0 1 1 16S (a-proteobacteria) 0 1 1 Nitrospira-like (16S) 0 1 1

152

10 9 8 7 16S 6 5 (Universal) 4 3 2 1 Number Number of Incidences 0

Link Present

Link Absent 16 14 12 10 amoA 8 (bacteria) 6 4 2 Number Number of Incidences 0

Figure 32. Examples of the distribution of incidences associated with a universal gene (bacterial 16S (A)) and a specific gene (bacterial amoA (B)) across the different microbial processes for which a link was tested.

153

11 Link Present

9 Link Absent

7

5

3

1 Number of Structures per Paper per Structures of Number -1 0 2 4 6 8 10 Number of Processes per Paper

Figure 33. Comparison between the number of structure and process metrics used in each paper that tested for a link (38 papers) shaded by whether or not the authors detected a link. Marker size weighted by number of papers; range 1-7 papers.

154

References

Aanderud, Z. T., S. E. Jones, N. Fierer, and J. T. Lennon. 2015. Resuscitation of the rare biosphere contributes to pulses of ecosystem activity. Front Microbiol 6:24.

Abeliovich, A. 2007. The nitrate oxidizing bacteria. Pages 861-872 The Prokaryotes.

Allison, S. D. 2012. A trait-based approach for modelling microbial litter decomposition. Ecol Lett 15:1058-1070.

Allison, S. D., and J. B. Martiny. 2008. Resistance, resilience, and redundancy in microbial communities. Proceedings of the National Academy of Sciences 105:11512-11519.

Anderson, M. J. 2001. A new method for non-parametric multivariate analysis of variance. Austral Ecol 26:32-46.

Anderson, M. J., and T. J. Willis. 2003. Canonical analysis of principal coordinates: a useful method of constrained ordination for ecology. Ecology 84:511-525.

Arnold, M. C., T. T. Lindberg, Y. T. Liu, K. A. Porter, H. Hsu-Kim, D. E. Hinton, and R. T. Di Giulio. 2014. Bioaccumulation and speciation of selenium in fish and insects collected from a mountaintop removal coal mining-impacted stream in West Virginia. Ecotoxicology 23:929-938.

Auguet, J. C., A. Barberan, and E. O. Casamayor. 2010. Global ecological patterns in uncultured Archaea. ISME J 4:182-190.

Baker, B. J., and J. F. Banfield. 2003. Microbial communities in acid mine drainage. FEMS Microbiol Ecol 44:139-152.

Baker, B. J., G. W. Tyson, L. Goosherst, and J. F. Banfield. 2009. Insights into the diversity of eukaryotes in acid mine drainage biofilm communities. Appl Environ Microbiol 75:2192-2199.

Barbour, M. T., J. Gerritsen, B. D. Snyder, and J. B. Stribling. 1999. Rapid Bioassessment Protocols for Use in Streams and Wadeable Rivers: Periphyton, Benthic Macroinvertebrates, and Fish. Washington, D.C.

Bates, S. T., D. Berg-Lyons, J. G. Caporaso, W. A. Walters, R. Knight, and N. Fierer. 2011. Examining the global distribution of dominant archaeal populations in soil. ISME J 5:908-917.

155

Battin, T. J., K. Besemer, M. M. Bengtsson, A. M. Romani, and A. I. Packmann. 2016. The ecology and biogeochemistry of stream biofilms. Nature Reviews Microbiology:251-263.

Beevers, R. K. 2006. Sampling Strategies for Particle Filtering SLAM. Troy, NY.

Bell, T., J. A. Newman, B. W. Silverman, S. L. Turner, and A. K. Lilley. 2005. The contribution of species richness and composition to bacterial services. Nature 436:1157-1160.

Belser, L. W., and E. L. Mays. 1980. Specific inhibition of nitrite oxidation by chlorate and its use in assessing nitrification in soils and sediments. Appl Environ Microbiol 39:505-510.

Bernhardt, E. S., B. D. Lutz, R. S. King, J. P. Fay, C. E. Carter, A. M. Helton, D. Campagna, and J. Amos. 2012. How many mountains can we mine? Assessing the regional degradation of Central Appalachian rivers by surface coal mining. Environ Sci Technol 46:8115-8122.

Bernhardt, E. S., and M. A. Palmer. 2011. The environmental costs of mountaintop mining valley fill operations for aquatic ecosystems of the Central Appalachians. Ann NY Acad Sci 1223:39-57.

Besemer, K., H. Peter, J. B. Logue, S. Langenheder, E. S. Lindstrom, L. J. Tranvik, and T. J. Battin. 2012. Unraveling assembly of stream biofilm communities. ISME J 6:1459-1468.

Besemer, K., G. Singer, C. Quince, E. Bertuzzo, W. Sloan, and T. J. Battin. 2013. Headwaters are critical reservoirs of microbial diversity for fluvial networks. Proc Biol Sci 280:20131760.

Bier, R. L., E. S. Bernhardt, C. M. Boot, E. B. Graham, E. K. Hall, J. T. Lennon, D. R. Nemergut, B. B. Osborne, C. Ruiz-Gonzalez, J. P. Schimel, M. P. Waldrop, and M. D. Wallenstein. 2015a. Linking microbial community structure and microbial processes: an empirical and conceptual overview. FEMS Microbiol Ecol 91:fiv113.

Bier, R. L., K. A. Voss, and E. S. Bernhardt. 2015b. Bacterial community responses to a gradient of alkaline mountaintop mine drainage in Central Appalachian streams. ISME J 9:1378-1390.

Blagodatskaya, E., and Y. Kuzyakov. 2013. Active microorganisms in soil: Critical review of estimation criteria and approaches. Soil Biology and Biochemistry 67:192-211.

156

Blazewicz, S. J., R. L. Barnard, R. A. Daly, and M. K. Firestone. 2013. Evaluating rRNA as an indicator of microbial activity in environmental communities: limitations and uses. ISME J 7:2061-2068.

Bouskill, N. J., J. Barker-Finkel, T. S. Galloway, R. D. Handy, and T. E. Ford. 2010. Temporal bacterial diversity associated with metal-contaminated river sediments. Ecotoxicology 19:317-328.

Bouskill, N. J., J. Tang, W. J. Riley, and E. L. Brodie. 2012. Trait-based representation of biological nitrification: model development, testing, and predicted community composition. Front Microbiol 3:364.

Braker, G., J. Schwarz, and R. Conrad. 2010. Influence of temperature on the composition and activity of denitrifying soil communities. FEMS Microbiol Ecol 73:134-148.

Bray, J. R., and J. T. Curtis. 1957. An ordination of the upland forest communities of Southern Wisconsin. Ecological Monographs 27:325-349.

Brenzinger, K., P. Dorsch, and G. Braker. 2015. pH-driven shifts in overall and transcriptionally active denitrifiers control gaseous product stoichiometry in growth experiments with extracted bacteria from soil. Front Microbiol 6:961.

Burke, C., P. Steinberg, D. Rusch, S. Kjelleberg, and T. Thomas. 2011. Bacterial community assembly based on functional genes rather than species. Proceedings of the National Academy of Sciences 108:14288-14293.

Buttigieg, P. L., and A. Ramette. 2014. A Guide to Statistical Analysis in Microbial Ecology: a community-focused, living review of multivariate data analyses. FEMS Microbiol Ecol 90:543-550.

Caporaso, J. G., J. Kuczynski, J. Stombaugh, K. Bittinger, F. D. Bushman, E. K. Costello, and e. al. 2010. QIIME allows analysis of high- throughput community sequencing data. Nat Methods 7:335-336.

Carney, K. M., and P. A. Matson. 2005. Plant Communities, Soil Microorganisms, and Soil Carbon Cycling: Does Altering the World Belowground Matter to Ecosystem Functioning? Ecosystems 8:928-940.

Chao, A. 1984. Nonparametric estimation of the number of classes in a population. Scand J Stat 11.

157

Chen, J., K. Bittinger, E. S. Charlson, C. Hoffmann, J. Lewis, G. D. Wu, R. G. Collman, F. D. Bushman, and H. Li. 2012. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 28:2106–2113.

Cheng, Y., J. Wang, B. Mary, J.-b. Zhang, Z.-c. Cai, and S. X. Chang. 2013. Soil pH has contrasting effects on gross and net nitrogen mineralizations in adjacent forest and grassland soils in central Alberta, Canada. Soil Biology and Biochemistry 57:848-857.

Christophersen, C. T., M. Morrison, and M. A. Conlon. 2011. Overestimation of the abundance of sulfate-reducing bacteria in human feces by quantitative PCR targeting the Desulfovibrio 16S rRNA gene. Appl Environ Microbiol 77:3544- 3546.

Comte, J., L. Fauteux, and P. A. Del Giorgio. 2013. Links between metabolic plasticity and functional redundancy in freshwater bacterioplankton communities. Front Microbiol 4:112.

Cox, M. P., D. A. Peterson, and P. J. Biggs. 2010. SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11:485.

De Beer, D., and P. Stoodley. 2006. Microbial biofilms. Pages 904-937 in M. Dworkin, S. Falkow, E. Rosenberg, K.-H. Schleifer, and E. Stackebrandt, editors. SIAM Review. Springer, New York, NY.

De Cáceres, M., and P. Legendre. 2009. Associations between species and groups of sites: indices and statistical inference. Ecology 90:3566-3574.

De Cáceres, M., P. Legendre, and M. Moretti. 2010. Improving indicator species analysis by combining groups of sites. Oikos 119:1674-1684. de Vries, F. T., and A. Shade. 2013. Controls on soil microbial community stability under climate change. Front Microbiol 4:265.

Doak, D. F., D. Bigger, E. K. Harding, M. A. Marvier, R. E. O'Malley, and D. Thomson. 1998. The statistical inevitability of stability-diversity relationships in community ecology. Am Nat 151:264-276.

Dominguez, D. C., M. Guragain, and M. Patrauchan. 2015. Calcium binding proteins and calcium signaling in prokaryotes. Cell Calcium 57:151-165.

158

Doshi, S. M. 2006. Bioremediation of Acid Mine Drainage Using Sulfate-Reducing Bacteria.in N. N. o. E. M. Studies, editor. U.S. Environmental Protection Agency, Washington, DC.

Dufrene, M., and P. Legendre. 1997. Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecological Monographs 67:345-366.

Edgar, R. C. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460-2461.

Edmonds, J. W., N. B. Weston, S. B. Joye, and M. A. Moran. 2008. Variation in prokaryotic community composition as a function of resource availability in tidal creek sediments. Appl Environ Microbiol 74:1836-1844.

Edmonds, J. W., N. B. Weston, S. B. Joye, X. Mou, and M. A. Moran. 2009. Microbial community response to seawater amendment in low-salinity tidal sediments. Microb Ecol 58:558-568.

Falkowski, P. G., T. Fenchel, and E. F. Delong. 2008. The microbial engines that drive Earth's biogeochemical cycles. Science 320:1034-1039.

Feris, K., P. Ramsey, C. Frazar, J. N. Moore, J. E. Gannon, and W. E. Holben. 2003. Differences in hyporheic-zone microbial community structure along a heavy- metal contamination gradient. FEMS Microbiol Ecol 69:5563-5573.

Feris, K. P., P. W. Ramsey, C. Frazar, M. Rillig, J. N. Moore, J. E. Gannon, and W. E. Holben. 2009. Hyporheic microbial community development is a sensitive indicator of metal contamination. Environmental Science and Technlology 43:6158-6163.

Fierer, N., and R. B. Jackson. 2006. The diversity and biogeo- graphy of soil bacterial communities. Proceedings of the National Academy of Sciences 103:616-631.

Fierer, N., J. L. Morse, S. T. Berthrong, E. S. Bernhardt, and R. B. Jackson. 2007. Environmental controls on the landscape-scale biogeography of stream bacterial communities. Ecology 99.

Fierer, N., J. P. Schimel, and P. A. Holden. 2003. Variations in microbial community composition through two soil depth profiles. Soil Biol Biochem 35:167-176.

159

Fortunato, C. S., A. Eiler, L. Herfort, J. A. Needoba, T. D. Peterson, and B. C. Crump. 2013. Determining indicator taxa across spatial and seasonal gradients in the Columbia River coastal margin. ISME J 7:1899-1911.

Fuhrman, J. A. 2009. Microbial community structure and its functional implications. Nature 459:193-199.

Fukami, T., T. Martijn Bezemer, S. R. Mortimer, and W. H. Putten. 2005. Species divergence and trait convergence in experimental plant community assembly. Ecology Letters 8:1283-1290.

Ganguly, S., and B. B. Jana. 2002. Cadmium induced adaptive responses of certain biogeochemical cycling bacteria in an aquatic system. Water Research 36:1667- 1676.

Garrity, G. 2005. The Proteobacteria. Page 1106 in G. Garrity, D. J. Brenner, N. R. Krieg, and J. R. Staley, editors. Bergey’s Manual of Systematic Bacteriology. Springer, New York, NY.

Gates, A. J., V. M. Luque-Almagro, A. D. Goddard, S. J. Ferguson, M. D. Roldan, and D. J. Richardson. 2011. A composite biochemical system for bacterial nitrate and nitrite assimilation as exemplified by Paracoccus denitrificans, . Biochem J 435:743-753.

Geisseler, D., and K. M. Scow. 2014. Long-term effects of mineral fertilizers on soil microorganisms – A review. Soil Biology and Biochemistry 75:54-63.

Giller, K. E., E. Witter, and S. P. McGrath. 2009. Heavy metals and soil microbes. Soil Biology and Biochemistry 41:2031-2037.

Gomez-Alvarez, V., T. K. Teal, and T. M. Schmidt. 2009. Systematic artifacts in metagenomes from complex microbial communities. ISME J 3:1314-1317.

Goslee, A. S., and D. Urban. 2007. R package "ecodist".

Grace, J. B., T. M. Anderson, H. Olff, and S. M. Scheiner. 2010. On the speci cation of structural equation models for ecological systems. Ecological Monographs 80:67- 87.

Graham, E. B., J. E. Knelman, A. Schindlbacher, S. Siciliano, M. Breulmann, A. Yannarell, J. M. Beman, G. Abell, L. Philippot, J. Prosser, A. Foulquier, J. C. Yuste, H. C. Glanville, D. L. Jones, R. Angel, J. Salminen, R. J. Newton, H. Burgmann, L. J.

160

Ingram, U. Hamer, H. M. Siljanen, K. Peltoniemi, K. Potthast, L. Baneras, M. Hartmann, S. Banerjee, R. Q. Yu, G. Nogaro, A. Richter, M. Koranda, S. C. Castle, M. Goberna, B. Song, A. Chatterjee, O. C. Nunes, A. R. Lopes, Y. Cao, A. Kaisermann, S. Hallin, M. S. Strickland, J. Garcia-Pausas, J. Barba, H. Kang, K. Isobe, S. Papaspyrou, R. Pastorelli, A. Lagomarsino, E. S. Lindstrom, N. Basiliko, and D. R. Nemergut. 2016. Microbes as Engines of Ecosystem Function: When Does Community Structure Enhance Predictions of Ecosystem Processes? Front Microbiol 7:214.

Graham, E. B., W. R. Wieder, J. W. Leff, S. R. Weintraub, A. R. Townsend, C. C. Cleveland, L. Philippot, and D. R. Nemergut. 2014. Do we need to understand microbial communities to predict ecosystem function? A comparison of statistical models of nitrogen cycling processes. Soil Biology and Biochemistry 68:279-282.

Griffith, M. B., S. B. Norton, L. C. Alexander, A. I. Pollard, and S. D. LeDuc. 2012. The effects of mountaintop mines and valley fills on the physicochemical quality of stream ecosystems in the central Appalachians: a review. Sci Total Environ 417- 418:1-12.

Griffiths, R. I., B. C. Thomson, P. James, T. Bell, M. Bailey, and A. S. Whiteley. 2011. The bacterial biogeography of British soils. Environ Microbiol 13:1642-1654.

Hao, O. J., L. Huang, J. M. Chen, and R. L. Buglass. 1994. Effects of metal additions on sulphate reduction activity in wastewaters. Toxicology and Environmental Chemistry 46:197-212.

Hedin, L. O., J. C. von Fischer, N. E. Ostrom, B. P. Kennedy, M. G. Brown, and G. P. Robertson. 1998. Thermodynamic constraints on nitrogen transformations and other biogeochemical processes at soil-stream interfaces. Ecology 79:684–703.

Hemme, C. L., Y. Deng, T. J. Gentry, M. W. Fields, L. Wu, S. Barua, K. Barry, S. G. Tringe, D. B. Watson, Z. He, T. C. Hazen, J. M. Tiedje, E. M. Rubin, and J. Zhou. 2010. Metagenomic insights into evolution of a heavy metal-contaminated groundwater microbial community. ISME J 4:660-672.

Hering, D., A. Buffagni, O. Moog, S. L, M. Sommerhauser, I. Stubauer, and et al. 2003. The development of a system to assess the ecological quality of streams based on Macroinvertebrates – design of the sampling pro- gramme within the AQEM project. Int Rev Hydrobiol 88:345–361.

161

Herlemann, D. P., M. Labrenz, K. Jurgens, S. Bertilsson, J. J. Waniek, and A. F. Andersson. 2011. Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea. ISME J 5:1571-1579.

Herold, M. B., E. M. Baggs, and T. J. Daniell. 2012. Fungal and bacterial denitrification are differently affected by long-term pH amendment and cultivation of arable soil. Soil Biology and Biochemistry 54:25-35.

Hesselsoe, M., S. Fureder, M. Schloter, L. Bodrossy, N. Iversen, P. Roslev, P. H. Nielsen, M. Wagner, and A. Loy. 2009. Isotope array analysis of Rhodocyclales uncovers functional redundancy and versatility in an activated sludge. ISME J 3:1349-1364.

Hill, M. O. 1973. Diversity and evenness : a unifying notation and its consequences. Ecology 54:427-432.

Janssens, I. A., W. Dieleman, S. Luyssaert, J. A. Subke, M. Reichstein, R. Ceulemans, P. Ciais, A. J. Dolman, J. Grace, G. Matteucci, D. Papale, S. L. Piao, E. D. Schulze, J. Tang, and B. E. Law. 2010. Reduction of forest soil respiration in response to nitrogen deposition. Nature Geoscience 3:315-322.

Jin, R. C., P. Zheng, Q. Mahmood, and B. L. Hu. 2007. Osmotic stress on nitrification in an airlift bioreactor. J Hazard Mater 146:148-154.

Johnsen, K., C. Jacobsen, V. Torsvik, and J. Sørensen. 2001. Pesticide effects on bacterial diversity in agricultural soils - a review. Pages 443-453 Biology and Fertility of Soils.

Jones, S. E., and J. Lennon. 2010. Dormancy contributes to the maintenance of microbial diversity. Proceedings of the National Academy of Sciences 107:5881-5886.

Kalyuhznaya, M. G., W. Martens-Habbena, T. Wang, M. Hackett, S. M. Stolyar, D. A. Stahl, M. E. Lidstrom, and L. Chistoserdova. 2009. Methylophilaceae link methanol oxidation to denitrification in freshwater lake sediment as suggested by stable isotope probing and pure culture analysis. Environ Microbiol Rep 1:385-392.

Kapoor, V., X. Li, M. Elk, K. Chandran, C. A. Impellitteri, and J. W. Santo Domingo. 2015. Impact of Heavy Metals on Transcriptional and Physiological Activity of Nitrifying Bacteria. Environ Sci Technol 49:13454-13462.

162

Kempf, B., and E. Bremer. 1995. OpuA, an osmotically regulated binding protein- dependent transport system for the osmoprotectant glycine betaine in Bacillus subtilis. J Biol Chem 270:16701-16713.

Kim, H., D. Kaown, B. Mayer, J. Y. Lee, Y. Hyun, and K. K. Lee. 2015. Identifying the sources of nitrate contamination of groundwater in an agricultural area (Haean basin, Korea) using isotope and microbial community analyses. Sci Total Environ 533:566-575.

Kimes, N. E., A. V. Callaghan, J. M. Suflita, and P. J. Morris. 2014. Microbial transformation of the Deepwater Horizon oil spill-past, present, and future perspectives. Front Microbiol 5:603.

Kirby, C. S., and C. A. Cravotta. 2005. Net alkalinity and net acidity 2: Practical considerations. Applied Geochemistry 20:1941-1964.

Klee, R. J., and T. E. Graedel. 2004. ELEMENTAL CYCLES: A Status Report on Human or Natural Dominance. Annual Review of Environment and Resources 29:69-107.

Kolb, S., C. Knief, S. Stubner, and R. Conrad. 2003. Quantitative Detection of Methanotrophs in Soil by Novel pmoA-Targeted Real-Time PCR Assays. Applied and Environmental Microbiology 69:2423-2429.

Konneke, M., A. E. Bernhard, J. R. de la Torre, C. B. Walker, J. B. Waterbury, and D. A. Stahl. 2005. Isolation of an autotrophic ammonia-oxidizing marine archaeon. Nature 437:543-546.

Konopka, A. 2009. What is microbial community ecology? ISME J 3:1223-1230.

Kramer, G. F., and B. N. Ames. 1988. Mechanisms of mutagenicity and toxicity of sodium selenite (Na2SeO3) in Salmonella typhimurium. Mutation Research 201:169-180.

Krause, S., X. Le Roux, P. A. Niklaus, P. M. Van Bodegom, J. T. Lennon, S. Bertilsson, H. P. Grossart, L. Philippot, and P. L. Bodelier. 2014. Trait-based approaches for understanding microbial biodiversity and ecosystem functioning. Front Microbiol 5:251.

Kremen, C. 2005. Managing ecosystem services: what do we need to know about their ecology? Ecol Lett 8:468-479.

163

Kuang, J. L., L. N. Huang, L. X. Chen, Z. S. Hua, S. J. Li, M. Hu, J. T. Li, and W. S. Shu. 2013. Contemporary environmental variation determines microbial diversity patterns in acid mine drainage. ISME J 7:1038-1050.

Kuczynski, J., Z. Liu, C. Lozupone, D. McDonald, N. Fierer, and R. Knight. 2010. Microbial community resemblance methods differ in their ability to detect biologically relevant patterns. Nat Methods 7:813-819.

Lami, R., L. C. Jones, M. T. Cottrell, B. J. Lafferty, M. Ginder-Vogel, D. L. Sparks, and e. al. 2013. Arsenite modifies structure of soil microbial communities and arsenite oxidization potential. FEMS Microbiol Ecol 84:270-279.

Langille, M. G., J. Zaneveld, J. G. Caporaso, D. McDonald, D. Knights, J. A. Reyes, J. C. Clemente, D. E. Burkepile, R. L. Vega Thurber, R. Knight, R. G. Beiko, and C. Huttenhower. 2013. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol 31:814-821.

Lauber, C. L., M. Hamady, R. Knight, and N. Fierer. 2009. Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale. Appl Environ Microbiol 75:5111-5120.

Lear, G., D. Niyogi, J. Harding, Y. Dong, and G. Lewis. 2009. Biofilm bacterial community structure in streams affected by acid mine drainage. Applied and Environmental Microbiology 75:3455-3460.

Lennon, J., Z. Aanderud, B. K. Lehmkuhl, and D. R. Schoolmaster. 2012. Mapping the niche space of soil microorganisms using taxonomy and traits. Ecology 93:1867- 1879.

Lennon, J., and S. E. Jones. 2011. Microbial seed banks: the ecological and evolutionary implications of dormancy. Nature Reviews Microbiology 9:119-130.

Levine, U. Y., T. K. Teal, G. P. Robertson, and T. M. Schmidt. 2011. Agriculture's impact on microbial diversity and associated fluxes of carbon dioxide and methane. ISME J 5:1683-1691.

Lindberg, T. T., E. S. Bernhardt, R. L. Bier, A. M. Helton, R. B. Merola, A. Vengosh, and e. al. 2011. Cumulative impacts of mountaintop mining on an Appalachian watershed. Proceedings of the National Academy of Sciences 108:20929-20934.

164

Lindsay, M. B. J., D. W. Blowes, P. D. Condon, and C. J. Ptacek. 2009a. Managing pore- water quailty in mine tailings by inducing microbial sulfate reduction. Environ Sci Technol 43:7086-7091.

Lindsay, M. B. J., P. D. Condon, J. L. Jambor, K. G. Lear, D. W. Blowes, and C. J. Ptacek. 2009b. Mineralogical, geochemical, and microbial investigation of a sulfide-rich tailings deposit characterized by neutral drainage. Applied Geochemistry 24:2212-2221.

Lindsay, M. B. J., K. D. Wakeman, O. F. Rowe, B. M. Grail, C. J. Ptacek, D. W. Blowes, and D. B. Johnson. 2011. Microbiology and Geochemistry of Mine Tailings Amended with Organic Carbon for Passive Treatment of Pore Water. Geomicrobiology Journal 28:229-241.

Lindstrom, E. S., and S. Langenheder. 2012. Local and regional factors influencing bacterial community assembly. Environ Microbiol Rep 4:1-9.

Liu, B., P. T. Morkved, A. Frostegard, and L. R. Bakken. 2010. Denitrification gene pools, transcription and kinetics of NO, N2O and N2 production as affected by soil pH. FEMS Microbiol Ecol 72:407-417.

Logan, M. V., K. F. Reardon, L. A. Figueroa, J. E. McLain, and D. M. Ahmann. 2005. Microbial community activities during establishment, performance, and decline of bench-scale passive treatment systems for mine drainage. Water Res 39:4537- 4551.

Logares, R., E. S. Lindstrom, S. Langenheder, J. B. Logue, H. Paterson, J. Laybourn-Parry, K. Rengefors, L. Tranvik, and S. Bertilsson. 2013. Biogeography of bacterial communities exposed to progressive long-term environmental change. ISME J 7:937-948.

Logue, J. B., S. E. Findlay, and J. Comte. 2015. Editorial: Microbial Responses to Environmental Changes. Front Microbiol 6:1364.

Lozupone, C. A., M. Hamady, S. T. Kelley, and R. Knight. 2007. Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities. Appl Environ Microbiol 73:1576-1585.

Lozupone, C. A., and R. Knight. 2007. Global patterns in bacterial diversity. Proceedings of the National Academy of Sciences 104:11436-11440.

165

Marquez, B. 2005. Bacterial efflux systems and efflux pumps inhibitors. Biochimie 87:1137-1147.

Martiny, A. C., K. Treseder, and G. Pusch. 2013. Phylogenetic conservatism of functional traits in microorganisms. ISME J 7:830-838.

Matsutani, N., T. Nakagawa, K. Nakamura, R. Takahashi, K. Yoshihara, and T. Tokuyama. 2011. Enrichment of a Novel Marine Ammonia-Oxidizing Archaeon Obtained from Sand of an Eelgrass Zone. Microbes and Environments 26:23-29.

Maurice, C. F., H. J. Haiser, and P. J. Turnbaugh. 2013. Xenobiotics shape the physiology and gene expression of the active human gut microbiome. Cell 152:39-50.

McCullagh, P., and J. Nelder. 1989. Generalized Linear Models. 2nd edition. Chapman and Hall/CRC, Boco Raton, FL.

McCune, B., and M. J. Mefford. PC-ORD. Multivariate analysis of ecological data. Gleneden Beach, OR, MjM Software.

McGuire, K. L., and K. Treseder. 2010. Microbial communities and their rele- vance for ecosystem models: decomposition as a case study. Soil Biol Biochem 42:529-535.

Meron, D., R. Rodolfo-Metalpa, R. Cunning, and e. al. 2012. Changes in coral microbial communities in response to a natural pH gradient. ISME J 6:1775-1785.

Meyer, F., D. Paarmann, M. D'Souza, R. Olson, E. M. Glass, M. Kubal, T. Paczian, A. Rodriguez, R. Stevens, A. Wilke, J. Wilkening, and R. A. Edwards. 2008. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 9:386.

Mohanakrishnan, J., M. V. Kofoed, J. Barr, Z. Yuan, A. Schramm, and R. L. Meyer. 2011. Dynamic microbial response of sulfidogenic wastewater biofilm to nitrate. Applied Microbiology and Biotechnology 91:1647–1657.

Moorhead, D. L., and R. L. Sinsabaugh. 2006. A theoretical model of litter decay and microbial interaction. Ecological Monographs 76:151-174.

Nemergut, D. R., S. K. Schmidt, T. Fukami, S. P. O'Neill, T. M. Bilinski, L. F. Stanish, J. E. Knelman, J. L. Darcy, R. C. Lynch, P. Wickey, and S. Ferrenberg. 2013. Patterns and processes of microbial community assembly. Microbiol Mol Biol Rev 77:342- 356.

166

Nemergut, D. R., A. Shade, and C. Violle. 2014. When, where and how does microbial community composition matter? Front Microbiol 5:497.

Nie, M., E. Pendall, C. Bell, C. K. Gasch, S. Raut, S. Tamang, and M. D. Wallenstein. 2013. Positive climate feedbacks of soil microbial communities in a semi-arid grassland. Ecol Lett 16:234-241.

Niyogi, D. K., M. Koren, C. J. Arbuckle, and C. R. Townsend. 2007. Stream communities along a catchment land-use gradient: subsidy-stress responses to pastoral development. Environ Manage 39:213-225.

Oakley, B. B., F. Carbonero, S. E. Dowd, R. J. Hawkins, and K. J. Purdy. 2011. Contrasting patterns of niche partitioning between two anaerobic terminal oxidizers of organic matter. ISME J 6:905-914.

Odum, E. P., J. T. Finn, and E. H. Franz. 1979. Perturbation theory and the subsidy-stress gradient. BioScience 29:349-352.

Oksanen, A. J., F. G. Blanchet, R. Kindt, P. Legendre, P. R. Minchin, R. B. O'hara, and H. Wagner. 2011. Vegan: Community Ecology Package

Oremland, R., J. T. Hollibaugh, A. S. Maest, T. S. Presser, L. G. Miller, and C. W. Culbertson. 1989. Selenate reduction to elemental selenium by anaerobic bacteria in sediments and culture: biogeochemical significance of a novel, sulfate independent respiration. Appl Environ Microbiol 55:2333-2343.

Oremland, R., J. Switzer Blum, A. Burns Bindi, P. R. Dowdle, M. Herbel, and J. F. Stolz. 1999. Simultaneous reduction of nitrate and selenate by cell suspensions of selenium-respiring bacteria. Appl Environ Microbiol 65:4385-4392.

Oremland, R. S., and S. Polcin. 1982. Methanogenesis and sulfate reducition: competitive and noncompetitive substrates in estuarine sediments. Appl Environ Microbiol 44:1270-1276.

Paerl, H. W., J. Dyble, P. H. Moisander, R. T. Noble, M. F. Piehler, J. L. Pinckney, T. F. Steppe, L. Twomey, and L. M. Valdes. 2003. Microbial indicators of change: current applications to eutrophication studies. FEMS Microbiology Ecology 46:233-246.

Palmer, M. A., E. S. Bernhardt, W. H. Schlesinger, K. N. Eshleman, E. Foufoula- Georgiou, M. S. Hendryx, and e. al. 2010. Mountaintop mining consequences. Science 327:148-149.

167

Pereira, L. B., R. Vicentini, and L. M. Ottoboni. 2014. Changes in the bacterial community of soil from a neutral mine drainage channel. PLoS One 9:e96605.

Pereira, L. B., R. Vicentini, and L. M. Ottoboni. 2015. Characterization of the core microbiota of the drainage and surrounding soil of a Brazilian copper mine. Genet Mol Biol 38:484-489.

Petersen, D. G., S. J. Blazewicz, M. Firestone, D. J. Herman, M. Turetsky, and M. Waldrop. 2012. Abundance of microbial genes associated with nitrogen cycling as indices of biogeochemical process rates across a vegetation gradient in Alaska. Environ Microbiol 14:993-1008.

Philippot, L., S. G. Andersson, T. J. Battin, J. I. Prosser, J. P. Schimel, W. B. Whitman, and S. Hallin. 2010. The ecological coherence of high bacterial taxonomic ranks. Nat Rev Microbiol 8:523-529.

Pierre, W. H. 1928. Nitrogenous fertilizers and soil acidity: I. Effect of various nitrogenous fertilizers on soil reaction. J Am Soc Agron 20.

Poisot, T., B. Pequin, and D. Gravel. 2013. High-throughput sequencing: a roadmap toward community ecology. Ecol Evol 3:1125-1139.

Pond, G. J. 2010. Patterns of Ephemeroptera taxa loss in Appalachian headwater streams (Kentucky, USA). Hydrobiologia 641:185-201.

Pond, G. J. 2012. Biodiversity loss in Appalachian head- water streams (Kentucky, USA): Plecoptera and Tri- choptera communities. Hydrobiologia 679:97-117.

Pond, G. J., M. E. Passmore, F. A. Borsuk, L. Reynolds, and C. J. Rose. 2008. Downstream effects of mountaintop coal mining: comparing biological conditions using family- and genus-level macroinvertebrate bioassessment tools. Journal of the North American Benthological Society 27:717-737.

Portillo, M. C., S. P. Anderson, and N. Fierer. 2012. Temporal variability in the diversity and composition of stream bacterioplankton communities. Environ Microbiol 14:2417-2428.

Prosser, J. 2013. Think before you sequence. Nat Rev Microbiol 494:41.

Prosser, J. I. 2010. Replicate or lie. Environ Microbiol 12:1806-1810.

168

Prosser, J. I. 2012. Ecosystem processes and interactions in a morass of diversity. FEMS Microbiol Ecol 81:507-519.

Prosser, J. I., B. J. Bohannan, T. P. Curtis, R. J. Ellis, M. Firestone, R. P. Freckleton, J. Green, L. E. Green, K. Killham, J. J. Lennon, A. M. Osborn, M. Solan, C. van der Gast, and P. W. Young. 2007. The role of ecolgical theory in microbial ecology. Nature Reviews Microbiology 5:382-392.

Ramirez, K. S., J. M. Craine, and N. Fierer. 2010. Nitrogen fertilization inhibits soil microbial respiration regardless of the form of nitrogen applied. Soil Biology and Biochemistry 42:2336-2338.

Raskin, L., B. E. Rittmann, and D. A. Stahl. 1996. Competition and coexistence of sulfate- reducing and methanogenic populations in anaerobic biofilms. Appl Environ Microbiol 62:3847-3857.

Reed, H. E., and J. B. Martiny. 2007. Testing the functional significance of microbial composition in natural communities. FEMS Microbiol Ecol 62:161-170.

Reed, H. E., and J. B. Martiny. 2013. Microbial composition affects the functioning of estuarine sediments. ISME J 7:868-879.

Rocca, J. D., E. K. Hall, J. T. Lennon, S. E. Evans, M. P. Waldrop, J. B. Cotner, D. R. Nemergut, E. B. Graham, and M. D. Wallenstein. 2015. Relationships between protein-encoding gene abundance and corresponding process are commonly assumed yet rarely observed. ISME J 9:1693-1699.

Rose, A. W., and C. A. Cravotta. 1998. Geochemistry of coal mine drainage. Pages 1-22 in K. B. C. Brady, M. W. Smith, and J. Schueck, editors. Coal Mine Drainage Prediction and Pollution Prevention in Pennsylvania. Pennsylvania Department of Environmental Protection, Harrisburg, PA.

Rousk, J., E. Baath, P. C. Brookes, C. L. Lauber, C. Lozupone, J. G. Caporaso, R. Knight, and N. Fierer. 2010. Soil bacterial and fungal communities across a pH gradient in an arable soil. ISME J 4:1340-1351.

Ruiz-González, C., T. Lefort, R. Massana, R. Simó, and J. M. Gasol. 2012. Diel changes in bulk and single-cell bacterial heterotrophic activity in winter surface waters of the northwestern Mediterranean Sea. Limnology and Oceanography 57:29-42.

Ruiz-González, C., J. P. Niño-García, J.-F. Lapierre, and P. A. del Giorgio. 2015. The quality of organic matter shapes the functional biogeography of bacterioplankton

169

across boreal freshwater ecosystems. Global Ecology and Biogeography 24:1487- 1498.

Sabater, S., H. Guasch, M. Ricart, A. Romani, G. Vidal, C. Klunder, and M. Schmitt- Jansen. 2007. Monitoring the effect of chemicals on biological communities. The biofilm as an interface. Anal Bioanal Chem 387:1425-1434.

Sabaty, M., C. Avazeri, D. Pignol, and A. Vermeglio. 2001. Characterization of the reduction of selenate and tellurite by nitrate reductases. Appl Environ Microbiol 67:5122–5126.

Schimel, J. P. 1995. Ecosystem consequences of microbial diversity and community structure. Pages 239-254 in F. S. Chapin and C. Korner, editors. Arctic and Alpine Biodiversity: Patterns, Causes, and Ecosystem Conse- quences. Springer, Berlin.

Schimel, J. P., J. Bennett, and N. Fierer. 2005. Microbial community composition and soil N cycling: is there really a connection? Pages 171-188 in R. D. Bardgett, D. W. Hopkins, and M. B. Usher, editors. Biological Diversity and Function in Soils. Cambridge University Press, New York, NY.

Schimel, J. P., and S. M. Schaeffer. 2012. Microbial control over carbon cycling in soil. Front Microbiol 3:348.

Schroeder, I., S. Rech, T. Krafft, and J. M. Macy. 1997. Purification and characterization of the selenate reductase from Thauera selenatis. J Biol Chem 272:23675–23678.

Shade, A., J. G. Caporaso, J. Handelsman, R. Knight, and N. Fierer. 2013. A meta- analysis of changes in bacterial and archaeal communities with time. ISME J 7:1493-1506.

Shade, A., H. Peter, S. D. Allison, D. L. Baho, M. Berga, H. Burgmann, D. H. Huber, S. Langenheder, J. T. Lennon, J. B. Martiny, K. L. Matulich, T. M. Schmidt, and J. Handelsman. 2012a. Fundamentals of microbial community resistance and resilience. Front Microbiol 3:417.

Shade, A., J. S. Read, N. D. Youngblut, N. Fierer, R. Knight, T. K. Kratz, N. R. Lottig, E. E. Roden, E. H. Stanley, J. Stombaugh, R. J. Whitaker, C. H. Wu, and K. D. McMahon. 2012b. Lake microbial communities are resilient after a whole- ecosystem disturbance. ISME J 6:2153-2167.

Sims, A., Y. Zhang, S. Gajaraj, P. B. Brown, and Z. Hu. 2013. Toward the development of microbial indicators for wetland assessment. Water Res 47:1711-1725.

170

Smets, B. F., and T. Barkay. 2005. Horizontal gene transfer: perspectives at a crossroads of scientific disciplines. Nat Rev Microbiol 3:675-678.

Smith, M. B., A. M. Rocha, C. S. Smillie, S. W. Olesen, C. Paradis, L. Wu, J. H. Campbell, J. L. Fortney, T. L. Mehlhorn, K. A. Lowe, J. E. Earles, J. Phillips, S. M. Techtmann, D. C. Joyner, D. A. Elias, K. L. Bailey, R. A. Hurt, Jr., S. P. Preheim, M. C. Sanders, J. Yang, M. A. Mueller, S. Brooks, D. B. Watson, P. Zhang, Z. He, E. A. Dubinsky, P. D. Adams, A. P. Arkin, M. W. Fields, J. Zhou, E. J. Alm, and T. C. Hazen. 2015. Natural bacterial communities serve as quantitative geochemical biosensors. MBio 6:e00326-00315.

Solimini, A. G., G. Free, I. Donohue, K. Irvine, M. Pusch, B. Rossaro, and e. al. 2006. Using benthic macroinverte- brates to assess ecological status of lakes current knowledge and way forward to support WFD implementation. Institute for Environment and Sustainability, European Commission.

Steinberg, L. M., and J. M. Regan. 2008. Phylogenetic comparison of the methanogenic communities from an acidic, oligotrophic fen and an anaerobic digester treating municipal wastewater sludge. Appl Environ Microbiol 74:6663-6671.

Sudduth, E. B. 2011. Effects of Urbanization on Stream Ecosystem Functions. Duke University Durham, NC.

Sun, J., Z. Deng, and A. Yan. 2014. Bacterial multidrug efflux pumps: mechanisms, physiology and pharmacological exploitations. Biochem Biophys Res Commun 453:254-267.

Suzek, B. E., Y. Wang, H. Huang, P. B. McGarvey, C. H. Wu, and C. UniProt. 2015. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31:926-932.

Suzuki, M. T., O. Beja, L. T. Taylor, and E. F. DeLong. 2001. Phylogenetic analysis of ribosomal RNA operons from uncultivated coastal marine bacterioplankton. Environ Microbiol 3:323-331.

Sverdrup, L. E., R. Linjordet, G. Strømman, S. B. Hagen, C. A. M. van Gestel, S. Frostega rd,̊ and e. al. 2006. Functional and community-level soil microbial responses to zinc addition may depend on test system biocomplexity. Chemosphere 65:1747-1754.

Tchobanoglous, G., F. L. Burton, and H. D. Stensel. 2003. Wastewater engineering: treatment and reuse. 4th edition. McGraw-Hill Education, New York, NY. 171

Telang, A. J., S. Ebert, J. M. Foght, D. W. S. Westlake, G. E. Jenneman, D. Gevertz, and G. Voordouw. 1997. Effect of nitrate injection on the microbial community in an oil field as monitored by reverse sample genome probing. Appl Environ Microbiol 63:1785-1793.

Todd-Brown, K. E. O., F. M. Hopkins, S. N. Kivlin, J. M. Talbot, and S. D. Allison. 2011. A framework for representing microbial decomposition in coupled climate models. Biogeochemistry 109:19-33.

Townsend, P. A., D. P. Helmers, C. C. Kingdon, B. E. McNeil, K. M. de Beurs, and K. N. Eshleman. 2009. Changes in the extent of surface mining and reclamation in the Central Appalachians detected using a 1976–2006 Landsat time series. Remote Sensing of Environment 113:62-72.

Treseder, K. 2008. Nitrogen additions and microbial biomass: a meta-analysis of ecosystem studies. Ecol Lett 11:1111-1120.

Treseder, K. K., T. C. Balser, M. A. Bradford, E. L. Brodie, E. A. Dubinsky, V. T. Eviner, K. S. Hofmockel, J. T. Lennon, U. Y. Levine, B. J. MacGregor, J. Pett-Ridge, and M. P. Waldrop. 2011. Integrating microbial ecology into ecosystem models: challenges and priorities. Biogeochemistry 109:7-18.

USEPA. 2011. The Effects of Mountaintop Mines and Valley Fills on Aquatic Ecosystems of the Central Appalachian Coalfields.in U. S. E. P. Agency, editor., Washington, D.C.

USGS. Variously dated. National field manual for the collection of water-quality data. Pages http://pubs.water.-/ usgs.gov/twri9A U.S. Geological Survey Techniques of Water-Resources Investigations. van der Heijden, M. G., R. D. Bardgett, and N. M. van Straalen. 2008. The unseen majority: soil microbes as drivers of plant diversity and productivity in terrestrial ecosystems. Ecol Lett 11:296-310.

Vishnivetskaya, T. A., J. J. Mosher, A. V. Palumbo, Z. K. Yang, M. Podar, S. D. Brown, S. C. Brooks, B. Gu, G. R. Southworth, M. M. Drake, C. C. Brandt, and D. A. Elias. 2011. Mercury and other heavy metals influence bacterial community structure in contaminated Tennessee streams. Appl Environ Microbiol 77:302-311.

Voss, K. A. 2015. Linking Structural and Functional Responses to Land Cover Change in a River Network. Duke University.

172

Wagner, M., A. J. Roger, J. L. Flax, G. A. Brusseau, and D. A. Stahl. 1998. Phylogeny of dissimilatory sulfite reductases supports an early origin of sulfate respiration. Journal of Bacteriology 180:2975–2982.

Wakelin, S. A., M. J. Colloff, and R. S. Kookana. 2008. Effect of wastewater treatment plant effluent on microbial function and community structure in the sediment of a freshwater stream with variable seasonal flow. Appl Environ Microbiol 74:2659-2668.

Wallenstein, M. D., and E. K. Hall. 2011. A trait-based framework for predicting when and where microbial adaptation to climate change will affect ecosystem functioning. Biogeochemistry 109:35-47.

Wallenstein, M. D., S. McNulty, I. J. Fernandez, J. Boggs, and W. H. Schlesinger. 2006. Nitrogen fertilization decreases forest soil fungal and bacterial biomass in three long-term experiments. Forest Ecology and Management 222:459-468.

Walters, W. A., and R. Knight. 2014. Technology and techniques for microbial ecology via DNA sequencing. Ann Am Thorac Soc 11 Suppl 1:S16-20.

Wei, W., K. Isobe, T. Nishizawa, L. Zhu, Y. Shiratori, N. Ohte, K. Koba, S. Otsuka, and K. Senoo. 2015. Higher diversity and abundance of denitrifying microorganisms in environments than considered previously. ISME J 9:1954-1965.

Welch, W. J. 1993. How cells respond to stress. Scientific American:56-64.

Wen, L. L., C. Y. Lai, Q. Yang, J. X. Chen, Y. Zhang, A. Ontiveros-Valencia, and H. P. Zhao. 2016. Quantitative detection of selenate-reducing bacteria by real-time PCR targeting the selenate reductase gene. Enzyme Microb Technol 85:19-24.

Wertz, S., A. K. Leigh, and S. J. Grayston. 2012. Effects of long-term fertilization of forest soils on potential nitrification and on the abundance and community structure of ammonia oxidizers and nitrite oxidizers. FEMS Microbiol Ecol 79:142-154.

West, A. W., and G. P. Sparling. 1986. Modifications to the substrate-induced respiration method to permit measurement of microbial biomass in soils of differing water contents. J Microbiol Methods 5.

Wetzel, R. G. 1983. Limnology. 2nd edition. Saunders College Publishing, Philadephia, PA.

173

Whittaker, R. H. 1956. Vegetation of the Great Smoky Mountains. Ecological Monographs 26:1-80.

Whittaker, R. J., K. J. Willis, and R. Field. 2001. Scale and species richness: towards a general, hierarchical theory of species diversity. Journal of Biogeography 28:453- 470.

Widdel, F. 1992. The genus Desulfotomaculum. Pages 1792–1799 in A. Balows, H. G. Triiper, M. Dworkin, W. Harder, and K. H. Schleifer, editors. The Prokaryotes. Springer, New York, NY.

Widenfalk, A., S. Bertilsson, I. Sundh, and W. Goedkoop. 2008. Effects of pesticides on community composition and activity of sediment microbes--responses at various levels of microbial community organization. Environ Pollut 152:576-584.

Wieder, W. R., G. B. Bonan, and S. D. Allison. 2013. Global soil carbon projections are improved by modelling microbial processes. Nature Climate Change 3:909-912.

Williams, J. W., and S. Silver. 1984. Bacterial resistance and detoxification of heavy metals. Enzyme and microbial technology 6:530-537.

Wittebolle, L., M. Marzorati, L. Clement, A. Balloi, D. Daffonchio, K. Heylen, P. De Vos, W. Verstraete, and N. Boon. 2009. Initial community evenness favours functionality under selective stress. Nature 458:623-626.

WVDEP. 2009. Selenium bioaccumulation among select stream and lake fishes in West Virginia. Page 39 in W. V. D. o. E. Protection, editor., Charleston, WV.

Xie, X., M. Liao, A. Ma, and H. Zhang. 2011. Effects of contamination of single and combined cadmium and mercury on the soil microbial community structural diversity and functional diversity. Chinese Journal of Geochemistry 30:366-374.

Xu, M., Q. Zhang, C. Xia, Y. Zhong, G. Sun, J. Guo, T. Yuan, J. Zhou, and Z. He. 2014. Elevated nitrate enriches microbial functional genes for potential bioremediation of complexly contaminated sediments. ISME J 8:1932-1944.

Yee, N. 2011. Geomicrobiology of selenium: Life and death by selenite. Applied Geochemistry 26:S324-S325.

Yin, H., J. Niu, Y. Ren, J. Cong, X. Zhang, F. Fan, Y. Xiao, X. Zhang, J. Deng, M. Xie, Z. He, J. Zhou, Y. Liang, and X. Liu. 2015. An integrated insight into the response of

174

sedimentary microbial communities to heavy metal contamination. Sci Rep 5:14266.

Zeglin, L. H. 2015. Stream microbial diversity in response to environmental changes: review and synthesis of existing research. Front Microbiol 6:454.

175

Biography

Raven Lee Bier was born May 9, 1985 in Pittsburgh, PA, USA. She received a

Bachelor of Arts in Biology in 2007 from Carleton College, Northfield, MN, USA. After this degree, she held three research assistant positions. The first position was in Dr.

Emily Bernhardt’s lab at Duke University, where she worked on projects in coastal biogeochemistry and soil-plant feedbacks. Second, she worked in Dr. David Moeller’s lab at University of Georgia where she assisted with a project in evolutionary and ecological genetics of Clarkia spp. Last, she oversaw a project examining the ecological effects of nanoTiO2 in Dr. Bradley Cardinale’s lab at University of California Santa

Barbara. This last project prompted her to return to graduate school to seek a Ph.D. in the Duke University Program in Ecology with a focus on microbial community responses to contaminant perturbations. At Duke, Raven was a student in Dr. Emily

Bernhard’s lab. She was supported by a Science to Achieve Results fellowship from the

US Environmental Protection Agency. To date she has co-authored three peer-reviewed articles related to her graduate work: Bacterial community responses to a gradient of alkaline mine drainage in Central Appalachian stream (2015), Linking microbial community structure and microbial processes: an empirical and conceptual overview (2015), and Cumulative impacts of mountaintop mining on an Appalachian watershed (2011).

176