<<

MOLECULAR CHARACTERIZATION OF MICROBIAL

COMMUNITIES IN LAKE ERIE SEDIMENTS

Torey Looft

A Thesis

Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

December 2005

Committee:

Juan L. Bouzat, Advisor

George Bullerjahn

Michael McKay ii

ABSTRACT

Juan L. Bouzat, Advisor

Microorganisms perform important roles in elemental cycling and organic decomposition, which are vital for ecosystems to function. Lake Erie offers a unique opportunity to study microbial communities across a large environmental gradient. Lake Erie consists of three basins and is affected by allochthonous inputs of dissolved organic matter (DOM) that increase to the west of the lake. In addition, the Central Basin of Lake Erie is characterized by a large area dominated by a Dead Zone, which experiences periodic hypoxic events. To evaluate patterns of microbial diversity, environmental samples from eleven sites were selected for PCR amplification, cloning and sequencing of 16S ribosomal DNA genes from microbial .

Samples included inshore sites from the Western, Central and Eastern Basins as well as from the

Dead Zone of the Central Basin. DNA representing the microbial community was extracted directly from sediment samples and universal primers were designed to amplify a 370 bp region of the small subunit of the 16S rDNA gene. Characterization of DNA sequences was performed through sequence database searches and phylogenetic analyses of environmental DNA sequences, the latter using reference DNA sequences from and all major bacterial groups. These analyses were used to assign environmental sequences to specific taxonomic groups. Biodiversity indices (Berger-Parker number and Bray-Curtis cluster analysis) were calculated and measures of sequence diversity were obtained from inshore sites of the three basins and the Dead Zone of Lake Erie. Results from this study demonstrated considerable levels of spatial variability of microbial communities throughout Lake Erie. Characterized species included bacterial groups with diverse metabolic capabilities and key members involved in the iii cycling of nutrients. The relative preponderance of in the Western and

Central Basins, but not in the Eastern Basin, may reflect the presumably widespread carbon substrate range found in the Western and Central Basins due to the greater number of allochthonous inputs of DOM. East and Central Basins showed similarities by the cluster analysis for species diversity, while the Dead Zone represented the most distinct group. These results are consistent with the idea that the Dead Zone’s unique conditions may lead to a unique microbial community structure. Although the presence/absence of some taxonomic groups revealed some patterns of spatial structuring, genetic distance calculations showed limited sequence differences between sites and groups.

iv

ACKNOWLEDGMENTS

Many people made contributions to the completion of this project. I would like to thank

Juan L. Bouzat for his advising, and Michael McKay and George Bullerjahn for completing my thesis committee.

I would like to thank David Porta for collecting the Lake Erie sediments used in this study. Matt Hoostal assisted with data analysis and provided helpful suggestions during the writing process. I would like to thank my other MOLECON cohorts: Jeremy Ross, Brian Roller,

Bethany Swanson and Tim Herman for many helpful discussions. Thanks to Sandra Marcu for her grammatical edits; her English is better than mine.

I would like to thank my parents, Terry and Carole Looft support. My parents’ encouragement through my long college career was much appreciated and they never let me forget that there is outside college. Finally, I would like to thank the discipline of science

and scientific principles to help explain how and why things happen.

v

TABLE OF CONTENTS

Page

ABSTRACT………………………………………………………………… ii

ACKNOWLEDGEMENTS ………………………………………………… iv

TABLE OF CONTENTS…………………………………………………… v

LIST OF TABLES………………………………………………………….. vi

LIST OF FIGURES………………………………………………………….. vii

INTRODUCTION…………………………………………………………... 1

MATERIALS AND METHODS…………………………………………… 7

RESULTS…………………………………………………………………… 15

DISCUSSION……………………………………………………………….. 21

TABLES AND FIGURES…………………………………………………… 31

APPENDIX………………………………………………………………….. 49

REFERENCES………………………………………………………………. 58

vi

LIST OF TABLES

Page

Table 1. Characterization of environmental DNA sequences based on the Ribosome Database Project Classifier…………………………………………….…... 32

Table 2. Rank Abundance table based on characterizations of environmental DNA sequences using the Ribosome Database Project Sequence Match…..…… 33

Table 3. Berger-Parker Indexes of species dominance for the Western, Central, and Eastern Basins as well as the Dead Zone and sites associated with industrial inputs in Lake Erie……….………………………………………………… 34

Table 4. Characterization of unclassified sequences, cross-searched with the RDP Sequence Match.………………………………………….………………… 35

Table 5. Characterization of DNA sequences based on phylogenetic analysis. …..... 36

Table 6. Neighbor-Joining distances between sampling sites estimated using the Kimura-2 Parameter model of evolution and estimated standard errors based on 1050 bootstrap resampling of the data………..……..…………… 37

Table 7. Genetic distances within sampling sites and group averages estimated using the Kimura-2 Parameter model of evolution……….…………….…… 38

Table 8. Genetic distances between sampling group averages estimated using the Kimura-2 Parameter model of evolution…………….………………….…… 39

vii

LIST OF FIGURES

Page

Figure 1. Map and cross-section of Lake Erie……………………………………… 40

Figure 2. Map of Lake Erie showing location of sampling stations. Pie charts represent results from the Ribosome Database Project Sequence Classifier for each sample group………………………………………….. 41

Figure 3. Bray-Curtis Cluster analysis trees based on diversity between sampled groups…..………………………………………………………...…….…. 42

Figure 4. Bray-Curtis Cluster analysis tree, based on species diversity between sampled groups………………………..……………………………..……. 43

Figure 5. Neighbor-Joining phylogenetic tree of all 16S rDNA sequences obtained in this study……….……...... ……………………………..…….. 44

Figure 6. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Western Basin..………………………………….….….. 45

Figure 7. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Central Basin…….……………………………………… 46

Figure 8. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Eastern Basin……………………………………..……... 47

Figure 9. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Dead Zone………………………………………..…..….. 48

1

MOLECULAR CHARACTERIZATION OF MICROBIAL

COMMUNITIES IN LAKE ERIE SEDIMENTS

INTRODUCTION

Microorganisms are important components of every ecosystem because they perform significant roles such as elemental cycling and organic decomposition. Microbial community processes include photosynthesis, N2 fixation, denitrification, sulfate reduction, methanogenesis, and metal reduction reactions (Paerl and Pickney, 1996). Many of these processes result in the detoxification of pesticides and other harmful contaminants that pose health risks to humans.

This wide range of metabolic plasticity allows microorganisms to inhabit many , including habitats under extreme environmental conditions (Head et al., 1998). Microorganisms play an indispensable role in the environment and many are dependent on their relationships with microorganisms (Berman-Frank and Parker, 2003). An assessment of microbial community diversity and structure is therefore an important step for understanding the role of microbial communities in ecosystem function.

One of the longstanding problems in environmental microbiology is determining what species are present in a given environment (Kirk et al., 2004). The extent of microbial diversity is unknown since it is much greater than what is observed from studies that obtain microorganisms in culture from the environment under laboratory conditions. Given the limitations of culture-based methods to identify non-culturable organisms (Cho and Tiedje et al.,

2000), it has been estimated that up to 99% of soil microorganisms are unidentified (Borneman et al., 1996). Microorganisms are not easy to characterize on the basis of morphology because few morphological traits are widespread enough to allow meaningful comparisons between

2

distantly related microorganisms (Edwards, 2000). Many microbial species are characterized biochemically on the basis of their metabolic function. varies with regard

to the sources of energy microbial species use for assembling macromolecules and other cellular

components as seen, for example, with sulfur-oxidizers (Dexter-Dyer, 2003). However,

biochemical assays also have limitations since these often require laboratory culturing, which for

most taxa is not applicable. Other factors affecting microbial diversity, such as interactions

between microbes, dispersal rates, changing conditions, and evolutionary change, cannot be

taken into account in the laboratory setting Travisano and Rainey, 2000).

An alternative approach to the biochemical characterization of microbial species and

communities has been the use of DNA molecular techniques (Kirk et al., 2004). Examination of

16S ribosomal DNA sequences can be used to identify organisms and establish evolutionary

relationships among species (Gray and Herwig, 1996). Nucleic acid sequences can be used to

characterize microorganisms based on genes that have been highly conserved through

evolutionary time. Sequences within ribosomal DNA genes are ideal for species identification

since ribosomal function is necessary for synthesis, which requires ribosomes to be

structurally conserved to perform their essential role (Kirk et al., 2004). Ribosome sequence

analysis was used by Carl Woese to demonstrate that all living organisms fall into one of three

major domains of life: , Archaea, and Eukarya (Woese et al., 1985). Phylogenies

constructed using 16S rDNA sequence analysis ended the use of the old classification system,

which was based on separating organisms as being either or (Woese,

1998). By comparing environmental sequences to those from well-characterized organisms,

researchers can conduct ecological surveys without the need to culture microorganisms in the lab

(Mikesell et al., 1993). Phylogenies, which are hypotheses about historical relationships among

3

organisms, allow researchers to characterize unknown species from environmental samples and

make inferences about ecological adaptations based on the properties of closely related species.

Microbial diversity can be a sensitive indicator of environmental pollution and change because microorganisms respond quickly to changes in the ecosystem (Seckbach and Kluwer,

2000). Since microbial communities are involved in many soil/sediment processes, they may also be used as a reliable indicator of ecosystem integrity. Functional adaptations of microbial communities to environmental factors, for example, can reflect environmental gradients and spatial heterogeneity regarding contaminants or dissolved organic matter (DOM) (Robertson et al., 1999). Although molecular characterizations of microbial communities have been previously performed (Kirstine, 2001; Muller et al., 2002; Borneman et al., 1996), these have been mainly

restricted to microbial communities from “single sites” or communities associated with specific environmental factors (e.g., sites with high concentrations of contaminants or hot springs). Less common are studies that evaluate microbial diversity throughout a large ecosystem, which may be associated with environmental gradients or spatial variation regarding levels of DOM inputs.

Due to the physical and chemical gradients found across large ecosystems, a variety of habitats

and micro-niches can potentially be inhabited by metabolically diverse microorganisms (Baross

and Deming, 1995). Little is known, however, about spatial patterns of microbial diversity

associated with local adaptations across major environmental gradients.

In this study, I used molecular and bioinformatics techniques to characterize patterns of microbial diversity in sediments across Lake Erie, one of the largest lake ecosystems in the world

(Figure 1). Lake Erie is the eleventh largest lake in the world by surface area (25,700 km2).

Lake Erie is naturally divided into three basins and has an average depth of about 19 m and a maximum depth of 64 m. Lake Erie is the shallowest of the North American Great Lakes

4

(http://www.great-lakes.net) as well as the smallest in volume. Additionally, it is exposed to the

greatest effects from urbanization and agriculture. The area around Lake Erie is intensively

farmed and receives water runoff from agricultural fields in southwestern Ontario and parts of

Ohio, Indiana and Michigan (Painter et al., 2001). As a consequence, the Lake Erie ecosystem exhibits spatial heterogeneity with regard to dissolved organic matter inputs and other

environmental contaminants. How these differences across the three major Lake Erie basins influence microbial communities is not clear.

In addition to the geophysical differences among the three basins, Lake Erie has both polluted and relatively pristine areas, creating an environmental gradient with increasing levels

of DOM and pollution towards the west (Painter et al., 2001). The western parts of the lake are typified by industrialized areas, with many tributaries leading to considerable amounts of inorganic, organic and contaminant inputs being loaded into the lake (Painter et al., 2001).

These inputs have lead to many instances of contaminated sediment in the west (e.g., with PCBs and metals; Painter et al., 2001; Marvin et al., 2002). The West-East environmental gradient was identified by Painter (2001) and Marvin (2002), with surficial sediment in the West having higher concentrations of chromium, copper, nickel, lead, mercury, zinc, and PCB’s compared to those found in the East. In addition, it was shown that phosphorus concentrations in Lake Erie

were higher in the Western and Central basins compared to the Eastern basin (Painter et al.,

2001). For example, the watersheds for the Western Basin represent about 50% of the crop-land

drainage into Lake Erie (Forster and Raush, 2002). These sources of DOM combined with a

large input of contaminates from the Detroit River result in an environment with the highest

organic content in the lake (Painter et al., 2001). In contrast, the Eastern Basin is very deep

(with maximum depths up to 60 m) and the surrounding area is not as industrialized as the

5

Western and Central Basins. As a consequence, this basin has not had serious issues regarding

sediment contamination and remains relatively pristine.

The Central Basin is the largest basin of Lake Erie, covering a large area from central

Ohio to Erie, Pennsylvania. The Central Basin includes a large offshore area, known as the Dead

Zone, which experiences annual hypoxia (Smith et al, 2004). The physical characteristics of the

Central Basin are key to the Dead Zone formation (21 m deep, 18 m epilimnion, and 3 m

hypolimnion). It is not fully understood what additional factors causes the appearance of the so-

called Dead Zone, but conditions of low dissolved oxygen usually appear in late summer to early

fall and have continued even though nutrient load reductions have been in effect for over two

decades (Smith et al, 2004). The hypoxic conditions in the Dead Zone are expected to impact

the types of microbial communities that exist within the sediment, which may be dominated by

anaerobic microbes. Explanations for the occurrence of a Dead Zone have also focused on invasive species like the zebra mussel, lower water levels, and high phosphorus concentrations

(Smith et al, 2004).

It is believed that anthropogenic activities, such as urban development, agriculture,

pollution, and use of pesticides also have significant effects on microbial communities (Kirk et

al., 2004). We therefore expect that environmental gradients of nutrient inputs would affect the spatial structure and function of microbial communities. Regions with a high diversity of DOM, for example, might exhibit higher levels of microbial diversity, reflecting a higher diversity of metabolic pathways associated with carbon assimilation. In contrast, areas with low nutrient levels and oxygen concentration, like that observed in the Dead Zone, would most likely be associated with anaerobic metabolism and lower activity. A similar pattern may be observed in

6 the Eastern Basin, which receives relatively fewer nutrient inputs than the Central and Western

Basins and, therefore, may exhibit lower bacterial diversity.

The overall goal of this thesis is to provide a molecular characterization of microbial communities from a large ecosystem. Environmental DNA and bioinformatics techniques were used to evaluate spatial patterns of sediment microbial diversity across the three major basins of

Lake Erie. Specifically, I evaluated the potential effects of environmental gradients regarding nutrient inputs and pollutants on microbial communities. I addressed three specific research questions:

1. Are there major differences in microbial diversity across the three major basins of Lake

Erie?

2. Do sites with high input levels of DOM contain more diversity than sites receiving lower

allochthonous inputs?

3. Are there unique characteristics of microbial communities associated with the Dead Zone

located in the Central Basin of Lake Erie?

Results from this study provide an initial characterization of the sediment microbial diversity throughout the Lake Erie ecosystem. In addition, this study provides a preliminary analysis of spatial patterns of microbial communities associated with environmental gradients of dissolved organic matter and the unique ecological conditions of the Dead Zone.

7

MATERIALS AND METHODS

Sample collection

Sediment samples for the molecular characterization of microbial communities were

collected in September of 2002 during a cruise of the research vessel Lake Guardian (EPA,

http://www.epa.gov/greatlakes/monitoring/guard/ship.html). Surface sediments were collected

using a Ponar grab from 19 sample stations distributed throughout the three basins of Lake Erie

(Figure 2). Each sediment sample consisted of about 60 g wet weight of total sediment collected in a sterile 50 ml polypropylene tube. Sampling stations were distributed across the lake in four transects, which included one transect through each of the Western and Eastern Basins and two transects through the Central Basin (Figure 2). Samples were stored at -20ºC until environmental

DNA extractions were performed.

DNA Extractions

Three DNA extraction kits developed for soil samples were tested for DNA yield and quality of sediment environmental DNA. These kits included Fast DNA (Bio 101, La Jolla, CA),

Ultra Clean DNA (MO BIO Laboratories, Carlsbad, CA), and Bactozol (Molecular Research

Center, Cincinnati, OH). All three extraction kits were used following the manufacturers’

instructions. The quality and efficiency of the extraction protocols were evaluated by

electrophoresis of extracted DNA samples on 1% agarose gels, and the quality and quantity of

DNA was resolved by staining gels with ethidium bromide and visualization under ultraviolet

light (see Appendix for extraction protocols).

The Fast DNA Spin kit (Bio 101, La Jolla, CA) was selected for DNA extraction of

sediment samples because it yielded the most DNA with the least amount of DNA shearing,

which indicated less degradation, thus reducing the potential for chimera formation during PCR.

8

For each sample, total environmental DNA was extracted from 1 g of wet sediment. DNA

extractions were eluted in 50 µl of water and stored at -20ºC. The DNA was quantified and

confirmed using a light spectrophotometer and by gel electrophoresis. DNA concentrations

obtained from environmental DNA extractions were in the range of 20-50 ng/µl.

16S Ribosomal DNA Amplification, Cloning and Sequencing

PCR amplification of bacterial 16S ribosomal DNA was performed using total

environmental DNA samples extracted from Lake Erie sediment as template and with PCR

primers designed to amplify conserved regions of the 16S rDNA gene (Lane et al., 1985). The

universal primers selected to amplify a segment of bacterial 16S rDNA included: Com-1-

Forward (5’-CAGCAGCCGCGGTAATAC-3’) and Com-1-Reverse (5’-

CCGTCAATTCCTTTGAGTTT-3’) located at positions 519 to 536 and 907 to 926 of the

Escherichia coli genome, respectively (Lane et al., 1985). These primers have been shown to successfully amplify a 370 base-pair section located between two conserved regions of the 16S rDNA gene. (Lane et al., 1985; Schwieger and Tedde, 2000).

To evaluate patterns of microbial diversity, environmental samples from eleven sites were selected for PCR amplification, cloning and sequencing of 16S rDNA genes. Samples included sites from Western, Central and Eastern Basins. Sediment samples taken from the

Western Basin included samples from stations S-61 and Sandusky (SAN), which were both shoreline sites. The Central Basin contained shoreline sites Ashtabula (ASH), Cleveland (CLE),

Port Alma (PAL), and Dead Zone (offshore) sites (S) S-42, S-43, and S-78. The Eastern Basin sires chosen were from shoreline stations Erie, Port Dover (DOV), and Barcelona (BAR) (see

Figure 2 for site locations).

9

PCR reactions for the amplification of 16S rDNA genes were performed in an MJ PTC-

100 (MJ Research, Waltham, MA) thermal cycler. Reaction mixtures containing 1X PCR buffer,

2.0 mM MgCl2, 2.0 U of Taq polymerase, 0.5 mM of each primer, and 80 ng of total

environmental DNA extractions as template in a final reaction volume of 50 µl. The reaction

profile included an initial denaturing step at 94ºC for 3 min, followed by 35 cycles of 60 s at

94ºC, 60 s at 50ºC, and 90 s at 74ºC, and a final DNA extension step at 72ºC for 4 min, followed

by 4 min at 4ºC (Schwieger, 1998). Negative controls without DNA template were included to

test for potential contamination of reagents.

PCR amplicons were tested by 1% agarose gel electrophoresis and subsequently cloned

using the Topo TA cloning kit (Invitrogen, Carlsbad, CA) and following the manufacturers’

instructions. Fifteen positive clones were selected at random from each of the shoreline

sampling stations for plasmid extraction and DNA sequencing, whereas ten clones were selected

from each of the Dead Zone sites. Plasmids were extracted using Bio-Rad (BIO-RAD

Laboratories, Hercules, CA) and Qiagen (Qiagen Inc., Valencia, CA) miniprep kits. The

presence of cloned PCR inserts was confirmed by performing an EcoR1 restriction enzyme

digestion, followed by agarose electrophoresis. Additional positive clones (20-30 clones per

site) were preserved in glycerol at -80ºC for future analysis.

Sequencing reactions from 125 clones were performed on an ABI 310 capillary DNA

sequencer (Applied Biosystems, Foster City, CA). Sequencing reactions were performed using

the ABI BigDye v3.0 reaction kit (Applied Biosystems, Foster City, CA) and the M13 reverse

primer. The reaction profile consisted of an initial denaturing step of 90 s at 94ºC, followed by

39 cycles of 15 s at 94ºC, 15 s at 50ºC, and 4 min at 60ºC, with a final step of 10 min at 4ºC.

DNA Sequences were cross-scored for base miscalling by direct visualization of

10

electrophoretograms. Twenty-five clones were outsourced to GeneGateway (San Francisco, CA)

for DNA sequencing.

Characterization of DNA Sequences

An initial characterization of DNA sequences was performed through BLAST searches into GenBank at the National Center for Biotechnology Information

(http://www.ncbi.nlm.nih.gov/). BLAST top hits and E-values, a parameter that describes the

number of hits one can expect to see just by chance when searching a database of a particular

size, were noted for each query sequence. Each DNA sequence was assigned to either a

taxonomic group (at the class and order level) or marked as unclassified (e.g., when there were

no significant matches in the GenBank database or matches corresponded to unclassified

environmental sequences). Sequences were assigned to taxonomic groups when the E-values

were smaller than 10-15. In addition to GenBank searches, DNA sequences were classified using

the Ribosomal Database Project (RDP) Release 9 Sequence Classifier web resource (Cole et al.,

2005). RDP classifications used a naive Bayesian rRNA classifier in which sequences are placed

in the major formal taxonomic ranks: , , class, order, family, genus, and species

(Cole et al., 2005). The taxonomic hierarchy is determined using only type strain individuals

characterized by the bacteria described by Bergey's Manual of Systematic

Bacteriology, release 0.9 (Cole et al., 2005). In this study, sequence classifications were

resolved to the phylum level, and in the case of large phyla such as , classifications

were specified to the class level. Only microorganisms classified with at least an 80%

confidence level were assigned to groups with the RDP classifier tool. The program places

sequences to the lowest taxonomic node, but sequences that did not meet the criterion of an 80%

11

match were considered unclassified. Unclassified sequences were described using the RDP

Sequence Match program, which includes all submitted sequences in GenBank.

Phylogenetic Analysis

Characterization of DNA sequences was also performed through phylogenetic analysis of

environmental DNA sequences, using reference DNA sequences from Archaea and all major

bacterial groups (see Appendix for 16S rDNA reference sequences and corresponding GenBank

accession numbers). Division representatives that were used to build phylogenetic trees included

sequences from , , ,

Gammaproteobacteria, , , , Flavobacteria, ,

Bacilli, , , and Archaea. In contrast to BLAST assignments, which are

based on sequence similarity searches, the phylogenetic analysis allowed the assignment of novel

DNA sequences based on patterns of ancestral-descent relationships.

Multiple DNA sequence alignments for phylogenetic analyses of environmental clones

and reference sequences were performed using Clustal-X, version 1.83 (Thompson et al., 1997).

Neighbor-Joining phylogenetic trees were built using the Molecular Evolutionary Genetics

Analysis (MEGA), version 3.0 software (Kumar et al., 2004), using the Kimura-2 parameter

model of evolution. Bootstrap resampling of 1000 multiple datasets was performed to provide

estimates of relative confidence for phylogenetic groupings.

An overall phylogenetic analysis was performed including all environmental DNA

samples and reference DNA sequences. This analysis was used to assign environmental

sequences to specific taxonomic groups. Independent phylogenetic analyses were performed for

groups of DNA environmental samples on the bases of basin location, industrial/DOM inputs,

and the Central Basin Dead Zone, and were designed to explore potential patterns of spatial

12 variability. Sampling stations for the Western Basin included shore sites S-61 and Sandusky (30 sequences). The Central Basin group included DNA samples from shore stations Ashtabula, Port

Alma, and Cleveland (45 sequences). The Eastern Basin contained sites Erie, Port Dover, and

Barcelona (45 sequences). The Dead Zone group included offshore sites S-42, S-43, and S-78

(30 sequences). Sites characterized by high industrial and nutrient inputs included sites S-61 and

Sandusky from the Western Basin, and Cleveland from the Central Basin (with a total of 150 sequences).

Microbial Diversity Measures

Microbial diversity was estimated by calculating both species diversity and sequence diversity. Species diversity is a measure of the diversity of species present in a given ecosystem.

This measure requires species information, which is sometimes subjective. Sequence diversity takes into account sequence differences between sampled organisms to get a measure of diversity. Sequence difference based measures can underestimate diversity, because only the most recent change at a given locus is detectable. Together these estimates can give a more complete description of the microbial diversity within the communities of our samples.

Species diversity requires the categorization of environmental sequences into known taxonomic groups. Estimates of species diversity were based on taxonomic grouping performed by the RDP Sequence Classifier program, which uses type strain sequences to make assignments

(Table 1). Sequences that were not assigned by the RDP Sequence Classifier were subsequently cross-searched with the RDP Sequence Match program, which searches for significant matches within the entire GenBank sequence database. A rank abundance table was made to show the distributions of taxa between sample groups (Table 2).

13

The Berger Parker index was calculated with the software BioDiversity Professional,

developed by the Scottish Association for Marine Science, using the taxonomic assignments

made with the RDP programs. Species diversity is a measure of the variety of organisms within

a given environment or area, but also takes into account the relative abundance of each species

(Freeman et al., 2004). The Berger Parker number (d) is a diversity index based on the

proportional importance of the most abundant species (McAleece, 1997). The more abundant a

dominant species is the less biodiversity is present. The Berger Parker number (d) is negatively

related to biodiversity, so as the index d decreases, biodiversity increases. The Berger Parker index was calculated independently for each basin as well as for the Dead Zone (Table 3).

Data on species composition and abundance was also used to generate a dissimilarity

matrix based on the Bray-Curtis analysis (Pielou, 1984). Bray-Curtis analysis uses clustering methods to create a group-average hierarchy to show similarity between groups. These relationships are displayed as a dendogram (McAleece, 1997; Figure 3).

Measures of DNA sequence diversity were obtained using MEGA version 3.0 (Kumar et

al., 2004). Computations of genetic distances included pair wise distances and average distances

between sites, as well as net average distances among groups of sequences (Western, Central and

Eastern Basins and the Dead Zone).

Neighbor-Joining genetic distances were calculated using the Kimura-2 Parameter model

of evolution (Kumar et al., 2004). Genetic distances were calculated using bootstrapping with

1050 replicates to assign confidence levels. Genetic distances reflect the proportion of

nucleotide sites at which two sequences being compared are different. Distances are obtained by

dividing the number of nucleotide differences by the total number of nucleotides compared. The

14 distance estimates do not make any correction for multiple substitutions at the same site, substitution rate biases, or differences in evolutionary rates among sites (Kumar et al., 2004).

15

RESULTS

DNA sequences from environmental samples grouped within many divisions of bacteria using the RDP rDNA Sequence Classifier. This program allowed us to categorize DNA sequences on the basis of “Type Strains” with well categorized representatives. Sequences were placed in a taxonomical hierarchy with a confidence rating of 80% or higher. Results from the

RDP Sequence Classifier are shown in Table 1. More than half the sequences (78/150) fell below the 80% level of confidence and therefore were categorized as unclassified environmental samples. The remaining sequences included representatives of the phylum Proteobacteria (51 sequences) with representatives of beta, gamma, and delta classes. Other phylum representatives included Bacteriodetes (10 sequences), Nitrospira (7 sequences), and (4 sequences).

Most of our “unclassified” sequences still had matches with sequences in the RDP database; however, they did not have confident matches that allowed a categorization of query sequences at the phylum or class level.

Database searches were also conducted using the RDP Sequence Match web resource

(Cole et al., 2005), which allow matches with any sequence in the GenBank database. With the exception of two clones, no sequences had identical matches with those in the GenBank database, but many allowed taxonomic assignments based on sequence similarity (Table 4).

Initial characterizations were done at the class level, which allowed identification of major bacterial groups occurring in the sequences. As mentioned above, this characterization differs from that of the RDP Sequence Classifier because environmental and unclassified sequences are both included in the GenBank database.

The majority of the sequences fell within the phylum Proteobacteria, with class representitives from Deltaproteobacteria (9 sequences), Gammaproteobacteria (2 sequences), and

16

Betaproteobacteria (1 sequence). From a total of 150 sequences, 27 matched with sequences

identified as unknown bacteria, which are very often DNA sequences obtained from

environmental samples. Within the phylum Firmicutes, class representatives included Clostridia

(12 sequences). Two sequences had significant matches with sequences from

and two were found matched with . Two sequences matched with those from a

canidate division for Bacteria (Genera WS3). Seventeen sequences failed to bring

any matches within the database. Because the RDP database does not contain archaeal

sequences, BLAST searches were preformed using the complete GenBank database to identify

potential archaeal representatives. BLAST searches for Archaea sequences resulted in nine significant matches and corresponded to environmental clones SAN-5, PAL-L5, PAL-12, DOV-

2, DOV-21, DOV-27, ERIE-25, 78-12, and 78-15 (see Appendix).

Phylogenetic analysis of environmental DNA sequences and reference sequences from major bacterial groups and Archaea allowed the categorization of environmental clones on the

basis of ancestral-decent relationships. A Neighbor-Joining phylogenetic tree including all 150

environmental sequences and reference sequences is shown in Figure 5. Most sequences

(112/150) were assigned to major bacterial groups (Table 5) with bootstrap confidence values >

40%. Seven sequences clustered within the archaeal clade. It is worth noting that many

sequences grouped with a reference sequence from an uncultured Dehalococcoides (Dehalo).

This Bacterium belongs to a proposed new phylum Dehalococcoides, which is a group associated

with the dechlorination of chlorinated ethenes (Hendrickson et al., 2002) and, thus, may serve an

important role in bioremediation.

To search for specific patterns associated with environmental characteristics of the three

basins and the Dead Zone, phylogenetic trees were constructed using environmental sequences

17

and reference sequences grouped by basin. Our sample groups included sites from Western,

Central and Eastern Basins, as well as the offshore sites of the Dead Zone.

The Western Basin is represented by DNA environmental clones from near shore sites 61 and Sandusky (Figure 2). A Neighbor-Joining phylogenetic analysis allowed taxonomic assignments of environmental clones to the phylum level, and in some cases the class level

(Figure 6). Phylogenetic trees showed sixteen environmental sequences grouped with

Proteobacteria reference sequences, and included Alphaproteobacteria (2 sequences),

Betaproteobacteria (2 sequences), Deltaproteobacteria (3 sequences), and Gammaproteobacteria

(5 sequences), with the remaining four being ungrouped Proteobacteria. Within the

Bacteroidetes phylum, two Sphingobacteria were identified. Clustered within the Firmicutes phylum there were Clostridia (2 sequences), and no Flavobacteria. Cyanobacteria, Spirochaetes, and Actinobacteria sequences were not identified in the environmental samples from the Western

Basin. Two environmental DNA sequences were assigned to the phylum Nitrospira. One of the environmental clones grouped with Archaea with a bootstrap value of 60%. Seven environmental DNA sequences could not be assigned to any particular group. These sequences branched too deep in the tree and had no bootstrap support to be assigned to any particular group.

The phylogenetic analysis of the Central Basin environmental sequences included DNA sequences from near shore sites Ashtabula, Port Alma, and Cleveland (Figure 7). Seventeen

DNA sequences grouped with Proteobacteria reference sequences, and included representatives of Alphaproteobacteria (1 sequences), Betaproteobacteria (5 sequences), Deltaproteobacteria (5 sequences), and Gammaproteobacteria (6 sequences). Four sequences clustered within the

Bacteroidetes phylum, which included one Flavobacteria, two Sphingobacteria and one unclassified Bacteroidetes. There was one DNA sequence that grouped within the class

18

Clostridia (phylum Firmicutes). No sequences were grouped with Cyanobacteria or

Actinobacteria. Two sequences were assigned to the phylum and four sequences grouped with Nitrospira. There were two sequences that were identified as Archaea. From a total of 45 sequences taken from the Central Basin, fourteen were not assigned phylogenetically to any particular group and were considered unclassified sequences.

Phylogenetic analysis of environmental DNA sequences from the Eastern Basin (Figure

8) revealed five clones that grouped with Archaea. Interestingly, there were no representatives of Cyanobacteria found as well as none for Actinobacteria. In the phylum Bacteroidetes there were no Flavobacteria found, but there were two clones grouped within the class Sphingobacteria and one within the class Bacteroidetes. Within the phylum Firmicutes, one and one

Bacillus sequence was found. There were two sequences grouped with the Nitrospira phylum.

Sixteen isolates grouped in the phylum Proteobacteria; one grouped within

Gammaproteobacteria, six grouped with Betaproteobacteria, eight with Deltaproteobacteria, and one environmental sequence could not be assigned to any particular group because it branched deep, before alpha and delta Proteobacteria split. Six sequences grouped with Spirochaete reference sequences. The remaining eleven DNA sequences clustered together with no type strain reference sequences available and thus could not be assigned to any particular taxonomic group.

Phylogenetic trees based on environmental clones from the Central Basin dead zone (42,

43, and 78) showed relatively high species diversity (Figure 9). Seventeen sequences grouped with Proteobacteria, including five Betaproteobacteria, seven Deltaproteobacteria, and one

Gammaproteobacteria, with the remaining four being ungrouped within the Proteobacteria lineages. In the Bacteroidetes phylum, there were two Flavobacteria and one unclassed

19

Bacteroidetes. There were no sequences found within the phylum Firmicutes in the Dead Zone.

One sequence grouped with Cyanobacteria, and one with Actinobacteria. Spirochaete and

Nitrospira sequences were not found among the environmental clones from the Dead Zone. Two sequences grouped with Archaea. The remaining six sequences were grouped with unclassified

sequences associated with environmental isolates. In general, DNA sequences that could not be

assigned phylogenetically corresponded to sequences that had significant matches to

environmental clones using the RDP Sequence Match program.

In addition to phylogenetic analysis we estimated indices of microbial diversity based on rank abundance data. Rank abundance plots were determined on the basis of sequences characterized by the RDP Sequence Classifier and Sequence Match programs (Tables 1 and 2).

Berger-Parker Dominance indices (d) (Berger and Parker, 1979) were then calculated on the basis of relative importance of the most abundant species (Table 3). Decreasing d values therefore indicate increasing diversity. The Western Basin had a value (d) of 0.333, the Central was 0.281, the Eastern was 0.207, and the Dead Zone was 0.25. Samples from stations 61,

Sandusky, and Cleveland grouped as industrial sites revealed a dominance value of 0.231.

Additional biodiversity calculations were performed using the software BiodiversityPro,

(McAleece, 1997) which is a biostatistics analysis program that includes a variety of community similarity measures. The Bray-Curtis measure (Pielou, 1984) was used to convert the species composition and abundance data into a dissimilarity matrix using Group-Average clustering and to display relationships as a dendogram (Figure 3). The branching pattern produced by the cluster analysis showed the Eastern and Central Basins to be the most similar in terms of microbial diversity compared to the Western Basin and the Dead Zone. The organisms representing the Dead Zone showed the least similarity between the site groupings, branching

20

independently from a major group consisting of the Western, Central, and Eastern Basins.

Similar patterns were obtained when using Bray-Curtis cluster measures based on phylogenetic assignments (Figure 4).

Genetic distances were also estimated using MEGA and based on the Kimura-2 Parameter

model of evolution. Most distance comparisons within and among groups showed little

significance. The overall mean distance of all environmental samples was 0.267 (S.E.=0.023).

The genetic distances among sites were found not to be significantly different and did not show

any apparent geographic patterns (Table 6).

Individual site and group average genetic distances also showed minor differences and no

apparent spatial patterns (Table 7). The genetic distances among groups also showed no

significant differences. When looking at the genetic distances within each site, the largest difference was found between sampling sites S-78 and S-42, with an overall genetic distance of

0.358 and 0.206 for sampling sites S-78 and S-42 respectively (Table 7). Interestingly, both of these sites are found in the Dead Zone. The Industrial group (which includes sites S-61,

Sandusky, and Cleveland) revealed an average distance of 0.252, while the average distance for non-Industrial sites was 0.280.

21

DISCUSSION

Microorganisms play important roles in the environment. Studies of microbial community composition can, therefore, provide potentially descriptive information about ecosystem function. Anthropogenic activities, such as urban development, agriculture, use of pesticides, and pollution are believed to have significant effects on microbial communities (Kirk et al., 2004). The characterization of microbial communities is therefore important to determine what kinds of microorganisms are associated with specific sediment or soil properties as well as to gain an understanding of the ecological functions of uncultured microbes. The overall goal of this thesis was to provide a preliminary molecular characterization of microbial communities from Lake Erie sediments, as well as identify possible spatial patterns in microbial communities as they relate to gradients of DOM and the presence of a region subjected to annual cycles of anoxia (i.e., Lake Erie’s Dead Zone).

DNA from environmental samples of Lake Erie sediments was extracted to obtain a

representation of the associated microbial communities. The sediment samples were obtained

throughout the lake and included areas from different basins subjected to differential levels of

DOM inputs. The identification of darker sediment the further west samples were collected was consistent with reports from other studies (Painter et al., 2001) that indicated larger amounts of organic material in the sediment towards the Western Basin.

Initial characterizations of microbial communities were performed using the RDP

Sequence Classifier. Taxonomic assignments were made based on similarity of environmental

DNA sequences to typed species sequences in the Ribosomal Database Project database (Cole et al., 2005). Sequences that could not be assigned by the RDP Sequence Classifier were then cross-searched using the RDP Sequence Match program, which is not limited to type strain

22

sequences, but includes all bacterial 16S rDNA sequences in the GenBank database.

Additionally, sequence searches were performed using the NCBI BLAST program (Altschul et al.,

1990) to identify any possible Archaea or Eukaryotes, since the RDP includes only bacterial

sequences in the database. The RDP Sequence Match program was used to create a rank

abundance table for species diversity estimates. Sequence data from our 150 environmental

clones were then used to construct phylogenetic trees and calculate genetic distances for sites

grouped by basin and in the Dead Zone.

General Characterization of Environmental DNA Clones

Microbial characterizations using the RDP Sequence Classifier were made based on

similarity of our environmental DNA sequences to typed species sequences in the Ribosomal

Database Project database. Assignments that were given by the RDP Classifier placed sequences

with at least an 80% confidence rating into characterized bacterial groups. Proteobacteria were

the largest bacterial group recognized by our sequence data, making up about 34% of our

environmental sequences according to the RDP sequence classifier. This is not surprising

because Proteobacteria represent one of the largest bacterial groups, containing one third of all

known species of bacteria (Dexter-Dyer, 2003). Proteobacteria are gram negative, but represent a diverse range of organisms such as the purple phototrophic bacteria, and

enteric bacteria spanning five large classes, the alpha, beta, delta, gamma, and epsilon-

proteobacteria (Dexter-Dyer, 2003). This metabolic versatility is reflected in their wide

distribution in natural environments and they are common in aquatic sediments, including

freshwater rivers and lakes (Colwell and Grimes, 2000).

Other groups identified by the RDP Sequence Classifier were Firmicutes, Bacteroidetes,

and Nitrospira. Each of these phyla was detected across all of the basin sampling groups, except

23

in the Dead Zone, where no Firmicutes were identified. No Firmicutes were identified in the

Dead Zone based on phylogenetic tree scores either. Firmicutes are gram-positive bacteria and

have two main branches, Clostridia and . Clostridia are anaerobic whereas Bacilli consist

of obligate aerobes and facultative aerobes (Wolf et al., 2004). The Dead Zone, consisting of

seasonal periods of low dissolved oxygen, would create an unsuitable environment for obligate

aerobes, which may be the reason why Bacilli were not identified in the Dead Zone. The Dead

Zone experiences the most hypoxic conditions in the late summer and early fall, the time at

which our samples were collected. For the remainder of the year the Dead Zone is small or non existent (National Parks of Canada, http://www.pc.gc.ca/progs/amnc-nmca/plan/gla5_E.asp).

These seasonal changes in the dissolved oxygen may result in significant changes in the composition of the microbial community. As a result, obligate anaerobes might be lacking in this area. This seasonal variability in the dissolved oxygen content in the Dead Zone is not fully

understood, and its effect on microbial communities is still not clear.

A more complete database search with our environmental sequences was done using the

RDP Sequence Match program. Taxonomic assignments could be made using type strains, which provide documented landmarks, as well as non-type strains in the database. Non-type strains are often associated with sequences from environmental clones or unknown microorganisms. The three most common classes of bacteria found in our sequences using the

RDP Sequence Match program included the delta, beta, and gamma Proteobacteria.

Deltaproteobacteria are represented by morphologically diverse, anaerobic sulfidogens.

Deltaproteobacteria are known sulfate-reducing bacteria, these bacteria are likely to be important players in sulfur cycling (Seckbach and Kluwer, 2000). Betaproteobacteria are comprised of chemoheterotrophs and chemoautotrophs which derive nutrients from decomposition of organic

24

material (Dexter-Dyer, 2003). Lastly, Gammaproteobacteria are comprised of a group of

facultatively anaerobic and fermentative gram-negative bacteria (Dexter-Dyer, 2003). Other

bacterial members exhibiting diverse metabolic pathways were also identified. Clostridium

bacteria, for example, are anaerobic rods which are common soil heterotrophs (Cardenas, 1989).

In contrast, Bacillus bacteria are aerobic spore-forming rods and are also common soil organisms

(Foster and Johnstone, 1989). Sphingobacteria, on the other hand are rod-shaped non- forming chemoorganotrophic bacteria (Holmes et al., 1988). Nitrospira are aerobic - oxidizing bacteria that are important in marine habitats. However, this group is found in both fresh- and saltwater environments and is considered to be the dominant nitrite-oxidizing bacterium in wastewater treatment systems and other reactors (Hovanec et al., 1998).

Overall, results from this study revealed bacterial candidates that represent diverse metabolic capabilities. At all of the sites we found both aerobic and anaerobic bacterial representatives, as well as facultative anaerobes. Thus, although the number of sequences obtained likely under represented the total bacterial diversity at individual sites, we were able to identify key members involved in the cycling of nutrients using diverse metabolic pathways.

Most major groups of bacteria were represented in our sequences. We found high levels of microbial diversity including bacterial groups with different types of carbon assimilation metabolisms.

The characterization of a few representatives of archaeal sequences was somewhat surprising since the primers used to obtain our environmental DNA libraries were specifically designed for the amplification of bacterial 16S rDNA (Lane et al., 1985). However, analysis of primer sequences showed complete matches with archaeal 16S rDNA sequences; thus allowing cross-amplification of sequences from archaeal representatives. Our results showed a few

25

representatives of methanogenic Archaea in each of our sampling groups. These

microorganisms are also important terminal oxidizers in the anaerobic mineralization of organic

matter (Purdy et al., 2003). BLAST searches using the NCBI web resource were performed to

identify any potential Archaea sequences, because the RDP database only includes bacterial 16S

rDNA sequences. The resulting BLAST searches for Archaea sequences resulted in nine

significant matches. It is not surprising that Archaea exist in these areas because this group has been found in many common environments such as lake sediments (Forterre et al., 2002). Of the

nine sequences assigned to Archaea with a BLAST search data, four were found in the Eastern

Basin. Many Archaea are methanogens, producing methane (CH4) and carbon dioxide (CO2) during organic decomposition. Methanogens are common in most types of anoxic environments

(Garcia, 1998). The processes of sulfate reduction, methanogenesis and oxidation in sediments are tightly linked, resulting in the bioremineralization of organic compounds in lake sediments.

Organic compounds are oxidized by sulfate reducers as long as sulfate is available. In sediments where sulfate is depleted, the terminal steps of carbon remineralization are performed by methanogens, which can utilize a variety of carbon sources including acetate, ethanol, formate, methanol and methylamines (Whitman et al., 1992). DNA assignments using sequence database searches and phylogenetic analysis presented in this study provide a snapshot of microbial communities involved in the cycling of key nutrients, as well as potential links between anaerobic and aerobic metabolisms.

Phylogenetic Characterization of Environmental DNA Clones

Initial phylogenies of all environmental sequences (Figure 5) were performed to identify

representatives of specific bacterial groups as well as potential differences with the assignments

made with the RDP Sequence Classifier and Sequence Match programs. Across all groups we

26 found bacterial candidates that represent diverse metabolic capabilities. Key members involved in the cycling of nutrients could be inferred phylogenetically from our data. A characteristic group assigned through database searches was Desulfobacterales. Desulfobacterales are anaerobic, sulfate-reducing bacteria that are ubiquitously distributed throughout ecosystems (Loy et al., 2002). Indeed, Desulfobacterales were found in most of our samples and defined groups

(three basins and the Dead Zone). The sulfate-reducing bacteria reduce sulfate to sulfide, which in turn can be oxidized by sulfur oxidizing bacteria, and are best typified by the

Alphaproteobacteria. This group of sulfur oxidizing bacteria was also found in several samples.

Besides their role in sulfur cycling, sulfate-reducing bacteria ferment organic acids and provide links between the sulfur and carbon cycles (Hansen, 1994). The groups with most individuals identified in the phylogenetic analysis were delta, gamma and beta Proteobacteria. These

Proteobacteria assignments agreed with the ones made with the RDP programs. Several sequences grouped with a reference sequence from Dehalococcoides on the phylogenetic tree and many remained ungrouped, and thus could not be identified.

Assignments made with our phylogenetic trees differed at times with those seen with the

RDP sequence programs. From a total of 150 DNA sequences, using a relatively low bootstrap value (>40%), only 38/150 sequences remained ungrouped. In contrast, 78/150 sequences remained unassigned using sequence database searches (RDP Sequence Classifier and Sequence

Match). It is important to understand that sequence databases are limited and cannot always give correct assignments. Matches with sequences in the GenBank database does not always mean an ancestral relationship is present since matches are determined only by sequence similarity. On the other hand, phylogenetic analyses use evolutionary models to identify ancestry relationships and thus provide insights into the evolutionary history of the microbial groups studied.

27

Phylogenetic trees were also constructed with sequences from grouped sites and

reference sequences to identify potential patterns across the lake. One notable distinction in our results is the significant number of Gammaproteobacteria in the Western and Central Basins, but an absence of Gammaproteobacteria in the Eastern Basin. Based on our phylogenetic trees, the

Eastern Basin and the Dead Zone had only one sequence each that grouped with

Gammaproteobacteria, while the other groups had between five and nine gamma sequences.

Gammaproteobacteria include the closely related Pseudomonads and Azotobacter, which are

ecologically widespread bacteria with a wide range of carbon substrates that can be assimilated,

such as organic xenobiotics (Soltmann et al., 2002). The relative preponderance of

Gammaproteobacteria in the Western and Central Basins, but not in the Eastern Basin, may

reflect the presumably widespread carbon substrate range found in the Western and Central

Basins due to the greater number of allochthonous inputs of DOM. In spite of being the largest

class within the Proteobacteria phylum, the Gammaproteobacteria group was underrepresented in

the Eastern Basin. This East-West pattern in the respective species of this group may reflect

microbial community adaptations to the East-West environmental gradient regarding DOM

inputs. However, further sampling should be performed to confirm this pattern. The Eastern

Basin is the only basin with a large community of Spirochaetes, which are often associated with

parasitic bacteria. Spirochetes are anaerobic and facultatively anaerobic bacteria that are

indigenous to aquatic environments such as the mud and water of ponds and marshes

(Terracciano and Canale-Parola, 1984).

The Dead Zone was the only group that revealed no Nitrospira and no Sphingobacteria.

Nitrospira are a surprisingly underrepresented group, considering that they have been shown to

be the dominant nitrite oxidizers in most environments sampled (Hovanec et al., 1998).

28

Considering the hypoxic conditions that are found within the Dead Zone, these differences in the

microorganisms identified may reflect the unique geochemistry of this region. The Central Basin

is the only basin with examples of Flavobacteria found, two being in the Dead Zone and one in

an off shore site. Flavobacteria, which are yellow-pigmented non-fermentative gram-negative bacteria are common in fresh- and marine water environments (Botha et al., 1989).

Actinobacteria are Gram-positive and almost exclusively aerobic, with their DNA being biased toward high G-C content (Monciardini, 2003). Only one sequence from site 42 was grouped with the Actinobacteria phylum.

Besides the ecological significance of these results, characterizing microbial communities within the lake sediment may have direct implications for the natural bioremediation of contaminated sediments. Our phylogenies suggest that a significant number of sequences obtained from sites high in organochlorine contamination (Painter et al., 2001) are representatives of Pseudomonads, a genus whose role in organochlorine degradation has been well-characterized (Hoostal et al., 2002). Pseudomonads seemed particularly well-represented at sites from the Western and Central Basins, which have been especially impacted by contaminated loadings (Painter et al., 2001). Microbial communities at these sites may have become adapted to the presence of environmental contaminants by increasing the number of species that can catabolize such contaminants. The identification of bacterial groups involved in organochlorine degradation at specific sites therefore suggests that these sites may have a high potential for bioremediation.

Diversity Measures

Environmental DNA sequences characterized with the RDP Sequence Match were used to estimate species diversity. The most obvious difference between the groups analyzed at

29

individual sampling sites and defined groups (West, Central, East, and Dead Zone) are illustrated by the Bray-Curtis cluster analysis (Figure 3). This analysis showed the divergence among

microbial communities found in different samples based on an index of community composition

and abundance. The input for this analysis was the rank abundance table (Table 2), which

included the identifiable classifications obtained from the RDP Sequence Classifier and

Sequence Match. The Dead Zone showed the least similarity with the other sampled groups,

followed by the West, which illustrated a difference from the East and Central Basins. The East

and Central Basins clustered together and were the groups with the fewest differences. These

results are consistent with the idea that the Dead Zone has unique conditions and possibly a

unique community structure associated with those conditions. The separation of the Western

Basin from the Central and Eastern Basins may reflect the large inputs of nutrients and DOM from the Detroit River. This is also an expected result considering the unique nature of the nutrient and industrial inputs present in this region compared to the other sites. The nutrient loads produced by these inputs would be expected to affect community composition, most likely increasing levels of microbial diversity associated with the metabolism of a more diverse array of carbon sources. This pattern, of the Dead Zone being dissimilar from the other groups, was also confirmed when performing Bray-Curtis analysis based on phylogenetic assignments. The cluster analysis consistently grouped Central, Eastern, and Western basins independently from the Dead Zone.

Genetic distance calculations based on DNA sequence comparisons showed limited differences among sites and groups. No apparent spatial heterogeneity was detected. This was

expected considering the limited number of clones sequenced and the wide range of species

30

detected at each site. With larger data sets, statistical analysis may show significant differences between sites.

With the development of molecular genetic tools the field of microbial ecology is no longer dependent on culture techniques to address ecological questions. Only with the application of these molecular techniques have researchers begun to understand the extent of biodiversity at the microscopic level (Colwell and Grimes, 2000). Microorganisms share complex relationships with other microorganisms and the environment, and they can act as sensitive indicators of disturbances such as environmental toxicity attributed to pesticides, metals, nutrient loading and other anthropogenic pollutants (Robertson et al., 1999). Due to the responsive nature of microbial communities to environmental characteristics such as DOM inputs, climate, and disturbances, cross-site comparisons can be used in ecological studies

(Robertson et al., 1999). The methods used in this study provide preliminary insights into the

spatial variability of microbial communities throughout Lake Erie. We found bacterial

candidates that are aerobic and anaerobic, as well as facultative anaerobes, representing diverse

metabolic capabilities. Although the number of sequences obtained certainly under represents the total bacterial diversity present at individual sites, the presence of key members involved in the cycling of nutrients was inferred from our data. The ecological significance of these groups was apparent in the identification of specific groups associated with environmental gradients of

DOM inputs, levels of pollution, and the Lake Erie Dead Zone.

31

TABLES AND FIGURES

32

Table 1. Characterization of environmental DNA sequences based on the Ribosome Database Project Classifier. Assignments were determined to the Class level with 80% confidence against type strain sequences in the RDP database. Data indicates the number of sequences in the Western, Central, and Eastern Basins as well as the Dead Zone of Lake Erie.

______

CLASS Western Central Eastern Dead Total ______Flavobacteria 0 0 0 2 2 Sphingobacteria 1 2 0 0 3 Alphaproteobacteria 1 1 0 0 2 Betaproteobacteria 1 5 6 2 14 Deltaproteobacteria 4 3 4 2 13 Gammaproteobacteria 4 3 0 1 8 Unclass. Proteobacteria 3 2 5 4 14 Nitrospira 1 2 2 2 7 Clostridia 0 2 1 0 3 Bacilli 0 0 1 0 1 Unclass. Bacteroidetes 1 2 2 0 5 Unclass. Bacteria 14 23 24 17 78 ______

* Unclass: Unclassified DNA sequences

Table 2. Rank Abundance table based on characterizations of environmental DNA sequences using the Ribosome Database Project Sequence Match. Assignments were determined to the Order level and indicate the number of sequences in the Western, Central, and Eastern Basins as well as the Dead Zone of Lake Erie.

______Western Central Eastern Dead Zone ______

Order No. Order No. Order No. Order No. ______Clostridiales 4 Clostridiales 6 Clostridiales 5 Desulfuromonales 5 Burkholderiales 3 Burkholderiales 4 Desulfobacterales 3 Nitrospirales 3 Sphingobacteriales 2 Nitrospirales 3 Desulfuromonales 3 2 Desulfobacterales 2 Desulfuromonales 3 Acidobacteriales 3 Desulfobacterales 1 Chromatiales 2 Sphingobacteriales 2 Thermomicrobiales 3 Burkholderiales 1 Nitrospirales 1 Desulfobacterales 2 Nitrospirales 2 Clostridiales 1 Unclass. Bacteroidetes 1 Myxococcales 2 Burkholderiales 2 Rhodocyclales 1 1 Legionellales 1 Aquificales 2 Methylococcales 1 Legionellales 0 Rhodospirillales 1 Spirochaetales 2 1 0 1 Bacillales 1 Verrucomicrobiales 1 Rhodocyclales 0 Syntrophobacterales 1 Rhodocyclales 1 Actinomycetales 1 Desulfuromonales 0 Acidobacteriales 1 Thermoanaerobacteriales 1 Unclass. Bacteroidetes 1 Rhodospirillales 0 Thermomicrobiales 1 Sphingobacteriales 0 Deferribacterales 1 Myxococcales 0 Aquificales 1 Legionellales 0 Sphingobacteriales 1 ______

Table 3. Berger-Parker Indexes of species dominance ( d, 1/d, and %d) for the Western, Central, and Eastern Basins as well as the Dead Zone and sites associated with industrial inputs in Lake Erie. A decreasing d values (dominance) indicates increasing diversity.

______

Index Western Central Eastern Dead Zone Industrial ______d 0.333 0.281 0.207 0.250 0.231

1/d 3.000 3.556 4.833 4.000 4.333

%d 33.333 28.125 20.69 25.00 23.077 ______

Table 4. Characterization of unclassified sequences, cross-searched with the RDP Sequence Match. Sequences unclassified by the RDP Sequence Classifier were assigned by the RDP Sequence Match, which uses all the 16S sequences in the GenBank database. Only significant matches were give assignments. ______Western Central Eastern Dead Zone ______Class No. Class No. Class No. Class No. ______

Clostridia 4 Deltaproteobacteria 4 Clostridia 4 Deltaproteobacteria 4 Gammaproteobacteria 1 Clostridia 4 Acidobacteria 2 Gammaproteobacteria 1 Genera incertae sedis WS3 1 Betaproteobacteria 1 Thermomicrobia 2 unclassified bacteria 8 Unclass. Bacteria 4 Verrucomicrobiae 1 Deltaproteobacteria 1 No match 4 No match 4 Genera incertae sedis WS3 1 Bacteroidetes 1 Unclass. Bacteria 9 Holophaga 1 No match 3 Spirochaetes 1 Unclass. Bacteria 6 No match 6 ______

Table 5. Characterization of DNA sequences based on phylogenetic analysis. Taxonomic assignments were given to environmental sequences based on branching patterns with reference sequences on Neighbor-Joining phylogenetic trees. Data indicates the number of sequences in the Western, Central, and Eastern Basins as well as the Dead Zone of Lake Erie. ______

Class Western Central Eastern Dead Zone Total ______Alphaproteobacteria 1 0 2 0 3 Deltaproteobacteria 5 8 3 7 23 Gammaproteobacteria 6 1 5 1 13 Betaproteobacteria 5 6 2 5 18 unclass Proteobacteria 0 1 4 4 9 Nitrospira 4 2 2 0 8 Flavobacteria 1 0 0 2 3 Sphingobacteria 2 2 2 0 6 Unclass Bacteroidetes 1 1 0 1 3 Clostridia 1 1 2 0 4 Bacilli 0 1 0 0 1 Actinobacteria 0 0 0 1 1 Spirochaetes 2 6 0 0 8 Cyanobacteria 0 0 0 1 1 Archaea 3 5 1 2 11 Ungrouped 14 11 7 6 38 Total 45 45 30 30 150 ______

Table 6. Neighbor-Joining distances between sampling sites estimated using the Kimura-2 Parameter model of evolution (below diagonal) and estimated standard errors based on 1050 bootstrap resampling of the data (above diagonal). See Methods for abbreviations of sampling stations. ______

S-42 S-43 S-61 S-78 ASH BAR CLE DOV ERIE PALM SAN S-42 0.023 0.023 0.028 0.023 0.023 0.023 0.025 0.022 0.025 0.023 S-43 0.222 0.024 0.028 0.024 0.024 0.024 0.027 0.024 0.026 0.024 S-61 0.221 0.227 0.029 0.024 0.023 0.022 0.027 0.024 0.026 0.024 S-78 0.310 0.306 0.310 0.028 0.028 0.029 0.029 0.028 0.030 0.029 ASH 0.235 0.239 0.238 0.320 0.024 0.024 0.026 0.024 0.026 0.024 BAR 0.217 0.223 0.220 0.310 0.239 0.023 0.027 0.023 0.026 0.024 CLE 0.222 0.236 0.228 0.326 0.251 0.226 0.027 0.024 0.026 0.024 DOV 0.279 0.291 0.291 0.343 0.298 0.290 0.302 0.026 0.027 0.026 ERIE 0.229 0.247 0.242 0.316 0.257 0.237 0.248 0.288 0.026 0.024 PALM 0.259 0.273 0.271 0.337 0.288 0.270 0.278 0.316 0.278 0.026 SAN 0.239 0.251 0.249 0.325 0.265 0.251 0.257 0.300 0.262 0.280 ______

38

Table 7. Genetic distances within sampling sites and group averages estimated using the Kimura-2 Parameter model of evolution. Standard errors were based on 1050 bootstrap resampling of the data.

______

Distance Standard Error ______

WESTERN 0.244 0.023 S-61 0.224 0.023 SAN 0.270 0.025 ______

CENTRAL 0.284 0.026 ASH 0.255 0.026 PALM 0.303 0.027 CLE 0.231 0.024 ______

EASTERN 0.270 0.025 DOV 0.328 0.029 BAR 0.219 0.024 ERIE 0.255 0.024 ______

DEAD ZONE 0.276 0.025 S-42 0.206 0.022 S-43 0.239 0.025 S-78 0.358 0.032 ______

Industrial 0.252 0.022 Non-Industrial 0.280 0.024 ______

39

Table 8. Genetic distances (below diagonal) between sampling group averages estimated using the Kimura-2 Parameter model of evolution. Standard errors (above diagonal) were based on 1050 bootstrap resampling of the data. ______Sample Basin Dead West Central East ______Dead 0.025 0.026 0.025 West 0.262 0.025 0.024 Central 0.277 0.264 0.026 East 0.273 0.260 0.277 ______

DOM Input Dead Industrial Non-Industrial ______Dead 0.023 0.023 Industrial 0.271 0.023 Non-Industrial 0.282 0.271 ______

40

Figure 1. Map and cross section of Lake Erie. Cross-section shows basin depths and spatial variability. (Obtained from the Pennsylvania Fish & Boat Commission website, http://sites.state.pa.us/PA_Exec/Fish_Boat/anglerboater/1998/novdec98/bcontour.html)

41

Figure 2. Map of Lake Erie showing location of sampling stations. Pie charts represent results from the Ribosome Database Project Sequence Classifier for each sample group.

42

Figure 3. Bray-Curtis Cluster analysis tree, based on species diversity between sampled groups. The input for this analysis is the identifiable classifications obtained from the RDP Sequence Classifier, and Sequence Match (see Table 2).

43

Figure 4. Bray-Curtis Cluster analysis tree, based on species diversity between sampled groups. The input for this analysis is the taxonomic assignments made using phylogenetic analysis (see Table 5).

44

Archaea

A

r

c

h

A A

3

r

r

c c

h h

2

5

D

e

h

a

l o

ru e a a M b k Cyanobacteria a n n nla ri A p Nitrospira P O e T yn C h S a e nd r N L m itr e i o pt

N ag Firmicutes Af B ge Laermlic r S ct h O Par tapo ox Abubo C h al rk bar Delta T t C prae lo lost ha de A ua aq Trk st BNuaecepbaasi Cyto CBvioRl Vario Beta Atroi Lth Ar Cjohn io uhl Cflev mi granha Fli S Lcor yd y Hh rch iha trep Bacteroidetes B Sp S po o PP a a od it Pf j si ep h tr vlua P S F or Tr R R G e v h

B li e o ru u P e D r w m lgp o D o h a M

b e o

t e o

l e

S s r s

oto u te A B it

t s

M

m u

Spirochaetes w C a p C

m u M

l mob

p e S u

l l

v a

e a

L c Gamma

N

M f

L t l

i

a

g u

c f

b

r o

h e

r

n

r n

a

e s i

Actinobacteria d

m R

s b

p Delta

Alpha h a

0.05

Figure 5. Neighbor-Joining phylogenetic tree of all 16S rDNA sequences obtained in this study. The red branches indicate environmental sequences from our sites and the black braches represent reference sequences retrieved from GenBank (see Appendix for reference sequence information).

45

5 - N A S 2 h Archaea c r A 3 h 5 rc h A c r A

F C

C l i j

f m o l

e h i

BacteroidetesA v n t Cyanobacteria r S o AN i u S r g -1 e

r k a

a 0 D n

S

n a M

a

e b l A

n

h a i 6 T C P N -

L r a r n

e y p

H c - N

l p 1 A

t F h o o Spirochaetes a o O A 3 j y h 5 -

a S d a S 1 p r y A 6 61 o 1 h Spirochaetes N c -1 e -1 r B n 1 - 6 N B

o 4 rs y -1 A i S 1 S 6 S p i L h e a p w -5 o 1 li Le 6 o lt pt b u o A m P S eActinobacteria a ut A rb l b o M ur k trit do R Rho SA N -1 S trep AN S -1 7 N 6 N-1 a -1 A Clo gg 61 S i st er and T C pra e Therm Cbart Nitrospira Nitro Firmicutes Oo xal Lept Staph Lacto 10 Blich 61- -6 61 rm Afe t cr aes Sma N Taqua Bcepa agne Bu M vibr rk C es R Beta cr C ba C Lth vio si

i l

ha G o

p A S s i m

R -9 V r Alpha d b A u

b e a

n o N 2 h

- N r m t

u D i

a t A 1 a o l

m -

l h e 3 D

N M B S 6 S

s 66 S e

A 1

u S - h S 1

s 1 N

l

6

Pv e f -

u P A 5 P

A

o 7

1 w - M

l 6 N p

f

7 f

A N o - a l a h

P 1 e u r

9 - P o i

d t - r a - 1 o s

8 v o 4 e 2

e u h r l Delta 6

u g

a

1

l

o -

1

6 Gamma 2

1 Gamma

-

8

6

1

-

1

4

S

A

N

-

1 4 0.05

Figure 6. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Western Basin. The red branches indicate environmental sequences from our sites and the black braches represent reference sequences retrieved from GenBank (see Appendix for reference sequence information).

46

2

1 5 - Archaea L - M 3 L L h

A A c r P 2 P 5 h A h c rc r A A

7 6 - - M M L L A A P P 3 1 - H S A

D

e C 7 C 1 Nitrospira h L

- 6 a P - E L

l H 2 A E 0 H o - 1 4 P S 9 1 - S L Ma -

Actinobacteria - - A 1

A H A M

E P 1

Cyanobacteria L M O Rh S e P

- L o A L i r M o A p 1 u A r d r L C t A ap o 6 j i i S - n P n M L 1 9 F l d P - r a m M N t a Bacteroidetes n o - 4 d Ml p H A k A u 1 C y S - l e S h n b u t 3 9 n a y o m A H a ra b te L n l R r h g a e i 8 o S - e c t S r h L i i H 4 ro t t T - t r M A e S AL p v A P e Cfl Flimi Cjohn N Tp a ra g e ge Cb r Cyto Firmicutes O ar ox t L Sta al acto ph 4 B -1 lich H Aferm 0 S Parbo -1 A Aburk H S -2 Clost A H AS -7 ASH-3 LE pa C ce CLE-3 B k Bur Rbasi E-24 Cviol CL Lthio CLE-8 -2 -1 CLE CLE 7 Naes SH- Ta t Beta A qua V A ar ru io hl

C D

to e L s

p G E

e u A

L C S - l

D

m S

f C h 1

a H P M

1 e L 2 o A P C e

e L - p o

w 2 s P E L 5 w h r

p - t f it

S E u o

e a l E P a

H v u - s L l S

l - - f H 1 1 v P

S o 5

a e o P u y m 4 A A 6

ch ih - r a lg

p L 1 o P r a P e

B S 1 M

8

c

A r

1 A

- A - u

r

8

L Gamma

B

a M d

p L M

L C

e e C m

M r M

A A C

h -

T v

c

P 3 Gamma

o

- t

M i S

a L

b u 1

C r b

Spirochaetes l

e H E

L r R o

a n i

E Delta s -

g d -

s N

- 6 1 2

n p

0 h

i P 0 e

s h

r a A

o a

m

B L

-

b

L 2 0.05 Alpha

Figure 7. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Central Basin. The red branches indicate environmental sequences from our sites and the black braches represent reference sequences retrieved from GenBank (see Appendix for reference sequence information).

47

3 5 h Archaea h c c r 2 r A 2 - A

E I 1 2 R 2 Arch2 - 7 E - 2 5 V - V V -2 O OO IE D D R E

T

r Spirochaetes e p

a D O B V o

r 9

- s - 2 i Cyanobacteria V 6 B O u D r r c

B a k h D e

A b n a

y O E a R a n l M i D - V R n r 2 P

- A p O I S 1 5 E O V p - 2 i -2 h 0 Le 5 a e

B n 1 pw y - L A e S V o p o E R O p Bacteroidetes R E

t C E ja o D to

R F I R E - y r 1 n

I

I C d E a - 0

E y r 3 h a g

- h - S

1 i

2 H o v

N c ro e Nitrospira 1 t L L A fl ep it C i D C ro lim O a t F n T n V d h Cjoh -6 E i e D R r OV IE m DO -3 -4 E V RI -4 -5 E- RIE B 24 E AR C-3 DO 22 -2 23 V R- RIE AR- -1 BA E B Ma S 9 C-5 lo gn m BAR Deha e ac li r bo C A Alpha v t ibr mul R Ccres S spha Mlute Actinobacteria Nhamb Rtrit Mtund i St Rho Bmob rep do P CDl OV-16 vero Cb ost Pfluo art Gamma Tp eru rae Pa -4 RC SDtaOV BA p -20 B A L h O a li fe c lg N o c rm to Pvu a a x h ew g a Sh g l s e Firmicutes

ho 7 r

Gamma p -8 - 6

E 3 8

P it I C - l R 6

r E 1 1

- - R a

o E

a 2 f

A C I

l R

M t

B R P R

o 5 A

u A

f e

A E-

o B a s l 1 E

B I l b

6 - m

e r

u a

0 b u

l 7 1 V

h

2 r s D

o - G o - e k

i 2

t ER

e v 2 - O

l s d

h C - C A

e o E D D ru i I AR a C R

A k o

h r B N i R i t R A a

s L u a r

p 0 E

A B

u a

a B 6

1 e -

b q

- B V Delta

c

C

R a 1

E

B I

T 1 R

Beta -

R

A

C

E

B

R A

0.05 B

Figure 8. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Eastern Basin. The red branches indicate environmental sequences from our sites and the black braches represent reference sequences retrieved from GenBank (see Appendix for reference sequence information).

48

3 h rc Archaea 5 h A c 2 r 1 A - 5 8 1 7 - 8 2 7 h rc A

Bacteroidetes

F C C f

l j A

F

i l 1 o 7 m e

t 1

8 j h

r

a v - - i o S

1 n H 8 p

SpirochaetesB i

3 g 7

h

7 o

r

8 r a

y c

- L

n d

1 h

0 c

y r

B

o

o h

r

a

s

C

i

y

T

t

o r

e

4 p 3 a - 1 L 5 e

Spirochaetes p

w S L p

e o a

i Cyanobacteria

p h b k t a a n o n a n u A l ri r P p ae O M 7 8 - 2 ne y 0 S -1 C 43 lo s C T t b p a r r a Firmicutes t e S La t O c ap o -4 to h x 0 -4 78 a -1 2 B l 2 4 A li 4 lo ferm ch ha Parb Nag de De o ger A -6 halo Abur 42 k 2 43-1 78 -6 9 Delta 78-3 42- ulfo -14 Des 43 Gmetal 43-11 43-5 42-7 Desulfa pt Le 42-2 Therm 43-13 di -2 Nitrospira Can 43 ruhl 5 o A 2- Nitr 4 11 trep Lthio 2- S R 4 Bc basi 4 Bur epa

3 V k o 4

d it - a

o tr 3 6 rio Pf

h S 7 C -

R R P 1 m v

8 T N

l i

8 v - o Beta u

ac a

- B 5 a e e P l M

C e o

M

t 2 C q

m r

u r S s

4 v a

l a o u

c i tu t

l i e h

g b o

M R a

t o r r e r e

l n

b n

N s u

Actinobacteria u b e w s

d i p P

m A h P a M h v

S a p a o

u

m h 4 r

l i

g o 2 b t

- s 1

6 Gamma 0.05 Alpha

Figure 9. Neighbor-Joining phylogenetic tree of the 16S rDNA sequences obtained from the Dead Zone. The red branches indicate environmental sequences from our sites and the black braches represent reference sequences retrieved from GenBank (see Appendix for reference sequence information).

49

APPENDIX

50

DNA Extractions

The most critical step in microbial community studies is the extraction of nucleic acids

from the environment. DNA extracted from environmental sediment samples should be a

representation of the microbial communities present at the time of the sampling. Biases however

can emerge due to the lysis efficiency of the bacteria present in the sample. For example, gram-

positive cells can be under-represented due to resistance to lysis. Other biases could result from

DNA loss during purification, as well as biases from selective amplification of certain taxa

during PCR. Some biases can affect the amplification of total community DNA, including

primer design and the number of 16S rRNA genes in each .

In this study three soil DNA extraction kits were tested for extraction of environmental

DNA from Lake Erie sediment samples. These kits included Fast DNA (Bio 101, La Jolla, CA),

Ultra Clean DNA (MO BIO Laboratories, Carlsbad, CA), and Bactozol (Molecular Research

Center, Cincinnati, OH). These kits were tested for their performance in DNA yield and quality.

All three extraction kits were used following the manufacturers’ instructions. The quality and efficiency of the extraction protocols were evaluated by running extracted DNA samples on 1% agarose gel electrophresis, and quality and quantity of DNA was evaluated by staining gels with ethidium bromide and visualization under ultraviolet light (Figure A-1).

The Fast DNA Spin kit (Bio 101, La Jolla, CA) was selected for the DNA extraction of sediment samples in this study because it yielded the most DNA with the least amount of apparent DNA shearing, which indicated less degradation; thus, reducing the potential for chimera formation during PCR. For each sample, total environmental DNA was extracted from one gram of wet sediment. DNA extractions were eluted in 50 ul of water and stored at -20ºC.

The DNA was quantified and confirmed using a light spectrophotometer and by gel

51 electrophoresis. DNA concentrations obtained from environmental DNA extractions were in the range of 20-50 ng/ul.

Figure A-1. Picture of DNA electrophoresis with a 1% agarose gel of DNA extractions with tested soil extraction kits using lake sediment. Well 1 shows a 1kb ladder, while the remaining wells contain DNA extractions from environmental samples; well 2-Fastprep DNA, well 3-Ultra Clean DNA, well 4-Bactozol-vortexed and well 5-Bactozol-not vortexed.

52

Environmental DNA Sequences

List of sequences, basins and sites for sequenced clones obtained from environmental samples from each sample site (the Dead Zone is in the Central Basin of Lake Erie).

Basin Site Clone 16S rDNA Length

Western S-61 61-1 363 bp Western S-61 61-2 356 bp Western S-61 61-3 357 bp Western S-61 61-4 357 bp Western S-61 61-5 359 bp Western S-61 61-6 358 bp Western S-61 61-7 349 bp Western S-61 61-8 358 bp Western S-61 61-9 356 bp Western S-61 61-10 358 bp Western S-61 61-11 356 bp Western S-61 61-12 360 bp Western S-61 61-14 360 bp Western S-61 61-15 350 bp Western S-61 61-16 355 bp

Western Sandusky SAN-1 359 bp Western Sandusky SAN-3 360 bp Western Sandusky SAN-4 358 bp Western Sandusky SAN-5 362 bp Western Sandusky SAN-6 369 bp Western Sandusky SAN-7 364 bp Western Sandusky SAN-8 356 bp Western Sandusky SAN-9 356 bp Western Sandusky SAN-10 352 bp Western Sandusky SAN-11 360 bp Western Sandusky SAN-12 351 bp Western Sandusky SAN-14 356 bp Western Sandusky SAN-15 357 bp Western Sandusky SAN-16 363 bp Western Sandusky SAN-17 357 bp

Central Cleveland CLE-1 364 bp Central Cleveland CLE-2 364 bp Central Cleveland CLE-3 359 bp Central Cleveland CLE-4 355 bp Central Cleveland CLE-5 351 bp

53

Environmental Sequences (cont.)

Basin Site Clone 16S rDNA Length

Central Cleveland CLE-6 361 bp Central Cleveland CLE-7 362 bp Central Cleveland CLE-8 364 bp Central Cleveland CLE-9 358 bp Central Cleveland CLE-10 363 bp Central Cleveland CLE-11 361 bp Central Cleveland CLE-12 357 bp Central Cleveland CLE-14 352 bp Central Cleveland CLE-16 348 bp Central Cleveland CLE-24 360 bp

Central Ashtabula ASH-2 359 bp Central Ashtabula ASH-3 359 bp Central Ashtabula ASH-5 354 bp Central Ashtabula ASH-6 352 bp Central Ashtabula ASH-7 360 bp Central Ashtabula ASH-8 360 bp Central Ashtabula ASH-9 360 bp Central Ashtabula ASH-10 357 bp Central Ashtabula ASH-12 349 bp Central Ashtabula ASH-13 353 bp Central Ashtabula ASH-14 363 bp Central Ashtabula ASH-17 356 bp Central Ashtabula ASH-18 362 bp Central Ashtabula ASH-20 359 bp Central Ashtabula ASH-21 357 bp

Central Port Alma PALM-1 362 bp Central Port Alma PAL-L2 357 bp Central Port Alma PALM-3 363 bp Central Port Alma PALM-4 354 bp Central Port Alma PAL-L5 363 bp Central Port Alma PALM-6 354 bp Central Port Alma PALM-7 363 bp Central Port Alma PALM-8 363 bp Central Port Alma PALM-9 360 bp Central Port Alma PALM-10 354 bp Central Port Alma PALM-11 358 bp Central Port Alma PALM-12 368 bp Central Port Alma PALM-13 360 bp Central Port Alma PALM-14 363 bp Central Port Alma PALM-16 357 bp

54

Environmental Sequences (cont.)

Basin Site Clone 16S rDNA Length

Eastern Erie ERIE-1 359 bp Eastern Erie ERIE-2 356 bp Eastern Erie ERIE-3 357 bp Eastern Erie ERIE-4 354 bp Eastern Erie ERIE-5 358 bp Eastern Erie ERIE-6 359 bp Eastern Erie ERIE-8 365 bp Eastern Erie ERIE-10 365 bp Eastern Erie ERIE-20 359 bp Eastern Erie ERIE-21 363 bp Eastern Erie ERIE-22 361 bp Eastern Erie ERIE-24 361 bp Eastern Erie ERIE-25 362 bp Eastern Erie ERIE-26 360 bp Eastern Erie ERIE-27 363 bp

Eastern Port Dover DOV-1 353 bp Eastern Port Dover DOV-2 361 bp Eastern Port Dover DOV-3 354 bp Eastern Port Dover DOV-4 351 bp Eastern Port Dover DOV-5 355 bp Eastern Port Dover DOV-6 353 bp Eastern Port Dover DOV-9 358 bp Eastern Port Dover DOV-15 361 bp Eastern Port Dover DOV-16 353 bp Eastern Port Dover DOV-19 355 bp Eastern Port Dover DOV-20 362 bp Eastern Port Dover DOV-21 365 bp Eastern Port Dover DOV-25 363 bp Eastern Port Dover DOV-26 352 bp Eastern Port Dover DOV-27 364 bp

Eastern Barcelona BARC-2 362 bp Eastern Barcelona BARC-3 360 bp Eastern Barcelona BARC-4 359 bp Eastern Barcelona BARC-5 361 bp Eastern Barcelona BARC-6 357 bp Eastern Barcelona BARC-7 360 bp Eastern Barcelona BARC-10 356 bp Eastern Barcelona BARC-11 358 bp Eastern Barcelona BARC-13 360 bp Eastern Barcelona BARC-16 359 bp

55

Environmental Sequences (cont.)

Basin Site Clone 16S rDNA Length

Eastern Barcelona BARC-18 357 bp Eastern Barcelona BARC-20 362 bp Eastern Barcelona BARC-21 367 bp Eastern Barcelona BARC-22 360 bp Eastern Barcelona BARC-23 365 bp

Dead Zone S-42 42-2 358 bp Dead Zone S-42 42-4 355 bp Dead Zone S-42 42-5 367 bp Dead Zone S-42 42-6 361 bp Dead Zone S-42 42-7 358 bp Dead Zone S-42 42-8 359 bp Dead Zone S-42 42-9 358 bp Dead Zone S-42 42-10 356 bp Dead Zone S-42 42-11 357 bp Dead Zone S-42 42-16 354 bp

Dead Zone S-43 43-1 358 bp Dead Zone S-43 43-2 359 bp Dead Zone S-43 43-5 357 bp Dead Zone S-43 43-6 359 bp Dead Zone S-43 43-10 357 bp Dead Zone S-43 43-11 359 bp Dead Zone S-43 43-12 361 bp Dead Zone S-43 43-13 357 bp Dead Zone S-43 43-14 368 bp Dead Zone S-43 43-15 362 bp

Dead Zone S-78 78-2 363 bp Dead Zone S-78 78-3 360 bp Dead Zone S-78 78-4 359 bp Dead Zone S-78 78-5 360 bp Dead Zone S-78 78-6 361 bp Dead Zone S-78 78-10 357 bp Dead Zone S-78 78-11 356 bp Dead Zone S-78 78-12 362 bp Dead Zone S-78 78-13 357 bp Dead Zone S-78 78-15 364 bp

56

Reference Sequences

List of reference sequences and abbreviations used on the phylogenetic analysis. These sequences were obtained from GenBank and include representatives of major bacterial groups and Archaea.

Bacterial Sequences

Abbreviation Species Phylum Class Accession No. ______Nhamb hamburgensis Proteobacteria Alpha L11663 Magne Magnetospirillum magnetotacticum Proteobacteria Alpha Y10110 Smacr Sphingopyxis macrogoltabida Proteobacteria Alpha D13723 Cvibr Caulobacter vibrioides Proteobacteria Alpha AJ009957 Mtund Methylocella tundrae Proteobacteria Alpha AJ555244 Bmobi Beijerinckia mobilis Proteobacteria Alpha AJ563932 Rspha Rhodobacter sphaeroides Proteobacteria Alpha X53853 Ccres Caulobacter vibrioides Proteobacteria Alpha AJ009957

Rbasi Cupriavidus necator Proteobacteria Beta AF191737 Vario Variovorax paradoxus Proteobacteria Beta AF209469 Burk Burkholderia sp. Proteobacteria Beta AF092889 Naest aestuarii Proteobacteria Beta AJ298734 Taqua Thiobacillus aquaesulis Proteobacteria Beta U58019 Bcepa Burkholderia cepacia Proteobacteria Beta AF175314 Cviol Chromobacterium violaceum Proteobacteria Beta M22510 Lthio Limnobacter thiooxidans Proteobacteria Beta AJ289885 Aruhl Achromobacter ruhlandii Proteobacteria Beta AB010840

Gmetal Geobacter grbiciae Proteobacteria Delta AF335182 Adehalo Anaeromyxobacter dehalogenans Proteobacteria Delta AF382396 Desulfa hydrothermale Proteobacteria Delta AF170417 Desulfo Desulfuromonas chloroethenica Proteobacteria Delta U49748

Morit Moritella abyssi Proteobacteria Gamma AJ252022 Pvulg Proteus vulgaris Proteobacteria Gamma AJ301683 Pfluo Pseudomonas fluorescens Proteobacteria Gamma AJ278812 Paeru Pseudomonas aeruginosa Proteobacteria Gamma Z76651 Shewa Shewanella violacea Proteobacteria Gamma D21225 Pvero Pseudomonas veronii Proteobacteria Gamma AF064460 Pphos Photobacterium phosphoreum Proteobacteria Gamma X74687

Cjohn Flavobacterium johnsoniae Bacteroidetes Flavobacteria D12664 Cflev Flavobacterium flevense Bacteroidetes Flavobacteria M58767 Atroi Arenibacter troitsensis Bacteroidetes Flavobacteria AB080771 Flimi Flavobacterium limicola Bacteroidetes Flavobacteria AB075230

Lcoha Lewinella cohaerens Bacteroidetes Sphingobacteria AF039292 Sgran Saprospira grandis Bacteroidetes Sphingobacteria M58795 Hhydr Haliscomenobacter hydrossis Bacteroidetes Sphingobacteria M58790 Fjapo Flexibacter japonensis Bacteroidetes Sphingobacteria AB078055 Cyto Cytophaga sp. Bacteroidetes Sphingobacteria AB013834

57

Reference Sequences (cont.)

Abbreviation Species Phylum Class Accession No. ______Staph Staphylococcus aureus Firmicutes Bacilli X68417 Lacto versmoldensis Firmicutes Bacilli AJ496791 Blich Bacillus licheniformis Firmicutes Bacilli X68416 Aferm Amphibacillus fermentum Firmicutes Bacilli AF418603 Ooxal Oxalophagus oxalicus Firmicutes Bacilli Y14581

Cbowa Clostridium bowmanii Firmicutes Clostridia AJ506120 Parbo Propionispira arboris Firmicutes Clostridia Y18190 Aburk burkinensis Firmicutes Clostridia AJ010961 Clost Clostridium bowmanii Firmicutes Clostridia AJ506120 Tprae Tissierella praeacuta Firmicutes Clostridia X80841 Nagger Natronoanaerobium aggerbacterium Firmicutes Clostridia AJ271452 Cbart Clostridium bartlettii Firmicutes Clostridia AY438672

Borsi sinica Spirochaetes AB022101 Spiha halophila Spirochaetes M88722 Trepa parvum Spirochaetes AF302937 Lepwo biflexa Spirochaetes Z12821 Brchy hyodysenteriae Spirochaetes M57743 Lepto Spirochaetes Z12817

Therm Thermodesulfovibrio islandicus Nitrospira X96726 Nitro Nitrospira X82558 Lept Leptospirillum ferrooxidans Nitrospira X86776 Candi Magnetobacterium bavaricum Nitrospira X71838

Maeru Microcystis aeruginosa Cyanobacteria AB035549 Syne Cylindrospermum stagnale Cyanobacteria AJ133163 Oprin Oscillatoria princeps Cyanobacteria AB045961 Anaba Anabaena variabilis Cyanobacteria AF247593 Plank Planktothrix agardhii Cyanobacteria AB045952

Strep Streptomyces chartreuses Actinobacteria AJ399468 Rtrit Rathayibacter rathayi Actinobacteria X77439 Rhodo Rhodococcus erythropolis Actinobacteria X76691 Mlute Micrococcus luteus Actinobacteria AJ536198 Smult Salana multivorans Actinobacteria AJ400627 Aboli Arsenicicoccus bolidensis Actinobacteria AJ558133

Dehalo uncultured Dehalococcoides Genera incertae sedis AY622903

Archaea Sequences

Abbreviation Species Accession No. ______Arch2 Methanosarcina acetivorans M59137 Arch5 Uncultured archaeon AJ294873 Arch3 Uncultured archaeon U87519

58

REFERENCES

Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, 1990. Basic local alignment

search tool. Journal of Molecular Biology 215: 403-410.

Baross, J. A., and J. W. Deming, 1995. Growth at high temperatures: isolation and taxonomy,

physiology, and ecology. In D. M. Karl (ed.), Microbiology of deep-sea hydrothermal

vents. CRC Press, Inc., Boca Raton, Fla.

Berger, W. H., and F. L. Parker, 1979. Diversity of planktonic in deep-sea

sediments. Science. 168: 1345-1357.

Berman-Frank, I., P. Lundgren, and P. Falkowski, 2003. Nitrogen fixation and photosynthetic

oxygen evolution in Cyanobacteria. Research in Microbiology 154: 157-164.

Borneman, J., P. W. Skroch, K. M. O'Sullivan, J. A. Palus, N. G. Rumjanek, J. L. Jansen, J.

Nienhuis, and E. W. Triplett, 1996. Molecular microbial diversity of an agricultural soil

in Wisconsin. Applied and Environmental Microbiology 62: 1935-1943.

Botha, W. C., P. J. Jooste, and T. J. Britz, 1989. The taxonomic relationship of certain

environmental flavobacteria to the genus Weeksella. Journal of Applied 67:

551–559

Cardenas, E. 1989. Biochemistry of oxygen toxicity. Annual Review Biochemistry 598: 79–110

Cole, J. R., B. Chai, R. J. Farris, Q. Wang, S. A. Kulam, D. M. McGarrell, G. M. Garrity, and J.

M. Tiedje, 2005. The Ribosomal Database Project (RDP-II): sequences and tools for

high-throughput rRNA analysis. Nucleic Acids Res. 33: D294-D296.

Colwell, R. R., and D. J. Grimes, (eds.) Nonculturable Microorganisms in the Environment;

American Society for Microbiology Press: Washington, D.C., 2000.

Dexter-Dyer, B., A Field Guide to Bacteria; Comstock Publishing Assoc.: Ithaca, NY, 2003;

59

Vol. 1, pp24-220.

Forster, D. L., and J. N. Rausch, 2002. Evaluating agricultural nonpoint-source pollution

programs in two Lake Erie tributaries. Journal of Environmental Quality 31: 24-31.

Forterre, P., C. Brochier, and H. Philippe, 2002. Evolution of the Archaea Theoretical

Population. Biology 61: 409-422.

Foster S. J., and K. Johnstone. The trigger mechanism of bacterial germination. Smith, R. A.

Slepecky, and P. Setlow (ed.) Regulation of Procaryotic Development, Structural and

Functional Analysis of Bacterial Sporulation and Germination. American Society for

Microbiology Washington, D.C. 1989 pp. 89–108I

Freeman, A. R., C. M. Meghen, D. E. Machugh, R. T. Loftus, M. D. Achukwi, A. Bado, B.

Sauveroche, and D. G. Bradley, 2004. Admixture and diversity in West African cattle

populations. Molecular Ecology 13: 3477-3487.

Edwards, C., 2000. Problems posed by natural environments for monitoring microorganisms.

Molecular Biotechnology 15: 211-223.

EPA, 2005. http://www.epa.gov/greatlakes/monitoring/guard/ship.html> May 28, 2005.

Garcia, J. L., 1998. The methanogenic Archaea Comptes Rendus de l'Academie d'Agriculture de

France 84: 23-33.

Gillooly, D. J., A. G. Robertson, and C. A. Fewson, 1998. Molecular Characterization of Benzyl

Alcohol Dehydrogenase and Benzaldehyde Dehydrogenase II of Acinetobacter

Calcoaceticus. Biochemical Journal 330: 1375-1381

Gray, J. P.; and R. P. Herwig, 1996. Phylogenetic analysis of the bacterial communities in

marine sediments. Applied and Environmental Microbiology 62: 4049-4059.

Great Lakes information network, 2004 , Jan. 3, 2004

60

Hansen, T. A., 1994. Metabolism of sulfate-reducing prokaryotes. Antonie van Leeuwenhoek

66: 165-185.

Head, I. M., J. R. Saunders, and R. W. Pickup, 1998. Microbial evolution, diversity, and ecology:

A decade of ribosomal RNA analysis of uncultivated microorganisms. Microbial Ecology

35: 1-21

Hendrickson, E. R., J. A. Payne, R. M. Young, M. G. Star, M. P. Perry, S. Fahnestock, D. E.

Ellis and R. C. Ebersole, 2002. Molecular analysis of Dehalococcoides 16S ribosomal

DNA from chloroethene-contaminated sites throughout North America and Europe.

Appl. Environ. Microbiol. 68: 485-495.

Holmes, B., R. E. Weaver, A. G. Steigerwalt, and D. J. Brenner, 1988. A taxonomic study of

Flavobacterium spiritivorum and Sphingobacterium mizutae: Proposal of

Flavobacterium yabuuchiae sp. nov. and Flavobacterium mizutaii. International Journal

of Systematic Bacteriology 38: 348–353.

Hoostal, M. J., G. S. Bullerjahn, and M. L. McKay, 2002. Molecular assessment of the potential

for in situ bioremediation of PCBs from aquatic sediments. Hydrobiologia 469: 59-65

Hovanec, T. A., L. T. Taylor, A. Blakis, and E. F. Delong, 1998. Nitrospira-like bacteria

associated with nitrite oxidation in freshwater aquaria. Applied and Environmental

Microbiology 64: 258-264.

Itoh, T., 2003. Taxonomy of Nonmethanogenic Hyperthermophilic and Related Thermophilic

Archaea. Journal of Bioscience and Bioengineering. 96: 203-212.

Kirk, J. L., L. A. Beaudette, M. Hart, P. Moutoglis, J. N. Klironomos, H. Lee, and J. T. Trevors,

2004. Methods of studying soil microbial diversity. Journal of Microbiological Methods

58: 169-188.

61

Koonce, J. F., W. Busch, N. Dieter, and T. Czapla, 1996. Restoration of Lake Erie: Contribution

of water quality and natural resource management. Canadian Journal of Fisheries and

Aquatic Sciences 53: 105-112.

Kumar, S., K. Tamura, and M. Nei, 2004. MEGA 3: Integrated software for Molecular

Evolutionary Genetics Analysis and sequence alignment. Briefings in Bioinformatics 5:

150-163.

Lane D. J., B. Pace, G. J. Olsen, D. A. Stahl, M. L. Sogin, and N. R. Pace, 1985. Rapid

Determination of 16S Ribosomal RNA Sequences for Phylogenetic Analyses.

Proceedings of the National Academy of Sciences of the United States of America 82:

6955-6959

Loy A., A. Lehner, N. Lee, J. Adamczyk, H. Meier, J. Ernst, K. H. Schleifer, and M. Wagner,

2002. Oligonucleotide Microarray for 16S rRNA Gene-Based Detection of All

Recognized Lineages of Sulfate-Reducing Prokaryotes in the Environment. Applied and

Environmental Microbiology, 68: 5064-5081

Marvin, C. H., M. N. Charlton, E. J. Reiner, T. Kolic, K. MacPherson, G. A. Stern, E.

Braekevelt, J. F. Estenik, L. Thiessen, and S. Painter, 2002. Surficial sediment

contamination in Lakes Erie and Ontario: A comparative analysis. Journal of Great Lakes

Research 28: 437-450.

McAleece, N., 1997. BioDiversity Professional beta version software. The National History

Museum and the Scottish Association for Marine Science.

Mikesell, M. D., J. J. Kukor, and R. H. Olsen, 1993. Metabolic diversity of aromatic

hydrocarbon-degrading bacteria from a petroleum-contaminated aquifer. Biodegradation

4: 249-259

62

Monciardini, P., L. Cavaletti, P. Schumann, M. Rohde, and S. Donadio, 2003. Conexibacter

woesei gen. nov., sp. nov., a novel representative of a deep evolutionary line of descent

within the class Actinobacteria. International Journal of Systematic and Evolutionary

Microbiology 53: 569-576.

Muller, A. K., L. D. Rasmussen, and S. J. Sorensen, 2001. Adaptation of the bacterial

community to mercury contamination. FEMS Microbiology Letters 204: 49-53

Muller, A. K., K. Westergaard, S. Christensen, and S. J. Sorensen, 2002. The diversity and

function of soil microbial communities exposed to different disturbances. Microbial

Ecology 44: 49-58

National Center for Biotechnology Information, 2004 , Sept. 12,

2004

National Parks of Canada. Lake Erie, 2005

gla5_E.asp>, March 2, 2005

Paerl, H. W., and J. L. Pinckney, 1996 A mini-review of microbial consortia: Their roles in

aquatic production and biogeochemical cycling. Microbial Ecology 31: 225-247

Painter, S., C. Marvin, F. Rosa, T. B. Reynoldson, M. N. Charlton, M. Fox, P. A. Thiessen, and

J. F. Estenik, 2001. Sediment contamination in Lake Erie: A 25-year retrospective

analysis. Journal of Great Lakes Research 27 :434-448.

Pielou, E. C., The interpretation of Ecological Data, Wiley Pub. Co., New York, 1984.

Purdy, K. J., D. B. Nedwell, and T. M. Embley, 2003. Analysis of the Sulfate-Reducing Bacterial

and Methanogenic Archaeal Populations in Contrasting Antarctic Sediments. Applied and

Environmental Microbiology 69: 3181-3191.

Robertson, P., D. C. Coleman, C. S. Bledsoe, and P. Sollins, (eds.) Standard Soil Methods for

63

Long-Term Ecological Research; Oxford University Press: Oxford, NY, 1999.

Schwieger, F., and C. C. Tebbe, 2000. A new approach to utilize PCR-single-strand-

conformation polymorphism for 16S rRNA gene-based microbial community analysis.

Applied and Environmental Microbiology 64: 4870-4876.

Seckbach, J., and J. Kluwer. (eds.) Journey to Diverse Microbial Worlds; Academic Publishers:

Dordrecht, The Netherlands, 2000.

Smith, R. E. H., C. D. Allen, and M. N. Charlton, 2004. Dissolved organic matter and ultraviolet

radiation penetration in the Laurentian Great Lakes and tributary waters. Journal of Great

Lakes Research 30: 367-380

Soltmann, U., H. Wand, A. Mueller, P. Kuschk, and Stottmeister, 2002. U. Exposure to

xenobiotics deeply affects the bacteriocenosis in the rhizosphere of helophytes Acta.

Biotechnologica 22: 161-166

Terracciano, J. S., and E. Canale-Parola, 1984. Enhancement of Chemo Taxis in Spirochaeta-

Aurantia Grown Under Conditions of Nutrient Limitation. Journal of Bacteriology 159:

173-178

Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins, 1997. The

ClustalX Windows interface: flexible strategies for multiple sequence alignment aided by

quality analysis tools. Nucleic Acids Research 25: 4876-4882.

Travisano, M., P. B. Rainey, 2000. Studies of adaptive radiation using model microbial systems.

American Naturalist 156: S35-S44

Whitman, W. B., T. L. Bowen, and D. R.. Boone, 1992. The Methanogenic Bacteria. Page 719-

767. In: A. Balows, H.G. Trüper, M. Dworkin, W. Harder, and K.-H. Schleifer (eds.),

The Prokaryotes, 2nd edition. Springer Verlag, New York Berlin Heidelberg.

64

Woese, C. R., E. Stackebrandt, T. J. Macke, and G. E. Fox, 1985 A Phylogenetic Definition of

the Major Eubacterial Taxa. Systematic and Applied Microbiology 6: 143-151

Woese, C. R., 1998. Default taxonomy: Ernst Mayr's view of the microbial world. Proceedings

of the National Academy of Sciences of the United States of America 95: 11043-11046.

Wolf, M., T. Moller, T. Dandekar, and J. D. Pollack, 2004. Phylogeny of Firmicutes with special

reference to () as inferred from phosphoglycerate kinase amino

acid sequence data. International Journal of Systematic and Evolutionary Microbiology

54: 871-875.