PROKARYOTIC DIVERSITY OF BOILING SPRINGS LAKE, LASSEN VOLCANIC NATIONAL PARK

HUMBOLDT STATE UNIVERSITY

By

Andrea Bartles

A Thesis

Presented to

The Faculty of Humboldt State University

In Partial Fulfillment

Of the Requirements for the Degree

Master of Art

In Biology

(May, 2007)

PROKARYOTIC DIVERSITY OF BOILING SPRINGS LAKE, LASSEN VOLCANIC NATIONAL PARK

HUMBOLDT STATE UNIVERSITY

By

Andrea Bartles

We certify that we have read this study and that it conforms to acceptable standards of scholarly presentation and is fully acceptable, in scope and quality, as a thesis for the degree of Master of Arts.

Approved by the Master’s Thesis Committee:

______Patricia Siering, Major Professor Date

______Mark Wilson, Committee Member Date

______Michael Camann, Committee Member Date

______Brian Arbogast, Committee Member Date

______Michael Mesler, Graduate Coordinator Date

______Chris A. Hopper, Interim Dean Date Research, Graduate Studies & International Programs

ABSTRACT

Prokaryotic Diversity of Boiling Springs Lake, Lassen Volcanic National Park

Andrea Bartles

The identification of organisms present in an environment is a crucial prerequisite for understanding the role of organisms in the environment and the processes influencing the diversity of those organisms. In general, the basic ecology of acidic thermal environments is poorly understood, partially due to the lack of cultivability of prokaryotes from those environments. The purpose of this project was to examine the prokaryotic community composition in Boiling Springs Lake (BSL), a hot, acidic lake in

Lassen Volcanic National Park (LVNP). Culture-independent methods were used to identify the prokaryotes living in BSL and to compare the prokaryotic community composition at four sites around the lake, ranging in temperature from 52.2-82.3ºC. First, prokaryotes were identified using a 16S rRNA clone library constructed from water samples collected at the warmest of the four sites. Analysis of the clone library identified sequences from the domains and . Approximately 75% of the clones sequenced and 27% of the identified phylotypes belonged to the domain Bacteria.

Terminal Restriction Fragment Length Polymorphism (TRFLP), a molecular method that has been used to approximate diversity within microbial communities, was used to compare community composition at all four sampling sites. TRFLP diversity fingerprints were examined for intra-sample and inter-sample variation in an effort to evaluate the sensitivity and reproducibility of this method in detecting variation between and among

iii prokaryotic communities in BSL. Some variation between independent extractions of the same sample was observed, but, in most cases, it was less than the variation observed between different samples. TRFLP was able to resolve community composition differences between samples where differences were expected based on temperature. All of the phylotypes identified in the clone library were detected by TRFLP in at least one of the sampling sites, and six of the phylotypes were detected in water samples from all four sites. Nonmetric multidimensional scaling ordination of TRFLP fingerprints generally resulted in clusters of extractions correlated with sampling site and temperature. TRFLP fingerprints from the warmest site were characterized by a Thermoplasmatales-like phylotype, an Ignicoccus -like phylotype, and Hydrogenobaculum . The results of this study provide an initial look at the bacterial diversity present in BSL and suggest that differences in community composition around the lake are correlated with temperature.

They also indicate that TRFLP is a sensitive and repeatable enough method to detect variation in the prokaryotic community composition at various sites in BSL.

iv

ACKNOWLEDGEMENTS

I would sincerely like to thank the people who made it possible for me to complete this project. Most of all, I would like to thank Patty Siering and Mark Wilson for their guidance and support on every aspect of this project. Over the past four years, they have been invaluable as both teachers and friends. Thanks to Mike Camann and

Brian Arbogast for their assistance in developing this project and Mike Camann for his assistance with the statistical analyses. Thanks to Anthony Baker for sharing his laboratory knowledge and space. Thanks to Michelle Hauser, Luke Hamm, and Christy

Whitehouse for their assistance in collecting and processing samples. Michelle also did extremely helpful preliminary work on TRFLP. Thanks to Michelle Ansell for helping me complete plasmid preps. Thanks to Ryan Brodie and Jocelyn Jones for collecting

AODC data. Thanks to Chris White for working side by side with me on TRFLP, sharing his data, and providing encouragement along the way. Thanks to Donnie Carter for his suggestions in the lab. I would also like to thank Casey Lu for giving me guidance and providing me with opportunities to help me become a better teacher. Finally, I would like to thank Marty Yip for encouraging me to accomplish my goals, providing emotional support when I needed it, and helping me to keep balance in my life along the way.

v

TABLE OF CONTENTS

ABSTRACT……………………………………………………………………………...iii

ACKNOWLEDGEMENTS……………………………………………………………….v

TABLE OF CONTENTS…………………………………………………………………vi

LIST OF TABLES………………………………………………………………………..ix

LIST OF FIGURES……………………………………………………………………….x

CHAPTER 1: BACKGROUND AND OBJECTIVES……………………………………1

Overview…………………………………………………………………………..1

Site Description……………………………………………………………………2

Thermoacidophilic Life…………………………………………………………...6

Cultivation vs. Culture Independent Investigation………………………………10

Molecular Estimates of Diversity………………………………………………..12

CHAPTER 2: CLONE LIBRARY AND PHYLOGENY……………………………….22

Introduction………………………………………………………………………22

Methods…………………………………………………………………………..24

Sample Collection………………………………………………………..24

Acridine Orange Direct Counts (AODC)………………………………..26

Nucleic Acid Extractions………………………………………………..26

Creation of Site D rRNA Gene Clone Library………………………….28

Sequence Analysis and Phylogeny Estimation…………………………29

Results…………………………………………………………………………..31

vi

Site Characterization……………………………………………………..31

Community Analysis…………………………………………………….32

Discussion………………………………………………………………………..39

CHAPTER 3: TERMINAL-RESTRICTION FRAGMENT LENGTH POLYMORPHISM (TRFLP)……………………………………………………………45

Introduction………………………………………………………………………45

Methods…………………………………………………………………………..49

Sample Collection and Nucleic Acid Extraction………………………...49

PCR Amplification………………………………………………………49

Mung Bean Nuclease Digestion…………………………………………51

Restriction Enzyme Digestion…………………………………………...51

Polyacrylamide Gel Electrophoresis…………………………………….52

TRFLP Analysis…………………………………………………………54

Results……………………………………………………………………………54

Intra-sample Variation…………………………………………………...54

Inter-sample Variation…………………………………………………...57

Inter-site Variation……………………………...………………………..60

Comparison of TRFLP with Clone Library……………………………...66

Statistical Analysis……………………………………………………….66

Discussion………………………………………………………………………..69

Assessment of the Sensitivity, Repeatability and Resolution of TRFLP..69

Sources of variation……………………………………………...69

vii

Intra-sample variation……………………………………………70

Inter-sample variation……………………………………………71

Assessment of Diversity in BSL…………………………………………73

Inter-site variation…………………...…………………………...73

Pooled vs. Averaged Extractions………………………………...76

Factors influencing microbial communities……………………...76

Factors influencing TRFLP results………………………………77

Suggestions for Future Studies…………………………………………..79

CHAPTER 4: CONCLUSION…………………………………………………………..81

REFERENCES…………………………………………………………………………..84

viii

LIST OF TABLES

Table Page

1 Common known thermoacidophiles………………………………………………9

2 Common phylotypes in BSL site A sediment clone libraries……………………23

3 Summary of BSL site D clone library phylotypes……………………………….33

4 Summary of nucleic acid extractions from BSL water samples collected July 19, 2004…………………………………………………………..50

5 Bray-Curtis similarity indices showing variation between multiple extractions from each BSL sample (higher values indicate greater similarity).………….….56

6 Bray-Curtis and Jaccard similarity indices showing variation between water samples from BSL (higher values indicate greater similarity)……………58

7 Bray-Curtis similarity indices showing variation between sampling sites in BSL (higher values indicate greater similarity)……………………………….64

8 Bray-Curtis similarity indices comparing TRFLP profiles of pooled samples with averaged TRFLP profiles from sampling sites in BSL (higher values indicate greater similarity). For each site, pooled samples are compared with averaged results for all extractions and with averaged results for only those extractions included in the pooled sample. All results are from the primer set U341F/U1406R……………………………….…65

9 Comparison of averaged TRFLP results for sites A, B, C, and D with phylotypes detected in site D clone library………………………………………67

ix

LIST OF FIGURES

Figure Page

1 Maps of Lassen Volcanic National Park. (a) LVNP is located at the southern end of the Cascade Range in northern California. (image taken from www.lib.utexas.edu ) (b) Shaded area indicates LVNP boundaries. BSL (indicated by a white star) is in the Warner Valley of LVNP. (image taken from www.americansouthwest.net)...... 3

2 Photograph and schematic of BSL. (a) The view from a cliff above the south end of the lake shows sampling sites A, B, and C and outflow (indicated by arrow). (b) The surface temperature and pH of several locations around the lake periphery are relatively constant (~52˚C, pH 2.24 in June-August 1999-2006), except for warmer and more variable temperatures at sites C and D. Several high temperature springs and mudpots on the periphery are indicated by grey swirls. GPS positions and pH/T data are from July 19, 2004. (image from Siering and Wilson, personal communication)……………………………………...... 5

3 Summary of TRFLP. (1) Environmental samples are collected, filtered and the DNA extracted. (2) DNA is amplified through PCR with fluorescently-labeled 16S rRNA forward primer and unlabeled 16S rRNA reverse primer. (3) Digestion of PCR products produces labeled fragments of various lengths. (4) Polyacrylamide gel electrophoresis is used to separate fragments. (5) Analysis software is used to determine band size, intensity, and to generate a chromatogram with peaks whose area corresponds to the percent community composition of the microbe represented by the peak. (image taken from 35.8.164.52/html/ t-rflp_jul02.html)……………………………………………..19

4 Schematic of BSL. The approximate locations of sampling sites are indicated………………………………………………………………………….25

5 effort curve for BSL Site D clone library. The curve indicates that the clone library was sampled sufficiently to identify the majority of phylotypes present…………...……………………………………………………………….34

x

LIST OF FIGURES (continued)

Figure Page

6 Phylogenetic tree of Bacterial sequences from BSL Site D clone library. 16S rRNA genes were amplified using universal primers 341F and 1406R. Sequences of reference taxa were downloaded from the RDP II database (Cole et al. 2003) in an aligned format. Tree topology was consistent across parsimony, distance, maximum likelihood and Bayesian analyses. Bayesian tree is shown. Numbers at nodes represent bootstrap confidence values from maximum likelihood method and are based on 100 bootstrap resamplings of dataset. Numbers at nodes within parentheses represent Bayesian posterior probabilities if different from bootstrap values.………...……………….………35

7 Phylogenetic tree of Archaeal sequences from BSL Site D clone library. 16S rRNA genes were amplified using universal primers 341F and 1406R. Sequences of reference taxa were downloaded from the RDP II database (Cole et al. 2003) in an aligned format. Tree topology was consistent across parsimony, distance, maximum likelihood and Bayesian analyses. Bayesian tree is shown. Numbers at nodes represent bootstrap confidence values from maximum likelihood method and are based on 100 bootstrap resamplings of dataset. Numbers at nodes within parentheses represent Bayesian posterior probabilities if different from bootstrap values………………………………….37

8 Dendrogram showing the relationship of BSL extractions to each other based on Bray-Curtis distances between TRFLP profiles. Smaller Bray-Curtis distance values indicate greater similarity between extractions. All results are from the primer set U341F/U1406R……………………………...61

9 Dendrogram showing the relationship of BSL extractions to each other based on Bray-Curtis distances between TRFLP profiles. Smaller Bray-Curtis distance values indicate greater similarity between extractions. All results are from the primer set U515F/U1406R. Extractions are labeled with the letter indicating site, the first number indicating sample, and the second number indicating extraction…………………………………………….62

10 Nonmetric multidimensional scaling analysis of TRFLP profiles for BSL extractions. In plot A, symbols represent sampling sites. In plot B, vectors show correlation of individual TRFLP fragments and temperature with site clusters. The unreadable group of vectors associated with site D extractions includes temperature and fragments with sizes 46, 49, 171, 174, and 200. All results are from the primer set U341F/U1406R.…………………………………68

xi

CHAPTER 1: BACKGROUND AND OBJECTIVES

Overview

The purpose of this project was to examine the prokaryotic community composition in Boiling Springs Lake (BSL), a hot, acidic lake in Lassen Volcanic

National Park (LVNP). Culture-independent methods were used to identify the prokaryotes living in BSL and to compare the prokaryotic community composition at four sites around the lake, ranging in temperature from 52.2-82.3ºC. First, prokaryotes were identified using a 16S rRNA clone library constructed from water samples collected at the warmest of the four sites. Then, Terminal Restriction Fragment Length

Polymorphism (TRFLP), a molecular method that has been used to approximate diversity within microbial communities (22, 32, 65, 84, 90), was used to crudely estimate community composition at all four sampling sites. TRFLP diversity fingerprints were examined for intra-sample and inter-sample variation in an effort to evaluate the sensitivity and reproducibility of this method in detecting variation between and among prokaryotic communities in BSL. Correlations were made between observed differences in community composition around the lake and temperature. This project is part of a larger study which aims to examine microbial community structure in the geothermal features at LVNP. Baseline studies such as this are the first step in addressing ecological questions concerning the metabolic processes and geomicrobiology of these extremophilic communities.

1 2 The objectives of this study were to: (1) use culture independent methods to identify the prokaryotes living in BSL; (2) determine if TRFLP is a sensitive enough method to detect variation in the prokaryotic community composition at various sites in

BSL; (3) use TRFLP to assess differences in prokaryotic community composition at thermally distinct sites around BSL, and look for correlations between any detected variation and temperature.

Site Description

LVNP is located in north central California at the southern end of the Cascade

Range (Figure 1). Lassen is an active volcanic system formed by the subduction of the oceanic Juan de Fuca and Gorda Plates beneath the continental North American Plate

(23). Volcanic activity in the area began approximately 600,000 years ago with the formation of the Mt. Tehama stratovolcano. Prior to the formation of Lassen Peak, Mt.

Tehama collapsed, and a substantial part of it has been removed by erosion. Lassen Peak is a plug dome volcano formed 27,000 years ago from a vent on Tehama’s northeastern slope. The most recent eruptions in LVNP occurred from 1914-1917, again altering the surrounding landscape.

Active volcanism in LVNP has created an extensive system of hydrothermal features, including fumaroles, mudpots, boiling pools, and steaming ground. The hydrothermal system at Lassen is a vapor-dominated system resulting in low chloride, acid sulfate waters (158). Water from rain and snow flows down through the permeable

3

(a) (b) FIGURE 1. Maps of Lassen Volcanic National Park. (a) LVNP is located at the southern end of the Cascade Range in northern California. (image taken from www.lib.utexas.edu ) (b) Shaded area indicates LVNP boundaries. BSL (indicated by a white star) is in the Warner Valley of LVNP. (image taken from www.americansouthwest.net)

4 rock, fractures and faults to an underground heat source. After the water is heated, it

rises to a depth of approximately 1 km, where the pressure has decreased enough to allow

boiling, creating underground steam chambers. Steam then rises to the surface creating

fumaroles and steaming ground, or condenses and heats ground water near the surface,

creating mudpots and boiling pools. In this near-surface, oxygen-rich environment,

hydrogen sulfide gas is oxidized to form elemental sulfur and sulfuric acid, which helps

shape the chemistry of hydrothermal features. Few studies have examined the microbial

diversity in the features of LVNP. Two studies have focused on the biogeography of

specific organisms (53, 161). One study examined the genetic diversity of protists (16),

and one study investigated the diversity of microorganisms in geochemically distinct

hydrothermal features around the park (144).

Boiling Springs Lake (BSL) is one of the hydrothermal features at LVNP (Figure

2). It is located at the southern end of the park, along Terminal Geyser fault, at an

elevation of 2,062 m. With an area of approximately 12,000 m 2, it is the largest acidic hot spring in North America. Physical characteristics and geochemistry of BSL have been described previously (66, 144, 158). In summer (June-August), BSL surface water has an average temperature of ~52 +/- 3ºC, and an average pH of 2.0 +/- 0.4. In winter, access to the lake is limited by snow, but preliminary satellite remote sensing suggests average winter temperatures up to 30ºC cooler (Pedelty et. al., unpublished).

Temperature across the peripheral lake surface is variable, with the hottest area (Site D,

Figure 2b, 65-95ºC in summer) located at the southern end where mud pots and bubbling springs are visible. High temperature springs and mud pots also surround the periphery

5

(a) (b) Figure 2. Photograph and schematic of BSL. (a) The view from a cliff above the south end of the lake shows sampling sites A, B, and C and outflow (indicated by arrow). (b) The surface temperature and pH of several locations around the lake periphery are relatively constant (~52˚C, pH 2.24 in June-August 1999-2006), except for warmer and more variable temperatures at sites C and D. Several high temperature springs and mudpots on the periphery are indicated by grey swirls. GPS positions and pH/T data are from July 19, 2004. (image from Siering and Wilson, personal communication)

6 of BSL and can be in contact with the lake during part of the year. There is a seasonal outflow stream at the northeast end of the lake until approximately mid-July each year.

Geochemical analyses have shown BSL to have high sulfate, low chloride, and low metal water chemistry (144, 158), making it different from most of the hot, acidic springs characteristic of Yellowstone National Park (YNP).

Thermoacidophilic Life

The high temperature and low pH of BSL create an extreme environment, limiting the growth and survivability of many organisms. The organisms thriving in conditions of

BSL are labeled thermoacidophiles, indicating that optimal growth occurs at high temperatures and low pH. Thermophilic bacteria are commonly defined as having an optimal growth temperature at or above 55ºC (13). This boundary is based on the widespread presence of temperatures lower than 50ºC on Earth and the presently known upper temperature limit for eukaryotic life at 60ºC (14). Organisms that exhibit optimal growth above 80ºC are referred to as hyperthermophiles.

Growth at high temperatures presents unique challenges to bacteria. Their proteins, cytoplasmic membranes, and nucleic acids must be heat stable. Protein thermostability is achieved by an increased number of ion pairs, a highly hydrophobic core, the formation of oligomers containing multiple subunits, an increased number of salt bridges, expression of heat-shock chaperone-like proteins, a decrease in the surface to volume ratio, and decreased length of surface loops (29, 71, 116, 134).

7 Cytoplasmic membranes of thermophiles in the domain Bacteria are heat-

stabilized by high proportions of saturated fatty acids, which form stronger hydrophobic

interactions resulting in membranes more resistant to thermal lysis. Thermophilic

Archaea have membranes composed of repeating units of the five-carbon compound

isoprene bonded by ether linkage to glycerol phosphate, forming a lipid monolayer,

which is more resistant to melting than the typical lipid bilayer (94).

The DNA of thermophiles may be protected from denaturation by high

intracellular salt concentrations (95, 116). KCl and MgCl 2 protect DNA from depurination and hydrolysis. Hyperthermophiles have a DNA topoisomerase called reverse DNA gyrase, which positively supercoils DNA (44). Positive supercoiling has been shown to stabilize DNA to heat denaturation. Although G-C pairs of nucleic acids are more stable in high temperatures than A-T pairs, no correlation has been found between G+C ratios and thermophily. There is, however, an increase in G+C content of

RNA in organisms with higher optimal growth temperatures, resulting in stabilization of secondary structure (48). Post-transcriptional modification of RNA, such as methylation or acetylation of nucleotides, restricts flexibility, thereby contributing to thermal stability

(83, 116).

Acidophiles are organisms that grow optimally at pH 6 or lower. Unlike intracellular temperature which is determined by the environmental temperature, intracellular pH is usually controlled to remain near neutrality, regardless of the environmental pH. Acidophilic prokaryotes maintain transmembrane pH gradients of up to 5 pH units (93). The most extreme acidophiles described, Picrophilus oshimae and

8 Picrophilus torridus , grow optimally at pH 0.7 and have intracellular pHs of 4.6 (47, 116,

159). Therefore, the challenge to an organism living in an acidic environment is often

not how to maintain the structure of intracellular molecules, but how to control

intracellular pH.

One of the mechanisms used by acidophiles to maintain internal pH is an extremely low cell membrane permeability to protons (47). A tetraether membrane monolayer (described above) is extremely proton impermeable. This membrane structure is present in all known acidophilic Archaea and may be crucial for their survival in acid

(93). However, acidophilic Bacteria lack this membrane structure. Other mechanisms include positive surface charges, high internal buffer capacity, overexpression of H +

export enzymes and unique transporters (97, 134). Many acidophiles have unusual

bioenergetics, as the membrane potential is positive within the cell compared to outside

like most neutrophiles (97). Although intracellular pH of acidophiles is usually close to

neutral, their membrane-associated and extracellular proteins must be acid stable. The

mechanisms used to achieve this are not entirely understood, but they have been observed

to include both reductions and increases in the number of charged amino acid residues

and either highly positive or highly negative surface charges (116).

Hydrothermal habitats support a variety of thermoacidophilic prokaryotes with

diverse metabolic strategies. Most thermoacidophiles use either aerobic or anaerobic

respiration as their energy-producing pathway (73), and the majority are Archaea.

Thermoacidophiles in the domain Bacteria are less numerous than Archaea, but they also

Table 1. Common known thermoacidophiles.

Anaerobic Lithotrophic (e - Temperature pH for Domain Aerobic Heterotrophic Reference (e - acceptors) donors) for Growth Growth + (sulfidic ores, S0, tetrathionate, Archaea + - + 65-85ºC 1.0-5.5 (15, 61) sulfide, hydrogen) Archaea Picrophilus + - + - 47-60ºC 0-3.5 (138) + (sulfidic ores, Archaea Metallosphaera + - + 50-80ºC 1.0-4.5 (60, 61) S0, hydrogen) + (S 0, sulfide Archaea Sulfurococcus + - + 40-85ºC 1.0-5.8 (51, 131) minerals, Fe 2+ ) Archaea Acidianus + + (S 0) - + (hydrogen) 45-96ºC 1.0-6.0 (73, 140) Archaea Sulfurisphaera + + (S 0) + + (S 0) 63-92ºC 1.0-5.0 (87, 130) Archaea Thermoplasma + + (S 0) + - 33-67ºC 0.5-4.0 (139) Archaea - + (S 0) - + (hydrogen) 57-89ºC 1.0-5.5 (141) Archaea Acidilobus - + (S 0) + - 60-92ºC 2.0-6.0 (123) + (S 0, Archaea - + - 65-92ºC 3.5-5.6 (68) thiosulfate) Archaea - + (S 0) + - 45-80ºC 2.3-5.4 (69) Bacteria Leptospirillum + - - + (Fe 2+ ) 30-55ºC 1.3-2.0 (74) + (S 0, Fe 2+ , Bacteria Sulfobacillus + - + 17-60ºC 1.1-5.5 (12, 110) sulfide minerals) Bacteria Acidimicrobium + - + + (Fe 2+ ) 45-50ºC (opt) 2.0 (opt) (21) + (hydrogen, (31, 142, Bacteria Hydrogenobaculum + - - reduced sulfur 55-65ºC (opt) 3.0-4.0 (opt) 152) compounds) + (S 0, sulfide Bacteria Alicyclobacillus + - + <20-70ºC 1.5-5.0 (78) minerals, Fe 2+ ) Bacteria Acidicaldus + + (Fe 3+ ) + - 40-65ºC 1.75-3.0 (76) (+) indicates presence, (-) indicates absence, (opt) indicates the temperature or pH given is for optimal growth 9

10 use a variety of metabolic strategies for survival. Some common known thermoacidophiles are listed in Table 1.

Eukaryotic diversity in acidic hot springs is primarily limited by temperature

rather than pH (73). The upper temperature limits for eukaryotic microorganisms are

56ºC for protozoa, 60ºC for algae, and 62ºC for fungi (94). The red alga Cyanidium

caldarium lives in thermal acidic environments up to 55ºC and at pH 1.85-3.8 (14). A

photosynthetic Euglena strain has been isolated from a hot, acidic mud pool in Costa

Rica (146). The fungus Dactylaria gallopava is widespread in acid thermal habitats (14).

The alga Chlamydomonadales and diatoms Pinnularia and Aulacoseira were identified in

an acidic hot spring in LVNP (16).

Cultivation vs. Culture Independent Investigation

The identification of organisms present in an environment is a crucial prerequisite

for understanding the role of organisms in the environment and the processes influencing

the diversity of those organisms. Traditionally, cultivation efforts have been used to

characterize inhabitants of microbial communities. While culturing gives important

information regarding the physiology, structure, genetics, growth characteristics and

pathogenicity of microorganisms, it can not be used to adequately assess microbial

diversity in natural environments; even in culture conditions presumed to be nearly

identical to the natural environment, less than 1% of microorganisms are culturable from

most environments (5, 9, 28, 64, 128), and in extreme and low nutrient environments

culturability is significantly lower. Furthermore, microorganisms cultivated from the

11 environment often do not represent the numerically dominant or functionally important organisms in that ecosystem (55). Culture-independent methods are, therefore, essential to assess microbial community diversity. Many of these methods rely on ribosomal RNA sequences. rRNAs have been widely used in culture-independent studies of microbial communities for several reasons (163): (1) they occur in all organisms; (2) they maintain a high degree of functional constancy; (3) they are large molecules with many domains;

(4) different parts of the rRNA molecule evolve at different rates, resulting in highly conserved domains (allowing the inference of relationships among Bacteria, Archaea, and

Eukarya) and hypervariable domains (allowing discrimination at the genus, species levels); (5) they are not believed to be laterally transferred among organisms. Recently, there has been growing evidence from genome sequencing that rRNA genes may be laterally transferred, but the rate of occurrence is unknown, and transfer appears to be minimal compared to other genes (50).

The most comprehensive and laborious of the culture-independent methods for studying microbial diversity in nature is sequencing clone libraries of rRNA genes derived from environmental DNAs. This method involves extracting nucleic acids from the sample, amplifying small subunit rRNA genes using the Polymerase Chain Reaction

(PCR), ligating amplified genes into plasmids, inserting plasmids into E. coli , screening the colonies for the presence of plasmids, isolating the plasmids, and sequencing rRNA genes to identify individuals in a population. Although cloning and sequencing is a powerful method for identifying individuals in an environmental sample, it is not without problems. Biases can be introduced during nucleic acid extraction, because some cells

12 may be more resistant to lysis than others (103). Rigorous conditions needed to lyse gram-positive cells may damage the nucleic acids of the gram-negative cells, potentially biasing the reported diversity of the sample (88). The use of PCR also introduces biases, because it does not amplify all genes equally (132, 153). This bias may be due to variable energetics in primer annealing and DNA denaturation due to G+C content in template or primer DNA (143), or variation in genome size and the number of rRNA genes within a given genome (39). PCR may also result in the formation of chimeric sequences from the artifactual joining of 16S rRNA gene sequences of two organisms

(82, 88) or from within the genome of a single organism (160). Chimeras can be mistaken for novel taxa in subsequent analyses (63). DNA damage has been shown to contribute to the formation of chimeras during PCR (115). It has been suggested that increasing elongation time and minimizing the number of PCR cycles helps to reduce the occurrence of chimeras (3, 124). Also, if a clone library is not sequenced exhaustively, all organisms in the sample may not be represented by the subset of clones that was sequenced. Finally, sampling bias can affect clone library results. Variations in community composition can occur at a small scale in microbial ecosystems, so sample heterogeneity can be an issue if the sampling scale is not large enough to include all the organisms present in the natural habitat.

Molecular Estimates of Diversity

Cloning and sequencing is useful to identify organisms present in a sample, but it is not practical to observe community composition changes over time due to its large time

13 requirements and expenses. Some molecular techniques that can be used to obtain crude indices of microbial diversity in a sample are fluorescent in situ hybridization (FISH),

DNA microarrays, ribosomal intergenic spacer analysis (RISA), denaturing gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE), restriction fragment length polymorphism (RFLP), and terminal restriction fragment length polymorphism (TRFLP). Each of these methods has limitations and advantages that must be considered when selecting which method to use for a specific study. All of these methods, except FISH, rely on nucleic acid extraction and PCR, so their results are affected by the biases previously mentioned.

In fluorescent in situ hybridization (FISH), fluorescently-labeled DNA probes are annealed to a signature region of an rRNA molecule of fixed cells. Signature phylogenetic probes have been identified for individual microbial species as well as for entire domains, so the degree of specificity can be controlled by the sequence of the probe. Fixed environmental samples can be examined by probing with a suite of probes

(with different fluorescent labels) that are capable of identifying bacteria at varying levels of taxonomic hierarchy (143). Quantification can be done by comparing probe signal intensity of the environmental sample with that of a known amount of positive control

(34, 129). One advantage of this technique is that it does not depend on nucleic acid extraction and PCR, eliminating the biases imposed by those methods. It is also rapid and inexpensive. Since their short length allows them to distinguish between single- mismatch differences in their potential targets, probes can be very specific (157); however, this specificity can also lead to the exclusion or underemphasis of organisms in

14 a sample (150). Another limitation of FISH is that there must be a significant amount of target rRNA sequence in the cell for it to be detected. There are approximately 10 3-10 5

ribosomes per cell in actively growing cells, but microbial growth rates are often very

slow in natural environments (143). In complex communities, organisms that comprise

<0.1% of the sample can rarely be detected (5). Several methods have been developed in

an effort to overcome the problem of low signal strength, including using polynucleotide

probes (122), rRNA enrichment (41), and catalyzed reporter deposition (CARD)-FISH

(40, 42, 67, 121). Variable signal strength, fading of fluorescent signal, and

autofluorescence of non-target cells and acellular material are other limitations of this

technique (143).

DNA microarrays provide a qualitative profile of taxonomic groups present in an environmental sample. Taxonomically differentiated fragments of 16S rDNA are spotted and immobilized on a glass slide to make hybridization targets. Microbial community

DNA is either amplified by PCR or extracted directly from environmental samples, and labeled for use as a hybridization probe. The resulting hybridization is microscopically visualized and detected using lasers. The data are then analyzed using task specific computer software (150). One of the benefits of DNA microarrays is that samples can be examined for many microorganisms quickly and easily, provided that a microarray facility is available. In some cases, communities can be analyzed without PCR amplification. Microarrays provide specific species information and general community profiles at the same time. Some commercially produced microarrays are now available, reducing development time and expense.

15 One significant limitation of DNA microarrays for community characterization is that sequence information must be available for a microorganism in order to assay for it, thereby limiting application to previously characterized species (150). Additionally, sequence divergence between probe and target sequences occurs. The accuracy of quantitative species assessment is questionable, because it is difficult to distinguish differences in signal intensity due to population abundance versus sequence divergence

(166). Also, the hybridization of multiple probes to a single target can occur, making it difficult to differentiate between closely related organisms in a sample (162). Secondary structure of rRNA can create false negatives and positives (119). Since specialized equipment is required, use of this method can be limited by equipment availability.

Finally, for environmental samples to be analyzed without using PCR, target genes must be present in large quantities (150). DNA microarrays have been used on unamplified nucleic acid extractions from sediment samples to characterize microbial communities

(38, 120), successfully detecting community members with abundances as low as 2%

(120). Recently, whole-community RNA amplification (WCRA) was developed to amplify environmental RNAs prior to microarray analysis without introducing the biases associated with PCR (49).

Ribosomal intergenic spacer analysis (RISA) is a technique that uses the natural variability of the prokaryotic ribosomal intergenic spacer (IGS) region of the rRNA gene to create a community profile. This region, flanked by the genes for 16S rRNA and 23S rRNA, is constant among members of a given species, but varies in length from 50 bp to

1500 bp between species (126), and it possesses a high degree of sequence variability

16 (150). PCR amplification and gel electrophoresis of this region produces a community fingerprint based on size, and isolation and sequencing of gel bands can allow taxonomic identification of organisms in the sample. RISA is a rapid and simple method to obtain community fingerprints of environmental samples. Also, an automated method of RISA, called ARISA, has been developed. The variability in both size and sequence of the region allows for more detailed taxonomic identification of individual PCR amplicons than RFLP or DGGE, which depend on more conserved regions of the rRNA gene (109).

One major limitation of RISA is the IGS database is much smaller and less comprehensive than the other ribosomal databases, making species identification from

RISA bands less likely (150). Also, since IGS amplicons cover such a broad range of sizes, PCR biases may disproportionately affect RISA due to the preferential amplification of shorter sequences (43).

Community profiles can also be obtained using denaturing gradient gel electrophoresis (DGGE), which separates PCR amplified rRNA genes based on differences in G+C content. Amplicons are run on a polyacrylamide gel made with a gradient of DNA-denaturing compounds such as urea and formamide. As it moves through the gel, DNA becomes increasingly denatured, decreasing its mobility until movement stops when it is almost completely denatured. Since G-C bonds are harder to denature than A-T bonds, the position on the gel at which DNA stops moving depends on the G+C content of the DNA. If properly calibrated, DGGE is sensitive enough to detect single base pair differences between amplicons (101). The gel bands can be isolated and sequenced for taxonomic identification. Some problems with using DGGE are: (1) the

17 denaturing gradient and electrophoretic duration must be carefully calibrated prior to sample analysis (104); (2) the size of DNA fragments is typically limited to below 500 bp

(106); (3) large quantities of DNA are required for effective resolution(107). There is also the possibility that multiple amplicons migrate to the same position in the gel, or that one species results in more than one band due to the existence of multiple copies of rRNA genes in a single organism (112). Brighter bands in a DGGE gel are often assumed to represent the dominant members of the community, but PCR biases can lead to over or under representation of certain species (150). Typically, only DNA from organisms comprising at least 1% of the community can be visualized using this technique (105). Finally, extremely complex communities can result in too many bands to visualize individually (150). Temperature gradient gel electrophoresis (TGGE) is a variant of DGGE, using a temperature gradient instead of a chemical concentration gradient to denature the DNA, and it has similar advantages and disadvantages as DGGE.

Restriction fragment length polymorphism (RFLP) is a method in which PCR amplified rRNA genes from an environmental sample are digested with one or more restriction enzyme(s). Since sequence variation between species will create different restriction sites for various enzymes, different species will yield different fragment sizes.

The digested DNAs are run on a polyacrylamide gel, resulting in a pattern of fragment sizes that is characteristic of the community. This method has also been called amplified ribosomal DNA restriction analysis (ARDRA). One limitation of RFLP is that it provides a community fingerprint, but it does not allow for taxonomic identification of organisms in the sample. Probe hybridization can be used in conjunction with RFLP for

18 species identification (92). Also, each experiment requires preliminary digestions to determine which restriction enzyme(s) provide the highest resolution between samples

(127). If a clone library is constructed using the same PCR primers, “virtual digests”

(www.restrictionmapper.org ) of clone library sequences can be used to choose restriction enzymes.

Terminal-restriction fragment length polymorphism (TRFLP) is an extension of

RFLP (Figure 3). Using one fluorescently labeled primer and one unlabeled primer, rRNA genes are PCR-amplified from nucleic acids extracted from an environmental sample.

The amplicons are digested with one or more restriction enzyme(s). Digested samples are run on a polyacrylamide gel, and the fluorescently labeled terminal restriction fragments (TRFs) are detected by an automated DNA sequencer, which reads both the size of the fragments and the intensity of each band. TRFLP output consists of a digital profile with size represented on the horizontal axis and intensity represented on the vertical axis. Ideally, each TRF peak represents a single species, and peak intensity represents the abundance of a given fragment size in the community. There are extensive rDNA databases that can be used to predict TRF size for all known sequences for a given set of PCR primers and restriction enzymes. TRFLP can also be combined with a clone library constructed using the same primers to identify major members of the community.

This is accomplished by using sequences of clones to predict TRF size based on restriction enzyme cut site. Peaks in the TRFLP output that correspond to an expected fragment size from a clone library sequence may tentatively be assumed to represent that

19

Figure 3. Summary of TRFLP. (1) Environmental samples are collected, filtered and the DNA extracted. (2) DNA is amplified through PCR with fluorescently-labeled 16S rRNA forward primer and unlabeled 16S rRNA reverse primer. (3) Digestion of PCR products produces labeled fragments of various lengths. (4) Polyacrylamide gel electrophoresis is used to separate fragments. (5) Analysis software is used to determine band size, intensity, and to generate a chromatogram with peaks whose area corresponds to the percent community composition of the microbe represented by the peak. (image taken from 35.8.164.52/html/ t-rflp_jul02.html)

20 clone and its corresponding phylotype. Like RFLP, this method requires preliminary studies to determine the restriction enzyme(s) that will best differentiate between species in the sample. If paired with a clone library, this task can be simplified by using “virtual digests” ( www.restrictionmapper.org ) of clone library sequences to choose restriction enzymes.

Even with careful enzyme selection, some TRF peaks may represent more than one organism, because some organisms with different rRNA sequences may have the same restriction site or some very closely related organisms may have the same rRNA sequence. Also, one species may result in more than one band due to the existence of multiple copies of rRNA genes in a single organism. Some studies have shown that

TRFLP results are typically limited to the 50 most abundant organisms in a sample (135), and only organisms that make up at least 0.5% of the amplified DNA can be detected

(90). There may also be differences between predicted TRF length and observed TRF length due to differential migration of TRFs based on length or purine content (77).

Finally, although peak intensity represents abundance of a given TRF length in the amplified sample, it may not be accurately representative of abundance of the corresponding organism in the environment due to previously discussed extraction and

PCR biases. Despite its limitations, TRFLP possesses some clear advantages over other

DNA fingerprinting techniques. It allows for the fast, fairly inexpensive analysis of environmental samples and requires only small amounts of sample. Automated analysis allows for higher sensitivity to small changes in community profile between samples

(150). Finally, the equipment required is already available in many labs, because it is

21 commonly used in DNA sequencing. These advantages lead to the selection of TRFLP as the DNA fingerprinting technique used in this study.

CHAPTER 2: CLONE LIBRARY AND PHYLOGENY

Introduction

Several clone libraries were previously generated from nucleic acid extractions of

BSL site A sediment (Figure 2b, Table 2) (Siering and Wilson, in preparation). All three rRNA clone libraries were constructed using the same forward primer, U515F (133), with different reverse primers (universal U1406R, Archaeal A1100R, prokaryotic P1525R).

Phylotypes were defined as groups of sequences with at least 97% identity to all other sequences within the group and greater than 3% difference when compared to all other sequences. Unique single sequence phylotypes had less than 97% identity to all other groups of sequences. Approximately 94%, 83%, and 61% of the phylotypes in the

A1100R, U1406R, and P1525R libraries, respectively, shared <95% similarity with the closest described isolate, indicating that several novel taxa may be present in BSL. The

A1100R and U1406R libraries consisted of mostly Archaeal phylotypes. The majority of

A1100R and U1406R sequences shared <85% and <87% similarity with the closest described isolates, respectively. Some of these shared 97-98% identity with environmental sequences from a small 60ºC, pH 3 sulfate-chloride spring in YNP (70).

The P1525R library contained both Bacterial and Archaeal sequences. The most abundant phylotype was only 91% similar to the closest described isolate,

Pelotomaculum thermopropionicum . Approximately 10% of the sequences in the

P1525R library shared 99% identity with Hydrogenobaculum sp ., an obligate aerobic

22 23

Table 2. Common phylotypes in BSL site A sediment clone libraries.

Closest cultured match in A1100R U1406R P1525R databases library library library Staphylothermus marinus (84%) 75/165 2/179 Thermofilum pendens (83%) 74/80 11/165 Pelotomaculum 2/80 1/165 56/179 thermopropionicum (91%) Picrophilus oshimae (85%) 35/165 Hydrogenobaculum sp. 2/165 18/179 NOR3L3B (99%) Sulfobacillus 20/179 thermosulfidooxidans (92-95%) Desulfotomaculum australicum 20/179 (89%) Thermoplasma acidophilum 10/165 (98%) Comamonas acidovorans (99%) 1/165 8/179 % of phylotypes that share <95% identity with the closest described 94% 83% 61% isolate

24 chemolithoautotroph that requires elemental sulfur or thiosulfate for growth (152). The cultured taxa most closely related to the common phylotypes detected utilized a variety of metabolic strategies and included chemolithoautotrophs and heterotrophs that are obligate anaerobes, obligate aerobes and microaerophiles.

The first goal of this project was to identify the prokaryotes living in BSL using a

16S rRNA clone library constructed from site D (Figures 2b and 4) water samples. This investigation, along with the previous clone libraries constructed from site A sediments, will increase our understanding of the biological diversity at the two sampling sites that are farthest from each other in both distance and temperature. Combining biological and geochemical data from BSL and other thermal features at LVNP can provide insight into the biotic and abiotic factors controlling these extremophilic communities and will ultimately assist in efforts to identify the function of taxonomic groups in ecosystem level processes.

Methods

Sample Collection

Water samples were collected from BSL on July 19, 2004. Four sites around the lake were selected for sampling based on physical characteristics and accessibility (sites

A, B, C, D, Figures 2 and 4), and within each site, three samples were taken approximately 1 m apart (samples A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3,

Figure 4). BSL water samples were collected in sterile 1 L Nalgene containers attached

25

Figure 4. Schematic of BSL. The approximate locations of sampling sites are indicated.

26 to the end of a telescoping aluminum paint rod. The outsides of the Nalgene containers were surface-sterilized with a rinse of 80% ethanol, followed by a rinse of sterile, distilled, deionized water. Samples were collected from immediately below the water surface and approximately 3 m from the shore. The location of each sampling site was determined with a Garmin 12XL Global Positioning system. Values for temperature and pH were recorded on site using a Thermo-Orion 290A Plus meter (Fisher Scientific,

Pittsburgh, PA).

Following sample collection, water was carried back to the trailhead (~4 km) and driven to lab facilities within the park for processing. Approximately 4-10 hours passed from the time of sample collection until the completion of processing for acridine orange direct counting (AODC) and nucleic acid extraction.

Acridine Orange Direct Counts (AODC)

Samples were fixed for acridine orange direct counting (AODC) by the addition of 1/10 volume formalin (Sigma-Aldrich, St. Louis, MO) for a final concentration of

3.7% formaldehyde. Fixed samples were stored at 4ºC until counts were performed.

Quantitative direct counts of total microbial numbers in fixed water samples were performed using previously described AODC methods (145).

Nucleic Acid Extractions

Cells were collected aseptically by filtering water samples through 0.2 µm

Durapore Millipore filters at volumes of 500 mL/filter for site A and B samples, 100

mL/filter for site C samples and 50 mL/filter for site D samples. The volume of water put

through each filter was dependent on the amount of suspended solids in the sample. Each

27 filter was folded into a sterile, 2 ml screw capped microcentrifuge tube and stored at -

80ºC until nucleic acids were extracted.

Nucleic acids were extracted from collected cells and purified by previously reported methods (144). Each filter containing cells was incubated at 50ºC for 30 minutes in 800 µl lysis buffer (0.1 M NaCl, 10 mM EDTA, 10 mM Tris pH 8.0, 2%SDS) and 0.3 mg Proteinase K. Following incubation, tubes were microcentrifuged for 30 seconds, and supernatants were combined with 400 µl 24:1 chloroform: isoamyl alcohol

(C/IAA), and stored on ice during the subsequent lysing steps. Filters were removed from the tubes, and the remaining material was combined with 400 µl lysis buffer, 300 µl

C/IAA and 1-1.5 g sterile zirconia-silica beads (0.1 mm) (Bio Spec Products. Bartlesville,

OK). Cells were lysed by beating for 1 minute at 2500 in a Mini Beadbeater-8 (Biospec

Products, Bartlesville, OK). Samples were microcentrifuged, and the supernatants were combined with those from the previous step. Addition of lysis buffer and C/IAA, bead beating, centrifugation and removal of supernatants was repeated a second time. Beads were washed by adding 800 µl ice-cold sterile TE, beating for 10 seconds in the bead beater, and microcentrifuging for 2 minutes. The supernatants were combined with those from the previous steps. For C2 samples, lysates from 5 separate filters were pooled. For

C3 samples, lysates from 4 separate filters were pooled. For all A, B, and D samples, lysates from single filters were kept separate. The lysates were extracted with phenol/chloroform, concentrated by butanol extractions, further purified on Sepharose

CL-4b columns (100), and precipitated overnight in ethanol using standard methods.

28 Nucleic acids were resuspended in 50 µl of 2 mM Tris, 0.2 mM EDTA (pH 8) and stored at -80ºC.

When possible, 3 nucleic acid extractions were obtained from each water sample and labeled S1, S2, and S3. For site C, there were only 0, 1, and 2 successful extractions from samples C1, C2, and C3 respectively. For site D, there were no successful extractions from sample D2 (Table 4).

Creation of Site D rRNA Gene Clone Library

Prior to amplification, 3 µl from D1-S3, D3-S1, D3-S2, and D3-S3 extractions were pooled to make a BSL D pooled sample. The extractions were not pooled in equimolar amounts due to very low concentrations of some extractions. Also, the extremely low concentrations of D1 extractions led to the need to conserve an adequate volume of extractions for remaining experiments. Therefore, only one of three D1 extractions was included in the pooled sample. 16S ribosomal RNA genes were amplified with the primer set U341F (5’-CCT ACG GGR SGC AGC AG-3’) (56) and

U1406R (5’-GAC GGG CGG TGT GTR CA-3’) (133) using Promega MasterMix

(Promega Corporation, Madison, WI), according to manufacturer’s protocol. Amplicons from four PCR reactions were pooled, purified with Wizard PCR Preps DNA Purification

System (Promega) and cloned into the pGEM-T Easy cloning vector (Promega), according to manufacturer’s protocol. Blue/white colony screening and insert PCR were used to confirm clones, and plasmid DNA was isolated from confirmed clones using

Wizard Plus SV Minipreps (Promega). Initial sequencing was done in a single direction by CSUPERB Microchemical Core Facility (San Diego, CA). Sequencing of clones

29 selected for phylogenetic analysis was done in two directions in-house using the Licor

4200 Long ReadIR automated sequencing system according to the manufacturer’s recommendations. Sequence data were processed and vector sequences removed using the Staden Package (http://staden.sourceforge.net/index.html ) (151) and Sequencher,

Version 4.0 (Gene Codes Corporation, Ann Arbor, MI) programs. There were a total of

163 clones sequenced in this study, and 34 of these clones were sequenced in two directions.

Sequence Analysis and Phylogeny Estimation

Partial sequences (~500 nucleotides) from all 163 clones were aligned to one

another using Clustal W, and genetic distances were estimated using Clustal Dist

(http://workbench.sdsc.edu ). Clones were grouped into phylotypes based on percent similarity of the sequence to the sequences of the other clones. Clones whose sequences showed 97% or greater similarity to each other were grouped into the same phylotype.

Clones with less than 97% similarity to all other clones were considered to be unique phylotypes. For each phylotype, 2 clones were selected for phylogenetic analysis (the most divergent and the most conserved sequences within the cluster). The 2 representative clones from each phylotype and the single clones that did not initially group with other clones during cluster analysis were bidirectionally sequenced and used for phylogenetic estimation. Since initial cluster analysis was performed with unidirectional sequences, some clones that did not initially cluster together were found to represent the same phylotypes after further sequencing. Therefore, some phylotypes are represented by more than 2 clones in phylogenetic analyses. Clone sequences were

30 compared to sequences in the National Center for Biotechnology Information (NCBI) database (http:www.ncbi.nlm.nih.gov) using the Discontinuous Megablast program (4).

Sequences in the database showing the highest similarities to clone library phylotypes were downloaded from the Ribosomal Database Project (RDP) database (24) or GenBank

(http://www.ncbi.nlm.nih.gov) for phylogenetic analyses. Sequences were checked for potential chimeras using the Chimera Check program available from the RDP (24).

Species effort analyses were performed using Analytic Rarefaction 1.3 (58).

For phylogenetic analyses, Bacterial and Archaeal sequences were aligned separately. Each group was aligned with reference sequences downloaded from the RDP database in an aligned format. Clone library sequences and reference sequences were aligned manually using the Sequencher program, version 4.0 (Gene Codes Corporation).

Conserved primary and secondary structural features and the RDP alignments were used as a guide during alignment. For the Bacteria, Microcystis aeruginosa was used as an outgroup sequence. For the Archaea, Methanococcus jannaschii , Methanococcus

infernus, Archaeoglobus fulgidus , and Thermococcus hydrothermalis were used as

outgroup sequences.

A comparison of distance matrix, parsimony, maximum likelihood, and Bayesian

methods was used to estimate phylogenies, as previously reported (144). Distance matrix

and parsimony trees were inferred using the PAUP 4.0 b10 software package (154). The

search method was heuristic, and starting trees were obtained by stepwise addition with a

random addition sequence and 10,000 replicates. The tree-bisection-reconnection (TBR)

branch-swapping algorithm was used. For distance matrix trees, distance measure was

31 maximum likelihood with two substitution types, a transition/transversion ratio of two, and empirical nucleotide frequencies. Maximum likelihood trees were estimated using

PAUP and Fast DNA ML (113), with empirical base frequencies, random sequence addition order, and global tree rearrangements. Nonparametric bootstrap analyses were performed using the PAUP software package. The optimality criterion was set to likelihood with two substitution sites, a transition/transversion ratio of two, and empirical nucleotide frequencies. Gamma distributed rate variation was applied. A heuristic search was done with 100 bootstrap replicates. The starting tree was obtained via stepwise addition with a random addition sequence. Bayesian analyses were carried out using Mr.

Bayes 3.1 program (62). For each data set (Bacterial and Archaeal), the General Time

Reversible (GTR) model of nucleotide substitution with gamma distributed rate variation across sites was applied, and four independent runs of at least 2 million generations with burn-in values of 25% were completed. These parameters were adequate to reach convergence. Consensus trees were constructed in PAUP.

The SSU rRNA gene sequences analyzed in this study were deposited to the

GenBank database under the accession numbers EF558669-EF558700.

Results

Site Characterization

GPS coordinates for each site refer to the location on shore from which the

samples were collected. Site A was located at N 40º26’09.8”, W 121º23’52.4”. Site B

was located at N 40º26’10.4”, W 121º23’50.2”. Site C was located approximately three

32 meters to the east of N 40º26’05.8”, W 121º23’47.3”, and site D was located approximately three meters to the west of N 40º26’05.8”, W 121º23’47.3”.

At the time samples for this study were taken (July 19, 2004), the water temperature and pH were consistent across sites A and B (including subsites A1, A2, A3,

B1, B2, and B3) measuring 52.2ºC and pH 2.23. All three subsites at sampling site C

(C1, C2, and C3) were slightly warmer at 56.8ºC and pH 2.25. The subsites at sampling site D were much warmer and more variable, presumably due to the actively bubbling springs and mud pots continually visible at this site. Site D1 measured 82.3ºC, pH2.35; site D2 measured 76.8ºC, pH 2.33; and site D3 measured 70.6ºC, pH 2.31.

Community Analysis

Sites A, B, C, and D had average cell concentrations of 2.49 x 10 7, 1.05 x 10 7,

7.64 x 10 6, and 5.66 x 10 6 cells/ml, respectively (Brodie and Jones, unpublished). Each cell count was based on 4 independent counts from 2 different smears. These values are comparable to mesothermic, neutral pH, aquatic environments, and they agree with counts from site A waters and sediments in multiple years (Siering and Wilson, unpublished).

Cluster analysis identified 11 unique phylotypes among the 163 clones sequenced from the site D water clone library (Table 3). Of the 11 phylotypes, 3 are within the domain Bacteria and 8 are within the domain Archaea. Approximately 75% of the clones sequenced belonged to the domain Bacteria. Species effort analysis (Figure 5) indicated that the clone library was sampled adequately to identify the majority of phylotypes

33 Table 3. Summary of BSL site D clone library phylotypes.

Abundance Closest BLAST match Clones used for in clone (% identity) phylogeny library Acidimicrobium sp. Y0018 (98-99%) 76/163 (47%) BSLdp 52, 208, 42, 17, 56 Hydrogenobaculum sp. NOR3L3B 30/163 (18%) BSLdp 119, 93, 31 (96-98%) Acidicaldus organivorous (95-96%) 17/163 (10%) BSLdp 28, 88, 13, 150, 37 Vulcanisaeta distributa IC-124 (97-99%) 16/163 (10%) BSLdp 81, 85, 16, 84 Uncultured Thermal Soil Archaeal Clone YNPFFA23 (99%) 10/163 (6%) BSLdp 39, 64, 127, 180 Methanococcus fervens (82-83%) Uncultured Thermoplasmatales Archaeal Clone CPCA007 (98%) 4/163 (2%) BSLdp 48, 215, 140 Picrophilus torridus DSM 9790 (88%) Uncultured Archaeal Clone SK291 (96-97%) 3/163 (2%) BSLdp 201, 69, 153 Ignisphaera sp . TOK10A.S1 (90-91%) Uncultured Archaeal Clone A14 (98-99%) 2/163 (1%) BSLdp 6, 123 Ignicoccus sp. KIN4-I (91%) Metallosphaera hakonensis (96-97%) 2/163 (1%) BSLdp 90, 112 Sulfolobus acidicaldarus (92%) 2/163 (1%) BSLdp 186, 155 Stygiolobus azoricus (98%) 1/163 (1%) BSLdp 26 % of phylotypes that share <95% identity 5/12 (42%) with the closest described isolate

34

BSL Site D Clone Library Species Effort Curve

12

10 8

6

4 2 Number Number of Phylotypes 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 Total Clones Sampled

Figure 5. Species effort curve for BSL Site D clone library. The curve indicates that the clone library was sampled sufficiently to identify the majority of phylotypes present.

35

Figure 6. Phylogenetic tree of Bacterial sequences from BSL Site D clone library. 16S rRNA genes were amplified using universal primers 341F and 1406R. Sequences of reference taxa were downloaded from the RDP II database (Cole et al. 2003) in an aligned format. Tree topology was consistent across parsimony, distance, maximum likelihood and Bayesian analyses. Bayesian tree is shown. Numbers at nodes represent bootstrap confidence values from maximum likelihood method and are based on 100 bootstrap resamplings of dataset. Numbers at nodes within parentheses represent Bayesian posterior probabilities if different from bootstrap values.

36 represented. The species effort curve reached a plateau, indicating that further sampling would be unlikely to identify additional phylotypes.

Phylogenetic analyses were done separately for Bacteria and Archaea, to increase speed of phylogenetic analyses and reduce problems associated with long-branch attraction. Bayesian, parsimony, distance, and maximum likelihood methods all resulted in consistent tree topologies, and the placement of known reference taxa was largely in agreement with currently accepted tree topologies (24) (Figures 6 and 7). Bayesian analyses showed a higher degree of support than bootstrap analyses for the specific location of some internal nodes, as reported in previous studies (28, 37, 144).

Of the BSL Site D clones analyzed, approximately 75% were in the domain

Bacteria (Table 3, Figure 6). Approximately 47% formed a well supported monophyletic clade with an Acidimicrobium sp. isolated from geothermal acidic sites in YNP and closely related to Acidimicrobium ferrooxidans (75). A. ferrooxidans has been described to grow optimally at pH 2 and 45-50ºC (21). Phylogenetic analyses placed this species, and the associated BSL site D clones, in a monophyletic clade with species in the class

Actinobacteria. Approximately 18% grouped with Hydrogenobaculum sp. isolated from a mud hole (91ºC, pH 3) in YNP (33). Phylogenetic analyses placed this species, and the

BSL Site D clones grouped with it, in a clade with species in the class Aquificae, order

Aquificales, family Aquificaceae. The remaining 10% were most closely related to

Acidicaldus organivorus , isolated from geothermal sites in YNP (75, 76). However, the

37

Figure 7. Phylogenetic tree of Archaeal sequences from BSL Site D clone library. 16S rRNA genes were amplified using universal primers 341F and 1406R. Sequences of reference taxa were downloaded from the RDP II database (Cole et al. 2003) in an aligned format. Tree topology was consistent across parsimony, distance, maximum likelihood and Bayesian analyses. Bayesian tree is shown. Numbers at nodes represent bootstrap confidence values from maximum likelihood method and are based on 100 bootstrap resamplings of dataset. Numbers at nodes within parentheses represent Bayesian posterior probabilities if different from bootstrap values.

38 BSL Site D clones sequenced shared only 95-96% identity with A. organivorus , which grows optimally at pH 2.5-3.0 and 50-55ºC (76). Phylogenetic analyses placed this species, and the associated BSL Site D clones, in a clade with species in the class

Alphaproteobacteria, order .

Approximately 25% of the clones sequenced were in the domain Archaea (Table

3, Figure 7). Unfortunately, the reference taxa for Archaeal phylogenetic analyses were selected prior to the final BLAST analyses of the sequences. Therefore, some of the closest BLAST matches (Table 3) were not included in the phylogenetic analyses. The majority of the Archaeal clones sequenced clustered with species in the phylum. Approximately 10% formed a monophyletic clade closely associated with

Vulcanisaeta distributa . V. distributa was isolated from a hot spring in Japan (90ºC, pH

2.4), and is reported to have optimal growth conditions of 85-90ºC and pH 4.0-4.5 (68).

Approximately 2% of the clones were most closely related to clone SK291 (GenBank accession number AY882840) identified in YNP hot springs (Korf et al ., unpublished), and they formed a monophyletic clade with the 1% of clones most closely related to clone A14 (GenBank accession number AF325186) identified in an acid thermal spring

(pH 3.1, 58-62ºC) in YNP (70). Previous studies placed clone A14 in the deeply branching terrestrial hot spring Crenarchaeota group 1 (70, 155), which agrees with the placement of the BSL site D clones related to clones A14 and SK291 in a clade with species in the phylum Crenarchaeota. Approximately 3% of clones sequenced grouped with species in the order . One third of these were closely related to

Metallosphaera hakonensis , isolated from a hot spring in Japan and showing optimal

39 growth at ph 3.0, 70-75ºC (86), and one third were closely related to Stygiolobus azoricus

with optimal growth at pH 3.0, 80ºC (141). Approximately 1% shared 92% identity to

Sulfolobus acidicaldarus , but Chimera Check (24) results indicated that this phylotype represents a possible chimera. The remaining 8% of site D water clones sequenced clustered with species in the Euryarchaeota phylum. Approximately 6% of the clones were most closely related to clone YNPFFA23 (GenBank accession number AF391911) identified in geothermally heated soil of YNP (Botero et al ., unpublished).

Approximately 2% of the clones sequenced were most closely related to clone CPCA007

(GenBank accession number AY861721) retrieved from a hot spring (88ºC, pH 4.2) in

YNP (149), and they grouped with species in the order Thermoplasmatales, which agrees with the previous study.

Discussion

Sampling site D in BSL is much warmer and more variable than site A from which BSL clone libraries were previously constructed. Due to the high temperature

(70.6-82.3ºC) of this site, I expected to find limited microbial diversity and the majority of sequences to be Archaeal. Surprisingly, the diversity of sequences in the BSLdp clone library (Table 3) was not significantly lower than the diversity of sequences previously amplified from site A sediments (Table 2) (55). Additionally, although the majority of phylotypes (8 of 11) identified were Archaeal, 75% of sequences amplified from site D water were Bacterial. This may indicate that Bacterial diversity at site D is low, but abundance is high. This result could also be due to preferential amplification of Bacterial

40 sequences with the universal 341F/1406R primer set. Since site A clone libraries were constructed from sediment samples instead of water samples and with different primer sets, it is not surprising that the major phylotypes differed between libraries. One of the major phylotypes, Hydrogenobaculum sp. was common among the BSLdp library and two of the site A sediment libraries. Complete sequence analysis of a clone library constructed from site A water is underway, but it will not be completed in time to include it in this study.

All three of the bacterial phylotypes identified are closely related to isolates from acidic thermal features of YNP (33, 75). Acidimicrobium ferrooxidans grows chemolithoautotrophically by oxidizing ferrous iron, which is stable at low pH. It can also grow heterotrophically on yeast extract. Optimal growth occurs at 45-50ºC and pH

2.0 (21). Hydrogenobaculum sp. is a microaerophilic chemolithoautotroph that uses hydrogen and reduced sulfur compounds as electron donors. It requires elemental sulfur or thiosulfate for growth. Hydrogenobaculum sp. grows optimally at 55-65ºC and pH

3.0-4.0 (31, 142, 152). Acidicaldus organivorus is an obligate heterotroph and facultative anaerobe, growing by ferric iron respiration in the absence of oxygen. Optimal growth occurs at 50-55ºC and pH 2.5-3.0 (76). The optimal growth conditions of these isolates are similar to the temperature and pH of BSL sites A, B, and C at the time of sampling, but cooler than site D water. This may indicate that either the isolates have a larger range of temperatures at which they can grow, or the BSL phylotypes have different growth optima than the isolates. Alternatively, the isolates may be present at site D but not growing or not viable.

41 Of the 8 Archaeal phylotypes, only three are closely related to previously described species. Vulcanisaeta distributa is an obligately anaerobic heterotroph, but it can tolerate low levels of oxygen (68). Elemental sulfur or thiosulfate is required as an electron acceptor. It is capable of growth at 65-92ºC and pH 3.5-5.6. Metallosphaera hakonensis is an obligately aerobic chemolithoautotroph that grows on elemental sulfur and reduced sulfur compounds (60, 61, 156). It is capable of growth at 50-80ºC and pH

1.0-4.5. Stygiolobus azoricus is also chemolithotrophic, but it is an obligate anaerobe that oxidizes H 2 (141). It is capable of growth at 57-89ºC and pH 1.0-5.5. The temperature of site D water at the time of sampling falls within the growth range for all three described species. However, the pH of BSL site D water was lower than the pH range at which V. distributa is capable of growth. This suggests that either the BSL phylotype exhibits a different pH range, or V. distributa is capable of growth at a pH lower than previously described. Four of the Archaeal phylotypes closely match clones that have been sequenced from thermal soil or acidic hot springs in YNP. Temperature and pH data are available for two of these thermal areas (58-62ºC, pH 3.1 and 88ºC, pH

4.2). The temperatures of the springs are similar to the temperature of BSL; the pHs are slightly higher, but still acidic. The phylotype that shared 82-83% identity with

Methanococcus fervens clustered in between the phylotypes associated with the phylum

Crenarchaeota and the Thermoplasmatales-like phylotype associated with the phylum

Euryarchaeota.

Based on the results of previously constructed clone libraries and geochemical

data from BSL, I expected the BSLdp clone library to include novel organisms with

42 sequences similar to thermoacidophiles that have previously been identified in BSL and other thermal features of LVNP and YNP. Ten of the eleven BSLdp phylotypes match clones or described organisms from thermal acidic features. The remaining phylotype shared 92% identity with Sulfolobus acidicaldarus , but it does not closely match any previously reported sequence. Based on results of Chimera Check analyses (data not shown) (24), it may represent a chimera in which neither fragment is closely related to a previously described species.

Oxygen solubility decreases rapidly as temperature increases, so the oxygen concentration of BSL site D water is presumably low though dissolved oxygen measurements were not obtained in this investigation. In fresh water, the concentrations of dissolved oxygen at 20ºC, 50ºC, 70ºC, and 90ºC are approximately 9.2 mg/L, 5.6 mg/L, 4.0 mg/L, and 1.7 mg/L, respectively. Therefore, although samples were collected near the water surface, I expected to find mostly microaerophiles and facultative anaerobes. It is likely that the deeper waters and sediments of BSL provide a better habitat for obligate anaerobes. However, two of the Archaeal phylotypes represent obligate anaerobes. The extreme turbidity of site D water during sample collection could indicate that this site is shallow, and the water may be mixed with sediment. Therefore, the obligate anaerobes detected by the clone library could have been from sediments collected with the water. Also, two of the phylotypes detected represent obligate aerobes, so there must be sufficient oxygen present at the water surface to support aerobic metabolism. It is possible that both aerobic and anaerobic microhabitats exist on a very small scale in BSL, providing a suitable environment for the growth of organisms with

43 various oxygen requirements. Alternatively, the oxygen relationships of the BSL phylotypes might be distinct from those of the described isolates with similar 16S rRNA sequences.

Molecular-based identification of organisms present in BSL gives information that can be used to direct cultivation efforts. As expected, sequences matching described organisms with a variety of metabolic strategies were observed in the BSLdp clone library. Sulfur oxidizers such as Hydrogenobaculum sp. and Metallosphaera hakonensis oxidize elemental sulfur and thiosulfate to sulfate, forming sulfuric acid in the process.

This process, along with the chemical conversion of sulfite to sulfate in acidic environments, probably contributes to the high sulfate concentrations (833 ppm) (144) and low pH of BSL. Elemental sulfur is also used as an electron acceptor for anaerobic respiration by organisms such as Vulcanisaeta distributa and Stygiolobus azoricus .

Acidimicrobium ferrooxidans oxidizes ferrous iron for energy. Total iron has been measured at 38.4 ppm in BSL (144) and is stable in the ferrous state at low pH. Since the oxidation of ferrous to ferric iron only generates a small amount of energy, iron-oxidizing bacteria often precipitate a large amount of iron, which combines with sulfur to form jarosite or pyrite. The ferric iron produced may also be used by organisms such as

Acidicaldus organivorus as electron acceptors during anaerobic respiration. In addition to sulfur and iron dependent metabolism, hydrogen and organic compounds are important for growth in several observed phylotypes.

In general, the BSLdp clone library results are consistent with expectations based on BSL geochemistry, and they give insights into the possible metabolic processes

44 occurring in the lake. In the future, it would be interesting to repeat the clone library using different primer sets. It is possible that Bacterial or Archaeal specific primers would amplify different sequences and identify new phylotypes. Also, analysis of a clone library constructed from site A water is currently underway, and those results will allow a more accurate comparison of community composition at very different temperatures and opposite ends of BSL.

CHAPTER 3: TERMINAL-RESTRICTION FRAGMENT LENGTH POLYMORPHISM (TRFLP)

Introduction

TRFLP is a culture-independent method used to rapidly assess microbial diversity and make comparisons between community compositions of environmental samples.

Since its introduction in 1997 (90), this method has been used to examine and compare microbial community compositions in a wide variety of environments such as soil (30,

32, 54, 84, 118), marine sediments (137), freshwater (65, 136), acid mine drainage (17), wetlands (25, 102), bioreactors (98), wastewater (165), and digestive tracts (46, 59). The method has been praised for its advantages of greater resolution than other fingerprinting methods like DGGE and TGGE, immediate gel analysis and digital output, and the ability to assign phylogenetic identities to fragments by referencing databases (96). In spite of its wide use in molecular microbial ecology, questions remain regarding the ability of

TRFLP to accurately and effectively detect variability between environmental samples or identify phylotypes present within samples. Few studies have investigated the variation among replicate samples or intra-sample variation (11, 45, 65).

Much of the discussion regarding the efficacy of TRFLP focuses on methods of data analysis. TRFLP output is in the form of a chromatogram, with size and relative fluorescence of each terminal fragment calculated by the software used. There are several variables that affect the conclusions that can be made from this output. Variation between true and observed TRF length has been reported to range from one to seven base

45 46 pairs (77, 117). This variation can affect both the identification of microbes in a community and comparison of community composition between samples. Purine content and length of fragments, differing chemistries of fluorescent primer labels, and fluctuations in laboratory temperature have all been shown to affect TRF drift (77, 117).

To accommodate for TRF size imprecision, fragments within a range of sizes are often grouped together during data analysis. The method used to group fragments can affect results and the conclusions drawn regarding relatedness between community fingerprints

(57). Community diversity can be under or overestimated by grouping fragments representing multiple organisms together or splitting fragments representing one organism into multiple groups, respectively. Researchers have tried to address this problem by grouping TRFs manually or with computer programs (57, 79, 148), but a consensus on how to group fragments to minimize artifacts associated with technique has yet to be reached.

Another question that arises when evaluating TRFLP data involves the determination of which chromatogram peaks to include in the data. It is not always easy to determine which peaks represent community members and which peaks result from background noise. Often researchers set a fluorescence baseline above which all peaks are included in their results (1, 27, 102, 114). In some studies, samples are loaded on a gel multiple times, and any peaks not included in all replicates are discarded (25, 32, 65).

Once peaks that represent community members have been determined, peak area is normalized to percentage of total sample fluorescence to allow comparisons between samples independent of concentration loaded on gel. The concentration of a sample

47 loaded on a gel can affect results, because minor members of a community may be represented when a sample is loaded at high concentration but not at low concentration

(114). Often, peaks that make up less than 1% of the total fluorescence of the sample are excluded from data analysis (10, 30, 65, 84), but this cutoff value varies among studies.

Finally, there is a question as to whether peak height or peak area is a better representation of abundance for each fragment length. Some researchers prefer to use peak areas, claiming that using peak heights will underestimate the abundance of larger fragments due to diffusion during electrophoresis (52, 80). Others argue that overlapping peaks can result in inaccurate peak areas, so the use of peak heights is better for comparison between communities (11).

TRFLP is often used to quantitatively assess microbial communities by using relative fluorescence of terminal fragments to indicate abundance of the population they represent in the community. There are many variables, however, that can bias these results and limit accurate quantification of the relative abundance of microorganisms in natural communities. One study showed that peak height and area were highly variable, depending on nucleic acid extraction method (45). Easily-lysed cells (gram-negative bacteria) were overrepresented, and including bead-beating in the extraction procedure significantly skewed the results. As mentioned previously, bias is also introduced during

PCR. Primer selection can influence results, as certain primers may preferentially amplify some target genes over others (45). Changing the annealing temperature during

PCR has been shown to result in varying relative abundance detected during TRFLP (22).

One form of PCR bias unique to TRFLP is the formation of pseudo-terminal restriction

48 fragments (pseudo-TRFs), which are TRFs resulting from partially single-stranded amplicons that do not represent organisms present in the sample. The occurrence of pseudo-TRFs has been observed to be positively correlated with the number of PCR cycles (35). Digestion of PCR amplicons with single-strand-specific mung bean nuclease

(MBN) prior to restriction enzyme digestion has been shown to reduce the occurrence of pseudo T-RFs due to single-stranded amplicons (19, 35), but this can lead to the underestimation of the relative abundance of amplicons which are affected by the formation of pseudo-TRFs (35). Finally, the choice of restriction enzyme can affect relative abundance of populations in a community indicated on a chromatogram.

Depending on the rRNA gene sequences present in the sample and the cut site of the restriction enzyme, one peak may represent multiple members of a community. It has been suggested that the use of more than one restriction enzyme can help to minimize this problem (22, 32, 36, 114). Even with multiple restriction enzyme digests, the ability of

TRFLP to resolve single populations in natural communities may be limited to environments where there is low to intermediate species richness (36). Another complication to using the relative fluorescence of peaks as a measure of abundance in natural communities is the presence of numerous copies of rRNA genes within a genome, which can result in multiple peaks representing a single population (7, 26). Up to 4 copies have been detected in Archaea, but Bacteria can have as many as 15 (2, 81). In addition to leading to an overestimation of sample diversity, this can cause an underestimation of abundance for those organisms represented by multiple peaks.

49 Despite these limitations, TRFLP is one of the better methods available to microbial ecologists for obtaining crude estimates of diversity in a timely manner. The second goal of this project was to determine if TRFLP is a sensitive enough method to detect variation between prokaryotic communities of four sampling sites in BSL, varying in temperature by up to 30ºC, and to assess if the methods used provide consistent, reliable results when repeated. Intra-sample variation between TRFLP profiles was assessed to look for biases that may be introduced by the extraction method and PCR.

Inter-sample variation between fingerprints was used to look for differences between microbial communities around the lake. TRFLP results were also compared to the site D clone library phylogeny in an attempt to observe the distribution of detected organisms among sampling sites around the lake.

Methods

Sample Collection and Nucleic Acid Extraction

Samples were collected and nucleic acids extracted using the methods described for preparation of the Site D clone library (Chapter 2). Since not all extraction attempts were successful, these methods resulted in a total of 27 extractions for use in TRFLP analysis (Table 4). The locations of sampling sites in BSL are shown in Figure 4.

PCR Amplification

16S rRNA genes were amplified for TRFLP analysis using fluorescently labeled

U341F (341F*) (LICOR Biosciences, Lincoln, NE) and unlabeled U1406R primers. To increase yield of labeled PCR amplicons, the forward primer was mixed to include 90%

50 Table 4. Summary of nucleic acid extractions from BSL water samples collected July 19, 2004.

Estimated Number of Names of Concentration of Sampling Site Nucleic Acid Extractions Used in Nucleic Acids Extractions Study Extracted (ng/ µl) A1-S1 40 A1 3 A1-S2 10 A1-S3 10 A2-S1 20 A2 3 A2-S2 10 A2-S3 10 A3-S1 15 A3 3 A3-S2 25 A3-S3 12 B1-S1 10 B1 3 B1-S2 >24 B1-S3 4 B2-S1 10 B2 3 B2-S2 >24 B2-S3 >24 B3-S1 20 B3 3 B3-S2 20 B3-S3 20 C1 0 C2 1 C2-S1 1 C3-S1 <0.2 C3 2 C3-S2 <0.2 D1-S1 <0.6 D1 3 D1-S2 <0.6 D1-S3 <0.6 D2 0 D3-S1 1.6 D3 3 D3-S2 4 D3-S3 6

51 fluorescently labeled 341F and 10% unlabeled 341F. Each 100 µl reaction contained 50

µl Master Mix (Promega), 0.4 µM 341F*, 0.4 µM 1391R, 0.04 µM 341F, 47.16 µl nuclease-free water and 2 µl DNA (diluted to maximize amplification for that extraction).

PCRs were executed in a 2720 Thermal Cycler (Applied Biosystems, Foster City, CA) with the following program: 4’ 94ºC; 30 cycles of 1’ 94ºC, 1’ 55ºC, 1’ 72ºC; 7’ 72ºC.

For each extraction from sites A1, A2, A3, B1, B2, B3, C2, and C3, four 100 µl PCR reactions were pooled. Due to lower amplification from D samples, eight 100 µl PCR reactions were pooled for each site D1 and D3 extraction. Amplicons from the four or eight reactions were pooled and purified with Promega Wizard PCR Preps DNA

Purification System, following manufacturer’s protocol. PCR products were quantified on a 1% agarose gel by comparisons with known molecular weight standards.

Mung Bean Nuclease Digestion

For each sample, mung bean nuclease digestion reactions were prepared with 30

µl of purified PCR product (mass ranged from 100-900ng), 5 µl 10X MBN buffer, 0.05

µl MBN at 100U/µl, and 15 µl nuclease-free water. Reactions were incubated at 30ºC for

15 minutes, and 10 µl of 1M Tris (pH 8) were added to each reaction to stop the digestion. MBN digested amplicons were purified using Wizard PCR Preps DNA

Purification System (Promega), following manufacturer’s protocol. Purified products were quantified on a 1% agarose gel.

Restriction Enzyme Digestion

MBN digested samples were digested with the restriction enzyme Hae III

52 (Promega). For each sample, a restriction enzyme digest containing 8 µl of the sample

(mass ranged from 30-160 ng), 1 µl of Hae III (10 U/µl), and 1 µl of buffer C (Promega) was incubated at 37ºC for 3 hours. After digestion, 5 µl IR2 Stop Solution (LICOR

Biosciences) were added, and samples were stored at -20ºC until gel electrophoresis.

Polyacrylamide Gel Electrophoresis

Hae III digested samples were loaded onto a polyacrylamide gel for electrophoresis. The 5.5% polyacrylamide gel was prepared in a total volume of 30 ml, with 12.6 g of urea, 6 ml of 5X TBE, 3.3 ml of LongRanger 50% acrylamide solution

(BioWhittaker Molecular Applications, Walkersville, Maine), 200 µl of 10% ammonium persulfate, and 20 µl of TEMED (Life Technologies, Grand Island, New York). A portion of each sample was diluted 1:5 in TE buffer (10mM Tris, 1 mM EDTA, pH 7.8).

Prior to loading onto the gel, undiluted samples, 1/5 diluted samples, and a 50-700bp

Sizing Standard (LICOR Biosciences) were heated at 95ºC for 5 minutes and then cooled to 4ºC. 1 µl of each undiluted and 1/5 diluted sample and 0.5 µl of the size standard were loaded onto the gel using the following pattern: sizing standard, empty lane, undiluted sample, empty lane, 1/5 dilution, empty lane. The LICOR 4200 BaseImageIR software

Data Collection function was used to image the gel run. The configuration used was called 25TRFLP.COL; HVPS was set to 1500V, 35mA, and 35W, scan speed to 4/fastest, and the filter to 3.

To confirm consistency between T-RFLP results, Hae III digests of samples A1-

S1, A1-S2, A1-S3, A2-S1, A2-S2, A2-S3, A3-S1, A3-S2, A3-S3, D1-S1, D1-S2, and D1-

S3 were loaded on the gel in duplicate. T-RFLP (from PCR amplification through

53 polyacrylamide gel electrophoresis) was repeated for samples A2-S1, A2-S3, A3-S1, A3-

S2, D1-S1, D1-S2, and D1-S3. These samples were also loaded on the gel in duplicate.

In addition, TRFLP of sample A1-S1 was repeated 3 times (from PCR amplification through polyacrylamide gel electrophoresis) following the protocol described above, using fluorescently labeled U515F (515*, 5’-GTGCCAGCMGCCGCGGTAA-3’)

(LICOR Biosciences) and unlabeled U1406R. Each of these 3 separate Hae III digests was loaded on the gel 3 times.

To give additional information regarding variation between sampling sites, several extractions from each site were pooled, and these pooled samples were subjected to the entire TRFLP process (from PCR through polyacrylamide gel electrophoresis).

For site A, extractions A1-S2, A2-S2, and A3-S3 were pooled and labeled Apool. For site B, extractions B1-S1, B2-S1, and B3-S1 were pooled and labeled Bpool. For site C, extractions C2-S1, C3-S1, and C3-S2 were pooled and labeled Cpool. For site D, extractions D1-S3, D3-S1, D3-S2, and D3-S3 were pooled and labeled Dpool. For sites

A, B, and C, all samples were pooled by combining equal masses of DNA from each extraction. For site D, samples were pooled by combing equal volumes of DNA from each extraction. TRFLP procedure was followed as previously described. Each sample was run on the polyacrylamide gel only once.

Finally, TRFLPs of extractions from sites A, B, and D3 were repeated using the primers U515F and U1406R (White, unpublished). TRFLP procedure was followed as previously described, except each sample was only run on the polyacrylamide gel once.

54 TRFLP Analysis

The program RFLPscan Plus version 3.0, now called GeneProfiler version 3.52

(Scanalytics Incorporated, Fairfax, VA), was used to analyze the gel. Bands were initially identified by the RFLPscan program and then edited manually to include only those bands visible on the computer screen. RFLPscan analysis resulted in a chromatogram for each sample, on which the position of peaks corresponds to the base- pair length of labeled fragments in each band in relation to the size standard, and the area of each peak corresponds to the relative band intensity. Peaks that composed less than

1% of the total band intensity for that sample were not included in the analysis. Each remaining peak area was normalized to the total area of its corresponding chromatogram.

For samples that were run on the gel multiple times, peak area was reported as an average of the runs.

The incidence-based Jaccard similarity index (S12 /(S 1+S 2-S12 )) and the abundance-based Bray-Curtis similarity index (BC ij = 1-Σ |n ik -njk |/(n ik +n jk )) were used to compare T-RFLP diversity profiles for different samples (20). Nonmetric multidimensional scaling (NMS) using R software (125) was used to identify groups within and between samples.

Results

Intra-sample Variation

To test for variation between nucleic acid extractions of subsamples of the same

sample, three separate extractions of each sample were examined with TRFLP. Using

55 these methods, variation between extractions was observed. Bray-Curtis similarity coefficients indicating the degree of similarity between extractions of each sample are listed in Table 5. For the primer set U341F/U1406R, Bray-Curtis values ranged from

0.626 – 0.944. The average within sample similarity was lowest for sample A2 and highest for sample B1, at 0.729 and 0.904, respectively. Within sample A2, extraction

A2-S2 was responsible for much of the variation observed. There were five fragments detected in this extraction that were not present in either of the other extractions from that sample. Sample A3 also showed relatively low average similarity at 0.746. Within sample A3, extraction A3-S3 is responsible for much of the variation detected, also with five fragments not detected in the other two A3 samples.

For the primer set U515F/U1406R, Bray-Curtis values ranged from 0.514-0.937

(Table 5). The average within sample similarity was lowest for sample A3 and highest for sample B1, at 0.648 and 0.929, respectively. Within sample A3, extraction A3-S3 is responsible for much of the variation observed, with five fragments not detected in either of the other two A3 extractions. Extraction A2-S2 had six fragments not detected in the other A2 extractions, resulting in a relatively low average Bray-Curtis value of 0.75 for sample A2. Extraction D3-S3 is responsible for much of the variation observed within sample D3. It had nine fragments not detected in the other D3 extractions.

TRFLP of a single extraction (A1-S1) was repeated three times with primers

U515F/U1406R to test for consistency and repeatability of results. Bray-Curtis similarity indices between the results ranged from 0.88-0.97. Identical fragment sizes were

56 Table 5. Bray-Curtis similarity indices showing variation between multiple extractions from each BSL sample (higher values indicate greater similarity). EXTRACTION EXTRACTION BRAY-CURTIS BRAY-CURTIS 1 2 (341F) (515F) A1-S1 A1-S2 0.785 0.879 A1-S1 A1-S3 0.868 0.771 A1-S2 A1-S3 0.797 0.705 A2-S1 A2-S2 0.773 0.707 A2-S1 A2-S3 0.754 0.725 A2-S2 A2-S3 0.659 0.817 A3-S1 A3-S2 0.9 0.758 A3-S1 A3-S3 0.626 0.514 A3-S2 A3-S3 0.712 0.672 B1-S1 B1-S2 0.89 0.937 B1-S1 B1-S3 0.9 0.923 B1-S2 B1-S3 0.923 0.926 B2-S1 B2-S2 0.878 0.823 B2-S1 B2-S3 0.869 0.845 B2-S2 B2-S3 0.92 0.913 B3-S1 B3-S2 0.753 0.931 B3-S1 B3-S3 0.786 0.783 B3-S2 B3-S3 0.899 0.788 C3-S1 C3-S2 0.834 D1-S1 D1-S2 0.881 D1-S1 D1-S3 0.944 D1-S2 D1-S3 0.826 D3-S1 D3-S2 0.929 0.718 D3-S1 D3-S3 0.76 0.667 D3-S2 D3-S3 0.76 0.68

57 detected in each replicate, but proportion of total sample fluorescence for each fragment varied slightly between replicates.

Inter-sample Variation

TRFLP results from individual extractions were averaged, producing an average

TRFLP profile for each water sample. These averaged profiles were compared using the

abundance-based Bray-Curtis Similarity Index and the incidence-based Jaccard Similarity

Index (Table 6). For the primer set U341F/U1406R, sites A and B showed relatively

high similarity between samples. For site A, the three samples had Bray-Curtis similarity

values ranging from 0.772-0.874 and Jaccard values ranging from 0.71-0.79. Of the 15

fragments detected in site A samples, 10 were present in all three samples (A1, A2, and

A3), and two fragments were detected in two of the three samples. All three fragments

that were only present in one sample measured less than 1% of the total sample

fluorescence. The Bray-Curtis similarity values for site B samples ranged from 0.857-

0.884, and the Jaccard values ranged from 0.92-1.0. Of the 12 fragments detected in site

B samples, only one was not present in all three samples. It made up 3.9% of the total

fluorescence of sample B3. Comparison of the average of the two extractions for site C3

to the results of the single extraction from site C2 gave a low Bray-Curtis similarity value

(0.472) due to variation in the relative fluorescence of the fragments, all of which were

detected in both C2 and C3 (Jaccard value 1.0). Samples D1 and D3 also resulted in a

low Bray-Curtis value (0.526) when compared to each other. Unlike site C inter-sample

variation, site D inter-sample variation can be attributed to both variation in fragments

58 Table 6. Bray-Curtis and Jaccard similarity indices showing variation between water samples from BSL (higher values indicate greater similarity). SAMPLE SAMPLE BRAY- JACCARD BRAY- JACCARD 1 2 CURTIS (341F) CURTIS (515F) (341F) (515F) A1 A2 0.772 0.71 0.888 0.714 A1 A3 0.779 0.79 0.84 0.588 A2 A3 0.874 0.79 0.783 0.632 B1 B2 0.884 1.0 0.926 0.833 B1 B3 0.857 0.92 0.927 0.692 B2 B3 0.862 0.92 0.965 0.846 C2 C3 0.472 1.0 D1 D3 0.526 0.46

59 detected (Jaccard value 0.46) and the relative fluorescences of those fragments. More than 50% of the fragments detected in site D samples were present in only D1 or D3.

For the primer set U515F/U1406R, only sites A and B were evaluated for inter- sample variability, because TRFLP profiles were obtained for only one sample from site

D and no samples for site C. Both sites showed relatively high similarity between samples. For site A, Bray-Curtis values ranged from 0.783-0.888, and Jaccard values ranged from 0.588-0.714 (Table 6). The number of fragments detected in each site A sample ranged from 10-17. Of the 19 fragments detected at site A, 10 fragments were detected in all three samples (A1, A2, and A3), two were detected in samples A2 and A3, and seven were only detected in either A2 or A3. Eight of the nine fragments that were detected in one or two site A samples measured less than 1% of the total sample fluorescence. Sample A3 had a unique fragment that measured 1.5% if the total sample fluorescence. For site B, Bray-Curtis values ranged from 0.927-0.965, and Jaccard indices ranged from 0.692-0.846. Of the 13 fragments detected at site B, nine were detected in all three samples (B1, B2, and B3), and three were detected in two samples.

Sample B3 had one fragment not detected in the other site B samples, and it measured less than 1% of the total sample fluorescence.

Dendrograms were constructed to visualize the relationships of BSL extractions to each other based on Bray-Curtis distances between TRFLP profiles (Figures 8 and 9).

For the primer set U341F/U1406R (Figure 8), all site B extractions clustered together.

Sample C3 extractions clustered with the site A extractions, and this group was most closely related to the site B cluster. Sample D1 extractions clustered separately from

60 sample D3 extractions. The single extraction from sample C2 did not group with any other extractions. For the primer set U515F/U1406R (Figure 9), extractions from sites A and B grouped together. Extractions from sample D3 formed a separate cluster.

Inter-site Variation

TRFLP results from all the individual extractions for each site were averaged to produce TRFLP diversity fingerprints for sites A, B, C, and D. These averaged results were compared using the Bray-Curtis similarity index to assess diversity in community composition between sites (Table 7). For the primer set U341F/U1406R, sites A and C were the most similar (0.72), followed closely by sites A and B (0.681). All the fragments detected in site C were also detected in site A, but site A had six additional fragments that were not present in C. Sites A and B had 11 fragments in common and five that were only detected in one or the other. Sites B and C had relatively low similarity (0.564) to each other, with 50% of the fragments only observed in one of the sites. Site D had very low similarity (0.248-0.362) to all three other sites due to both variation in fragment presence and relative fluorescence of the fragments.

When TRFLP was carried out with the primer set U515F/U1406R, sites A and B were the most similar (Table 7). There were six fragments detected at site A that were not detected at site B, each constituting less than 0.5% of the total fluorescence. All of the fragments present at site B were also present at site A. Site D had relatively low similarity to both sites A (0.606) and B (0.593). There were four fragments detected at site D that were not at site A or B, each responsible for 0.3-2.9% of the total fluorescence. Site A had 5 fragments that were not present at site D, each comprising

61

Figure 8. Dendrogram showing the relationship of BSL extractions to each other based on Bray-Curtis distances between TRFLP profiles. Smaller Bray-Curtis distance values indicate greater similarity between extractions. All results are from the primer set U341F/U1406R.

62

Figure 9. Dendrogram showing the relationship of BSL extractions to each other based on Bray-Curtis distances between TRFLP profiles. Smaller Bray-Curtis distance values indicate greater similarity between extractions. All results are from the primer set U515F/U1406R. Extractions are labeled with the letter indicating site, the first number indicating sample, and the second number indicating extraction.

63 0.1-1% of the total fluorescence. Site B had two fragments that were not present at site

D, each measuring 0.3% of the total fluorescence.

As an alternate method of measuring variation between community compositions

of the four sampling sites, extractions from each water sample were pooled, and these

pooled samples were analyzed with TRFLP using the primer set U341F/U1406R. This

resulted in Apool, Bpool, Cpool and Dpool TRFLP profiles, which were compared to

each other using the Bray-Curtis similarity index (Table 7). Sites A and B had profiles

that were very similar to each other in both the presence of fragments and their relative

fluorescence. Sites A and B were also relatively similar to site C in fragment presence,

but the relative fluorescence of fragments in Apool and Bpool were less similar to Cpool

than to each other. Dpool shared low similarity with all three other sites due to

differences in both fragments detected and their relative fluorescences.

Profiles from the samples pooled prior to TRFLP analysis were compared to the

averaged profiles from each site using the Bray-Curtis similarity index (Table 8).

Averaged profiles were obtained in two ways. First, TRFLP results from all extractions

for each site were averaged. Second, TRFLP results from only those extractions included

in the pooled sample for that site were averaged. Relatively low similarity was observed

between pooled and averaged results. Site A had the highest Bray-Curtis value of 0.736,

but this value decreased slightly when there were fewer extractions included in the

averaged profile. For sites B and D, the similarity increased when fewer extractions were

included in the averaged profile. Since all three site C extractions were included in the

pooled sample, a second averaged profile was not calculated for that site.

64 Table 7. Bray-Curtis similarity indices showing variation between sampling sites in BSL (higher values indicate greater similarity).

BRAY-CURTIS (TRFLP RESULTS BRAY-CURTIS (EXTRACTIONS FROM SEPARATE EXTRACTIONS SITE SITE POOLED PRIOR TO TRFLP) 1 2 AVERAGED) 341F 515F 341F A B 0.681 0.836 0.93 A C 0.72 0.799 A D 0.359 0.606 0.37 B C 0.564 0.751 B D 0.362 0.593 0.338 C D 0.248 0.435

65 Table 8. Bray-Curtis similarity indices comparing TRFLP profiles of pooled samples with averaged TRFLP profiles from sampling sites in BSL (higher values indicate greater similarity). For each site, pooled samples are compared with averaged results for all extractions and with averaged results for only those extractions included in the pooled sample. All results are from the primer set U341F/U1406R.

BRAY-CURTIS BRAY-CURTIS (AVERAGE INCLUDES PROFILE PROFILE (AVERAGE INCLUDES ONLY EXTRACTIONS 1 2 ALL EXTRACTIONS) USED IN POOLED SAMPLE) Apool A (average) 0.736 0.687 Bpool B (average) 0.536 0.56 Cpool C (average) 0.655 Dpool D (average) 0.563 0.676

66 Comparison of TRFLP with Clone Library

Averaged TRFLP results from sites A, B, C, and D (primer set U341F/U1406R)

were compared to phylotypes detected in the site D clone library to get a crude estimate

of phylotype distribution in BSL. The abundance of each phylotype indicated by the

relative fluorescence of the expected fragment size was compared to the abundance

observed in the clone library (Table 9). All of the phylotypes identified in the clone

library were detected by TRFLP in at least one of the sampling sites, and six of the

phylotypes were detected in water samples from all four sites. There were five fragment

sizes detected by TRFLP that did not match expected fragment sizes from the clone

library.

Statistical Analysis

Nonmetric multidimensional scaling ordination of TRFLP fingerprints (primer set

U341F/U1406R) resulted in clusters of extractions correlated, in most cases, with sampling site (Figure 10). Site A, B, and C extractions were separated from site D extractions along axis 1. Site D extractions formed two clusters, each representing a different sample (D1 or D3). Temperature and TRFLP fragment lengths attributed to the uncultured Archaeal clone A14, Hydrogenobaculum sp., Vulcanisaeta distributa and the uncultured Thermoplasmatales archaeal clone CPCA007 were correlated with both site D clusters. Site A and C extractions were correlated with fragments attributed to V. distributa , and the thermal soil archaeal clone YNPFFA23. The site B extractions were separated from site A and C extractions along axis 2. The site B cluster was correlated with a fragment attributed to the Acidicaldus -like phylotype. Sample C3 extractions

67 Table 9. Comparison of averaged TRFLP results for sites A, B, C, and D with phylotypes detected in site D clone library.

abundance, expected observed observed observed observed Phylotype, from Dpool size of abundance, abundance, abundance, abundance, BSLdp, U341F, U341F, T-RF ave A ave B ave C ave D U1406R clone library U1406R (bp), cut Hae III Hae III Hae III Hae III (% shared identity) library with T-RF T-RF T-RF T-RF Hae III Vulcanisaeta 16/163 distributa IC-124 35-55 23.6% 13.6% 13.3% 21% (10%) (97-99%) Not observed 62 4.9% 0% 31% 0% Acidimicrobium sp . 76/163 68-78 14.5% 10.4% 16.2% 4.1% Y0018 (98-99%) (47%) Not observed 151 0.2% 3.6% 0% 4.1% Not observed 155 2% 0% 0% 3.2% Not observed 171 5.1% 3.5% 3.8% 9.6% Uncultured Thermoplasmatales archaeal clone 4/163 (2%) 174 0% 0% 0% 6.5% CPCA007 (98%) Picrophilus torridus DSM 9790 (88%) Sulfolobus 2/163 (1%) 180 acidicaldarus (92%) Stygiolobus azoricus 1/163 (1%) 182 (98%) 0.1% 8.1% 0% 0% Uncultured archaeal clone SK291 (96-97%) 3/163 (2%) 185 Ignisphaera sp. TOK10A.S1 (90-91%) Not observed 190 3.2% 0% 3.5% 0% Uncultured archaeal clone A14 (98-99%) 2/163 (1%) 196-205 Ignicoccus sp . KIN4-I 4.8% 1.1% 4.2% 17.2% (91%) Metallosphaera 2/163 (1%) 202 hakonensis (96-97%) Uncultured thermal soil archaeal clone YNPFFA23 (99%) 10/163 (6%) 247 10% 7.7% 7.4% 2.5% Methanococcus fervens (82-83%) Hydrogenobaculum 30/163 sp. NOR3L3B 375-395 14.6% 20.5% 5.1% 28.9% (18%) (96-98%) Acidicaldus 17/163 organivorous 520-530 16.9% 31.4% 15.7% 3% (10%) (95-96%)

68

1.0 Site A Site B Site C Site D

A Axis 2

0.0 0.2 0.4 0.6 0.8

0.0 0.2 0.4 0.6 0.8 1.0

Axis 1

X62

Axis 2 X55 B X244

temp X171X46X174X200X49 X526 X391 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

Axis 1

FIGURE 10. Nonmetric multidimensional scaling analysis of TRFLP profiles for BSL extractions. In plot A, symbols represent sampling sites. In plot B, vectors show correlation of individual TRFLP fragments and temperature with site clusters. The unreadable group of vectors associated with site D extractions includes temperature and fragments with sizes 46, 49, 171, 174, and 200. All results are from the primer set U341F/U1406R.

69

clustered with the site A extractions, but the sample C2 extraction was an outlier correlated with a 62-bp fragment not predicted by the site D clone library.

Discussion

Assessment of the Sensitivity, Repeatability and Resolution of TRFLP

Sources of variation . There are several possible sources of variation between

multiple extractions of each sample. One of these sources is sample heterogeneity.

Variations in community composition can occur at a small scale in microbial ecosystems,

so sample heterogeneity can be an issue if the sampling scale is not large enough to

include all the organisms present in the natural habitat. After samples were filtered to

collect cells, there were noticeable differences in the amount of particulate matter

collected on filters from the same volume of water.

Another possible source is the method of nucleic acid extraction, because slight

changes in extraction technique may lead to differences in the proportion of cells lysed

and the quality of DNA recovered. Although efforts were made to be consistent during

the extraction procedure, the length of extraction time varied between days. Depending

on where in the procedure additional time occurred, this could have lead to differences

between nucleic acid qualities. Also, the final concentrations of extractions were not

consistent, even between extractions from different aliquots of the same sample. This

could be due to differences in the number of cells collected on each filter, the proportion

of cells lysed, or the amount of nucleic acids lost during column purification of extracts.

70 Past studies have shown bead-beating during nucleic acid extraction contributes to

DNA degradation, possibly biasing TRFLP results (45). The extraction method used in this study was devised to reduce degradation of DNA during cell lysis. After each lysis step, nucleic acids were recovered and stabilized with C/IAA. Therefore, nucleic acids from easily lysed cells were collected prior to increased efforts to lyse more resistant cells, reducing the nucleic acid degradation caused by bead-beating. During development of the extraction method, post-extraction AODC results indicated a high percentage of cell lysis (Siering, unpublished).

Inconsistent concentrations of extractions following restriction enzyme digestion may have contributed to differences between TRFLP profiles of extractions from the same sample. The appearance of peaks representing minor members of the community can be dependent on slight differences in the concentration of samples loaded on the gel.

Because they may represent background noise, peaks measuring less than 1% of the total fluorescence were omitted during data analysis.

Intra-sample variation . Even without the weakest peaks, the relatively low

similarity (0.746, 0.648) between A3 extractions may be due to variation in concentration

loaded on the gel. The concentration of the A3-S3 sample was approximately three times

greater than the other two A3 samples, which could have contributed to the presence of

the five additional fragments detected in A3-S3, regardless of the primer set used. The

relatively large intra-sample variation of A2 (0.729, 0.75) cannot be attributed to

concentration, however, because concentration was the same for all three A2 samples.

Extraction A2-S2 had more fragments than the other two A2 extractions regardless of

71 primer set used, so the variation observed is unlikely to be due to PCR bias. Similarly,

PCR bias is not likely to be the only cause of the variation observed within sample D3, because, although the average Bray-Curtis similarity for sample D3 was much lower for the primers U515F/U1406R (0.688) than for the primers U341F/U1406R (0.816), extraction D3-S3 was responsible for the majority of the variation observed in both cases.

It is possible that the intra-sample variability observed within samples A2, A3, and D3 is due to extraction bias, failure to thoroughly mix the sample prior to filtration, limits to the repeatability of the method, or a combination of these factors.

Some variation between results was observed when TRFLP was repeated several times on a single extraction (A1-S1). However, in most cases, this variation was smaller than the variation detected between extractions and between TRFLP profiles from different samples or sites (Tables 5, 6, 7). In this study, Bray-Curtis similarity indices as low as 0.88 may be due to limits in the repeatability of the method, and may not indicate actual differences between samples or sites.

Inter-sample variation . To assess if TRFLP had sufficient resolution to detect

differences between prokaryotic communities in BSL, I compared TRFLP profiles of

multiple samples from each site. For sites A, B, and C, the temperature was consistent

between samples, so I expected the samples within each site to have a high degree of

similarity to each other. Site D sampling sites, however, varied in temperature by

approximately 12ºC, so I expected to see lower similarity between site D samples.

As expected, samples from sites A and B showed a high degree of within site

similarity (Table 6). The majority of fragments detected were present in all three samples

72 from each site, indicating that the microbial diversity was consistent across the approximately two meter distance sampled at each site. Site C had low similarity between samples, but variation was in abundance of fragments detected rather than diversity (Jaccard value of 1.0). The variation between site C samples is not surprising, because, although temperature between samples did not vary, the water was much more turbid than sites A and B. The increased turbidity may indicate that site C is shallower, and there may be more mixing of the water with sediments due to gas inputs from the bottom. As a result of the increased turbidity at site C, only 100 ml of water was run through each filter. It is also possible that these results indicate high heterogeneity in community structure at a 100 ml scale. For site A and B samples, biodiversity was assessed at a larger scale, as 500 ml of BSL water was run through each filter. A culture- independent study of the microbial communities in an acidic lake and river in Indonesia found low similarity between DGGE profiles of replicate samples (300-1000 ml) and suggests the variation could be due to low cell numbers of certain groups or clumped cells in the water (91). Also, attempted nucleic acid extractions of site C samples were often unsuccessful at retrieving DNA, resulting in samples C2 and C3 being represented by only one or two extractions, respectively. The low number of extractions for each sample may have increased the effect of extraction bias on results. Finally, since site C is in the transition area between the cooler, less turbid sites A and B, and the warmer, more turbid site D, it is possible that the samples collected at site C are influenced by communities at sites A/B and site D in varying proportions, contributing the observed differences between site C samples. As predicted, site D TRFLP profiles indicated that

73 the prokaryotic communities represented by each sample were very different from each other in both diversity (Jaccard value of 0.46) and abundance (Bray-Curtis value of

0.526). The between sample temperature variation and the small sampling scale of 50 ml could have both contributed to the heterogeneity of prokaryotic communities between site

D samples.

For the most part, TRFLP was able to resolve differences in community composition between samples where differences were expected. Since predictions regarding variation between samples were based on temperature, the variation between site C samples may indicate that microbial diversity is influenced by other variables in addition to temperature, including the sampling scale. Sampling on a larger scale at sites

C and D may detect greater microbial diversity. However, given the physical limitations of the filters, increasing the sampling scale would require a modification of either the sampling method or extraction protocol.

Assessment of Diversity in BSL

Inter-site variation . Of the two physical variables measured at each sampling site

(temperature and pH), only temperature varied significantly between sites. Previous

studies have shown microbial community composition to be associated with temperature

(18, 65, 89, 111, 147). Therefore, I expected to see differences in the microbial diversity

of BSL sampling sites correlated with temperature. Since the temperature of samples

taken at sites A and B was constant, I expected the community composition of samples

taken from those sites to be very similar. Samples collected at site D were much warmer

than any other samples, and the temperature was highly variable between sampling

74 locations within the site. Therefore, I expected the prokaryotic diversity of site D samples to be highly variable when compared to each other or to the other sites. Since the temperature of site C was 4.6ºC warmer than sites A/B and 13.8-25.5ºC cooler than site

D, I expected the prokaryotic community of site C to be more similar to sites A/B than to site D, but less similar to sites A/B than they were to each other.

As predicted, the bacterial communities of sites A and B were relatively similar to each other, but, in most cases, less similar than samples within each site. With the primers U515F/U1406R, the Bray-Curtis value comparing samples A2 and A3 (0.783) was smaller than the Bray-Curtis value comparing sites A and B (0.836). The high intra- sample variabilities detected for samples A2 and A3 (Table 5) suggest that the TRFLP profiles for these samples may have been influenced by factors other than bacterial diversity, as previously discussed. Surprisingly, sites A and C were slightly more similar than sites A and B and significantly more similar than sites B and C. Site B was distinguished from sites A and C by a higher abundance of the phylotype most closely related to A. organivorous . However, the variation between TRFLP profiles of samples

C2 and C3 indicates that sampling on a larger scale may uncover greater diversity at site

C, possibly leading to increased similarity between communities of sites B and C.

The effect of sampling bias on similarity between bacterial communities of sites

A, B and C is supported by a comparison of pooled samples from each site (Table 7).

The pooled samples only contain a subset of the extractions represented by the average results for each site. When comparing the TRFLP profiles for the pooled samples, sites

A and B have the highest similarity to each other, while site C is slightly more similar to

75 site A than to site B. It is possible that the actual prokaryotic community composition of sites A, B and C is mostly homogeneous, since these sites only vary in temperature by up to 6ºC. The variation detected between communities at these sites may reflect heterogeneity at the scale of sampling rather than heterogeneity between sites, and additional studies are needed to further investigate this possibility.

The bacterial community at site D showed very low similarity to sites A, B, and

C, as indicated by both the averaged TRFLP profiles and the pooled TRFLP profiles.

The fragment attributed to the Thermoplasmatales-like phylotype was unique to site D, and it was more than twice as abundant in D1 samples as in D3 samples. The sequence from the NCBI database ( http://www.ncbi.nlm.nih.gov ) that most closely matched this phylotype was retrieved from a hot spring with a temperature of 88ºC, which is consistent with the warmer temperatures of site D. Also, the fragment attributed to the Ignicoccus - like phylotype is at least three times more abundant in site D samples than in samples from any of the other sites. The sequence from the NCBI database that most closely matched this phylotype was identified in a YNP thermal spring with temperature 58-62ºC and pH 3.1, conditions similar to those of BSL. The fragment attributed to

Hydrogenobaculum was detected at all four sites, but it was most abundant at site D.

Hydrogenobaculum grows optimally at 55-65ºC and pH 3.0-4.0, so it is not surprising to find it distributed throughout the lake. As expected, clustering of site D extractions separately from site A, B, and C extractions was highly correlated with temperature. It is likely that the differences between the prokaryotic community composition of site D and that of the other sites are due, at least partially, to the higher temperatures at this site.

76 Pooled vs. averaged extractions . When profiles obtained from pooling multiple extractions from each site prior to TRFLP analysis were compared to profiles obtained by performing TRFLP analysis separately on independent extractions and then averaging the results, Bray-Curtis similarity values were much lower than expected (Table 8). This was true even when only those extractions included in the pooled sample were included in the averaged profile. Even though an attempt was made to pool the extractions in equal masses (except for site D), the method of determining extraction concentration and the variable accuracy of pipettes used may have led to unequal contributions of certain extractions to the pooled samples. These results indicate that averaging separate TRFLP fingerprints from multiple extractions of each sample is a more accurate method for detecting differences between the community compositions of different sites using

TRFLP, because each extraction contributes equally to the sample profile.

Factors influencing microbial communities . There are many physical and chemical variables in addition to temperature that could be influencing the microbial communities in BSL. The other variable measured in this study was pH. The samples collected at site D had a slightly higher pH than samples from other sites (site D pH 2.31-

2.35, sites A, B, C pH 2.23-2.25), so pH may have contributed to the observed differences between sites. In a study investigating the bacterial communities in 15 diverse lakes in northern Europe, pH and temperature were found to be strongly related to the distribution of taxa (89). Another study that used ARISA to assess microbial communities in 30 lakes in Wisconsin found pH to be strongly related to community composition (164). A study of several geochemically diverse hot springs in YNP

77 suggests that minor differences in pH can translate to varying energy yields from chemolithotrophic reactions, influencing microbial community structure (99). Nutrient concentrations have also been associated with bacterial community composition (147,

164). Since they were not measured in this study, it is not known if variations of nutrient concentrations (or some other chemical or physical factor) between sites contributed to the results of this study.

Finally, while the geographical distance between sampling sites varied, the differences between community compositions can not be explained by distance alone.

Sites C and D were the closest geographically (~6 meters apart), but the site C TRFLP profile was less similar to site D than in was to sites A (~172 meters away) or B (~157 meters away) (Table 7). Sites C and D were both approximately the same distance from site A (~ 172 meters), but the similarity between sites A and C was much higher than the similarity between sites A and D (Table 7).

Factors influencing TRFLP results . It is important to remember that TRFLP profiles may not accurately portray actual natural communities in the environment.

Detected abundance of organisms may be influenced by extraction and PCR bias. Also, fragments detected by TRFLP may represent more than one organism, or one organism may result in multiple TRFLP fragments. While careful selection of restriction enzyme can help to minimize this effect, it does not eliminate the problem. In this study, Hae III was chosen based on “virtual digests” ( www.restrictionmapper.org ) of clone library

sequences with the enzymes Alu I, BstU I, Dpn I, Fat I, Hae III, Hha I, Msp I, Rsa I, and

Tai I. Of the four-base-pair cutting enzymes examined, Hae III resulted in the best

78 resolution between BSL site D clones. Using multiple restriction enzymes would increase the ability to distinguish between organisms, but it would also increase the length and expense of the study. Also, when associating fragments present in the TRFLP profiles with specific organisms, it is important to recognize that the clone library used as a reference was constructed using site D water. If the microbial communities present at sites A, B, and C include organisms that are not present at site D, fragments in those profiles may be misidentified using the site D library. A clone library constructed from site A water is in the process of being analyzed, and completion of it will increase the results that can be inferred from the TRFLP data.

In molecular studies that require PCR, the primers used can also influence which organisms are detected, as indicated by the three clone libraries constructed from BSL site A sediment (Table 2). A study of the published primers commonly used for 16S rRNA amplification concluded that the choice of primers can significantly affect the microbial communities detected (8). In this study, TRFLPs of extractions from sites A,

B, and D were repeated using the primers U515F and U1406R (White, unpublished). In most cases, the relationships detected between samples and sites were consistent between the two primer sets. The absence of a BSL water clone library constructed using primers

U515F/U1406R prevented the relation of fragments from TRFLP profiles with actual organisms, so it is unknown if this different primer set detected different prokaryotic communities in the BSL samples.

79 Suggestions for Future Studies

These results suggest that TRFLP is a sufficiently consistent and repeatable method to use in studies assessing community composition. To reduce the bias introduced during nucleic acid extraction, multiple extractions from each sample should be analyzed separately and results averaged. Further studies are necessary to determine the number of independent extractions required to sufficiently reduce variability due to extraction bias and sample heterogeneity. The low Bray-Curtis similarity (0.472) between samples C2 and C3 indicates that comparisons based on one or two extractions may be highly influenced by extraction bias and/or sample heterogeneity. Therefore, a minimum of three independent extractions should be combined to represent each sample.

In fact, when TRFLP profiles from three extractions were averaged to represent site A, B, and D samples, Bray-Curtis similarity values between samples from the same site were only slightly lower than similarity values between repeated TRFLPs of a single extraction. Further studies are required to determine if increasing the number of extractions would continue to reduce the effects of extraction bias and sample heterogeneity. In addition, different samples can be more accurately compared if the concentration of the restriction enzyme digest loaded on the gel is kept constant. Finally, some variation in relative fluorescence of fragments should be expected, even between repeated runs of the same extraction. Prior to beginning any study using TRFLP, background levels of variation should be assessed and considered during data analysis.

In comparing environmental samples, variation in the presence of the most abundant

80 fragments or large differences in relative fluorescence of fragments most likely indicate differences in community composition of the original samples.

CHAPTER 4: CONCLUSION

The results of this study provide an initial look at the bacterial diversity present in

BSL and suggest that differences in community composition around the lake are correlated with temperature. They also indicate that TRFLP is a sensitive and repeatable enough method to detect variation in the prokaryotic community composition at various sites in BSL. When drawing conclusions from molecular studies of microbial diversity, one must consider the potential biases and assumptions, as detailed throughout this thesis, that may be influencing results.

In microbial ecology studies such as this, rRNA sequences are often used to determine phylogenetic relationships and assess diversity in the environment. When drawing conclusions based on these studies, it is important to recognize that bacteria can have multiple, heterogeneous rRNA operons. Although up to 15 rRNA operons have been detected in bacteria, the majority of organisms sequenced have fewer. In a study of

355 Bacterial genomes, approximately 40% of the strains studied had only one or two operons, and 65.2% of the Archaeal strains had only one operon.(2). In the genomes with multiple operons, over 43% of Bacteria and 25% of Archaea had invariant rRNAs. In most cases, the nucleotide divergence between operons was less than 1%. However, divergence was found to be greater in thermophiles than in other organisms. It is possible that divergent rRNA operons in BSL organisms have resulted in an overestimation of diversity in this study. However, if this is the case, it is likely that all BSL samples were affected in the same way. For this reason, I believe that comparisons between samples

81 82 still provide a good indication of the actual variation between microbial communities around the lake.

One of the reasons rRNA sequences have been extensively used to determine phylogenetic relationships between taxa is the assumption that rRNAs are only weakly affected by horizontal gene transfer. However, several studies have suggested that the exchange of rRNA genes between organisms of different species is more frequent than previously believed (50, 72), especially in thermophiles (6, 108). It has been suggested that horizontal gene transfer between prokaryotes is prevalent enough to make traditional

“tree-like” phylogenies inadequate to describe prokaryotic evolution (50). It has also been argued, however, that the rate of occurrence of horizontal gene transfer has been overestimated (85), and exchange that has successfully occurred has been between closely related organisms (2). In addition, phylogenies based on whole-genome sequences are very similar to rRNA-based phylogenies (85). This suggests that the impact of horizontal gene transfer on rRNA-based phylogenies is minimal, and comparisons of rRNA gene sequences continue to be a successful method to estimate evolutionary relationships between prokaryotes.

In future studies using TRFLP to compare microbial community compositions of environmental samples, the effects of biases on results can be reduced by: (1) averaging at least three independent extractions for each sample; (2) keeping the concentration of samples loaded on the polyacrylamide gel constant; (3) performing preliminary studies to determine the level of variation expected due to limits of methods and equipment; (4) adjusting the sampling scale to thoroughly assess the diversity at that site.

83 Further studies are needed to thoroughly investigate the extent of community heterogeneity within BSL and the factors influencing those communities. Some questions that remain to be answered are: (1) What are the chemical and physical characteristics of BSL, and how do they vary temporally and spatially? (2) How does the sampling scale affect the diversity detected? (3) Does the microbial community composition vary temporally? The information discussed here can be used to develop future studies which will expand our understanding of the microbial community structure and metabolic processes occurring in BSL and the other thermal features of LVNP. The relatively simple biological communities of extreme environments make them model systems for investigating basic ecological questions, such as what factors are controlling community structure and how particular organisms function in ecosystem processes.

Since thermophilic microorganisms are capable of growth in conditions that may be similar to those of early Earth, studying these organisms may give us insights into evolutionary history. Also, the potential uses of extremophilic microorganisms in industry, biotechnology and bioremediation continue to be recognized. Understanding the microbial diversity and ecology of thermal features is essential to being able to preserve these environments so they will continue to be a source of knowledge and technology in the future.

REFERENCES

1. Abdo, Z., U. M. E. Schuette, S. J. Bent, C. J. Williams, L. Forney, J., and P. Joyce. 2006. Statistical methods for characterizing diversity of microbial communities by analysis of terminal restriction fragment length polymorphisms of 16S rRNA genes. Environmental Microbiology 8: 929-938.

2. Acinas, S. G., L. A. Marcelino, V. Klepac-Ceraj, and M. F. Polz. 2004. Divergence and redundancy of 16S rRNA sequences in genomes with multiple rrn operons. Journal of Bacteriology 186: 2629-2635.

3. Acinas, S. G., R. Sarma-Rupavtarm, V. Klepac-Ceraj, and M. F. Polz. 2005. PCR- induced sequence artifacts and bias: insights from comparison of two 16S rRNA clone libraries constructed from the same sample. Applied and Environmental Microbiology 71: 8966-8969.

4. Altschul, S. F., T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25: 3389-3402.

5. Amann, R. I., W. Ludwig, and K. H. Schleifer. 1995. Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiological Reviews 59: 143-169.

6. Aravind, L., R. L. Tatusov, Y. I. Wolf, D. R. Walker, and E. V. Koonin. 1998. Evidence for massive gene exchange between Archaeal and Bacterial hyperthermophiles. Trends in Genetics 14: 442-444.

7. Avis, P. G., I. A. Dickie, and G. M. Mueller. 2006. A 'dirty' business: testing the limitations of terminal restriction fragment length polymorphism (TRFLP) analysis of soil fungi. Molecular Ecology 15: 873-882.

8. Baker, G. C., and D. A. Cowan. 2004. 16S rDNA primers and the unbiased assessment of thermophile diversity. Biochemical Society Transactions 32: 218- 221.

9. Barns, S. M., R. E. Funyga, M. W. Jeffries, and N. R. Pace. 1994. Remarkable archaeal diversity detected in a Yellowstone National Park hot spring environment. Proceedings of the National Academy of Sciences 91: 1609-1613.

84

10. Blackwood, C. B., and J. S. Buyer. 2007. Evaluating the physical capture method of terminal restriction fragment length polymorphism for comparison of soil microbial communities. Soil Biology and Biochemistry 39: 590-599.

11. Blackwood, C. B., T. L. Marsh, S.-H. Kim, and E. A. Paul. 2003. Terminal fragment length polymorphism data analysis for quantitative comparison of microbial communities. Applied and Environmental Microbiology 69: 926-932.

12. Bogdanova, T. y. I., I. A. Tsaplina, T. F. Kondrat'eva, V. I. Duda, N. E. Suzina, V. S. Melamud, T. P. Tourova, and G. I. Karavaiko. 2006. Sulfobacillus thermotolerans sp. nov., a thermotolerant, chemolithotrophic bacterium. International Journal of Systematic and Evolutionary Microbiology 56: 1039-104

13. Brock, T. D. 1986. Thermophiles: general, molecular, and applied microbiology. John Wiley & Sons, New York.

14. Brock, T. D. 1978. Thermophilic microorganisms and life at high temperatures. Springer-Verlag, New York.

15. Brock, T. D., K. M. Brock, R. T. Belly, and R. L. Weiss. 1972. Sulfolobus : a new genus of sulfur-oxidizing bacteria living at low pH and high temperature. Archives of Microbiology 84: 54-68.

16. Brown, P. B., and G. V. Wolfe. 2006. Protist genetic diversity in the acidic hydrothermal environments of Lassen Volcanic National Park, USA. Journal of Eukaryotic Microbiology 53: 420-431.

17. Bruneel, O., R. Duran, C. Casiot, F. Elbaz-Poulichet, and J. C. Personne. 2006. Diversity of microorganisms in Fe-As-rich acid mine drainage waters of Carnoules, France. Applied and Environmental Microbiology 72: 551-556.

18. Bryanskaya, A. V., Z. B. Namsaraev, O. M. Kalashnikova, D. D. Barkhutova, B. B. Namsaraev, and V. M. Gorlenko. 2006. Biogeochemical processes in the algal- bacterial mats of the Urinskii alkaline hot spring. Microbiology 75: 611-620.

19. Carter, A., and M. Wilson. 2004. Determination of Microbial Populations in Thermal Springs Using Terminal Restriction Fragment Length Polymorphisms (T-RFLP) and the Effects of Digestion with Mung Bean Nuclease on T-RFLP. Department of Biological Sciences, Humboldt State University, Arcata, CA.

20. Chao, A., R. L. Chazdon, R. K. Colwell, and T.-J. Shen. 2006. Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 62: 361-371.

85 86 21. Clark, D. A., and P. R. Norris. 1996. Acidimicrobium ferrooxidans gen. nov., sp. nov., mixed-culture ferrous iron oxidation with Sulfobacillus species. Microbiology 142: 785-790.

22. Clement, B. G., L. E. Kehl, K. L. DeBord, and C. L. Kitts. 1998. Terminal restriction fragment patterns (TRFPs), a rapid, PCR-based method for the comparison of complex bacterial communities. Journal of Microbiological Methods 31: 135-142.

23. Clynne, M. A., C. J. Janik, and L. J. P. Muffler. 2003. "Hot Water" in Lassen Volcanic National Park - fumaroles, steaming ground, and boiling mudpots (USGS Fact Sheet 101-02). U. S. Geological Survey.

24. Cole, J. R., B. Chai, T. L. Marsh, R. J. Farris, Q. Wang, S. A. Kulam, S. Chandra, D. M. McGarrell, T. M. Schmidt, G. M. Garrity, and J. M. Tiedje. 2003. The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic . Nucleic Acids Research 31: 442-443.

25. Cordova-Kreylos, A. L., Y. Cao, P. G. Green, H.-M. Hwang, K. M. Kuivila, M. G. LaMontagne, L. C. Van De Werfhorst, P. A. Holden, and K. M. Scow. 2006. Diversity, composition, and geographical distribution of microbial communities in California salt marsh sediments. Applied and Environmental Microbiology 72: 3357-3366.

26. Crosby, L. D., and C. S. Criddle. 2003. Understanding bias in microbial community analysis techniques due to rrn operon copy number heterogeneity. Biotechniques 34: 2-9.

27. Danovaro, R., G. M. Luna, A. Dell'Anno, and B. Pietrangeli. 2006. Comparison of two fingerprinting techniques, terminal restriction fragment length polymorphism and automated ribosomal intergenic spacer analysis, for determination of bacterial diversity in aquatic environments. Applied and Environmental Microbiology 72: 5982-5989.

28. Dawson, S. C., and N. R. Pace. 2002. Novel kingdom-level eukaryotic diversity in anoxic environments. Proceedings of the National Academy of Sciences U.S.A. 99: 8324-8329.

29. Demirjian, D. C., F. Moris-Varas, and C. S. Cassidy. 2001. Enzymes From Extremophiles. Current Opinion in Chemical Biology 5: 144-151.

87

30. Derakshani, M., T. Lukow, and W. Liesack. 2001. Novel bacterial lineages at the (sub)division level as detected by signature nucleotide-targeted recovery of 16S rRNA genes from bulk soil and rice roots of flooded rice microcosms. Applied and Environmental Microbiology 67: 623-631.

31. Donahoe-Christiansen, J., S. D'Imperio, C. R. Jackson, W. P. Inskeep, and T. R. Mcdermott. 2004. Arsenite-oxidizing Hydrogenobaculum strain isolated from an acid-sulfate-chloride spring in Yellowstone National Park. Applied and Environmental Microbiology 70: 1865-1868.

32. Dunbar, J., L. O. Ticknor, and C. R. Kuske. 2000. Assessment of microbial diversity in four southwestern United States soils by 16S rRNA gene terminal restriction fragment analysis. Applied and Environmental Microbiology 66: 2943- 2950.

33. Eder, W., and R. Huber. 2002. New isolates and physiological properties of the Aquificales and description of Thermocrinis albus sp. nov. Extremophiles 6: 309- 318.

34. Edgcomb, V. P., J. H. McDonald, R. Devereux, and D. W. Smith. 1999. Estimation of bacterial cell numbers in humic acid-rich salt marsh sediments with probes directed to 16S ribosomal DNA. Applied and Environmental Microbiology 65: 1516-1523.

35. Egert, M., and M. W. Friedrich. 2003. Formation of pseudo-terminal restriction fragments, a PCR-related bias affecting terminal restriction fragment length polymorphism analysis of microbial community structure. Applied and Environmental Microbiology 69: 2555-2562.

36. Engebretson, J. J., and C. L. Moyer. 2003. Fidelity of select restriction endonucleases in determining microbial diversity by terminal-restriction fragment length polymorphism. Applied and Environmental Microbiology 69: 4823-4829.

37. Erixon, P., B. Svennblad, T. Britton, and B. Oxelman. 2003. Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Systematic Biology 52: 665-673.

88 38. Fantroussi, S. E., H. Urakawa, A. E. Bernhard, J. J. Kelly, P. A. Noble, H. Smidt, G. M. Yershov, and D. A. Stahl. 2003. Direct profiling of environmental microbial populations by thermal dissociation analysis of native rRNAs hybridized to oligonucleotide microarrays. Applied and Environmental Microbiology 69: 2377-2382.

39. Farrelly, V., F. A. Rainey, and E. Stackebrandt. 1995. Effect of genome size and rrn gene copy number on PCR amplification of 16S rRNA genes from a mixture of bacterial species. Applied and Environmental Microbiology 61: 2798-2801.

40. Fazi, S., S. Amalfitano, J. Pernthaler, and A. Puddu. 2005. Bacterial communities associated with benthic organic matter in headwater stream microhabitats. Environmental Microbiology 7: 1633-1640.

41. Ferrari, B., S. Binnerup, and M. Gillings. 2005. Microcolony cultivation on a soil substrate membrane system selects for previously uncultured soil bacteria. Applied and Environmental Microbiology 72: 918-922.

42. Ferrari, B. C., N. Tujula, K. Stoner, and S. Kjelleberg. 2006. Catalyzed reporter deposition-fluorescence in situ hybridization allows for enrichment-independent detection of microcolony-forming soil bacteria. Applied and Environmental Microbiology 72: 918-922.

43. Fisher, M. M., and E. W. Triplett. 1999. Automated approach for ribosomal intergenic spacer analysis of microbial diversity and its application to freshwater bacterial communities. Applied and Environmental Microbiology 65: 4630-4636.

44. Forterre, P. 2002. A hot story from comparative genomics: reverse gyrase is the only hyperthermophile-specific protein. Trends in Genetics 18: 236-238.

45. Frey, J. C., E. R. Angert, and A. N. Pell. 2006. Assessment of biases associated with profiling simple, model communities using terminal-restriction fragment length polymorphism-based analyses. Journal of Microbiological Methods 67: 9- 19.

46. Frey, J. C., J. M. Rothman, A. N. Pell, J. B. Nizeyi, M. R. Cranfield, and E. R. Angert. 2006. Fecal bacterial diversity in a wild gorilla. Applied and Environmental Microbiology 72: 3788-3792.

47. Futterer, O., A. Angelov, H. Liesegang, G. Gottschalk, C. Schleper, B. Schepers, C. Dock, G. Antranikian, and W. Liebl. 2004. Genome sequence of Picrophilus torridus and its implications for life around pH 0. Proceedings of the National Academy of Sciences, U.S.A. 101: 9091-9096.

89 48. Galtier, N., and J. R. Lobry. 1997. Relationships Between Genomic G+C Content, RNA Secondary Structures, and Optimal Growth Temperature in Prokaryotes. Journal of Molecular Evolution 44: 632-636.

49. Gao, H., Z. K. Yang, T. J. Gentry, L. Wu, C. W. Schadt, and J. Zhou. 2007. Microarray-based analysis of microbial community RNAs by whole-community RNA amplification. Applied and Environmental Microbiology 73: 563-571.

50. Gogarten, J. P., W. F. Doolittle, and J. G. Lawrence. 2002. Prokaryotic evolution in light of gene transfer. Molecular Biology and Evolution 19: 2226-2238.

51. Golovacheva, R. S., K. M. Valiejo-Roman, and A. V. Troitsky. 1987. Sulfurococcus mirabilis gen. nov. sp. nov., a new thermophilic archaebacterium with the ability to oxidize sulfur. Microbiology 56: 84-91.

52. Grant, A., and L. A. Ogilvie. 2003. Terminal restriction fragment length polymorphism data analysis. Applied and Environmental Microbiology 69: 6342- 6343.

53. Gross, W., I. Heilmann, D. Lenze, and K. Schnarrenberger. 2001. Biogeography of the Cyanidiaceae (Rhodophyta) based on 18S ribosomal RNA sequence data. European Journal of Phycology 36: 275-280.

54. Grueter, D., B. Schmid, and H. Brandl. 2006. Influence of plant diversity and elevated atmospheric carbon dioxide levels on below ground bacterial diversity. BMC Microbiology, vol. 6, doi: 10.1186/1471-2180-6-68.

55. Hamm, L., J. Kee, A. M. Anacker, M. S. Wilson, and P. L. Siering. 2006. N-100 isolation of heterotrophic prokaryotes from a hot, acidic lake (Boiling Springs Lake) in northern CA. American Society for Microbiology 106th General Meeting. Orlando, Fl.

56. Hansen, M. C., T. Toker-Neilson, M. Givskov, and S. Molin. 1998. Biased 16S rDNA PCR amplification caused by interference from DNA flanking template region. FEMS Microbiology Ecology 26: 141-149.

57. Hewson, I., and J. A. Fuhrman. 2006. Improved strategy for comparing microbial assemblege fingerprints. Microbial Ecology 51: 147-153.

58. Holland, S. M. 2003. Analytic Rarefaction 1.3. http://www.uga.edu/~strata/software .

90 59. Hongoh, Y., L. Ekpornprasit, T. Inoue, S. Moriya, S. Trakulnaleamsai, M. Ohkuma, N. Noparatnaraporn, and T. Kudo. 2006. Intracolony variation of bacterial gut microbiota among castes and ages in the fungus-growing termite Macrotermes gilvus. Molecular Ecology 15: 505-516.

60. Huber, G., C. Spinnler, A. Gambacorta, and K. O. Stetter. 1989. Metallosphaera sedula gen. nov., and sp. nov. represents a new genus of aerobic, metal- mobilizing, thermoacidophilic archaebacteria. Systematic and Applied Microbiology 12: 38-47.

61. Huber, H., and K. O. Stetter. 2001. Order III. Sulfolobales . In D. R. Boone and R. W. Castenholz (ed.), Bergey's Manual of Systematic Bacteriology, 2nd ed. Vol. 1.

62. Huelsenbeck, J. P. 2000. Mr.Bayes: Bayesian Inference of Phylogeny, p. Distributed by the author. Department of Biology, University of Rochester. 3.1 ed.

63. Hugenholtz, P., and T. Huber. 2003. Chimeric 16S rDNA sequences of diverse origin are accumulating in the public database. International Journal of Systematic and Evolutionary Microbiology 53: 289-293.

64. Hugenholtz, P., C. Pitulle, K. L. Hershberger, and N. R. Pace. 1998. Novel division level bacterial diversity in a Yellowstone hot spring. Journal of Bacteriology 180: 366-376.

65. Hullar, M. A. J., L. A. Kaplan, and D. A. Stahl. 2006. Recurring seasonal dynamics of microbial communities in stream habitats. Applied and Environmental Microbiology 72: 713-722.

66. Ingebritsen, S. E., and M. L. Sorey. 1985. A quantitative analysis of the Lassen hydrothermal system, north central California. Water Resources Research 21: 853- 868.

67. Ishii, K., M. MuBmann, B. J. MacGregor, and R. Amann. 2004. An improved fluorescence in sity hybridization protocol for the identification of bacteria and archaea in marine sediments. FEMS Microbiology Ecology 50: 203-212.

68. Itoh, T., K. Suzuki, and T. Nakase. 2002. Vulcanisaeta distributa gen. nov.,sp. nov., and Vulcanisaeta souniana sp. nov., novel hyperthermophilic, rod-shaped crenarchaeotes isolated from hot springs in Japan. International Journal of Systematic and Evolutionary Microbiology 52: 1097-1104.

91 69. Itoh, T., K. Suzuki, P. C. Sanchez, and T. Nakase. 2003. Caldisphaera lagunensis gen. nov., sp. nov., a novel hyperthermophilic creanarchaeote isolated from a hot spring at Mt. Maquiling, Philippines. International Journal of Systematic and Evolutionary Microbiology 53: 1149-1154.

70. Jackson, C. R., H. W. Langner, J. Donahoe-Christiansen, W. P. Inskeep, and T. R. McDermott. 2001. Molecular analysis of microbial community structure in an arsenite-oxidizing acidic thermal spring. Environmental Microbiology 3: 532-542.

71. Jaenicke, R. 1996. Stability and folding of ultrastable proteins: eye lens crystallins and enzymes from thermophiles. FASEB Journal 10: 84-92.

72. Jain, R., M. C. Rivera, J. E. Moore, and J. A. Lake. 2002. Horizontal gene transfer in microbial genome evolution. Theoretical Population Biology 61: 489-495.

73. Johnson, D. B. 1998. Biodiversity and ecology of acidophilic microorganisms. FEMS Microbiology Ecology 27: 307-317.

74. Johnson, D. B. 2001. Genus II. Leptospirillum, p. 453-457. In G. M. Garrity (ed.), Bergey's Manual of Systematic Bacteriology, 2 ed, vol. 1. Springer-Verlag New York, Inc., New York.

75. Johnson, D. B., N. Okibe, and F. F. Roberto. 2003. Novel thermo-acidophilic bacteria isolated from geothermal sites in Yellowstone National Park: physiological and phylogenetic characteristics. Archives of Microbiology 180: 60- 68.

76. Johnson, D. B., B. Stallwood, S. Kimura, and K. B. Hallberg. 2006. Isolation and characterization of Acidicaldus organivorus , gen. nov., sp. nov., a novel sulfur- oxidizing, ferric iron-reducing thermo-acidophilic heterotrophic Proteobacterium . Archives of Microbiology 185: 212-221.

77. Kaplan, C. W., and C. L. Kitts. 2003. Variation between observed and true Terminal Restriction Fragment length is dependent on true TRF length and purine content. Journal of Microbiological Methods 54: 121-125.

78. Karavaiko, G. I., T. y. I. Bogdanova, T. y. P. Tourova, T. F. Kondrat'eva, I. A. Tsaplina, M. A. Egorova, E. N. Krasil'nikova, and L. M. Zakharchuk. 2005. Reclassification of ' Sulfobacillus thermosulfidooxidans subsp. thermotolerans ' strain K1 as Alicyclobacillus tolerans sp. nov. and Sulfobacillus disulfidooxidans Dufresne et al. 1996 as Alicyclobacillus disulfidooxidans comb. nov., and emended description of the genus Alicyclobacillus . Journal of Systematic and Evolutionary Microbiology 55: 941-947.

92 79. Kent, A. D., D. J. Smith, B. J. Benson, and E. W. Triplett. 2003. Web-based phylogenetic assignment tool for analysis of terminal restriction fragment length polymorphism profiles of microbial communities. Applied and Environmental Microbiology 69: 6768-6776.

80. Kitts, C. L. 2001. Terminal restriction fragment patterns: a tool for comparing microbial communities and assessing community dynamics. Current Issues in Intestinal Microbiology 2: 17-25.

81. Klappenbach, J. A., J. M. Dunbar, and T. M. Schmidt. 2000. rRNA operon copy number reflects ecological strategies of bacteria. Applied and Environmental Microbiology 66: 1328-1333.

82. Kopczynski, E. D., M. M. Bateson, and D. M. Ward. 1994. Recognition of chimeric small-subunit ribosomal DNAs composed of genes from uncultured organisms. Applied and Environmental Microbiology 60: 746-748.

83. Kowalak, J. A., J. J. Dalluge, J. A. McCloskey, and K. O. Stetter. 1994. The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochemistry 33: 7869-7876.

84. Kraigher, B., B. Stres, J. Hacin, L. Ausec, I. Mahne, J. D. van Elsas, and I. Mandic-Mulec. 2006. Microbial activity and community structure in two drained fen soils in the Ljubljana Marsh. Soil Biology and Biochemistry 38: 2762-2771.

85. Kurland, C. G., B. Canback, and O. G. Berg. 2003. Horizontal gene transfer: a critical view. Proceedings of the National Academy of Sciences, U.S.A. 100: 9658-9662.

86. Kurosawa, N., Y. H. Itoh, and T. Itoh. 2003. Reclassification of Sulfolobus hakonensis Takayanagi et al. 1996 as Metallosphaera hakonensis comb. nov. based on phylogenetic evidence and DNA G+C content. International Journal of Systematic and Evolutionary Microbiology 53: 1607-1608.

87. Kurosawa, N., Y. H. Itoh, T. Iwai, A. Sugai, I. Uda, N. Kimura, T. Horiuchi, and T. Itoh. 1998. Sulfurisphaera ohwakuensis gen. nov., sp. nov., a novel extremely thermophilic acidophile of the order Sulfolobales. International Journal of Systematic Bacteriology 48: 451-456.

88. Liesack, W. H., H. Weyland, and E. Stackenbrandt. 1991. Potential risks of gene amplification by PCR as determined by 16S rDNA analysis of a mixed culture of strict barophilic bacteria. Microbial Ecology 21: 192-198.

93 89. Lindstrom, E. S., M. P. Kamst-Van Agterveld, and G. Zwart. 2005. Distribution of typical freshwater bacterial groups is associated with pH, temperature, and lake water retention time. Applied and Environmental Microbiology 71: 8201-8206.

90. Liu, W.-T., T. L. Marsh, H. Cheng, and L. J. Forney. 1997. Characterization of microbial diversity by determining terminal restriction length polymorphisms of genes encoding 16S rRNA. Applied and Environmental Microbiology 63: 4516- 4522.

91. Lohr, A. J., A. M. Laverman, M. Braster, N. M. van Straalen, and W. F. M. Roling. 2006. Microbial Communities in the World's Largest Acidic Volcanic Lake, Kawah Ijen in Indonesia, and in the Banyupahit River Originating from It. Microbial Ecology 52: 609-618.

92. Lovell, C. R., and Y. Hui. 1991. Design and testing of a functional group-specific DNA probe for the study of natural populations of acetogenic bacteria. Applied and Environmental Microbiology 57: 2602-2609.

93. Macalady, J. L., M. M. Vestling, D. Baumler, N. Boekelheide, C. W. Kaspar, and J. F. Banfield. 2004. Tetraether-linked membrane monolayers in Ferroplasma spp: a key to survival in acid. Extremophiles 8: 411-419.

94. Madigan, M. T., J. M. Martinko, and J. Parker. 2003. Brock biology of microorganisms. 11 ed. Pearson Education, Inc., Upper Saddle River, NJ.

95. Marguet, E., and P. Forterre. 1998. Protection of DNA by salts against thermodegradation at temperatures typical for hyperthermophiles. Extremophiles 2: 115-122.

96. Marsh, T. L. 1999. Terminal restriction fragment length polymorphism (T-RFLP): an emerging method for characterizing diversity among homologous populations of amplification products. Current Opinion in Microbiology 2: 323-327.

97. McArthur, J. V. 2006. Microbial ecology: an evolutionary approach. Elsevier, Inc., Burlington, MA.

98. McGuinness, L. M., M. Salganik, L. Vega, K. D. Pickering, and L. J. Kerkhof. 2006. Replicability of bacterial communities in denitrifying bioreactors as measured by PCR/T-RFLP analysis. Environmental Science Techniques 40: 509- 515.

94 99. Meyer-Dombard, D. R., E. L. Shock, and J. P. Amend. 2005. Archaeal and bacterial communities in geochemically diverse hot springs of Yellowstone National Park, USA. Geomicrobiology 3: 211-227.

100. Miller, D. N. 2001. Evaluation of gel filtration resins for the removal of PCR- inhibitory substances from soils and sediments. Journal of Microbiological Methods 44: 49-58.

101. Miller, K. M., T. J. Ming, A. D. Schulze, and R. E. Withler. 1999. Denaturing gradient gel electrophoresis (DGGE): a rapid and sensitive technique to screen nucleotide sequence variation in populations. Biotechniques 27: 1016-1024.

102. Morales, S. E., P. J. Mouser, N. Ward, S. P. Hudman, N. J. Gotelli, D. S. Ross, and T. A. Lewis. 2006. Comparison of bacterial communities in New England Sphagnum bogs using terminal restriction fragment length polymorphism (T- RFLP). Microbial Ecology 52: 34-44.

103. Moré, M., J. B. Herrick, M. Silva, W. C. Ghiorse, and E. L. Madsen. 1994. Quantitative cell lysis of indigenous microorganisms and rapid extraction of microbial DNA from sediment. Applied and Environmental Microbiology 60: 1572-1580.

104. Muyzer, G., and K. Smalla. 1998. Application of denaturing gradient gel electrophoresis (DGGE) and temperature gradient gel electrophoresis (TGGE) in microbial ecology. Antonie Van Leeuwenhoek 73: 127-141.

105. Muyzer, G., E. C. Waal, and A. G. Uitterlinden. 1993. Profiling of complex microbial populations by denaturing gradient gel analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Applied and Environmental Microbiology 59: 695-700.

106. Myers, R. M., S. G. Fischer, L. S. Lerman, and T. Maniatis. 1985. Nearly all single base substitutions in DNA fragments joined to a GC-clamp can be detected by denaturing gradient gel electrophoresis. Nucleic Acids Research 13: 3131- 3145.

107. Nakagawa, T., and M. Fukui. 2002. Phylogenetic characterization of microbial mats and streamers from a Japanese alkaline hot spring with a thermal gradient. Journal of General and Applied Microbiology 48: 211-222.

95 108. Nelson, K. E., R. A. Clayton, S. R. Gill, M. L. Gwinn, R. J. Dodson, D. H. Haft, E. K. Hickey, J. D. Peterson, W. C. Nelson, K. A. Ketchum, L. McDonald, T. R. Utterback, J. A. Malek, K. D. Linher, M. M. Garrett, A. M. Stewart, M. D. Cotton, M. S. Pratt, C. A. Phillips, D. Richardson, J. Heidelberg, G. G. Sutton, R. D. Fleischmann, J. A. Eisen, O. White, S. L. Salzberg, H. O. Smith, J. C. Venter, and C. M. Fraser. 1999. Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima . Nature 399: 323-329.

109. Normand, P., C. Ponsonnet, X. Nesme, M. Neyra, and P. Simonet. 1996. ITS analysis of prokaryotes., p. 1-12. In A. D. L. Akkermans, J. D. van Elsas, and F. J. de Bruijn (ed.), Molecular microbiology ecological manual. Kluwer Academic, Dordrecht.

110. Norris, P. R., D. A. Clark, J. P. Owen, and S. Waterhouse. 1996. Characteristics of Sulfobacillus acidophilus sp. nov. and other moderately thermophilic mineral- sulphide-oxidizing bacteria. Microbiology 142: 775-783.

111. Norris, T. B., J. M. Wraith, R. W. Castenholz, and T. R. McDermott. 2002. Soil microbial community structure across a thermal gradient following a geothermal heating event. Applied and Environmental Microbiology 68: 6300-6309.

112. Nubel, U., F. Garcia-Pichel, and G. Muyzer. 1997. PCR primers to amplify 16S rRNA genes from cyanobacteria. Applied and Environmental Microbiology 63: 3327-3332.

113. Olsen, G. J., H. Matsuda, R. Hagstrom, and R. Overbeek. 1994. FastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Computer Applications in the Biosciences 10: 41-48.

114. Osborne, C. A., G. N. Rees, Y. Bernstein, and P. H. Janssen. 2006. New threshold and confidence estimates for terminal restriction fragment length polymorphism analysis of complex bacterial communities. Applied and Environmental Microbiology 72: 1270-1278.

115. Paabo, S., D. M. Irwin, and A. C. Wilson. 1990. DNA damage promotes jumping between templates during enzymatic amplification. Journal of Biological Chemistry 265: 4718-4721.

116. Pakchung, A. A. H., P. J. L. Simpson, and R. Codd. 2006. Life on earth. extremophiles continue to move the goal posts. Environmental Chemistry 3: 77- 93.

96 117. Pandey, J., K. Ganesan, and R. K. Jain. 2006. Variation in T-RFLP profiles with differing chemistries of fluorescent dyes used for labeling the PCR primers. Journal of Microbiological Methods, doi: 10.1016/j.mimet.2006.11.012.

118. Park, S., Y. K. Ku, M. J. Seo, D. Y. Kim, J. E. Yeon, K. M. Lee, S.-C. Jeong, W. K. Yoon, C. H. Harn, and H. M. Kim. 2006. Principal component analysis and discriminant analysis (PCA-DA) for discriminating profiles of terminal restriction fragment length polymorphism (T-RFLP) in soil bacterial communities. Soil Biology and Biochemistry 38: 2344-2349.

119. Peplies, J., R. O. Glickner, and R. Amann. 2003. Optimization strategies for DNA microarray-based detection of bacteria with 16S rRNA-targeting oligonucleotide probes. Applied and Environmental Microbiology 69: 1397-1407.

120. Peplies, J., C. Lachmund, F. O. Glockner, and W. Manz. 2006. A DNA microarray platform based on direct detection of rRNA for characterization of freshwater sediment-related prokaryotic communities. Applied and Environmental Microbiology 72: 4829-4838.

121. Pernthaler, A., J. Pernthaler, and R. Amann. 2002. Fluorescence in situ hybridization and catalyzed reporter deposition for the identification of marine bacteria. Applied and Environmental Microbiology 68: 3094-3101.

122. Pernthaler, A., C. M. Preston, J. Pernthaler, E. F. DeLong, and R. Amann. 2002. Comparison of fluorescently labeled oligonucleotide and polynucleotide probes for the detection of pelagic marine bacteria and archaea. Applied and Environmental Microbiology 68: 661-667.

123. Prokofeva, M. I., M. L. Miroshnichenko, N. A. Kostrikina, N. A. Chernyh, B. B. Kuznetsov, T. P. Tourova, and E. A. Bonch-Osmolovskaya. 2000. Acidolobus aceticus gen. nov., sp. nov., a novel anaerobic thermoacidophilic archaeon from continental hot vents in Kamchatka. International Journal of Systematic and Evolutionary Microbiology 50: 2001-2008.

124. Qiu, X., L. Wu, H. Huang, P. E. McDonel, A. V. Palumbo, J. M. Tiedje, and J. Zhou. 2001. Evaluation of PCR-generated chimeras, mutations, and heteroduplexes with 16S rRNA gene-based cloning. Applied and Environmental Microbiology 67: 880-887.

125. R Development Core Team. 2005. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

97 126. Ranjard, L., F. Poly, J. C. Lata, C. Mougel, J. Thioulouse, and S. Nazaret. 2001. Characterization of bacterial and fungal soil communities by automated ribosomal intergenic spacer analysis fingerprints: biological and methodological variability. Applied and Environmental Microbiology 67: 4479-4487.

127. Ranjard, L., F. Poly, and S. Nazaret. 2000. Monitoring complex bacterial communities using culture-independent molecular techniques: application to soil environment. Research in Microbiology 151: 167-177.

128. Rappe, M. S., and S. J. Giovannoni. 2003. The uncultured microbial majority. Annual Review of Microbiology 57: 369-394.

129. Raskin, L., J. M. Stromley, B. E. Rittman, and D. A. Stahl. 1994. Group-specific 16S rRNA hybridization probes to describe natural communities of methanogens. Applied and Environmental Microbiology 60: 1232-1240.

130. Reysenbach, A.-L. 2001. Genus V. Sulfurisphaera , p. 208-209. In D. R. Boone and R. W. Castenholz (ed.), Bergey's Manual of Systematic Bacteriology, 2nd ed. Vol. 1.

131. Reysenbach, A.-L. 2001. Genus VI. Sulfurococcus , p. 209-210. In D. R. Boone and R. W. Castenholz (ed.), Bergey's Manual of Systematic Bacteriology, 2nd ed. Vol. 1.

132. Reysenbach, A.-L., L. J. Giver, G. S. Wickham, and N. R. Pace. 1992. Differential amplification of rRNA genes by polymerase chain reaction. Applied and Environmental Microbiology 58: 3417-3418.

133. Reysenbach, A.-L., and N. R. Pace. 1995. Reliable amplification of hyperthermophilic Archaeal 16S rRNA genes by the polymerase chain reaction, p. 101-106. In F. T. Robb and A. R. Place (ed.), Archaea - a laboratory manual (thermophiles). Cold Spring Harbor Laboratory Press, Cold Spring Harbor.

134. Rothschild, L. J., and R. L. Manicinelli. 2001. Life in extreme environments. Nature 409: 1092-1101.

135. Sakano, Y., and L. Kerkhof. 1998. Assessment of changes in microbial community structure during operation of an ammonia biofilter with molecular tools. Applied and Environmental Microbiology 64: 4077-4082.

98 136. Scala, D. J., E. L. Hacherl, R. Cowan, L. Y. Young, and D. S. Kosson. 2006. Characterization of Fe(III)-reducing enrichment cultures and isolation of Fe(III)- reducing bacteria from the Savannah River site, South Carolina. Research in Microbiology 157: 772-783.

137. Scala, D. J., and L. J. Kerhof. 2000. Horizontal heterogeneity of denitrifying bacterial communities in marine sediments by terminal restriction fragment length polymorphism. Applied and Environmental Microbiology 66: 1980-1986.

138. Schleper, C., G. Puehler, I. Holz, A. Gambacorta, D. Janekovic, U. Santarius, H.- P. Klenk, and W. Zillig. 1995. Picrophilis gen. nov., fam. nov.: a novel aerobic, heterotrophic, thermoacidophilic genus and family comprising Archaea capable of growth around pH 0. Journal of Bacteriology 177: 7050-7059.

139. Segerer, A., T. A. Langworthy, and K. O. Stetter. 1988. Thermoplasma acidophilum and Thermoplasma volcanium sp. nov. from solfatara fields. Systematic and Applied Microbiology 10: 161-171.

140. Segerer, A., A. Neuner, J. K. Kristjansson, and K. O. Stetter. 1986. Acidianus infernus gen. nov., sp. nov., and Acidianus brierleyi comb. nov.: facultatively aerobic, extremely acidophilic thermophilic sulfur-metabolizing archaebacteria. International Journal of Systematic Bacteriology 36: 559-564.

141. Segerer, A. H., A. Trincone, M. Gahrtz, and K. O. Stetter. 1991. Stygiolobus azoricus gen. nov., sp. nov. represents a novel genus of anaerobic, extremely thermoacidophilic archaebacteria of the order Sulfolobales. International Journal of Systematic Bacteriology 41: 495-501.

142. Shima, S., and K.-i. Suzuki. 1993. Hydrogenobacter acidophilus sp. nov., a thermoacidophilic, aerobic, hydrogen-oxidizing bacterium requiring elemental sulfur for growth. International Journal of Systematic Bacteriology 43: 703-708.

143. Siering, P. L. 1998. The double helix meets the crystal lattice: the power and pitfalls of nucleic acid approaches for biomineralogical investigations. American Mineralogist 83: 1593-1607.

144. Siering, P. L., J. M. Clarke, and M. S. Wilson. 2006. Geochemical and biological diversity of acidic, hot springs in Lassen Volcanic National Park. Geomicrobiology J. 23: 129-141.

99 145. Siering, P. L., and W. C. Ghiorse. 1997. Development and application of 16S rRNA-targeted probes for detection of iron- and manganese-oxidizing sheathed bacteria in environmental samples. Applied and Environmental Microbiology 63: 644-651.

146. Sittenfeld, A., M. Mora, J. M. Ortega, F. Albertazzi, A. Cordero, M. Roncel, E. Sanchez, M. Vargas, M. Fernandez, J. Weckesser, and A. Serrano. 2002. Characterization of a photosynthetic Euglena strain isolated from an acidic hot mud pool of a volcanic area of Costa Rica. FEMS Microbiology and Ecology 42: 151-161.

147. Skirnisdottir, S., G. O. Hreggvidsson, S. Hjorleifsdottir, V. T. Marteinsson, S. K. Petursdottir, O. Holst, and J. K. Kristjansson. 2000. Influence of sulfide and temperature on species composition and community structure of hot spring microbial mats. Applied and Environmental Microbiology 66: 2835-2841.

148. Smith, C. J., B. S. Danilowicz, A. K. Clear, F. J. Costello, B. Wilson, and W. G. Meijer. 2005. T-Align, a web-based tool for comparison of multiple terminal restriction fragment length polymorphism profiles. FEMS Microbiology and Ecology 54: 375-380.

149. Spear, J. R., J. J. Walker, T. M. McCollom, and N. R. Pace. 2005. Hydrogen and bioenergetics in the Yellowstone geothermal ecosystem. Proceedings of the National Academy of Sciences, USA 102: 2555-2560.

150. Spiegelman, D., G. Whissell, and C. W. Greer. 2005. A survey of the methods for the characterization of microbial consorta and communities. Canadian Journal of Microbiology 51: 355-386.

151. Staden, R. 1996. The Staden sequence analysis package. Molec. Biotech. 5: 233- 241.

152. Stohr, R., A. Waberski, H. Volker, B. J. Tindall, and M. Thomm. 2001. Hydrogenothermus marinus gen. nov., sp. nov., a novel thermophilic hydrogen- oxidizing bacterium, recognition of Calderobacterium hydrogenophilum as a member of the genus Hydrogenobacter and proposal of the reclassification of Hydrogenobacter acidophilum gen. nov., comb. nov., in the phylum 'Hydrogenobacter/Aquifex'. International Journal of Systematic and Evolutionary Microbiology 51: 1853-1862.

153. Suzuki, M. T., and S. J. Giovannoni. 1996. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Applied and Environmental Microbiology 62: 625-630.

100 154. Swofford, D. L. 2002. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10 (PPC). Sinauer Associates, Sunderland, Massachusetts.

155. Takai, K., and K. Hirikoshi. 1999. Genetic diversity of Archaea in deep-sea hydrothermal vent environments. Genetics 152: 1285-1297

156. Takayanagi, S., H. Kawasaki, K. Sugimori, T. Yamada, A. Sugai, T. Ito, K. Yamasato, and M. Shioda. 1996. Sulfolobus hakonensis sp. nov., a novel species of acidothermophilic archaeon. International Journal of Systematic Bacteriology 46: 377-382.

157. Theron, J., and T. E. Cloete. 2000. Molecular techniques for determining microbial diversity and community structure in natural environments. Critical Reviews in Microbiology 26: 37-57.

158. Thompson, J. M. 1985. Chemistry of thermal and nonthermal springs in the vicinity of Lassen Volcanic National Park. Journal of Volcanology and Geothermal Energy 25: 81-104.

159. van de Vossenberg, J. L. C. M., A. J. M. Driessen, W. Zillig, and W. N. Konings. 1998. Bioenergetics and cytoplasmic membrane stability of the extremely acidophilic, thermophilic archaeon Picrophilus oshimae . Extremophiles 2: 67-74.

160. Wang, G. C.-Y., and Y. Wang. 1997. Frequency of formation of chimeric molecules as a consequence of PCR coamplification of 16S rRNA genes from mixed bacterial genomes. Applied and Environmental Microbiology 63: 4645- 4650.

161. Whitaker, R. J., D. W. Grogan, and J. W. Taylor. 2003. Geographic barriers isolate endemic populations of hyperthermophilic Archaea. Science 301: 976-978.

162. Wilson, J. W., R. Ramamurthy, S. Porwollik, M. McClelland, T. Hammond, and P. Allen. 2002. Microarray analysis identifies Salmonella genes belonging to the low-shear modeled microgravity regulon. Proceedings of the National Academy of Sciences, U.S.A. 99: 13807-13812.

163. Woese, C. R. 1987. Bacterial evolution. Microbiological Reviews 51: 221-271.

164. Yannarell, A. C., and E. W. Triplett. 2005. Geographic and environmental sources of variation in lake bacterial community composition. Applied and Environmental Microbiology 71: 227-239.

101 165. Yoshie, S., H. Makino, H. Hirosawa, K. Shirotani, S. Tsuneda, and A. Hirata. 2006. Molecular analysis of halophilic bacterial community for high-rate denitrification of saline industrial wastewater. Applied Microbiology and Biotechnology 72: 182-189.

166. Zhou, J., and D. K. Thompson. 2002. Challenges in applying microarrays to environmental studies. Current Opinions in Biotechnology 13: 204-207.