<<

University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln

Theses and Dissertations in Biochemistry Biochemistry, Department of

12-2010

The In Situ Function of a Microbial Community Profiled yb FT-IR: A Snapshot in Time

Ryan Roberts University of Nebraska-Lincoln, [email protected]

Follow this and additional works at: https://digitalcommons.unl.edu/biochemdiss

Part of the Biochemistry Commons

Roberts, Ryan, "The In Situ Function of a Microbial Community Profiled yb FT-IR: A Snapshot in Time" (2010). Theses and Dissertations in Biochemistry. 5. https://digitalcommons.unl.edu/biochemdiss/5

This Article is brought to you for free and open access by the Biochemistry, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Theses and Dissertations in Biochemistry by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. THE IN SITU FUNCTION OF A MICROBIAL COMMUNITY PROFILED BY FT-IR:

A SNAPSHOT IN TIME

by

Ryan Roberts

A THESIS

Presented to the Faculty of

The Graduate College at the University of Nebraska

In Partial Fulfillment of Requirements

For the Degree of Master of Science

Major: Biochemistry

Under the Supervision of Professor Kenneth W. Nickerson

Lincoln, Nebraska

December, 2010 i THE IN SITU FUNCTION OF A MICROBIAL COMMUNITY PROFILED BY FT-IR: A SNAPSHOT IN TIME

Ryan Roberts, M. S.

University of Nebraska, 2010

Advisor: Kenneth W. Nickerson

Photographs of an ecosystem are an important tool in macro-community ecology. The photograph is a permanent record of species phenotype. In microbiology, biochemical activity provides the most descriptive information of an organism‟s phenotype. A method for fingerprinting all biochemical activities occurring within a microbial community is analogous to a photograph. Infrared spectroscopy in the region between wavelengths 2500 to 20,000 nm (mid-

IR) is a well established instrumental method for fingerprinting the total biochemical profile of axenic cultures. Spectra are complex and sensitive as demonstrated by the ability to discriminate between strains of , fungi, and algae. This thesis develops the method to apply mid-IR spectroscopy in order to attain a biochemical fingerprint by attenuated total reflectance. Chapter

2 establishes methods for obtaining mid-IR spectra by prefiltering and concentrating lake water onto a 0.2 µm filter membrane. In order to optimize signal, a combined spectrum from PVDF and nylon membranes is used following a 20-25 µm prefilter. Chapter 3 verifies that the established method is capable of discriminating the biochemical fingerprints of six interdunal lakes from Western Nebraska with differing chemistries and community structures. Mid-IR spectra provide functional data, but do not mirror structure, based on 16S rRNA data, identically.

Now, a record of biochemical activity within a microbial community can be captured for future comparison. This technique is culture-independent and provides functional, in situ information.

In the future, more information can be deduced by comparing different functional communities to one another, assigning spectral information to particular biochemical activities in a community.

ii Acknowledgments

I would like to thank Dr. Kenneth Nickerson and Dr. Bradley Plantz for accepting me into their lab. Dr. Nickerson‟s fascination in military history has always kept my interest and reminds me of where I came from and how I found myself in a biochemistry/biological sciences lab at the University of Nebraska-Lincoln. His knowledge in biochemistry and biological sciences has been the bedrock from which all of our studies have extended. Dr. Plantz‟ commitment to seeing our work come to fruition, and teaching me along the way, has been the key to my development. Without it, I never would have arrived at this point.

I would also like to thank Dr. Donald Becker, who was the Graduate Committee

Chairperson when I arrived and walked me through the entire application process to achieve acceptance from the University of Nebraska – Lincoln graduate studies.

I am grateful to Dr. Julie Shaffer and Betty Jo Jacques for their assistance in the Sandhills

Region measuring lake water chemistries, their knowledge of the microbiology of the lakes, and their support in developing the sampling plan.

The entire lab has been very helpful throughout the process. We continue to have an attitude of partnership and teamwork, where we all “cooperate to graduate.” Thank you Suman

Gosh, Swetha Tati, and Sahar Hasim. I am grateful for John Ferguson‟s assistance sampling lakes, being a common sense check, and potentially extending mid-IR into algal classification.

Also, a special thanks to Karolyn Fox, an undergraduate student, for working with the FT-IR and developing the work into classification of bacteria by attenuated total reflectance.

I am also grateful to Dr. Charles Kuszynski and Danielle Shea for their expertise on flow cytometry and continually struggling with an idea and a problem that remains a problem to this day. Dr. Javier Seravalli assisted me greatly in understanding the ICP-MS ion data we acquired from the lakes. I am indebted to Dr.‟s Jens Walter and Andrew Benson and everyone from

CAGE for their assistance with DGGE and 454 pyrosequencing, as well as Inez Martinez for her iii assistance in a number of DGGE experiments and data analyses. I am also grateful for the DGGE done at Dr. Russell Hill‟s Lab in the Center of Marine Biology (COMB) in Baltimore, Maryland.

Dr.‟s Ed Cahoon and Rhae Drijber were invaluable in their knowledge of lipids and GC-MS analysis of FAMEs. Also, I would like to thank Dr. David Gosselin for his expertise on lake geochemistry and for explaining to me in a number of meetings the concepts and measuring dilemmas of water chemistry. Dr. Han Chen, from the Core Microscopy Facility was very helpful in explaining the SEM procedures, teaching me how to prepare samples, and examine them in the instrument.

And finally, quite possibly the most important group that supported me, my loving wife,

Eun Ju, and children, Amelia and Albert, who motivated and supported me day-in and day-out.

Thank you for standing by me and allowing me to do this work.

iv Table of Contents

INTRODUCTION……………………………………………………………………... i ACKNOWLEDGEMENTS……………………………………………………………iii TABLE OF CONTENTS……………………………………………………………… iv LIST OF FIGURES…………………………………………………………………….v LIST OF TABLES…………………………………………………………………...…vii ABBREVIATIONS…………………………………………………………………….viii

CHAPTER 1: Introduction and Literature Review..……………….…….…………….1

CHAPTER 2: Characterization of the in situ Function of a Planktonic Microbial Community by mid-IR Spectroscopy……………………………………………………44 Abstract…………………………………………………………………………………..45 Introduction……………………………………………………………………………....45 Methods…………………………………………………………………………………..48 Results……………………………………………………………………………………51 Discussion………………………………………………………………………………..55 References………………………………………………………………………………..60 Figures……………………………………………………………………………………62

CHAPTER 3: A Biochemical Fingerprint by Mid-IR Spectroscopy of the Microbial Community from Six Very Different, Nebraska Sandhills‟ Lakes………………………75 Abstract…………………………………………………………………………………..76 Introduction………………………………………………………………………………76 Methods…………………………………………………………………………………..77 Results……………………………………………………………………………………79 Discussion………………………………………………………………………………..81 References………………………………………………………………………………..86 Figures……………………………………………………………………………………89

Conclusions and Future Directions….………………………………………………….105

Appendix 1: Data from the Alkaline Lakes, 2009…..………………………...……....107 Appendix 2: Flow cytometry ………………………………………………………....120 Appendix 3: GC-FAME……………………………………………………………....124 Appendix 4: SEMs of capture membranes, 2010……………………………………..128 Appendix 5: Alkaline Lakes Isolate Database……………………………………...…134 Appendix 6: Continuity and Lessons Learned……………………………………...…143

v List of Figures

Chapter 1.

Figure 1. Halophilic and alkalophilic adaptations………………………………...... 37

Figure 2. Eco-systems biology…………………………………………………………38

Figure 3. Infrared correlation chart…………………………………………………….40

Figure 4. Mid-IR spectrum of biological sample…………………………………...…..41

Chapter 2.

Figure 1. Merging of PVDF and Nylon spectra……………………………………..…62

Figure 2. SEMs of different capture membranes……….………………………………63

Figure 3. Eukaryotic cell count by SEM……………………………………………….64

Figure 4. DGGE…………...…………………………………………………………...65

Figure 5. Lipid analysis by thin layer chromatography…………………………….….66

Figure 6. Column chart of signal-to-noise ratio by prefilter…………………...………67

Figure 7. Column chart of mean D-values within replicates……………………...……68

Figure 8. Optimized cluster analysis of prefilter mid-IR spectra……………...……….69

Figure 9. Cluster analysis of perturbed lake water…………………………….……….70

Figure 10. First derivative spectra by window………………………………………….71

Figure 11. Mid-IR spectra of different filter chemistries…………………………….…73

Figure 12. Diagram of differing evanescent wave penetration………………………....74

Chapter 3.

Figure 1. Cluster analysis by lake physico-chemical data……………………………...96

Figure 2. Pie charts of microbial community structure by phyla……………………….97

Figure 3. Pie charts of Tree Claim Lake structure by phyla………………………...….98

Figure 4. Cluster analysis by pyrosequencing data classified to genus…………...……99 vi Figure 5. Average mid-IR spectra of each lake………………………………..……...100

Figure 6. Cluster analysis of lakes by window of information…………….……...…..101

Figure 7. First derivative of average spectra by window of information………...……102

Figure 8. Optimized cluster analysis……………………………………………..……103

Figure 9. A concept of potential difference in community structure and function ...…104

Appendices.

Figure A1-1. Average spectra of ten lakes from Western Nebraska, 2009…..…...... 109

Figure A1-2. SEMs of prefilter and capture membrane compared to spectra……...…110

Figure A2-1. Histograms of prefiltered lake water by flow cytometry ………………122

Figure A2-2. Histograms of stained lake water by flow cytometry………………...…123

Figure A3-1. A chromatogram of FAMEs from prefiltered lake water……………….126

Figure A3-2. A list of FAMEs from GC-MS data of prefiltered lake water…………..127

Figure A6-1. A charge balance worksheet…………………………………………….145

vii List of Tables

Chapter 1.

Table 1. Lipid biomarkers of soil microbiology………………………………….….....39

Table 2. A list of mid-IR classification experiments…………………………….……..42

Chapter 2.

Chapter 3.

Table 1. A list of lake physico-chemical properties……………………………………89

Table 2. A list of genera by lake from pyrosequencing data……………………….…..90

viii Abbreviations

2DE 2-dimensional electrophoresis

ABE Acetone Butanol Ethanol

AM Fungi Arbuscular mycorrhiza fungi

ANN Artificial Neural Networks

ANOVA Analysis of Variance

ARISA Automated ribosomal intergenic spacer analysis

ATP Adenine triphosphate

ATR Attenuated Total Reflectance

BLAST Basic Local Alignment Search Tool

C14 Carbon-14

CAGE Core for Applied Genomics and Ecology cDNA Cloned deoxyribonucleic acids

CE-MS Capillary Electrophoresis - Mass Spectrometry

CFU Colony forming units

CLNWR Crescent Lake National Wildlife Refuge

CVA Canonical Variate Analysis

DA Discriminant Analysis

DGGE Denaturing Gradient Gel Electrophoresis

DNA Deoxyribonucleic acid dNTPs Deoxynucleotide triphosphates

DRIFT Diffuse Reflectance Infrared Fourier Transform

FAME Fatty acid methyl ester

FISH Fluorescent in situ hybridization

FT-IR Fourier Transform Infrared Spectroscopy

GC-FAME Forming fatty acid methyl esters from lipids to analyze by gas chromatography ix GC-MS Gas-Chromatography - Mass Spectrometry

GS-FLX Genome sequencing from 454 Life Sciences, a subsidiary of Roche Ltd.

HCA Hierarchical Cluster Analysis

ICP-MS Inductively coupled plasma - mass spectrometry

IRE Internal Reflectance Element

KNN k-Nearest Neighbor

LB Luria-Bertani

LC-MS Liquid Chromatography - Mass Spectrometry

MEL Mobile Environmental Lab (from UNK)

Mid-IR mid-infrared spectroscopy mRNA messenger ribonucleic acid

Nha Sodium (Na) Hydrogen antiporter

NMR Nuclear Magnetic Resonance

PCA Principal Component Analysis

PCR Polymerase chain reaction pH - log [H]

PVDF Polyvinylidene fluoride

RDP Ribosomal Database Project rRNA Ribosomal ribonucleic acid

RT-PCR Reverse transcriptase-polymerase chain reaction

SEM Scanning Electron Microscope

SNR Signal-to-noise ratio

TAE Tris base, acetic acid, and EDTA, A common buffer consisting of

Taq Type of DNA polymerase acquired from aquaticus

TLC Thin Layer Chromatography

TRFLP Terminal Restriction Fragment Length Polymorphism x UPGMA Unweighted Pair Group Method with Arithmetic Mean

W1 Window 1 - 2800-3000 cm-1; Fatty Acid Window

W2 Window 2 - 1500-1800 cm-1; Amide Window

W3 Window 3 - 1200-1500 cm-1; Mixed Window or Phosphate Window

W4 Window 4 - 900-1200 cm-1; Polysaccharide Window

W5 Window 5 - 700-900 cm-1; Fingerprint Window

ZnSe Zinc Selenide (salt)

1

CHAPTER 1 – AN INTRODUCTION AND LITERATURE REVIEW

2 Introduction

The question we are trying to address is whether a method is available to capture the in situ function of a planktonic microbial community as a fingerprint in time.

This is important because what a microbial community is doing cannot be measured by visually watching it, as is the case for larger animals classically studied by ecologists. To determine what a microbial community is doing, it is necessary to monitor the biochemistry of the community as a whole. The function of the community is the important component in the ecosystem as far as how quickly carbon is turned over, how nitrogen is assimilated into the community, what metabolites are being excreted into the system, etc. The health of the ecosystem is based on the function of the microbial community. If carbon turnover is delayed, a build up of biomass occurs, and a bog is formed, changing the dynamics of the body of water, which can affect the water supply to the cattle and waterfowl as well as recharge to groundwater.

Currently, a method to examine the function of a microbial community as a whole has not been fully developed. Meta-metabolomics are in their infancy. The ability to capture a community without a change in metabolism is elusive. To study a microbial community, the majority of effort has been structure-focused. We are becoming increasingly better at identifying what species are present in a microbial community. We are limited in our ability to identify changes that affect the behavior of the community. We can see who is in the community, but not what the entire community is doing.

The unique approach presented in this thesis applies mid-IR spectroscopy to attain a biochemical fingerprint of the entire community. It is not culture based, meaning, we do not rely on characteristics of the microbes to be amenable to nutrient-rich growth on media in the lab.

Also, the study is not focused the study on a very small aspect, such as one gene, one , or one pathway, like a fluorescent in situ hybridization (FISH) experiment that uses a probe of one or two genes to identify one function in a microbial community. Mid-IR spectroscopy gathers a 3 fingerprint of the whole – every biochemical pathway in the community. Computer modeling methods (in silico) have been designed to take metagenomic data and predict the genes that are active. I am developing an experimental method to assist in answering that question rather than an in silico method.

How can we attain a picture of the whole function of a microbial community in situ?

This thesis begins with a review of the environment we are sampling – the Alkaline Lakes of

Western Nebraska, the unique organisms we may find in that environment, the current techniques to examine a microbial community, and the development of mid-IR spectroscopy in biology, with a particular focus on attenuated total reflectance, the technique used. Chapter 2 focuses on developing the method‟s parameters to ensure optimal signal is attained. And Chapter 3 verifies that a mid-IR spectroscopic technique is capable of acquiring a biochemical fingerprint with discriminative power between lake samples.

Literature Review

Lakes of the Nebraska Sandhills

Small interdunal lakes and wetlands are common throughout the Nebraska Sandhills (89).

The Nebraska Sandhills Region is the largest vegetated dune field in the Western Hemisphere

(134) covering 58,000 km2. The interdunal lakes cover 450 km2 and are part of a larger groundwater recharge area for the Ogallala or High Plains Aquifer. Lakes can be fed by groundwater discharge or by rainwater (41). Lakes can be described as recharge lakes, those that lose water to the aquifer; discharge lakes, those that gain water from the groundwater; and flow- through lakes, those that receive groundwater in one area of the lake and discharge water to the aquifer through the rest of the lake basin (42). However, flow-through lakes can have such little recharge to the ground water that evaporation plays the major role in loss of water, so that they behave as though they are closed-basin lakes. Ground water discharge is characterized by dune topography, the rate of recharge, and hydraulic conductivity (40). One study showed that the 4 route infiltrating waters take to fill a lake determined the alkalinity or acidity of the lake. Those waters that travelled through deeper soils produced a more alkaline lake (17). Playa lakes, lakes that dry to nearly salt basins in the winter, have the highest and largest variability in all solute concentrations (41). The Sandhills region has a higher estimated evapotranspiration rate than precipitation rate, yet maintains wetlands, meadows, and lake systems (43). The strongly alkaline lakes reside in Sheridan, Garden, and Morrill counties (89). Garden county has approximately

174 lakes that cover 5,368 ha (89). Lakes in close proximity have wide differences in biology and chemistry (89). Water chemistry is determined by rainwater, evapotranspiration, and biological community (11). Anions present tend to be (in descending order): Bicarbonate, carbonate, sulfate, and chloride; cations that are present (in descending order): sodium, potassium, magnesium, and calcium (11).

Sand dunes most likely settled in the area 12,000 to 6,000 years ago, blocking streams, which caused the Ogallala aquifer to rise (83). There is evidence that the sand dunes were active until 1000 to 700 years ago, coinciding with a great drought in the region (88). The vegetation maintained by this water level keeps the sand dunes in place, thus keeping the aquifer full (83).

The biology of the interdunal lakes has never been fully examined down to the prokaryotes that inhabit the waters. Lakes are well circulated, have many sunny days (that reach the bottom to an important benthic population), and have an ample supply of allochthonous minerals and organic matter (89) that play an important role in the biology of the lakes.

Waterfowl may carry many different species of algae – and one would assume bacteria as well - to the lakes (89). One thorough study of Border Lake (57) found the dominant algal species to be: Nitzchia spp., Oocystis spp., Pseudotetrahedron spp., and the blue green filamentous algae

(Cyanobacteria) Oscillatoriaceae. Additional species classified included: brine shrimp (Artemia salina), rotifers (Hexartha jenkinae and Brachionus plicatilis), and the brine fly (Ephydra hians).

Birds were primarily avocets which migrate through the Sandhills and feed on brine flies and 5 brine shrimp. Bacterial production measured by counts of bacteria was highest in December; whereas, acetate consumption (C14) was lowest in December and highest in October (57).

Values of productivity measured by Hiskey (57) place Border Lake among the most productive natural systems ever described. In a two year study of Border Lake, alkalinity changes brought about a change in the biological community. Brine shrimp outnumbered brine flies at high alkalinity. At lower alkalinity the brine flies outnumbered the brine shrimp (57).

The lakes of the Nebraska Sand Hills will continue to be an important ecosystem for economic viability. During World War I, potash, K2O, was a several million dollar economy

(1918 values). The lakes provided 40% of the nation‟s potash during World War I (18). This area is also the second most productive waterfowl region in the U.S. (74). The wet meadows of the Sandhills provide hay for the industry which produces a third of Nebraska‟s beef cattle (41).

The lakes in this region are interesting due to the harsh conditions they present to the microbial flora that inhabits the ecosystem. Lakes found near one another can have vastly different chemistries from fresh to brine and from high alkalinity/pH to moderate pH (40). In the summer, lakes tend to have depths of water from less than a foot to 4 feet, depending on the rainfall. In the winter, some lakes can dry to nothing more than salt basins. A microbial community that can withstand changes of this nature over time must be able to cope with salt concentration changes as well as pH changes. Due to the strong selectivity of the lake, biological diversity may be low (57). These characteristics make the microbial community of the lakes a model system for examining changes in the microbial community over time, examining function, and have potential for harboring novel species.

Halophiles and Alkaliphiles

The harsh conditions of the alkaline lakes exert stress on the present, selecting for alkaliphiles and which have adapted unique characteristics to overcome the conditions. „‟ are significant ecologically and economically. Four percent of 6 all DNA sequencing is done on genes of extremophiles for industrial applications (29). Horikoshi defined an alkaliphile as an organism that grows well at pH 9, but cannot grow or grows slowly at neutral pH (60); whereas, an alkalitolerant organism may survive at pH 9 or higher, but grows optimally at lower pH. Halophiles may have optimum growth at salt concentrations from 0.2 to

>3.0 M; whereas, a halotolerant organism may grow at salt concentrations greater than 1.0 M, but their optimum growth range is less than 0.2 M salt (44).

The challenge to growth at high salt concentrations involves overcoming the osmotic pressure. The common adaptation is to accumulate small molecules inside the cell to balance the difference in osmolarity. These small molecules are typically called osmolytes or compatible solutes (111). Halobacteria common to the Dead Sea, which are actually halophilic , belonging to the order halobacteriales, accumulate potassium (K+) ions in their cytoplasm to overcome the osmotic pressure (90). Halotolerant bacteria and algae accumulate small organic molecules in their cytoplasm to overcome the external osmotic pressure (46). Because the interior concentration of salts is so high in , their cytoplasmic proteins must be functional at high salt concentrations and they may be hard to study in the laboratory because they tend to unfold at lower salt concentrations (46). Dunalliella (algae) exist at the northern end of the Dead

Sea where salt concentrations are slightly lower and they overcome osmotic pressure by synthesizing glycerol. At higher salt concentrations, prokaryotes dominate and the primary producers are Cyanobacteria (44, 125).

Most alkaliphiles, on the other hand, take a different approach to adaptation, by importing hydrogen (H+) ions into the cytoplasm to lower the pH inside the cell, bringing the pH closer to neutral. There are a number of known integral membrane protein antiporters (Figure 1) that exchange Na+ or K+ for H+ (72). E. coli has only one gene for the NhaA antiporter (Sodium hydrogen antiporter); whereas, alkaliphilic Bacillus clausii has 11 antiporters (72). Alkaliphiles maintain their cytoplasm approximately 1 pH unit lower than their surroundings, even at pH 7 levels higher than 10.5 (63). This pH gradient is opposing the energetically favorable electrochemical gradient required for oxidative phosphorylation. However, alkaliphiles continue oxidative phosphorylation to produce ATP in these conditions. Several theories have been proposed, including the tunneling of H+ ions pumped out of the cytoplasm through the exterior end of membrane proteins to the ATP synthase (63). In support of this theory, common features of the alkaliphilic ATP synthase have proven to be necessary for oxidative phosphorylation including an AXAXAXA protein sequence on the N-terminal end of each c-subunit of the F0 complex and a P51XXEXXP57 sequence on the C-terminal helix-2 of the c-subunit (82) (Figure 1).

These structural features of the ATP synthase may increase the affinity for H+ ions and allow

ATP synthase to pick up H+ ions immediately after they leave proteins of the electron transport system.

Microbial Community Analysis

Methods to study a microbial community – not only structure, but function as well, are improving every day. A review of the literature reveals what methods are commonly used and where more improvement is needed. Since Antonie van Leeuwenhoek first looked at bacteria scraped from his teeth under a microscope, microbiologists have been interested in microbial communities, not only “What organisms are there?” but also, “What are they doing?” A lot has happened since 1663 and this review will begin at the onset of modern gene studies into microbial communities. The following methods studying microbial communities are arranged by their systems biology (Figure 2).

Through the 20th Century culturing bacteria has been an important part of microbiology, thanks in large part to Robert Koch‟s postulates. Despite the emphasis on cultured bacteria, it was widely known that a very small percentage of viable bacteria in the environment was cultivable. “The Great Plate Count Anomaly” distinguishes that many more cells were identified by microscopy than could be counted on agar plates as colony forming units (123). Inarguably, 8 one of the largest, most important advances to studying uncultured organisms has been in the field of gene studies. Examination of the in the hot springs of Yellowstone National

Park serve as an example where “The Great Plate Count Anomaly” was an obstacle before gene studies began and can be followed in later gene studies. The thermophiles found in the hot springs could not be cultured (13). In order to count bacteria, Thomas Brock and colleagues would place microscope slides into the hot springs and allow the microbes to attach to the slides; then, they would count them directly from the slides.

Gene studies. Later, after the introduction of 16S rRNA sequences utilized as a phylogenetic marker of prokaryotes (32, 136), Norman Pace used a similar method to characterize the community of a Yellowstone National Park hot spring by 5S rRNA sequences (122). They were able to identify 5S rRNA similar to the bacteria Thermus spp. and Archaea (at the time classified as archaebacteria) Thermoproteales and Sulfolobus spp. Now, not only was a general count of cells possible, but cells could be classified. Since that sentinel study where oligonucleotides were used as biomarkers to describe the structure of a microbial community, gene studies of microbial communities have continued and improved.

A large amount of biomass was required in the early gene studies. Attempts to improve the signal of nucleotides in a sample continued, such as using reverse transcriptase to characterize all the RNA bound in ribosomes (77), to the now popular amplification of DNA by polymerase chain reaction (112). Now, the most popular methods of examining the structure of a microbial community include examining the sequences of amplified 16S rRNA genes from DNA extracts and microscopy by fluorescent in situ hybridization (FISH) (3, 4, 23). Or, in place of sequencing the nucleotides, which can be expensive, Muyzer and colleagues (91) developed a simple procedure to separate oligonucleotides from different species by migrating them across a denaturing gradient gel electrophoresis (DGGE). The banding pattern can then be used to compare one community to another or by different treatments. 9 Methods to examine the function of a microbial community by gene studies were also developed. FISH can also be used to identify function by binding to a specific gene that encodes an enzyme necessary for a specific function of the community (56). Genomic functional analysis

(49, 124) also allows the function of a non-cultivable community to be further elucidated by cloning DNA from the community into vectors which are then harbored by E. coli cells. Genes can then be selected for by growing the E. coli in specific media, say media deprived of a certain vitamin. One could also use an E. coli mutant deficient a galactose transporter to select for a transporter gene by growing in a defined media with galactose as the only source of energy, a growing colony would indicate the community analyzed had a transporter for galactose.

DNA based methods can be delineated as gene studies which look only at 16S rRNA sequences, or genomic studies where the entire genome of an isolated culture is examined.

Beyond the isolated organism, we are interested in the community. Jo Handelsman first coined the term “metagenomics” to characterize a study of the entire genome of a community that behaves as a meta-organism (50). Now, systems biology that examines a community could be named, “eco-systems biology” (108). The next sections of this review will cover eco-systems biology by meta-transcriptomics, meta-proteomics, and meta-metabolomics.

Transcriptomic studies. The first publication to look at the collective transcriptome of planktonic communities was by Mary Ann Moran and colleagues (107). The method used involved total RNA extracts, subtractive hybridization to remove rRNA, and reverse transcription

(RT)-PCR. The cDNA was then sequenced and analyzed in BLASTX and BLASTN formats.

The primers used in this study focused on a Shine-Delgarno sequence to amplify bacterial mRNA. Over 80,000 unique transcripts were identified by this method. Some of the genes include: (soxA) a gene for inorganic sulfur oxidation, (trkA) a gene for potassium uptake,

(psbA2) a gene for photosystem II, and (proV) a gene for Proline/glycine betaine/dimethylsulfaniopropionate transport. The presence of compatible solute transporters is 10 not surprising considering the sample came from a closed-basin, hypersaline near Lake

Tahoe, CA. One of the major challenges to transcriptomics identified by Moran‟s group is that mRNA, especially bacterial mRNA, degrades rapidly. To overcome this problem, they tried to rapidly extract the mRNA but admit that they may have inadvertently emphasized the mRNA transcripts that are more stable.

In a meta-transcriptomic study of the Tinto River, Victor Parro et al. sequenced cDNA of transcripts by using a microarray preferential for genes of L. ferrooxidans for which they were most interested (104). By using a microarray, many transcripts are able to be sequenced; however, the microarray primers restrict and focus the types of transcripts that are able to be sequenced. Although the microarray is preferential for genes of L. ferrooxidans, there is no way to eliminate the possibility that the transcript came from another organism, but the community of genes (the genes of the meta-organism) the authors believed is really what is important.

In a third type of meta-transcriptomic study, Gilbert and colleagues use the latest pyrosequencing GS-FLX technology (from 454 Corporations now owned by Roche

Pharmaceuticals) after enriching for mRNA over rRNA, mRNA was then applied to a reverse transcription reaction (35). The authors looked at changes in transcription due to acidification in an ocean environment. The most abundant transcripts found were housekeeping genes, stress- induced chaperonin proteins, ABC transporters, genes related to ATPase activity and AMP- binding as well as RNA polymerases. Of the three methods of transcriptomics reviewed here, each have their advantages and disadvantages and no method is clearly superior to another.

Proteomic studies. Beyond the transcriptome there are many regulatory factors that determine if translation will occur. To define the function of the community more accurately, a look at the metaproteome is necessary. The term “metaproteome” was first proposed by Francisco

Rodriguez-Valera (110). In 2004, Wilmes and Bond did the first metaproteome study (135) of an activated sludge system optimized for biological phosphorus removal. In this publication, the 11 authors developed a popular method of analyzing the metaproteome by 2-Dimensional electrophoresis (2DE) analysis and mass spectrometry. The concept of separating proteins first by isoelectric point, then by molecular weight was established in 1975 by Patrick O‟Farrell (102) and further evaluated to ensure each protein is resolved from a yeast model (47). Wilmes and

Bond were able to identify 728 spots on their gel and compared the images to one another by a software program called PROTEOMWEAVER. Nine spots were further analyzed by mass spectrometry: outer membrane protein porins, acetyl-CoA acetyltransferase, and ABC-type branched-chain transport system proteins were identified based on database searches of the protein sequences.

2DE is still the most popular method to analyze a metaproteome. Another emerging method includes separation of the proteome by liquid chromatography. In order to gain resolution of each protein, typically more than one separation is necessary – meaning more than one type of column is used (19, 67). Other metaproteome studies have looked at the change in the proteome when the system in insulted by a pollutant (cadmium) (75). The metaproteome of an uncultivable symbiont in a deep-sea tube worm has been examined (87).

And the metaproteome of microbes in the human infant gastrointestinal tract has been elucidated

(71).

Metabolomics. The youngest eco-systems biology approach is meta-metabolomics. This section will focus on the commonly used approaches in metabolomics. Separation techniques in tandem with mass spectrometry, NMR, and mid-IR spectroscopy are common in metabolomic studies.

Gas chromatography and mass spectrometry (GC-MS) is often used to analyze the metabolites of a cell, primarily in plant studies (34, 61). The primary limitation of GC-MS is that the analytes studied must be volatile. In a similar fashion, a study of the fatty acid portion of cellular lipids (a product of metabolism not often lumped in with the low-molecular weight metabolites commonly referred to as comprising the metabolome) from a cell or community can 12 be analyzed after saponification of the lipids forming fatty acid methyl esters (FAMEs), which can be studied by gas chromatography (25, 33). If an axenic bacterial or fungal culture is grown according to a specific protocol, the species can be verified by its lipid profile. From a soil community certain types of bacteria or fungi can be identified based on the lipid biomarkers identified by GC-FAME (Table 1).

Liquid chromatography is also a common separation technique used in tandem with mass spectrometry to profile cellular metabolites (34, 85, 133). In one LC-MS study, 93% of all commercially available metabolites from the in silico metabolomes of E. coli and were identified (119). In silico means the metabolome was elucidated by a computer model of the known genome; in silico studies by computer modeling of metabolomes are an entire technique by themselves (108). In another study, the metabolic fingerprints of E. coli and

Saccharomyces cervisiae were attained by LC-MS under normal growth conditions and under a starvation stress response (16).

Capillary electrophoresis (CE) is also a common separation technique used in tandem with mass spectrometry for metabolomic studies (34). In one CE-MS study, 375 charged hydrophilic intermediates of E. coli were identified and 198 were quantified (103).

NMR is also commonly used in metabolomic studies (34, 131, 133). In one study of

Aspergillus nidulans, the metabolomic profiles under normal conditions and in response to a chemical lead, 8-azaxanthine, as part of a drug discovery program, were compared in wild type and mutant strains (31). Nine metabolites were identified by NMR including: adenine, ATP, uridine, and cytosine. NMR spectra and comparisons by principal component analysis were able to identify the effects of a drug that influences the purine degradation pathway and the pyrimidine biosynthetic pathway as compared to mutants that were defective in these pathways.

Mid-IR spectroscopy, the final technique covered in this metabolomics section, is the most common method of attaining a spectral fingerprint from biological samples (85). This 13 method captures a biochemical profile of cells including the entire metabolic pool. The following section takes a closer look at mid-IR spectroscopy as it applies to biological studies.

Mid-IR Spectroscopy

A convention of describing infrared spectroscopy as Fourier Transform Infrared (FT-IR) is very common. However, many other instruments today use the Fourier Transform to take a variety of data and manipulate them into simple sine and cosine components to derive useful data:

X-ray crystallography, NMR, and some mass spectrometers. Thus, because all modern infrared spectrometers are of the Fourier Transform type, there is no longer a need to refer to the technique as FT-IR (45); therefore, I will refer to the technique described herein as mid-IR spectroscopy and the instrument as a Fourier Transform Infrared Spectrometer (FT-IR) throughout this review.

The mid-infrared region of the electromagnetic spectrum ranges from wavelengths of

2500 nm to 20,000 nm. In inverse wavelengths (wavenumbers) this range is 4000 cm-1 to 500 cm-1. Covalent bonds absorb at a specific wavenumber in the infrared in a quantized manner, become excited, and relax back to the ground state by vibrations in the bond: symmetrical and asymmetrical stretching, rocking, wagging, twisting, and scissoring. Analytical chemists have been able to use this fact to identify an unknown analyte in a known solvent for many years. A correlation chart (Figure 3) can be used to identify absorbances in a spectrum with functional groups on a molecule to identify characteristics about an unknown sample.

Biologists have been interested in the spectrum of a cell since the 1950s (99), but the complexity of the spectra hindered useful application of mid-IR spectroscopy with whole cell analysis. Dieter Naumann‟s lab at the Robert Koch Institute in Berlin published the seminal paper that applied the advances in mid-IR spectroscopy to overcome the spectral complexities – better FT-IR instruments, digital data storage, more powerful computers, and software programming capable of statistical analysis of large data sets with multiple variables (53, 54, 55, 14 59, 92, 93, 94, 95). Naumann‟s group defined the five windows of information from biological samples within the mid-IR range (Figure 4). In addition, Naumann‟s work verified that mid-IR spectroscopy could be used as a tool to identify and classify bacteria based on the whole-cell, biochemical profile obtained by the reproducible mid-IR spectrum of a strain of bacteria.

Typically, spectra recorded digitally on a computer can be compared to one another by a process

Naumann‟s group established by calculating the non-overlapping area between two spectra to determine the spectral distance, or D-value. Since this work, many other types of bacteria, fungi, and algae have been classified by mid-IR spectroscopy.

Concurrently, Nichols et al., from Florida State University, did work on mid-IR spectroscopy of whole cells in 1985 that supports Naumann‟s work (98). Nichols proposed that mid-IR spectroscopy could be used to monitor a microbial community, identifying the production of gum-arabic (1150 cm-1) or poly-β-hydroxybutyrate (1740 cm-1) in pure cultures of E. coli and

B. subtilis, respectively. Nichols also examined biofilms as they accumulated in a flow-through cell monitored by ATR.

Mariey et al. (86) already compiled a very useful table of experiments involving classification of different microorganisms (Table 2). The table outlines the technique as either: transmission mode (Trans.), attenuated total reflectance (ATR), Diffuse Reflectance (Diff. refl.), or FT-IR microspectroscopy – where an infrared microscope can be used to focus the IR beam on a very small spot of a particular sample (Microsp.). The table illustrates that, through the 1990s, there were a number of publications using mid-IR spectroscopy to classify bacteria from a wide variety of genera as well as some publications that classify yeasts and algae by their biochemical fingerprints. In addition, a number of different statistical analysis approaches are used with mid-

IR spectra and are listed in the table: Discriminant Analysis (DA), Hierarchical cluster analysis

(HCA), Canonical Variate Analysis (CVA), Principle Component Analysis (PCA), Artificial

Neural Network (ANN), and k-nearest neighbor (KNN). 15 Since the 1990s, literature on classification has continued. Oberreuter et al. classified 31 strains of Brevibacterium linens, 27 Corynebacterium glutamicaum, and 29 Rhodococcus erythropolis (101). They found mid-IR spectroscopy was capable of identifying down to the species level, but not necessarily the strain level. They also found no correlation between 16S rDNA sequence similarities and spectral distance. Because 16S rRNA sequencing is the gold standard of phylogenetic classification, Oberreuter and colleagues concluded that mid-IR spectroscopy is not a good phylogenetic classification tool. Then, Kirschner et al. examined 18

Enterococci strains of six different species and found that mid-IR spectra was better suited at distinguishing strains than the most used phenotypic tests at the time (70). Similarly, Tindall et al. found mid-IR spectroscopy capable of discriminating strains from different phyla isolated from a mat in Antarctica; whereas both FAME and 16S rDNA sequencing with BLAST analysis were unable to distinguish these strains (129). Fischer et al. extended classification by mid-IR spectroscopy to the fungi, examining the possibility of testing known fungal pathogens isolated from a hospital and examining the conidiospores, rather than yeast cells, as the conidiospore is the common form of the pathogen which infects a host at the hospital (30). Fischer‟s group was able to develop the optimal parameters for the method and classify nine different types of

Aspergillus and Penicillium species.

Staphylococcus aureus small colony variants (10); E. coli, Staphylococcus aureus, and

Candida albicans by DRIFT (26); and Listeria spp. by macro- and micro-spectroscopy (109) have also been classified by mid-IR spectroscopy. Allwood et al. were able to discriminate between different mutants of Pseudomonas syringae by mid-IR spectroscopy of the metabolomic footprint (2). Domenighini and colleagues extended mid-IR spectroscopy classification to algae, properly discriminating 12 green algae and 2 Cyanobacteria, using mid-IR microspectroscopy

(24). 16 Extracts have also been examined. Naumann et al. (96) examined extracts of from five different bacteria in order to determine the 3-Dimensional structure of peptidoglycan (96). Hedrick et al. examined lipid extracts from five Archaea and four bacteria, determining that the two could easily be identified by mid-IR DRIFT of total lipid extracts (52).

Bacterial lipids have absorbance in the 3000-2800 cm-1 region as expected, as well as 1740 cm-1 due to ester linkages as in phospholipids, diacylglycerides, and triacylglycerides. However,

Archaea lack the high intensity absorbance at 1740 cm-1 because their phospholipids are ether linked. Ananthi et al. examined extracts of polysaccharides from a brown alga, determining the functional groups present on the polysaccharides, such as uronic acid (1609-1420 cm-1), sulfate groups (1240-1260 and 920-836 cm-1), and acidic groups (1139-1037 cm-1) (7).

Classification of whole cells by mid-IR spectroscopy is a classification by the total biochemical fingerprint of a cell, where the growth conditions must be the same in order for the functional difference to be solely based on species differences. Other functional studies may be interested in what metabolic changes occur when growth conditions change. Vannini et al. examined the growth of Lactobacillus spp. and Yarrowia lipolytica and were able to observe the loss of glucose (a substrate for the cultures) and gain of metabolites such as lactic acid by taking samples from a culture at different times and analyzing them by ATR (132). Similarly, Schuster et al. examined Clostridium cultures during industrial acetone and butanol production. They were able to determine, by mid-IR spectroscopy, when the culture was in early solventogenic phase and late solventogenic phase, and when the culture was still in the process of producing unwanted byproducts – acetic acid and butyric acid – in the acidogenic phase (115). Sporulation was also identified by spectral features in 1378 – 1625 cm-1 late in the production process (115). The ability and ease by which mid-IR spectroscopy could be used to examine a wide variety of bioprocesses is clear at this point. Landgrebe et al. (76) list the attributes of an FT-IR instrument which make it the obvious choice for bioprocess examination: non-invasive, no time delay due to 17 sensor response times, the sensor has no influence on the bioprocess itself, and many analytes can be measured at once. Indeed, lipid production by algae has been examined by Dean et al. (22) by mid-IR spectroscopy, which supports mid-IR spectroscopy as an outstanding tool to monitor algal biodiesel production.

Other functional examinations by mid-IR include: analyzing chicken breast spoilage, including accumulation of bacteria on chicken breast cells (27), analyzing structural changes in C. albicans brought about by a change in the growth conditions (1), identification of dipicolinic acid as a biomarker for sporulation in Bacillus (39), examining the change in a tomato cell due to extreme changes in salt concentrations (65), analyzing the change in the metabolome of a plant cell culture when infected by Pseudomonas syringae (2).

Other mid-IR spectroscopy experiments include: testing for drug interactions, such as those by Kim and colleagues (68) which used mid-IR to determine what changes were occurring in cervical carcinoma cells when introduced to a drug, lopinavir. The resulting spectra showed changes in the protein, nucleotide, and carbohydrate regions. Also, Obata et al. used mid-IR spectroscopy to determine the effect of a drug on stratum corneum to test the ability of the drug to penetrate the skin (100). Mid-IR spectroscopy determined that L-menthol disordered the structure of intercellular lipids in the stratum corneum much the same as heat treatments did.

In another unique application of mid-IR spectroscopy, Thomizg et al. obtained structural information on prion proteins, using 1616-1640 cm-1 as an indication of β-sheet structures and

1656-1659 cm-1 as α-helices (126). The PrP27-30 prion proteins yielded different spectra based on the different „strains‟ that cause scrapie, bovine spongiform encephalopathy (BSE), or

Creutzfeldt-Jakob disease.

Attenuated total reflectance (ATR). Several publications listed above used a particular method of acquiring a mid-IR signal called attenuated total reflectance (ATR) which I will now describe further. Harrick (51) first published the technique and described it as spectral analysis of totally 18 internally reflected radiation. Fahrenfort (28) also published separately on the same method the next year and both are accredited with discovering ATR. Total internal reflectance is common today in fiber optics. The name comes from the fact that as light travels through a fiber, it is reflected from the inside edge of the medium it is travelling through, and, as long as a critical angle is not exceeded, the total energy is reflected internally. Despite the total energy being reflected, there is a portion of the beam that penetrates into the outer medium. This portion carries no energy, but maintains the same wavelength and is called the evanescent wave. Based on Maxwell‟s equation, and every solution of the equation describing an internal reflection, every photon that strikes the interior of the medium includes a component wave existing on the exterior of the surface. If a substance were to absorb this wavelength in the outer medium, loss of energy would be noticed. Based on this fact, a spectrum could be acquired when a substance is pressed against a crystal passing an infrared beam through it by total internal reflectance. How deep the evanescent wave penetrates into the outer medium is dependent upon the angle at which it strikes, the wavelength, and the indices of refraction of both the crystal and the outer medium:

where dp is the penetration depth, λ is the wavelength of the light, ϴ is the angle of incidence of the IR beam, n1 is the index of refraction of the crystal, and n2 is the index of refraction of the outer medium (106).

One particular ATR method of note, Davis et al. rinse contaminated chicken breast with a

PBS solution (21). The PBS solution is then passed through a 0.2 µm filter, concentrating the

Salmonella spp. on the surface of the cellulose nitrate filter. The cellulose filter is then pressed against an internal reflectance element (IRE) with multiple bounce, meaning the IR is reflected off the crystal several times before passing through the IRE to the detector. The resulting spectra 19 were able to detect low levels of Salmonella contamination and to discriminate between

Salmonella enterica serovars.

Mid-IR spectroscopy to profile microbial communities. Mid-IR Spectroscopy has been applied to a microbial community (116). Scullion et al. examined earthworm feces and disturbed soil, together called earthworm casts, when the earthworms, L. terrestris and L. rubellus, are fed different leaves. Mid-IR spectroscopy was able to identify differences in the earthworm casts based on the different diets. Differences in the microbial community in the earthworm casts were also confirmed by DGGE. Scullion and colleagues did not find correspondence in similarities of the two techniques. Using the same technique, Huang et al. (61), also from the University of

Wales, Aberystwyth and the University of Manchester, UK, monitored metabolic changes in artificial soil slurries due to horizontal gene transfer. An Acinetobacter mutant with a dysfunctional salicylate hydroxylase gene (salA) was introduced to a soil slurry as well as naked salA DNA to monitor the gene transfer. The change in mid-IR spectra coincided with the horizontal gene transfer event analyzed by the Green Fluorescent Protein gene added to the plasmid which contained the salA gene. Relative concentrations of salicylic acid could also be monitored at wavenumbers 1701-1708 and 1141-1164 cm-1.

Considering the techniques used today to monitor a microbial community; the use, convenience, and attributes of mid-IR spectroscopy; and the need to study function in a microbial community, the advantages of mid-IR spectroscopy are straightforward. It is quick, easy to use, inexpensive compared to many other gene or analytical chemistry techniques, and information- rich, as it pertains to metabolites in the community. My goal was to develop and evaluate a method to use mid-IR spectroscopy as a tool to profile the in situ function of a microbial community.

20 References

1. Adt, I. Toubas, Dominique, Pinon, Jean-Michel, Manfait, Michel, Sockalingum, Ganesh

D. 2006. FTIR spectroscopy as a potential tool to analyze structural modifications during morphogenesis of Candida albicans. Archives of Microbiology, 185: 277-285.

2. Allwood, W.J. Goodacre, Royston, Clarke, Andrew, Mur, Luis A.J. 2010. Dual metabolomics: A novel approach to understanding plant-pathogen interactions. Phytochemistry,

IN PRESS.

3. Amann, R.I., Ludwig, Wolfgang, Schleifer, Karl-Heinz. 1995. Phylogenetic

Identification and In Situ Detection of Individual Microbial Cells without Cultivation.

Microbiological Reviews, 59: 143-169.

4. Amann, R.I., Fuchs, Bernhard M., Behrens, Sebastian. 2001. The identification of microorganisms by fluorescence in situ hybridization. Current Opinion in , 12:

231-236.

5. Amiel, C., Curk, M.C., Denis, C., Ingouf, J.C., Reyrolle, J., and Travert, J. 1997. Ed.

Carmona, P. Spectroscopy of Biological Molecules: Modern Trends, Kluwer Academic

Publishers (Dordrecht, Denmark), 435.

6. Amiel, C., Mariey, L., Curk-Daubie, M.C., Pichon, P., and Travert, J. 2000. Potentiality of Fourier Transform Infrared Spectroscopy (FTIR) for discrimination and identification of dairy lactic acid bacteria. Le Lait, 20: 445-459.

7. Ananthi , S., Raghavendran, Hanumantha R.B., Sunil, Adoor Gopalan, Gayathri,

Veeraraghavan, Ramakrishnan, Ganapathy, Vasanthi, Hannah R. 2010. In vitro antioxidant and in vivo anti-inflammatory potential of crude polysaccharide from Turbinaria ornata (Marine Brown

Alga). Food and Clinical Toxicology, 48: 187-192.

8. Bastert, J., Korting, H.C., Traenkle, P., and Schmalreck, A.F. 1999. Identification of dermatophytes by Fourier Transform Infrared Spectroscopy (FT-IR). Mycoses, 42: 525-528. 21 9. Beattie, S.H., Holt, C., Hirst, D., and Williams, A.G. 1998. Discrimination among

Bacillus cereus, B. mycoides, and B. thuringiensis and some other species of the genus

Bacillus by Fourier transform infrared spectroscopy. FEMS Microbiology Letters, 164: 201-

206.

10. Becker, K., Al Laham, Nahed, Fegeler, Wolfgang, Proctor, Richard A., Peters, Georg, von Eiff, Christof. 2006. Fourier-Transform Infrared Spectroscopic Analysis Is a Powerful Tool for Studying the Dynamic Changes in Staphylococcus aureus Small-Colony Variants. J. Clinical

Microbiology, 44: 3274-3278.

11. Bleed, A.S. 1998. Lakes and Wetlands. Ed. Bleed, A.S. An Atlas of the Sand Hills

(Conservations and Survey Division: Lincoln, NE), 115-122.

12. Borel, M. and Lynch, B.M. 1993. A study of differentiation of living bacteria by

ATR/FTIR spectroscopy. Canadian J. of Applied Spectroscopy, 38: 18-21.

13. Bott, T.L., Brock, T.D. 1969. Bacterial Growth Rates above 90 ° C in Yellowstone Hot

Springs. Science, 164: 1411-1412.

14. Bouhedja, W., Sockalingum, G.D., Pina, P., Allouch, P., Bloy, C., Labia, R., Millot, J.M., and Manfait, M. 1997. ATR-FTIR spectroscopic investigation of E. coli transconjugants β- lactams-resistance phenotype. FEBS Letters, 412: 39-42.

15. Bouhedja, W., Sockalingum, G.D., Pina, P., Gainvors, A., Millot, J.M., and Manfait, M.

1997. Ed. Carmona, P. Spectroscopy of Biological Molecules: Modern Trends, Kluwer

Academic Publistars, Dordrecht, Denmark, 487.

16. Brauer, M.J., Yuan, J., Bennett, B.D., Lu, W.Y., Kimball, E., Botstein, D, Rabinowitz,

J.D. 2006. Conservation of the metabolomic response to starvation across two divergent microbes. PNAS, 103:19302-19307. 22 17. Chen, C.W., Gherini, S.A., Peters, N.E., Murdoch, P.S., Newton, R.M., Goldstein, R.A.

2008. Hydrologic analyses of acidic and alkaline lakes. Water Resources Research, 20: 1875-

1882.

18. Condra, G.E. 1918. Preliminary Report on the Potash Industry of Nebraska. The

Nebraska Conservation and Soil Survey, 8: 1-39.

19. Cravatt, B.F. Simon, Gabriel M., Yates, John R. III. 2007. The biological impact of mass- spectrometry-based proteomics. Nature, 250: 991-1000.

20. Curk, M.C., Peladan, F., Hubert, J.C. 1994. Fourier transform infrared (FTIR) spectroscopy for identifying Lactobacillus species. FEMS Microbiology Letters, 123: 241-248.

21. Davis, R. Burgula, Y, Deering, A., Irudayaraj, J., Reuhs, B.L., Mauer, L.J. 2010.

Detection and differentiation of live and heat-treated Salmonella enterica serovars inoculated onto chicken breast using Fourier transform infrared (FT-IR) spectroscopy. J. Applied

Microbiology, IN PRESS.

22. Dean, A.P.., Sigee, David C., Estrada, Beatriz, Pittman, Jon K. 2010. Using FTIR spectroscopy for rapid determination of lipid accumulation in response to nitrogen limitation in freshwater microalgae. Bioresource Technology, 101: 4499-4507.

23. DeLong, E.F. Pace, Norman R., Wickham, Gene S. 1989. Phylogenetic Stains:

Ribosomal RNA-Based Probes for the Identification of Single Cells. Science, 243: 1360-1363.

24. Domenighini, A., and Giordano, M. 2009. Fourier transform infrared spectroscopy of microalgae as a novel tool for biodiversity studies, species identification, and the assessment of water quality. J. of Phycology, 45: 522-531.

25. Drijber, R., Doran, J.W., Parkhurst, A.M., Lyon, D.J. 2000. Changes in soil microbial community structure with tillage under long-term wheat-fallow management. Soil Biology and

Biochemistry, 32: 1419-1430. 23 26. D'Souza, L., Devi, Prabha, Kamat, Tonima, Naik, Chandrakant G. 2009. Diffuse reflectance infrared Fourier transform spectroscopic (DRIFTS) investigation of E. coli,

Staphylococcus aureus, and Candida albicans. Indian Journal of Marine Sciences, 38: 45-51.

27. Ellis, D.I., Goodacre, Royston, Broadhurst, David I., Kell, Douglas B., Rowland, Jem J.

2002. Rapid and quantitative detection of the microbial spoilage of meat by Fourier transform infrared spectroscopy and machine learning. Applied and Environmental Microbiology, 68: 2822-

2828.

28. Fahrenfort, J. 1961. Attenuated total reflection: A new principle for the production of useful infra-red reflection spectra of organic compounds. Spectrochimica Acta, 17: 698-709.

29. Ferrer, M., Golyshina, Olga, Beloqui, Ana, Golyshin, Peter N. 2007. Mining from extreme environments. Current Opinion in Microbiology, 10: 207-214.

30. Fischer, G. Silvia Braun, Ralf Thissen, Wolfgang Dott. 2006. FT-IR spectroscopy as a tool for rapid identification and inter-species characterization of airborne filamentous fungi. J.

Microbiological Methods, 64: 63-77.

31. Forgue, P., Halouska, S., Werth, M., Xu, K, Harris, S., and Powers, R. 2006. NMR

Metabolic Profiling of Aspergillus nidulans to Monitor Drug and Protein Activity. J. of Proteome

Research, 5: 1916-1923.

32. Fox, G.E., Stahl, David A., Hespell, R.B., Gibson, J., Maniloff, J., Dyer, T.A., Balch,

W.E., Tanner, R.S., Magrum, L.J., Zablen, L.B., Gupta, R., Bonen, L., Lewis, B.J., Luehrsen,

K.R., Chen, K.N. 1980. The Phylogeny of Prokaryotes. Science, 209: 457-463.

33. Frey, S.D., Drijber, R., Smith, H., Melillo, J. 2008. Microbial biomass, functional capacity, and community structure after 12 years of soil warming. Soil Biologhy and

Biochemistry, 40: 2904-2907. 24 34. Garcia, D.E., Baidoo, Edward E., Benke, Peter I., Pingitore, Francesco, Tang, Yinjie J.,

Villa, Sandra, Keasling, Jay D. 2008. Separation and mass spectrometry in microbial metabolomics. Current Opinion in Microbiology, 11: 233-239.

35. Gilbert, J.A., Field, Dawn, Huang, Ying, Edwards, Rob, Le, Weizhong, Gilna, Paul,

Joint, Ian. 2008. Detection of Large Numbers of Novel Sequences in the Metatranscriptomes of

Complex Marine Microbial Communities. PLoS ONE, 3: e3042.

36. Gomez, J.E., Sockalingum, G.D., Aubert, D., Toubas, D., Pinon, J.M., Witthun, F., and

Mainfait, M. 1999. Ed. Greve, J., Puppels, G.J., Otto, C. Spectroscopy of Biological Molecules:

New Directions, (Kluwer Academic Publistars: Dordrecht, Denmark) 469.

37. Goodacre, R., Timmins, E.M., Rooney, P.J., Rowland, J.J., and Kell, D.B. 1996. Rapid identification of Streptococcus and Enterococcus species using diffuse reflectance-absorbance

Fourier transform infrared spectroscopy and artificial neural networks. FEMS Mirobiology

Letters, 140: 233-239.

38. Goodacre, R., Kell, Douglas B., Timmins, Eadaoin M., Burton, Rebecca, Kaderbhai,

Naheed N., Woodward, Andrew M., Rooney, Paul J. 1998. Rapid identification of urinary tract infection bacteria using hyperspectral whole-organism fingerprinting and artificial neural networks. Microbiology, 144: 1157-1170.

39. Goodacre, R., Kell, Douglas B., Timmins, Eadaoin M., Shann, Beverley, Gilbert, Richard

J., McGovern, Aoife C., Alsberg, Bjorn K., Logan, Niall A. 2000. Detection of the Dipicolinic

Acid Biomarker in Bacillus Spores Using Curie-Point Pyrolysis Mass Spectrometry and Fourier

Transform Infrared Spectroscopy. Analytical Chemistry, 72: 119-127.

40. Gosselin, D., Sibray, Steve, Ayers, Jerry. 1994. Geochemistry of K-rich Alkaline Lakes.

Geochimica et Cosmochimica Acta, 58: 1403-1418. 25 41. Gosselin, D., Harvey, F. Edwin, Sridhar, Venkataramana, Goeke, James. 2006.

Hydrological Effects and Groundwater Fluctuations in Interdunal Environments in the Nebraska

Sandhills. Great Plains Research, 16: 17-28.

42. Gosselin, D. and Khisty, M.J. 2001. Simulating the influence of two shallow, flow- through lakes on a groundwater system: implications for groundwater mounds and hinge lines.

Hydrogeology Journal, 9: 476-486.

43. Gosselin, D., Harvey, F. Edwin, Goeke, James, Drda, Steve. 1999. Hydrologic Setting of

Two Interdunal Valleys in the Central Sand Hills of Nebraska. Ground Water, 37: 924-933.

44. Grant, W.D., McGenity, T.J., and Gemmell, R.T. 1998. Halophiles. Ed. Horikoshi, K. and Grant, W.D. Extremophiles: Microbial Life in Extreme Environments (John Wiley and Sons,

Inc.: New York) 93-132.

45. Griffiths, P.R. 2010. FTIR vs. FT-IR vs. mid-IR. Applied Spectroscopy, 64: 40A.

46. Gross, M. 1996. Life on the Edge: Amazing Creatures Thriving in Extreme Environments

(Perseus Books: Cambridge).

47. Gygi, S.P., Corthals, Garry L., Zhang, Yanni, Rochon, Yvan, Aebersold, Ruedi. 2000.

Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology. PNAS,

97: 9390-9395.

48. Haag, H., Gremlich, H.U., Bergmann, R., and Sanglier, J.J. 1996. Characterization and identification of actinomycetes by FT-IR spectroscopy. J. of Microbiological Methods, 27: 157-

163.

49. Handelsman, J. 2004. Metagenomics: Application of Genomics to Uncultured

Microorganisms. Microbiology and Molecular Biology Reviews, 68: 669-685.

50. Handelsman, J., Rondon, Michelle R., Brady, Sean F., Clardy, Jon, Goodman, Robert M.

1998. Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chemistry and Biology, 5: R245-249. 26 51. Harrick, N.J. 1960. Surface chemistry from spectral analysis of totally internally reflected radiation. Physical Chemistry, 64: 1110-1114.

52. Hedrick, D.B., Nivens, David E., White, David C., Stafford, Chris. 1991. Rapid differentiation of archaebacteria from eubacteria by diffuse reflectance Fourier Transform IR spectroscopic analysis of lipid preparations. J. Microbiological Methods, 13: 67-73.

53. Helm, D. and Naumann, D. 1995. Identification of some bacterial cell components by

FT-IR spectroscopy. FEMS Microbiology Letters, 126: 75-80.

54. Helm, D., Dieter Naumann, Labischinski, H., Schallehn, G. 1991. Classification and identification of bacteria by Fourier-Transform infrared spectroscopy. J. General Microbiology,

137: 69-79.

55. Helm, D., Dieter Naumann, Labischinski, H. 1991. Elaboration of a procedure for identification of bacteria using Fourier-transform infrared spectral libraries: A stepwise correlation approach. J. Microbiological Methods, 14: 127-142.

56. Heng, H.H.Q., Tsui, L.C. 1993. Modes of DAPI banding and simultaneous in situ hybridization. Chromosoma, 102: 325-332.

57. Hiskey, R.M. 1981. The Trophic-Dynamics of an Alkaline-Saline Nebraska Sandhills

Lake. Thesis: University Nebraska – Lincoln.

58. Holt, C., Hirst, D., Sutherland, A., MacDonald, F. 1995. Discrimination of Species in the

Genus Listeria by Fourier Transform Infrared Spectroscopy and Canonical Variate Analysis.

Applied and Environmental Microbiology, 61: 377-378.

59. Horbach, I., Dieter Naumann, Fehrenbach, F.J. 1988. Simultaneous infections with different serogroups of Legionella pneumophila investigated by routine methods and Fourier

Transform infrared spectroscopy. J. Clinical Microbiology, 26: 1106-1110.

60. Horikoshi, K. 1998. Alkaliphiles. Ed. Horikoshi, K. and Grant, W.D. Extremophiles:

Microbial Life in Extreme Environments. (John Wiley and Sons, Inc.: New York) 155-179. 27 61. Huang, W.E., Goodacre, Royston, Elliott, Geoff N., Beckmann, Manfred, Worgan,

Hilary, Bailey, Mark J., Williams, Peter A., Scullion, John, Draper, John. 2005. The use of chemical profiling for monitoring metabolic changes in artificial soil slurries caused by horizontal gene transfer. Metabolomics, 1: 305-315.

62. Irmscher, H.M., Fischer, R., Beer, W., and Seltmann, G. 1999. Characterization of nosocomial Serratia marcescens isolates: comparison of Fourier-transform infrared spectroscopy with pulsed-field gel electrophoresis of genomic DNA fragments and multilocus enzyme electrophoresis. Zentralbl Bakteriology, 289: 249-263.

63. Ivey, D. M., Ito, Masahiro, Gilmour, Raymond, Zemsky, Jennifer, Guffanti, Arthur A.,

Sturr, Michael G., Hicks, David B., Krulwich, Terry A. 1998. Alkaliphile Bioenergetics. Ed.

Horikoshi, K. and Grant, W.D. Extremophiles: Microbial Life in Extreme Environments. (John

Wiley and Sons, Inc.: New York) 181-210.

64. Johnsen, K. and Nielsen, P. 1999. Diversity of Pseudomonas strains isolated with King's

B and

Gould's S1 agar determined by repetitive extragenic palindromic-polymerase chain reaction, 16S rDNA sequencing and Fourier transform infrared spectroscopy characterization. FEMS

Microbiology Letters, 173: 155.

65. Johnson, H.E., Goodacre, Royston, Broadhurst, David I., Smith, Aileen R. 2003.

Metabolic fingerprinting of salt-stressed tomatoes. Phytochemistry, 62: 919-928.

66. Kansiz. M., Heraud, P., Wood, B., Burden, F., Beardall, J., and Mac Naughton D. Fourier

Transform Infrared microspectroscopy and chemometrics as a tool for the discrimination of cyanobacterial strains. 1999. Phytochemistry, 52: 407-417.

67. Keller, M. and Hettich, R. 2009. Environmental Proteomics: a Paradigm Shift in

Characterizing Microbial Activities at the Molecular Level. Microbiology and Molecular Biology

Reviews, 73: 62-70. 28 68. Kim, Dong-Hyun, Jarvis, Roger M., Xu, Yun, Oliver, Anthony W., Allwood, William J.,

Hampson, Lynne, Hampson, Ian N. 2010. Combining metabolic fingerprinting and footprinting to understand the phenotypic response of HPV16 E6 expressing cervical carcinoma cells exposed to the HIV anti-viral drug lopinavir. Analyst, IN PRESS.

69. Kirschner, C., Ngo Thi, N.A., and Naumann, D. 1999. Ed. Greve, J., Puppels, G.J., and

Otto, C. Spectroscopy of Biological Molecules: New Directions (Kluwer Academic Publistars:

Dedricht, Denmark) 561.

70. Kirschner, C., Dieter Naumann, Manfait, Michel, Sockalingum, Ganesh D., Maquelin,

K., Pina, P., Ngo Thi, N.A., Choo-Smith, L.P., Sandt, C., Ami, D., Doglia, S.M., Allouch, P.,

Puppels, G.J., Orsini, F. 2001. Classification and Identification of Enterococci: a Comparative

Phenotypic, Genotypic, and Vibrational Spectroscopic Study. J. of Clinical Microbiology, 39:

1763-1770.

71. Klaassens, E.S., do Vos, W.M., and Vaughan, E.E. 2007. Metaproteomics approach to study the functionality of the microbiota in the human infant gastrointestinal tract. Applied and

Environmental Microbiology, 73: 1388-1392.

72. Krulwich, T. A., Ito, Masahiro, Hicks, David B. 2009. Cation/proton antiporter complements of bacteria: why so large and diverse? Molecular Microbiology, 74: 257-260.

73. Kummerle, M., Scherer, S., and Seiler, H. 1998. Rapid and reliable identification of food- borne yeasts by Fourier-Transform Infrared Spectroscopy. Applied and Environmental

Microbiology, 64: 2207-2214.

74. Labedz, T.E. 1990. Birds. Ed. Bleed, A. and Flowerday, C. An Atlas of the Sand Hills

(Conservation and Survey Division, University of Nebraska-Lincoln), 161-180.

75. Lacerda , C.M.R., Choe, L.H., and Reardon, K.F. 2007. Metaproteomic analysis of a bacterial community response to cadmium exposure. J. Proteome Research, 6: 1145-1152. 29 76. Landgrebe, D., Haake, Claas, Hopfner, Tim, Beutel, Sascha, Hitzmann, Bernd, Scheper,

Thomas, Rhiel, Martin, Reardon, Kenneth F. 2010. On-line infrared spectroscopy for bioprocess monitoring. Applied Microbiology and Biotechnology, 88: 11-22.

77. Lane, D.J., Pace, Bernadette, Olsen, Gary J., Stahl, David A., Sogin, Mitchell L., Pace,

Norman R. 1985. Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. PNAS, 82: 6955-6959.

78. Lang, P.L. and Sang, S.C. 1998. The in-situ infrared microspectroscopy of bacterial colonies on agar plates. Cellular and Molecular Biology, 44: 231-238.

79. Lefier, D., Hirst, D., Holt, C., and Williams, A.G. 1997. Effect of sampling procedure on strain variation in Listeria monocytogenes on the discrimination of species in the genus Listeria by Fourier transform infrared spectroscopy and canonical variates analysis. FEMS Microbiology

Letters, 147: 45-50.

80. Lefier, D., Lamprell, H., and Mazerolles, G. 2000. Evolution of Lactococcus strains during ripening in Brie cheese using Fourier transform infrared spectroscopy. Le Lait, 80: 247.

81. Lipkus, A.H., Chittur, K.K., Vesper, S.J., Robinson, J.B., and Pierce, G.E. 1990.

Evaluation of infrared spectroscopy as a bacterial identification method. J. Industrial

Microbiology, 6: 71-75.

82. Liu, J., Hicks, David B., Krulwich, Terry A., Fujisawa, Makoto. 2009. Characterization of the Functionally Critical AXAXAXA and PXXEXXP Motifs of the ATP Synthase c-Subunit from an Alkaliphilic Bacillus. J. Biological Chemistry, 284: 8714-8725.

83. Loope, D.B. and Swinehart, J. 2000. Thinking Like a Dune Field: Geologic History in the Nebraska Sand Hills. Great Plains Research, 10: 5-35.

84. Mac Naughton, D., Romeo, M., Kansiz, M., Wood, B., Heraud, P., Burden, F., and

Berdall, J. 1999. Ed. Greve, J., Puppels, G.J., Otto, C. Spectroscopy of Biological Molecules:

New Directions (Kluwer Academic Publistars: Dordrecht, Denmark), 475. 30 85. Mapelli, V., Lisbeth Olsson, Jens Nielsen. 2008. Metabolic footprinting in microbiology: methods and applications in functional genomics and biotechnology. Cell, 26: 490-497.

86. Mariey, L., Signolle, J.P., Amiel, C., Travert, J. 2001. Discrimination, classification, identification of microorganisms using FTIR spectroscopy and chemometrics. Vibrational

Spectroscopy, 26: 151-159.

87. Markert, S., Arndt, C., Felbeck, H., Becher, D., Sievert, S.M., Hugler, M., Albrecht, D.,

Robidart, J. Bench, S., Feldman, R.A., Hekcer, M., and Schweder T. 2007. Physiological proteomics of the uncultured endosymbiont of Riftia pachyptila. Science, 315: 247-250.

88. Mason, J.A., Loope, David B., Swinehart, James, Goble, Ronald J. 2004. Late

Holocene dune activity linked to hydrological drought, Nebraska Sand Hills, USA. The Holocene,

14: 209-217.

89. McCarraher, D.B. 1977. Nebraska’s Sandhills Lakes. Ed. Huff, E. (Nebraska Game and

Parks Commission: Lincoln, NE).

90. McGenity, T.J., Gemmell, Renia T., Grant, William D., Stan-Lotter, Helga. 2000. Origins of halophilic microogranisms in ancient salt deposits. Environmental Microbiology, 2: 243-250.

91. Muyzer, G., De Waal, Ellen C., Uitterlinden, Andre G. 1993. Profiling of Complex

Microbial Populations by Denaturing Gradient Gel Electrophoresis Analysis of Polymerase Chain

Reaction-Amplified Genes Coding for 16S rRNA. Applied and Environmental Microbiology, 59:

695-700.

92. Naumann, D., Dieter Helm, Labischinski, H., Giesbrecht, P. 1991. The characterization of microorganisms by Fourier-Transform infrared spectroscopy (FT-IR). Ed. Nelson, W.H.

Modern Techniques for Rapid Microbiological Analysis (VCH Publishers: New York), 43-96.

93. Naumann, D., Dieter Helm, Labischinski, H. 1991. Microbiological characterizations by

Fourier-Transform infrared spectroscopy (FT-IR). Nature, 351: 81-82. 31 94. Naumann, D., Labischinski, H., Giesbrecht, P., Fijala, V. 1988. The rapid differentiation and identification of pathogenic bacteria using Fourier Transform infrared spectroscopy and multivariate statistical analysis. J. Molecular Structure, 174: 165-170.

95. Naumann, D. The ultra rapid differentiation and identification of pathogenic bacteria using FT-IR techniques. 1985. SPIE: International Conference on Fourier and Computerized

Infrared Spectroscopy, 553: 268-269.

96. Naumann, D., BARNICKEL, G, BRADACZEK, H., LABISCHINSKI, H., and

GIESBRECHT, P.. 1982. Infrared Spectroscopy, a tool for probing bacterial peptidoglycan.

European J. Biochemistry, 125: 505-515.

97. Ngo Thi, N.A., Kirschner, C., and Naumann, D. 1999. Eds Greve, J., Puppels, G.J., Otto,

C. Spectroscopy of Biological Molecules: New Directions (Kluwer Academic Publistars:

Dordrecht, Denmark), 557.

98. Nichols, P.D., Henson, J.M., Guckert, J.B., Nivens, David E., White, David C. Fourier- transform infrared spectroscopic methods for microbial ecology: analysis of bacteria, bacteria- polymer mixtures, and biofilms. J. Microbiological Methods, 4: 79-94.

99. Norris, K.P. and Greenstreet, J.E.S. 1958. On the infrared absorption spectrum of

Bacillus megaterium. J. General Microbiology, 19: 566-580.

100. Obata, Y., Utsumi, Shunichi, Watanabe, Hiroshi, Suda, Masafumi, Tokudome, Yoshihiro,

Otsuka, Makoto, Takayama, Kozo. 2010. Infrared spectroscopic study of lipid interaction in stratum corneum treated with transdermal absorption enhancers. International Journal of

Pharmaceutics, IN PRESS.

101. Oberreuter, H. Charzinski, Joachim, Scherer, Siegfried. 2002. Intraspecific diversity of

Brevibacterium linens, Corynebacterium glutamicum and Rhodococcus erythropolis based on partial 16S rDNA sequence analysis and Fourier-transform infrared (FT-IR) spectroscopy.

Microbiology, 148: 1523-1532. 32 102. O'Farrell, P.H. 1975. High Resolution Two-Dimensional Electrophoresis of Proteins. J.

Biological Chemistry, 250: 4007-4021.

103. Ohashi, Y., Hirayama, A., Ishikawa, T., Nakamura, S., Shimizu, K., Ueno, Y., Tomita,

M., and Soga T. 2008. Depiction of metabolome changes in histidine-starved Escherichia coli by

CE-TOF-MS. Molecular Biosystems, 4: 135-147.

104. Parro, V. Moreno-Paz, Mercedes, Gonzalez-Toril, Elena. 2007. Analysis of environmental transcriptomes by DNA microarrays. Environmental Microbiology, 9: 453-464.

105. Picand, C., Lebaud, J., Novel, G., and Travert, J. 1995. Eds. Merlin, J.C., Turrell, S., and

Huvenne, J.P. Spectroscopy of Biological Molecules (Kluwer Academic Publishers: Dordrecht,

Denmark), 509.

106. Installation and User Guide MIRacleTM ATR for FTIR Spectrometers. 2004. Pike

Technologies: Madison, WI.

107. Poretsky, R.S. Bano, N., Buchan, A., LeCleir, G., Kleikemper, J., Pickering, M., Pate, W.

M., Moran, M. A., Hollibaugh, J. T. 2005. Analysis of Microbial Gene Transcripts in

Environmental Samples. Applied and Environmental Microbiology, 71: 4121-4126.

108. Raes, J. and Bork, P. 2008. Molecular eco-systems biology: towards an understanding of community function. Nature Reviews Microbiology, 6: 693-699.

109. Rebuffo-Scheer, C.A., Scherer, Siegfried, Dietrich, Jochen, Wenning, Mareike. 2008.

Identification of five Listeria species based on infrared spectra (FTIR) using macrosamples is superior to a microsample approach. Annals of Bioanalytical Chemistry, 390: 1629-1635.

110. Rodriguez-Valera, F. 2004. Environmental genomics, the big picture? FEMS

Microbiology Letters, 231: 153-158.

111. Romantsov, T. Guan, Ziqiang, Wood, Janet M. 2009. Cardiolipin and the osmotic stress responses of bacteria. Biochimica et Biophysica Acta, 1788: 2092-2100. 33 112. Saiki, R.K., Gelfand, David H., Stoffel, Susanne, Scharf, Stephen J., Higuchi, Russell,

Horn, Glenn T., Mullis, Kary B., Erlich, Henry A. 1988. Primer-Directed Enzymatic

Amplification of DNA with a Thermostable DNA Polymerase. Science, 239: 487-491.

113. Schmalreck, A.F., Trankle, P., Vanca, E., and Blaschke-Hellmessen, R. 1998.

Differentiation and characterization of Candida albicans, Exophiala dermatitidis and Prototheca spp. by Fourier-transform infrared Spectroscopy (FT-IR) in comparison with conventional methods. Mycoses, 41(S1): 71-77.

114. Schmitt, J., Udelhoven, T., Naumann, D., and Flemming, H.C. 1998. Stacked spectral data processing and artificial neural networks applied to FT-IR and FT-Raman spectra in biomedical applications. Proceedings SPIE International Society for Optical Engineering, 3257:

236.

115. Schuster, K.C., Mertens, F., Gapes, J.R. 1999. FTIR spectroscopy applied to bacterial cells as a novel method for monitoring complex biotechnological processes. Vibrational

Spectroscopy, 19: 467-446.

116. Scullion, J.. Goodacre, Royston, Huang, Wei E., Elliott, Geoff N., Worgan, Hilary,

Bailey, Mark J., Draper, John, Gwynn-Jones, Dylan, Griffith, Gareth W., Winson, Michael K.,

Clegg, Christopher. 2003. Use of earthworm casts to validate FT-IR spectroscopy as a 'sentinal' technology for high-throughput monitoring of global changes in microbial ecology. Pedobiologia,

47: 440-446.

117. Seltmann, G., Voigt, W., and Beer, W. 1994. Application of Physico-Chemical typing methods for the epidemiological analysis of Salmonella enteritidis strains of phage type 25/17.

Epidemiology and Infection, 113: 411-424.

118. Seltmann, G., Beer, W., Claus, H., and Seifer, H. 1995. Comparative classification of

Acinetobacter baumannii strains using seven different typing methods. Zentralbl Bacteriology,

282: 372-383. 34 119. Smilde, A.K., vad der Werf, M.J., Bijlsma, S., van der Werff-van-der-Vat, B.J.C., and

Jellema, R.H. 2005. Fusion of mass spectrometry-based metabolomics data. Analytical

Chemistry, 77: 6729-6736.

120. Sockalingum, G.D., Bouhedja, W., Pina, P., Allouch, P., Mandray, C., Labia, R., Millot,

J.M., and Manfait, M. 1997. ATR–FTIR Spectroscopic Investigation of Imipenem-Susceptible and -Resistant Pseudomonas aeruginosa Isogenic Strains. Biochemistry and Biophysics Research

Communes, 232: 240-246.

121. Sockalingum, G.D., Bouhedja, W., Pina, P., Allouch, P., Bloy, C., and Manfait, M. 1998.

FT-IR spectroscopy as an emerging method for rapid characterization of microorganisms.

Cellular Molecular Biology, 44: 261-269.

122. Stahl, D.A., Lane, David J., Olsen, Gary J., Pace, Norman R. 1985. Characterization of a

Yellowstone Hot Spring Microbial Community by 5S rRNA Sequences. Applied and

Environmental Microbiology, 49: 1379-1384.

123. Staley, J.T. and Konopka, A. 1985. Measurements of in situ activities of nonphotosynthetic microorganisms in aquatic and terrestrial habitats. Annual Reviews of

Microbiology, 39: 321-346.

124. Stein, J.L., Marsh, Terence L., Wu, Ke Ying, Shizuya, Hiroaki, DeLong, Edward F.

1996. Characterization of uncultivated prokaryotes: isolation and analysis of a 40-kilobase-pair genome fragment from a planktonic marine archaeon. J. of Bacteriology, 178: 591-599.

125. Steinhorn, I and Gat, J.R. 1983. The Dead Sea. Scientific American, 249: 102-109.

126. Thomzig, A., Dieter Naumann, Spassov, Sashko, Friedrich, Manuela, Beekes, Michael.

2004. Discriminating Scrapie and Bovine Spongiform Encephalopathy Isolates by Infrared

Spectroscopy of Pathological Prion Protein. J. Biological Chemistry, 279: 33847-33854. 35 127. Timmins, E.M., Quain, D.E., and Goodacre, R. 1998. Differentiation of brewing yeast strains by pyrolysis mass spectrometry and Fourier transform infrared spectroscopy. Yeast, 14:

885.

128. Timmins, E.M., Howell, S.A., Alsberg, K.A., Noble, W.C., and Goodacre, R. 1998.

Rapid Differentiation of Closely Related Candida Species and Strains by Pyrolysis-Mass

Spectrometry and Fourier Transform-Infrared Spectroscopy. J. Clinical Microbiology, 36: 367-

374.

129. Tindall, B.J., Brambilla, Evelyne, Steffen, Maike, Neumann, Regine, Pukall, Rudiger,

Kroppenstedt, Reiner M., Stackebrandt, Erko. 2000. Cultivatable microbial biodiversity: gnawing at the Gordian knot. Environmental Microbiology, 2: 310-318.

130. Tintelnot, K., Haase, G, Seibold, M., Bergmann, F., Staemmler, M., Franz, T., and

Naumann, D. 2000. Evaluation of Phenotypic Markers for Selection and Identification of

Candida dubliniensis. J. Clinical Microbiology, 38: 1599-1608.

131. Turnbaugh, P.J. and Gordon, J.I. 2008. An invitation to the marriage of metagenomics and metabolomics. Cell, 132: 708-713.

132. Vannini, L., Lanciotti, R., Gardini, F., Guerzoni, M.E. 1996. Evaluation of Fourier- transform infrared spectroscopy for data capture in predictive microbiology. World Journal of

Microbiology and Biotechnology, 12: 85-90.

133. Viant, M.R. 2008. Recent developments in environmental metabolomics. Molecular

BioSystems, 4: 980-986.

134. Wang, T., Zlotnik, Vitaly A., Wedin, David, Wally, Kathryn D. 2008. Spatial trends in saturated hydraulic conductivity of vegetated dunes in the Nebraska Sand Hills: Effects of depth and topography. J. of Hydrology, 349: 88-97. 36 135. Wilmes, P. and Bond, P.L. 2004. The Application of two-dimensional polyacrylamide gel electrophoresis and downstream analyses to a mixed community of prokaryotic microorganisms.

Environmental Microbiology, 6: 911-920.

136. Woese, C.R. 1987. Bacterial Evolution. Microbiological Reviews, 51: 221-271.

37

Figure 1. and alkaliphile adaptations. Horikoshi and Grant, 1998.

38

Figure 2. Eco-systems biology. Goodacre, 2000.

39

Table 1. List of lipid biomarkers. Frey et al. (2008) identified concentrations of lipids from soil plots heated and not heated.

40

Figure 3. A correlation chart of functional groups and at what wavenumbers they absorb. http://unodc.org 41

Figure 4. Mid-IR spectrum. The five windows of information include: W1, 3000-2800 cm-1; W2, 1800-1500 cm-1; W3, 1500-1200 cm-1; W4, 1200-900 cm-1; and W5, 900-500 cm-1. 42

43

Table 2. A listing of numerous classification experiments by mid-IR spectroscopy, Mariey et al. (5, 6, 8, 9, 12, 14, 15, 20, 36, 37, 38, 48, 54, 55, 58, 62, 64, 66, 69, 73, 78, 79, 80, 81, 84, 97, 105, 113, 114, 115, 117, 118, 120, 121, 126, 127, 129) 44

CHAPTER 2 - Characterization of the in situ Function of a Planktonic Microbial Community by mid-IR Spectroscopy

Ryan Roberts, Julie Shaffer, Vicki Schlegel, Bradley Plantz, and Kenneth W. Nickerson

45 Abstract: The function of a microbial community includes how efficiently it decomposes organic matter and moves carbon through the environment. These capabilities are important to the health of the ecosystem and structural information alone cannot describe these metabolic functions of a microbial community. A method is needed to record a snapshot in time of the in situ function of a planktonic microbial community. Herein, a mid-IR spectroscopic method is developed to provide optimal signal and biological information. The microbial community is concentrated on 0.2 micron filters, dried to quench metabolism, and analyzed by attenuated total reflectance. Optimal results were obtained with a 20-25 micron pore size prefilter coupled with both PVDF and nylon filter membranes which, because they have complementary transparent windows, provide the most spectral information. Prokaryotic and some eukaryotic signatures are included in the spectra, and the method can detect a change of one colony forming unit per milliliter. This method can detect small changes in the function of a planktonic microbial community and it should prove useful for monitoring biochemical processes in sensitive ecosystems.

Introduction

Microbial ecologists are interested in the structure, function, and interactions of constituent community populations. Because bacteria are r-strategist, meaning they focus on creating large quantities of offspring, and are sensitive to environmental factors, species abundance patterns are likely to be dynamic in all but the most environmentally stable of ecosystems. This means a large number of samples must be collected and analyzed across space and time if the true patterns of diversity are to be discerned, or if correlations between environmental variables and species diversity are to be made. The inability to collect and process data from a sufficiently large sampling plan has been a limiting factor in our ability to properly census microbial communities. DNA technologies such as denaturing gradient gel electrophoresis (DGGE), terminal restriction fragment length polymorphism (T-RFLP), or 46 automated ribosomal intergenic spacer analysis (ARISA) and accompanying data analysis tools and data analysis software packages now allow for the rapid elucidation of community species richness (13, 9, 18). Despite having to make assumptions as to polymerase chain reaction bias and the frequency of target genes within cells, pyrosequencing technologies can determine both richness and abundance of species within samples. The ability to tag samples with short 6-base barcodes, the reduction in the cost of sequencing, and a plethora of data analysis software packages now enable population biologists to evaluate spatial and temporal patterns of bacterial biodiversity (10). Variants of these same technologies can elucidate functionality by probing for functional genes or by sequencing the metagenome, from which gene function can be inferred

(12, 17). However, data analyses are computationally intensive and the need to sequence and analyze the large number of samples required means that metagenomic sequencing as a tool is cost prohibitive.

Metatranscriptomics (15) and metaproteomics (23) provide some functional information, but metabolomics provides ultimate information regarding function. Metabolites are the end- products of all processes in a cell and continue to modify enzymatic activity, translation, and transcription. The ideal method for examining total function is a metabolomic approach. We are developing mid-infrared (mid-IR) spectroscopy as an in situ method to fingerprint planktonic populations of microorganisms in lakes. Infrared frequencies are absorbed by all covalent bonds and are detectable if they undergo a change in dipole moment and, therefore, a signature of every macromolecule and metabolite in a cell or community of cells is available. Naumann et al. (14) describe five windows of wavenumbers that provide information regarding biological material in the mid-IR region. They include: window 1 (W1) – the fatty acid window (3000-2800 cm-1), window 2 (W2) the amide window (1700-1500 cm-1), window 3 (W3) a mixed region of absorbance due to P=O bonds (1500-1200 cm-1), window 4 (W4) the polysaccharide window

(1200-900 cm-1), and window 5 (W5) the fingerprint window (900-600 cm-1). Wavenumbers in 47 the 4000-3000 cm-1 and 2800-1700 cm-1 have not shown reproducible patterns useful in analyzing biological samples. When applied to complex chemical milieus such as a cell, the resulting spectral pattern provides a biochemical fingerprint. Since the improvement of infrared spectroscopy – in the form of Fourier Transform devices (FT-IR), computers, and computational multivariate statistical analysis tools – reproducible identification of bacteria (11), fungi (8), and algae (6) has been possible. The advantage of this technique is that it provides, in a sense, a metabolomic fingerprint of the sample in question.

Our objective is to apply mid-IR fingerprinting to microbial communities. Our rationale is that spectra will reflect the biochemical fingerprints of a community and differences between fingerprints will reflect structural and functional differences between communities. Specific criteria considered for the development of the process were:

1. The in-field sampling method must concentrate planktonic cells sufficiently in order for the instrument to detect signal.

2. The in-field sampling method must rapidly quench metabolic activity so that the signal reflects the metabolic state of the community rather than the post-process state.

3. The signal should reflect bacterial rather than eukaryotic information.

Here we report on the development of attenuated total reflectance mid-IR spectroscopy

(ATR-midIR) as a method for profiling microbial planktonic communities from a lake. In ATR techniques, light is transmitted through a crystal by total internal reflection. The evanescent wave formed penetrates 0.5 to 5 µm into the surrounding media and is attenuated by any material that absorbs the corresponding wavelengths. A membrane filter concentrates particulates and is amenable to being pressed against the ATR crystal, thus providing a characteristic mid-IR image

(fingerprint) of the microbial community. This is a novel approach to profiling the metabolome of a microbial community and, thus, the optimal parameters must be determined.

48 Materials and methods

Lake water sampling Lake water from Salt Lake, Lincoln, NE (40°49‟18.8”N 96°44‟05.8”) was collected from the lake surface in 1-gallon sterile plastic containers and stored overnight at

4°C. All 1-gallon containers were pooled into a six-gallon container, then passed through a 100

µm, ethanol-sterilized, brass sieve to remove large particulates. This water was classified as unfiltered. The prefilters used were 20-25 µm paper filter (Whatman No. 541 Hardened Ashless),

8 µm paper filter (Whatman No. 2), or a 2.5 µm paper filter (Whatman No. 42, Ashless).

Capture filters were of 0.2 µm pore size, composition and sources were nitrocellulose (NC), nylon (NY), or polyvinyldiene fluoride (PVDF) (Whatman® Cellulose Nitrate Membrane Filters,

0.2 µm, 47 mm diameter; Millipore® Nylon Membrane, 0.2 µm GNWP; and Millipore®

Durapore® Membrane Filters, 0.22 µm GV, respectively). The quantity of water filtered through each capture filter was based on the volume of water required to almost plug the prefilter. Actual quantity varied based on the prefilter. Each capture membrane was then rinsed with 5 ml sterile, distilled water. The 0.2 µm filters were then dried in vacuo. Drying took ca. 3 hours. After drying, filters were stored with a desiccant pouch under nitrogen in 50 mL conical tubes.

Scanning Electron microscopy (SEM) After filters were dried and stored with desiccant under nitrogen, one nitrocellulose filter from each treatment was placed in a 3% gluteraldehyde phosphate solution to fix the samples and rehydrate the cells prior to examination with an Hitachi

S-3000N variable pressure SEM. Filters were bake dried at 40 °C instead of the traditional critical point drying method in ethanol, because the nitrocellulose filters were degraded in ethanol. The filters were then coated in a gold-palladium alloy before being examined by SEM.

Images were captured and stored digitally on a PC based computer.

Mid-IR spectroscopy Capture filters (0.2 µm) were analyzed with the Bruker Instruments

Equinox 55 FT-IR spectrometer with a deuterated triglycine sulfate detector and a Pike

Technologies, MIRacle® ZnSe cystal, single reflectance attachment. Spectra were recorded from 49 4,000 to 750 cm-1 at a resolution of 4 cm-1. 128 scans were co-added to obtain one spectrum. All spectra were merged and analyzed using the Bruker software OPUS NT version 6.5. Two replicate filters of each water sample were analyzed and three instrumental replicates of each capture filter were recorded. All mid-IR cluster analyses were based on D-values of first derivative spectra and clustered using Ward‟s algorithm (22).

DNA extraction, amplification, and DGGE DNA was extracted from one filter using the

QIAGEN DNEasy® Blood and Tissue kit with a modified protocol . Briefly, all DNA extractions used Whatman® Cellulose Nitrate Membrane Filters, 0.2 µm pore size, 47 mm in diameter. The filters were cut with a razor blade into approximately 4 mm squares then placed in an Eppendorf tube with the Enzyme lysis buffer. At that point, the QIAGEN DNEasy Blood and

Tissue Kit protocol for bacterial DNA extraction was followed for cell lysis. DNA was extracted with 24:1 chloroform/isoamyl alcohol followed by a 25:24:1 phenol/chloroform/isoamyl alcohol extraction. DNA was then precipitated in 2 vol ethanol and 3 ml sodium-acetate (25%) solution and allowed to sit overnight. Samples were then centrifuged for 5 min at 13,200 rpm, supernatant discarded, and the pellet washed twice with 700 µl of 70% ethanol. DNA was washed with 300

µl, 100% ethanol, microcentrifuged, and allowed to dry at room temperature.

DNA pellets were resuspended in 50 µl of elution buffer and the DNA concentration was adjusted to 20 ng/µl. Primer design was based on Muyzer, et al. (13). The forward primer was

5'-cgcccgccgcgcgcggcgggcggggcgggggcacggggggcctacgggaggcagcag – 3‟; and the reverse primer was

5‟ – attaccgcggctgctgg – 3‟. All amplifications were done using a 50 µl final volume. Each reaction included: 5 µl of Invitrogen‟s Platinum ® Taq DNA Polymerase buffer solution, 1 µl

® each of primers set to 100 pmol, 2 µl MgCl2, 1 µl Platinum ® Taq dNTPs, 0.2 µl Platinum Taq

Polymerase, and 2 µl of DNA from samples. The amplification procedure consisted of an initial

96 °C step for 5 min, followed by 25 cycles of 92 °C for 1 min, 46 °C for 2 min, and 72 °C for 50 1.5 min. A final 72 °C for 5 min and a 4 °C final step until samples were collected completed the procedure.

Amplicons were loaded onto an 8% (w/v) polyacrylamide gel with a 40% to 70% urea gradient (16). An electric potential of 60 V for 16 h was applied in a 1x TAE buffer. The gel was stained with 1x SYBR green for 10 minutes and an image captured by the Typhoon 9410 image system. The TIFF image was then analyzed with BioNumerics 5.0 (Applied Maths), using dicer analysis (presence or absence), and distances determined by the Pearson‟s correlation and clustered by UPGMA.

Lipid extractions and Thin Layer Chromatography (TLC) Lipids from 0.2 µm nitrocellulose filters were extracted by a modified Bligh and Dyer (1959) procedure (2). The filters were cut and shaken in chloroform:methanol (2:1) for 4 hours, then equal parts chloroform and sterile, distilled water were added, and the mixture shaken for another 30 min. Solids were removed by filtration through Whatman no. 1 paper. Filtrate was then moved to a separatory funnel for 30 minutes. The organic phase at the bottom of the separatory funnel was then removed and heated to 50 °C under vacuum to remove the chloroform. When the total volume was at or less than 5 ml, the chloroform was moved to a pre-weighed glass test tube with Teflon lined cap and placed in a 50 °C heatblock to dry under a stream of nitrogen. When all chloroform was completely gone, the glass test tube was capped and allowed to return to room temperature before the weight was recorded. Lipids were kept at 4 °C for storage.

The lipids were resuspended in 10 µl of chloroform for every 1 mg of lipid and 10 µl of sample were then placed onto a Whatman silica 60 A TLC plate (60 general purpose, 20 x 20 cm,

250 µm, Maidstone, U.K.) for analysis. A solvent of hexane:diethyl ether:acetic acid (90:10:2) served as the mobile phase. After the mobile phase reached 1 cm from the top of the plate, the plate was removed and allowed to dry. The plate was stained with 10% cupric sulfate and 8% phosphoric acid in water, allowed to dry, and charred at 160 °C for 10-15 min until spots were 51 visible on the plate. A 9-lipid standard, prepared as published previously (24), was run concurrently on the TLC plate for lipid identification.

Sensitivity Experiment A sensitivity experiment determines how much change to the lake water is required in order to detect a change by mid-IR spectroscopy. Lake water from Western

Nebraska that was available in the lab was used for this experiment. Bacterial strains of

Escherichia coli and Staphylococcus aureus, Gram negative and Gram positive, respectively, were grown in standard Luria-Bertani (LB) broth in shake flasks at 170 rpm. After 24 hours of growth, the optical density in Klett units was determined for each culture. The culture was diluted with sterile LB broth to 100 Klett units and colony forming units (CFUs) ml-1 were determined on triplicate plates of DIFCO Laboratories Bacto® Plate Count Agar. Dilutions of the media were then prepared from 1:10 down to 1:108. 1 ml of diluted media was then added to

19 ml of lake water. The total 20 ml was passed through a 20-25 µm prefilter, then passed through a PVDF or nylon filter membrane. Three replicates of each filter membrane were prepared, dried in vacuo and three instrumental replicates recorded from each membrane. PVDF and nylon spectra were cut and merged; then, the three instrument replicates were averaged to represent one sample. Dilutions at 10-1, 10-6, 10-7, and 10-8 were then analyzed by mid-IR spectroscopy. As a control, 19 ml of lake water with 1 ml of sterile Luria-Bertani broth was passed through a PVDF and nylon filter.

Results

Selection of capture membrane filters. Total counts of bacterial cells within planktonic communities will range from a few cells per milliliter to more than 1 x 107. The more dilute samples will require a concentration step in order to be above the detection limit for infrared spectroscopy. Concentration of cells onto the surface of a membrane filter is a logical approach.

Attenuated total reflectance is based on the principle of an evanescent wave penetrating the surface of a dried film. The variables for evanescent wave penetration depth include frequency 52 and index of refraction. Assuming an homogenously mixed surface, the index of refraction should be the same across the filter surface. Wavenumbers below 750 cm-1 are not used in ATR experiments to ensure adequate penetration depths. The first step in developing the method was to identify filter matrices that do not interfere in spectral regions that contain biological information, i.e. the windows of information as described by Naumann (14). Negative peaks were a significant issue (visible in Fig. 1); they are an unavoidable result of the sample partially blocking signal from the filter that has otherwise been subtracted through the process of blanking the filter. The ideal solution would be a filter chemistry that does not absorb IR in the windows of information. Such a filter does not exist. To this end, glass (GL), nylon (NY), polycarbonate

(PC), polyvinyldiene fluoride (PVDF), nitrocellulose (NC) and cellulose mixed esters (CE) filters were evaluated, all with an absolute poor size of 0.2 µm (Figure 11). Our solution was to merge spectra from two separate filter chemistries in a process analogous to the split mull method (20).

Figure 1 shows the spectral patterns from lake water captured onto PVDF and NY filters, the merger of the two spectra, and a comparison of the merged spectrum to an absorbance spectrum from E. coli, obtained in transmission mode, which represents the maximum amount of information that could be obtained by mid-IR. PVDF membranes provide good signal from

3,000 to 1275 cm-1; whereas, NY membranes provide good signal from 1,275 down to 900 cm-1.

Although not quite the complete mid-IR spectrum, windows that contain information are well represented in the merged spectrum.

Biological evaluation of prefilter selection. Lake water contains “other” particles including and suspended particulates that may or may not be of biological origin. The primary interest of our method is to evaluate planktonic bacterial communities and these other particles reduce and interfere with the bacterial signal. Prefiltering is one method of removing this interfering signal. Although a prefilter step may enhance the bacterial signal, it may also remove 53 filamentous, clumping, or encapsulated bacteria from the sample. Thus, particle passage through prefilters was evaluated by several methods.

Figure 2 shows scanning electron micrographs of 0.2 µm NC filters after drying in vacuo.

Diatoms and large cells are visible in these images. Diatom counts decrease congruent with decreased prefilter pore size (Fig 2, A-D and Fig 3). Retention of soft-bodied cell morphology requires critical point drying through a series of dehydrating solvents such as ethanol (4). The films formed after drying in vacuo are amorphous. Pure cultures of Vibrio metschnikovii form a similar amorphous film when captured onto NC membrane (Fig. 2G) whereas rods from an environmental isolate closely related to Bacillus hemicellulosilyticus form distinct rod shapes (Fig

2H). Figure 2E shows that very little material is trapped onto the 0.2 µm capture filter after passage through a 1.2 µm; prefilter. Additionally, a 1.2 µm prefilter was found to be impractical because it rapidly fouled and filtration times were in excess 1.5 hours to filter 250 ml of water.

Figure 4 shows the pattern of 16S rRNA amplicons obtained by DGGE analysis, confirming that the films from prefiltered water do contain bacteria. Cluster analysis indicates that prefilter selection results in a presence/absence banding pattern of no worse than 80% similarity between samples. There is little difference in the bacterial community of each prefilter.

The 2.5 µm prefilter treatment was most identical to the unfiltered, and presumably complete, bacterial community. Bands are clearly absent in regions of the gel for the unfiltered water sample, such as near the higher end of the denaturant concentration suggesting that the 20 and 8

µm prefilters enhanced gene amplification. In addition to DNA based studies, lipids provide biological information as to the population types captured on membranes. Bacteria, with the exception of mycoplasmas, do not contain sterols and therefore the presence of sterol bands on thin layer chromatography is indicative of eukaryotes. Figure 5 shows that sterols are present in every sample extracted from the 0.2 µm filter regardless of prefilter pore size. Thus, none of the prefilters used seemed to exclude eukaryotes. 54 Instrumental based selection of prefilter. The signal-to-noise ratio (SNR) is a limiting factor for any instrumental method. A statistically significant higher SNR was recorded for the 20 – 25

µm prefilter compared to lake water or either the 8 or 2.5 µm prefilter (Fig. 6). Important, too, is the distance between spectra when there is a compositional difference between samples. The spectral distance, or D-value, is a calculation of the nonoverlapping area between two spectra, bound by the start and end wavenumbers. As mentioned previously, five regions of the mid-IR spectrum contain information about general chemical compounds within a cell. Windows 1 through 4 were included in the distance calculation: 3,040 – 2,800 cm-1 and 1,800 – 900 cm-1

(Fig. 1). Other regions of the spectrum contain no useful information and were excluded.

Reproducible spectra with low inherent noise between replicates will provide better statistical power when comparing spectra collected from unique samples. An estimate of inherent noise within these regions is the D-value calculated from replicate samples. Six replicate spectra were collected from unfiltered and prefiltered lake water. The mean of D-values within replicates and their standard deviations were calculated (Figure 7). Passage through a 20 – 25 µm prefilter resulted in the lowest within sample mean D-value.

The final step for evaluating differences in mid-IR spectra between samples is to establish a cluster method. Here we used a heuristic approach (based on the following five assumptions and analysis of each window, Fig. 10). The resulting dendrogram is shown in

Figure 8.

1. Pretreatment has an effect on the sample,

2. Within treatment spectral distances should be minimal,

3. Between treatments spectral distances should be maximal,

4. Distances within windows are not equal, and

5. Spectral distance between unfiltered and prefiltered lake water should increase with decreasing pore size. 55 Windows, window weighting, spectral preprocessing, and clustering algorithm are provided in the legend to Figure 8.

Sensitivity of spectra to changes in bacteria composition. The ability to detect small changes in the bacterial community was estimated by addition of a known number of viable Escherichia coli or Staphylococcus aureus cells to prefiltered lake water. One milliliter of a series of diluted cultures was added to 19 ml of prefiltered lake water and another ml of each dilution was plated onto tryptic soy agar plates to obtain the number of colony forming units (CFU) ml-1. These spiked samples were then filtered through 0.2 um NY and PVDF membranes, spectra obtained, and analyzed by cluster analysis. Sensitivity to the change is reflected in the distance between the spiked and control samples (Fig 9). In both cases 1 CFU ml-1 was detected and for S. aureus approximately 2 cells in 20 ml (0.1 CFU ml-1) was sufficient for a difference in spectral distance to be detected.

Discussion

Attenuated total reflectance mid-IR successfully imaged a spectral pattern of suspended planktonic particulates from a lake water column. Organic molecules absorb energy in the mid-

IR spectral region and spectra from whole cells are a fingerprint formed from all IR active organic molecules (7). Here we extended the concept of fingerprinting whole cells to planktonic microbial communities targeting the bacterial populations. Mid-IR as a method to profile microbial communities is not a novel concept. Scullion et al. (19) demonstrated how mid-IR spectroscopy can be used to profile global changes in earthworm cast communities. Davis et al.

(5) applied mid-IR ATR to detecting Salmonella enterica serovars as members of a mixed bacterial community contaminating the surface of chicken breasts. Here we extend the concept of mid-IR fingerprints to lake water planktonic microbial communities.

As with axenic cultures, the mid-IR fingerprints represent all molecules of biotic origin within the community, i.e., it is a measure of cellular function. There are several advantages to 56 using mid-IR fingerprinting. The method is fast. Spectra are typically the sum of 64 to 128 individual scans and the total time needed for collection is about 2 minutes for 128 scans. Six or more spectra can be obtained from a single 25 mm diameter filter; thus, assuming the sample is homogeneously mixed, multiple replicates can be obtained from a single filter. The software for most modern Fourier transform IR spectrometers includes programs for multivariate statistical comparison between samples, or the data can be downloaded in spreadsheet format for analysis by a stand alone statistical program. This means that once the methods for comparing spectra are established, little computational expertise is required for evaluating spectra. Sampling in the field is also straight forward. Filtration concentrates suspended particles and dehydration quenches biochemical activity, thus preserving the biochemical state. A mid-IR spectrum reflects the in situ functional profile of a sample, in this case the planktonic microbial community. In this study we addressed the feasibility of using mid-IR ATR as suggested. Pressing concerns included being able to detect signal, determining if the signal was, or at least included, chemical signatures from the bacterial community, and sensitivity of the signal.

A significant advantage to using filter membranes is that particles in dilute suspension are concentrated. Our goal was to load the filters to a point just before they plugged. This allowed for inclusion of a wash step with organic-free water to reduce the amount of dissolved chemicals, such as salts or dissolved organic carbon, trapped within the film. Filter membrane chemical composition was problematic due to absorption in the spectral areas of interest. Removing background signal by a blanking process reduces matrix effects, but there are still effects reflected as negative peaks within regions that the membranes absorb IR energy (Figure 12). This is due to penetration depth of the evanescent wave and thickness of the dried sample film. To overcome this problem two membrane chemistries were used, NY and PVDF, because they have complementary transparency windows in the IR spectrum. Merged spectra were a reasonable facsimile of mid-IR absorbance spectra in transmission mode for three windows, the 3,000 – 57 2,800 cm-1 lipid window, the 1,800 – 1,500 cm-1 amide widow, and 1,200 – 900 cm-1 polysaccharide window. Unfortunately, information does appear to be reduced in the 1,500 –

1,200 phospholipid-nucleotide window. However, the information that remains within this spectral region is still discriminative. The addition of as few as two S. aureus cells in 19 mL of lake water was sufficient to detect a change in the spectra. These shifts amplified with increasing numbers of spiked cells as indicated in the larger cluster distance between dilution samples.

An important question to be answered is: to what biological entities is the spectrum a signature? The objective of the work is for spectra to reflect bacterial populations. Spectra are truly an absorption pattern for all covalent bonds trapped on the membrane. The strongest statement that can currently be made is that bacteria are contributing to the mid-IR signal, based on DGGE analysis which clearly shows that Eubacterial rRNA gene sequences are amplified from total DNA extracted off the surface of the capture membranes. The use of a prefilter to remove or reduce large particulates from a water sample is common practice, yet prefilter selection appears to be arbitrary. For example, Andersson et al. (1) filtered water collected from the Baltic Sea through a 3.0 µm prefilter, Bodaker et al. (3) made no mention of a prefilter, and

Venter et al. (21) mention a 0.8 µm prefilter in series with a 0.2 µm capture filter. Prefilter selection is critical because, regardless of pore size, some eukaryotes will pass through and some prokaryotes will be retained by the prefilter. Although oligonucleotide primers preferentially target a particular community, thus reducing or eliminating signal from non-targeted populations, the prefilter could also impose a change in the population and, consequently, comparisons cannot be made between samples that are not treated identically. The DGGE data show little difference in band counts, between the high and low was 4 out of a maximum of 40 bands detected. A possible explanation for this is that eukaryotic DNA dilutes the abundance of 16S rRNA gene targets. The TLC assay for sterols is the best evidence that prefilters, within the pore size range 58 evaluated, do pass eukaryotes. What then should be the overriding criterion for prefilter pore size selection? Our belief is signal quality.

The 20 – 25 µm pore size gave the best signal quality as determined from the SNR, and spectral reproducibility. The importance of a high SNR to an instrumental method is self evident.

The importance of spectral reproducibility is a function of variance and statistical power. A normally distributed sample from a population has a variance and large variances reduce the distance between two means that can be declared significantly different. The difference between spectra is based on area which can be compared as a parametric statistic. Thus, reducing within sample spectral variance increases the power with respect to declaring two unique samples to be the same or different. Why the 20 – 25 µm prefilter had the highest SNR as well as the least within sample spectral variance is not clear. Our supposition, based on Figure 2, is that cells which passed through the 20-25 µm prefilter, but were excluded by smaller prefilters, provided a stronger absorbance and coated the filter more uniformly.

At present we conclude that ATR mid-IR in combination with the concentration of suspended particulates onto 0.2 µm capture filters can detect the signal of a microbial community in lake water. The signal should be sufficient to discriminate between samples which are spatially or temporally distinct. Testing this hypothesis is a logical next step. Much of the signal is from bacterial populations within the community but signal from inert organic particulates and eukaryotes must also be present. The method as presented detects signal from members of the microbial community which are less than 25 µm but greater than 0.2 µm in diameter. The power of mid-IR spectral information is to identify differences between samples and to suggest within what general class of molecules those changes are most significant. The biological relevancy of these shifts can only be determined by application of other methods such as correlating spectral patterns to community composition as determined by DGGE, 16S pyrotag deep sequencing, or fatty acid methyl ester profiles. Of value, too, would be qualitative and quantitative measures of 59 specific metabolites using mass spectrometry based methods. Thus, a long-term goal of this research is to ascribe biological meaning to mid-IR patterns.

Acknowledgements

The authors are indebted to the Russell T. Hill Lab at the Center of Marine Biology

(COMB), Baltimore, Maryland and the Jens Walter Lab at the University of Nebraska – Lincoln for assistance with DGGE and analysis; and Han Chen, PhD at the Morrison Microscopy Core

Research Facility, Center for Biotechnology, University of Nebraska – Lincoln.

60 References

1. Andersson, A.F., Riemann, L., and Bertilsson, S. 2010. Pyrosequencing reveals contrasting seasonal dynamics of taxa within Baltic Sea bacterioplankton communities: Bacterioplankton dynamics in the Baltic Sea. The ISME Journal, 4: 171-181.

2. Bligh, E.G. and Dyer, W.J. 1959. A rapid method of total lipid extraction and purification. Canadian Journal of Biochemistry and Physiology, 37: 911-917.

3. Bodaker, I., Sharon, I., Suzuki, M.T., Feingersch, R., Shmoish, M., Andreishcheva, E., Sogin, M.L., Rosenberg, M., Maguire, M.E., Belkin, S., Oren, A., and Beja, O. 2010. Comparative community genomics in the Dead Sea, an increasingly extreme environment. The ISME Journal, 4: 399-407.

4. Bowden, W.B. 1977. Comparison of Two Direct-Count Techniques for Enumerating Aquatic Bacteria. Applied and Environmental Microbiology, 33: 1229-1232.

5. Davis, R., Burgula Y., Deering A., Irudayaraj, J., Reuhs, B.L., and Mauer, L.J. 2010. Detection and differentiation of live and heat-treated Salmonella enterica serovars inoculated onto chicken breast using Fourier transform infrared (FT-IR) spectroscopy. J. or Applied Microbiology, IN PRESS.

6. Domenighini, A. and Giordano, M. 2009. Fourier transform infrared spectroscopy of microalgae as a novel tool for biodiversity studies, species identification, and the assessment of water quality. J. of Phycology, 45: 522-531.

7. Dunn, W.B. and Ellis, D.I. 2005. Metabolomics: Current analytical platforms and methodologies. Trends in Analytical Chemistry, 24: 285-294.

8. Fischer, G., Braun, S., Thissen, R., and Dott, W. 2004. FT-IR spectroscopy as a tool for rapid identification and intra-species characterization of airborne filamentous fungi. J. of Microbiological Methods, 64: 63-77.

9. Fisher, M.W. and Triplett, E.W. 1999. Automated Approach for Ribosomal Intergenic Spacer Analysis of Microbial Diversity and Its Application to Freshwater Bacterial Communities. Applied and Environmental Microbiology, 65: 4630-4636.

10. Gilbert, J.A., Dawn, F., Huang, Y., Edwards, R., Weizhong, L., Gilna, P., and Joint, I. 2008. Detection of Large Numbers of Novel Sequences in the Metatranscriptomes of Complex Marine Microbial Communities. PLoS ONE, 3: e3042.

11. Helm, D., Naumann, D., Labischinski, H., and Schallehn, G. 1991. Elaboration of a procedure for identification of bacteria using Fourier-transform infrared spectral libraries: A stepwise correlation approach. J. of Microbiological Methods, 14: 127-142.

12. Heng, H. H.Q. and Tsue, L.C. 1993. Modes of DAPI banding and simultaneous in situ hybridization. Chromosoma, 102: 325-332. 61 13. Muyzer, G., De Waal, E.C., and Uitterlinden, A.G. 1993. Profiling of Complex Microbial Populations by Denaturing Gradient Gel Electrophoresis Analysis of Polymerase Chain Reaction- Amplified Genes Coding for 16S rRNA. Applied and Environmental Microbiology, 59: 695-700.

14. Naumann, D., Helm, D., Labischinski, H., and Giesbrecht, P. 1991. The characterization of microorganisms by Fourier-Transform infrared spectroscopy (FT-IR). Modern Techniques for Rapid Microbiological Analysis. Edited by Nelson, W.H., New York: VCH Publisher, 43-96.

15. Poretsky, R.S., Bano, N., Buchan, A., LeCleir, G., Kleikemper, J., Pickering, M., Pate, W. M., Moran, M.A., and Hollibaugh, J. T. 2005. Analysis of Microbial Gene Transcripts in Environmental Samples. Applied and Environmental Microbiology, 71: 4121-4126.

16. Radwan, Hanora, A., Zan, J., Mohamed, N. M., Abo-Elmatty, D. M., Abou-El-Ela, S. H., and Hill, R. T. 2010. Bacterial Community Analyses of Two Red Sea Sponges. Marine Biotechnology, 12: 350-360.

17. Raes, J. and Bork, P. 2008. Molecular eco-systems biology: towards an understanding of community function. Nature Reviews Microbiology, 6: 693-699.

18. Robinson, C.J. Bohannan, B. J.M., and Young, V. B. 2010. From Structure to Function: the Ecology of Host-Associated Microbial Communities. Microbiology and Molecular Biology Reviews, 74: 453-476.

19. Scullion, J. Goodacre, R., Huang, W. E., Elliott, G. N., Worgan, H., Bailey, M. J., Draper, J., Gwynn-Jones, D., Griffith, G. W., Winson, M. K., and Clegg, C. 2003. Use of earthworm casts to validate FT-IR spectroscopy as a 'sentinal' technology for high-throughput monitoring of global changes in microbial ecology. Pedobiologia, 47: 440-446.

20. Smith, B.C. 1996. Fundamentals of Fourier Transform Infrared Spectroscopy. Boca Raton: CRC Press, 1996.

21. Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J., Wu, D., Paulsen, I., Nelson, K.E., Nelson, W., Fouts, D.E., Levy, S., Knap, A.H., Lomas, M.W., Nealson, K., White, O., Peterson, J., Hoffman, J., Parsons, R., Baden-Tilson, H., Pfannkoch, C., Rogers, Y.H., and Smith, H.O. 2004. Environmental Genome Shotgun Sequencing of the Sargasso Sea. Science, 304: 66-74.

22. Ward, J.H. 1963. Hierarchical Grouping to Optimize an Objective Function. J. of the American Statistical Association, 58: 236-244.

23. Wilmes P. and Bond, P.L. 2004. The Application of two-dimensional polyacrylamide gel electrophoresis and downstream analyses to a mixed community of prokaryotic microorganisms. Environmental Microbiology, 6: 911-920.

24. Zbasnik, R., Carr, T., Weller, C., Hwang, K. T., Wang, L., Cuppett, S., Schlegel, V. 2009. Antiproliferation Properties of Grain Sorghum Dry Distiller's Grain Lipids in Caco-2 Cells. J. of Agricultural and Food Chemistry, 57: 10435-10441. 62

Figure 1. Mid-IR spectra of biological samples on PVDF and nylon filters recorded by attenuated total reflectance (ATR) using a clean filter as a blank leaves areas masked by the strong absorbance of the filters themselves. Overlaying PVDF and nylon spectra with one another shows positive signal in windows 1 and 2 from PVDF and windows 3 and 4 from nylon. Merging two spectra, then performing baseline correction and vector normalization on the merged spectra, in comparison to spectra obtained in transmission mode, reveals information in all four windows by either method.

63

Figure 2. Scanning electron micrographs (SEMs) of 0.2 µm nitrocellulose filters. A) 20-25 µm prefiltered lake water, [x500]. B) 11 µm prefiltered lake water, [x500]. C) 8 µm prefiltered lake water, [x500]. D) 2.5 µm prefiltered lake water, [x1200]. E) 1.2 µm prefiltered lake water, [x1200]. F) Clean, 0.2 µm nitrocellulose filter, [x2500]. G) Gram negative bacteria, axenic culture scraped from agar plates, suspended in sterile, Nanopure® water, captured onto a filter, [x2500]. Individual cells are not distinguishable. On the right side of the SEM image is a tear in the filter due to SEM preparation. H) Gram positive bacteria, prepared identically as in G, [x4000].

64

Figure 3. Count of objects on SEM of 0.2 µm capture filters greater than 5 µm in diameter (two standard deviations depicted by error bars). Three fields-of-view at a magnification of x500 under SEM from one filter were averaged.

65

Figure 4. Denaturing Gradient Gel Electrophoresis (DGGE) of 16S rRNA amplicons, extracted from 0.2 µm filter membranes. Communities were analyzed by presence or absence of bands in prefiltered lake water. Band count is the same at 20-25 µm and 8 µm (40), 2.5 µm (38), and unfiltered (36) samples. Cluster analysis generated distances by Pearson‟s correlation and clustered according to UPGMA.

66

Figure 5. Thin layer chromatography (TLC) of lipids extracted from 0.2 µm nitrocellulose filters of unfiltered lake water, 20-25 µm, 8 µm, and 2.5 µm prefiltered lake water. A 9-lipid standard provides identification of the different lipid types on the silica TLC plate.

67

Figure 6. Signal-to-noise ratios (SNR) of mid-IR spectra recorded by attenuated total reflectance (ATR). Six replicates of each sample. Analysis of variance (ANOVA) indicates there is a statistically significant difference between samples. A means plot illustrates that the 20-25 µm prefilter sample has a significantly higher SNR than other samples.

68

Figure 7. Within replicate mean D-values of first derivative spectra, from wavenumbers 3040- 2800 and 1800-900 cm-1, represent the within sample variance. Comparing six replicates provides 15 D-values. Error bars indicate two standard deviations.

69

Figure 8. Optimal clustering analysis to minimize within heterogeneity and maximize between heterogeneity. Cluster analysis is based on D-values (Helm, 1999) of first derivative spectra, clustered by Ward‟s algorithm. Cluster analysis was executed over wavenumbers in the following windows with weighting factors in brackets: Window 2[3], Window 1 [2], and Windows 3 and 4 [1].

70

Figure 9. Dendrograms of first derivative spectra over wavenumbers 3040-2800 and 1800-1000 cm-1 were clustered by Ward‟s algorithm. Each point on the dendrogram is an average spectrum of one sample. Clear clustering by number of colony forming units added to lake water indicates an ability to discriminate by amount of change to the system.

71

Figure 10. First derivative spectra of each window. All cluster analysis based on D-values of first derivative spectra and clustered by Ward‟s algorithm. A. Window 1, the fatty acid window (wavenumbers 3040-2800 cm-1). Cluster analysis accurately distinguishes 20-25 µm prefiltered lake water and unfiltered lake water. 8 and 2.5 µm prefiltered lake water did not cluster clearly. B. Window 2, the amide window (wavenumbers 1800-1500 cm-1). All samples were clearly identified in this window. Also, similarities were identified as expected – unfiltered lake water is most similar to 20-25 72 µm prefiltered lake water, followed by 8 µm prefiltered lake water, and 2.5 µm prefiltered lake water is most dissimilar. C. Window 3, a mixed window due to absorbance by phospholipids and nucleotides (wavenumbers 1500- 1200 cm-1). Cluster analysis accurately distinguished 20-25 µm prefilter and 8 µm prefilter. Some confusion at the unfiltered and 2.5 µm prefiltered lake water. A large amount of heterogeneity between the two pairs: pair-1) unfiltered and 20-25 µm prefiltered lake water; and pair-2) 8 and 2.5 µm prefiltered lake water. D. Window 4, polysaccharide window (wavenumbers 1200-900 ). Cluster analysis distinguished the 8 µm prefiltered lake water accurately, but all others were confused.

73

Figure 11. Mid-IR spectra of filter chemistry types. Glass and cellulose nitrate filters were also examined (data not shown). Each filter chemistry absorbs in at least one window of information, which caused negative regions in the spectrum when the blank was subtracted from the sample.

74

Figure 12. If the filter absorbs at 1650 cm-1, the absorbance of the virgin filter at 1650 will be greater than the absorbance at 1650 cm-1 of the sample, because the path distance into the filter matrix has now decreased. The equation that determines the final spectrum is: Final spectra = sample filter – blank.

75

Chapter 3: A Biochemical Fingerprint by Mid-IR Spectroscopy of the

Microbial Community from Six Very Different, Nebraska Sandhills‟ Lakes

76 Abstract. The objective of this study was to verify that a mid-IR spectroscopy method of capturing the in situ function of a microbial community at a given moment in time can discriminate between different lakes. In this study, physico-chemical characteristics of six lakes were determined, structural information of the microbial community was described by pyrosequencing data, and mid-IR ATR spectra were obtained by concentrating the microbial community onto PVDF and nylon, 0.2 micron filter membranes. The results verify that the method is capable of capturing a discriminating biochemical signature of a lake microbial community. A tool is now available to compare the total biochemical activity of microbial communities at a given time-point.

Introduction

Microbial community function is the sum total of all biochemical processes. Although easy to conceptualize, the quantitative or qualitative measurement of function in situ is difficult.

Important questions in microbial community function include: what metabolites are present in the community, and how do we compare the function of two communities or over time, and how do environmental factors influence community function? In this article, we look at a unique collection of interdunal lakes in the Nebraska Sandhills to verify the discriminative power of a method that captures the in situ function of a microbial community at a given point in time by mid-IR spectroscopy.

The Nebraska Alkaline Lake ecosystem is unique because lakes with discrete chemistries are located in close proximity to one another (8, 11), as close as 500 meters. Because of this unique juxtaposition, one can assume the lakes are inoculated with the same microorganisms, over the same time period, from waterfowl, cattle, rodents, groundwater discharge, and runoff from the grass-covered sand dunes. The differences in lake chemistry can be explained by the flow of groundwater into and out of the lakes, the rate of evapotranspiration, rainfall, and the flow of groundwater to the lakes (3, 4, 8). The currently accepted paradigm is that environmental 77 factors steer community structure by exerting a selective force. If, however, there is no relationship between water chemistry and microbial structure, an alternative hypothesis must be sought to explain differences in community structure. Far and away the most common method for measuring community structure is by 16S rRNA based phylogeny. This approach assumes that structure equates to function. Horizontal gene transfer or environmental influences such as alkalinity may cause biochemical variations within species, meaning that the microbial function of a community will not follow community structure.

Recently a method of capturing the in situ function of a planktonic microbial community by mid-IR spectroscopy has been developed (Chapter 2 of this thesis). By concentrating the microbial community onto a membrane filter of 0.2 µm pore size, an attenuated total reflectance

(ATR) mid-IR spectrum of the microbial community is attainable. The resulting spectrum serves as a whole-cell, biochemical fingerprint of the community which can be compared to other communities or over time. The biochemical fingerprint can also be compared to the water chemistries of each lake and the community structure as identified by 16S rRNA pyrosequencing of filters prepared in the same manner.

Materials and Methods

Sampling. Six lakes from the Crescent Lake National Wildlife Refuge (CLNWR) in Western

Nebraska were sampled: Border Lake, Kokjohn Lake, Perrin Lake, Smith Lake, Tree Claim Lake, and Windmill Lake. Each lake was sampled at five different locations, spread across the lake to provide a horizontal cross sectional sample, such that the average and standard deviation should be representative of the entire lake. Water was bottled in 1-Liter plastic, sterile sample bottles by immersion no deeper than 20 cm from the surface.

Water Processing. Lake water was brought to a centralized location and processed within two hours in the University of Nebraska-Kearney Mobile Environmental Laboratory (MEL). All water was prefiltered through 20-25 µm pore-sized filters (Whatman® 541 Filter Papers). The 78 resulting microbial community was then captured on Nitrocellulose filter membranes

(Whatman® Cellulose Nitrate Membrane Filters, 0.2 µm, 47 mm diameter) for eventual DNA extraction and 25 mm Polyvinyldiene Fluoride (PVDF) (Millipore® Durapore® Membrane

Filters, 0.22 µm GV) and 25mm nylon filter membranes (Millipore® Nylon Membrane, 0.2 µm

GNWP) for eventual analysis by mid-IR ATR. All capture filters were rinsed with 5 mL sterile ddH2O, and then dried in a lyophilizer for 15 min before being stored with desiccant until analyzed.

Pyrosequencing. Nitrocellulose filters were cut into small pieces and placed into a deep well,

96-sample holder plate. Cell lysis and DNA extraction were conducted according to the protocol from QIAGEN® BioSprint ® 96 One-For-All Vet robotic cell lysis instrument, buffers and procedures followed the Qiagen® VET-100 protocol, modified to include 3 cycles of beadbeating in a RetschTM TissueLyser II (QIAGEN®) for 4 min each cycle. DNA was extracted and then amplified, using primers with 8-nt barcodes fused (designated as „B‟), targeting the V1 and V2 regions of the 16S rRNA genes A-8FM (5‟-CCATCTCATCCCTGCGTGTCTCCGACTCA

GBBBBBBBBAGAGTTTGATCMTGGCTCAG) and B-357R (5‟-CCTATCCCCTGTGTGCC

TTGGCAGTCTCAGBBBBBBBBCTGCTGCCTYCCGTA-3‟). After amplification, DNA was quantified by gel electrophoresis with PicoGreen (Invitrogen) based on the intensity of each band as measured by a Qubit Fluorometer (Invitrogen). Equivalent concentrations were then amplified in the Roche-454 GS FLX and sequenced in Titanium sequencing adapters.

Sequences obtained were checked for quality and eliminated based on the following criteria: (1) Absence of a complete forward primer and barcode; (2) Two or more interruptions in the sequential flow of one read; (3) Any sequence length less than 200 nt or greater than 500 nt; and (4) Average quality score less than or equal to 20 (2). Primer sequences were removed from sequences and sorted based on barcode. Sequences were then classified based on the Ribosomal

Database Project (RDP) CLASSIFIER into known taxonomical categories. After removal of rare 79 species (less than 0.1% frequency) taxonomical data by Genera were then clustered by Ward‟s algorithm based on Euclidean distances in Statgraphics® Centurion version XVI.

Mid-IR Spectroscopy. Capture filters (0.2 µm) were analyzed with the Bruker Instruments

Equinox 55 FT-IR spectrometer with a deuterated triglycine sulfate detector and a Pike

Technologies, MIRacle®, single reflectance attachment with a ZnSe internal reflectance element

(IRE). Spectra were recorded from 4,000 to 750 cm-1 at a resolution of 4 cm-1. 128 scans were co-added. All spectra were merged and analyzed using the Bruker software OPUS NT version

6.5. Each site had one pair of filters, PVDF and nylon, and three instrumental replicates were taken from each filter. Cluster analysis was based on the D-values of first derivative spectra in the regions of information: 3040 – 2800 cm-1 and 1800 – 750 cm-1.

Lake water physical and chemical properties. The conductivity, carbonate, bicarbonate, and chloride concentrations were determined at Ward Laboratories Inc. (4007 Cherry Ave, Kearney,

NE 68848-0788). The water was stored in 1-Liter plastic, sterile bottles for shipment to Ward‟s

Laboratory. Temperature and pH was measured on site. Turbidity was measured on site by a

Turner Designs (Sunnyvale, CA) Aquafluor (Model 8000-010). Nitrate, ammonia, phosphate, and sulfate were measured at a centralized location by the HACH (Loveland, CO) DR/890

Colorimeter, according to the Procedures Manuals. Total Dissolved Solids were measured by weighing 1 ml aliquots on planchets after oven-drying at 80 °C for 16 hours. Li, B, Na, Mg, P, S,

K, Ca, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, and Cd were measured with an Agilent ICP-MS

(7500 CX).

Results

Lake water physico-chemical characteristics. Table 1 lists the average physico-chemical properties of each lake. Border Lake stands out as the most concentrated lake and Windmill Lake as the most dilute. Similarities based on all 19 measurements in Table 1 can be better illustrated by Hierarchical Cluster Analysis of Euclidean Distances by Ward‟s algorithm (Statgraphics 80 Centurion XVI version 16.1.05) (Figure 1). The resulting dendrogram illustrates that Border and

Windmill Lakes are very different from the other lakes based on their physico-chemical properties. Additionally, similarities can be seen between (1) Tree Claim and Kokjohn Lakes and

(2) Perrin and Smith Lakes.

Microbial Community Structure based on Pyrosequencing. The distribution of Phyla for each lake is illustrated by pie charts in Figure 2. The dominant Phylum in Border Lake was

Firmicutes; for Kokjohn Lake it was Cyanobacteria; and for Perrin, Smith, and Windmill Lakes it was Proteobacteria. For these five lakes, each of the five sampling sites had roughly the same bacterial profile. However, Tree Claim Lake was very different for each site (Figure 3). The results for the first three sites of Tree Claim Lake were reproduced to verify the outcome. Sites 1 and 3 were Proteobacteria dominant; whereas, sites 2, 4, and 5 were Firmicute dominant.

Table 2 lists the 29 most abundant Genera for each of the sites in the six lakes, classified by RDP Classifier, in terms of percent abundance. Rare species below 0.1% of the total were removed because they only occurred one or two times in the total and thus are unlikely to play a major role in nutrient flux through the community. Figure 4 consists of a dendrogram, from the

Genera frequency data based on Euclidean distances clustered by Ward‟s algorithm (Statgraphics

Centurion XVI version 16.1.05). The dendrogram illustrates the large difference between Border

Lake and the other lakes. Also, Windmill and Perrin Lakes are the most similar of the six lakes, and Kokjohn and Smith Lakes are much different from one another. Tree Claim Lake, which had different bacterial profiles within the lake (Figure 3), had sites 2, 4, and 5 cluster with Border

Lake and sites 1 and 3 cluster much differently (Figure 4). Note that the replicates of the individual sites from Tree Claim Lake, e.g. TC1-2 and TC2-2, clustered tightly with the first samples from those respective sites (Figure 4).

Mid-IR Spectroscopy. Averaged zero-order spectra from each of the six lakes, collected and processed as described in Chapter 2, from each of the six lakes are shown in Figure 5. The 81 spectral patterns are complex but have a strong signal-to-noise ratio in each of the five windows of information (12). The goal of minimizing within lake distances and maximizing between lake distances drove our choices of preprocessing methods and window selection. Each window provided some discrimination between lakes but the amide and polysaccharide windows were most discriminative. Figure 5 provides the cluster analysis of each window of information.

Figure 6 shows averaged first derivative spectra for each lake. The optimized cluster diagram is shown in Figure 8 after scaling to first range with weighting factors for each window as described in the figure legend. Every sample location within a given lake clusters into a homogeneous grouping for that lake. For the six lakes, Smith Lake had the least heterogeneity and Tree Claim

Lake had the greatest heterogeneity.

Discussion

This study is the first report using mid-IR spectroscopy to compare biochemical fingerprints between individual lakes of known lake water chemistry. Mid-IR spectroscopy provides unique, functional data at a specific moment in time, describing the microbial community in situ. It is our contention that spectra reflect the sum of all microbial functions and are analogous to photographs of a habitat from which the behavior of macroorganisms can be discerned based on a prior knowledge of individual behavior. Although a spectrum is static, changes in spectra over space and time become a dynamic reflection of changes in microbial community function between two samples. Concentrating particulates onto the surface of a membrane is a key step in preparing the samples. Spectra reflect the particulates that pass through the 20 µm prefilter, but are trapped on the 0.2 µm capture filters. Although at this time we cannot say for certain what portion of the trapped particles are Bacteria, Archaea, Eukaryotes, or abiotic particulates, we can be certain that the majority of the signal is from organic bonds simply by the nature of the bonds that absorb in the IR region. Furthermore, rinsing the filters with water removed a significant portion of organic molecules and salts to further enhance signal. 82 Drying the filters acts to quench metabolic activity and although quenching was not immediate due to the time between initiating filtration and achieving complete desiccation, once dry, metabolic activity will cease. Thus, spectra reflect in situ function.

Lake water chemistry: The physico-chemical data and cluster analysis detailed differences in the measured parameters between lakes. The objective of this analysis was to determine if there were significant differences in total water chemistry between lakes rather than if there were individual factors that steer microbial community structure or function. Based on this rationale every parameter measured was equally weighted. Multiple locations within each lake were sampled and evaluated individually allowing for a parametric statistical approach to determine the uniformity of water chemistry throughout each lake. With the exception of a small number of individual chemistries, the distributions were normally distributed and the mean values reported in Table 1 are representative of the average lake water value. Uniformity of lake water chemistry is also reflected in Figure 1. Each lake clustered to a unique position on the dendrogram. Were this not the case, subsequent evaluations of microbial community structure and function would need to be evaluated as discrete samples based on sample site rather than representative of the entire lake.

16S rRNA gene sequencing: The 16S rRNA gene sequencing data provides information about the bacterial richness and taxa abundance within each sample. Abundance calculations require the assumption that an individual amplicon represents an individual bacterium. Association of query sequences to a database of known sequences allows for some prediction of function at least at higher taxonomic levels. Figure 2 shows the association and abundance of sequences at the taxonomic level of Phylum. Cyanobacterium and photosynthetic Eukaryotes were the dominant

Phylum in Kokjohn Lake indicating high primary productivity. The dominant Phylum in Border

Lake was Firmicutes whereas Proteobacterium were dominant in the remaining four lakes. The reason for the dominance of Firmicutes versus Proteobacterium is unclear at this time but both 83 are heterotrophs. If physico-chemical parameters had a dominant influence on taxa richness, then the expectation is that the physico-chemical and 16S rRNA gene sequence dendrograms should have approximately the same branching pattern. Figure 4 is a dendrogram based on the 76 most prevalent taxa at the level of Genus, i.e., those present above 0.1% of the total sequences queried and represent the majority of bacterial biomass present in the samples. Exclusion of sequences present at less than 0.1% is justified because, at the time of sampling, these “rare” sequences contribute little to the overall flux of nutrients through the lakes. The clustering of the physico- chemical data suggests that Border Lake was most unique and the 16S rRNA gene data supports a unique bacterial structure in Border Lake compared to the other lakes. However, the 16S rRNA gene cluster patterns for the remaining lakes do not coincide with the physico-chemical data.

Tree Claim Lake is problematic in that the cluster data suggest a heterogeneous community structure between sites with some more similar to Border Lake and others to Perrin and Windmill

Lakes. Because of this heterogeneity in Tree Claim Lake, three additional pyrosequencing runs were completed corresponding to sites 1, 2, and 3 with each site being sequenced in duplicate.

The duplicate data are in near complete agreement with each other, thus showing that the sequencing was reproducible. However, why the sequence data is heterogeneous is unknown.

Mid-IR spectral data: Mid-IR spectral data provide information about the organic biomolecules present in the sample. Although a rigorous deconvolution of an individual spectrum can provide information about specific chemical bonds, the pattern reflecting the total organic presence is equally informative. In simplest terms, two spectra that are identical contain the same richness and abundance of organic compounds. As distance between two spectra increases, so too does the chemical composition of the sample. The distances between spectra are presented as a dendrogram in Figure 8. Again, according to the physico-chemical data Border Lake should be more distant to all other lakes if physico-chemical pressure is the driving force for biomolecular composition of a sample. However, the spectral distance was greatest between Tree Claim Lake 84 and all other lakes. Unlike the 16S rRNA based dendrogram, there was little spectral heterogeneity between locations within Tree Claim Lake. This juxtaposition suggests that, although the taxa present at each location vary, the biochemical signature at each location does not, at least with respect to all other lakes. Of the remaining lakes, the spectral pattern for Smith

Lake was more heterogeneous followed by Border, Kokjohn, Perrin, and Windmill Lakes in decreasing order of heterogeneity.

Similarities between the pyrosequencing data and the mid-IR spectroscopy became clear when each window was examined individually. Windmill and Perrin Lakes are the most closely related lakes by pyrosequencing data and Windmill and Perrin Lakes are also closely related in three windows of information from mid-IR spectroscopy: the amide window, the mixed window, and the polysaccharide window. The optimized, mid-IR cluster analysis resembles the pyrosequencing data; but, they are not identical. Tree Claim Lake sites have very different 16S rRNA gene profiles, yet the mid-IR spectra from these sites cluster closely. Smith Lake has a unique pattern with a strong, broad absorbance in the 1500 cm-1 region and a weak, sharp absorbance at around 800 cm-1. This pattern is indicative of aromatics, possibly resulting from lignocellulose degradation.

Are 16S rRNA gene sequence and mid-IR spectral data comparable? The 16S rRNA gene sequence and mid-IR spectral dendrograms are not in concordance. 16S rRNA gene sequence based trees are a measure of the evolutionary history of a species. As has been pointed out by

Oberreuter et al. (14), Scullion et al. (17), and others, the associations between species based on the two methods will not be in perfect agreement. We contend that the differences in associations between physico-chemistry, 16S rRNA gene sequences, and mid-IR spectra reflect differences in the evolution of a conserved core genome and the fluidity of the accessory gene pool brought about by horizontal gene transfer. The concept is illustrated in Figure 9. Many experiments classifying single-celled organisms by mid-IR spectroscopy were published through the 1990s 85 and during the last decade (1, 5, 7, 9, 10, 12, 15, 18). In these experiments great care was taken to ensure that conditions were the same throughout the growth of cultures because, if conditions were changed, spectral clustering would change, thus invalidating comparisons between species.

Prokaryotes in particular have a wide variety of inducible genes and biochemical pathways, which would cause their biochemical profiles to change drastically (6, 13, 10). On the other hand, two evolutionarily distinct taxa may exhibit similar spectral patterns because their functioning genes are similar.

In conclusion, we have verified that spectra from different lakes, with different chemistries and microbial communities, can be discriminated. Future experiments are required to investigate differences observed between physico-chemical data, pyrosequencing data, and mid-

IR spectroscopy. Additional information will be required to assign biological significance to spectra in much the same way as the biological significance of a photograph cannot be interpreted without a priori information. Connections must be made between spectral profiles and known biochemical function. The connection between the presence and activity of metabolic pathways and mid-IR spectra may also be better defined, allowing validation of current models today that connect genomic data to function (16) in a simple, rapid, and inexpensive manner.

Acknowledgements

The authors are indebted to Javier Saravelli, Ph.D., from the University of Nebraska-

Lincoln Spectroscopy and Biophysics Core – Mass Spectrometry, who provided expertise on the

ICP-MS to ensure ion measurements were precise. The authors thank the Core for Applied

Genomics and Ecology (CAGE), University of Nebraska – Lincoln who sequenced all samples.

86 References

1. Becker, K., Al Laham, N., Fegeler, W., Proctor, R.A., Peters, G., von Eiff, C. 2006.

Fourier-Transform Infrared Spectroscopic Analysis Is a Powerful Tool for Studying the Dynamic

Changes in Staphylococcus aureus Small-Colony Variants. J. Clinical Microbiology, 44: 3274-

3278.

2. Benson, A.K., Scott, A.K., Legge, R., Fangrui, M., Low, S.J., Kim, J.H., Zhang, M., Oh,

P.L., Nehrenberg, D., Hua, K., Kachman, S.D., Moriyama, E.N., Walter, J., Peterson, D.A., and

Pomp, D. 2010. Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors. PNAS, 107: 18933-18938.

3. Bleed, A.S. 1998. Lakes and Wetlands. Ed. Bleed, A.S. An Atlas of the Sand Hills

(Conservations and Survey Division: Lincoln, NE), 115-122.

4. Chen, C.W., Gherini, S.A., Peters, N.E., Murdoch, P.S., Newton, R.M., Goldstein, R.A.

2008. Hydrologic analyses of acidic and alkaline lakes. Water Resources Research, 20: 1875-

1882.

5. Domenighini, A., and Giordano, M. 2009. Fourier transform infrared spectroscopy of microalgae as a novel tool for biodiversity studies, species identification, and the assessment of water quality. J. of Phycology, 45: 522-531.

6. Eisenstark A. 2010. Genetic diversity among offspring from archived Salmonella enterica spp. enterica Serovar Typhimurium (Demeree Collection): In search of survival strategies. Eds. Gottesman, S. and Harwood, C.S. Annual Review of Microbiology, 64: 277-292.

7. Fischer, G. Silvia Braun, Ralf Thissen, Wolfgang Dott. 2006. FT-IR spectroscopy as a tool for rapid identification and inter-species characterization of airborne filamentous fungi. J.

Microbiological Methods, 64: 63-77.

8. Gosselin, D., Sibray, Steve, Ayers, Jerry. 1994. Geochemistry of K-rich Alkaline Lakes.

Geochimica et Cosmochimica Acta, 58: 1403-1418. 87 9. Kirschner, C., Dieter Naumann, Manfait, Michel, Sockalingum, Ganesh D., Maquelin,

K., Pina, P., Ngo Thi, N.A., Choo-Smith, L.P., Sandt, C., Ami, D., Doglia, S.M., Allouch, P.,

Puppels, G.J., Orsini, F. 2001. Classification and Identification of Enterococci: a Comparative

Phenotypic, Genotypic, and Vibrational Spectroscopic Study. J. of Clinical Microbiology, 39:

1763-1770.

10. Mariey, L., Signolle, J.P., Amiel, C., Travert, J. 2001. Discrimination, classification, identification of microorganisms using FTIR spectroscopy and chemometrics. Vibrational

Spectroscopy, 26: 151-159.

11. McCarraher, D.B. 1977. Nebraska’s Sandhills Lakes. Ed. Huff, E. (Nebraska Game and

Parks Commission: Lincoln, NE).

12. Naumann, D., Dieter Helm, Labischinski, H., Giesbrecht, P. 1991. The characterization of microorganisms by Fourier-Transform infrared spectroscopy (FT-IR). Ed. Nelson, W.H.

Modern Techniques for Rapid Microbiological Analysis (VCH Publishers: New York), 43-96.

13. Norris, K.P. and Greenstreet, J.E.S. 1958. On the infrared absorption spectrum of

Bacillus megaterium. J. General Microbiology, 19: 566-580.

14. Oberreuter, H. Charzinski, Joachim, Scherer, Siegfried. 2002. Intraspecific diversity of

Brevibacterium linens, Corynebacterium glutamicum and Rhodococcus erythropolis based on partial 16S rDNA sequence analysis and Fourier-transform infrared (FT-IR) spectroscopy.

Microbiology, 148: 1523-1532.

15. Raes, J. and Bork, P. 2008. Molecular eco-systems biology: towards an understanding of community function. Nature Reviews Microbiology, 6: 693-699.

16. Rebuffo-Scheer, C.A., Scherer, Siegfried, Dietrich, Jochen, Wenning, Mareike. 2008.

Identification of five Listeria species based on infrared spectra (FTIR) using macrosamples is superior to a microsample approach. Annals of Bioanalytical Chemistry, 390: 1629-1635. 88 17. Scullion, J., Goodacre, Royston, Huang, Wei E., Elliott, Geoff N., Worgan, Hilary,

Bailey, Mark J., Draper, John, Gwynn-Jones, Dylan, Griffith, Gareth W., Winson, Michael K.,

Clegg, Christopher. 2003. Use of earthworm casts to validate FT-IR spectroscopy as a 'sentinal' technology for high-throughput monitoring of global changes in microbial ecology. Pedobiologia,

47: 440-446.

18. Tindall, B.J., Brambilla, Evelyne, Steffen, Maike, Neumann, Regine, Pukall, Rudiger,

Kroppenstedt, Reiner M., Stackebrandt, Erko. 2000. Cultivatable microbial biodiversity: gnawing at the Gordian knot. Environmental Microbiology, 2: 310-318.

89

Table 1. Temperature, pH, and turbidity measurements taken on site. Bicarbonate, carbonate, chloride, and conductivity measured by Ward Laboratories Inc., Kearney, NE. Sulfate, Phosphate, Nitrate, Ammonia tested by HACH Instruments. Na, K, Mg, Ca, Cu, Fe, and Mn concentrations determined by ICP-MS. Standard deviations shown in parentheses. Values of „0.00‟ are less than the detection limit of the instrument, or by ICP-MS, < 0.00.

90 Name B1 B2 B3 B4 B5 K1 K2 K3 K4 K5 GpIIa 15.6 0.1 5.4 11.8 6.8 52.4 50.4 65.3 55.3 49.8 unclassified_Betaproteobacteria 1.3 0.1 1.3 0.9 1.5 1.6 0.8 0.6 1.0 1.3 unclassified_Comamonadaceae 3.6 0.5 2.2 2.2 1.9 2.6 2.1 0.6 1.3 1.3 Weissella 11.9 22.1 18.1 17.5 20.0 1.3 2.3 2.3 1.0 2.8 Leuconostoc 8.9 17.2 18.3 16.7 20.8 1.3 2.5 2.4 1.3 2.6 Hydrogenophaga 6.4 0.0 3.4 1.9 1.1 1.8 4.0 0.6 1.1 1.0 Polynucleobacter 0.3 0.0 0.8 0.7 0.5 1.0 0.7 0.2 0.6 1.1 unclassified_Burkholderiales 0.7 0.0 0.4 0.3 0.7 3.3 1.7 1.0 2.2 2.7 unclassified_Bacteroidetes 5.2 0.3 1.8 2.7 1.2 5.6 4.6 3.9 7.6 5.7 unclassified_Actinomycetales 0.7 0.1 0.3 0.5 0.4 1.2 1.2 0.6 1.1 1.3 Lactococcus 5.4 11.8 9.6 11.4 11.7 0.7 1.1 1.2 0.6 1.1 Cryobacterium 0.4 0.0 1.1 0.8 0.1 1.2 1.3 0.3 0.6 0.6 Flavobacterium 0.1 0.1 0.7 0.2 0.4 0.3 0.2 0.1 0.2 0.3 Acinetobacter 4.4 7.1 8.3 5.4 6.8 0.4 0.6 0.7 0.3 0.9 Chloroplast 0.5 1.5 0.2 0.5 0.4 1.5 2.6 1.4 0.6 2.6 unclassified_Microbacteriaceae 4.8 0.1 1.8 2.5 1.3 1.2 1.5 0.3 1.0 1.3 unclassified_Flexibacteraceae 0.6 0.0 0.4 0.6 0.5 3.9 2.1 2.6 5.2 3.7 unclassified_Proteobacteria 0.3 0.1 0.2 0.2 0.3 0.5 0.3 0.4 0.5 0.5 unclassified_Flavobacteriaceae 1.5 1.1 1.8 1.2 1.4 1.6 1.7 1.4 2.3 1.4 unclassified_Gammaproteobacteria 2.5 0.3 1.7 1.5 0.8 1.3 1.6 0.6 1.6 1.0 Cryptomonadaceae 0.3 0.0 0.1 0.2 0.2 0.0 0.0 0.0 0.0 0.1 unclassified_Firmicutes 0.7 0.2 0.7 0.5 0.1 0.1 0.2 0.1 0.1 0.1 unclassified_Flavobacteriales 2.1 0.0 0.4 0.8 0.1 1.2 1.2 1.2 1.7 1.5 unclassified_Pseudomonadales 2.9 2.9 3.2 2.3 3.7 0.4 0.7 0.7 0.6 0.6 Fluviicola 0.0 0.0 0.3 0.3 0.4 0.2 0.0 0.0 0.1 0.0 Arcicella 0.1 0.0 0.2 0.1 0.3 0.0 0.0 0.0 0.0 0.0 Curvibacter 0.0 0.0 0.1 0.2 0.2 0.1 0.0 0.0 0.0 0.1 unclassified_Rhodobacteraceae 0.1 0.0 0.2 0.3 0.0 2.6 2.4 2.0 2.1 2.4 unclassified_Sphingobacteriales 0.2 0.0 0.2 0.3 0.0 0.6 0.5 0.4 0.7 0.5 Citrobacter 1.3 2.5 1.9 2.2 3.2 0.0 0.2 0.2 0.1 0.2 unclassified_Enterobacteriaceae 1.5 2.7 1.9 1.8 2.4 0.1 0.3 0.2 0.1 0.2 unclassified_Alcaligenaceae 2.8 0.1 0.5 0.8 0.4 2.3 2.4 1.2 1.7 1.5 unclassified_Chloroflexales 0.1 0.0 0.1 0.0 0.0 1.1 1.0 1.2 1.3 1.0 Arcobacter 0.1 0.8 0.7 0.4 0.5 0.0 0.0 0.0 0.0 0.1 Table 2. A listing of pyrosequencing data, classified to genus level in RDP Classifier. The most abundant 75 genera, overall, are displayed for each site within each lake, from pages 90 to 95, listed as a percent abundance after removal of the rare species below 0.1%. Page 1 of 6.

91 Name B1 B2 B3 B4 B5 K1 K2 K3 K4 K5 Pseudomonas 0.4 0.4 1.0 0.3 0.2 0.0 0.1 0.1 0.0 0.1 unclassified_Rhizobiales 0.2 0.0 0.1 0.3 0.2 0.2 0.1 0.1 0.1 0.1 unclassified_Sphingomonadaceae 0.2 0.1 0.3 0.1 0.0 0.2 0.2 0.3 0.2 0.3 unclassified_Burkholderiaceae 0.0 0.0 0.1 0.2 0.2 0.0 0.0 0.0 0.0 0.0 unclassified_Alphaproteobacteria 0.0 0.1 0.0 0.1 0.1 0.2 0.4 0.1 0.2 0.2 Chlorophyta 0.1 0.0 0.1 0.2 0.1 1.0 1.5 1.0 0.4 1.9 Enterococcus 1.2 1.3 1.1 1.3 1.5 0.1 0.1 0.1 0.1 0.1 unclassified_Micrococcineae 0.7 0.1 0.3 0.2 0.2 0.5 0.6 0.2 0.5 0.4 Streptococcus 0.7 1.7 1.0 0.8 1.3 0.1 0.2 0.1 0.1 0.2 Haliscomenobacter 0.0 0.0 0.0 0.0 0.0 0.5 0.4 0.6 0.5 0.6 Clostridium 0.1 6.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 unclassified_Incertae sedis 5 0.1 0.0 0.1 0.2 0.1 0.2 0.1 0.0 0.1 0.2 unclassified_Actinobacteria 0.1 0.0 0.2 0.3 0.1 0.1 0.1 0.1 0.0 0.2 Veillonella 0.4 0.9 0.7 0.7 1.7 0.0 0.1 0.1 0.1 0.1 Terrimonas 0.1 0.0 0.0 0.0 0.0 0.3 0.4 0.1 0.2 0.1 unclassified_Chlorobiaceae 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Porphyrobacter 0.1 0.0 0.1 0.1 0.0 0.6 0.7 0.7 0.7 1.0 unclassified_Crenotrichaceae 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 Acidovorax 0.5 0.7 1.1 0.8 0.7 0.0 0.0 0.0 0.0 0.2 unclassified_Clostridiales 0.1 3.3 0.3 0.0 0.1 0.0 0.1 0.1 0.0 0.1 unclassified_Pseudomonadaceae 0.5 0.5 0.8 0.4 0.2 0.0 0.0 0.1 0.1 0.1 unclassified_Chloroplast 0.1 0.0 0.1 0.0 0.0 0.4 0.5 0.3 0.2 0.5 Erysipelotrichaceae Incertae Sedis 0.0 6.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Lamprocystis 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0.0 0.0 0.0 unclassified_Chromatiaceae 0.0 0.0 0.0 0.2 0.1 0.5 0.6 0.5 0.3 0.5 Marinibacillus 0.0 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 Sphingopyxis 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Thioalkalimicrobium 3.9 0.0 0.6 1.6 0.7 0.0 0.0 0.0 0.0 0.0 Algoriphagus 0.0 0.1 0.0 0.0 0.1 0.3 0.1 0.1 0.3 0.3 unclassified_Rhodocyclaceae 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Microvirgula 0.4 0.3 0.6 0.3 0.6 0.0 0.0 0.0 0.0 0.1 Brevundimonas 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Chryseobacterium 0.5 0.6 0.3 0.3 1.0 0.0 0.0 0.1 0.0 0.1 Methylophilus 0.1 0.0 0.1 0.1 0.0 0.8 0.4 0.5 1.0 0.6 Sphingobium 0.4 0.7 0.8 0.2 0.0 0.0 0.0 0.1 0.0 0.1 Comamonas 0.3 0.4 0.6 0.3 0.1 0.0 0.0 0.0 0.0 0.1 unclassified_Saprospiraceae 0.0 0.0 0.2 0.0 0.0 0.5 0.5 0.6 1.0 0.8 unclassified_"Lachnospiraceae" 0.2 1.4 0.8 0.2 0.2 0.0 0.0 0.0 0.0 0.0 Lactobacillus 0.9 2.0 0.2 0.1 0.1 0.0 0.0 0.1 0.0 0.0 unclassified_Acetobacteraceae 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Stenotrophomonas 0.0 0.7 0.4 0.3 0.2 0.0 0.0 0.1 0.0 0.0 Table 2. Page 2 of 6.

92 Name P1 P2 P3 P4 P5 S1 S2 S3 S4 S5 GpIIa 1.4 3.3 2.7 1.0 2.0 11.9 12.4 25.3 12.8 25.9 unclassified_Betaproteobacteria 13.0 6.4 5.1 25.9 30.0 6.5 7.3 6.0 7.3 5.5 unclassified_Comamonadaceae 12.9 10.4 9.0 16.0 8.4 4.4 6.2 4.2 5.9 4.2 Weissella 2.0 1.3 2.4 0.3 0.8 0.3 0.3 0.6 0.4 0.5 Leuconostoc 1.8 1.3 2.6 0.5 0.6 0.3 0.4 0.7 0.4 0.3 Hydrogenophaga 6.7 6.1 6.0 9.3 6.1 1.3 0.8 0.9 2.0 0.8 Polynucleobacter 4.6 8.0 4.3 7.9 6.3 10.2 9.7 7.6 12.4 6.2 unclassified_Burkholderiales 5.8 6.3 5.8 7.7 4.1 10.2 9.8 8.7 9.8 7.6 unclassified_Bacteroidetes 1.4 5.0 5.3 0.6 0.6 2.6 3.4 2.8 2.7 4.4 unclassified_Actinomycetales 2.4 4.1 3.1 1.8 3.2 13.2 11.5 8.6 11.6 9.4 Lactococcus 1.1 0.9 1.1 0.2 0.4 0.2 0.2 0.4 0.2 0.3 Cryobacterium 1.8 2.9 2.5 2.3 3.3 2.5 2.3 1.9 2.6 2.2 Flavobacterium 5.0 2.5 3.6 5.2 6.0 0.3 0.3 0.3 0.5 0.3 Acinetobacter 0.9 0.6 1.0 0.1 0.3 0.1 0.2 0.4 0.1 0.1 Chloroplast 2.5 1.9 2.3 0.9 1.2 2.9 1.4 1.6 1.7 1.8 unclassified_Microbacteriaceae 3.6 3.5 3.1 1.4 3.3 0.9 1.6 1.3 1.5 1.6 unclassified_Flexibacteraceae 0.6 2.8 2.8 0.2 0.3 0.3 0.4 0.5 0.4 0.5 unclassified_Proteobacteria 7.1 1.5 1.1 3.8 4.7 7.2 7.0 4.4 6.3 3.5 unclassified_Flavobacteriaceae 2.3 1.4 2.3 2.3 2.4 0.1 0.2 0.2 0.2 0.1 unclassified_Gammaproteobacteria 1.7 1.8 1.7 1.5 2.0 3.7 4.2 3.1 2.9 2.8 Cryptomonadaceae 1.0 0.8 0.6 0.6 0.6 2.3 1.0 0.9 1.4 1.2 unclassified_Firmicutes 0.1 0.1 0.1 0.0 0.0 0.0 0.1 0.0 0.1 0.0 unclassified_Flavobacteriales 0.9 2.0 1.8 0.5 0.8 0.6 0.6 0.5 0.4 0.5 unclassified_Pseudomonadales 0.7 0.3 0.8 0.2 0.2 0.2 0.2 0.3 0.2 0.2 Fluviicola 4.7 3.7 6.3 1.9 2.1 0.2 0.5 0.5 0.5 0.5 Arcicella 4.1 4.5 4.7 0.9 1.9 0.5 0.4 0.2 0.3 0.2 Curvibacter 1.0 0.9 1.0 1.2 0.9 1.6 2.1 1.7 2.0 0.9 unclassified_Rhodobacteraceae 0.8 2.2 1.9 0.4 0.6 0.9 1.0 1.0 0.8 1.7 unclassified_Sphingobacteriales 0.9 3.7 4.4 0.4 0.7 0.6 0.7 1.0 0.7 1.2 Citrobacter 0.1 0.1 0.4 0.0 0.1 0.0 0.1 0.1 0.0 0.1 unclassified_Enterobacteriaceae 0.2 0.2 0.4 0.1 0.1 0.0 0.0 0.1 0.0 0.1 unclassified_Alcaligenaceae 0.0 0.2 0.1 0.0 0.1 0.9 1.1 1.1 0.9 1.0 unclassified_Chloroflexales 0.0 0.1 0.1 0.0 0.1 1.8 2.1 2.6 1.7 2.2 Arcobacter 0.2 0.0 0.1 0.7 0.2 0.2 0.3 0.1 0.1 0.0 Table 2. Page 3 of 6.

93 Name P1 P2 P3 P4 P5 S1 S2 S3 S4 S5 Pseudomonas 0.1 0.2 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 unclassified_Rhizobiales 0.4 1.2 1.0 0.3 0.5 1.3 1.2 1.2 0.9 2.0 unclassified_Sphingomonadaceae 0.4 0.7 0.7 0.2 0.3 0.4 0.4 0.2 0.2 0.2 unclassified_Burkholderiaceae 0.6 0.7 0.3 0.9 0.8 0.7 0.5 0.4 0.6 0.3 unclassified_Alphaproteobacteria 0.6 0.5 0.4 0.3 0.5 1.2 1.0 0.9 1.3 1.0 Chlorophyta 1.3 0.8 1.6 0.2 0.4 0.1 0.1 0.2 0.1 0.4 Enterococcus 0.1 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 unclassified_Micrococcineae 0.7 0.8 0.4 0.3 0.6 0.6 0.7 0.7 0.7 0.7 Streptococcus 0.1 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Haliscomenobacter 0.1 0.4 0.5 0.0 0.0 1.7 1.3 1.2 1.0 2.1 Clostridium 0.1 0.0 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 unclassified_Incertae sedis 5 0.1 0.3 0.3 0.3 0.2 0.7 0.8 0.8 0.5 0.7 unclassified_Actinobacteria 0.0 0.0 0.0 0.0 0.0 1.4 1.3 0.9 1.4 0.9 Veillonella 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Terrimonas 0.3 0.7 0.4 0.1 0.2 0.4 0.2 0.3 0.3 0.2 unclassified_Chlorobiaceae 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 Porphyrobacter 0.1 0.5 0.8 0.1 0.0 0.1 0.2 0.1 0.0 0.1 unclassified_Crenotrichaceae 0.4 0.8 0.3 0.2 0.2 0.1 0.3 0.2 0.2 0.2 Acidovorax 0.1 0.2 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 unclassified_Clostridiales 0.1 0.2 0.1 0.2 0.2 0.1 0.1 0.1 0.1 0.1 unclassified_Pseudomonadaceae 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 unclassified_Chloroplast 0.2 0.1 0.1 0.1 0.1 0.4 0.2 0.2 0.2 0.2 Erysipelotrichaceae Incertae Sedis 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Lamprocystis 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 unclassified_Chromatiaceae 0.1 0.0 0.1 0.1 0.2 0.0 0.0 0.0 0.0 0.1 Marinibacillus 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Sphingopyxis 0.2 0.4 0.4 0.0 0.2 0.4 0.2 0.3 0.1 0.3 Thioalkalimicrobium 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Algoriphagus 0.1 0.5 0.4 0.1 0.2 0.3 0.4 0.5 0.4 0.8 unclassified_Rhodocyclaceae 0.2 0.2 0.2 0.4 0.3 0.1 0.1 0.1 0.2 0.1 Microvirgula 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Brevundimonas 0.0 0.0 0.0 0.0 0.0 0.6 0.7 1.2 0.5 1.2 Chryseobacterium 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Methylophilus 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 Sphingobium 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Comamonas 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 unclassified_Saprospiraceae 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.1 unclassified_"Lachnospiraceae" 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 Lactobacillus 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 unclassified_Acetobacteraceae 0.1 0.2 0.4 0.1 0.0 0.4 0.5 0.4 0.3 0.5 Stenotrophomonas 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 Table 2. Page 4 of 6.

94 Name T1 T1-2 T2 T2-2 T3 T3-2 T4 T5 W1 W2 W3 W4 W5 GpIIa 15.6 0.1 5.4 11.8 6.8 52.4 50.4 65.3 55.3 49.8 1.4 3.3 2.7

unclassified_Betaproteobacteria 1.3 0.1 1.3 0.9 1.5 1.6 0.8 0.6 1.0 1.3 13.0 6.4 5.1

unclassified_Comamonadaceae 3.6 0.5 2.2 2.2 1.9 2.6 2.1 0.6 1.3 1.3 12.9 10.4 9.0

Weissella 11.9 22.1 18.1 17.5 20.0 1.3 2.3 2.3 1.0 2.8 2.0 1.3 2.4

Leuconostoc 8.9 17.2 18.3 16.7 20.8 1.3 2.5 2.4 1.3 2.6 1.8 1.3 2.6

Hydrogenophaga 6.4 0.0 3.4 1.9 1.1 1.8 4.0 0.6 1.1 1.0 6.7 6.1 6.0

Polynucleobacter 0.3 0.0 0.8 0.7 0.5 1.0 0.7 0.2 0.6 1.1 4.6 8.0 4.3

unclassified_Burkholderiales 0.7 0.0 0.4 0.3 0.7 3.3 1.7 1.0 2.2 2.7 5.8 6.3 5.8

unclassified_Bacteroidetes 5.2 0.3 1.8 2.7 1.2 5.6 4.6 3.9 7.6 5.7 1.4 5.0 5.3

unclassified_Actinomycetales 0.7 0.1 0.3 0.5 0.4 1.2 1.2 0.6 1.1 1.3 2.4 4.1 3.1

Lactococcus 5.4 11.8 9.6 11.4 11.7 0.7 1.1 1.2 0.6 1.1 1.1 0.9 1.1

Cryobacterium 0.4 0.0 1.1 0.8 0.1 1.2 1.3 0.3 0.6 0.6 1.8 2.9 2.5

Flavobacterium 0.1 0.1 0.7 0.2 0.4 0.3 0.2 0.1 0.2 0.3 5.0 2.5 3.6

Acinetobacter 4.4 7.1 8.3 5.4 6.8 0.4 0.6 0.7 0.3 0.9 0.9 0.6 1.0

Chloroplast 0.5 1.5 0.2 0.5 0.4 1.5 2.6 1.4 0.6 2.6 2.5 1.9 2.3

unclassified_Microbacteriaceae 4.8 0.1 1.8 2.5 1.3 1.2 1.5 0.3 1.0 1.3 3.6 3.5 3.1

unclassified_Flexibacteraceae 0.6 0.0 0.4 0.6 0.5 3.9 2.1 2.6 5.2 3.7 0.6 2.8 2.8

unclassified_Proteobacteria 0.3 0.1 0.2 0.2 0.3 0.5 0.3 0.4 0.5 0.5 7.1 1.5 1.1

unclassified_Flavobacteriaceae 1.5 1.1 1.8 1.2 1.4 1.6 1.7 1.4 2.3 1.4 2.3 1.4 2.3

Unclass._Gammaproteobacteria 2.5 0.3 1.7 1.5 0.8 1.3 1.6 0.6 1.6 1.0 1.7 1.8 1.7

Cryptomonadaceae 0.3 0.0 0.1 0.2 0.2 0.0 0.0 0.0 0.0 0.1 1.0 0.8 0.6

unclassified_Firmicutes 0.7 0.2 0.7 0.5 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1

unclassified_Flavobacteriales 2.1 0.0 0.4 0.8 0.1 1.2 1.2 1.2 1.7 1.5 0.9 2.0 1.8

unclassified_Pseudomonadales 2.9 2.9 3.2 2.3 3.7 0.4 0.7 0.7 0.6 0.6 0.7 0.3 0.8

Fluviicola 0.0 0.0 0.3 0.3 0.4 0.2 0.0 0.0 0.1 0.0 4.7 3.7 6.3

Arcicella 0.1 0.0 0.2 0.1 0.3 0.0 0.0 0.0 0.0 0.0 4.1 4.5 4.7

Curvibacter 0.0 0.0 0.1 0.2 0.2 0.1 0.0 0.0 0.0 0.1 1.0 0.9 1.0

unclassified_Rhodobacteraceae 0.1 0.0 0.2 0.3 0.0 2.6 2.4 2.0 2.1 2.4 0.8 2.2 1.9

unclassified_Sphingobacteriales 0.2 0.0 0.2 0.3 0.0 0.6 0.5 0.4 0.7 0.5 0.9 3.7 4.4

Citrobacter 1.3 2.5 1.9 2.2 3.2 0.0 0.2 0.2 0.1 0.2 0.1 0.1 0.4

unclassified_Enterobacteriaceae 1.5 2.7 1.9 1.8 2.4 0.1 0.3 0.2 0.1 0.2 0.2 0.2 0.4

unclassified_Alcaligenaceae 2.8 0.1 0.5 0.8 0.4 2.3 2.4 1.2 1.7 1.5 0.0 0.2 0.1

unclassified_Chloroflexales 0.1 0.0 0.1 0.0 0.0 1.1 1.0 1.2 1.3 1.0 0.0 0.1 0.1

Arcobacter 0.1 0.8 0.7 0.4 0.5 0.0 0.0 0.0 0.0 0.1 0.2 0.0 0.1 Table 2. Page 5 of 6.

95 Name T1 T1-2 T2 T2-2 T3 T3-2 T4 T5 W1 W2 W3 W4 W5 Pseudomonas 0.4 0.4 1.0 0.3 0.2 0.0 0.1 0.1 0.0 0.1 0.1 0.2 0.0 unclassified_Rhizobiales 0.2 0.0 0.1 0.3 0.2 0.2 0.1 0.1 0.1 0.1 0.4 1.2 1.0 Unclass._Sphingomonadaceae 0.2 0.1 0.3 0.1 0.0 0.2 0.2 0.3 0.2 0.3 0.4 0.7 0.7 unclassified_Burkholderiaceae 0.0 0.0 0.1 0.2 0.2 0.0 0.0 0.0 0.0 0.0 0.6 0.7 0.3 unclassified_Alphaproteobacteria 0.0 0.1 0.0 0.1 0.1 0.2 0.4 0.1 0.2 0.2 0.6 0.5 0.4 Chlorophyta 0.1 0.0 0.1 0.2 0.1 1.0 1.5 1.0 0.4 1.9 1.3 0.8 1.6 Enterococcus 1.2 1.3 1.1 1.3 1.5 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 unclassified_Micrococcineae 0.7 0.1 0.3 0.2 0.2 0.5 0.6 0.2 0.5 0.4 0.7 0.8 0.4 Streptococcus 0.7 1.7 1.0 0.8 1.3 0.1 0.2 0.1 0.1 0.2 0.1 0.1 0.1 Haliscomenobacter 0.0 0.0 0.0 0.0 0.0 0.5 0.4 0.6 0.5 0.6 0.1 0.4 0.5 Clostridium 0.1 6.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.1 unclassified_Incertae sedis 5 0.1 0.0 0.1 0.2 0.1 0.2 0.1 0.0 0.1 0.2 0.1 0.3 0.3 unclassified_Actinobacteria 0.1 0.0 0.2 0.3 0.1 0.1 0.1 0.1 0.0 0.2 0.0 0.0 0.0 Veillonella 0.4 0.9 0.7 0.7 1.7 0.0 0.1 0.1 0.1 0.1 0.0 0.0 0.1 Terrimonas 0.1 0.0 0.0 0.0 0.0 0.3 0.4 0.1 0.2 0.1 0.3 0.7 0.4 unclassified_Chlorobiaceae 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Porphyrobacter 0.1 0.0 0.1 0.1 0.0 0.6 0.7 0.7 0.7 1.0 0.1 0.5 0.8 unclassified_Crenotrichaceae 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.4 0.8 0.3 Acidovorax 0.5 0.7 1.1 0.8 0.7 0.0 0.0 0.0 0.0 0.2 0.1 0.2 0.1 unclassified_Clostridiales 0.1 3.3 0.3 0.0 0.1 0.0 0.1 0.1 0.0 0.1 0.1 0.2 0.1 unclassified_Pseudomonadaceae 0.5 0.5 0.8 0.4 0.2 0.0 0.0 0.1 0.1 0.1 0.1 0.1 0.0 unclassified_Chloroplast 0.1 0.0 0.1 0.0 0.0 0.4 0.5 0.3 0.2 0.5 0.2 0.1 0.1 Erysipelotrich. Incertae Sedis 0.0 6.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Lamprocystis 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 unclassified_Chromatiaceae 0.0 0.0 0.0 0.2 0.1 0.5 0.6 0.5 0.3 0.5 0.1 0.0 0.1 Marinibacillus 0.0 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Sphingopyxis 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.4 0.4 Thioalkalimicrobium 3.9 0.0 0.6 1.6 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Algoriphagus 0.0 0.1 0.0 0.0 0.1 0.3 0.1 0.1 0.3 0.3 0.1 0.5 0.4 unclassified_Rhodocyclaceae 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.2 0.2 Microvirgula 0.4 0.3 0.6 0.3 0.6 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.1 Brevundimonas 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Chryseobacterium 0.5 0.6 0.3 0.3 1.0 0.0 0.0 0.1 0.0 0.1 0.0 0.0 0.1 Methylophilus 0.1 0.0 0.1 0.1 0.0 0.8 0.4 0.5 1.0 0.6 0.1 0.0 0.0 Sphingobium 0.4 0.7 0.8 0.2 0.0 0.0 0.0 0.1 0.0 0.1 0.1 0.1 0.0 Comamonas 0.3 0.4 0.6 0.3 0.1 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.1 unclassified_Saprospiraceae 0.0 0.0 0.2 0.0 0.0 0.5 0.5 0.6 1.0 0.8 0.0 0.0 0.0 unclassified_"Lachnospiraceae" 0.2 1.4 0.8 0.2 0.2 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 Lactobacillus 0.9 2.0 0.2 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.1 unclassified_Acetobacteraceae 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.2 0.4 Stenotrophomonas 0.0 0.7 0.4 0.3 0.2 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.1 Table 2. Page 6 of 6.

96

Figure 1. Hierarchical cluster analysis Lakes by physico-chemical properties from Table 1. Clustered by Ward‟s algorithm based on Euclidean distances. The following abbreviations were used for each lake followed by a number which represents each site: Border (B), Kokjohn (K), Perrin (P), Smith (S), Tree Claim (T), and Windmill (W).

97

Figure 2. Pie charts of the lake water pyrosequencing data by Phyla. Tree Claim has very different abundance of Phyla from each site. The average for Tree Claim Lake is not a good representative of the whole lake.

98

Figure 3. Each site within Tree Claim by Phyla, represented as pie charts.

99

Figure 4. Dendrogram of lakes clustered by pyrosequencing data, identified by taxonomy in RDP‟s Classifier after removing unidentified bacteria and those rare sequences making up less that 0.1 % of the total sequences, normalized by percent of total reads. Tree Claim was abbreviated TC. Each number next to the lake name or abbreviation represents the site from which the sample was taken.

100

Figure 5. Mid-IR average spectra of each lake. The five windows of information are: W1, the fatty acid window, 3040-2800 cm-1; W2, the amide window, 1800-1500 cm-1; W3, the mixed window, 1500-1200 cm-1; W4, the polysaccharide window, 1200-900 cm-1; and W5, the fingerprint window, 900-500 cm-1.

101

Figure 6. Hierarchical cluster analysis of each window, based on D-values of first derivative spectra clustered by Ward‟s algorithm. 102

Figure 7. Average, first derivative spectra of each lake illustrate peak differences in each window of information. 103

Figure 8. Optimized hierarchical cluster analysis of mid-IR spectroscopic data. The dendrogram was constructed from D-values of first derivative data, based on windows with weighting factor in brackets: 3000-2800 cm-1 [1], 1800-1500 cm-1 [3], 1500-1200 cm-1 [3], 1200-900 cm-1 [3], and 900-750 cm-1 [1]. Smith Lake has the least heterogeneity and Tree Claim Lake has the greatest heterogeneity.

104

Figure 9. A Concept sketch of different organisms in different environments which affect function differently, resulting in different metabolomic similarities.

105 Conclusion and Future Directions

Mid-IR spectroscopy is a viable tool for fingerprinting the in situ biochemical profile of a planktonic microbial community. The model system tested by this technique was the Alkaline

Lakes of Western Nebraska. A review of other techniques used to analyze a microbial community revealed the need for a method that captures the biochemical profile of a microbial community in situ. Mid-IR spectroscopy has been extensively used to classify microorganisms and describe function in axenic cultures. Now, a mid-IR, ATR method is available to analyze the in situ function of a planktonic microbial community at a specific moment in time.

Challenges to optimize the method included: interference due to infrared absorbance of the filter matrix and low signal in some lake systems. The method parameters were then optimized to overcome these challenges. A combination of PVDF and nylon filter membranes was used; their spectra were cut and merged to avoid interference from the filter. A 20-25 µm pore-sized filter minimized the potential of biofouling and provided the greatest signal-to-noise.

The 20-25 µm prefilter also had the least variability within replicates as measured by the mean D- values. The optimal parameters described enabled the application of the method.

The mid-IR spectroscopic technique is able to differentiate the biochemical fingerprint of six interdunal lakes of Western Nebraska. Physico-chemical characteristics of six lakes from the

Crescent Lake National Wildlife Refuge indicate clear differences in the chemistry of the lakes.

Pyrosequencing data of 16S rRNA genes reveal structural differences in the microbial community of the lakes. Mid-IR spectroscopy by attenuated total reflectance is able to discriminate spectral signals from six different lakes, which we believe reflect the function of the microbial community. Similarities between the structural and functional data exist, but the relationships between lakes by both analyses are not identical.

The experiments required to deconvolute the information-rich spectra and assign functional information to spectral patterns will start with many simple models and expand to 106 more complex systems. Small, simple mesocosms, or experimental systems that simulates real- life conditions as closely as possible, may be the best approach at acquiring meaningful, biological information from mid-IR spectra of controlled communities. Microbial communities with known, active pathways may be identified if several mesocosms are established with the same species and chemistries present in each, but with mutant strains deficient in one enzyme or pathway or supplemented with one enzyme or pathway. Change in a microbial community‟s function due to a change in chemistry can be controlled in a mesocosm. Artificial systems at high pH and low pH of the same microbial communities could be analyzed. The mesocosm approach has the added benefits of being simple, easily accessible, and quick to generate enough samples

(N) to achieve statistical significance and applying a treatment is simple as well. Then a comparison of mesocosm spectra to actual environmental systems is necessary to validate that spectral changes in a mesocosm are similar to changes in the environment.

If a healthy and unhealthy microbial community were compared by mid-IR spectroscopy, a method of determining the health of an ecosystem could be established. Numerous healthy and unhealthy samples would need to be compared to develop this potential application as a quality control check. Several mesocosms could be established with the same chemistries and bacterial species present, then half of the mesocosms could encounter a bacteriophage, pollutant, invasive species, or natural chemical (e.g., sulfate) to perturb the lake function.

There are many applications where a recorded image of biochemical activity in a community is useful. This mid-IR technique will likely be applied to other communities as well, including host-associated microbial communities. Methods development must be applied to those communities to determine how to capture the sample, stop metabolism quickly, and acquire a mid-IR spectrum. Then comparisons can be made from one sample or treatment to another and, potentially, the health of the community could be assessed.

107 Appendix 1. Initial Investigation of the Alkaline Lakes, 2009

The following data are from an initial investigation of the Sandhills Lakes in June 2009.

Materials and Methods

Ten lakes were examined from Sheridan County. The lake water was first collected in 1-

Liter or 1-gallon sterile containers, and then brought to a centralized location. The water was prefiltered through a 20 µm prefilter and a 5 µm nitrocellulose filter membrane, before being captured onto a 0.2 µm nitrocellulose filter membrane. The capture membrane was then dried in vacuo with desiccant. The dried filters were then bottled in sterile, 50 mL conical tubes with a hole placed in the lid. All conical tubes were then stored in a box with desiccant to keep the samples dry.

At the lab, each nitrocellulose filter was examined by mid-IR spectroscopy. The blank was a virgin nitrocellulose filter.

A sample of each nitrocellulose filter (5 µm and 0.2 µm) was cut and prepared for scanning electron microscopy. The filters were fixed in a 2.5% gluteraldehyde solution. Then the filters were dried by a bake dry method at 80 °C, because the traditional critical point drying method with ethanol caused the nitrocellulose filters to dissolve. The filters were then coated in a gold-palladium alloy and examined by SEM.

Results and Discussion

The resulting spectra are shown in Figure A1-1. Of the ten lakes examined, five lakes produced a very strong signal. These lakes could be resolved by cluster analysis, meaning each lake was discriminated clearly. Five lakes produced a very weak signal. The lakes with weak signals could not be resolved from one another properly.

In order to determine why some of the lakes produced a strong signal and others did not, the filters were examined by SEM. The resulting images are shown in Figure A1-2. From the images, it appears clear that the prefilters from the five lakes with weak signals were biofouled, 108 meaning, the prefilter quickly accumulated material, blocking pores, preventing enough biomass from passing through the prefilter to be examined later on the capture filter.

This initial examination led to further studies in order to optimize the method parameters in order to avoid this biofouling issue and to ensure a strong signal on the FT-IR.

109

Figure A1-1. Five lakes from the Sandhills Region of Western Nebraska provide a strong signal. Five lakes from the Sandhills Region of Western Nebraska provide a weak signal. 110

Figure A1-2a. Brushpile Lake.

111

Figure A1-2b. Jessie Lake.

112

Figure A1-2c. Lilly Lake.

113

Figure A1-2d. Louden Lake.

114

Figure A1-2e. Merritt Lake.

115

Figure A1-2f. 114 Lake (Named after a mile marker in the road next to the body of water).

116

Figure A1-2g. Ed Lake (On the Herian property, named after Uncle Ed, whose house was built next to this lake.) 117

Figure A1-2h. Ellsworth Lake.

118

Figure A1-2i. Herian Lake.

119

Figure A1-2j. Windmill Lake. 120 Appendix 2. Flow Cytometry

Flow Cytometry is an instrument that one can use to analyze particles in a fluid sample.

A common method in flow cytometry to count cells is to dye the sample, staining viable cells.

The stain must be excited by a wavelength used by the flow cytometer. The detector must then detect the emission. After excluding autofluorescence, the particles detected with the correct emission are viable cell (4). In this manner a viable cell can be stained and counted. This technique is more accurate than plate count methods, due to a large number of cells that are uncultivable.

Flow cytometry can also be used to estimate the size of a particle in a fluid sample (3).

Side scatter can be calibrated to particular sized standards. I attempted to use this procedure to determine the size of cells in lake water samples upon prefiltering through paper filters of varying pore sizes.

Methods

Salt Lake, Capital Beach, Lincoln, NE was sampled by 1-gallon sterile containers and the water stored overnight at 4 °C. The lake water was then pooled in a five-gallon carboy, and processed through prefilters of 20-25 µm, 8 µm, and 2.5 µm. Samples of prefiltered water was then stored at 4 °C until analyzed by flow cytometry. Each sample, 10 mL, was stained with 1

µg/mL ethidium bromide and each sample, 10 mL was analyzed without stain. Spherotech Inc.,

SpheroTM, polysterene beads of standard sizes (15 µm, 7.6 µm, 5 µm, and 3 µm) were used as calibration standards.

Results and Discussion

Results are shown in Figures A2-1 and A2-2. There does not appear to be a significant difference in particle size based on prefilter as shown in the histograms. Stained data indicate the percentage of particles that were stained with ethidium bromide. The 20-25 µm prefilter resulted in 93.97% stained; the 8 µm prefilter resulted in 91.57% stained, and the 2.5 µm prefilter resulted 121 in 89.17% stained. The total particles counted as events are: 20-25 µm pefilter had 6798 events,

8 µm prefilter had 6546 events, and 2.5 µm prefilter had 6154 events.

When ethidium bromide is excited at 540 nm an excitation wavelength of 590 nm is visible when bound to DNA. The percent particles stained imply that the 20-25 µm prefilter allows for a higher percent of biologically significant particles. With this procedure, it is known that flow cytometry overestimates the number of cells in a sample (5). This is possibly due to ethidium bromide binding to inorganic phosphates in the environment, which may cause false positive readings. It is also possible that a cell is bound to an inorganic particle, so the flow cytometer records only one particle, and that particle emits 590 nm. The particle would then appear to be a cell, but, in fact, a small percentage of that particle is indeed organic.

122

Figure A2-1. Flow cytometry of prefiltered water. R2-R5 indicate the standard bead sizes calibrated to those areas: R2 is 15 µm, R3 is 7.6 µm, R4 is 5 µm, and R5 is 3 µm.

123

Figure A2-2. Flow cytometry of prefiltered water stained with ethidium bromide. The percent fluoresced indicated in the lower right hand corner of each histogram indicates the percentage of particles that emitted in the 590 nm range, indicating ethidium bromide bound to DNA.

124 Appendix 3. GC-FAME

It is a well developed method in soil microbiology to classify a microbial community by the lipids present (Drijber). The typical methods involve extracting total lipids from soil samples.

Then, the lipids are saponified forming fatty acid methyl esters (FAMEs) so they are volatile and can be analyzed by gas chromatography (GC-FAME). Certain types of lipids act as biomarkers, such as cyclopropane structures indicative of bacteria, branched fatty acids at the tenth carbon are indicative of Actinomycetes, C18:2cis9,12 is indicative of fungi, and C16:1cis11 is indicative of arbuscular mycorrhiza fungi (2).

The same methods were used to study prefiltered lake water from Chapter 2 of this thesis.

Methods and Materials

Lipids from three 0.2 micron nitrocellulose filters were extracted by a modified Bligh and

Dyer (1959) procedure. The filters were cut and shaken in chloroform:methanol (2:1) for 4 hours. Then equal parts chloroform and sterile, distilled water were added and the mixture shaken for another 30 min. The mixture was rapidly filtered through a Whatman no. 1 filter paper. Filtrate was then moved to a separatory funnel and set for 30 min. Lipids from the

Whatman no. 1 filter paper were rinsed with chloroform two times. The organic phase at the bottom of the separatory funnel was then removed, heated to 50 °C under vacuum to remove the chloroform. When the total volume was at or less than 5 ml, the chloroform was moved to a glass test tube with Teflon lined cap and placed in a 50 °C heatblock to dry under a stream of nitrogen.

When all chloroform was completely dry, the glass test tube was capped and allowed to return to room temperature before the weight was recorded. Lipids were kept at 4 °C for storage.

Lipids were then suspended in 0.5 mL methanol toluene 1:1 (v/v), with freshly prepared

0.5 ml, 0.2 M methanolic KOH. The sample was then incubated for 15 min at 37 °C, then neutralized with 255 µL 1M acetic acid. Once the tube cooled, 1 ml water and 1 ml heptanes is 125 added and mixed thoroughly for 5 min. Then the sample was centrifuged at 4000 RPM for 5 min.

The top layer is then transferred to an autosampler vial for analysis in the GC-MS.

Results and Discussion

The resulting chromatogram, with peaks analyzed by mass spectrometry, identified no bacterial biomarkers. The absence of bacterial biomarkers, despite prokaryotes verified by DNA in the samples, indicates that the lipid sample may not have been adequate. Only one sample was analyzed by GC-MS, based on the small mass of lipids available in a sample and the lack of bacterial biomarkers in the initial study. No literature has been found that analyzes a planktonic community by GC-FAME. It is possible that planktonic bacteria produce different lipid biomarkers and the catalogue of their fatty acids has not been assessed by this method, yet. Other techniques might need to be explored, such as centrifuging larger quantities of lake water to accumulate enough biomass in order to increase lipid yield.

126

Figure A3-1. Gas Chromatography of fatty acid methyl esters.

127

Figure A3-2. Chromatogram peaks identified by mass spectrometry. Peak numbers coincide with those labeled in Figure A2-1.

128 Appendix 4. SEM of Lake Water from the summer of 2010.

129

130

131

132

133

134 Appendix 5. Alkaline Lakes Isolate Database

The Alkaline Lakes Isolate Database can be found on the Labshare server at,

\\beadle3\Labshare-B3\nickerson\plantz\ryan\ALID2. Someone in the lab should be designated the manager. The manager is responsible for creating a backup each month to ensure the current information is secure.

The database opens up to the switchboard. On the left is a list of Tables, Queries, Forms, and Reports. By clicking on any one of them, the list is displayed. To begin entering data, the lists on the left are unnecessary. Use the three buttons on the „Main Switchboard‟ to enter data, view the data, or go directly to a new entry to create a new isolate. 135

Once the button from the switchboard, „Enter Data‟ is pressed, the primary form, „form

Isolate Properties with tabbed subforms‟ is opened. The initial display is shown above. From the top, „ID‟ is shown, which is the automatic number that serves as an index for the „Isolate Name.‟

All data is tied to the „Isolate Name‟ as shown in the table of relationships below.

136 The relationships are shown in the previous page. The table „Isolate Name‟ is related to all other tables.

As seen in the initial screenshot of the primary table, there are tabs for each subform:

Permanent ID, Biochem Properties, Acid, Growth Tests, Gelatinase, FTIR, and DNA sequence.

The Permanent ID subform is shown. Above the tabs, the isolate name field is displayed. One must only type in this area if in a new record and adding a new isolate name. To the right of that is „Find Isolate Name.‟ This box allows you to type in an isolate name, or click the down arrow to open a list of all names. After entering the isolate name, press enter, and you will be taken directly to the isolate name you chose, or click on the isolate name in the list and you will be taken to the isolate name you chose.

The permanent ID subform is used once an isolate is ready to put into permanent storage.

„Isolate ID‟ serves as the permanent ID and index for the table. The „Last Modified‟ field will update automatically. At the bottom of each subform, you will see the „Record‟ listed, with arrows pointing left and right. This should only be used if you have more than one entry for the current record. Because one isolate name will only have one permanent ID, you should never go to the second „record‟ of any particular isolate name in the Permanent ID subform.

At the bottom of the primary form you will also see a „record‟ listed, with left and right arrows. The „Record‟ menu in the subform controls only that subform. The „Record‟ menu at the bottom controls the entire form. If you click on the right arrow, it will take you to the next isolate name. If you click on the right arrow with end-line, it will take you to the last isolate name. If you click on the right arrow with yellow asterisk, it will take you to a new record where you can record a new isolate name. 137

The subform „Biochem Properties‟ is shown above. The same functionality in the subform „Permanent ID‟ exists in this subform and all the following subforms. This subform is used to record the biochemical experiments: KOH, Gram Stain, colony color, cell morphology, oxygen requirement, motility, CMCase, xylanase, spore formation, nitrate reduction, and anaerobic growth.

138 The subform „Acid‟ is used to record if acid is produced when the isolate is grown on: glucose, xylose, arabinose, maltose, cellobiose, and sucrose.

The „Growth Tests‟ tab includes three separate subforms: pH Growth Range Test, KCl

Growth Range Test, and NaCl Growth Range Test.

139 The „Gelatinase‟ subform is used to record an isolates gelatinase activity under normal conditions and under alkaline conditions.

The „FTIR‟ tab lists convenient notes for the behavior of each isolate when capturing a mid-IR spectrum for classification of the isolate. The entry „OPUS file‟ hyperlinks to the OPUS file. „Excel File‟ hyperlinks to the data point table of each isolate. „Cluster Analysis‟ hyperlinks to the dendrogram each isolate fits based on the 21 isolates identified by DNA sequence, as shown below. 140

The Powerpoint slide displayed shows where PXY22 clusters closely to the PXY21 currently in the OPUS library of known isolates. The upper right shows how the three separate spectra cluster and the bottom right shows how the average spectrum clusters. The bottom left is a dendrogram of the three replicates, illustrating the RL1 difference of the three replicates.

The fields: „Most Similar To,‟ „Threshold,‟ „Hit Quality,‟ and „Within Clustering Point‟ indicate the scores based on the OPUS program Ident.

In order for the hyperlinks to work in ALID, all files must be stored in the ALID2 folder.

141 The ALID2 folder currently has a folder for FTIR data and DNA sequences. All OPUS files, datapoint tables, and cluster analysis files are stored in the FTIR data folder.

The subform, „DNA sequence‟ is a work in progress. The primer fields are currently set as text entries to type in the primer sequences used. „DNA fragment sequence‟ and „DNA

Sequence Complete‟ are currently set as hyperlinks to text files to view the fragment sequences and the complete DNA sequence, respectively. The complete DNA sequence refers to the complete sequence of the 16S rRNA gene. 142

When the hyperlink is clicked, the file will open in Word, displaying the sequence.

143 Appendix 6. Continuity and Lessons Learned

1. Mid-IR Spectroscopy. In every case, for standardization, I vector normalized and did baseline correction on all spectra. All cluster analysis was done on first derivative spectra by Ward‟s algorithm. If absorption from 2100-2300 is high, in OPUS ensure vector normalization excludes the CO2 band. Vector normalization sets the area under the curve to a value of one. Many publications normalize to the amide I peak, meaning the amide I peak is adjusted to one and all other points are adjusted by the same factor. If this is done, the CO2 band can be ignored.

If samples flake from filters after drying, to alleviate this problem, it helps to allow suction from the vacuum to pull air through the filter for about one minute. This ensures the sample is not saturated with water and will adhere to the filter nicely. If this does not alleviate the problem, others in the lab have attained great spectra by recording spectra from material that has flaked off the filter. In this case, the blank should be taken by clamping the press down on the crystal with nothing present (basically the thin layer of air between the press and the crystal is the blank). When done in this manner, no interference from the filter is observed.

Another publication using a nitrocellulose filter and ATR encountered no interference from the filter (1). I contacted Dr. Lisa Mauer to figure out how her lab avoided this interference, but received no reply ([email protected]).

2. Water Chemistry. I learned a lot about lake water chemistry from Dr. David Gosselin. As water chemistry quality control checks once can balance the mass (each ion concentration versus the total dissolved solids) and balance the charge. When the mass balance is done, bicarbonate

- (HCO3 ) must be adjusted by a factor of 0.4917. This is because the bicarbonate in the water will

convert to CO2 when you dry the water sample in the oven to calculate total dissolved solids

- 2- (2HCO3 = CO3 + H2O + CO2(g)).

To balance the charge, a useful spreadsheet is available from the website, www.coalgeology.com. An example of the spreadsheet can be seen in Figure A6-1. The 144 spreadsheet allows you to list the ion you have measured, add the atomic weight and the measured concentration, and the equations are available to calculate the molality (mmol/L). The valence must then be added and the meq/L, milliequivalent per liter. The spreadsheet adds the milliequivalent per liter value in the cation category and the anion category, then calculates the charge balance error (CBE).

145

Figure A6-1. The cation/anion balance calculation worksheet from http://coalgeology.com.

146 References (From Appendices)

1. Davis, R. Burgula, Y, Deering, A., Irudayaraj, J., Reuhs, B.L., Mauer, L.J. 2010.

Detection and differentiation of live and heat-treated Salmonella enterica serovars inoculated onto chicken breast using Fourier transform infrared (FT-IR) spectroscopy. J. Applied

Microbiology, IN PRESS.

2. Frey, S.D., Drijber, R., Smith, H., and Melillo, J. 2008. Microbial biomass, functional capacity, and community structure after 12 years of soil warming. Soil Biology and Biochemistry,

40: 2904-2907.

3. Katz, A., Alimova, A., Xu, M., Rudolph, E., Shah, M.K., Savage, H.E., Rosen, R.B.,

McCormick, S.A., and Alfano, R.R. 2003. Bacteria size determination by elastic light scattering.

IEEE Journal of Selected Topics in Quantum Electronics, 9: 277-287.

4. Porter, J., Edwards, C., Morgan, A.W., and Pickup, R.W. 1993. Rapid, automated separation of specific bacteria from lake water and sewage by flow cytometry and cell sorting.

Applied and Environmental Microbiology, 59: 3327-3333.

5. Sieracki, M.E., Haugen, E.M., and Cucci, T.L. 1995. Overestimation of heterotrophic bacteria in the Sargasso Sea: direct evidence by flow and imaging cytometry. Deap Sea Research

I: Oceanographic Research Papers, 42: 139-1409.