<<

Characterization of the Cyanobacterial Harmful Algal Bloom Community in Hamilton Harbour, Lake Ontario

by

Rosemary Saati

A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Cell and Systems Biology University of Toronto

© Copyright by Rosemary Saati 2016

Characterization of the Cyanobacterial Harmful Algal Bloom Community in Hamilton Harbour, Lake Ontario

Rosemary Saati

Master of Science

Cell and Systems Biology University of Toronto

2016 Abstract

Little is known about the interspecific relationships between and heterotrophic microorganisms throughout Cyanobacterial harmful algal bloom (CyanoHAB) development mainly since most aquatic microorganisms are non-culturable in the laboratory. The culture- independent technique Terminal Restriction Length Polymorphism (T-RFLP) was used for fingerprinting of microbial community DNA collected from Hamilton Harbour water throughout the summers of 2014 and 2015. 454-tag encoded pyrosequencing and shotgun sequencing revealed the dominant and throughout each summer. Actinobacteria was the dominant bacterial in June while Cyanobacteria and were co- dominant from August-September. Shotgun data also revealed the microbial functional potentials. Cyanobacterial dominance was associated with a higher potential for nitrate/nitrite assimilation and in the late-summer. These results provide the first metagenomic characterization of the total CyanoHAB community in Hamilton Harbour.

Correlations to physico-chemical parameters further provide insights for future management into which factors could be shaping the bacterial community throughout the summer.

ii

Acknowledgments

The biggest thank-you goes to my supervisor, Dr. Roberta Fulthorpe, for allowing me to pursue this degree under her supervision. I truly could not have asked for a more supportive and encouraging supervisor throughout this journey. I would also like to thank my committee members Dr. George Espie and Dr. George Arhonditsis for your critique and invaluable advice during meetings.

I would like to extend my thanks to Dr. Arhonditsis for allowing me to participate in the Hamilton Harbour project. Also, thank you to collaborating researchers that helped with the water sampling and kindly shared the physico-chemical data including Dr. Susan Watson’s group (Environment Canada), Mark Verschoor in Dr. Lewis Molot’s group (YorkU) and Dr. Stefan Markovic in Dr. Maria Dittrich’s group (UTSC).

I would also like to acknowledge my lab colleagues: Roxana Shen, Genevieve Noyce, Nicole Ricker and Rhea Lamactud. Thank you for keeping me sane when the science wasn’t working and, of course, all the advice.

Lastly, I’d like to thank my family for believing in me and supporting me unconditionally. You can finally turn the TV volume up now J.

iii

“Study the science of art. Study the art of science. Develop your senses- especially learn how to see. Realize that everything connects to everything else.”- Leonardo Di Vinci

iv

Table of Contents

List of Figures ...... viii

List of Tables ...... xiii

List of Appendices ...... xiv

Abbreviations ...... xv

Chapter 1: Introduction ...... 1

1.1 Anthropogenic Eutrophication and Algal Blooms ...... 1

1.2 Consequences of Cyanobacterial Blooms ...... 1

1.3 Current Understanding of CyanoHAB Development and Persistence ...... 2

1.4 Concerns in the Great Lakes ...... 6

1.5 Study Site: Hamilton Harbour ...... 8

1.6 The Application of Molecular Tools to Understand CyanoHAB’s ...... 10

1.6.1 Why Use Molecular Tools? ...... 10

1.6.2 Terminal Restriction Fragment Length Polymorphism (T-RFLP) ...... 10

1.6.3 DNA Sequencing ...... 11

1.7 Research Objective and Hypotheses ...... 12

Chapter 2: Community Structure and Correlations to Physico-chemical Parameters ...... 13

2.1 Introduction ...... 13

2.2 Methods ...... 15

2.2.1 Choice of Restriction Enzymes for T-RFLP ...... 15

2.2.2 Sample Collection ...... 16

2.2.3 DNA Isolation ...... 16

2.2.4 Preparation of DNA Amplicons for T-RFLP ...... 17

2.3 Statistical Analyses ...... 18

2.3.1 Data Trimming and Diversity of Terminal Fragments ...... 18

v

2.3.2 Ordination of Phylotype and Environmental Data ...... 19

2.4 Results...... 20

2.4.1 Terminal Fragment Diversity ...... 20

2.4.2 Microbial Community Structure ...... 23

2.4.3 Environmental Correlates ...... 29

2.5 Discussion ...... 35

Chapter 3: Phylogenetic Composition of the 2014 and 2015 Communities ...... 38

3.1 Introduction ...... 38

3.2 Methods ...... 39

3.2.1 Preparation of samples for 454 pyrosequencing and shotgun metagenomics ...... 39

3.2.2 Bioinformatic Pre-processing ...... 40

3.2.2.1 Pre-processing of 454-pyrosequencing data ...... 40

3.2.2.2 Pre-processing of metagenomic data ...... 40

3.2.3 Classification ...... 41

3.2.3.1 Tag-encoded Amplicon 454 Pyrosequencing data ...... 41

3.2.3.2 Metagenome data and MEGAN ...... 41

3.2.4 Phylogenetic Tree Building of 454 pyrosequencing data ...... 42

3.2.5 Diversity ...... 43

3.2.5.1 454-Pyrosequencing ...... 43

3.2.5.2 Metagenomes ...... 43

3.3 Results ...... 43

3.3.1 Sequence Statistics ...... 43

3.3.2 Bacterial Community (2014) ...... 44

3.3.3 Eukaryotic Community (2014) ...... 48

3.3.4 Metagenome Taxonomy (2015) ...... 53

3.3.5 Diversity ...... 56

vi

3.6 Discussion ...... 58

Chapter 4: Functional Potentials of the 2015 Community ...... 61

4.1 Introduction ...... 61

4.2 Methods ...... 63

4.3 Results ...... 63

4.3.1 General Functions ...... 63

4.3.2 Nitrogen Metabolism Overall ...... 64

4.3.3 Nitrogen Transport ...... 64

4.3.4 Intracellular Nitrogen Reduction ...... 69

4.4 Discussion ...... 74

Chapter 5: Conclusion and Future Direction ...... 76

vii

List of Figures

Figure 1.1: Map of the Hamilton Harbour and its surrounding watershed. This photo taken from Dermott et al. (2007) scale bar = 1800 m.

Figure 2.1: Diagram of Hamilton Harbour with the inshore site 9031 (I) and offshore site 1001 (O) highlighted.

Figure 2.2: Bacterial phylotype richness throughout 2014 and 2015.

Figure 2.3: Eukaryotic phylotype richness throughout 2014 and 2015.

Figure 2.4: a) PCoA results for the 2014 bacterial community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. PCo1 has a relative corrected eigenvalue of 0.36 and PCo2 has a relative corrected eigenvalue of 0.24. b) Dendrogram results for the same dataset.

Figure 2.5: a) PCoA results for the 2015 bacterial community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.72 and PCo2 has a relative corrected eigenvalue of 0.11. b) Dendrogram results for the same dataset.

Figure 2.6: a) PCoA results for the 2014 eukaryotic community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. PCo1 has a relative corrected eigenvalue of 0.35 and PCo2 has a relative corrected eigenvalue of 0.21. b) Dendrogram results for the same dataset.

Figure 2.7: a) PCoA results for the 2015 eukaryotic community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.48 and PCo2 has a relative corrected eigenvalue of 0.22. b) Dendrogram results for the same dataset.

viii

Figure 2.8: a) PCoA results for the 2014 and 2015 bacterial community T-RF datasets. Inshore (I) samples are represented by and offshore (O) samples are represented by n. Datapoints that are red represent the 2014 samples whereas datapoints that are blue represent the 2015 samples. The sample site followed by number denotes sampling week in 2014; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. The sample site followed by number denotes sampling week in 2015; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.46 and PCo2 has a relative corrected eigenvalue of 0.14. b) Dendrogram results for the same dataset. The 2014 samples are distinguished by -1 and the 2015 samples are distinguished by -2.

Figure 2.9: a) PCoA results for the 2014 and 2015 eukaryotic community T-RF datasets. Inshore (I) samples are represented by and offshore (O) samples are represented by n. Datapoints that are red represent the 2014 samples whereas datapoints that are blue represent the 2015 samples. The sample site followed by number denotes sampling week in 2014; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. The sample site followed by number denotes sampling week in 2015; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.25 and PCo2 has a relative corrected eigenvalue of 0.18. b) Dendrogram results for the same dataset. The 2014 samples are distinguished by -1 and the 2015 samples are distinguished by -2.

Figure 2.10: The db-RDA results of the 2014 bacterial community and environmental parameters collected that year. The analysis was restricted to include only the most influential environmental parameters as constraining variables. Constrained ordination collectively explained 44% and unconstrained ordination collectively explained 56% out of total variation between samples. The first axis (CAP1) by itself was constrained and captured 22% of the total variation and the second axis (CAP2) was also constrained but only captured 20% of the total variation.

Figure 2.11: The db-RDA results of the 2015 bacterial community and environmental parameters collected that year. The analysis was restricted to include only the most influential environmental parameters as constraining variables. Constrained ordination collectively explained 80% and unconstrained ordination collectively explained 20% out of total variation between samples. The first axis (CAP1) by itself was constrained and captured 73% of the total

ix variation and the second axis (CAP2) was also constrained but only captured 0.06% of the total variation.

Figure 2.12: The db-RDA results of the 2014 eukaryotic community and environmental parameters collected that year. The analysis was restricted to include only the most influential environmental parameters as constraining variables. Constrained ordination collectively explained 40% and unconstrained ordination collectively explained 60% out of total variation between samples. The first axis (CAP1) by itself was constrained and captured 20% of the total variation and the second axis (CAP2) was also constrained but only captured 13% of the total variation.

Figure 3.1: Diversity analysis of the bacterial community throughout the 2014 sampling season: 1=June 5, 2= June 19, 3=July 2, 4= July17-July31, 5=August 14-August 27. Top: Data is plotted as OTU rarefaction curves representing richness as a) observed OTU’s and b) PD_whole_tree which takes into account phylogenetic distance between 16S OTU’s. Bottom: PCoA ordination of 16S taxonomic dissimilarities between pooled samples with (c) Bray-Curtis distance matrix and (d) Weighted Unifrac, which accounts for phylogenetic distances between OTU’s.

Figure 3.2: The bacterial assemblage throughout the summer of 2014. Sample identification is based on DNA pools: 1=June 5, 2= June 19, 3=July 2, 4= July17-July31, 5=August 14-August 27.

Figure 3.3: Correlations between 2014 bacterial higher-level taxa over time with statistical significance (p-values for each correlation) in the bottom matrix. p-values of less than 0.05 indicate statistically significant correlations between the corresponding populations.

Figure 3.4: Diversity analysis of the eukaryotic community throughout the 2014 sampling season: 1=June 5, 2= June 19, 3= July 2-July 17, 4= July 31(inshore), 5=July 31(offshore)- August 14, 6= August 27. Top: Data is plotted as OTU rarefaction curves representing richness as a) observed OTU’s and b) PD_whole_tree taking into account phylogenetic distance between 18S OTU’s. Bottom: PCoA ordination of 18S taxonomic dissimilarities between pooled samples with (c) Bray-Curtis distance matrix (d) Weighted Unifrac, which accounts for phylogenetic distances between OTU’s.

x

Figure 3.5: The eukaryotic assemblage throughout the summer of 2014. Sample identification is based on DNA pools: 1=June 5, 2= June 19, 3=July 2-July 17, 4=July 31(inshore), 5=July 31(offshore)- August 14, 6=August 27.

Figure 3.6: Correlations between 2014 eukaryotic phyla over time with statistical significance (p-values for each correlation) in the bottom matrix. p-values of less than 0.05 indicate statistically significant correlations between the corresponding populations.

Figure 3.7: Higher-level taxonomic assignment of the 2015 bacterial community over space and time in Hamilton Harbour. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Figure 3.8: Correlations between most abundant bacterial and eukaryotic taxa (>1% total community) over time with statistical significance (p-values for each correlation) in the bottom matrix. p-values of less than 0.05 indicate statistically significant correlations between the corresponding populations.

Figure 3.9: Rarefaction curves generated for the 2015 metagenomes generated by MG-RAST. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Figure 3.10: PCoA ordination of the 2015 metagenomes. Sample identification is represented by month (J=July; A=August; S=September) and sampling day. Inshore samples are represented by and offshore samples are represented by n. Biplot vectors in green indicate which taxa have the largest influence in PCoA plot.

Figure 4.1: Categories within nitrogen metabolism as described by SEED Subsystems.

Figure 4.2: Nitrogen transport potential throughout the summer. Sequences coding for a) ammonium and b) nitrate/nitrite transporter normalized against the bacterial housekeeping gene rpoB. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Figure 4.3: Taxonomy assignment for the ammonium transporter genes. Taxa contributing to >1% of the genes are represented in the pie charts.

xi

Figure 4.4: Taxonomy assignment for the nitrate/nitrite transporter genes. Taxa contributing to >1% of the genes are represented in the pie charts.

Figure 4.5: Network of nitrogen metabolisms and including enzymes (in boxes) that catalyze the conversion of different nitrogen forms. Genes detected in the July 30 and August 14 communities (blue) and August 27-September 24 communities (red) are highlighted. Each enzyme code corresponds to a unique KEGG accession number. The pathways involved in prokaryotic nitrogen metabolism that are investigated in this chapter are within the red square.

Figure 4.6: Intracellular nitrogen reduction by enzymes described by KEGG with the corresponding accession numbers in []. The values plotted are a ratio between each gene and the copy number of rpoB present within each dataset.

Figure 4.7: Taxonomy assignment to the genes encoding subunits of the major enzymes responsible for catalyzing the stepwise reduction of nitrate to ammonium. Relative percentages of each taxa were then multiplied by their normalized gene copy values obtained from Figure 4.6.

Figure 4.8: Taxonomy assignment of sequences corresponding to nifH representing nitrogenase complex that catalyzes the reduction of dinitrogen to ammonium. Relative percentages of each taxa were then multiplied by their normalized gene copy values obtained from Figure 4.6.

xii

List of Tables

Table 2.1: Oligonucleotide primers that were used to amplify 16S and 18S rRNA target genes.

Table 2.2: PCR conditions for three target genes: I= Initiation, D= Denaturation, A= Annealing, E= Elongation, FE= Final Elongation steps. Steps D-E were repeated for 35 cycles.

Table 2.3: Results of PERMANOVA tests for effect of site, week and year on the microbial community structure (permutations=999) with correlation coefficient R2 assigned and p values: <0.05*, <0.01**, <0.001***.

Table 2.4: Mantel statistic r based on Pearson correlations between standardized environmental parameter distance matrices and phylotype distance matrices and pH/DO/temperature distance matrices (2014 only) below. The Mantel tests based on correlations between bacterial and eukaryotic distance matrices are also indicated. Significance was based on permutations (perm=999) with p values (<0.05*).

Table 2.5: Bioenv results for subsets of environmental variables with the best correlation to community data. Codes for each parameter are listed in Appendix A (Table A.6). Those with significant correlations (*) were further investigated by db-RDA analysis.

Table 3.1: Taxonomy assignment of the 2015 metagenomes at level according to MEGAN. The values indicate relative percentages (%) of each domain within each sample. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Table 4.1: Nitrogen reduction genes in biological . Modified by table found in Moreno-vivián et al. (1999).

xiii

List of Appendices

Appendix A ...... 93

Appendix B ...... 99

Appendix C ...... 108

xiv

List of Abbreviations

AOC- Area of Concern bp- base pair

C- Carbon

CyanoHAB- Cyanobacterial Harmful Algal Bloom db-RDA- Distance Based Redundancy Analysis

HAB- Harmful Algal Bloom

MC- Microcystin

MEGA7- Molecular Evolutionary Genetics Analysis 7

MEGAN- Metagenomics Analyzer

MG-RAST- Metagenomics Rapid Annotation Using Subsystem Technology

N- Nitrogen

NCBI- National Center for Biotechnology Information

OTU- Operational Taxonomic Unit

P- Phosphorus

PCoA- Principal Coordinates Analysis

PCR- Polymerase Chain Reaction

QIIME: Quantitative Insights Into Microbial Ecology

RAP- Remedial Action Plan

SSU rRNA- Small subunit ribosomal RNA

TP- Total Phosphorus

T-RF- Terminal Restriction Fragment

T-RFLP- Terminal Restriction Fragment Length Polymorphism

WHO- World Health Organization

xv

Chapter 1 Introduction

1.1 Anthropogenic Eutrophication and Algal Blooms Anthropogenic eutrophication is the term used to describe excessive loading of nutrients into aquatic systems as a result of human activity. Freshwater coastal regions have been particularly affected by eutrophication throughout the last 100 years because of urbanization, industrial and agricultural development. Oversupply of nutrients, mainly nitrogen and phosphorus, has stimulated a higher frequency of algal blooms that have greatly impacted the health of coastal freshwater systems around the world (Rastogi et al., 2015). Primary production occurs at rates such that algal biomass accumulation exceeds dispersal through physical and biological processes. The result is an imbalance in the natural biogeochemical cycling with numerous downstream consequences. These blooms are sometimes dominated by phytoplankton that produce and release toxic metabolites into their environment. Because of the negative impacts that blooms have on their environment and other organisms, they are often referred as harmful algal blooms (HAB’s).

1.2 Consequences of Cyanobacterial Blooms Excessive phytoplankton density increases water turbidity in the summer, which can hamper light penetration in shallow lakes and impinge on macrophyte growth and distribution (Kissoon et al., 2013). Dead phytoplankton biomass sinks to the sediments and is subjected to bacterial leading to oxygen depletion (Palmer, 1997). The hypolimnion remains in an oxygen-depleted (hypoxic) state until it can be re-oxygenated by significant mixing events (Hutchinson, 1938). If hypoxia reaches extreme levels, a severe consequence of eutrophication can include massive fish kills (CENR, 2003)

Freshwater blooms associated with eutrophication are sometimes accompanied by a proliferation of problematic -producing genera of phytoplankton. When the bloom population mainly consists of Cyanobacteria, they are referred to as CyanoHAB’s. At high nutrient levels, Cyanobacteria and some eukaryotic phyla can become harmful, but Cyanobacteria are notoriously the most problematic phytoplankton in freshwater (Paerl et al., 2001). Cyanobacterial metabolites are known to cause unpleasant tastes and odors in drinking water (Watson, 2004). It is estimated that between 25%-75% of Cyanobacterial blooms produce

1 2 secondary metabolites called cyanotoxins (Chorus 2001; Bláhová et al., 2007). Field data has suggested that can enter the food chain reaching higher trophic levels and can accumulate in zooplankton and fish (Lehman et al., 2010).

Studies of carcinogenesis by one of the most widespread cyanotoxins, called microcystins (MC’s) in rodent models in vitro have shown that chronic exposure to the specific isoform MC- LR has the potential to impair vital immune responses and indirectly lead to risk of cancer development (Lone et al., 2016). The World Health Organization (WHO) has more strictly defined their basic standard for problematic blooms as a density total of Cyanobacterial cells exceeding 100,000 cells/ml. Cell density can be quantified indirectly based on chlorophyll a measurements of 50 μg/L if Cyanobacteria dominate (Bartram et al., 1994). While taking into account uncertainty factors, WHO has also set a provisional Guideline Value for drinking water of 1 μg/L and 20 μg/L for recreational uses (Bartram et al., 1994). In addition to the risks associated with levels measured within the water, CyanoHAB’s in recreational waters pose a threat to and humans when toxic scums are deposited on the shore in wind-concentrated areas (Zaccaroni & Scaravelli, 2008).

1.3 Current Understanding of CyanoHAB Development and Persistence The natural summer assemblage of freshwater phytoplankton includes phylogenetically diverse taxa such as eukaryotic (green algae), Dinophyta (), Chrysophyta (golden algae), Cryptophyta and prokaryotic Cyanobacteria (Paerl et al., 2001). Top-down as well as bottom-up dynamics in lake systems have both quantitative and qualitative effects on phytoplankton community composition and population abundance.

The Redfield ratio was discovered by Alfred C. Redfield in 1934 and defines the stoichiometric ratio between carbon (C), nitrogen (N) and phosphorus (P) as 106:16:1 in all dead and living marine (Redfield, 1934). This ratio is well known to be the fundamental feature of understanding the biogeochemical cycles of oceans. In marine systems, nitrogen limitation is assumed when the N:P ratio is below 16:1 and phosphorus limitation when the ratio is higher (Tyrell, 1999). In freshwater systems C:N:P ratios are not so constant - lakes are generally reported to have both higher spatio-temporal variability and higher mean C:P/C:N ratios than the offshore oceans (Hecky et al., 1993). Nutrient sources of lakes do not always follow the Redfield ratio and can range from N:P mass of ~200 from precipitation, groundwater and

3 to <1 from sediments, sewage, urban runoff and feces inflow (Downing & McCauley, 1992). In contrast to the abiotic component, the range of N:P mass ratios within the composition of freshwater phytoplankton is relatively restricted and is closer in terms of absolute magnitude to the Redfield ratio with some variability between 7.1 and 44.2 (Downing & McCauley, 1992). Specific values associated with the ratio depend on the composition of the dominant biological species present and on the nutrient state of the system. For example, the highest N:P mass ratios in living freshwater organisms are between ~40 - ~44 in algae and macrophytes of P deficient systems, ~36 in Cyanobacteria that form blooms and ~22 in algae of systems with high P (Downing & McCauley, 1992). Therefore, in eutrophic systems with high P, Cyanobacteria may only predominate if high N is also supplied or whether they can obtain N from other means such as biological nitrogen fixation to meet their relatively higher cellular N needs.

Biological kinetics is considered to be important in helping explain the succession of phytoplankton throughout dynamic freshwater systems. The classical model in the literature explains the dominance of Cyanobacteria in eutrophic systems as a result of them having higher uptake rates for P than eukaryotic algae when P is high, whereas in systems with low P (i.e. oligotrophic systems) they have lower uptake rates for P than eukaryotes (Scavia et al., 1988). This can explain why eukaryotes are seen to dominate oligotrophic systems whereas Cyanobacteria dominate systems affected by a history of eutrophication. This theory would further imply that eukaryotic phytoplankton should outcompete their prokaryotic counterparts in summer-early fall when dissolved P reaches low levels in the water column after being incorporated into spring/early summer primary production (Hyenstrand et al., 2001; Watson et al., 2008). Yet this is not always supported by data collected in the field. For example, Hamilton Harbour is a historically eutrophic system that still faces Cyanobacterial blooms in the late summer despite significant reductions in exogenous in N and P into the system. Microscope analyses of the water column community suggests a diversity of eukaryotic alga including green algae, diatoms and dinoflaggelates peak in the spring when run off supposedly enriches the system, while potential MC-producing Cyanobacteria are the dominant phytoplankton in the late summer (Dermott et al., 2007).

Ammonium is the preferred source of dissolved nitrogen for uptake and assimilation by phytoplankton including Cyanobacteria, but they can also transport and assimilate nitrate/nitrite (Dortch, 1990). If biologically available forms of nitrogen become limiting in the late summer,

4 some bloom-forming Cyanobacteria have the potential to fix atmospheric nitrogen into ammonium, introducing N to the system. Nitrogen fixation in the development of CyanoHAB’s is also thought to play a significant role in systems when there is noticeable shift from non- nitrogen fixers to nitrogen fixers (diazotrophs) in the late summer (Beversdorf et al., 2013). Theoretically, therefore, increasing exogenous N would lead to the dominance of non-nitrogen fixing Cyanobacteria such as Microcystis, while an N-limited system can be dominated by diazotrophs. For this reason, Schindler (2012) suggests that management should always focus on P reduction since controlling N inputs may be counterproductive by promoting the proliferation of either diazotrophs or non-diazotrophic Cyanobacteria (Schindler, 2012), both of which are undesired as they can contain genera that can produce toxins and form dense blooms. The relationships between absolute abundance and/or ratios between N and P and subsequent shift to Cyanobacterial dominance are not clear and there is an ongoing debate about which should be the focus of control by management efforts.

Despite the importance of nutrient availability, many other abiotic factors are at play. Many phytoplankton require relatively high light intensities for photosynthesis (~120 μEŸm-2Ÿs-1) and high temperature (20-30°C) although optima and tolerance thresholds can vary significantly between different species (Latala, 1991). Cyanobacteria have evolved mechanisms to exploit light at both high and low intensities. Cyanobacteria have both photoprotective mechanisms to overcome high-light exposure (Lohscheider et al., 2011). They also have unique phycobilisome proteins involved in photosynthesis that allow them to harvest light at a wider range of wavelengths than eukaryotic phytoplankton and can therefore thrive in more turbid waters (Grossman et al., 1993). Many Cyanobacteria favor water temperatures of >25°C which is characteristic of late-summer temperatures in lakes (Skulberg & Utkilen, 1999). Temperature optima can vary between Cyanobacteria species, which can significantly influence the composition and structure of the phytoplankton community (Robarts & Zohary, 1987). Some potentially toxic genera including Microcystis have pressure-resistant gas vesicles that provide them with buoyancy regulation. With this ability, they can migrate vertically in response to changing environmental conditions to overcome light and nutrient limitation in a stratified water column (Chu et al., 2007; Ganf & Olivier., 1982). Some researchers hypothesize this to be advantageous when loading of P, ferrous iron and other nutrients from sediments occur near the anoxic zone in late-summer (Molot et al., 2014). Yet there is recent speculation that buoyant Cyanobacteria such as Planktothrix lose this advantage when vertical disturbance such as those

5 brought on by temporary disturbances in the water called seiches displace them, in which case they can lose control of this regulation (Hingsamer et al., 2014).

All of the above describe our understanding of blooms so far based on bottom-up control of phytoplankton community structure- factors that regulate phytoplankton growth. Just as important are top-down factors that describe the impact of zooplankton grazing and parasitism on phytoplankton survival. Cyanobacteria are typically less susceptible to grazing through their low nutritional value, filamentous growth of many nuisance genera, and toxin production (Haney, 1987). Comparison of metadata from 66 published studies on 597 experiments also suggests that CyanoHAB’s often contain members that are less grazed than eukaryotic flagellates and Chlorophytes (Wilson et al., 2006). Morphology and toxicity often limit edibility of Cyanobacteria for filter-feeding populations such as Cladocerans such as Daphnia (Ger et al., 2014; Paerl & Otten, 2013). Yet other zooplankton grazers including copepods are thought to be more equipped to bypass the challenge of morphology by either selecting for small colonies (Panosso et al., 2003) or breaking up the Cyanobacteria filaments into smaller components that are more available to themselves and other grazers (Bouvy et al., 2001).

Protozoan grazers such as are an important link through energy transfer between their consumption of heterotrophic and algae to zooplankton. These grazers have relatively higher growth rates than other grazers and are more comparable to HAB growth rates, making them good candidates for “top-down” control (Strom & Morello, 1998). A recent experiment by Combes et al. (2013) showed that the freshwater Nassula sp. can consume and grow on MC-producing Planktothrix for a prolonged period, but short-term exposure displayed slower growth and reproduction than when it consumed non-toxic Cyanobacteria. The role of freshwater fungi has been an overlooked aspect of aquatic foodwebs, where there is potential for parasitism against filamentous and toxin producing Cyanobacteria. These potential parasites transfer energy throughout the microbial loop by releasing nutrients captured in inedible algae and providing energetic particles for grazers (Rasconi et al., 2014). Yet, grazing experiments by Van Wichelen and colleagues (2010) suggest that aquatic amoeboids apparently have a significantly greater impact on reducing biomass than other grazers and parasites on specific strains of Microcystis. Viruses specific to infecting Cyanobacteria, called cyanophages, are being investigated as an important factor in top-down management of blooms (Manage et al., 1999) and cyanophages specifically infecting Microcystis aeruginosa have been isolated

6 (Yoshida et al., 2006). There are fundamental unknowns regarding how future changes in climate will influence biological pressures and whether they will leverage or mitigate HAB development in aquatic environments (Wells et al., 2015).

It is clear that there are many prerequisites for the development and persistence of Cyanobacterial blooms. The shift to toxic species may be influenced by nutrient availability (“bottom-up influence”) by either exogenous or sediment loadings in the late-summer. Furthermore, nuisance Cyanobacterial species have several physiological advantages over other phytoplankton in the late-summer water column in systems with a history of eutrophication. “Top-down” control through grazing/parasitism, or lack-of, affect the rate and degree of Cyanobacterial proliferation. Lastly, physical factors including both system-specific environmental disturbances and in general more extreme weather may also be at play but are not controllable by management efforts. Contrary to the original hypothesis that focused primarily on exogenous P and phytoplankton dynamics, clearly the incorporation of other factors may help better explain the CyanoHAB phenomenon. A more synthetic approach to understanding CyanoHAB’s that integrates several of the dominant single-factor hypotheses in the literature has been suggested (Watson et al., 2008).

1.4 Concerns in the Great Lakes The Laurentian Great Lakes constitute 84% of North America’s surface water and approximately 21% of the world freshwater supply while providing a habitat for native wildlife and fish populations. The Lakes that connect Canada and the United states provide drinking water to >24 million North Americans while smaller bodies within the lakes watersheds also provide recreational prospects for locals and tourists supporting economic opportunities. Urbanization, agricultural and industrial development have led to increased P and N loadings within the Great Lakes and consequently created a potential for the unwanted proliferation of concerning phytoplankton in the summer. Early reports from the 1960’s indicate that eukaryotic Cladophora and prokaryotic non-toxin producing Aphanizomenon flos-aquae initially dominated the HAB’s (Neil & Owen, 1964; IJC, 2013). Early remedial efforts consequently tried to control these algal populations by focusing on total phosphorus (TP) reduction targets to be established with means to reduce chlorophyll a levels at both non-point sources and point sources nutrient loading areas (IJC, 2013). Records of measurements taken from 1970-2010 water quality assessments suggest that TP levels in the middle of Lakes Huron and Ontario, Georgian Bay and

7 parts of Lake Erie declined while other regions such as the central basin of Lake Erie and Lake Superior have not declined significantly (IJC, 2013).

Within Lake Ontario, upgrades to municipal wastewater treatment (WWTP) allowed for better control of phosphorus loadings and were consequently the primary reason why TP reduction was successful (IJC, 2013). Despite these reductions, however, CyanoHAB’s still recur in some Western and Southern embayments of the Great Lakes, this time with a significant shift to potentially toxic species including MC-producer Microcystis aeruginosa (IJC, 2013). In coastal regions of Lake Ontario, there has been a significant increase in the number of algal bloom reports, with the greatest increase in reports of CyanoHAB’s since 1994 (Winter, n.d.). Understanding the relationships between nutrient levels and other potential triggers of this shift has been challenging, as several temporal, geographical and historical factors need to be considered. For example, the introduction of invasive species zebra (Dreissena polymorpha) and quagga (Dreissena bugensis) mussels to the Great Lakes in the 1980’s is also possibly connected with phosphorus distribution in regions affected by CyanoHAB’s (IJC, 2013).

Within Lake Ontario, MC’s have only been detected and are therefore the cyanotoxins of concern (IJC, 2013). MC’s are cyclic peptides that contain different chemical isoforms with the –LR isoform most commonly associated with CyanoHAB’s in Lake Ontario (IJC, 2013). The concentration of MC’s detected in the lower Great Lakes have been reported to exceed WHO guideline. Surface scums have reached levels that could induce acute poisoning if ingested with highest values in the summer ranging from 60 μg/L to 400 μg/L (IJC, 2013). The worst CyanoHAB event to date in any of the lower Great Lakes occurred in 2001 in Hamilton Harbour when surface scums contained up to 400 μg/l of Cyanobacterial toxins (Murphy et al., 2003).

The Great Lakes Water Quality agreement defines Areas of Concern (AOC’s) as regions of the Great Lakes between Canada and the U.S. that have significantly low water quality due to CyanoHAB’s and/or other detrimental affects on the lake systems by anthropogenic activity. These areas have been previous or are currently targets of remedial action in North America. Under the 1987 Great Lakes Water Quality Agreement between Canada and the United States, 43 AOC’s were described while 12 were on the Canadian border and 5 shared between both nations. AOC’s are impaired by one or more of the 14 described beneficial uses associated with the chemical, biological or physical makeup of the water. Remedial Action Plans (RAP’s) and strategies have been described and implemented by government, communities and researchers

8 working together with the aim of restoring these waterbodies to the point where they can be delisted as AOC’s. With system-specific targeted restoration and monitoring efforts in action, currently 7 Canadian AOC’s remain to be delisted in Lake Ontario including Port Hope Harbour, Bay of Quinte, Toronto and Region and Hamilton Harbour.

1.5 Study Site: Hamilton Harbour The Hamilton Harbour (also known as Burlington Bay) is a triangle-shaped bay located at the western tip of Lake Ontario (43o 17' 20" N - 79o 50' 2" W) and is naturally separated from the lake by a sand bar that rests under the Burlington canal (Figure 1.1). The bay itself is 8 km long (east-west) and 5 km wide (north-south), with the water column deepest (24 m) in the center. The Desjardins canal bridgeway connects the bay to a large, hyper-eutrophic marsh known as Cootes Paradise. On the southern shores, a deep-water port supports Canada’s iron and steel industries, while the upper reaches of the bay’s watershed have a mixture of rural and urban land uses. The Grindstone, Spencer and Redhill creeks are the three major tributaries that drain the surrounding watershed.

Figure 1.1: Map of the Hamilton Harbour and its surrounding watershed. This photo taken from Dermott et al. (2007) scale bar = 1800 m.

9 The Hamilton/Burlington interface is an economically important area through its shipping center and maintains the largest concentration of heavy industry in Canada. Consequently, urban and industrial development in the surrounding watershed has resulted in excessive toxic contaminant inputs into the system. These include Polychlorinated biphenyls, Dichlorodiphenyltrichloroethane, Polyaromatic Hydrocarbons and other toxic trace metals detected in the bay mainly as a result of processed wastes from two large steel mills located on the watershed (Poulton, 1987). The exchange between the harbour and Lake Ontario dilutes harbour pollutant metals, organic contaminants and continuously inputs fresh oxygenated water. The natural inflow through the Burlington canal can exceed 11 m3s-1, creating significant mixing between fresh lake water and harbour water through cold upwellings, internal seiches and flow from Lake Ontario (Hamblin & He, 2003; Wu et al., 1996).

Although in the past nutrient loadings were unregulated, today the bay receives regulated loadings of treated wastewater from four Waste Water Treatment Plants (WWTP’s) that are in the natural watershed, as well as treated wastewater from regions outside of the natural watershed and urban runoff from the cities of Hamilton and Burlington (HH RAP, 2003). The discharge into draining tributaries which lead into the bay is the result of local treatment plants processing hundreds of thousands of m3 of sewage a day (HH RAP, 2003). Hamilton Harbour has been designated an AOC and part of the RAP to delist it has been actively reducing of phosphorus inputs from WWTP’s to the harbour in order to lower the level of eutrophication (Dermott et al., 2007). Significant water quality improvements have been made since the RAP was initially implemented in the mid-80’s, with restricted phosphorus accompanied by less turbid waters and resurgence of macrophytes in many areas of the bay (Charlton and Le Sage, 1996; Charlton, 2001). Yet despite remedial efforts, the bay still receives effluents from the WWTP’s and water quality still remains below RAP targets. The bay also has shifts to Cyanobacterial taxa including some potential toxin producers in August/September. According to microscope analysis, the late-summer water column is typically dominated by genera including Microcystis, Anabaena, Aphanizomenon, Lyngbya, Nostoc, Chroococcus and Planktothrix (Dermott et al., 2007; Jonlija, 2014). Furthermore, the decaying phytoplankton in the hypolimnion of Hamilton Harbour may represent 30-35% of the total water column oxygen demand of during the summer (Dermott et al., 2007). Consequently the bottom of the bay becomes almost entirely oxygen depleted every summer that remains a persistent barrier to its long-term ecosystem recovery. Remediation targets have set dissolved oxygen (DO) to be >4

10 mg/L (HH RAP, 1992). However, during the water column stratification period from late June to September, hypolimnetic oxygen concentrations can reach as low as 0.5-1 mg/L. DO in the water column is replenished by photosynthesis, re-aeration and inflow from Lake Ontario (Dermott et al., 2007).

1.6 The Application of Molecular Tools to Understand CyanoHAB’s 1.6.1 Why Use Molecular Tools? Relationships between Cyanobacterial succession, grazing and physico-chemical parameters of eutrophic systems have been extensively studied. Significantly less is known about the composition, functions and relationships of the heterotrophic assemblages that are associated with these toxin-producing species in the events leading up to and during CyanoHABs. The importance of filling this knowledge gap comes from the finding that as little as 1% of all environmental microorganisms have been cultured in the laboratory (Hugenholtz, 2002; Lutton et al., 2013). Overlooking these uncultivable members can dramatically bottleneck our evaluation of the status of the system in the attempt to explain the complex ecological phenomenon of CyanoHAB development and persistence. The isolation of nucleic acids from these environments and application of high-throughput nucleic acid sequencing can shed light on the microbiological diversity and succession associated with CyanoHAB’s culture- independently while simultaneously overcoming our inability to see and distinguish many important members under the microscopes. In this way molecular analyses can elucidate any missing links between the physico-chemical condition of the system and apparent biological responses.

1.6.2 Terminal Restriction Fragment Length Polymorphism (T-RFLP) Because nucleic acid sequencing can be costly for exploratory studies, DNA fingerprinting techniques can be performed on all samples prior to more expensive analyses. Samples with similar community fingerprints can subsequently be pooled for sequencing to determine species composition within a sample. Terminal restriction fragment length polymorphism (T-RFLP) was first invented by Liu and colleagues (1997) and is one of the most commonly used molecular techniques for profiling microbial communities culture-independently. In this technique, Polymerase Chain Reaction (PCR) amplifies a functionally conserved gene common to all bacterial or eukaryotes within each sample, typically the small subunit ribosomal RNA (SSU rRNA) gene for phylogenetic studies. The amplicons can be cut with one or more

11 restriction enzymes, which generate individual restriction fragments (T-RF’s) of varying sizes, depending on the gene sequences. If the primers used in the amplification are labeled with fluorescent molecules, the terminal restriction fragments (T-RF's) can be detected on a capillary DNA sequencer that detects the wavelength of the specific fluorophor(s). Finer resolution between taxa and diversity are a product of an appropriate primer/enzyme combination. The dominant T-RF’s are assessed through electropherogram analysis whereby the X-axis represents the sizes of the fragments. Each T-RF is assumed to represent a different dominant "phylotype", a term interchangeable with operational taxonomic unit (OTU) corresponding roughly to a prokaryotic or eukaryotic genus. The Y-axis represents their fluorescence intensity that is translated to fragment/phylotype abundance. The phylotype composition data can then be subjected to multivariate statistical analyses. If environmental parameters are also collected, multivariate statistics can be applied to find correlations to community structure and changes in environmental data (Ramette, 2007).

1.6.3 DNA Sequencing Sequencing of single genes common to all known species can be used to describe the full range of both cultivable and uncultivable bacterial and eukaryotic plankton present within the environmental samples. While simple detection of organisms cannot reveal important interspecific ecological relationships, it is an important step as it is a better representation of diversity and provides a basis for determining co-occurrence relationships between Cyanobacteria and heterotrophs. Shotgun metagenomics is the non-targeted approach of sequencing all fragmented genetic material present in a sample. The shotgun approach simultaneously allows for both the identification of microbial community members and quantification of their functional genes.

Many nucleic acid sequencing technologies exist including the Roche 454 pyrosequencing chemistry generates longer reads (up to an average of ~400bp) and can generate up to 400,000 reads per run at a relatively high cost per sample. The Illumina HiSeq chemistry. on the other hand, can generate shorter reads but a higher depth (millions) and the cost per sample is relatively low. Metagenomics is a relatively new field in science. Although it is advancing our understanding of environmental microbes to a new level, it currently imposes several challenges, particularly the bioinformatics post-processing for which there currently is no standard pipeline (Teeling & Glöckner, 2012).

12 1.7 Research Objective and Hypotheses The Cyanobacterial community in Hamilton Harbour throughout the summer represents a complex assemblage of other heterotrophic organisms that live in association with well-studied species. My overall objective was to explore and describe the complete planktonic community in Hamilton Harbour using up-to-date molecular tools. Altogether, the analyses can shed new insights on the dynamics between competitors, grazer, parasites and symbionts associated with blooming species. The results will simultaneously provide a reference for future molecular studies on the Hamilton Harbour microbial community as well as similar ecosystems affected by recurrent CyanoHAB’s.

Cyanobacteria have historically been the dominant phytoplankton within the summer epilimnion of Hamilton Harbour. I hypothesized that a molecular approach would reveal:

1) Nitrogen, phosphorus and temperature would correlate most highly with total bacterial community structure throughout the summer 2) Novel components of the microbial community; diverse taxa would have different co- occurrence relationships with Cyanobacteria 3) The potential for Cyanobacterial nitrogen fixation would be the highest at the end of the summer sampling season

Chapter 2 Community Structure and Correlations to Physico-chemical Parameters 2.1 Introduction

Microbial community structure in lakes is dependent on the physical, chemical and biological factors in the environment. The same physico-chemical factors that are related to phytoplankton ecology can contribute to the proliferation of prokaryotic algae (Schindler, 1977). When light and temperature are high enough, phosphorus is the primary limiting nutrient and is linked to phytoplankton abundance (Schindler, 1977). Within the Great Lakes, excessive total phosphorus and nitrogen are thought to sustain the proliferation of CyanoHAB’s including MC producing genera (Davis et al., 2015).

RAP efforts for Hamilon Harbour have focused on lowering TP concentration in the water column to <17 μg/L, including both particulate and inorganic/soluble forms of phosphorus. The target concentration of TP below 20 μg/L has been set as supposedly low risk for bloom formation (Ontario Ministry of the Environment & Energy, 1994). Seasonal data from 2002- 2004 collected from multiple inshore and offshore sampling stations in the harbour have suggested that mean nutrient levels exhibit a high degree of variation spatially, temporally and annually, but typically exceed the low risk guidelines (Dermott et al., 2007). For example, TP and the bioavailable form soluble reactive phosphorus (SRP) peak in different seasons, different sites and show variable mean distributions throughout the relatively short timespan of three sampling years. During the 2002-2004 sampling study in Hamilton Harbour, mean TP concentrations measured from offshore water in the harbour ranged from 25.5-30.8 μg/L while inshore 35.8-37.9 μg/L, suggesting that inshore levels are slightly higher.

The target for nitrogen concentration in the harbour is 20 μg/L un-ionized ammonia (HH RAP, 1992). Dissolved ammonia-nitrogen trends in the harbour are more clear than TP and tend to peak in spring and drop in the summer. In 2002-2004, the lowest values were recorded in mid- August to early September however average levels in the spring can still significantly violate RAP target goal. For example at all sampling stations in 2002-2004, seasonal means were >600 μg/L un-ionized ammonia in the spring for all stations but dropped to <200 μg/L in the summer and stayed constant in the fall. There is no RAP target for dissolved nitrate/nitrite concentration

13 14 in Hamilton Harbour, but reports indicate that nitrate/nitrite levels have also been historically high across the harbour and at all seasons in comparison to other monitored areas such as Bay of Quinte in Lake Ontario. Nitrate/nitrite levels typically drop in the fall where the highest concentration is also reportedly near shore (Dermott et al., 2007).

In addition to the nutrient variability in the harbour, results from the biological component of the metadata published by Dermott et al. (2007) suggests that the phytoplankton community succession patterns vary seasonally and annually. There are variable peaks of eukaryotic taxa including Diatomeae, , Dinophyceae, Chlorophyta and Chrysophyceae. In general, primary productivity is dominated by larger phytoplankton (>20 μm) except in late spring to early summer when picophytoplankton (<2 μm) dominate. Cyanobacteria become dominant in August/September but their populations can extend to dominate as far as fall season. The only consistently seasonal dominant eukaryotic group appears to be Chlorophyta, which emerge at some point in July before Cyanobacterial emergence. Genera of Cyanobacteria have been found include Planktothrix, Pseudoanabaena, Aphanizomenon, Dolicospermum, Limnothrix, Lyngbya and Microcystis but composition within a sampling year varies depending on sampling site and depth (Jonlija, 2014). The filamentous Cyanobacterium Limnothrix has been found to be dominant in August while Aphanizomenon is dominant in September/October at 1 m below the water surface closer to the shore (Jonlija, 2014). Offshore, Planktothrix has been found to be dominant from July to November. In the same study, Microcystis was the only Cyanobacterium common to all samples collected from June to November.

In this Chapter, I used T-RFLP to evaluate the total bacterial and eukaryotic community structure in Hamilton Harbour surface water at inshore and offshore sites, bi-weekly for two recent summers. I hypothesized that nitrogen, phosphorus and temperature would correlate most strongly with total bacterial community composition. The results from DNA fingerprinting and hierarchical clustering would further be used as a guide to pool samples for next generation sequencing.

15 2.2 Methods 2.2.1 Choice of Restriction Enzymes for T-RFLP To try to capture the diversity of bacterial and eukaryotic phytoplankton within the Hamilton Harbour summer epilimnetic community, Dermott et al. (2007) was used as a primary reference. To try to capture the diversity of potential bacterial heterotrophs within the epilimnion, Newton et al., (2011) was used as a primary reference for library preparation. Commonly used universal primer sets were chosen for PCR amplification of target 16S bacterial and 18S eukaryotic SSU rRNA, V1-V9 and V4-V6, respectively (Table 2.1). The 27F/1492R primers that target bacterial DNA generate an amplicon size of ~1500 bp, while the 528F/R18R primers that target eukaryotic DNA generate an amplicon size of ~500 bp. Ribosomal gene sequences belonging to these representative organisms were obtained from GenBank and amplified by PCR in silico by computationally annealing the primer sequences to each corresponding target gene and cutting out each of the generated amplicons. The in silico generated PCR amplicons were then forward digested with a series of common restriction enzymes using a python script. The restriction enzymes chosen for subsequent laboratory work were based on which enzyme/primer pair would provide the highest fragment diversity in silico.

Table 2.1: Oligonucleotide primers that were used to amplify 16S and 18S rRNA target genes.

Primer ID Oligonucleotide Sequence

18S-Euk528F (Edgcomb et al., 2002) CCG CGG TAA TTC CAG CTC

18S-EukR18R (Hardy et al., 2011) CGT TAT CGG AAT TAA CCA GAC

16S-27F (Weisburg et al., 1991) AGA GTT TGA TYM TGG CTC AG

16S-1492R (Weisburg et al., 1991) GYI ACC TTG TTA CGA CTT

16 2.2.2 Sample Collection Two geographically distant sites were sampled in 2014 and 2015 (Figure 2.1). The “inshore” site 9031 (43°16'50.0"N 79°52'32.0"W) has a shallow water column depth of approximately 12 m and mostly influenced by any effluents from the watershed on the West end of the harbour. The “offshore” site 1001 (43°17'17.0"N 79°50'23.0"W) is also the deepest and most central part of the harbour has a water column that is approximately 24 m deep. The offshore site is also more proximal to the bridge separating the harbour from Lake Ontario. Water samples were collected bi-weekly from June 5th to August 27th in 2014 and from July 30th to September 24th in 2015 from both sites with volumes recorded in Table A.1 (Appendix A). Biomass from the epilimnion approximately 1 m deep from the surface of the water column was collected by either gravity (2014) or manual syringe filtration (2015) through 0.22 μm sterile PVDF membrane Sterivex filters (EMD Millipore, USA) to capture both the free-living and particle- attached cells. Sterivex filters with collected biomass were stored the same day at -80°C until further analysis.

Figure 2.1: Diagram of Hamilton Harbour with the inshore site 9031 (I) and offshore site 1001 (O) highlighted in green.

2.2.3 DNA isolation Biomass was removed from the filters as followed: 2 ml sterile PCR water was added to each filter, which was sealed and vortexed for 5 minutes to remove the biomass from the actual filter portion of the Sterivex unit. The biomass was pulled out of the Sterivex by a sterile syringe and transferred to a sterile 1.5 ml microcentrifuge tube. This was not the most efficient way of processing the samples since some biomass was likely still left on the filter. However, to be

17 consistent, all Sterivex were subsequently processed this way. The contents were centrifuged for 3 x 5 minutes at 6,000 g. The cell-free supernatant was discarded. DNA from the total biomass of the 2014 samples was extracted using a combination of the xanthogenate protocol (Tillett & Neilan, 2000) and FastDNA SPIN Kit (MP BIO, USA) while the 2015 samples were only isolated using the FastDNA protocol starting with the addition of CLS-Y from the FastDNA kit. For the 2014 samples, XS buffer was prepared with 1% potassium ethyl xanthogenate, 100 mM Tris-HCl [pH 8.0], 20 mM EDTA [pH 8.0], 1% SDS and 800 mM ammonium acetate. XS was prepared and heated for 5 minutes at 70°C to get SDS into solution until clear and mixed. Cells were then re-suspended in 200 μl of sterile dH2O and added to Lysing Matrix A tube with 500 μl of XS buffer and 500 μl of CLS-Y solution. The tubes were bead-beated for 3 minutes and centrifuged at 14,000 g for 10 minutes. The supernatant was transfered to a new tube and incubated for 60 minutes at 70°C. Samples were then vortexed for 15 seconds and immediately placed on ice for 30 minutes. Samples were then centrifuged at 14,000 rpm for 10 minutes. 750 μl of supernatant was transferred to a sterile 2.0 ml microcentrifuge tube and equal volume of Binding Matrix was added. Tubes were then gently rotated for 5 minutes at room temperature. Samples were then loaded into binding columns provided by the FastDNA kit and centrifuged at 14,000 g for 1 minute. Columns were then washed with 500 μl of SEWS-M (100% ethanol added) three times in order maximize the removal of environmental contaminants. Columns were dried with a final 2-minute spin and then residual ethanol was removed by letting them completely dry in a biosafety hood for 2 hours. The DNA was eluted by adding 50 μl of warmed elution DES solution provided by FastDNA and columns were incubated for 5 minutes at 55°C in a heat block. A final centrifuge at 14,000 g for 1 minute brought eluted DNA into a catch tube. Working volume dilutions were made so that stock DNA would be stored at -20°C.

2.2.4 Preparation of DNA amplicons for T-RFLP The PCR reactions were carried out in a 25 μl volume with a final concentration of 0.5 μM each of the forward primer labeled with 5ʹ-fluorescein amidite dye (FAM from LifeTechnologies, Canada), reverse primer 5’–hexachlorofluorescein dye (HEX from LifeTechnologies, Canada) and 2.5 units of HotStart Taq Plus DNA polymerase MasterMix kit (Qiagen, Canada). Approximately 2.5 ng of genomic DNA template was added to each reaction with rest of the volume made up with nanopure PCR water. Both no-template and positive PCR controls (DNA isolated from laboratory grown Burholderia phytofirmans OLGA R172) were always included in separate PCR runs. The reactions were carried out in a PTC-200 thermal cycler (MJ Research

18 Inc.). PCR conditions varied depending on the primer sets used with cycle details in Table 2.2. 4 μl of each PCR product was run on a 1.0% agarose gel to ensure successful amplification before downstream applications. PCR amplicons were purified using GenElute PCR Clean-Up Kit (Sigma, Canada). Approximately 100 ng of purified amplicons DNA were incubated with 10 U of restriction enzymes (all purchased from Life Technologies) and their corresponding buffers in 20 μl volume reactions. The digested products were then sent to the Molecular Services at the Agricultural Food Laboratory (AFL) at the University of Guelph for a final cleanup and electropherogram analyses to quantify raw fragment fluorescence.

Table 2.2: PCR conditions for three target genes: I= Initiation, D= Denaturation, A= Annealing, E= Elongation, FE= Final Elongation steps. Steps D-E were repeated for 35 cycles.

Primers I D A E FE

18S 95 C̊ 15 m 95 C̊ 45 s 60 C̊ 45 s 72 C̊ 1 m 72 C̊ 10 m 16S 95 C̊ 5 m 95 C̊ 1 m 56 C̊ 1 m 72 C̊ 1.5 m 72 C̊ 10 m

2.3 Statistical Analyses 2.3.1 Data Trimming and Diversity of Terminal Fragments Either peak height or area can be used as a proxy for phylotype abundance. The peak height of each fluorescent T-RF was chosen in this study. Only fragment sizes between 60-1200 base pairs (bp) and with fluorescent signal peaks greater than 100 units were included for further analysis as recommended by AFL. In addition, any fragments that were greater than ~500 bp for the eukaryotic dataset were attributed to artifacts and/or non-specific binding of primers to target template and were omitted (i.e only 60-500 bp fragments were included in the analysis). The annual datasets were treated both separately and combined. The Microsoft Excel macro TreeFlap (Rees et al., 2004) was used to bin fragments which were rounded the size of the nearest bp and generated a table of fragment sizes and their relative heights. Individual heights were converted to percentages based on the sum of heights in order to normalize fluorescence. Fragments sizes that were less than 1% of total abundance were removed to account for any additional background noise, rare species and/or artifacts (Rees et al., 2004). The filtered data for both years are available in the Supplementary Information document. The data matrices were

19 imported into the software R Studio v2.15.2 (R Core Team, 2015). Species richness was calculated by adding all fragments that contributed to >1% of the relative fluorescence within a sample.

2.3.2 Ordination of Phylotype and Environmental Data Microbial Community Dissimilarities Multivariate statistics were computed using the “vegan” package in R (Oksanen et al., 2011) unless otherwise stated. To facilitate the visualization of sample community structures, (dis)similarity matrices were created for both taxonomic markers using Bray-Curtis distance through the function “vegdist”. Bray-Curtis distance was chosen as it takes into account presence/absence and abundance of phylotypes without assuming linear relationships between sample distances and is therefore suitable for ecological diversity studies. Principal Coordinate Analysis (PCoA) was then chosen for unconstrained ordination to visualize sample clusters in 2- dimensional space because it preserves distances generated from any type of (dis)similarity measure. PCoA also generates a numerical eigenvalue, which summarizes total variability captured on each axis that is useful for data interpretation. PCoA ordination technique was performed using function “pcoa” in the “ape” package (Didier et al., 2016) with the number of dimensions set to 2 and Cailliez correction added to correct for any negative eigenvalues generated. The correction simply added a constant ‘c2’ to the original distance matrix except the diagonal values (Cailliez, 1983). To facilitate the separation of any sample clusters, the same distance matrices were then used for hierarchical clustering using Ward’s method with the function hclust(“ward.D”). Dendrogram branch height was used as a measure of dissimilarity (Murtagh, 2014). Permutational Multivariate Analysis of Variance (PERMANOVA) using the function “adonis” with 999 permutations assessed the significance of site, sampling date and year on community structure.

Microbial Communities Constrained by Environmental Parameters Physico-chemical parameter measurements were collected by collaborating researchers at Environment Canada (Dr. Susan B. Watson’s group) and York University (Dr. Lewis Molot’s group). Sample-by-observation data matrices were created in Microsoft Excel. The seasonal and annual values were then z-score standardized using the equation z-score = (mean- datapoint)/standard deviation to remove any potential biases induced by magnitude differences between the different parameters and to omit parameter units as recommended by Ramette

20 (2007). The environmental datasets for separate years were then transformed into Euclidean distance matrices using the “vegdist” function (method = euc) so that Mantel tests could be performed (function: “mantel”) to investigate any significant linear correlations between environmental and microbial community dissimilarity matrices (permutations=999). The best subset of environmental variables that most highly correlated to community dissimilarities for each year’s dataset was then explored using the function “bioenv”. Distance-based redundancy analysis (db-RDA) was the constrained ordination technique chosen to find correlations between microbial community structure and the most important environmental variables that were determined from bioenv. This constrained ordination technique was chosen because it has the flexibility of using any type of ecological distance matrix and importantly determines what percentage of the biota is explained by these influential variables, which can simultaneously be visualized on a 2-D plot. The db-RDA analyses were carried out using the function “capscale”.

2.4 Results 2.4.1 Terminal Fragment Diversity Diversity and Richness of Bacterial and Eukaryotic variants The libraries of predicted bacterial and eukaryotic freshwater organisms are listed in Tables A.2- A.3 (Appendix A), respectively. The 27F/1492R primers that target conserved regions of the bacterial 16S sequences generated ~1500 bp amplicon products before restriction digestion. After quality filtering the fragment data, the remaining terminal fragments ranged from 60 bp to 282 bp in size for the 2014 dataset and 69 bp to 246 bp for the 2015 dataset (Table A.4, Appendix A). Briefly, in 2014 the most dominant phylotypes were 150 bp, 188 bp and 240 bp and of those, 188 bp and 240 bp were also most dominant in 2015. The fragments of ~188-190 bp likely represent Cyanobacterial phylotypes according to Table A.2. The average bacterial phylotype richness was 18 (+/- 5) in 2014 and 15 (+/- 5) fragments in 2015 (Figure 2.2) suggesting higher average species richness in 2014 relative to 2015. Richness was also highest at the end of August in 2014 and decreased from July to September in 2015 at both sampling sites (Figure 2.2).

The Euk528F/EukR18R primer set that targets conserved regions of eukaryotic 18S sequences generated ~500 bp large amplicon products on the agarose gel. The resulting fragments in the T- RF data generated fragments from 72 bp to 759 bp for the 2014 dataset and 90 bp to 755 for the 2015 dataset as seen in Table A.5 (Appendix A). According to Table A.5, there were various

21 eukaryotic phylotypes that were present in >20% in at least one sample in 2014, but only 226 bp, 229 bp and 233 bp were common in both years’ eukaryotic datasets. The phylotype richness average varied from 11 (+/- 6) in 2014 and 11 (+/- 2) in 2015 (Figure 2.3) indicating similar richness both years. There was very high variability between sites and timepoints in the 2014 sampling season whereas there was little variability between samples in the 2015 sampling season, with diversity remaining more or less the same across site/times in 2015.

22

Bacterial Phylotype Richness Bacterial Phylotype Richness Throughout Throughout the Summer at the the Summer at the Offshore Site 1001 Inshore Site 9031 30 30 25 25 20 20 15 15 2014 10 10 2015

5 of Phylotypes Number

of Phylotypes Number 5 0 0

Sampling Week Sampling Week

Figure 2.2: Bacterial phylotype richness throughout 2014 and 2015.

Eukaryotic Phylotype Richness Eukaryotic Phylotype Richness Throughout the Summer at the Throughout the Summer at the Offshore Inshore Site 9031 Site 1001

20 18 18 16 16 14 14 12 12 10 10 2014 8 8

6 of Phylotypes Number 6 2015 of Phylotypes Number 4 4 2 2 0 0 Sep-end Sep-end July-end July-end Aug-end Aug-end July-mid July-mid Aug-mid Aug-mid June-mid June-mid Sep-early Sep-early July-early July-early June-early June-early Sampling Week Sampling Week Figure 2.3: Eukaryotic phylotype richness throughout 2014 and 2015.

23

2.4.2 Microbial Community Structure PCoA ordination is generally considered successful when the two main axes cover at least 50% of the total variation between samples (Buttigieg & Ramette, 2014). This means that the resulting two axes after ordination summarized a significant portion of the variation between samples. Using this as a benchmark value, all PCoA ordinations were successful with the exception of merged eukaryotic 2014/2015 dataset (Figure 2.9). In regard to interpreting individual PCoA plots, samples that cluster closely have more similar structures than samples that cluster further from each other.

In the 2014 bacterial dataset (Figure 2.4), the community showed clear differences between sample structures with generally greater temporal than spatial differences. In other words, samples collected from different sites on a given day clustered most closely to each other than they did to any other samples. In the 2015 bacterial community dataset (Figure 2.5), late July and early August clustered together while late August-late September clustered together on the first axis. Spatial homogeneity was also apparent in 2015, however July 30 and August 13 spatial samples were separated on the second axis indicating some spatial heterogeneity in the early summer samples.

In the 2014 eukaryotic dataset, the June samples grouped with August samples (Figure 2.6). However, just like in the bacterial dataset, in many cases samples that were taken from different sites on a given day clustered closer to each other than other sampling weeks. In the 2015 eukaryotic dataset, none of the temporal or spatial samples showed any visual clustering patterns (Figure 2.7).

The PCoA ordination plots of both years combined shows that the weekly bacterial communities in 2014 were very different from each other, whereas that the 2015 communities were more similar to each other despite the divide between August 14 and August 27 (Figure 2.8). The plot further suggests that the late 2014 samples clustered with early 2015 samples indicating that there was at least some overlap of community structure for both sampling years. For both eukaryotic annual datasets combined, in general the 2014 samples clustered closer to each other than to the 2015 samples (Figure 2.9), suggesting annual differences in community composition.

24

0.4 Cluster Dendrogram

I1 O1 1.4 0.2

O3 1.2 I3 O4 I4 1.0 0.0 0.8 PCo2

O5 O7 Height 0.6 I5 O2 O6 I2

I6 0.4 0.2 −

I7 0.2 0.0 0.4 I5 I4 I6 I7 I2 I1 I3 O5 O4 O6 O7 O2 O1 O3 −

−0.2 0.0 0.2 0.4

PCo1 distance a) b) hclust (*, "ward.D")

Figure 2.4: a) PCoA results for the 2014 bacterial community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. PCo1 has a relative corrected eigenvalue of 0.36 and PCo2 has a relative corrected eigenvalue of 0.24. b) Dendrogram results for the same dataset.

Cluster Dendrogram 0.3 2.0 0.2 I5

I8 1.5

0.1 O8

I6 0.0 1.0 PCo2 O9 I9 Height I7

0.1 O7 −

O5 0.5 O6 0.2 − 0.0 I8 I9 I7 I6 I5 0.3 O8 O9 O7 O6 O5 −

−0.6 −0.4 −0.2 0.0 0.2 0.4

PCo1 distance a) b) hclust (*, "ward.D")

Figure 2.5: a) PCoA results for the 2015 bacterial community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 5=July 30, 6= August 13, 7= August 27, 8= September 10,

25 9= September 24. PCo1 has a relative corrected eigenvalue of 0.72 and PCo2 has a relative corrected eigenvalue of 0.11. b) Dendrogram results for the same dataset.

O3 Cluster Dendrogram 0.4

I3 I4 I2 0.2 2.5

O4 2.0 0.0 O2 O5 O6I6 I5 1.5

PCo2 O1 I1 0.2 Height − 1.0 0.4 0.5 −

I7 O7 0.0 I6 I7 I2 I4 I3 I5 I1 0.6 O4 O5 O6 O7 O2 O3 O1 −

−0.6 −0.4 −0.2 0.0 0.2 0.4

PCo1 distace a) b) hclust (*, "ward.D")

Figure 2.6: a) PCoA results for the 2014 eukaryotic community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. PCo1 has a relative corrected eigenvalue of 0.35 and PCo2 has a relative corrected eigenvalue of 0.21. b) Dendrogram results for the same dataset.

26

0.4 Cluster Dendrogram I9 1.2 0.2

I6 1.0

O9 0.8 I5 O6

0.0 O8 PCo2 0.6 I7 Height O5 0.4

0.2 I8 − 0.2

O7 0.4 I9 I7 I5 I6 I8 − O7 O6 O9 O5 O8

−0.6 −0.4 −0.2 0.0 0.2 0.4

PCo1 distance a) b) hclust (*, "ward.D")

Figure 2.7: a) PCoA results for the 2015 eukaryotic community T-RF dataset. Inshore (I) samples are represented by and offshore (O) samples are represented by n. The sample site followed by number denotes sampling week; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.48 and PCo2 has a relative corrected eigenvalue of 0.22. b) Dendrogram results for the same dataset.

27 0.4

Cluster Dendrogram

I5 0.2 O5 I5 I4

O4 4 I6 O5

O6 I7 O7 O6 I8 3

0.0 I6 O8 I7 O7O9 I9

I3 PCo2 2 I1

0.2 O1O3 − Height 1 0.4 − I2 O2 0 2 2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 − − − − − − − − − − − − − − − − − − − − − − − − I8 I9 I7 I2 I1 I4 I7 I6 I5 I6 I5 I3 O8 O9 O7 O2 O1 O4 O7 O6 O5 O6 O5 O3 0.6 −

−0.4 −0.2 0.0 0.2 0.4 0.6

PCo1 distance a) b) hclust (*, "ward.D")

Figure 2.8: a) PCoA results for the 2014 and 2015 bacterial community T-RF datasets. Inshore (I) samples are represented by and offshore (O) samples are represented by n. Datapoints that are red represent the 2014 samples whereas datapoints that are blue represent the 2015 samples. The sample site followed by number denotes sampling week in 2014; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. The sample site followed by number denotes sampling week in 2015; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.46 and PCo2 has a relative corrected eigenvalue of 0.14. b) Dendrogram results for the same dataset. The 2014 samples are distinguished by -1 and the 2015 samples are distinguished by -2.

28

Cluster Dendrogram 0.6 O3 3.0

I2 I3 0.4 2.5 I4 0.2

O7 2.0 O2 O4 I7

I5 I1 1.5 0.0 PCo2 I6O6

O1 Height O7 O5 I9 1.0 0.2 I7 O6

− I5 O5 O9 I6 I8 O8 0.5 0.4 − 0.0 2 1 1 1 1 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0.6 − − − − − − − − − − − − − − − − − − − − − − − − − I9 I6 I7 I6 I8 I5 I1 I7 I5 I2 I4 I3 O4 O5 O6 O7 O5 O8 O6 O9 O1 O7 O2 O3

−0.6 −0.4 −0.2 0.0 0.2 0.4 0.6

PCo1 distance a) b) hclust (*, "ward.D")

Figure 2.9: a) PCoA results for the 2014 and 2015 eukaryotic community T-RF datasets. Inshore (I) samples are represented by and offshore (O) samples are represented by n. Datapoints that are red represent the 2014 samples whereas datapoints that are blue represent the 2015 samples. The sample site followed by number denotes sampling week in 2014; 1=June 5, 2= June 19, 3= July 2, 4= July 17, 5= July 31, 6= August 14, 7= August 27. The sample site followed by number denotes sampling week in 2015; 5=July 30, 6= August 13, 7= August 27, 8= September 10, 9= September 24. PCo1 has a relative corrected eigenvalue of 0.25 and PCo2 has a relative corrected eigenvalue of 0.18. b) Dendrogram results for the same dataset. The 2014 samples are distinguished by -1 and the 2015 samples are distinguished by -2.

The PERMANOVA results (Table 2.3) revealed significant changes within the bacterial communities bi-weekly but no spatial differences in both 2014 and 2015. In terms of the eukaryotic community, significant differences of the bi-weekly samples were only seen across the 2014 sampling season with yet again no significant spatial differences. The bacterial and eukaryotic communities were also significantly different depending on the sampling year.

29 Table 2.3: Results of PERMANOVA tests for effect of site, week (date) and year on the microbial community structure (permutations=999) with correlation coefficient R2 assigned and p values: <0.05*, <0.01**, <0.001***. Test (2014) F-Model R2 p Date x 14.7 0.93 0.0009*** Site x Prokaryote 0.26 0.02 0.94 Date x 4.9 0.80 0.0009*** Site x Eukaryote 0.5 0.04 0.83

Test (2015) Date x Prokaryote 11.1 0.90 0.002** Site x Prokaryote 0.29 0.03 0.75 Date x Eukaryote 1.6 0.56 0.18 Site x Eukaryote 0.12 0.09 0.55

Year x Prokaryote 14.9 0.40 0.0009*** Year x Eukaryote 4.1 0.16 0.002**

2.4.3 Environmental Correlates The environmental parameters collected along with annual averages for each sampling week and site are listed in the Appendix A.7. Information detected by YSI equipment including pH, dissolved oxygen and temperature, were lost for June 5th and August 14th 2014 and so correlations to the microbial community distance matrices were done separately for those parameters excluding those dates.

The result of the Mantel tests on correlations between the bacterial community and environmental distance matrices revealed that after 999 permutations, there were statistically significant correlations between the environmental data collected as a whole and community changes over both 2014 and 2015 summer seasons with the exception of the eukaryotic community in 2015 (Table 2.4). Weak correlations between the 2014 communities and pH/DO/temp parameters suggested that these parameters likely did not contribute significantly

30 to community structure based on the results of the Mantel test between these parameters separately.

A Mantel test was also conducted between prokaryotic and eukaryotic community composition and revealed that the bacterial community structure correlated to eukaryotic community structure only in 2014 (Table 2.4). This correlation was also statistically significant.

Table 2.4: Mantel statistic r based on Pearson correlations between standardized physico- chemical parameter distance matrices and phylotype distance matrices. pH/DO/temperature distance matrices for 2014 were assessed separately due to missing data. The Mantel test results between bacterial and eukaryotic distance matrices are also shown. Significance was based on permutations (perm=999) with p values (<0.05*). Test Mantel statistic p r Bacterial x Physico-chemical (2014) 0.45 0.002* Bacterial x Physico-chemical (2015) 0.77 0.001* Eukaryotic x Physico-chemical (2014) 0.41 0.001* Eukaryotic x Physico-chemical (2015) -0.07 0.64

Mantel statistic r p Bacterial x (pH/DO/temp) (2014) 0.13 0.20 Eukaryotic x (pH/DO/temp) (2014) 0.012 0.43

Mantel statistic r p Bacterial x Eukaryotic (2014) 0.32 0.004* Bacterial x Eukaryotic (2015) 0.16 0.42

31

- 2- The result of the best subset argument (bioenv) after 999 permutations revealed that Cl , SO4 and chlorophyll a were most highly correlated to the bacterial community structure in 2014 based on linear correlations (r= 0.60) for this year (Table 2.5). The result of the best subset

- - argument suggested that N-NH3, NO3 /NO2 and total particular phosphorus (TPP) were most highly correlated to the bacterial community structure in 2015 with a very strong linear correlation (r=0.89). The result of the 2014 best subset argument after 999 permutations

- - - revealed that NO3 /NO2 , Cl and P were most highly correlated to the eukaryotic community structure in 2014 although the correlation was weak (r= 0.34) but still statistically significant.

Table 2.5: Bioenv results for subsets of environmental variables with the best correlation to community data. Codes for each parameter are listed in Appendix A (Table A.6). Those with significant correlations (*) were further investigated by db-RDA analysis. Test Most Influential Factors Correlation r

- 2- Bacterial x Physico-chemical (2014) Cl , SO4 , Chla 0.60* - - - Eukaryote x Physico-chemical (2014) NO3 /NO2 , Cl , P 0.34* Bacterial x pH/DO/temp (2014) Temp 0.19 Eukaryote x pH/DO/temp (2014) Temp 0.15

- - Bacterial x Physico-chemical (2015 NO3 /NO2 , NH3, TPP 0.89* Eukaryote x Physico-chemical (2015) F-, TP-D, Chla 0.26

Distance-based redundancy analysis (db-RDA) provided eigenvalues for constrained and unconstrained axes, both of which were added together to account for total variation between samples and each environmental parameter’s influence could be calculated as a proportion of the total variation. Correlates with only significant best-subset results were assessed using db- RDA ( Figures 2.10-2.12). The db-RDA results showed that only in 2015, the most influential environmental variables collected explained more variation than any background/unconstrained variation (Figure 2.11). The July 30-August 14 samples clustered to the left of the center of the

- horizontal axis while the August 27- September 24 samples clustered on the right side with NO3 - /NO2 correlating most strongly with structure out of the parameters measured, noted by its closer alignment to the horizontal axis than TPP and N-NH3.

32 1.5 1 Chla

I7 1.0

I2O2 0.5 O6 I6 O7 0 0.0

CAP2 I5 O5 SO4.F

I3 O3 0.5

− I4O4 CL.F 1.0

− O1

I1 1 − 1.5 −

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

CAP1

Figure 2.10: The db-RDA results of the 2014 bacterial community and environmental parameters collected that year. The analysis was restricted to include only the most influential environmental parameters as constraining variables. Constrained ordination collectively explained 44% and unconstrained ordination collectively explained 56% out of total variation between samples. The first axis (CAP1) by itself was constrained and captured 22% of the total variation and the second axis (CAP2) was also constrained but only captured 20% of the total variation.

33

I5

O5 I8 0.5 TPP

O8

O9 I9 0 0.0 O7 I7 NO2.NO3 CAP2 0.5

− NH3 1.0 − O6 1 −

I6 1.5 − −1.5 −1.0 −0.5 0.0 0.5 1.0

CAP1

Figure 2.11: The db-RDA results of the 2015 bacterial community and environmental parameters collected that year. The analysis was restricted to include only the most influential environmental parameters as constraining variables. Constrained ordination collectively explained 80% and unconstrained ordination collectively explained 20% out of total variation between samples. The first axis (CAP1) by itself was constrained and captured 73% of the total variation and the second axis (CAP2) was also constrained but only captured 0.06% of the total variation.

34 1 1.5 I5 O1I1 1.0

I7 O7 0.5 0 0.0 O5

CAP2 CL.F O2

0.5 I4 − NO3NO2.F O6I6 I3 O4 1.0 I2 − 1 1.5 −

− O3

−2 −1 0 1 2

CAP1

Figure 2.12: The db-RDA results of the 2014 eukaryotic community and environmental parameters collected that year. The analysis was restricted to include only the most influential environmental parameters as constraining variables. Constrained ordination collectively explained 40% and unconstrained ordination collectively explained 60% out of total variation between samples. The first axis (CAP1) by itself was constrained and captured 20% of the total variation and the second axis (CAP2) was also constrained but only captured 13% of the total variation.

35 2.5 Discussion The bacterial community both years showed stronger correlations with environmental parameters than the eukaryotic component of the community. In 2014, the bacterial community correlated relatively strongly (0.60) to sulfate, chloride and chlorophyll a while the 2015 community was most influenced (0.89) by nitrate/nitrite, ammonia-nitrogen and total particulate phosphorus. Although the results only suggest correlations between the biological community composition and abiotic factors, chlorophyll a is not likely a driver but a reflection of the phytoplankton component of the community.

Constrained ordination of the 2015 community most closely supported the hypothesis that nitrogen and phosphorus would be most highly influencing community structure throughout the summer. Yet, these factors were not strongly correlated to the bacterial community in 2014. Furthermore, the results suggested that environmental factors in general collectively had weaker correlations to community composition in 2014 than in 2015. One reason for this could be that since sampling intervals were different between years, different factors could be shaping the community from June-September.

Nitrate/nitrate availability in particular was the most influential factor on bacterial community structure in the late summer in 2015. Although not shown, dissolved nitrate/nitrite and ammonia-nitrogen in Hamilton Harbour decreased throughout the summer and were lowest in late-September. Inorganic nitrogen is also seen to decrease rapidly throughout the summer in other eutrophic systems of the Great Lakes such as Sandusky Bay at the Western basin of Lake Erie (Davis et al., 2015). Bioavailable nitrogen limitation has also been suggested to be a strong influence on changes in phytoplankton community composition in the late-summer water column in Western Basin of Lake Erie. Bioassay analyses have been used to show that in eutrophic portions of the Western basin of Lake Erie, N enrichment (without P) significantly increased chlorophyll a concentration and very low (<5 mmol/L) nitrate triggered a significant shift to the predominance of the diazotroph Anabaena (Chaffin et al., 2013). Therefore, nitrogen bioavailability could be a significant influence on the community composition approaching the months of August/September in Hamilton Harbour, even within the Cyanobacterial phylum.

36 Although pH was not determined to be the most influential factor in the community in this study, the average pH of the water column was much higher in 2015 than in 2014. Higher pH is associated with higher density blooms caused by Cyanobacteria such as Microcystis, as photosynthesis depletes inorganic carbon in the water column (Ha et al., 1999). The environmental pH can also influence the ionization state of ammonium/ammonia-nitrogen, therefore pH has the potential to indirectly influence community structure. The ratio of ionized:un-ionized ammonium decreases with higher environmental pH, leaving less of the

+ preferred form that is transported by algae (NH4 ) in the environment and a greater portion of

NH3 in the environment.

Phosphorus was apparently less of an influence on community composition than nitrogen, but it is important to note that average TP in Hamilton Harbour throughout this sampling period was still above RAP target of 20 μg/L at both sampling stations, especially in 2015. This provides evidence to suggest that phosphorus levels were still high enough for the summer water column to support the proliferation of a Cyanobacterial bloom.

Other interesting correlations found within the datasets include chloride and sulfate having the highest correlation to the bacterial community structure throughout the 2014 sampling season. Chloride and sulfate are anions that can be used as a proxy of monitoring water conductivity. Higher temperature also increases conductivity by solubilizing salts and minerals. Chloride in particular is used as an indicator for monitoring water salinity, which can contribute to shaping aquatic communities as native freshwater organisms can only tolerate a specific salinity range (Latala, 1991). Possible sources that introduce salinity in particular include heavy rain/flooding, evaporation and pollution from highways which introduce road salts that spike conductivity in the spring after winter seasons (Granato & Smith, 1999). Data on chloride collected from Dermott et al. (2007) indicate that Hamilton Harbour has previously exhibited chloride levels between low 70’s and above 100 mg/L in 2002-2003, which is within the range of values collected in 2014/2015. The chloride levels in Hamilton Harbour are much higher than other embayments including Bay of Quinte, which has reportedly detected levels of <20 mg/L chloride (Dermott et al., 2007). The trends in Hamilton Harbour show that the concentration of chloride gradually decreases throughout the summer, perhaps when flushing from Lake Ontario becomes more significant.

37 Dissolved organic carbon (DOC) provides significant amounts of carbon available for heterotrophic uptake assimilation for energy from extracellular dissolved carbon released by phytoplankton (Laird et al., 1986) and/or terrestrial allochthonous carbon loadings especially to near-shore communities of freshwater (Tallberg & Heiskanen, 1998). Although the data and analyses were not assessed for DOC, it likely also plays a significant role in shaping the heterotrophic component of the bacterial community. Furthermore, it may even account for the significant portions that of variation between samples that was not captured by constrained ordination, especially in 2014 when unconstrained ordination apparently constituted >50% of the variation between samples.

Overall, the data suggests that the microbial community assemblage in the latest part of the summer is significantly shaped by the availability of nitrogen in the environment. Investigating the genes involved in nitrogen transport and assimilation pathways of the microbial community members could be the next avenue to explore. The major limitation of interpreting community structure based on T-RFLP is the difficulty in assigning taxonomy to phylotypes to characterize the community. Results from T-RFLP would need to be complemented by sequencing methods in order to identify the taxa (and subsequently their metabolic capabilities) that are associated with these major spatial or temporal changes in Hamilton Harbour’s microbial community. I used the results from this Chapter to decide which DNA samples could be pooled based datapoint clustering and expand on taxa identification in the next chapter.

Chapter 3 Phylogenetic Composition of the 2014 and 2015 Communities 3.1 Introduction

Culture-independent techniques have shed light on the composition and diversity of microbial communities in aquatic systems. In 2003, Craig Venter used next-generation sequencing technology to discover 2000 species including 148 different bacteria, never seen before in samples collected from the Sargasso Sea (Venter et al., 2004). Since then, other studies have been published on freshwater plankton communities in diverse habitats. This includes monitoring the potential previous and future impact of climate change on the micro-eukaryotic community composition in the Arctic (Boopathi et al., 2015). Spatial variation of the bacterial communities of other unique environments including high-altitude wetlands has also been investigated (Zhang et al., 2014). Within North America, DNA barcoding is further becoming an integrated portion of research on monitoring and detection of invasive species in the five Laurentian Great Lakes (Trebitz et al., 2015).

Eiler & Bertilsson (2004) described the first full snapshot of the heterotophic community associated with Cyanobacteria populations that cause problematic blooms in a freshwater system in Sweden. Beside Cyanobacteria, they determined that 4 different lakes had 158 common phylotypes of Proteobacteria, Bacteriodetes, Actinobacteria, and . More recent metagenomic snapshot studies have elucidated the bloom communities of other regions affected by eutrophication and pollution including China and coastal regions of lakes bordering the United States including Lake Erie (Steffen et al., 2012). Comparative analysis of these snapshots by Steffen et al. (2012) have further shown that the composition and collective community functions vary widely between Microcystis bloom communities from North America and Asia. The composition of Cyanobacterial orders within the phylum is apparently more conserved than the heterotrophic phyla associated with these Cyanobacteria, which can vary significantly between continents.

Significantly fewer studies have described total community changing over both temporal and spatial scales within the same water body. Because of the lack of wider-scale metagenomic studies, we are only beginning to understand the interactions that exist between Cyanobacteria, eukaryotes and heterotrophic bacteria throughout the events of Cyanobacterial proliferation.

38 39

Furthermore, a wider range of studies needs to be undertaken to expand the database of microorganisms associated with the characteristics of unique lake systems.

As discussed in Chapter 1, the information on water-column community composition in Hamilton Harbour has historically been dominated by research on phytoplankton and grazers based on microscope analyses. Zhan and colleagues have published two papers on the eukaryotic groups and rare organisms that could be detected by next-generation sequencing, using Hamilton Harbour as the model of a complex microbial community (Zhan et al., 2014; Zhan & MacIsaac, 2015). However, in their studies the focus was not on total community composition within the context of Cyanobacterial blooms in the harbour. There has yet to be a study covering a wider scope of spatial, weekly and annual changes of the total plankton community composition in this system.

The purpose of this Chapter was to expand on the sample structures from Chapter 2 by identifying the bacterial and eukaryotic microorganisms associated with major changes in community structure. The focus after taxa identification was to answer the question: what relationships exist between the microbial populations?

3.2 Methods 3.2.1 Preparation of samples for 454 pyrosequencing and shotgun metagenomics The 2014 temporal and spatial genomic DNA samples were pooled based on the clustering information from the T-RFLP results in Chapter 2. DNA was pooled by combining highly similar samples in equimolar amounts to make a total of 100 ng. The samples were submitted for 454 tag-encoded pyrosequencing using the Roche 454 FLX titanium platform at MR. DNA (Molecular Research LP, Texas). Commonly used universal in-house primers were requested for sequencing. The barcoded primers 27F (5’-AGRGTTTGATCMTGGCTCAG-3ʹ) and 530R (5ʹ- CCGCNGCNGCTGGCAC-3ʹ) were used to target bacterial 16S SSU rRNA V1-V3 region. The barcoded in-house general primers Euk528F (5’-CCGCGGTAATTCCAGCTC-3’) and EukR18R (5’-CGTTATCGGAATTAACCAGAC-3’) were used to target the V4-V6 region of eukaryotic 18S SSU rRNA genes. PCR conditions were the same for both primer sets and were as followed: 95°C for 5 minutes, 35 cycles of 95°C for 45 seconds followed by 60°C for 45 seconds and 72°C for 1 minutes, and a final 72°C extension for 10 minutes.

40 All 10 of the 2015 temporal and spatial samples were standardized to 1.5 μg of DNA per sample and sent for full shotgun metagenome sequencing. The DNA was used for sample library preparation and sequenced with the Illumina HiSeq 2500 (2 x 150 bp) platform at Molecular Research LP, USA.

3.2.2 Bioinformatic Pre-processing 3.2.2.1 Pre-processing of 454-pyrosequencing data Quantitative Insights Into Microbial Ecology (QIIME) software was used for processing and analyzing single gene pyrosequencing data (Caporaso et al., 2010). All bioinformatic processing of the raw sequences was carried out using MacQIIME (http://www.wernerlab.org/software/macqiime), which is a modified version of the original software QIIME (http://qiime.org/) v1.9.1 for Mac users. QIIME uses python language with script that was run through Unix command shell interface. A mapping file was created with sample information including names and assigned DNA barcode sequences for the software to recognize samples. Sequence files were demultiplexed (i.e barcodes removed) using the split_libraries.py command. Sequences with high error content were subsequently removed using the script denoiser.py with Denoiser (Reeder & Knight, 2010) and then grouped into operational taxonomic units (OTU's) of sequences comprising 97% nucleotide similarity using the script pick_otus.py script and UCLUST clustering method (Edgar, 2010). Chimeras sometimes form during PCR whereby two or more biological sequences are joined together and falsely suggest that novel organisms are present in the samples. To avoid including chimeric sequences in further analyses, chimeras were detected and removed using UCHIME during OTU picking (Edgar et al., 2011).

3.2.2.2 Pre-processing of Shotgun Metagenomic Data All pre-processing of the shotgun datasets for taxonomy analysis was executed manually using Unix command shell on a Linux computer. The raw .fastq sequence files with shotgun genome data were fed through the adaptive trimming tool Sickle (Joshi & Fass 2011) with the option for Illumina datasets added. Sickle uses a complex “window sliding” algorithm in which it slides along each read, identifies and removes low quality sequences from further analyses. After quality filtering, the software FLASH was downloaded and used to merge paired end files together (Magoč & Salzberg, 2011). Since FLASH is defaulted for sequence reads of 100 bp, the command was modified to account for longer reads. Merged files were then fed through the

41 software Diamond (Buchfink et al., 2015). Diamond uses algorithms to identify and translate protein-coding regions of short shotgun DNA sequences into protein sequences and then aligns the translated sequences against the non-redundant database file obtained from NCBI (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz) using BLASTX alignment mode. The product of this is a .daa file that can be compatible for taxonomy assignment and phylogenetic tree visualization in MEGAN software.

3.2.3 Taxonomy Classification 3.2.3.1 Tag-encoded Amplicon 454 Pyrosequencing data OTU picking can be based on reference sequences but this is often not recommended as environmental samples often contain DNA from organisms that are not in sequence databases and would therefore be removed from further processing. This problem is circumvented using the de novo clustering method. Sequences of 97% similarity are classified into “de novo” OTU’s and unknown OTU’s would be assigned to an “unclassified” group. Representative sequences for each OTU for further analyses were chosen for group taxonomy assignment using the script pick_rep_set.py. An OTU table containing sample/species and abundance information was made from the output using make_otu_table.py. Taxonomy was assigned to each representative sequence from each OTU cluster using the function assign_taxonomy.py and built-in RDP classifier method to reference Greengenes database (latest version: 13_8_5) for bacterial sequences (DeSantis et al., 2006) and SILVA SSU database (latest version: 111) for eukaryotic ribosomal sequences (Quast et al., 2013). The relative abundance values of OTU’s that were assigned taxonomy and collapsed at a designated taxon level were then plotted using Microsoft Excel. Taxon data was then imported into R and correlation matrices between populations were generated using the package corrplot (Wei & Simco, 2016) through the “corrplot” function.

3.2.3.2 Metagenome data and MEGAN MEGAN was the program chosen to assess the shotgun data taxonomy assignment results in detail, since it is robust at determining the resolution of all domains of life. The .daa files were fed through MEGAN v.6.4.9 Ultimate Edition (Huson et al., 2016). Since .daa files do not contain the name of the taxa associated with the reference sequences, an extra step was taken to include a mapping file to resolve this (called “meganizing” the .daa file). The mapping file was last updated in February 2015 called “gi2taxFeb2015” located on the MEGAN main webpage. Identifiers were mapped for references to NCBI taxa. The default settings were used for Lowest

42 Common Ancestor taxonomy assignment with the default threshold bit score (quality of alignment) and at least 0.1% of the assigned reads required to hit a taxon before that taxon takes a place in the tree, otherwise those reads would move up the hierarchy until the assigned reads to the next node reach their threshold.

3.2.4 Phylogenetic Tree Building of 454 Pyrosequencing data Phylogenetic tree building for representative sequences of each OTU was used for 1) diversity analysis and 2) confirmation of assigned taxonomy from the QIIME reference databases. With respect to diversity analysis, the initial alignment for each tree was built using the align_seqs.py with the default PyNAST alignment technique (Caporaso et al., 2010). Greengenes alignment template (85% alignment threshold) was used for the bacterial dataset and SILVA (90% alignment threshold) was used for the eukaryotic sequences. Gaps and variable regions that may interfere with accurate tree building were removed using the filter_alignment.py command and default Lane mask as the filtering method. Phylogenetic tree building for diversity analysis was carried out with the filtered alignment files through the make_phylogeny.py command.

Representative sequences for OTU’s that comprised >1% of the relative abundances within any of the samples were also manually searched using basic local alignment tool (BLAST) against the National Center for Biotechnology Information (NCBI) database to find closest relatives and confirm the identities of taxonomy assignment (Benson et al., 2005; Morgulis et al., 2008). Since the non-redundant database assigned the top 100 bacterial best-hits to “uncultured” environmental microorganisms, the 16S (Bacteria and ) specific archive in NCBI was used to match reference sequences for the bacterial OTU’s. OTU’s that comprised >1% relative abundance of any of the samples as well as their best reference hits were aligned together using the multiple sequence aligner ClustalW (Thompson et al., 1994). The aligned sequences were then trimmed to equal lengths to avoid computational issues downstream with uneven sequence lengths. The statistical closeness of sequences to their nearest relatives was assessed using a Maximum Likelihood Tree for each dataset set using the phylogenetic analysis software MEGA7 (Kumar et al., 2016). Gap regions were removed throughout the process with bootstrap (confidence) values out of 100 assigned to each branch node.

43

3.2.5 Diversity 3.2.5.1 454-Pyrosequencing Rarefaction curves to visualize community diversity were generated using the core_diversity_analyses.py script in QIIME with both phylogenetic distances from the tree built previously both taken into account and not taken into account. The PCoA ordination function in QIIME was used to visualize variation between sample pools. The distance matrices were formed using both Bray-Curtis and weighted Unifrac methods, the latter taking into account phylogenetic distances and abundances of related organisms based on single taxonomic markers. Data was exported and graphed in Microsoft Excel.

3.2.5.2 Metagenomes MG-RAST is an automated online interface that uses several bioinformatic tools to process and compare multiple metagenomic datasets from both private and public projects (Meyer et al., 2008). MG_RAST was mostly used for functional gene analysis (Chapter 4) but also for generating the 2015 species rarefaction curves. The raw sequence files were uploaded to MG- RAST v.3.0 and submitted for annotation. The initial parameters were set for filtering based on length and quality values that were assigned per base pair within each sequence read. Paired-end reads were then merged with the default settings of a minimum overlap of 8 bp and a maximum difference of 10%. Removal of artificial duplicate reads and dynamic trimming were checked with the rest of the quality control at the default settings. Rarefaction curves of the taxonomy assignment results to all genes were generated manually by selecting all project files in the “Metagenome Analysis” section found from the MG-RAST homepage with annotation against the M5RNA database. The M5RNA database contains collective information on non-redundant ribosomal genes from the Greengenes, SILVA and RDP databases. The annotation thresholds used were a maximum e-value of 10-20, minimum 60% sequence identity and minimum 15 bp sequence alignment.

MEGAN was used for generating PCoA ordination plots because it provided an extra tool to plot vectors that highlight the most influential taxa in the samples.

3.3 Results 3.3.1 Sequence Statistics A total of 5 DNA pools were made from the 2014 samples for bacterial analysis and 6 pools

44 were made for eukaryotic analysis (see Table B.1, Appendix B for sequence statistics). With the exception of July 31st inshore eukaryotic sample, which may be an outlier, inshore and offshore samples collected on the same day were pooled together. 34,606 raw sequences were obtained from the bacterial pool and 48,900 raw sequences from the eukaryotic pools. After quality filtering, a total of 33 bacterial OTU’s were present at least 1% relative abundance in any of the 5 bacterial pools and 43 eukaryotic OTU’s were present in at least 1% of any of the 6 eukaryotic pools. A maximum of 6% of the bacterial sequences were assigned to chloroplast sequences from the eukaryotic algae present in those samples and were removed from further analyses of the bacterial community since eukaryotes were assessed separately. The sequences will be deposited into GenBank database once published.

Sequence information for the shotgun metagenomes are in Tables B.2 and Table B.3 in Appendix B. It is worth mentioning that in some cases a significant portion of the reads within some samples (>50%) either did not pass quality control or were not annotated. Accession ID for public access to the annotation results of MG-RAST are listed in Table B.2.

3.3.2 Bacterial Community (2014) According to the rarefaction curves generated for the bacterial pools (Figure 3.1 a and b), whether phylogenetic distance between OTU’s was taken into account or not, the June samples had lower OTU richness than July/August. Specifically, the August pool had the steepest curve indicating that at any given number of sequences sampled from each pool, a larger number of new species would be recovered relative to the June samples. In neither case did the rarefaction curves reach their maximum, suggesting that deeper sequencing would uncover significantly more OTU diversity.

Analysis of the PCoA ordination results (Figure 3.1 c and d) showed that June pools clustered together on the positive end of the first axis, July 2nd at the center point and July 17th-August 27th clustered together and further away on the negative end of the first axis. Most of the variation between samples was captured on PCo1 given by the relative eigenvalue of >50%, therefore even though samples were spaced out on the second axis, that axis only accounted for <20% variation between samples and was less important in explaining variation than the first axis. Lastly, it did not matter whether phylogenetic distance between species in a sample was taken into account as both showed the same clustering. The Unifrac and Bray-Curtis distance matrices showed very similar clustering patterns on the first axis after PCoA ordination.

45

Observed_otus Rarefaction PD_whole_tree Rarefaction Curve Curve 30 350 25 300 1 250 20 2 200 15 3 150 10 4

100 PD_whole_tree Observed_otus 5 50 5 0 0 10 10 519 519 1028 1537 2046 2555 3573 4082 4591 5100 3064 1028 1537 2046 2555 3573 4082 4591 5100 3064 Sequences per sample Sequences per sample a) b)

PCoA (Bray-Curtis) PCoA (Weighted Unifrac) 0.3 0.12 0.25 0.1 0.2 0.08 0.15 0.06 1 0.1 0.04 2 0.05 0.02 3 0 0 4 -0.05 -0.02 5 -0.1 -0.04 -0.15 -0.06 -0.2 -0.08 PCo2 (variation explained 15.18%) explained PCo2 (variation explained) PCo2 (17.44 % variation -0.5 0 0.5 -0.4 -0.2 0 0.2 PCo1 (53.21% variation explained) PCo1 (variation explained 68.88%) c) d)

Figure 3.1: Diversity analysis of the bacterial community throughout the 2014 sampling season: 1=June 5, 2= June 19, 3=July 2, 4= July17-July31, 5=August 14-August 27. Top: Data is plotted as OTU rarefaction curves representing richness as a) observed OTU’s and b) PD_whole_tree which takes into account phylogenetic distance between 16S OTU’s. Bottom: PCoA ordination of 16S taxonomic dissimilarities between pooled samples with (c) Bray-Curtis distance matrix and (d) Weighted Unifrac, which accounts for phylogenetic distances between OTU’s.

46 The bacterial community over time at phylum level (Figure 3.2) revealed the most dominant phyla within the sample pools were Actinobacteria, Cyanobacteria, Proteobacteria (α and β classes) and Planctomycetes (Figure 3.2). There was a clear shift in the bacterial community from Actinobacteria sequences dominating in the early summer (up to ~84%) to Cyanobacteria emerging in early July and peaking in mid-late July (up to ~62%). Planctomycete presence and abundance was associated with Cyanobacteria emerging in early July and also peaked in mid- late July (~11%).

Relative Abundance of Bacterial Taxa Throughout the Summer

100% Other 90% Proteobacteria_Other 80% 70% 60% 50% Verrucomicrobia 40% Planctomycetes 30% Cyanobacteria

Relative Abundance (%) Relative 20% Chloroflexi 10% Actinobacteria 0% 1 2 3 4 5 Sample Pool

Figure 3.2: The bacterial assemblage throughout the summer of 2014. Sample identification is based on DNA pools: 1=June 5, 2= June 19, 3=July 2, 4= July17-July31, 5=August 14-August 27.

The correlation matrix between phyla over time (Figure 3.3) revealed significant positive correlations between the most dominant phyla Cyanobacteria and Planctomycetes (0.99) α- proteobacteria and Verrucomicrobia (0.84) and β-proteobacteria and Verrucomicrobia (0.96). Furthermore, significant negative correlations existed between Cyanobacteria and Actinobacteria (-0.95), as well as Actinobacteria and Planctomycetes (-0.93) and Actinobacteria and γ-proteobacteria (-0.89).

47

0.01

0.03 0.94

0 0 0.12

0 0 0.1 0

0.76 0.03 0.18 0.31 0.32

0.05 0 0.83 0 0.01 0

0.48 0 0.23 0.16 0.16 0 0

0 0.02 0.1 0 0 0.99 0.13 0.64

0 0.01 0.2 0 0 0.13 0 0.09 0.01

0.18 0 0.69 0.04 0.04 0 0 0 0.32 0.05

−1 1

Figure 3.3: Correlations between 2014 bacterial higher-level taxa over time with statistical significance (p-values for each correlation) in the bottom matrix. p-values of less than 0.05 indicate statistically significant correlations between the corresponding populations.

48 Phylogenetic tree analysis of nearest-best hits to the NCBI database suggested that there were at least 4 OTU’s with taxonomy assigned to fluminis (family C111), Canditatus , order and family ACK-M1 in the samples (Appendix B, Figure B.1) and each of them decreased in abundance throughout the summer (Appendix B, Table B.4). At least 3 different Planctomycetes of genera Pirellula, Rhodopirellula and Isosphaera were found in the samples according to reference sequences (Figure B.1). Planctomycetes were only detected at higher relative abundances in July and August (Table B.4). According to Figure B.1, Lyngbya aestuarii, Plantothrix agardhii, Microcystis aeruginosa, Cyanobium gracile and Calothrix and Synechococcus were the dominant Cyanobacteria present in the harbour. Table B.4 confirms that the abundance and diversity of Cyanobacteria increased throughout the summer. Synechococcus dominated July samples and the other Cyanobacterial genera only emerged in August. Proteobacteria appeared to be the most diverse at lower taxonomic levels than other phyla (Figure B.1). The major classes of Proteobacteria included α-Proteobacteria and β-Proteobacteria, which made up a dominant proportion of the total relative abundance but taxonomy at lower levels remained low abundance in any of the samples (Table B.4). The main Proteobacterial species were most closely related to Catellibacterium aquatile, Achromobacter xylosoxidans and Limnohabitans planktonicus (Figure B.1). Only one member of Verrucomicrobiaceae was identified in the samples and its closest relative according to NCBI was Brevifollis gellanilyticus.

3.3.3 Eukaryotic Community (2014) As shown in Figure 3.4 (a and b), the earliest summer pool showed lower OTU richness than the latest summer pool, however no clear pattern of change over time was evident. The PCoA ordination results in Figure 3.4 (c and d), indicated that most of the variation between samples (>50%) was captured only when both PCo1 and PCo2 were accounted for and each axis explained significantly less than what was explained for the bacterial dataset. When distance matrices were formed using Unifrac method which takes phylogenetic distance into account (Figure 3.4 d), a more clear temporal gradient of succession emerged with June pools clustering on the positive end of the axis, July 2nd at the central mark and July 17th- August 27th clustering together on the negative end of the axis. This was not the case for distance matrices formed using Bray-Curtis method, which did not show the same pattern (Figure 3.4 c).

49

Observed_otus Rarefaction PD_whole_tree Rarefaction Curve Curve 25 250 1 20 200 2

150 15 3 4 100 10 5 Observed_otus PD_whole_tree 50 5 6

0 0 10 10 629 629 1248 1867 2486 3105 3724 4343 4962 5581 6200 1248 1867 5581 2486 3105 3724 4343 4962 6200 Sequences per sample a) b) Sequences per sample

PCoA (Weighted Unifrac) PCoA (Bray-Curtis)

0.5 0.3 0.4 0.25 1 0.3 0.2 2 0.2 0.15 0.1 0.1 3 0.05 0 0 4 -0.1 -0.05 5 -0.2 -0.1 6 -0.3 -0.15

-0.4 25.96) explained PCo2 (variation PCo2 (variation explained 22.80%) explained PCo2 (variation -0.2 -1 -0.5 0 0.5 -0.4 -0.2 0 0.2 0.4 c) PCo1 (variation explained 29.44%) d) PCo1 (variation explained 39.72%)

Figure 3.4: Diversity analysis of the eukaryotic community throughout the 2014 sampling season: 1=June 5, 2= June 19, 3= July 2-July 17, 4= July 31(inshore), 5=July 31(offshore)- August 14, 6= August 27. Top: Data is plotted as OTU rarefaction curves representing richness as a) observed OTU’s and b) PD_whole_tree taking into account phylogenetic distance between 18S OTU’s. Bottom: PCoA ordination of 18S taxonomic dissimilarities between pooled samples with (c) Bray-Curtis distance matrix (d) Weighted Unifrac, which accounts for phylogenetic distances between OTU’s.

50 The 18S primers targeted both single and multicellular eukaryotic organisms within each of the DNA pools. The dominant eukaryotes designated groups by SILVA included Alveolata, Rhizaria, Stramenopiles, Holozoa, Cryptomonodales, Metazoa, Chloroplastida, and Fungi (Figure 3.5). Holozoa typically include Metazoans (excluding fungi) but were classified as a separate group according to taxonomy assignment by QIIME against the SILVA database. Metazoa according to SILVA describe the /copepod/mussel group, and sequence abundance was highest in June and their relative proportions decreased over time but then re- emerged in late August (Figure 3.5). The proportion of most organisms went down drastically between July 31st-August 14th with the emergence of Holozoa followed by Alveolata sequence dominance. Metazoa reemerged in late August with the emergence of Rhizaria, a relatively unknown group of unicellular eukaryotes that were not present in the other samples. Chloroplastida (green algae group) peaked (30%) in early-mid July with the onset of Cyanobacteria proliferation (Figure 3.2). Overall, patterns of eukaryotic succession were unclear and exhibited high variability between pools.

Relative Abundance of Eukaryotic Taxa Throughout the Summer

100% Stramenopiles 90% Rhizaria 80% Alveolata 70% 60% Metazoa 50% Holozoa 40% Fungi 30% Cryptomonadales 20% Relative Abundance (%) Relative Chloroplastida 10% Eukaryota_Other 0% 1 2 3 4 5 6 Sample Pool

Figure 3.5: The eukaryotic assemblage throughout the summer of 2014. Sample identification is based on DNA pools: 1= June 5, 2= June 19, 3= July 2-July 17, 4= July 31(inshore), 5= July 31(offshore)- August 14, 6= August 27.

51 Dominant Eukaryotic Phytoplankton An Alveolate related to Scrippsiella was detected as a dominant from late July- early August with up to 90% of the sequence reads within the pool (Table B.5, Appendix B). Its taxonomy was assigned to Scrippsiella according to SILVA but BLAST search against the NCBI non- redundant database indicates that it is of the algal dinoflaggelate Peridiniopsis polonicum (Figure B.2, Appendix B). A wide diversity of green algae including Scenedesmus, Oocystis, Chlorella, Colemanosphaera, Carteria, Pseudipediastrum and Coelastrum were confirmed by comparison to the NCBI reference sequences (Figure B.2). This group was at its peak from mid- June to July and diversity and relative abundance decreased substantially in early August (Table B.5). The autotrophic Cryptomonas curvata was the only member of the group Cryptomonodales detected in the 2014 samples (Figure B.2). At least 3 OTU’s were assigned taxonomy to the group Stramenopiles with the closest relative of one member being the phytoplanktonic diatom Stephanodiscus parvus.

Dominant Heterotrophic Organisms Within the group Alveolata, all 3 OTU’s that were assigned to the heterotroph Oxyrrhis through Greengenes database are likely the same according to alignment against NCBI database closest sequences (Figure B.2). Oxyrrhis showed highest relative abundance (26%) in mid-late June pool (Table B.5). A specific OTU assigned as Thecofilosea according to SILVA and belonged to the phylum Rhizaria dominated the late August pool (Table B.5) that was most closely related to an Unknown Eukaryote in the NCBI database (Figure B.2).

At least 4 different fungal OTU’s were identified, of which at least one was confirmed by NCBI to belong to (Figure B.2). Fungi were present at highest relative abundance in June-July (Table B.5). A heterotrophic member of Stramenopiles identified as the fungal-like organism related to Peronosporomycetes at highest abundance in mid-June (Table B.5). Lastly, an unknown Holozoa belonging to the family Ichthyophonae dominated up to 59% relative abundance of the outlier July 31st pool (Table B.5).

A wide diversity of Metazoans was present in Hamilton Harbour over the 2014 summer (Figure B.2). Among them were at least 6 different species of , 4 different species of copepods as well as the detection of Dreissena rostriformis bugensis (‘quagga mussel’). The most abundant Metazoans were the copepods Arctodiaptomus cf. stephanidesi and Acanthocyclops bicuspidatus in early June the rotifer Synchaeta pectinata in mid-June to July.

52 The correlation matrix between dominant eukaryotic phyla (Figure 3.6) suggested that the highest positive correlations between eukaryotic phyla throughout the summer existed between Fungi and Chloroplastida (0.83). Most of the negative correlations between phyla were relatively weaker.

0.69

0.07 0.57

0.61 0 0.69

0 0.88 0.04 0.79

0.39 0.92 0.02 0.34 0.11

0.42 0.09 0.4 0.06 0.93 0.11

0.21 0.58 0.21 0.67 0.31 0.83 0.68

0.15 0.98 0.01 0.64 0.04 0 0.29 0.8

−1 1

53 Figure 3.6: Correlations between 2014 eukaryotic phyla over time with statistical significance (p-values for each correlation) in the bottom matrix. p-values of less than 0.05 indicate statistically significant correlations between the corresponding populations.

3.3.4 Metagenome Taxonomy (2015) The results from taxonomy assignment at domain level according to MEGAN (Table 3.1) suggested that most of the sequences within the metagenome datasets were bacterial.

Table 3.1: Taxonomy assignment of the 2015 metagenomes at domain level according to MEGAN. The values indicate relative percentages (%) of each domain within each sample. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Sample Bacteria Archaea Eukaryota J30_I 84 0 16 J30_O 93 0 7 A13_I 92 0 8 A13_O 96 0 3 A27_I 95 0 5 A27_O 99 0 1 S10_I 84 0 16 S10_O 81 0 18 S24_I 93 0 7 S24_O 96 0 4

The most abundant (>1% total community) higher-level taxa within the bacterial component of the metagenomes included Proteobacteria, Cyanobacteria, Bacteroidetes, Actinobacteria, Planctomycetes and Verrucomicrobia (Figure 3.7) while the major eukaryotic groups included Arthopoda, Annelida, Chordata and Bacillariophyta. The proportion of bacteria to eukaryote was high across all samples. When examining the communities at higher taxonomic level, the relative proportion of Cyanobacteria increased gradually over time with less variability between different sites sampled on the same day than between sampling weeks. The relative proportion of Actinobacteria were highest in July and most clearly decreased throughout the summer.

54

Relative Abundance of Dominant Taxa Throughout the Summer

100% Bacillariophyta 90% Annelida 80% Arthropoda

70% Chordata Gammaproteobacteria 60% Proteobacteria other 50% Betaproteobacteria 40% Alphaproteobacteria Cyanobacteria 30%

Relative Abundance (%) Relative Actinobacteria 20% Verrucomicrobia 10% Planctomycetes

0% Bacteroidetes J30_I J30_O A13_I A13_O A27_I A27_O S10_I S10_O S24_I S24_O Sample

Figure 3.7: Higher-level taxonomic assignment of the 2015 bacterial community over space and time in Hamilton Harbour. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Analysis of correlations between these higher-level taxa (Figure 3.8) in 2015 revealed that mostly positive or neutral correlations existed between eukaryotic groups. Either strong positive or weak relationships existed between heterotrophic bacterial groups. Most of the strong negative correlations existed between Cyanobacteria and other groups. For example, Cyanobacteria and α-Proteobacteria (-0.79) and β-Proteobacteria (-0.66) indicating a slightly weaker correlation with the β class but this relationship was still significant. Cyanobacteria and Actinobacteria (-0.79) Cyanobacteria and Arthropoda (-0.71) also indicated strong negative correlations. The correlation between α-Proteobacteria and Actinobacteria was positive but relatively weak (0.60).

55

0.03

0 0

0.26 0.18 0.97

0 0.84 0.11 0

0 0 0 0.44 0.38

0 0.01 0 0.49 0.01 0

0.04 0 0 0.11 0.97 0 0.02

0.13 0 0 0.23 0.61 0 0 0.01

0.09 0.26 0.75 0 0 0.64 0.18 0.19 0.74

0.01 0.76 0.13 0.15 0 0.35 0.01 0.86 0.16 0

0.04 0.65 0.3 0.31 0.01 0.22 0.02 0.5 0.15 0.02 0

0.37 0.46 0.86 0 0.02 0.86 0.58 0.49 0.28 0.11 0.58 0.28

−1 1

Figure 3.8: Correlations between most abundant bacterial and eukaryotic taxa (>1% total community) over time with statistical significance (p-values for each correlation) in the bottom matrix. p-values of less than 0.05 indicate statistically significant correlations between the corresponding populations.

56

3.3.5 Diversity Species richness was analyzed via the rarefaction curve distribution in Figure 3.9. The results suggested that July 30th and August 14th had similar richness while late August and September had relatively lower overall diversity. This was the case for both inshore and offshore communities. No indication of leveling-off was apparent within the metagenomes, indicating that sequencing depth was insufficient to fully cover diversity across samples.

Species Rarefaction Distribution for the 2015 Metagenomes

8000 J30_I 7000 J30_O A13_I 6000 A13_O 5000 A27_I 4000 A27_O

3000 S10_I

of Species Number S10_O 2000 S24_I 1000 S24_O 0 0 2000000 4000000 6000000 8000000 10000000 12000000 Sequences

Figure 3.9: Rarefaction curves generated for the 2015 metagenomes generated by MG-RAST. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

PCoA ordination of the metagenome communities at genus level (Figure 3.10) confirmed the T-RFLP results from Chapter 2, in that early summer samples (J30-A13) clustered closer on the primary axis to each other than late summer (A27-S24) samples. Inshore site samples at the beginning of the summer were separated on the second axis indicating some spatial differences in community structure, but after August 27th spatial homogeneity was apparent once again. September 10th deviated from the late summer samples on the second axis, likely due to the presence of a high amount of eukaryotic sequences during that week (Table 3.1). Biplot

57 vectors added to the PCoA ordination distribution suggest that the most influential genera on differences in community structure in the early summer samples were a member of Actinobacteria (miscellaneous), unknown and member of the ac1 cluster as indicated by those three arrows pointing more horizontally than the other arrows. Lyngbya and Limnoraphis were highest in late August and late September and were also associated with separating early summer samples from the late summer samples. With regard to the vertical axis, which only accounted for 25.2% of the variation, Microcystis separated the early summer offshore samples from early summer inshore samples and were present in higher relative percentage in offshore metagenomes (Table B.6, Appendix B).

Figure 3.10: PCoA ordination of the 2015 metagenomes. Sample identification is represented by month (J=July; A=August; S=September) and sampling day. Inshore samples are represented by and offshore samples are represented by n. Biplot vectors in green indicate which taxa have the largest influence in PCoA plot.

58

3.6 Discussion Microbial community diversity varied within the season and annually within the Hamilton Harbour epilimnion. In 2014, August samples had relatively higher overall bacterial diversity than in June. In 2015, Cyanobacterial dominance in September was associated with lower overall bacterial diversity relative to the July samples. These apparently conflicting results can be justified by the fact that to some extent, the proliferation of Cyanobacteria such as Microcystis can increase the concentration of dissolved oxygen in the water through photosynthesis. This can create a more optimal environment for aerobic microorganisms, while at the same time the phytoplankton biomass provides exudates for other bacteria to grow on (Li et al., 2015). Xie et al. (2016) used metagenomics to link the presence of Microcystis in Lake Taihu, China to shaping the heterotrophic bacterial community composition with a basis for mutualistic exchange of carbon, energy and vitamins. Yet, Li and colleagues further suggested that if growth conditions for Cyanobacteria become optimal and they reach their peak, a less optimal environment could result due to the downstream affects of blooms such as toxic microcystin concentrations and competition for inorganic nutrients.

The main found in this study have all been found as co-occurring heterotrophs with Cyanobacteria in other eutrophic freshwater studies (Mou et al., 2013) and are considered common freshwater bacteria (Newton et al., 2011). Sampling in June 2014 revealed that the dominant heterotrophic group before the late summer was Actinobacteria and a significant negative correlation existed both years between Actinobacteria and Cyanobacteria relative populations over time. This was also found by Ghai et al. (2014), who found a significant inverse relationship between Actinobacteria and Cyanobacteria and a positive correlation between Cyanobacteria and late-summer temperatures in the Amadorio Reservoir, Spain. They further suggested that Actinobacteria normally predominate early summer water and may be outcompeted for nutrients by Cyanobacteria when temperature becomes optimal for Cyanobacterial proliferation.

The Actinobacterial taxa in this study were only assigned taxonomy to family-level Actinomycetales (also shown in the data as ACK-M1/acI clade) and Acidimicrobiales (also shown in the data as C111 clade). Many members of these clades are non-culturable and so the ecological roles of the members that constitute these clades remain unknown. However recent deep metagenomic sequencing has revealed how important they are in the natural

59 biogeochemical cycling of nutrients in relatively pristine freshwater systems (Ghai et al., 2014). Furthermore, analyses of their genomes and revealed that acI-family members have genes to break down Cyanobacterial compounds. Perhaps during Cyanobacterial proliferation, Actinobacteria are inhibited by the toxicity and competitive advantages of Cyanobacteria but repopulate when blooms die off and actually benefit from the carbon and nutrients they can attain from Cyanobacterial remnants. There is some evidence for this from temporal studies of CyanoHAB’s in Lake Taihu, where sequences belonging to acI and acIV were highest in spring then decreased in summer, but increased again in the fall. Li et al. (2015) postulate that these organisms are able to take advantage of the large amount of cyanophysin provided by excessive Cyanobacteria to acquire energy and carbon.

Another finding to highlight would be the positive correlation between Cyanobacteria and Planctomycetes seen in 2014. Although the same relationship was not found in 2015, Planctomycetes were present in all the 2015 samples. The literature suggests that there is a potential association between this group and Cyanobacteria that cause problematic blooms. Planctomycetes are known to be associated with macroalgae (Lage & Bondoso, 2014). The dominant Planctomycete found in both years was Rhodopirellula of which the species Rhodopirellula baltica has been sequenced after isolation from marine particles. Genome analysis has revealed an ability of this organism to derive energy from the degradation of sulfated polysaccharides of algal origin (Glöckner et al., 2003; Schlesner et al., 2004). Steven et al., (2011) used 454-pyrosequencing to reveal that Rhodopirellula are more enriched in algal mats in lake littoral zones than other genera. They suggest that this genus was benefitting from the compounds that may be released from these aggregates. Cai et al., (2013) have more evidence for this postulation as they also found that Planctomycetes were the dominating phylum within the microbial community assemblage of carpet-like mucilaginous attached to Microcystis bloom aggregates in Lake Taihu. Altogether, neutral to positive co-occurrence relationship between Planctomycetes and Cyanobacteria both years may be ecologically relevant.

There was a wide diversity and annual differences in composition of dominant Proteobacteria in the datasets. The dominant genus that was confirmed to be present both years and associated with the community before and during Cyanobacterial proliferation was the freshwater bacterium Limnohabitans. β-Proteobacteria in general known for high efficiency in degrading

60 dissolved organic matter (Niemi et al., 2009) and have been positively correlated to low- molecular weight algal-derived substrates (Newton et al., 2011). The significance of Limnohabitans in relation to toxic CyanoHAB’s is not known, yet this group likely plays an important role in the freshwater food chain. Kasalický et al. (2013) describe Limnohabitans lineages as having a prominent role in freshwater bacterioplankton communities due to their high rates of substrate uptake and growth, and have a key role in carbon flow from algal-derived substrates to the plankton grazer food chain through bacterivory.

A wide diversity of eukaryotic taxa was detected from the 454-tags. Since the 18S primers used in this study target conserved regions of all eukaryotic ribosomal sequences, they were sensitive enough to detect eukaryotic phytoplankton, single celled heterotrophs and multicellular organisms. Many of the dominant green algae that were detected have been previously found as seasonal eukaryotic algae in Hamilton Harbour in 2014 including Cryptomonas, Coelastrum and Oocystis (Dermott et al., 2007). The July predominance of sequences belonging to multicellular organisms including Keratella, Mesocyclops, Trichocerca and Synchaeta have been reported before within Hamilton Harbour within the same archive by Dermott and colleagues. Furthermore, DNA sequences belonging to an invasive mussel (Dreissena bugensis) were detected in late-August, likely the DNA of mussel larvae which have a planktonic stage. Importantly, this study highlights the addition of a number of novel microorganisms including non-photosynthetic unicellular heterotrophs to the Hamilton Harbour biological community database. A diversity of uncharacterized fungi belonging to the group Chytridiomycota was detected. This finding suggests that there are potential fungal parasites on the phytoplankton community members in Hamilton Harbour. Chytrid parasitism could be a potentially significant influence on phytoplankton composition based on recent suggestions in the literature. For example, researchers in Europe found that up to 60% decline of filamentous Cyanobacterial bloom populations in Lake Aydat, France, was due to Chytrid parasitism (Gerphagnon et al., 2013) highlighting the potential importance of these organisms in contributing to the regulation of Cyanobacterial populations.

The functional potentials of the dominant bacterioplankton are explored in the next Chapter, with a focus placed on nitrogen metabolism.

Chapter 4 Functional Potentials of the 2015 Community 4.1 Introduction Microbial transformation naturally drives nutrient cycling in aquatic systems. With respect to nitrogen, less available forms are converted a biologically availably form, ammonium, by specialized microbes. Conversely, ammonium can be oxidized via nitrifying microbes and then returned to the atmosphere through denitrification by other specialized microbes. Ammonium and nitrate/nitrite can also be supplied to the water column in significant amounts through multiple processes including internal loading from the sediments, through external loading from terrestrial or anthropogenic sources or through intracellular biological conversion by nitrogen fixing prokaryotes (Scott & Grantz, 2013). Consequently the bioavailability of nitrogen in lake systems is both a product of and influence on microbial ecology in the water column.

Nitrogen bioavailability is thought to affect both the growth rate and toxicity of Cyanobacterial blooms. Nitrogen is required for protein synthesis and is therefore required to sustain algal cell proliferation. The growth rate of Microcystis in culture experiments has been positively correlated to phosphorus, nitrogen, CO2 availability and light (Downing et al., 2005). In regard to toxin production, the physico-chemical requirements are less clear. Quantifiable positive correlations between MC production and other parameters have included growth rate, cellular nitrate uptake and intracellular nitrogen abundance (Downing et al., 2005). Yet, P uptake and carbon fixation in the same study by Downing and colleagues showed negative correlations with toxin production. With this finding, the conditions needed to sustain Cyanobacterial biomass are not necessarily correlated to bloom toxicity. Molecular analysis by Ginn et al. (2010) revealed that the transcription factor NtcA, a Cyanobacterial-specific nitrogen regulator, binds to the microcystin promoter region of the mcyA/D genes in Microcystis aeruginosa and both mcyB and NtcA are upregulated under nitrogen limited and starved conditions. Nitrogen availability is evidently involved in Cyanobacterial proliferation and modulating the synthesis of MC’s, although it is still unclear why Cyanobacteria produce these toxins and set of conditions are required to trigger bloom toxicity.

Many Cyanobacteria have unique high-affinity transport mechanisms and pathways for transport and intracellular nitrogen assimilation (Flores et al., 2005). Some genera also can undergo

61 62 nitrogen fixation under severe nitrogen limitation. Nitrogen fixation is an energy intensive metabolic process that only a selection of prokaryotic organisms can undergo. In marine systems, which are naturally N-limited, nitrogen fixation is a critical biochemical process primarily by diazotrophic Cyanobacteria (Capone & Carpenter, 1982). With respect to models of eutrophication in freshwater systems, recent work by Scott & Grantz (2013) showed that if Cyanobacterial populations are excessively N-limited relative to other nutrients, primary production could in fact be sustained through sufficient rates of nitrogen fixation. In cases where non-nitrogen fixing Cyanobacteria dominate in nitrogen limiting system, one concept that has been able to explain the blooms of non-nitrogen fixing phytoplankton in marine systems has

+ been the production of “new nitrogen”. In this model, NH4 and dissolved organic nitrogen leak out of cells initially fixed by co-occurring Cyanobacterial diazotrophs and becomes available for non-diazotrophs (Glibert & Bronkt, 1994). Other researchers question the importance of nitrogen fixation in certain circumstances. They argue that that microbially mediated reduction of available nitrate to N2 gas can offset nitrogen input by fixation, leaving a net balance of N- limitation, and emphasize anthropogenic N loads as the more influential source of nitrogen (Scott & McCarthy, 2010).

In this Chapter, the metabolisms of the dominant microorganisms identified in Chapter 3 are assessed using metagenomics. Since nitrogen fixers can theoretically bypass bioavailable nitrogen limitation, I hypothesized that the frequency of genes involved in nitrogen fixation would be highest in the late summer, when dissolved inorganic nitrogen in the water column is measured to be the lowest.

63

4.2 Methods Functional analysis was conducted through MG-RAST against databases with annotation settings defaulted to a minimum 60% sequence identity, maximum e-value of 10-20 and minimum 15 bp sequence alignment. Sequences were compared to the SEED Subsystems database. SEED classification contains a collection of biologically defined “subsystems” which are groups of well-defined genes based on functional roles. In order to assess function within the context of enzymatic pathways, the KEGG functions were also explored. Taxonomy assignment to functional genes was also explored and limited to genes of interest. To do so, hierarchical search results for each gene of interest were sent to the workbench in MG-RAST. The workbench data were then subjected to best-hit classification against the M5RNA database using minimum 60% identity, maximum e-value 10-20 and minimum 15 bp identity. Results were either plotted using features in MG-RAST or exported to plot in Microsoft Excel.

4.3 Results 4.3.1 General Functions General functions within the metagenomes are shown in Figure C.1 of Appendix C. In general, the highest proportion of reads belonged to the categories Clustering-based subsystems, and Amino Acids/Derivatives across all samples. Clustering-based subsystems is one in which there is functional coupling evidence that certain genes belong together, but it is still unknown what they do. Nitrogen metabolism contributed to approximately 1% of the relatively proportion of all categories across metagenomes.

64 4.3.2 Nitrogen Metabolism Overall Within the category of nitrogen metabolism (Figure 4.1), the relative proportions of ammonium assimilation, nitrate/nitrite ammonification and nitrogen fixation changed most significantly across samples. There was a noticeable increase in the latter two processes over time.

Nitrogen Metabolism: Subsystems 100%

90% Other 80% Nitrogen Fixation 70% Denitrification 60% Dissimilatory Nitrate Reductase Cyanate Hydrolysis 50% Allantoin Utilization 40% Nitrate and Nitrite Ammonification 30% Nitric Oxide Synthase 20% Ammonia Assimilation

Relative percentage of each category (%) category of each percentage Relative 10% 0%

Sample

Figure 4.1: Categories within nitrogen metabolism as described by SEED Subsystems.

4.3.3 Nitrogen Transport The increase in the proportion of nitrogen fixation and nitrate/nitrite ammonification could have been the result of either a decrease in the amount of genes involved in ammonium assimilation, or due to a true increase in the frequency of these genes. Figure 4.2 shows a more “absolute” figure to clarify what was occurring. The read counts belonging to transport of nitrate/nitrite and ammonium forms were normalized against the copy number of bacterial housekeeping gene rpoB (encodes the β-subunit of bacterial RNA polymerase) within each metagenome. Eukaryotic algae, fungi and higher plants, carry out some nitrogen transformations but considering that the ratio of bacterial reads relative to eukaryotic reads was much higher in the samples (Chapter 3, Table 3.1) rpoB was chosen as the gene marker for normalization. As shown in Figure 4.2, the overall biological potential for ammonium transport did not change

65 significantly over time. However, nitrate/nitrite transport potential was lowest in July and increased gradually in August and September with only a slight decrease in mid-September.

+ - - NH4 Transporter: rpoB NO3 /NO2 Transporter: rpoB Throughout the Summer Throughout the Summer rpoB

rpoB 0.25 0.6 0.5 0.2 0.4 0.15 Transporter: Transporter: -

Transporter: Transporter: 2

+ 0.3 4 0.1 /NO

0.2 - 3 0.1 0.05 0 0 Ratio of NH Ratio of NO J30_I J30_I S10_I S24_I S10_I S24_I A13_I A27_I J30_O A13_I A27_I J30_O S10_O S24_O S10_O S24_O A13_O A27_O A13_O A27_O Sample Sample a) b)

Figure 4.2: Nitrogen transport potential throughout the summer. Sequences coding for a) ammonium and b) nitrate/nitrite transporter normalized against the bacterial housekeeping gene rpoB. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Taxonomy assignment to these nitrogen transport sequences (Figure 4.3 and 4.4) suggested that there was a wide diversity of bacterial phyla that had nitrogen transport genes. At the beginning of the summer, Actinobacteria and Proteobacteria (α and β classes approximately equally) dominated the relative proportion of ammonium transport (Figure 4.3), but this changed as Cyanobacterial ammonium transport genes dominated in late August and September. For nitrate/nitrite transport (Figure 4.4), the beginning of the summer was dominated by mainly β- Proteobacteria that had the potential to transport nitrate/nitrite into their cells, but then transport potential became dominated by Cyanobacteria in August and September. A significantly smaller portion of the nitrate/nitrite transport was assigned to Actinobacteria in comparison to ammonium transport (maximum 6% of total pie), suggesting that the taxa within this phylum had a lower potential to take up less reduced forms of nitrogen.

66

J30_I J30_O 3% 1% 1% 4% 4% 14% 21% 4% 21% 20% 4%

13% 30% 25% 26% 6%

1% 2%

A13_I A13_O

2% 4% 17% 15% 21% 9% 3% 1%

2% 16% 13%

28% 16% 2% 49%

2%

A27_I A27_O 3% 4% 3% 2% 4% 2% 2% 9% 8% 6% 3% 6% 7% 5%

64% 72%

67

S10_I S10_O 1% 1% 3% 2% 4% 5% 6% 4% 8% 15% 13%

10% 10% 6% 0% 9% 49% 54%

S24_I S24_O 3% 2% 1% 2% 3% 2% 5% 15% 13%

9% 13%

3% 3% 65% 61%

Figure 4.3: Taxonomy assignment for ammonium transporter genes. Taxa contributing to >1% of the genes are represented in the pie charts.

68

J30_I J30_O 2% 3% 2% 5% 5% 8% 19%

25% 51% 55% 25%

A13_I A13_O

6% 9% 5%

31% 28% 21%

45%

1% 20% 14% 20%

A27_I A27_O

1% 2% 2% 1%

14% 10% 5% 7%

76% 82%

69

S10_I S10_O 3% 2% 1%

24%

31% 4% 54%

71% 10%

S24_I S24_O

2% 2% 1%

26% 36%

53% 4% 66% 10%

Figure 4.4: Taxonomy assignment for nitrate/nitrite transporter genes. Taxa contributing to >1% of the genes are represented in the pie charts.

4.3.4 Intracellular Nitrogen Reduction Enzymes involved in the intracellular reduction of nitrate/nitrite and fixation of nitrogen were obtained from pathways of the hierarchical “Environmental Processing” category in KEGG. Analyses were restricted to intracellular bacterial nitrogen reduction and are highlighted in Figure 4.5. The figure suggests that the early and late summer metagenomes contained the same enzymatic pathways throughout the summer. However, this figure gives no information on the differences in abundance of these genes across samples.

70

Figure 4.5: Network of nitrogen metabolisms and including enzymes (in boxes) that catalyze the conversion of different nitrogen forms. Genes detected in the July 30 and August 14 communities (blue) and August 27-September 24 communities (red) are highlighted. Each enzyme code corresponds to a unique KEGG accession number. The pathways involved in prokaryotic nitrogen metabolism that are investigated in this chapter are within the red square.

The absolute abundance of the dominant genes in the pathways highlighted in Figure 4.5 is plotted in Figure 4.6. Contrary to presence/absence information, the results from Figure 4.6 showed that the frequency of these genes varied over time. Namely, the abundance of 1.7.7.2 encoding an enzyme involved in ferredoxin-dependent nitrate reduction to nitrite, 1.7.7.1 encoding an enzyme involved in ferredoxin-dependent nitrite reduction to ammonium and 1.18.6.1 encoding nitrogenase which catalyzes the conversion of dinitrogen to ammonium increased in frequency the most significantly throughout the summer. These genes were highest in late August and September. Furthermore, 1.7.99.4, which encodes a different nitrate

71 reductase, and 1.7.1.4, which encodes a different nitrite reductase, were also present in the samples and highest in September.

Abundance of Intracellular Nitrogen Reduction Genes Throughout the Summer

1.2

rpoB 1 NO2- to NH4+ [1.7.7.1] NO2- to NH4+ [1.7.1.4] 0.8 NO3- to NO2- [1.7.99.4] NO3- to NO2- [1.7.7.2] 0.6 N2 to NH4+ [1.18.6.1]

0.4

0.2 Ratio of each gene copy number: gene Ratio of each

0

Sample

Figure 4.6: Intracellular nitrogen reduction by enzymes described by KEGG with the corresponding accession numbers in []. The values plotted are a ratio between each gene and the copy number of rpoB present within each dataset.

72 Taxonomic assignments to each of the genes in Figure 4.6 by MG-RAST are shown in Figure 4.7. All narB genes, involved in nitrate to nitrite reduction [1.7.7.2], were assigned to Cyanobacterial taxa. Succession within the phylum was apparent as the ratio of Chroococcales:Oscillatoriales/Nostocales decreased throughout the summer. The heterotrophic bacterial phyla Proteobacteria and Actinobacteria dominated the narG genes, which is a member of the ‘nar’ family of nitrate reductases. Within the Proteobacteria, the ratio of α- Proteobacteria: β-Proteobacteria decreased throughout the summer with the genes belonging to the β class predominating in September. With respect to nitrite reduction to ammonium, the nirA gene was dominated by Cyanobacteria, while Proteobacteria dominated the nirB gene pool with almost no sequences assigned to Actinobacteria. The proportion of α-Proteobacteria: β- Proteobacteria once again decreased over the summer for this gene and the β class peaked in September. Representing nitrogen fixation, nifH genes increased over the summer and were almost all assigned to Cyanobacterial sequences and some Proteobacteria (Figure 4.8).

73

Figure 4.7: Taxonomy assignment to the genes encoding subunits of the major enzymes responsible for catalyzing the stepwise reduction of nitrate to ammonium. Relative percentages of each taxa were then multiplied by their normalized gene copy values obtained from Figure 4.6.

Figure 4.8: Taxonomy assignment of sequences corresponding to nifH representing nitrogenase complex that catalyzes the reduction of dinitrogen to ammonium. Relative percentages of each taxa were then multiplied by their normalized gene copy values obtained from Figure 4.6.

74

4.4 Discussion Analysis of the shotgun data revealed not only which genes were dominant in the microbial community but also which taxa had those genes. The surge of Cyanobacteria in August and September was associated with changes in overall microbial functional potentials. There were also clear changes with respect to the abundance of genes involved in certain pathways of nitrogen metabolism. As seen in Table 4.1, Cyanobacteria are already known to use the narB/nirA pathway primarily in the biosynthesis of N compounds for sustaining reproductive growth such as in the development of an algal bloom (Moreno-vivián et al., 1999). Furthermore, it was found that heterotrophic β-Proteobacteria within the bacterial community in Hamilton Harbour were the most abundant taxa with genes involved in nitrate/nitrite reduction (narG/nirB) in anaerobic nitrate/nitrite respiration (Table 4.1). Ultimately, however, both

+ pathways use oxidized-nitrogen compounds as substrates and produce NH4 .

Table 4.1: Nitrogen reduction genes in biological organisms. Modified from table found in Moreno-vivián et al. (1999). KEGG Enzyme Structural Main Substrate à Affiliated Taxa Enzyme Name Gene Function Product ID

- - 1.7.7.2 Nitrate narB Biosynthesis NO3 àNO2 Cyanobacteria Reductase of N compounds

- - 1.7.99.4 Nitrate narG PMF NO3 àNO2 Bacteria Reductase (anaerobic nitrate respiration)

- + 1.7.7.1 Nitrite nirA Biosynthesis NO2 àNH4 Bacteria/Fungi/Plants/ Reductase of N Algae compounds

1 - + 1.7.1.4 Nitrite nirB Dissipation NO2 àNH4 Bacteria Reductase of reducing power/Nitrite detoxification 1Part of a ‘dissimilation’ path even though the ammonium generated can be assimilated but primary use is to detoxify nitrite that accumulated in anaerobic nitrate-respiring cells and to regenerate NAD+.

75 Within the Hamilton Harbour epilimnion, Cyanobacteria are apparently the main microorganisms capable of fixing nitrogen although some Proteobacterial sequences were also detected. Diazotrophic Cyanobacterial gene abundance was highest in August/September when inorganic nitrogen is lowest in the water column based on seasonal trends (Chapter 2 Introduction) and analysis of the raw 2015 physico-chemical data, supporting the hypothesis. The significance of this may be that there is succession within the phylum throughout the summer, and those that can fix nitrogen may be utilizing this capability when the water column experiences the most nitrogen limitation until fall mixing.

As seen in this study, in contrast to most members of the resident heterotrophic community, Cyanobacteria have the genetic equipment that allow them to utilize nitrate/nitrite and dinitrogen for assimilatory purposes if more reduced forms of nitrogen become limiting. The increase in frequency of Cyanobacterial nitrate/nitrite ammonification and nitrogenase genes in the environment suggests that resident Cyanobacteria in Hamilton Harbour can overcome ammonium-nitrogen limitation the system faces during the summer. Importantly, however, functional potential does not equate to the actual expression of these genes, in which case true nitrogen limitation would be better supported by other methods such as transcriptomics/proteomics data. Other studies have analyzed nitrogen metabolism through metagenomic snapshots. Steffen et al., (2015) analyzed actual gene activity in situ by examining the transcription of Microcystis genes at multiple sample sites during August 2012 in the Western basin of Lake Erie. They found the rates of nitrogen metabolic gene expression were significantly different at two geographically distant sites in the lake. They attributed cases of higher narB, nirA gene expression to potential nitrogen stress within certain areas of the lake. This was supported later by mesocosm experiments that showed primary producers were nitrogen limited during the same sampling season August 2012, which was accompanied by a taxonomic shift to diazotrophs (Chaffin et al., 2013).

Chapter 5 Conclusion and Future Directions

In this study, I have described the microbial community composition of the surface water in Hamilton Harbour at two sites bi-weekly for two consecutive summers. Molecular tools allowed the description to include both cultivable and uncultivable organisms within this environment. The 2014 and 2015 sample sets were processed using slightly different methods (i.e modifications in the DNA isolation protocol, sequencing chemistry, bioinformatics analyses pipeline), which limited true comparison of both years. Furthermore, different sampling intervals with only partial temporal overlap may also limit true comparison between years. The 2015 community was sampled later in the season, so the pre-Cyanobacterial dominant community was not captured; neither was post-Cyano dominance for both sampling years. Nevertheless, several conclusions were drawn from the study:

• The correlation between physico-chemical environment and community structure suggested that the dissimilarities in bacterial community structure were linked to changes in the physico-chemical characteristics of the system in the top 1m of the water column throughout the summer. Out of the parameters measured, soluble inorganic nitrogen had the strongest correlation with the bacterial community structure in 2015 while anions had the strongest correlation with the 2014 bacterial community structure. • Actinobacteria belonging to the candidate clades acI and acIV dominated the bacterial community before the Cyanobacteria peaked in August/September. Proteobacteria (specifically the β- class) were the major co-dominant heterotrophic group with Cyanobacteria in the late summer. Planctomycete populations also co- occurred with Cyanobacteria. Several eukaryotes were also detected in the samples, including potentially parasitic Chytrids, which merits future study for interspecific relations as these eukaryotes have been a historically overlooked contributor of phytoplankton population control. The dominant zooplanktons in 2014 were copepods and rotifers whereas Daphnia was the most dominant grazer in 2015.

76 77

• There was no change in the presence/absence of genes involved in nitrogen metabolism throughout the summer. However, the frequency of genes involved in specific metabolic pathways within the category of nitrogen metabolism changed significantly throughout the summer. The potential for Cyanobacterial nitrogen fixation was highest in late August-September when inorganic nitrogen was relative low in the water column.

Most importantly, the metagenomic data that has been obtained adds another dimension to our understanding of the interactions that ultimately shape the biological network in the events leading up to and during Cyanobacterial proliferation. There are many more questions to investigate within the metagenome datasets narrowing on genes that can reveal interspecific relationships: • Are heterotrophic microcystinase genes present and who has them? • Do Actinobacterial/Proteobacterial/Planctomycete organisms found have pathways of breakdown of Cyanobacterial metabolites? • Do the heterotrophs in this system provide the Cyanobacteria with nutrients/vitamins? • Can the zooplankton detected graze the dominant Cyanobacteria and are they doing so? • What genes constitute the “miscellaneous” group within the metagenomes and what functions are in the % unannotated?

While metagenomics provides an idea of the collective functional potential of the community, it is not indicative of actual microbial activity. Metatranscriptomic data could be used to investigate whether there is evidence of Cyanobacterial nitrate/nitrite transport, assimilation and nitrogen fixation gene expression in the late summer in situ. Other enzymatic pathways that may be of interest to investigate include heterotrophic microcystinase activity and Cyanobacterial compound breakdown by heterotrophs. The results would confirm whether the Cyanobacterial succession in the late summer is mainly influenced by nitrogen bioavailability, whereas the heterotrophic bacterial assemblage is determined by metabolic capacity to degrade Cyanobacterial exudates and detrital materials. Ultimately, the results will be required integration into long-term models that help predict the likelihood of toxic Cyanobacterial bloom

78 proliferation in Hamilton Harbour and may even be extended to other similar systems affected by recurring CyanoHAB’s.

References

Bartram, J., Chorus, I., Kuiper-, T., Utkilen, H., Burch, M., & Codd, G. A. (1994). Chapter 5 . SAFE LEVELS AND SAFE PRACTICES.

Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., & Wheeler, D. L. (2005). GenBank. Nucleic Acids Research, 33(DATABASE ISS.), 34–38. http://doi.org/10.1093/nar/gki063

Beversdorf, L. J., Miller, T. R., & McMahon, K. D. (2013). The Role of Nitrogen Fixation in Cyanobacterial Bloom Toxicity in a Temperate, Eutrophic Lake. PLoS ONE, 8(2). http://doi.org/10.1371/journal.pone.0056103

Bláhová L, Babica P, Maršálková E, Smutná M, Maršálek B, Bláha L. (2007) Concentrations and seasonal trends of extracellular microcystins in freshwaters of the Czech Republic – results of the national monitoring program. CLEAN – , Air, Water 35:348–354.

Blomqvist, P., Petterson, A., Hyenstrand, P. (1994). Ammonium–nitrogen: a key regulatory factor causing dominance of non-nitrogen-fixing cyanobacteria in aquatic systems. Archiv. Hydrobiologie. 132, 141–164.

Boopathi, T., Faria, D. G., Lee, M. D., Lee, J., Chang, M., & Ki, J. S. (2015). A molecular survey of freshwater microeukaryotes in an Arctic reservoir (Svalbard, 79°N) in summer by using next-generation sequencing. Polar Biology, 38(2), 179–187. http://doi.org/10.1007/s00300-014-1576-9

Bouvy, M., Pagano, M., & Troussellier, M. (2001). Effects of a cyanobacterial bloom (Cylindrospermopsis raciborskii) on bacteria and zooplankton communities in Ingazeira reservoir (northeast Brazil). Aquatic Microbial Ecology, 25(3), 215–227. http://doi.org/10.3354/ame025215

Buchfink, B., Xie, C., & Huson, D. H. (2015). Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12(1), 59-60. doi:http://dx.doi.org/10.1038/nmeth.3176

79 80

Cai, H. Y., Yan, Z. sheng, Wang, A. J., Krumholz, L. R., & Jiang, H. L. (2013). Analysis of the Attached Microbial Community on Mucilaginous Cyanobacterial Aggregates in the Eutrophic Lake Taihu Reveals the Importance of Planctomycetes. Microbial Ecology, 66(1), 73–83. http://doi.org/10.1007/s00248-013-0224-1

Cailliez, F. (1983). The analytical solution of the additive constant problem. Psychometrika, 48(2), 305–308. http://doi.org/10.1007/BF02294026

Capone, D. G., & Carpenter, E. J. (1982). Nitrogen Fixation in the Marine Environment Author. Science, 217(4565), 1140–1142.

Caporaso, J. G., Bittinger, K., Bushman, F. D., Desantis, T. Z., Andersen, G. L., & Knight, R. (2010). PyNAST: A flexible tool for aligning sequences to a template alignment. Bioinformatics, 26(2), 266–267. http://doi.org/10.1093/bioinformatics/btp636

Caporaso, J. G. et al. (2010). QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7, 335–336

CENR. (2003). An Assesment of Coastal Hypoxia and Eutrophication in U.S. Waters, 1–82.

Chaffin, J. D., Bridgeman, T. B., & Bade, D. L. (2013). Nitrogen Constrains the Growth of Late Summer Cyanobacterial Blooms in Lake Erie. Advances in Microbiology, 3(October), 16– 26. http://doi.org/10.4236/aim.2013.36A003

Charlton, M.N. (2001). The Hamilton Harbour remedial action plan: eutrophication. Verh. Internat. Verein. Limnol. 27, 4069–4072

Charlton, M.N., Le Sage, R. (1996). Water quality trends in Hamilton Harbour: 1987 to 1995. Water Qual. Res. J. Can. 31, 473–484.

Chorus, I. (2001). Cyanotoxins, occurrence, causes, consequences. – Heidelberg.

Chu, Z., Jin, X., Yang, B., & Zeng, Q. (2007). Buoyancy regulation of Microcystis flos-aquae during phosphorus-limited and nitrogen-limited growth. Journal of Plankton Research, 29(9), 739–745. http://doi.org/10.1093/plankt/fbm054

81 Combes, A., Dellinger, M., Cadel-six, S., Severine, A., Comte, K. (2013). Ciliate Nassula sp. grazing on a microcystin–producing cyanobacterium (Planktothrix agardhii: impact on cell growth and in the microcystin fractions. Aquat Toxicol 126: 435–441

Davis, T. W., Bullerjahn, G. S., Tuttle, T., McKay, R. M., & Watson, S. B. (2015). Effects of Increasing Nitrogen and Phosphorus Concentrations on Phytoplankton Community Growth and Toxicity during Planktothrix Blooms in Sandusky Bay, Lake Erie. Environmental Science and Technology, 49(12), 7197–7207. http://doi.org/10.1021/acs.est.5b00799

Dermott, R., Johannsson, O., Munawar, M., Bonnell, R., Bowen, K., Burley, M., … Niblock, H. (2007). Assessment of lower food web in Hamilton Harbour, Lake Ontario, 2002-2004. Canadian Technical Report of Fisheries and Aquatic Sciences, 120. Retrieved from http://publications.gc.ca/collections/collection_2012/mpo-dfo/Fs97-6-2729-eng.pdf

DeSantis, T. Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E. L., Keller, K., … Andersen, G. L. (2006). Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Applied and Environmental Microbiology, 72(7), 5069–5072. http://doi.org/10.1128/AEM.03006-05

Didier, G., Durand, B., Heibl, C., Ives, A., & Lawson, D. (2016). Package “ ape .”

Dortch, Q. (1990). The interaction between ammonium and nitrate uptake in phytoplankton. Marine Ecology Progress Series, 61, 183–201. http://doi.org/10.3354/meps061183

Downing, J.A., McCauley, E. (1992). The nitrogen: phosphorus relationship in lakes. Limnol Oceanogr. 37: 936–945 doi:10.4319/lo.1992.37.5.0936

Downing, T.G., Meyer, C., Gehringer, M.M., van de Venter, M. (2005). Microcystin content of Microcystis aeruginosa is modulated by nitrogen uptake rate relative to specific growth rate or carbon fixation rate. Environmental Toxicology 20: 257–262. doi: 10.1002/tox.20106

Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), 2460–2461. http://doi.org/10.1093/bioinformatics/btq461

Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C., & Knight, R. (2011). UCHIME improves sensitivity and speed of chimera detection. Bioinformatics, 27(16), 2194–2200.

82 http://doi.org/10.1093/bioinformatics/btr381

Edgcomb, V. P., Kysela, D. T., Teske, A., de Vera Gomez, A., & Sogin, M. L. (2002). Benthic eukaryotic diversity in the Guaymas Basin hydrothermal vent environment. Proceedings of the National Academy of Sciences of the United States of America, 99(11), 7658–62. http://doi.org/10.1073/pnas.062186399

Eiler, A., & Bertilsson, S. (2004). Composition of freshwater bacterial communities associated with cyanobacterial blooms in four Swedish lakes. Environmental Microbiology, 6(12), 1228–1243. http://doi.org/10.1111/j.1462-2920.2004.00657.x

Flores, E., Frı´as, J. E., Rubio, L. M., & Herrero, A. (2005). Photosynthetic nitrate assimilation in cyanobacteria. Photosynthesis Research, 83(2), 117–133. http://doi.org/10.1007/s11120- 004-5830-9

Fredriksson, N. J., Hermansson, M., & Wilén, B.-M. (2013). The choice of PCR primers has great impact on assessments of bacterial community diversity and dynamics in a wastewater treatment . PloS One, 8(10), e76431. http://doi.org/10.1371/journal.pone.0076431

Ganf, G. G. and Oliver, R. L. (1982). Vertical separation of light and available nutrients as a factor causing replacement of green algae by blue-green algae in the plankton of a stratified lake. J. Ecol. 70, 829–844.

Ger, K. A., Hansson, L. A., & L??rling, M. (2014). Understanding cyanobacteria-zooplankton interactions in a more eutrophic world. Freshwater Biology, 59(9), 1783–1798. http://doi.org/10.1111/fwb.12393

Gerphagnon, M., Latour, D., Colombet, J., & Sime-Ngando, T. (2013). Fungal Parasitism: Life Cycle, Dynamics and Impact on Cyanobacterial Blooms. PLoS ONE, 8(4), 2–11. http://doi.org/10.1371/journal.pone.0060894

Ginn, H.P., Pearson, L.A, Neilan, B.A. (2010). NtcA from Microcystis aeruginosa PCC 7806 is autoregulatory and binds to the microcystin promoter. Applied and Environmental Microbiology 76: 4362–4368. doi: 10.1128/aem.01862-09

83

Glibert, P. M., & Bronkt, D. A. (1994). Release of Dissolved Organic Nitrogen by Marine Diazotrophic, 60(11), 3996–4000.

Glöckner, F. O., et al. (2003). Complete genome sequence of the marine planctomycete Pirellula sp. strain 1. Proc. Natl. Acad. Sci. U. S. A. 100:8298–8303.

Granato, B. G. E., & Smith, K. P. (1999). Estimating Concentrations of Road-Salt Constituents in Highway-Runoff from Measurements of Specific Conductance. Water Resources, 22.

Grossman, R., Schaefer, M. R., Chiang, G. G., & Collier, J. L. (1993). The phycobilisome, a light-harvesting complex responsive to environmental conditions. Microbiological Reviews, 57(3), 725–749.

Ha, K., Cho, E.-A., Kim, H.-W., & Joo, G.-J. (1999). Microcystis bloom formation in the lower Nakdong River, South Korea: importance of hydrodynamics and nutrient loading. Marine and Freshwater Research, 50, 89–94. http://doi.org/10.1071/MF99078

Hamblin, P. F., & He, C. (2003). Numerical models of the exchange flows between Hamilton Harbour and Lake Ontario. Water Research, 180, 168–180. http://doi.org/10.1139/L02-076

Hamilton Harbour RAP Stakeholder Forum (HH RAP), (2003). Remedial Action Plan for Hamilton Harbour: Stage 2 Update 2002. Burlington, Ontario

Hamilton Harbour Remedial Action Plan (HH RAP), 1992. Goals, options and recommendations. Volume 2 — Main Report. RAP Stage 2. Burlington, Ontario

Haney, J. F. (1987). Field studies on zooplankton-cyanobacteria interactions. N. Z. J. Mar. Freshwat. Res., 21, 467–475.

Hardy, C. M., Adams, M., Jerry, D. R., Court, L. N., Morgan, M. J., & Hartley, D. M. (2011). DNA barcoding to support conservation: species identification, genetic structure and biogeography of fishes in the Murray - Darling River Basin, Australia. Marine and Freshwater Research, 62(8), 887. http://doi.org/10.1071/MF11027

84 Hecky, R. E., Campbell., P. L, Hendzel., L. L. (1993). The stoichiometry of carbon, nitrogen, and phosphorus in particulate matter of lakes and oceans. Limnol. Oceanogr. 38: 709–724.

Hingsamer, P., Peeters, F., & Hofmann, H. (2014). The consequences of internal waves for phytoplankton focusing on the distribution and production of Planktothrix rubescens. PLoS ONE, 9(8). http://doi.org/10.1371/journal.pone.0104359

Hugenholtz, P. (2002). Exploring prokaryotic diversity in the genomic era. Genome Biology, 3(2), REVIEWS0003. http://doi.org/10.1186/gb-2002-3-2-reviews0003

Huson, D.H et al. (2016). MEGAN Community Edition - Interactive exploration and 2 analysis of large-scale sequencing data, PLoS Computational Biology 12(6): e1004957. doi:10.1371/journal. pcbi.100495 [11]

Hutchinson, G. (1938). On the relation between the oxygen deficit and the productivity and typology of lakes. Internationale Revue Der Gesamten Hydrobiologie Und Hydrographie, 36(2), 336–355. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/iroh.19380360205/abstract

Hyenstrand, P., Burkert, U., Pettersson, A., & Blomqvist, P. (2000). Competition between the green alga Scenedesmus and the cyanobacterium Synechococcus under different modes of inorganic nitrogen supply. Hydrobiologia, 435(1-3), 91–98.

Hyenstrand, P., Rydin, E., Gunnerhed, M., Linder, J., Blomqvist, P. (2001). Response of the cyanobacterium Gloeotrichia echinulata to iron and boron additions — an experiment from. Lake Erken. Freshw. Biol. 46, 735–741

IJC. (2013). Human Health Effects from Harmful Algal Blooms : a Synthesis.

Jonlija, M. (2014). Assessment of toxic cyanobacterial abundance at Hamilton Harbour from analysis of sediment and water.

Joshi NA, Fass JN. (2011). Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.33) [Software]. Available at https://github.com/najoshi/sickle.

85 Kasalický, V., Jezbera, J., Hahn, M. W., & Šimek, K. (2013). The Diversity of the Limnohabitans Genus, an Important Group of Freshwater Bacterioplankton, by Characterization of 35 Isolated Strains. PLoS ONE, 8(3). http://doi.org/10.1371/journal.pone.0058209

Kissoon, L.T.T., Jacob, D.L., Hanson, M.A., Herwig, B.R., Bowe, S,E., Otte, ML. (2013). Macrophytes in shallow lakes: relationships with water, sediment and watershed characteristics. Aquatic Botany, vol .109, p. 39-48

Kumar, S., Stecher, G., & Tamura, K. (2016). MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Molecular Biology and Evolution, 33(7), msw054. http://doi.org/10.1093/molbev/msw054

Lage, O. M., & Bondoso, J. (2014). Planctomycetes and macroalgae, a striking association. Frontiers in Microbiology, 5(JUN), 1–9. http://doi.org/10.3389/fmicb.2014.00267

Laird, G. A., Scavia, D., and Fahnenstiel, G. A. (1986). Algal organic carbon excretion in Lake Michigan. J. Great Lakes Res. 12:136-141.

Latala, A. (1991). Effects of salinity , temperature and light on the growth and morphology of green planktonie algae *. Oceanologia, 31(1991), 119–138.

Lehman, P.W., S.J. Teh, G.L. Boyer, M.L. Nobriga, E. Bass and C. Hogle. (2010). Initial impacts of Microcystis aeruginosa blooms on the aquatic food web in the San Francisco Estuary. Hydrobiologia 637:229-248.

Li, J., Zhang, J., Liu, L., Fan, Y., Li, L., Yang, Y., … Zhang, X. (2015). Annual periodicity in planktonic bacterial and archaeal community composition of eutrophic Lake Taihu. Scientific Reports, 5(April), 15488. http://doi.org/10.1038/srep15488

Liu, W. T., Marsh, T. L., Cheng, H., & Forney, L. J. (1997). Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA. Applied and Environmental Microbiology, 63(11), 4516–22. http://doi.org/0099-2240/97/$04.00?0

Lohscheider, J. N., Strittmatter, M., Küpper, H., & Adamska, I. (2011). Vertical distribution of

86 epibenthic freshwater cyanobacterial synechococcus spp. strains depends on their ability for photoprotection. PLoS ONE, 6(5). http://doi.org/10.1371/journal.pone.0020134

Lone, Y., Bhide, M., & Koiri, R. K. (2016). Microcystin-LR Induced Immunotoxicity in Mammals. Journal of Toxicology, 2016. http://doi.org/10.1155/2016/8048125

Lutton, E. M., Schellevis, R., & Shanmuganathan, A. (2013). Increased culturability of soil bacteria from Marcellus shale temperate in Pennsylvania. Journal of Student Research, 2(1), 36–42. Retrieved from http://jofsr.bluetangi.com/index.php/path/article/view/110\npapers2://publication/uuid/7EC 1AFD5-9EBF-4723-9C56-4552B8F07C2C

Magoč, T., Salzberg, S. L. (2011). FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963. doi: 10.1093/bioinformatics/btr507

Manage P. M., Kawabata Z., Nakano S. (1999). Seasonal changes in densities of cyanophage infectious to Microcystis aeruginosa in a hypereutrophic pond. Hydrobiologia 411:211- 216.

Meyer, F., et al. (2008). The Metagenomics RAST server – A public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics, 9:386, [article]

Molot L.A., Brown E.J. (1986) Method for determining the temporal response of microbial phosphate-transport affinity. Applied and Environmental Microbiology, 51, 524–531.

Molot, L. A., Watson, S. B., Creed, I. F., Trick, C. G., Mccabe, S. K., Verschoor, M. J., … Schiff, S. L. (2014). A novel model for cyanobacteria bloom formation: The critical role of anoxia and ferrous iron. Freshwater Biology, 59(6), 1323–1340. http://doi.org/10.1111/fwb.12334

Moreno-vivián, C., Cabello, P., Blasco, R., Castillo, F., Cabello, N., Marti, M., & Moreno-vivia, C. (1999). Prokaryotic Nitrate Reduction : Molecular Properties and Functional Distinction among Bacterial Nitrate Reductases MINIREVIEW Prokaryotic Nitrate Reduction : Molecular Properties and Functional Distinction among Bacterial Nitrate Reductases, 181(21), 6573–6584.

87 Morgulis, A., Coulouris, G., Raytselis, Y., Madden, T. L., Agarwala, R., & Sch??ffer, A. A. (2008). Database indexing for production MegaBLAST searches. Bioinformatics, 24(16), 1757–1764. http://doi.org/10.1093/bioinformatics/btn322

Mou, X., Jacob, J., Lu, X., Robbins, S., Sun, S., & Ortiz, J. D. (2013). Diversity and distribution of free-living and particle-associated bacterioplankton in Sandusky Bay and adjacent waters of Lake Erie Western Basin. Journal of Great Lakes Research, 39(2), 352–357. http://doi.org/10.1016/j.jglr.2013.03.014

Murphy, T. P., Irvine, K., Guo, J., Davies, J., Murkin, H., Charlton, M., & Watson, S. B. (2003). New microcystin concerns in the lower great lakes. Water Qual. Res. J. Canada, 38(1), 127–140. Retrieved from //000181041100008

Neil, J. H., Owen, G. E. (1964). Distribution, environmental requirements and significance of Cladophora in the Great Lakes. In Proceedings of the 7th Conference on Great Lakes Research. Publ. 11. Great Lakes Research Division, University of Michigan, Ann Arbor, pp. 113–21. Niemi, R.M, Heiskanen, I., Heine, R., Rapala, J. Previously uncultured β-Proteobacteria dominate in biologically active granular activated carbon (BAC) filters. Water. Res. 2009; 43: 5075–5086. doi: 10.1016/j.watres.2009.08.037. pmid:19783028

Newton, R. J., Jones, S. E., Eiler, A., McMahon, K. D., & Bertilsson, S. (2011). A guide to the natural history of freshwater lake bacteria. Microbiology and molecular biology reviews : MMBR (Vol. 75). http://doi.org/10.1128/MMBR.00028-10

Oksanen, A. J., Blanchet, F. G., Kindt, R., Minchin, P. R., Hara, R. B. O., Simpson, G. L., … Wagner, H. (2011). Package “ vegan .”

Ontario Ministry of Environment and Energy. (1994). Water Management Policies Guidelines and Provincial Water Quality Objectives, Toronto, ON, Canada.

Paerl, H. W., Fulton, R. S., Moisander, P. H., & Dyble, J. (2001). Harmful freshwater algal blooms, With an emphasis on cyanobacteria. The Scientific World JOURNAL, 1, 76–113. http://doi.org/10.1100/tsw.2001.16

Paerl, H. W., & Otten, T. G. (2013). Harmful Cyanobacterial Blooms: Causes, Consequences,

88 and Controls. Microbial Ecology, 65(4), 995–1010. http://doi.org/10.1007/s00248-012- 0159-y

Palmer, M. A. (1997). Ecosystem Biodiversity and in Freshwater Sediments. Ambio, 26(8), 571–577.

Panosso, R., Carlsson, P., Kozlowsky-Suzuki, B., Azevedo, S. M. F. O., & Granéli, E. (2003). Effect of grazing by a neotropical copepod, Notodiaptomus, on a natural cyanobacterial assemblage and on toxic and non-toxic cyanobacterial strains. Journal of Plankton Research, 25(9), 1169–1175. http://doi.org/10.1093/plankt/25.9.1169

Poulton, D. J. (1987). Trace Contaminant Status of Hamilton Harbour. Journal of Great Lakes Research, 13(2), 193–201. http://doi.org/10.1016/S0380-1330(87)71642-6 Qian. H., Preston. M.D., Gupta V., Basiliko. N., Dunfield. K., Fulthorpe. R. (unpublished). Improved primers and a pyrosequencing approach recover wide range of nitrogen reductas e genes from peatlands.

Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., … Glöckner, F. O. (2013). The SILVA ribosomal RNA gene database project: Improved data processing and web- based tools. Nucleic Acids Research, 41(D1), 590–596. http://doi.org/10.1093/nar/gks1219

R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

Ramette, A. (2007). Multivariate analyses in microbial ecology. FEMS Microbiology Ecology, 62(2), 142–160. http://doi.org/10.1111/j.1574-6941.2007.00375.x

Rasconi, S., Grami, B., Niquil, N., Jobard, M., & Sime-Ngando, T. (2014). Parasitic chytrids sustain zooplankton growth during inedible algal bloom. Frontiers in Microbiology, 5(MAY), 1–19. http://doi.org/10.3389/fmicb.2014.00229

Rastogi, R. P., Madamwar, D., & Incharoensakdi, A. (2015). Bloom dynamics of cyanobacteria and their toxins: Environmental health impacts and mitigation strategies. Frontiers in Microbiology, 6(NOV), 1–22. http://doi.org/10.3389/fmicb.2015.01254

89 Redfield A.C., (1934). On the proportions of organic derivations in sea water and their relation to the composition of plankton. James Johnstone Memorial Volume. (ed. R.J. Daniel). University Press of Liverpool, pp. 176–192

Reeder, J., & Knight, R. (2010). Rapid denoising of pyrosequencing amplicon data: exploiting the rank-abundance distribution. Nature Methods, 7(9), 668–669. http://doi.org/10.1038/nmeth0910-668b.Rapid

Rees, G. N., Baldwin, D. S., Watson, G. O., Perryman, S., & Nielsen, D. L. (2004). Ordination and significance testing of microbial community composition derived from terminal restriction fragment length polymorphisms: application of multivariate statistics. Antonie van Leeuwenhoek, 86, 339–347. http://doi.org/10.1007/s10482-004-0498-x

Robarts, R. D., & Zohary, T. (1987). Temperature effects on photosynthetic capacity, respiration, and growth rates of bloom‐forming cyanobacteria. New Zealand Journal of Marine and Freshwater Research, 21(3), 391–399. http://doi.org/10.1080/00288330.1987.9516235

Scavia D., Lang G.A., Kitchell J.F. (1988). Dynamics of Lake Michigan plankton: a model evaluation of nutrient loading, competition and predation. Canadian Journal of Fisheries and Aquatic Sciences 45, 165–177

Schindler, A. D. W. (1977). Evolution of Phosphorus Limitation in Lakes Published, 195(4275), 260–262.

Schindler, D. W. (2012). The dilemma of controlling cultural eutrophication of lakes. Proceedings. Biological Sciences / The Royal Society, 279(1746), 4322–33. http://doi.org/10.1098/rspb.2012.1032

Schlesner H., et al. (2004). Taxonomic heterogeneity within the Planctomycetales as derived by DNA-DNA hybridization, description of Rhodopirellula baltica gen. nov., sp. nov., transfer of Pirellula marina to the genus Blastopirellula gen. nov. as Blastopirellula marina comb. nov. and emended description of the genus Pirellula. Int. J. Syst. Evol. Microbiol. 54:1567– 1580.

Scott, J. T., & Grantz, E. M. (2013). N 2 fixation exceeds internal nitrogen loading as a

90 phytoplankton nutrient source in perpetually nitrogen-limited reservoirs. Freshwater Science, 32(3), 849–861. http://doi.org/10.1899/12-190.1

Scott, J. T., & McCarthy, M. J. (2010). Nitrogen fixation may not balance the nitrogen pool in lakes over timescales relevant to eutrophication management. Limnology and Oceanography, 55(3), 1265–1270. http://doi.org/10.4319/lo.2010.55.3.1265

Skulberg, O. M., & Utkilen, H. (1999). Chapter 2 . CYANOBACTERIA IN THE ENVIRONMENT.

Smith, V. (1983). Low nitrogen to phosphorus ratios favor dominance by blue-green algae in lake phytoplankton. Science. 221: 669-671.

Steffen, M. M., Belisle, B. S., Watson, S. B., Boyer, G. L., Bourbonniere, R. A., & Wilhelm, S. W. (2015). Metatranscriptomic evidence for co-occurring top-down and bottom-up controls on toxic cyanobacterial communities. Applied and Environmental Microbiology, 81(9), 3268–3276. http://doi.org/10.1128/AEM.04101-14

Steffen, M. M., Li, Z., Effler, T. C., Hauser, L. J., Boyer, G. L., & Wilhelm, S. W. (2012). Comparative Metagenomics of Toxic Freshwater Cyanobacteria Bloom Communities on Two Continents. PLoS ONE, 7(8), 1–9. http://doi.org/10.1371/journal.pone.0044002

Steven, B., Dowd, S. E., Schulmeyer, K. H., & Ward, N. L. (2011). Phylum-targeted pyrosequencing reveals diverse planctomycete populations in a eutrophic lake. Aquatic Microbial Ecology, 64(1), 41–49. http://doi.org/10.3354/ame01507

Strom, S. L., & Morello, T. a. (1998). Comparative growth rates and yelds of ciliates and heterotrophic dinoflagellates. Journal of Plankton Research, 20(3), 571–584.

Tallberg, P., & Heiskanen, A. (1998). Species-specific phytoplankton sedimentation in relation to primary production along an inshore—offshore gradient in the Baltic Sea. Journal of Plankton Research, 20(ll), 2053–2070. http://doi.org/10.1093/plankt/20.11.2053

Teeling, H., & Glöckner, F. O. (2012). Current opportunities and challenges in microbial metagenome analysis-A bioinformatic perspective. Briefings in Bioinformatics, 13(6), 728– 742. http://doi.org/10.1093/bib/bbs039

91 Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22(22), 4673–4680. http://doi.org/10.1093/nar/22.22.4673

Tillett, D., & Neilan, B. A. (2000). XANTHOGENATE NUCLEIC ACID ISOLATION FROM CULTURED AND ENVIRONMENTAL CYANOBACTERIA 1 The isolation of high- quality nucleic acids from cyanobacterial strains , in particular environmental isolates , has proven far from trivial . We present novel techniques , 258, 251–258.

Trebitz, A. S., Hoffman, J. C., Grant, G. W., Billehus, T. M., & Pilgrim, E. M. (2015). Potential for DNA-based identification of Great Lakes fauna: match and mismatch between taxa inventories and DNA barcode libraries. Scientific Reports, 5(February), 12162. http://doi.org/10.1038/srep12162

Tyrrell, T. (1999). The relative influences of nitrogen and phosphorus on oceanic primary production. Nature, 400: 525– 531.

Van Wichelen, J., van Gremberghe, I., Vanormelingen, P., Debeer, A. E., Leporcq, B., Menzel, D., … Vyverman, W. (2010). Strong effects of amoebae grazing on the biomass and genetic structure of a Microcystis bloom (Cyanobacteria). Environmental Microbiology, 12(10), 2797–2813. http://doi.org/10.1111/j.1462-2920.2010.02249.x

Venter, J. C. et al. (2004). Environmental Genome Shotgun Sequencing of the. Science, 1093857(2004), 304. http://doi.org/10.1126/science.1093857

Wagner, F., Falkner, R., Falkner, G. (1995) Information about previous phosphate fluctuations is stored via an adaptive response of the high-affinity phosphate-uptake system of the cyanobacterium Anacystis nidulans. Planta. 197, 147–155

Watson, S. B. (2004). Aquatic Taste and Odor: a Primary Signal of Drinking-Water Integrity. Journal of Toxicology and Environmental Health, Part A, 67(20-22), 1779–1795. http://doi.org/10.1080/15287390490492377

Watson, S. B., Ridal, J., & Boyer, G. L. (2008). Taste and odour and cyanobacterial toxins: impairment, prediction, and management in the Great Lakes. Canadian Journal of

92 Fisheries and Aquatic Sciences, 65(April 2016), 1779–1796. http://doi.org/10.1139/F08- 084

Weisburg, W. G., Barns, S. M., Pelletie, D. a, Lane, D. J., Pelletier, D. a., & Lane, D. J. (1991). 16S ribosomal DNA amplification for phylogenetic study. Journal of Bacteriology, 173(2), 697–703. http://doi.org/n.a.

Wei, T., Simko, V. (2016) R package ‘corrplot’: Visualization of a correlation matrix (Version 0.77).

Wells, M. L., Trainer, V. L., Smayda, T. J., Karlson, B. S. O., Trick, C. G., Kudela, R. M., … Cochlan, W. P. (2015). Harmful algal blooms and climate change: Learning from the past and present to forecast the future. Harmful Algae, 49, 68–93. http://doi.org/10.1016/j.hal.2015.07.009

Wilson, A. E., Sarnelle, O., & Tillmanns, A. R. (2006). Effects of cyanobacterial toxicity and morphology on the population growth of freshwater zooplankton: Meta-analyses of laboratory experiments. Limnology and Oceanography, 51(4), 1915–1924. http://doi.org/10.4319/lo.2006.51.4.1915

Winter, J. (n.d.). Analyzing the trends What are Algae ?

Wu, J., Tsanis, I. K., & Chiocchio, F. (1996). Observed currents and water levels in Hamilton Harbour. Journal of Great Lakes Research, 22(2), 224–240. http://doi.org/10.1016/S0380- 1330(96)70951-6

Yoshida T. et al. (2006) Isolation and characterisation of a cyanophage infecting the toxic cyanobacterium Microcystis aeruginosa. Appl. Environ. Microbiol. 72:1239-1247

Zaccaroni, A., & Scaravelli, D. (2008). Toxicity of Fresh Water Algal Toxins To Humans and Animals. Algal Toxins: Nature, Occurrence, Effect and Detection., 399. http://doi.org/10.1007/978-1-4020-8480-5_3

Zhang, J., Zhang, X., Liu, Y., Xie, S., & Liu, Y. (2014). Bacterioplankton communities in a high-altitude freshwater wetland. Annals of Microbiology, 64(3), 1405–1411. http://doi.org/10.1007/s13213-013-0785-8

Appendix A

Table A.1: Sample dates, volumes filtered through Sterivex filters, water-column depths and sites sampled in Hamilton Harbour for this study.

Date Volume Filtered Depth Site 5 June 2014 800mL 1m 9031 5 June 2014 1200mL 1m 1001 19 June 2014 1000mL 1m 9031 19 June 2014 1000mL 1m 1001 2 July 2014 800mL 1m 9031 2 July 2014 1300mL 1m 1001 17 July 2014 500mL 1m 9031 17 July 2014 600mL 1m 1001 31 July 2014 1000mL 1m 9031 31 July 2014 1000mL 1m 1001 14 Aug 2014 1650mL 1m 9031 14 Aug 2014 2000mL 1m 1001 27 Aug 2014 300mL 1m 9031 27 Aug 2014 180mL 1m 1001 30 July 2015 500mL 1m 9031 30 July 2015 500mL 1m 1001 13 Aug 2015 500mL 1m 9031 13 Aug 2015 500mL 1m 1001 27 Aug 2015 500mL 1m 9031 27 Aug 2015 500mL 1m 1001 10 Sept 2015 500mL 1m 9031 10 Sept 2015 500mL 1m 1001 24 Sept 2015 500mL 1m 9031 24 Sept 2015 500mL 1m 1001

93 94 Table A.2: Restriction enzymes that were tested to cut 16S SSU rRNA amplicons and corresponding fragments (values represent length in bp). Amplicons of the predicted bacterial community were generated in silico. All bacterial sequences were obtained from GenBank. The enzyme chosen for actual digestion (AluI) is bolded.

HaeIII MspI RsaI HinfI AluI HhaI Sau3AI Cyanobacteria Microcystis aeruginosa 182 - 425 63 194 - 7

Lyngbya sp. 293 151 425 193 241 Aphanizomenon flos-aquae 292 - 424 103 192 - 7 Alphaproteobacteria Methylobacterium sp. 192 150 109 297 143 340 7

Sphingomonas natatoria 71 150 422 103 208 82 7

Caulobacter vibrioides 39 150 422 23 143 332 7 Betaproteobacteria Limnohabitans 215 - 446 320 230 203 7 Gammaproteobacteria Pseudomonas aeruginosa 39 143 - 83 72 155 7 Actinobacteria Candidatus Planktophila 225 141 452 114 233 - 7 limnetica Bacteroidetes Flavobaterium sp. - 93 314 331 74 98 7

95 Table A.3. Restriction enzymes that were tested to cut 18S SSU rRNA amplicons and corresponding fragments (values represent length in bp). Amplicons of the predicted bacterial community were generated in silico. All bacterial sequences were obtained from GenBank. The enzyme chosen for actual digestion (FatI) is bolded.

HaeIII MspI RsaI HinfI AluI HhaI FatI Cryptomonas erosa 84 - 72 - 15 - 186 Staurastrum gracile 141 94 - 79 15 159 227 Cryptomonas marssonii 89 - - - 15 74 133 Chlamydomonas noctigama 254 160 - 247 15 - 228 Actinocyclus curvatulus - - 259 - 15 - 232 Gymnodinium catenatum 114 - 159 - 15 111 454 Lagerheimia genevensis 77 82 97 162 15 77 227 Ochromonas sp. - - 102 118 15 - 233 Monoraphidium contortum 111 152 101 246 15 - 227 Pandorina morum 208 124 - 158 15 - 229 Fragilaria capucina 77 82 103 - 15 276 230 Cryptomonas reflexa 215 - 328 159 15 - 206 Rhodomonas minuta 89 146 - - 15 - 223 Rhodomonas lens 209 257 - - 15 81 200 Cosmarium obtusatum 141 125 - 79 15 - 227 Dinobryon divergens - - 106 147 15 - 155 Stephanodiscus niagarae 120 82 - 468 15 90 232 Ceratium sp. 114 - - - 15 - 229

96 Table A.4: Bacterial T-RF’s detected after PCR amplicons were cut with AluI and data were quality filtered. T-RF’s/fragments (in bp) that were present in *>5%, **>10%, or ***>20% of the relative abundance within the total bacterial community at some point during the sampling year are listed.

T-RF 2014 2015 60 * 69 * * 121 * 140 * * 141 * 150 *** 186 ** ** 188 *** *** 190 * ** 202 * * 203 * 205 * 237 * 239 * 240 *** *** 242 * 244 * 245 * 246 * 247 * 268 * 282 *

97 Table A.5: Eukaryotic T-RF’s detected after PCR amplicons were cut with FatI and data were quality filtered. T-RF’s/fragments (in bp) that were present in *>5%, **>10%, or ***>20% of the relative abundance within the total bacterial community at some point during the sampling year are listed.

T-RF 2014 2015 72 ** 84 * 90 * * 92 *** 102 ** 124 * 127 * * 129 *** ** 202 ** 209 *** ** 217 * 218 * 222 ** * 223 * 224 * 225 * 226 *** *** 227 ** *** 228 *** 229 *** *** 233 *** *** 235 *** 237 * 280 * 329 * 456 ** ** 489 *

98 Table A.6: Averages and standard deviations in () of water quality parameters collected from the 1 m deep epilimnion water throughout 2014 and 2015. Units for all chemical parameters are mg/L where are temperature is °C and chlorophyll a is μg/L. Cells highlighted with red font represent the 2014 sampling season while cells highlighted with blue font represent the 2015 sampling season.

Site Parameter

Inshore Offshore Inshore Offshore pH 8.44 (0.06) 8.35 (0.13) 9.15 (0.38) 9.42 (0.24) Optical dissolved oxygen 10.05 (0.56) 10.4 (0.70) 10.27 (2.78) 10.55 (1.88) (ODO) Temperature 20.76 (1.51) 20.37 (1.30) 21.27 (1.44) 21.67 (2.16)

Nitrate/nitrite (NO3/NO2) 2.26 (0.34) 2.35 (0.37) 1.93 (0.30) 1.89 (0.30) + Ammonium (NH4 ) 0.06 (0.072) 0.056 0.02 (0.01) 0.013 (0.055) (0.0039) Fluoride (F) 0.28 (0.016) 0.29 (0.020) 0.26 (0.01) 0.26 (0.0196) Chloride (Cl) 113.5 116 (17.55) 96.32 102.02 (18.14) (11.81) (9.06) 2- Sulfate (SO4 ) 45.81 (2.77) 47.81 (5.31) 42.52 (2.97) 41 (6.22) Soluble Reactive Phosphorus 0.0013 0.0015 <0.0002 0.00036 (SRP) (0.00084) (0.0013) (0.00037) Total Nitrogen-filtered (TN-N-F) 2.67 (0.52) 2.82 (0.45) - - Total Nitrogen-unfiltered (TN-N- 2.9 (0.51) 3.05 (0.52) - - UF) Total Nitrogen (TN-Kj) - - 0.53 (0.06) 0.49 (0.084) Total Phosphorus (TP) 0.028* 0.028* 0.04 (0.01) 0.035 (0.010) (0.0091) (0.0085) Total particulate phosphorus - - 0.03 (0.01) 0.025 (TPP) (0.0057) Total Phosphorus Particulate - 0.014 0.016 - - filtered (TP-P-F) (0.0032) (0.0033) Total Phosphorus Particulate- 0.039 0.040 - - unfiltered (TP-P-UF) (0.0063) (0.0049) Phosphorus Total Dissolved (TP- - - 0.01 0.011 D) (<0.001) (0.0018) Chlorophyll a (Chla) 20.76 22.18 (9.80) 28.28 23.30 (5.9) (10.38) (16.69)

*Labeled “P” in metadata; possible mix-up with another phosphorus-containing compound.

Appendix B

Table B.1: The 2014 16S and 18S pools of DNA samples sent for 454 tag-encoded pyrosequencing. Pool identification (ID), dates/site DNA samples that form each pool and number of reads/sequences after quality-filtering (QC) results are listed.

Pool ID Dates/Sites Reads Pool ID Dates/Sites Reads

(16S) Pooled After (18S) Pooled After QC QC 1 June 5 7917 1 June 5 6838 2 June 19 8446 2 June 19 7253 3 July 2 5460 3 July 2-July 17 6207 4 July 17-July 7180 4 July 6217 31(inshore) 31 5 Aug 14-Aug 5603 5 July 14762 31(offshore)- 27 Aug 14 6 August 27 7623

Table B.2: Statistics corresponding to the 2015 metagenome sequences before and after quality filtering and annotations by MG-RAST. Sample identification (ID) is represented by month (J=July; A=August; S=September), date, and site (_I=inshore; _O=offshore). This information is also publically available using the corresponding MG-RAST ID/accession codes.

Sample MG- # Raw % Failed %Unknown %Annotated %Ribosomal ID RAST ID Sequences QC Protein Protein RNA J30_I 4683869.3 9,067,477 13.4 44 33.3 9 J30_O 4683871.3 9,116,735 15.2 31 45.8 8 A13_I 4683866.3 8,646,087 15.1 36.6 40 8.3 A13_O 4683868.3 8,806,424 14.9 28.4 49.3 7.4 A27_I 4683867.3 9,839,299 15 45.1 30.8 9.1 A27_O 4683874.3 9,392,774 35 9.1 50.5 5.4 S10_I 4683865.3 10,222,512 14.2 51.2 23.4 11.2 S10_O 4683872.3 9,916,702 11.5 61.1 16.8 10.8 S24_I 4683873.3 9,503,045 16.4 45.1 28.4 10.1 S24_O 4683870.3 8,433,605 27.5 22 44.4 6.1

99 100

Table B.3: Metagenome sequence statistics output by MEGAN. Sequences assigned taxonomy to cellular organisms, viruses or not assigned are given. Sample ID is represented by month (J=July; A=August; S=September), date, and site (_I=inshore; O=offshore).

Sample Cellular Viruses Not Organisms assigned J30_I 1,374,249 3296 14,360 J30_O 2,066,405 3076 14,509 A13_I 1,688,815 3884 10,630 A13_O 2,433,785 5321 16,484 A27_I 1,603,972 5387 10,715 A27_O 4,979,529 2545 27,215 S10_I 1,038,600 4932 293 S10_O 526,620 6486 6,319 S24_I 1,253,207 4468 10,330 S24_O 2,760,453 797 24,629

101

98 cluster78(s Lyngbya hieronymusii) 80 Lyngbya aestuarii PCC 7419 (NR 112110.1) cluster196(g Planktothrix) 15 100 Planktothrix agardhii strain NIVA-CYA 18 (NR 119183.1) 22 cluster468(g Microcystis) 25 99 Microcystis aeruginosa strain NIES-843 (NR 074314.1) Stanieria cyanosphaera strain PCC 7437 (NR 102468.1) 38 cluster496(s Limnococcus limneticus) Cyanobacteria cluster487(g Dolichospermum) 21 100 Calothrix sp. PCC 7507 strain PCC 7507 (NR 102891.1) Synechococcus rubescens strain SAG 3.81 (NR 125481.1) 8 77 cluster1(g Synechococcus) 64 cluster527(g Synechococcus) 88 Cyanobium gracile strain PCC 6307 (NR 102447.1) cluster14(p Cyanobacteria) 16 83 cluster291(f Xanthomonadaceae) Arenimonas maotaiensis strain YT8 (NR 133967.1) 99 cluster566(g Limnohabitans) 69 92 Limnohabitans planktonicus strain II-D5 (NR 125541.1) Ramlibacter ginsenosidimutans strain BXN5-27 (NR 133836.1) Proteobacteria 94 cluster7(f Burkholderiaceae) 60 100 cluster560(f Alcaligenaceae) 16 Achromobacter xylosoxidans A8 strain A8 (NR 074754.1) 49 cluster510(g Methylibium) 99 Rhizobacter fulvus strain Gsoil 322 (NR 041367.1) 98 cluster119(o Rhizobiales) 37 Nordella oligomobilis strain N21 (NR 114615.1) cluster391(g Rhodobacter) 100 Catellibacterium aquatile strain NBRC 104254 (NR 114265.1) Proteobacteria 71 37 cluster8(g Beijerinckia) Phreatobacter oligotrophus strain PI 21 (NR 133817.1) 32 cluster162(f Pelagibacteraceae) 99 Candidatus Pelagibacter sp. strain IMCC9063 (NR 074410.1) 81 Rhodopirellula rubra strain LF2 (NR 126223.1) 97 Rhodopirellula caenicola strain YM26-125 (NR 136448.1) 32 cluster21(f Pirellulaceae) cluster570(f Isosphaeraceae) 100 100 Isosphaera pallida strain ATCC 43644 ( NR 074534.1) Plantomycetes cluster629(f Pirellulaceae) Pirellula staleyi strain DSM 6068 (NR 074521.1) 62 28 cluster244(f Pirellulaceae) 92 cluster581(f Pirellulaceae) 100 cluster239(f Verrucomicrobiaceae) Verrucomicrobia Brevifollis gellanilyticus strain DC2c-G4 (NR 113149.1) 93 cluster247(f C111) 87 cluster536(f C111) 67 cluster396(f C111) 53 cluster86(f C111) 51 97 cluster164(f C111) strain YM22-133 (NR 041633.1) 71 cluster340(g Candidatus Rhodoluna) 62 Candidatus Limnoluna rubra strain MWH-EgelM2-3 (NR 125497.1) cluster354(g Candidatus Rhodoluna) 86 98 Candidatus Rhodoluna limnophila (NR 125490.1) Candidatus Rhodoluna planktonica (NR 125488.1) Actinobacteria bigeumensis strain MSL-03 (NR 116028.1) 10 64 Kineosporia rhizophila strain DSM 44389 (NR 117176.1) cluster589(o Actinomycetales) 15 mangrovi strain MUSC 115 (NR 126283.1) 56 yokosukanensis strain NRRL B-3353 (NR 043496.1) 24 pelagius strain Aji5-31 (NR 113143.1) cluster269(o Actinomycetales) cluster293(f ACK-M1) 27 cluster248(f ACK-M1) 85 55 cluster265(f ACK-M1) 77 cluster533(f ACK-M1)

0.050

102

Figure B.1: Molecular Phylogenetic analysis of the Hamilton Harbour bacterial community with NCBI reference organisms. The evolutionary history was inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model. The bootstrap consensus tree inferred from 100 replicates is taken to represent the evolutionary history of the taxa analyzed. The analysis involved 66 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 213 positions in the final dataset. Evolutionary analyses were conducted in MEGA7. NCBI reference numbers for the reference sequences are listed near their assigned taxonomy names while the query OTU representative sequences are marked by “cluster” followed by their assigned OTU identification number. Colors are used to distinguish different phyla.

103 Table B.4: Percent abundance of each bacterial OTU in each 2014 sample pool with Greengenes taxonomy assignment. Sample identification is based on DNA pools: 1=June 5, 2= June 19, 3=July 2, 4= July17-July31, 5=August 14-August 27.

Phylum Greengenes OTU Pool Taxonomy 1 2 3 4 5

Actinobacteria C111 17 3 11 5 4 Actinomycetales 12 22 7 2 4 ACK-M1 27 6 18 4 11 Candidatus Rhodoluna 12 29 7 1 1 Cyanobacteria Cyanobacteria 0 0 0 4 1 Dolichospermum 0 0 0 0 1 Limnococcus limneticus 0 0 0 0 1 Microcystis 0 0 0 0 3 Lyngbya hieronymusii 0 0 0 0 4 Planktothrix 0 0 0 0 2 Synechococcus 1 2 10 53 18 Planctomycetes Isosphaeraceae 0 0 0 1 0 Pirellulaceae 0 0 2 7 3 Proteobacteria Rhizobiales 0 0 1 3 3 Beijerinckia 0 0 2 1 1 Rhodobacter 0 5 0 0 1 Pelagibacteraceae 0 0 0 0 2 Alcaligenaceae 2 3 4 1 1 Burkholderiaceae 0 0 1 0 2 Limnohabitans 1 3 0 0 2 Methylibium 0 0 0 0 2 Xanthomonadaceae 0 0 2 0 0 Verrucomicrobia Verrucomicrobiaceae 0 2 1 0 0

104

88 Synchaeta pectinata (KP875584.1) 46 cluster567 (p: Rotifera) Notommata cordonella (DQ297711.1) 27 49 cluster58 (p: Rotifera) Keratella quadrata (DQ297697.1) 25 96 cluster260 (p: Rotifera) cluster406 (p: Rotifera) 32 Trichocerca elongata (DQ297721.1) 66 cluster86 (p: Rotifera) 85 Testudinella clypeata (KF561109.1) 74 cluster310 (p: Rotifera) Ptygura libera (DQ297689.1) Metazoa 87 92 cluster332 (p: Rotifera) 99 Dreissena rostriformis bugensis (JX099479.1) cluster442 (so: Heteroconchia) 99 Arctodiaptomus cf. stephanidesi (JX945135.1) 63 cluster576 (c: Maxillopoda) 93 Acanthocyclops bicuspidatus (FJ825602.1) 98 cluster438 (c: Maxillopoda) 17 99 Cyclopidae sp.JMM-2003 (AY210814.1) 63 cluster516 (c: Maxillopoda) 54 Mesocyclops pehpeiensis (KR048728.1) 72 cluster39 (c: Maxillopoda) 99 Cryptomonas curvata (KF907377.1) Cryptophyceae 53 cluster482 (g: Cryptomonas) Uncultured (JN054675.1) Fungi 14 99 cluster468 (Uncultured Fungus) 24 99 Psorospermium haeckelii (U33180.1) cluster137 (f: Ichthyophonae) Holozoa 95 Uncultured ichthyosporean (HQ219425.1) 98 cluster522 (f: Ichthyophonae) 99 cluster596 (f: Trichocomaceae) Aspergillus versicolor (KR233971.1) 99 Uncultured Eukaryote (JN547304.1) cluster359 (Unclassified Eukaryote) 35 Uncultured Chytridiomycota (JQ689417.1) Fungi cluster556 (p: Chytridiomycota)

21 72 cluster145 (p: Chytridiomycota) Uncultured Chytridiomycota (JQ689414.1) 28 Rhizophlyctis harderi (AF164272.2) 99 cluster28 (p: Chytridiomycota) Chlorella sp. NIES-3912 (LC129522.1) 67 cluster76 (g: Tetrachlorella) 66 Oocystaceae sp. GSL021 (GQ243428.1) 44 cluster285 (g: Tetrachlorella) 98 Oocystis parva (JQ315649.1) cluster583 (g: Tetrachlorella) Carteria cerasiformis (LC037440.1) 89 cluster330 (g: Carteria) 75 99 cluster118 (g: Carteria) Colemanosphaera charkowiensis (LC086350.1) 45 93 cluster524 (c: Chlorophyceae) Chloroplastida 82 Pseudopediastrum boryanum (JQ315559.1) 66 22 cluster379 (g: Pseudopediastrum) cluster102 (g: Coelastrum) Coelastrum microporum (KP726226.1) 86 Hariotina reticulata (AH012395.2) 50 cluster399 (g: Coelastrum) 95 Scenedesmus sp. NS6 (KT720478.1) 73 cluster75 (g: Desmodesmus) 81 Scenedesmus sp. UKM 9 (KU170547.1) 99 cluster434 (g: Desmodesmus) 99 Uncultured Eukaryote (JN090899.1) Rhizaria cluster561 (c: Thecofilosea) 90 Uncultured Lagenidales (GU067950.1) 45 cluster152 (c: Peronosporomycetes) 99 Stephanodiscus parvus (KT072953.1) 54 Stramenopiles cluster230 (g: Thalassiosira) 65 Uncultured Eukaryote (KT813937.1) 93 cluster378 (Unclassified Stramenopiles) 78 Peridiniopsis polonicum voucher (JQ639764.1) 69 cluster458 (g: Scrippsiella) 99 cluster515 (g: Scrippsiella) Gyrodiniellum shiwhaense (FR720082.1) 37 61 Uncultured Eukaryote (KJ925354.1) cluster66 (g: Oxyrrhis) 99 cluster192 (g: Oxyrrhis) cluster422 (g: Oxyrrhis) Alveolata 99 Oligotrichia sp. cPFT3 (LN870165.1) cluster240 (sc: Choreotrichia) 99 Choreotrichia sp. bLPN2 (LN870020.1) 70 cluster113 (sc: Choreotrichia) 99 Tintinnopsis lacustris (JQ408161.1) 9 cluster44 (g: Codonella) 22 Uncultured Eukaryote (AB902048.1) 99 cluster363 (c: Oligohymenophorea)

0.050 105

Figure B.2: Molecular Phylogenetic analysis of the eukaryotic community with NCBI reference organisms. The evolutionary history was inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model. The bootstrap consensus tree inferred from 100 replicates is taken to represent the evolutionary history of the taxa analyzed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (100 replicates) are shown next to the branches. The analysis involved 86 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 165 positions in the final dataset. Evolutionary analyses were conducted in MEGA7. Evolutionary analyses were conducted in MEGA7. NCBI reference numbers for reference sequences are listed and the OTU representative sequences are marked by “cluster” followed by unique OTU identification number. Colors are used to distinguish different phyla.

106 Table B.5: Percent abundance of each eukaryotic OTU in each 2014 sample pool with SILVA taxonomy assignment. Sample identification is based on DNA pools: 1=June 5, 2= June 19, 3=July 2-July 17, 4=July 31(inshore), 5=July 31(offshore)- August 14, 6=August 27.

Phylum SILVA OTU Taxonomy Pool 1 2 3 4 5 6 Rhizaria Thecofilosea 0 0 0 0 0 40 Stramenopiles Peronosporomycetes 0 12 3 0 0 2 Thalassiosira 2 0 0 0 0 0 Cryptomonodales Cryptomonas 6 6 0 0 0 6 Chloroplastida Carteria 0 0 0 0 0 5 Coelastrum 0 2 14 3 0 3 Desmodesmus 1 3 5 2 0 2 Pseudopediastrum 0 1 1 0 0 0 Chlorophyceae 0 0 0 0 0 2 Tetrachlorella 0 0 5 1 0 1 Holozoa Ichthyophonae 0 1 1 59 2 0 Metazoa Rotifera 10 23 27 6 1 7 Maxillopoda 52 13 1 0 0 0 Heteroconchia 0 3 0 0 0 6 Alveolata Scrippsiella 0 0 6 0 90 3 Oxyrrhis 0 26 2 1 0 0 Oligohymenophorea 0 0 0 0 0 1 Codonella 2 0 0 0 0 0 Choreotrichia 0 0 0 8 0 0 Fungi Uncultured Fungus 0 0 5 0 0 0 Chytridiomycota 6 2 11 2 0 1 Unclassified Unclassified Eukaryote 0 0 0 4 0 0

107 Table B.6: Percent abundance of dominant phyla and genera representing >1% of the total relative abundance in each of the 2015 metagenomes. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).

Taxonomy Relative Percentage Within Each Metagenome (%) Phylum Genus J30_I J30_O A13_I A13_O A27_I A27_O S10_I S10_O S24_I S24_O Gemmatomonadetes Gemmatimonas 1 2 0 1 0 0 0 1 1 1 Proteobacteria Rickettsia 0 0 0 0 0 0 0 0 1 4 Porphyrobacter 2 4 1 1 0 0 0 0 0 0 Limnohabitans 2 3 2 1 1 1 2 2 2 1 unclassified Betaproteobacteria (miscellaneous) 1 1 0 0 0 0 1 1 1 1 Aeromonas 0 0 0 0 0 0 0 0 0 2 Acinetobacter 0 0 1 2 0 0 0 0 0 0 Silanimonas 0 1 1 1 1 1 0 0 0 0 Stenotrophomonas 0 0 6 0 0 0 0 0 0 0 Planctomycetes Rhodopirellula 2 1 1 1 2 1 4 6 1 3 Actinobacteria Acidimicrobium 9 4 8 5 2 1 1 1 1 1 unclassified 3 1 3 2 1 1 0 0 0 0 0 0 1 0 0 0 1 1 0 2 ac1 cluster(ACK-M1) 5 7 8 6 3 2 3 3 1 1 unclassified Actinobacteria (class) (miscellaneous) 13 9 12 7 4 3 3 3 1 1 Cyanobacteria Microcystis 6 30 10 41 19 18 9 6 17 20 Synechococcus 3 1 1 0 0 0 1 1 0 0 Arthrospira 0 0 0 0 0 0 0 0 1 4 Kamptonema 0 0 0 0 0 0 0 0 0 1 Limnoraphis 2 4 15 9 33 42 26 23 34 14 Lyngbya 1 2 6 3 13 16 8 9 13 11 Microcoleus 0 0 0 0 0 0 0 0 0 2 Pseudanabaena 0 0 0 1 0 0 0 0 0 0 Eukaryote Daphnia 6 3 2 0 0 0 3 3 0 5 Operophtera 0 0 0 0 0 0 0 1 0 0 Helobdella 0 0 0 0 0 0 1 2 1 0 Nematostella 0 0 0 0 0 0 1 1 0 0 Phaeodactylum 0 0 0 0 0 0 1 1 1 3 Thalassiosira 4 0 1 0 0 0 2 1 1 3

Appendix C

Inshore Community SEED Functions

Virulence, Disease and Defense Sulfur Metabolism Stress Response Secondary Metabolism Respiration Regulation and Cell signaling RNA Metabolism Protein Metabolism Potassium metabolism Photosynthesis Phosphorus Metabolism Phages, Prophages, Transposable elements, Nucleosides and Nucleotides S24_I Nitrogen Metabolism S10_I Motility and Chemotaxis Miscellaneous A27_I Metabolism of Aromatic Compounds A13_I Membrane Transport Iron acquisition and metabolism J30_I Fatty Acids, Lipids, and Isoprenoids Dormancy and Sporulation DNA Metabolism Cofactors, Vitamins, Prosthetic Groups, Clustering-based subsystems and Capsule Cell Division and Cell Cycle Carbohydrates Amino Acids and Derivatives 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 Relative Abundance (%) a)

108 109

Offshore Community SEED Functions

Virulence, Disease and Defense Sulfur Metabolism Stress Response Secondary Metabolism Respiration Regulation and Cell signaling RNA Metabolism Protein Metabolism Potassium metabolism Photosynthesis Phosphorus Metabolism Phages, Prophages, Transposable elements, S24_O Nucleosides and Nucleotides S10_O Nitrogen Metabolism Motility and Chemotaxis A27_O Miscellaneous A13_O Metabolism of Aromatic Compounds Membrane Transport J30_O Iron acquisition and metabolism Fatty Acids, Lipids, and Isoprenoids Dormancy and Sporulation DNA Metabolism Cofactors, Vitamins, Prosthetic Groups, Clustering-based subsystems Cell Wall and Capsule Cell Division and Cell Cycle Carbohydrates Amino Acids and Derivatives 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 Relative Abundance (%) b)

Figure C.1: Functional subsystems plotted with relative abundances for the 2015 metagenomes at the a) inshore site and b) offshore site. Sample identification is represented by month (J=July; A=August; S=September), sampling day, and site (_I=inshore; O=offshore).