<<

MOLECULAR SYSTEMATICS OF

CYANOBACTERIA

Submitted by Elvina Lee Mei Hui, BSc (Hons)

This thesis is presented for the degree of Doctor of Philosophy of Murdoch University

i

DECLARATION

I declare that this thesis is my own account of my research and contains as its main content work, which has not been previously submitted for a degree at any tertiary education institution.

______

Elvina Lee January 2016

ii

ABSTRACT

Cyanobacteria constitute a phylum of ubiquitous cosmopolitan with the ability to perform oxygenic photosynthesis. Their ancient origins, ecological and economic potential, biotechnological applications and impact on water systems have attracted much interest from the academia, industry, health authorities and regulators. Despite this, cyanobacteria classification and nomenclature still remains difficult. One of the aims of this project was to further our understanding of cyanobacteria systematics by (1) testing polyphasic characterization methods and (2) examining the effect of various phylogenetic reconstruction strategies. Additionally, (3) Next Generation Sequencing (NGS) assays using novel cyanobacteria 16S rDNA targeted primers were implemented to provide better taxa resolution than that offered by “universal” 16S rDNA primers. Cyanobacteria strains from various water sources in Australia were isolated, characterised at four loci commonly used for cyanobacteria molecular classification, and tested for the presence of genes implicated in toxin and terpene (odour) production. A total of 55 novel cyanobacterial strains were isolated and maintained in culture, forming the first known collection of cyanobacteria isolates from Western Australia. Comparison of molecular– and morphology– based identifications not only showed the limitations of the current methods (only 45% of the isolates showed agreement) but also provided the opportunity to suggest guidelines and conceive a way forward towards more effective identification approaches. Examination of alternative phylogenetic markers, workflows and stringencies showed that between alignment algorithms, alignment curations and tree building methods, the latter had the greatest effect on tree topology. This result was consistent regardless of locus, alignment and curation strategy employed. Finally, two sets of novel cyanobacteria-targeted primers were designed for use with NGS technologies. As compared to the universal 16S rRNA primers, these primers showed higher specificity and preferential amplification of cyanobacteria and DNA. Of the sequences obtained using these two new primer pairs, cyanobacteria sequences comprised 50.5% and 54.4%, while proteobacteria sequences comprised 44.5% and 40.3% respectively. In comparison, with the universal 16S rRNA primers, cyanobacteria and proteobacteria comprised 15.3% and 33.4% respectively of the sequences analysed.

iii

Using morphological and molecular methods, this project provides a snapshot of the as yet unstudied freshwater cyanobacterial diversity found in Western Australia using polyphasic methods. The limitations of the current identification approaches, uncovered during the first phase of the project, were harnessed to develop a method to assess the variability of phylogenetic reconstructions. Finally, novel cyanobacteria specific NGS primers demonstrated how adopting the latest NGS technology represents a promising advance in the molecular investigation of cyanobacteria.

iv

TABLE OF CONTENTS

Declaration ...... ii Abstract ...... iii Table of Contents ...... v Acknowledgements ...... vii Table of Figures ...... viii List of Tables ...... xi List of Abbreviations ...... xii CHAPTER 1. General Introduction ...... 1 1.1. Introduction ...... 1 1.2. Characteristics and habitat ...... 2 1.3. Public and environmental implications of cyanobacteria ...... 5 1.3.1. Effect of blooms ...... 5 1.3.2. Toxins and their effects ...... 8 1.3.3. Taste and odour ...... 13 1.4. Management of cyanobacteria ...... 14 1.4.1. Chemical methods ...... 14 1.4.2. Physical methods ...... 15 1.4.3. Biological and management methods ...... 16 1.5. : difficulties in nomenclature and identification...... 16 1.6. Methods used to identify and detect cyanobacteria ...... 20 1.6.1. Morphological identification ...... 20 1.6.2. Molecular methods ...... 21 1.6.3. Polyphasic approaches ...... 27 1.7. Aims ...... 29 CHAPTER 2. General Materials & Methods ...... 30 2.1. Water Sources ...... 30 2.2. Isolation and maintenance of cyanobacteria ...... 30 2.3. Extraction of cyanobacterial DNA ...... 31 2.4. PCR amplification and sequencing of target genes ...... 31 2.4.1. Conventional Standard PCR ...... 31 2.4.2. Quantitative PCR (qPCR) ...... 32 2.5. Sequencing and analysis of amplified products ...... 32 2.5.1. Sanger sequencing ...... 32 2.5.2. Analysis of sequenced chromatograms ...... 32 CHAPTER 3. Polyphasic identification of cyanobacterial isolates from Australia34 3.1. Introduction ...... 34 3.2. Materials and Methods ...... 35 3.2.1. Isolation and cultivation of cyanobacterial strains ...... 35 3.2.2. Morphological identification ...... 36 3.2.3. Molecular and phylogenetic analyses ...... 36 3.3. Results ...... 37 3.3.1. Comparison of morphological data ...... 39 3.3.2. Comparison of molecular identification and phylogenetic analysis at different loci 45 3.3.3. Comparison between morphological and molecular identifications ...... 51 3.3.4. Identification of novel isolates ...... 52 3.4. Discussion ...... 52 CHAPTER 4. Effect of alignment, curation and reconstruction methods on the molecular systematics of Cyanobacteria ...... 57 4.1. Introduction ...... 57 4.2. Materials and Methods ...... 59 4.2.1. Isolation, cultivation and sequencing of cyanobacterial isolates ...... 59 4.2.2. Multiple sequence alignment and curation ...... 59 v

4.2.3. Tree-building and multivariate analyses ...... 60 4.3. Results ...... 61 4.3.1. Comparison of alignment algorithms and effect of curation ...... 61 4.3.2. Comparison of phylogenetic reconstructions ...... 62 4.4. Discussion ...... 70 CHAPTER 5. Novel phylum selective primers for use with next generation sequencing technologies ...... 74 5.1. Introduction ...... 74 5.2. Materials and Methods ...... 75 5.2.1. Cyanobacteria cultures, sampling and DNA extraction ...... 75 5.2.2. Primer design and in silico validation ...... 76 5.2.3. Thermal cycling conditions...... 77 5.2.4. Library Preparation and NGS ...... 78 5.2.5. Bioinformatics analysis ...... 81 5.3. Results ...... 82 5.3.1. Evaluation of new primers ...... 82 5.3.2. NGS microbiome analysis of water samples ...... 84 5.3.3. Bacteria composition of freshwater water samples...... 89 5.4. Discussion ...... 90 CHAPTER 6. General Discussion ...... 94 6.1. Overview ...... 94 6.2. Cyanobacteria classification ...... 94 6.3. Cyanobacteria systematics ...... 96 6.4. Cyanobacteria-targeted primers for NGS technologies ...... 97 6.5. Future directions ...... 99 6.6. Conclusion ...... 101 References ...... 103 APPENDIX ...... 129

vi

ACKNOWLEDGEMENTS

I would like to acknowledge and thank my supervisors Professor Una Ryan and Dr Andrea Paparini for their guidance, experience and time they dedicated to my work. Without your patience and steady guidance, this thesis would not have been possible. Many thanks to the following people for their help, expertise discussions and criticisms, which have helped advanced this PhD project. These people include Dr Paul Monis, W/Prof. Andrew Whiteley, Prof Michael J Wise, Dr Steven Giglio, Dr Glenn B. McGregor, Stuart Helleren, Mitchell Ranger and Ling Hau, and Maninder S Khurana. Thanks to my friends and colleagues in the Vector and Waterborne Pathogen Research Group, and the many people in the State Agricultural and Biotechnology Centre for their help, companionship, support and assistance during the time spent there: Melanie Koinari, Alexandar Gofton, Rongchang Yang, Alireza Zahedi Abdi, Khalid Al Habsi, Frances Briggs, Steven Wylie and all the other fellow postgraduate students who slave away in the labs . I would like to acknowledge financial support from Murdoch University through the ARC linkage associated Murdoch Scholarship and project funding from the Australian Research Council Water Corporation of Western Australia and Murdoch University. Special thanks to my housemates, Sadia Iqbal, Masood Anwar, Jessie Sin, Alicia Lim and Koh Wan Hon. Thank you for putting up with my crazy moods, complaints, vampiric habits and for keeping me fed all these years. Also thanks Jamie, Shu Hui, Narelle, Jo-anne for just being there and prodding me along in my journey. Most importantly, I would like to thank my family for supporting me in journey, helping me through my rough patches and putting up with my constant absence as I worked my way through this PhD.

vii

TABLE OF FIGURES

Figure 1.1: Specialised cyanobacteria cells. A (image taken from (Rippka et al., 2015b)): Phase contrast image of fibrous outer wall layers of parental cells emptied of baeocytes, in a liquid culture of Stanieria (PCC 7437); B and C (images taken from (Sciuto and Moro, 2015)): Light microscopy of filamentous Nostocales cyanobacteria; B: Heterocytes both at the end of the filaments (terminal heterocytes, black arrows) and within the filaments (intercalary heterocytes, white arrows); C: Chain of akinetes (black arrow). Insert: High magnification image of scanning electron comparing the sizes of the vegetative cell (v) and an akinete (a)...... 3 Figure 1.2: Benthic (Phormidium spp.) cyanobacteria mats. Taken from Quiblier et al. (2013) ...... 6 Figure 1.3: Toxic cyanobacteria in the Canning River (WA) (Trust, 2005) ...... 6 Figure 1.4: Cyanobacteria maximum likelihood (ML) species tree produced by concatenation of 31 conserved proteins (modified from Shih et al. (2013); Branches were colour coded according to morphological subsection. Taxa names in red were genomes sequenced in the study by Shih et al. (2013). Black dots indicate nodes with > 70% bootstrap support. The seven major subclades observed were found to show similar distribution to the traditional five subsections (Shih et al., 2013)...... 22 Figure 1.5: Simple schematic displaying the differences between , and (adapted from (Harrison and Langdale, 2006)). Isolates A and B within the green box form a monophyletic clade; Isolates A to G within the blue box form a paraphyletic clade; while Isolates A and E (red circles) are polyphyletic to each other. 26 Figure 3.1: Number of cyanobacteria sequences within GenBank for the 16S, rpoC1 and phycocyanin loci over time. Data obtained from NCBI Nucleotide Database (http://www.ncbi.nlm.nih.gov/nuccore). References indicate papers characterising or reporting changes to cyanobacteria taxonomy and nomenclature. Figure updated from (Lee et al., 2014) ...... 35 Figure 3.2: Number of cyanobacterial isolates identified at order, genus or species-level, using molecular and morphological methods. Depending on the isolate, morphological identifications were carried out by either one (unique identifications), or two independent taxonomists (replicate identifications). Where isolates were examined in duplicate, the number of isolates providing agreement between the replicate identifications is shown...... 40 Figure 3.3: Maximum Likelihood tree based on the 16S rDNA sequences (313 bp) showing the clustering of isolates obtained. Branch support values greater than 50% for ML, MP and NJ analyses respectively are indicated left of the nodes. Bar, 0.05 substitutions per site. The outgroup was removed to facilitate the visualisation of the isolates. *Sequence previously submitted to GenBank with accession number AF247573...... 47 Figure 3.4: Maximum Likelihood tree based on the rpoC1 sequences (409 bp) showing the clustering of isolates obtained. Branch support values greater than 50% for ML, MP and NJ analyses respectively are indicated left of the nodes. Bar, 0.1 substitutions per site. The Synechococcus cluster containing GS6-1, Vasse River types 9 and 13 were removed to facilitate visualisation of the other isolates...... 48 Figure 3.5: Maximum Likelihood tree based on the cpcBA-IGS sequences (423 bp) showing the clustering of isolates obtained. Branch support values greater than 50% for ML, MP and NJ analyses respectively are indicated left of the nodes. Bar, 0.2 substitutions per site...... 49 Figure 3.6: Comparison of the extent of agreement between molecular identifications of cyanobacterial isolates with at least two successfully sequenced loci. All three loci (16S viii

rDNA, rpoC1 and cpcBA-IGS) were included in the analysis, so as to allow visualisation of all isolates that were successfully sequenced at > two loci. The numbers indicate the isolates for which agreement between the loci was found at species level (panel A), or at least genus level (i.e. identification of isolates were the same to either species or genus level) (panel B)...... 50 Figure 4.1: Overview of the workflow performed. Cyanobacterial sequences from four loci (16S rDNA, rpoC1, 16S-23S-ITS or cpcBA-IGS) were aligned using five alignment algorithms (ClustalW, MAFFT, MUSCLE, PAGAN, PRANK), prior to optional alignment curation (i.e. de-gapping). Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI). Multivariate analysis was performed based on a number of metrics computed from each . The output of the multivariate analysis consisted of principal component analysis (PCA) plots highlighting the topological differences between trees, generated using alternative methods adopted during steps 1 to 3...... 60 Figure 4.2: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the 16S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 63 Figure 4.3: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the rpoC1 locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 63 Figure 4.4: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the cpcBA-IGS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 64 Figure 4.5: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the 16S-23S ITS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 64 Figure 4.6: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the 16S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 65 Figure 4.7: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the rpoC1 locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 66 Figure 4.8: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the cpc-BA IGS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI) ...... 66 Figure 4.9: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the 16S-23S ITS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI). The two trees that clearly intersect the opposite cluster are indicated by the red arrows and have their reconstruction option underlined...... 67 Figure 4.10: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the 16S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 68 Figure 4.11: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the rpoC1 locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 68 ix

Figure 4.12: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the cpcBA-IGS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 69 Figure 4.13: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the 16S- 23S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)...... 69 Figure 4.14: Principal component analysis (PCA) plot highlighting the topological differences between trees generated using various combinations of phylogenetic reconstruction options and starting from cyanobacterial sequences from four loci: 16S rDNA, rpoC1, 16S-23S-ITS or cpcBA-IGS. Multivariate analysis was performed based on a number of metrics computed from each phylogenetic tree...... 70 Figure 5.1: Screen capture from Geneious (Biomatters, NZ) showing cyanobacteria 16S rDNA consensus sequence mapped onto E. coli K-12 substrain MG1655 (NR10284) Annotations indicate the positions of the hypervariable regions (blue boxes), and primer binding sites. Black arrows indicate positions of reference primers, green/grey arrows indicate the positions of primers designed in this study...... 80 Figure 5.2: Composition (percentage) of phyla amplified by the primer sets as predicted by in silico testing using TestPrime, against the SILVA database, with the recommended settings (1 mismatch +5 bases). Phyla forming less than 0.01% of the predicted amplifications were omitted...... 83 Figure 5.3: Alpha diversity observed from the sampled subset for each primer set. Grey diamonds represent the median number of OTUs observed by each primer pair; boxes give the 25% and 75% percentiles for each primer pair, error bars indicate standard errors...... 85 Figure 5.4: Proportion of sequences observed by each primer set for each phyla (from the sampled subset). Phyla with < 0.5% abundance were combined...... 86 Figure 5.5: Comparison of taxonomic coverage of cyanobacteria and proteobacteria genera by primer...... 86 Figure 5.6: Percentage phyla composition of the Freshwater samples from sampled dataset, as observed by each primer pair. Phyla with < 0.1% abundance were combined...... 89

x

LIST OF TABLES

Table 1.1: Factors affecting cyanobacteria growth and bloom formation ...... 7 Table 1.2: Summary of toxins produced by cyanobacteria and their effects...... 11 Table 1.3: Cyanobacteria classification and characteristics...... 19 Table 3.1: Collection sites and identities of the 39 cyanobacterial isolates used in the present study...... 38 Table 3.2: Identification of the 39 cyanobacterial isolates used in the study, based on 16S rDNA, rpoC1, cpcBA-IGS sequences and isolate morphology. Percentage similarity at each locus was calculated in MEGA 5 (Tamura et al., 2011), as the pairwise evolutionary divergence, with a p-distance model (Kimura, 1980)...... 41 Table 5.1: Environmental water sample collection sites...... 79 Table 5.2: Primers used in the study...... 80 Table 5.3: Percentage composition of cyanobacteria sequences from total predicted sequences amplified via in silico analyses using the SILVA and RDP databases...... 84 Table 5.4: Taxonomic breakdown to Class/Order observed for cyanobacteria and proteobacteria...... 87 Table 5.5: Illustrative comparison of some potentially problematic taxa detected by the primer pairs used in the present study. Listed genera include those containing medically important organisms of human and and toxic bloom-forming microalgae ...... 88

xi

LIST OF ABBREVIATIONS

16S rDNA 16S ribosomal DNA 16S-23S ITS Internal transcribed spacer region between the 16S and 23S rRNA genes A. Cyanobacterial genus Anabaena An. Cyanobacterial genus Anabaenopsis Ap. Cyanobacterial genus Aphanizomenon AWQC Australian Water Quality Centre BI Bayesian Inference BIC Bayesian Information Criterion bp Base pair cpcBA-IGS Phycocyanin operon and the variable intergenic spacer region C. Cyanobacterial genus Cylindrospermum D. Cyanobacterial genus Dolichospermum DNA Deoxyribonucleic acid EMP Earth Microbiome project G. Cyanobacterial genus Geitlerinema gyrB DNA gyrase subunit β protein gene hetR Heterocyte differentiation gene HGT HMP project ICN International Code for Algae, Fungi and ICBN International Code of Botanical Nomenclature ICNP International Code for Nomenclature of INDELs Insertion or deletion of bases ITS Internal transcribed spacer L. Cyanobacterial genus Limnothrix M. Cyanobacterial genus Microcystis MALDI-TOF Matrix assisted laser desorption/ionisation – time of flight MEGAN MEtaGenome ANalyzer MID Multiplex identifier ML Maximum likelihood MP Maximum parsimony MS Mass spectrometry MSA Multiple sequence alignment MYA Million years ago N. Cyanobacterial genus Nostoc NGS Next Generation Sequencing nifH Gene coding for the dinitrogenase reductase enzyme NJ Neighbour-joining O. Cyanobacterial genus Oscillatoriales OTC Odour threshold concentration OTU Operational taxonomic unit

xii

P. Cyanobacterial genus Planktothrix PC-IGS Phycocyanin intergenic spacer region PCA Principal component analysis PCR Polymerase chain reaction Ps. Cyanobacterial genus Pseudanabaena p-distance Pairwise distance qPCR Quantitative polymerase chain reaction RAPD Randomly amplified polymorphic DNA rbcL Gene coding for the large subunit of ribulose 1, 5 diphosphate enzyme rDNA Ribomosomal deoxyribonucleic acid RDP Ribosomal Database Project RFLP Restriction fragment length polymorphism rpoB β subunit of RNA polymerase rpoC1 γ subunit of the DNA-dependent RNA polymerase rpoD1 σ subunit of RNA polymerase S. Cyanobacterial genus Sphaerospermopsis STRR Short tandemly repeated repetitive sequence T. Cyanobacterial genus Trichormus T/O Taste and odour WGP Whole genome phylogeny WHO World Health Organisation WWTP Waste water treatment

xiii

CHAPTER I: GENERAL INTRODUCTION CHAPTER 1. GENERAL INTRODUCTION

1.1. Introduction Cyanobacteria, also known as blue-green algae, are a group of ubiquitous bacteria (Whitton and Potts, 2000, Fogg et al., 1973). They are important primary producers (Capone et al., 1997, Castenholz and Waterbury, 1989), with the ability to synthesise organic compounds through the process of photosynthesis (Cohen and Gurevitz, 2006, Komárek, 2006, Pitois et al., 2000). Some species also have the ability to fix atmospheric nitrogen (Whitton and Potts, 2000, Kumar et al., 2010), allowing them to colonise nitrogen-poor or nitrogen-depleted environments and play a significant role in the nitrogen cycle (Tomitani et al., 2006, Capone et al., 1997, Severin et al., 2010). Assumed to have evolved 3.5 billion years ago, cyanobacteria are thought to be the first oxygen evolving organisms and are believed to have played a major role in the transition of the Earth’s atmosphere from its initial anaerobic state to the current aerobic conditions (Fogg et al., 1973, Kumar et al., 2010, Wilmotte, 2004). Although debated by some authors (Stiller et al., 2003, Graur and Martin, 2004), phylogenetic analysis (Falcon et al., 2010) suggests that a free-living cyanobacterium with the capacity to store starch through oxygenic CO2 fixation and fix atmospheric N2, would have become the chloroplast of photosynthetic by endosymbiosis in the mid-Proterozoic (~1,300 MYA). The long evolutionary history of this phylum (Blankenship, 1992, Giovannoni et al., 1988, Tomitani et al., 2006) has enabled it to become adapted to virtually all environments, both aquatic and terrestrial (Cohen and Gurevitz, 2006, Fogg et al., 1973, Whitton and Potts, 2000, Bold and Wynne, 1985, Metcalf et al., 2012). This, together with their ecological (oxygen production, atmospheric nitrogen and carbon fixers) and economic (via biotechnology, biofuel, therapeutic compound production) significance, has resulted in increasing interest in members of this phylum (Reynolds, 2006, Sciuto and Moro, 2015). Furthermore, in recent history, the potential impacts of bloom-forming species resulting from the increasing eutrophication of the biosphere (Granéli and Turner, 2006, Whitton and Potts, 2000, Komárek, 2006), have also prompted increased studies on this group of organisms. Despite recent advances in cyanobacteria classification and nomenclature, the ability to accurately identify environmental isolates of cyanobacteria still remains challenging. As such, the two most commonly utilised techniques for cyanobacteria 1 CHAPTER I: GENERAL INTRODUCTION identification will be discussed in this chapter. This chapter will also attempt to discuss how the increase in information available in public databases and newly developed techniques such as next generation sequencing (NGS) can aid in the process of cyanobacteria identification.

1.2. Characteristics and habitat Cyanobacteria are a diverse group of eubacteria (gram-negative bacteria) that are capable of anaerobic metabolism (Cohen and Gurevitz, 2006). In aquatic environments, cyanobacteria occur as either free floating (planktic) or substrate associated (benthic) species (Bellinger and Sigee, 2010), displaying both unicellular and colonial forms such as trichomes (threads of vegetative cells), or as colonial structures enclosed within a sheath or matrix (Bold and Wynne, 1985, Reynolds, 2006). Despite lacking organelles to perform various metabolic functions, cyanobacteria are able to utilise various structures with specialised functions such as gas vesicles, thylakoids, various granules, glycogen granules and carboxysomes (polyhedral-bodies containing carbon fixation enzymes) (Castenholz and Waterbury, 1989). In addition to these specialised internal structures, cyanobacteria also have the ability to form specialised cells, such as baeocytes (reproductive cells), akinetes (resting cells), heterocytes (N2 fixation cells) (Figure 1.1) and hormogonia (motile filaments of cells), for a variety of purposes (Fogg et al., 1973, Sciuto and Moro, 2015).

2 CHAPTER I: GENERAL INTRODUCTION

Figure 1.1: Specialised cyanobacteria cells. A (image taken from (Rippka et al., 2015b)): Phase contrast image of fibrous outer wall layers of parental cells emptied of baeocytes, in a liquid culture of Stanieria (PCC 7437); B and C (images taken from (Sciuto and Moro, 2015)): Light microscopy of filamentous Nostocales cyanobacteria; B: Heterocytes both at the end of the filaments (terminal heterocytes, black arrows) and within the filaments (intercalary heterocytes, white arrows); C: Chain of akinetes (black arrow). Insert: High magnification image of scanning electron microscope comparing the sizes of the vegetative cell (v) and an akinete (a).

Most planktic cyanobacteria usually contain gas vesicles (Castenholz and Waterbury, 1989). In aquatic environments, these structures are used for buoyancy regulation and to maintain the organisms’ position within the water column, where growth conditions are optimal (Ganf and Oliver, 1982, Reynolds et al., 1987, Fogg et al., 1973). This allows cyanobacteria to overcome growth impediments resulting from the separation of light and nutrients, especially in environments where vertical mixing is reduced due to thermal stratification (Steffensen et al., 1999). There is little variation in gas vesicles within cyanobacterial species in the same habitat, however between the different cyanobacterial genera gas vesicles vary in length from 200 nm to 2 µm and diameters between 40 and 110 nm (Reynolds et al., 1987, Walsby, 1994). Under optimal light harvesting conditions, cyanobacteria are able to utilise both chlorophyll a and photosystems I and II for photosynthesis (Cohen and Gurevitz, 2006, Komárek, 2006). Furthermore, the presence of specialised phycobiliproteins- phycocyanin and allophycocyanin, give cyanobacteria their characteristic blue-green colour, while the energy-focusing phycoerythrin gives the cells a red or black pigmentation (Vincent, 2009, Castenholz and Waterbury, 1989). These accessory light- 3 CHAPTER I: GENERAL INTRODUCTION capturing protein pigments are located on phycobilisomes found on the photosynthetic thylakoid membranes within the cells (Vincent, 2009, Castenholz and Waterbury, 1989) and are also important for photosynthesis. Under optimal growth conditions, most cyanobacteria divide through the process of binary fission, whereby either all or almost all of the envelope layers grow inwards until cell separation is complete (Castenholz, 2001, Fogg et al., 1973). Depending on whether cell fission occurs in one or more planes, resultant cyanobacteria colonies may be either orderly or disorderly. In sheathed colonies that divide in a single plane, cyanobacteria form trichomes (chain of cells without an investing sheath (Castenholz, 2015c)). Under favourable and/or stressful conditions, random breakages of trichomes occur, resulting in the formation of motile (gliding) reproductive fragments (hormogonia) (Rippka et al., 1979, Castenholz, 2001). In addition to binary fission, a subset of unicellular cyanobacteria (Pleurocapsales) show alternative reproduction strategies (Angert, 2005). This process releases tiny, unicellular structures called baeocytes (Figure 1.1, Panel A), which upon release from the parent cell, may have momentary gliding motility before initiation of growth to continue the vegetative cycle (Castenholz, 2001, Waterbury, 2006, Rippka, 1988, Angert, 2005). In addition to the formation of specialised reproductive cells, cyanobacteria are diazotrophs with the ability to fix atmospheric gaseous nitrogen (diazotrophs) (Komárek, 2006) and may also form heterocytes. These are cells that provide fixed atmospheric nitrogen for the adjacent vegetative cells in the filament (Kumar et al., 2010). Internally, these highly specialised cells are anoxic to protect the enzyme nitrogenase from inactivation by oxygen, and lack photosystem II and carbon fixation abilities (Kumar et al., 2010). Heterocytes also usually contain cyanophycin granules at their poles and can be distinguished by their slightly larger and rounder shapes, have reduced pigmentation and thicker cell envelopes. They can occur terminally, or at regular intervals along the filament with different species having different developmental patterns (Figure 1.1, Panel B) (Kumar et al., 2010, Pitois et al., 2000). During adverse conditions (nutrient deficiency and/or light limitation), many heterocystous cyanobacteria can also form akinetes (Figure 1.1, Panel C). These specialised cells are granular, bigger than vegetative cells and have additional thickened cell walls; adaptations which allow the cells to withstand freezing, drying and long-term storage and germinate when the external conditions improve (Castenholz and Waterbury, 1989, Reynolds, 2006, Castenholz, 2015b).

4 CHAPTER I: GENERAL INTRODUCTION

1.3. Public and environmental implications of cyanobacteria Cyanobacteria have the potential to cause significant socio-economic implications for the water, tourism and food industries as many species can form thick blooms and are capable of producing harmful toxins and/or odorous metabolites. These effects shall be briefly discussed below.

1.3.1. Effect of blooms Most cyanobacterial blooms produce visible colouration of a water body due to the presence of suspended, subsurface or benthic mats of cells, filaments and/or colonies (Figure 1.2) (Fristachi et al., 2008). Dense populations of bloom forming species are common in freshwater bodies (Cohen and Gurevitz, 2006) and have increased worldwide as a result of (anthropogenic) eutrophication and climate change (O'Neil et al., 2012). In freshwater environments, cyanobacteria from the Dolichospermum, Aphanizomenon, Chrysosporum, Microcystis, Nodularia and Oscillatoria genera are the main bloom causing agents (Ressom et al., 1994). Dense cyanobacteria blooms can suppress macrophytes due to nutrient competition and increased turbidity, adversely affecting invertebrate and fish habitats; fish loss can result in decreased biodiversity and food web changes due to oxygen depletion (hypoxia/anoxia) during bloom decline and decomposition (Paerl and Otten, 2013, Falconer, 2004a). This, in turn, results in economic losses through the negative impacts on recreation, tourism and aquaculture industries (Robson and Hamilton, 2003, Chorus and Bartram, 1999, Paerl and Otten, 2013, Carmichael, 2001, Dionysiou, 2010). It has been estimated that in the USA, an average of $50 million a year is spent on monitoring and managing the problems caused by blooms (Dionysiou, 2010); while in Australia, freshwater algal blooms cost water users more than $200 million annually (Figure 1.3) (Chudleigh et al., 2000, Atech, 2000). An example of cyanobacteria blooms and their effects occurred in January 2000, in the Swan River estuary (Perth, WA), where heavy summer rainfalls caused a bloom of Microcystis aeruginosa (Robson and Hamilton, 2003). During this period, toxic M. aeruginosa densities exceeded 100,000 cells mL-1 (Atkins et al., 2001), twenty times greater than the usual cell density of 5,000 cells mL-1 (Hosja and Deeley, 1994); this resulted in the Swan River being closed from recreational activity for more than ten days (Robson and Hamilton, 2003, Atkins et al., 2001).

5

CHAPTER I: GENERAL INTRODUCTION Furthermore, the possible production of secondary compounds (taste- and- odour and/or toxic compounds) during blooms not only result in increased water treatment and monitoring costs, but can also result in deaths from consumption of contaminated water (Fristachi et al., 2008, Steffensen et al., 1999, Falconer, 2004a, Carmichael, 1998, Quiblier et al., 2013, Paerl and Otten, 2013, Dionysiou, 2010, Gugger et al., 2002, Gugger et al., 2005, Lyra et al., 2001). Studies have shown that bioaccumulation of cyanotoxins can occur in plants (terrestrial and aquatic) and aquatic organisms (fish, shrimps, oysters, mussels, snails) posing a potential risk to human health (Drobac et al., 2013, Chorus and Bartram, 1999, de la Cruz et al., 2013). Various factors have been found to initiate and facilitate cyanobacteria growth and dominance. These are listed in Table 1.2.

Figure 1.2: Benthic (Phormidium spp.) cyanobacteria mats. Taken from Quiblier et al. (2013)

Figure 1.3: Toxic cyanobacteria in the Canning River (WA) (Trust, 2005) 6

CHAPTER I: GENERAL INTRODUCTION Table 1.1: Factors affecting cyanobacteria growth and bloom formation Factor Effect Reference Different species of cyanobacteria have differing optimal light intensities for growth. The ability of cyanobacteria to regulate buoyancy allows them to gain access to optimal light Dokulil and Teubner (2000); Ganf and Light intensity intensities even in turbid or still waters, allowing them to gain a growth advantage over Oliver (1982); Msagati et al. (2006) photosynthetic organisms. Dokulil and Teubner (2000);Heath et al. Cyanobacteria have adapted to grow in various environments, from volcanic hot springs, to the (2011), Loiacono et al. (2012);McGregor Temperature Antarctic, and all conditions in between. Temperature affects different cyanobacteria genera and Rasmussen (2008); O'Neil et al. differently with most exhibiting optimal growth rates as temperatures exceed 20 °C (2012); Paerl and Huisman (2009); Taton et al. (2003) Eutrophication is considered a major factor of cyanobacteria blooms. The concentration, ratio Arfi et al. (2001), Dokulil and Teubner and availability of dissolved nutrients, especially nitrogen (N) and phosphorus (P), have been (2000);Lagus et al. (2007); Moisander et Nutrient shown to drive cyanobacteria growth, succession, population composition and productivity. al. (2012); O'Neil et al. (2012); Paerl and availability However, as many cyanobacteria can utilise both organic and inorganic sources of P and N, Otten (2013); Quiblier et al. (2013); bloom formation may also occur under conditions where inorganic concentrations of these Ressom et al. (1994); Sivonen and Jones compounds are low. (1999), Most species of cyanobacteria tend to proliferate in calm stable waters, usually occurring in Arfi, Bouvy et al. (2001); Dokulil and Hydrological lakes and reservoirs when vertical mixing is reduced due to thermal stratification. The ability of Teubner (2000); Ganf and Oliver (1982); disturbances cyanobacteria to regulate buoyancy allows them to overcome the separation of light and Moisander et al. (2012); Steffensen et al. nutrients, providing them with an advantage over other phytoplankton. (1999); Cyanobacteria have been found to be naturally tolerant/less sensitive to a variety of herbicides. Lürling and Roessink (2006); Powell et Furthermore, herbicide (especially glyphosate) use has been found to have the potential to Herbicide use al. (1991); Quiblier et al. (2013); Vera et affect the microbial community composition, possibly promoting the proliferation and al. (2010); Villeneuve et al. (2011) dominance of cyanobacteria, at the expenses of vulnerable macrophytes.

7

CHAPTER I: GENERAL INTRODUCTION

1.3.2. Toxins and their effects Many cyanobacteria produce a range of bioreactive secondary metabolic compounds, some of which are toxic to animals and humans (cyanotoxins) (Humpage, 2008, Sivonen and Jones, 1999) (Table 1.3). The list of known freshwater toxin producing cyanobacteria is constantly growing, making them organisms of concern to water authorities (Humpage et al., 2012). In freshwater systems, pelagic (i.e. water- column dwelling) cyanobacteria are the main cause of toxic blooms (Carmichael, 2001, O'Neil et al., 2012). To date, 40 genera of cyanobacteria have been identified to be toxin producing, of these fewer than 10 are commonly associated with toxic bloom formations (Bernard et al., 2011, Ressom et al., 1994). In Australia, the four commonly reported species are Microcystis aeruginosa (Kützing) Lemmermann, Anabaena circinalis Rabenhorst ex Bornet & Flauhault (now Dolichospermum circinalis), Nodularia spumigena Mertens and Cylindrospermopsis raciborskii (Wołoszyńska) Seenaya & Subba Raju (Falconer, 2001). Chemically, the main cyanobacteria toxins are either cyclic peptides or alkaloids, for which new structural analogues of known toxins are being continually identified (Humpage, 2008, Bernard et al., 2011, Humpage et al., 2012). According to their mode of toxicity in animals, they have been grouped into hepatotoxins, neurotoxins, cytotoxins and dermatotoxins (Codd et al., 2005, van Apeldoorn et al., 2007). Although toxin production and toxic water blooms are not necessarily constant (Carmichael, 2001), differing both temporally and spatially (Shirai et al., 1991), the ability of these toxic metabolites to persist and remain active in water after the producing cyanobacteria have disappeared is of increasing concern (Codd et al., 2005, Saker et al., 2009). In Australia, the first scientific record of presumed cyanobacteria poisoning was by George Francis in 1878, at Lake Alexandrina, South Australia (Falconer, 2004a, van Apeldoorn et al., 2007, Stewart et al., 2008). Stock animals died as a result of drinking from water thought to be contaminated with Nodularia spumigena (Pitois et al., 2000). Since then, there have been regular reports of livestock and wildlife deaths due to cyanobacteria contamination of waters, both in Australia and from around the world. It is believed that stock losses resulting from cyanobacterial blooms are underestimated as the majority of these reports come from South Africa where the toxic Microcystis genus is abundant in (eutrophic) water storage sites (Pitois et al., 2000,

8

CHAPTER I: GENERAL INTRODUCTION Carmichael, 2001, Steffensen et al., 1999, Carmichael, 1998, Kemp and John, 2006, Quiblier et al., 2013, Falconer, 2004a). In contrast, human fatalities associated with ingestion of cyanotoxin- contaminated waters have been suspected, but not documented due to the lack of information and methods to confirm the presence of toxins (Falconer, 2004a, Carmichael, 2001, Carmichael et al., 2001). Other routes of exposure to cyanotoxins include exposure through recreational activities (such as swimming, diving, canoeing), consumption of crops and/or aquatic animals in which bioaccumulation of toxins have occurred and consumption of cyanobacteria as dietary supplements (van Apeldoorn et al., 2007, Drobac et al., 2013, Watson et al., 2008, McCarron et al., 2014). The largest confirmed cyanotoxin-related human incident occurred in 1996 in Brazil where 76 patients died after haemodialysis with cyanotoxin (microcystin-LR and potentially cylindrospermopsin) contaminated waters (Carmichael et al., 2001, Falconer, 2004a). Patients affected were found to have an average microcystin concentration of 223 ng/g in liver tissue and an estimated source water concentration of 19.5 μg/L, which was almost 20 times higher than the 1μg/L safe level proposed by the World Health Organisation (WHO) (Carmichael et al., 2001, Chorus and Bartram, 1999). Other reports of gastrointestinal illnesses, gastroenteritis, generic poisoning and liver damage in humans have been documented (Falconer, 2004a). One such instance occurred in 1979 at Palm Island, Queensland, Australia. One hundred and fifty people were hospitalised with severe hepato-enteritis from exposure to cylindrospermopsin, resulting from treatment to the water supply, implemented to remove a cyanobacteria bloom in response to complaints of unpleasant drinking water taste and odour by the consumers. The treatment resulted in lysis of the cyanobacterial cells and release of the toxin into the source waters (Griffiths and Saker, 2003, Carmichael et al., 2001, Hawkins et al., 1985, Falconer, 2001). Common symptoms of cyanotoxin poisoning include illnesses such as hay- fever like symptoms, vomiting, diarrhoea, abdominal pain, headache, sore throat, dry cough, blistering of the mouth, atypical pneumonia, myalgia, elevated liver enzymes and skin rashes (Chorus and Bartram, 1999, Carmichael, 2001, Koreivienė et al., 2014). It has been suggested that mild cyanoytoxicosis usually goes unreported, as such symptoms are usually minor and are also symptoms of common gastrointestinal illnesses; with medical diagnosis and treatments only being sought when they become prolonged (Falconer, 2004a, Newcombe, 2009b). Although inconclusive, studies indicate that long-term exposure to certain cyanotoxins can be linked to some medical 9

CHAPTER I: GENERAL INTRODUCTION conditions. These include primary liver cancer by microcystin (Drobac et al., 2013, Hitzfeld et al., 2000, Falconer, 2004a, Humpage, 2008, van Apeldoorn et al., 2007, Chorus and Bartram, 1999) and amyotrophic lateral sclerosis/Parkinson-dementia (ALS/PDC) by β-Methylamino alanine (BMAA) (Cox et al., 2005, Weiss et al., 1989).

10

CHAPTER I: GENERAL INTRODUCTION Table 1.2: Summary of toxins produced by cyanobacteria and their effects. Toxin type Toxin Name (toxin LD Known Producers Site of 50 Effect (based on group) (ip, 24 h) (Genus) action effects) Dolichospermum,

Anabaenopsis, Inhibit protein phosphatases 1 and 2A. Disruption of liver Microcystis, cells, loss of sinusoidal/cytoskeletal structure leading to Microcystin (cp) 60 μg/kg Nodularia, Nostoc, uncontrolled intrahepatic haemorrhaging, resulting in Liver Oscillatoria, haemodynamic shock and heart failure. Carcinogenesis Phormidium and genotoxicity have also been linked to chronic low- Hepatotoxins Nodularia level exposure. Nodularin (cp) 60 μg/kg (spumigena) Anatoxin-a (alk) 200 μg/kg Post synaptic, cholinergic neuromuscular blocking alkaloids causing death due to paralysis of peripheral and Homoanatoxin (alk) 250 μg/kg respiratory muscles Nicotinic and Inhibits acetylcholinesterase, causes marked salivation Anatoxin-a(s) (alk) 20 μg/kg Dolichospermum, muscarinic and possible death within minutes of ingestion.

Saxitoxin (alk) 10 μg/kg Chrysosporum , receptors Inhibit nerve conduction by blocking voltage gated Nostoc, Oscillatoria, sodium (Na+) channels without affecting potassium Neosaxitoxin (alk) 3 μg/kg Phormidium, permeability, trans-membrane resting potential or Trichodesmium, membrane resistance Microseira (wollei),

Neurotoxins Glutamate mimic causing prolonged N-methyl-D- β-Methylamino - Scytonema (cf. aspartate (NMDA) receptor channel opening, leading to alanine (BMAA) (aa) crispum) neuronal excitotoxicity; causing increased intracellular Glutamate calcium (Ca2+) concentrations, cell depolarisation and receptors 2, 4-diaminobutyric initiation of apoptosis. Implicated in gradual 300 mg/kg acid (DBA) (CA) neurodegenerative diseases such as amyotrophic lateral sclerosis/Parkinson’s disease and Alzheimer’s disease. 11

CHAPTER I: GENERAL INTRODUCTION

Mainly targets the liver, inhibiting of glutathione, protein Dolichospermum , and cytochrome P450 synthesis in hepatocytes leading to Chrysosporum, cell death and hepatotoxicity. Genotoxic-causing Cylindrospermopsis, Cylindrospermopsin chromosome loss and DNA strand breakage. Toxin also 2.0 mg/kg Microseira wollei, (alk) results in renal toxicity/multiple organ damage. Has been

Oscillatoria sp., found to be toxic to developing foetuses and concluded to Raphidiopsis, Liver, be a possible carcinogen. Umezakia natans intestine and kidneys Cytotoxins ? Inhibition of protein synthesis and reduction of adenosine (2,500 mg triphosphate (ATP) concentrations, leading to progressive Liminothrixin (?) freeze-dried Liminothrix AC0243 cellular necrosis in the liver, kidneys and gastrointestinal cell tract. Possible neurotoxic effects also indicated.

material/kg)

Microseria wollei, Moorea producens Contact dermatitis causing skin irritation, rashes, Lyngbyatoxin (terp) 0.3 mg/kg Skin (previously L. gastrointestinal problems and potent tumour promoter

majuscula) Dermatotoxins

References: Carmichael (2001), Codd (2000), Dittmann et al. (2013), Falconer (2004a), Humpage (2008), Paerl and Otten (2013), Quiblier et al. (2013), Ressom et al. (1994), Sivonen and Jones (1999), Weiss et al. (1989), Koreivienė et al. (2014), Humpage et al. (2012), Daniels et al. (2014), Goto et al. (2012), de la Cruz et al. (2013), Runnegar et al. (2002), Griffiths and Saker (2003), Codd et al. (2005), Falconer (2008), Munday et al. (2013), Mazmouz et al. (2010), McGregor and Sendall (2015). Abbbreviations: alk = alkaloid; cp = cyclic peptide; aa = amino acid; terp = terpenoid; CA = carboxylic acid; ? = unknown.; LD50 as determined via intraperitonal (ip) injection in mice.

12

CHAPTER I: GENERAL INTRODUCTION

1.3.3. Taste and odour In addition to the production of toxins, cyanobacteria can also produce malodourous taste and odour (T/O) compounds. These earthy/musty/decay-like, woody or mouldy compounds are major sources of malodour outbreaks in drinking water systems, lakes, dams, water treatment plants and running water; and can taint the taste of seafood, wine, cheese and shellfish. T/O episodes have been primarily linked to the presence of two modified terpenes: geosmin (trans-1, 10-dimethyl-trans-9-decalol) and 2-methylisborneol (2-MIB) (Jüttner and Watson, 2007, Watson, 2003). Cyanobacteria are significant producers of these compounds, with the majority of the producers being benthic, filamentous and varying in N2 fixation ability (Smith et al., 2008, Watson, 2004, Jüttner and Watson, 2007). Studies have also identified two other cyanobacteria- produced odourous metabolites – hydroxyketones and norcarotenoids; the latter of which has been identified in environmental strains of Microcystis sp. (Zimba and Grimm, 2003, Höckelmann and Jüttner, 2005). Other malodorous compounds such as 2,4,6-trichloroanisole and methoxypyrazines (2-isopropyl-3-methoxypyrazine, 2- isobutyl-3-methoxypyrazine), produced mainly by bacteria, fungi (especially actinomycetes) are also known to produce such odours but are not as frequently identified as the cause for T/O incidents in water (Watson, 2004). In Australia, D. circinale blooms have been found to be the major cause of T/O issues in reservoirs, even on occasions when benthic cyanobacteria have been identified to be the cause of the issue. Detectable at low odour threshold concentrations (OTC) of ~10 ng/L or less (Young et al., 1996, Watson et al., 2000), geosmin and 2-MIB cause drinking water (or seafood grown in tainted waters) to be deemed unpalatable (Watson, 2003, Schrader and Rimando, 2003). This in turn results in increased management and treatment costs and also in the loss of aqua-cultural and recreational revenues. Implementation of effective water monitoring, management and remedial strategies is costly: England and Wales were estimated to spend between £75 – 114 million annually on algal bloom treatment (Murray et al., 2010), while catfish producers in USA may lose up to $60 million annually due to additional costs stemming from T/O incidents in aquaculture (Tucker and Hargreaves, 2003, Schrader and Dennis, 2005). Although geosmin and 2-MIB are frequently produced in conjunction with toxins, this trait is neither stable nor reliably correlated (Quiblier et al., 2013, Codd et al., 2005). Furthermore, while they are not known to be harmful to humans,

13

CHAPTER I: GENERAL INTRODUCTION experimental studies have shown activity (at mg/L concentrations) of geosmin and/or 2- MIB against other bacteria and algae, however only at concentrations greatly exceeding those usually found in the natural environments (≤1 mg/L) (Watson, 2003, Ho et al., 2012, Watson, 2004).

1.4. Management of cyanobacteria Although nutrient availability and increased temperatures have been linked to causing cyanobacterial blooms, other factors contributing to cyanobacteria outbreaks remain unclear and unanticipated blooms still occur (Quiblier et al., 2013), various remediation methods and management strategies have been developed to mitigate the problems posed by cyanobacteria secondary metabolites. These include chemical, physical and biological removal and detoxification of water supplies together with various management strategies (Koreivienė et al., 2014, Sierp et al., 2009, Hitzfeld et al., 2000, Burch et al., 2002).

1.4.1. Chemical methods Commonly used chemical methods to manage cyanobacteria and their secondary metabolites include application of algicides, oxidation, photocatalysis and decomposing barley straw (Koreivienė et al., 2014, Murray et al., 2010, Hitzfeld et al., 2000, Steffensen et al., 1999). Algicides, of which copper sulphate is the most common, are frequently used to prevent, control and disperse algal blooms. However, this controversial treatment usually results in the lysing of cyanobacteria cells and the subsequent release of secondary metabolites into the water system (Tucker and Hargreaves, 2003, Schrader et al., 2004, Burch et al., 2002, Drikas et al., 2001). Furthermore, copper sulphate, being a broad-spectrum algicide, has a potentially negative impact on the surrounding ecosystem (Burch et al., 2002, Koreivienė et al., 2014). Additionally, resistance to copper has been exhibited by M. aeruginosa that survived low-level copper sulphate treatments (Smith et al., 2008, arc a-Villada et al., 2004). Hydrogen peroxide, a known algaecide, has been shown to target cyanobacteria and are less toxic to other phytoplankton. Furthermore, it also reduces microcystin toxicity due to hydroxyl radicals cleaving the toxin molecule (Bauza et al., 2014, Barrington and Ghadouani, 2008). Decomposing barley straw has been used as an alternative to chemical algicides. Although the mode of action is still unclear, decomposing barley straw has been demonstrated to inhibit cyanobacteria growth (Murray et al., 2010, Martin and Ridge,

14

CHAPTER I: GENERAL INTRODUCTION 1999, Burch et al., 2002, Steffensen et al., 1999, Koreivienė et al., 2014, Newcombe et al., 2010). Oxidation processes using potassium permanganate, ozone or chlorine are also frequently used to control cyanobacteria and to reduce or remove toxins present in the source waters (Burch et al., 2002, Newcombe, 2009a, Falconer, 2004b). Chlorine is able to effectively oxidise cylindrospermopsin, microcystin and nodularin, however its effectiveness is highly dependent on the dose, pH and time given for the process to occur (Nicholson et al., 1994, Keijola et al., 1988, Himberg et al., 1989). Hence, chlorination is usually used in conjunction with other water treatment methods. Ozonation is another oxidative method to remove cyanobacteria and metabolites present; however this process may increase cell lysis and is more costly than chlorination (Koreivienė et al., 2014, Hitzfeld et al., 2000, Zouboulis et al., 2007). Photochemical degradation (photocatalysis) of cyanobacteria toxins and terpenes has been found to be an effective method of treating affected waters (Lawton et al., 2003, Lindner et al., 1997). This process utilises UV irradiation to successfully degrade organic metabolites present in water supplies (Griffiths and Saker, 2003, Jüttner 3+ 4+ and Watson, 2007). When coupled with catalysts such as TiO2/Fe /Fe , the degradation rate of these compounds is greatly increased without the formation of harmful by-products (Robertson and Lawton, 1997, Lindner et al., 1997, Koreivienė et al., 2014, Pantelić et al., 2013, Falconer, 2004b).

1.4.2. Physical methods Physical methods such as flocculation, sedimentation and sand filtration have traditionally been used to treat water supplies. Although these methods are generally successful in removing cells (Drikas et al., 2001, Griffiths and Saker, 2003), they are usually unable to remove dissolved (e.g. extracellular) compounds such as toxins or terpenes (Pitois et al., 2000, Burch et al., 2002). Hence, these methods are usually used in combination with activated carbon filters (which can absorb a range of organic compounds), oxidation/ozonation, or UV irradiation (Drikas et al., 2001, Burch et al., 2002, Steffensen et al., 1999, Hitzfeld et al., 2000, Falconer, 2004b). In recent years, ultrasound/sonication has also been shown to be an effective method of controlling cyanobacterial growth, however field application of this method has yet to be achieved (Wu et al., 2011, Rajasekhar et al., 2012).

15

CHAPTER I: GENERAL INTRODUCTION 1.4.3. Biological and management methods Biological methods of controlling and managing cyanobacteria have been suggested and tested (Griffiths and Saker, 2003, Steffensen et al., 1999, Sierp et al., 2009, Kong et al., 2013, Mehner et al., 2004, Ho et al., 2012). These include the use of cyanobacteria-feeding zooplankton or phytoplanktivorous fish, introduction of specific types of fishes, reducing the amount of dissolved nutrients (namely nitrogen and/or phosphorus) and vertical mixing of water columns (Kong et al., 2013, Van der Molen and Portielje, 1999, Sierp et al., 2009, Steffensen et al., 1999). These strategies aid the control of cyanobacteria populations by acting directly on growth and/or predation (by zooplankton and phytoplanktivores), or indirectly by controlling planktivore population through introduction of higher predators (Steffensen et al., 1999, Sierp et al., 2009, Mehner et al., 2004, Brookes et al., 2002).

1.5. Taxonomy: difficulties in nomenclature and identification Due to adaptations that allow them to survive under various environmental conditions, cyanobacteria have been shown to be a morphologically diverse group of organisms. They were traditionally classified under the International Code of Botanical Nomenclature (ICBN) based on morphological and physiological characteristics (Whitton and Potts, 2000, Castenholz, 2001, Oren, 2004, Rippka et al., 1979). On the basis of this Code, there are 150 genera and 2,000 species of cyanobacteria (Hoek et al., 1995). In 1978, the prokaryotic (or bacterial, see (Pace, 2006)) nature of cyanobacteria was officially recognised by Stanier et al. (1978) and the adoption of the International Code of Nomenclature of Prokaryotes (ICNP) for cyanobacteria nomenclature was proposed (Castenholz and Waterbury, 1989, Pinevich, 2015). Since then, effort has been made to standardise and revise cyanobacteria nomenclature and classification (Castenholz and Norris, 2005, McNeill et al., 2012, Oren and Tindall, 2005, Otsuka et al., 2001, Rajaniemi et al., 2005, Suda et al., 2002, Gaget et al., 2015, Oren, 2015). However, only a small number of names and genera have been officially revised and published (Oren, 2004, Oren, 2011, Hauer et al., 2014). Furthermore, new taxa have been described using a combination of both codes (Komárek, 2010b), with some organisms having multiple names (Gaget et al., 2011), causing much confusion in literature. On-going molecular studies has resulted in reclassification and renaming of a number of cyanobacterial strains, however, no major revisions to cyanobacteria classification have been made, this is reflected in the latest edition of Bergey’s Manual 16

CHAPTER I: GENERAL INTRODUCTION of Systematic Bacteriology (Vol. 1) and in Bergey’s Manual of Systematics of and Bacteria (Castenholz, 2015a). As such, the traditional, (mainly) morphology based classification of cyanobacteria is still in use (i.e. subsection, or order under bacterial classification scheme) (Vincent, 2009). Cyanobacteria are broadly classified into five subsections (Castenholz, 2001, Castenholz, 2015a): Subsection I (formerly Chroococcales), Subsection II (Pleurocapsales), Subsection III (formerly Oscillatoriales), Subsection IV (formerly Nostocales) and Subsection V (formerly Stigonematales) (Castenholz, 2001, Fogg et al., 1973, Rippka et al., 1979, Castenholz et al., 2015, Herdman et al., 2015, Hoffmann and Castenholz, 2015, Rippka et al., 2015a, Rippka et al., 2015b). Preliminary 16S ribosomal RNA (rRNA) phylogenetic analyses by (Wilmotte and Herdman, 2001), indicated that the clustering patterns obtained for subsections VI and V were congruent with the traditional classification (Castenholz, 2001, Wilmotte and Herdman, 2001). As a result of the complexity of cyanobacteria identification, various studies have been done to update cyanobacteria taxonomy to allow for more uniform identification criteria (Anagnostidis, 2001, Otsuka et al., 2001, Rajaniemi et al., 2005, Suda et al., 2002). Table 1 provides a brief summary of the features of each Subsection. Furthermore, as a result of convergent evolution, where similar features develop in independent lineages, and the exchange in genetic material between bacteria, a growing number of polyphyletic genera have been recently identified. This has also brought the classification of cyanobacteria into species using the assumption of a monophyletic origin into question, as they tend to lack ecologically or genetically coherent groups (Young, 2001, Dvořák et al., 2015). Even so, some studies have shown geographical differences in the 16S rDNA (as reviewed in Dvořák et al. (2015)), which is considered to be a relatively conserved gene. Furthermore, the increasing use of alternative loci such as the 16S-23S ITS region have allowed for identification of cryptic taxa, which were previously unidentified based solely on morphological characteristics. As a result of the difficulties in delimiting genera and species, attempts have been made to develop a special code valid only for cyanobacteria (http://www.cyanodb.cz/files/CyanoGuide.pdf), however, this code has yet to be accepted (Dvořák et al., 2015). For now cyanobacteria may still be described under the ICN or ICNP, although the Botanical Code is still the preferred code used for cyanobacterial classification as it allows for the description of new taxa without cultures (Dvořák et al., 2015, Oren, 2011). Hence until a single code for cyanobacteria classification can be decided upon, the definition of a cyanobacterial species will differ 17

CHAPTER I: GENERAL INTRODUCTION depending on whether the or species definition is being used (Dvořák et al., 2015).

18

CHAPTER I: GENERAL INTRODUCTION Table 1.3: Cyanobacteria classification and characteristics. DNA composition Subsection Size Morphology Characteristics (approx. mol% G + C) Unicellular, spherical, Cells reproduce by equal binary fission or budding in one, two or three ellipsoidal or rod-shaped cells, I successive planes. Planktonic forms usually contain gas vesicles. Majority 0.5-30 µm 31-71 occurring as individual cells or (Chroococcales) synthesise phycobiliproteins, while two genra (Prochloron and Prochlorococcus) colonial aggregates of various synthesise chlorophyll a and b instead sizes. Reproduce either exclusively by multiple fission through the formation of Unicellular coccoids that occur II baeocytes, or in combination with binary fission. Vegetative cells are < 20 µm 38-47 as individual cells or in (Pleurocapsales) permanently non-motile while baeocytes may be capable of temporary gliding colonies. motility. Cell division occurs by binary fission in a single plane, while reproduction Trichome occurs via trichome fragmentation, or hormogonia production. False branching Rods (longer than wide) or disk III width organisms with a continuous sheath, only the cytoplasmic membrane and 40-59 shaped cells (wider than long), (Oscillatoriales) 0.5-~100 peptoglycan are involved in the formation of daughter cells. Cells of individual filamentous µm organisms are undifferentiated, lacking heterocytes and akinetes. Exhibit gliding motility, O2 sensitivity and inability to fix N2. Cell division occurs exclusively via binary fission in a single plane. Cells differentiate to form specialised cells such as akinetes, terminal or intercalary Trichome Variable cell morphology IV heterocytes, reproductive trichomes (hormogonia) and tapered trichromes in width 38-47 (cylindrical, barrel-shaped, (Nostocales) response to nitrogen gradients resulting from terminally located heterocytes. ~ 2-15 µm spherical), filamentous Some genera produce sheaths and exhibit false branching. Motility amongst organisms is variable. Variable Variable cell morphology Cell division occurs via binary fission in multiple planes, resulting in occasional V trichome 42-44 (cylindrical, barrel-shaped, true branching of filaments. Cells differentiate to form specialised cells as in (Stigonematales) width spherical ), filamentous subsection IV. Some species form hormogonia. References: Castenholz (2001), Komárek (2010b), Komárek (2010a), Rippka et al. (1979), Vincent (2009), Waterbury (2006), Komárek and Anagnostidis (1989), Castenholz and Waterbury (1989), Herdman et al. (2015) 19

CHAPTER I: GENERAL INTRODUCTION

1.6. Methods used to identify and detect cyanobacteria In addition to treating water sources for the presence of secondary metabolites, there is also the need to be able to detect and identify cyanobacteria. This will allow for the implementation of measures to control their growth before blooms and potential production of secondary compounds occur. The following section will discuss methods commonly used to detect and identify cyanobacteria in water sources.

1.6.1. Morphological identification Cyanobacteria have traditionally been identified morphologically using light microscopy, with risk assessments made based on cell counts and strain identification (Newcombe et al., 2010). Morphological and physiological characteristics such as colony shape, pigmentation, sheath characteristics, cell shape and size have been used for morphological classification of cyanobacteria (Soares et al., 2011, Lane et al., 1985, Litvaitis, 2002). This process is time consuming, costly (labour), subjective and requires trained taxonomists (Steven et al., 2012). Furthermore, these characteristics can vary as a result of morphological plasticity, with distinctive phenotypic features varying within species, or even being lost, as a result of environmental or cultural conditions, growth phase (Saker et al., 2009, Lyra et al., 2001, Whitton and Potts, 2000, Pearson and Kingsbury, 1966, Litvaitis, 2002). The manifestation of ecotypes, or microbial pleomorphism during long-term cultivation, has also resulted in a large number of strains in culture being misidentified, with taxonomic names and morphological descriptions that disagree (Komárek, 2006, Komárek, 2010b). As a result, Komárek and Anagnostidis (1989) estimated that as many as 50% of cyanobacteria strains in culture collections have been misidentified or assigned to the wrong taxonomic group. Furthermore, only a small portion (1%) of bacterial strains have been successfully cultured (Stewart, 2012). Finally, morphological identification cannot distinguish toxin/terpene producers from non-producers (Neilan, 2002, Ouellette et al., 2006, Rantala et al., 2006). As a result of these limitations, DNA based approaches to cyanobacteria identification have been promoted and integrated into cyanobacteria identification (Marquardt and Palinska, 2007, Saker et al., 2007, Sánchez-Baracaldo et al., 2005, Valério et al., 2009, Willame et al., 2006, Komárek, 2006).

20

CHAPTER I: GENERAL INTRODUCTION 1.6.2. Molecular methods Various molecular markers have been used for cyanobacteria identification; these include the small ribosomal subunit (16S rDNA) (McGregor and Rasmussen, 2008, Nübel et al., 1997), DNA-dependent RNA polymerase γ subunit (rpoC1) (Palenik and Haselkorn, 1992), phycocyanin intergenic spacer region (PC-IGS) (Neilan et al., 1995) and the internal transcribed spacer region between the 16S and 23S rRNA genes (16S-23S ITS) (Dyble et al., 2002, Man-Aharonovich et al., 2007, Manen and Falquet, 2002, Kim et al., 2006, Neilan et al., 1995, Robertson et al., 2001, Boyer et al., 2001, Johansen et al., 2011, Johansen et al., 2009, Gugger et al., 2005, Gaget et al., 2015). Since it was first described by Woese in 1987, the 16S rRNA gene has been the most widely used marker for community profiling and taxonomic purposes (Coenye and Vandamme, 2003, Komárek, 2006, Castenholz, 2001). This molecule is characterised by regions of high conservation, alternating with regions of variability, allowing for relatively unequivocal nucleotide alignments between conserved regions (Litvaitis, 2002). Although displaying sequence variability, no single hypervariable region within the 16S rDNA is able to distinguish between all bacteria (Chakravorty et al., 2007). A study by Chakravorty et al. (2007) examining the degrees of sequence diversity between the different variable regions of the 16S rDNA found that the V2, V3 and V6 regions of the 16S rDNA contained the most variability and were able to provide the best targets for distinguishing between the 110 bacterial species used in their study. Although containing useful variable regions (Woese, 1987), recent studies have found the 16S rDNA to be too conserved to differentiate between closely related species and does not have an evolutionary pattern reflective of the entire genome (Seo and Yokota, 2003, Lyra et al., 2001, Coenye and Vandamme, 2003). Restriction fragment length polymorphism (RFLP) studies of full-length 16S rRNA genes have shown that the filamentous heterocystous cyanobacteria can be divided into three discrete clades that did not reflect toxicity characteristics, and that Synechoccocus spp. 16S rDNA, although forming their own clade, were highly divergent (Lyra et al., 1997, Turner et al., 1999). This is supported by whole genome phylogenetic analyses which show that Subsection IV (Nostocales) and Subsection V (Stigonematales) intermingle and separate into heterocyte-producing and non-producing clades, while Subsection I (Chroococcales) and Subsection III (Oscillatoriales) are paraphyletic (Figure 1.4) (Litvaitis, 2002, Wilmotte and Herdman, 2001, Seo and Yokota, 2003, Valério et al., 2009).

21

CHAPTER I: GENERAL INTRODUCTION

Figure 1.4: Cyanobacteria maximum likelihood (ML) species tree produced by concatenation of 31 conserved proteins (modified from Shih et al. (2013); Branches were colour coded according to morphological subsection. Taxa names in red were genomes sequenced in the study by Shih et al. (2013). Black dots indicate nodes with > 70% bootstrap support. The seven major subclades observed were found to show similar distribution to the traditional five subsections (Shih et al., 2013).

22

CHAPTER I: GENERAL INTRODUCTION Another frequently utilised gene for cyanobacteria identification is the rpoC1 locus; this gene encodes the γ subunit of RNA polymerase and is found within cells as a single copy (Bergsland and Haselkorn, 1991). Phylogenetic analyses using this locus usually provides taxonomic groupings similar to that observed using the 16S rDNA (Tomitani et al., 2006, Valério et al., 2009, Seo and Yokota, 2003), while providing more discrimination than that of the 16S rDNA at the species level (Palenik and Haselkorn, 1992, Lin et al., 2010). Even so, this locus is still too conserved to differentiate cyanobacteria from within individual genera (intrageneric differentiation) or for geographically restricted cyanobacteria (Han et al., 2009, Dall'agnol et al., 2012, Gugger et al., 2005). The interspacer region of the phycocyanin operon is highly variable amongst genotypes, is associated with observed morphologies and is capable of differentiating below the generic level (Bittencourt-Oliveira and Piccin-Santos, 2012, Bolch et al., 1996, Dyble et al., 2002, Neilan et al., 1995, Saker et al., 2009). This locus is frequently amplified in conjunction with the α (cpcA) and β (cpcB) subunits of the phycocyanin gene that flank the intergenic spacer region (cpcBA-IGS) (Neilan et al., 1995). Although studies have shown that exchange of genetic material occurs for this region, with the length of the intergenic spacer region being highly variable, phylogenetic analyses are largely consistent with 16S rDNA sequence analysis (Berrendero et al., 2008, Manen and Falquet, 2002, Neilan et al., 1995). In order to overcome the taxonomic limitations of the 16S rDNA, the 16S-23S ITS has been increasingly used (Janse et al., 2003). The 16S-23S ITS is variable in both length and the number of tRNA genes coded for (Iteman et al., 2000). Although length polymorphisms potentially cause alignment difficulties, alignment is still possible using a combination of primary and secondary structural features and has allowed the study of subgeneric relationships (Iteman et al., 2000, Rocap et al., 2002, Neilan et al., 1997b, Janse et al., 2003). Other studies have also suggested secondary structure predictions can improve alignments as these structures can base pair with the conserved regions of the 16S rRNA gene found upstream at the 5’ end and the 23S gene found downstream at the 3’ end respectively (Han et al., 2009, Boyer et al., 2001, Johansen et al., 2011, Johansen et al., 2009). Other cyanobacteria loci used for phylogenetic studies include the cyanobacteria RNA polymerase β and σ subunits (rpoB and rpoD1) (Gaget et al., 2011, Seo and Yokota, 2003, Rajaniemi et al., 2005); the rbcL gene, which codes for the large subunit of the ribulose 1, 5 diphosphate enzyme of the Calvin cycle (Gugger et al., 23

CHAPTER I: GENERAL INTRODUCTION 2002); the nifH gene, found only in diazotrophic cyanobacteria, that codes for the enzyme dinitrogenase reductase (Zani et al., 2000); hetR, a gene essential for heterocyte differentiation; and the DNA gyrase subunit β protein gene (gyrB) (Tomitani et al., 2006, Han et al., 2009, Rajaniemi et al., 2005). Prior to direct sequencing of specific regions, fingerprinting techniques based on DNA polymorphisms were more commonly used for cyanobacteria identification and characterization. These techniques have been described (Mazel et al., 1990) and used in studies as short tandemly repeated repetitive sequences (STRR) (Valério et al., 2009, Wilson et al., 2000), randomly amplified polymorphic DNA (RAPD) (Singh and Dhar, 2011, Casamatta et al., 2003, Neilan, 1995), restriction fragment length polymorphism (RFLP) (Iteman et al., 2002, Lyra et al., 2001), highly iterated palindromes (HIP1) (Robinson et al., 1995, Bittencourt-Oliveira et al., 2007a, Bittencourt-Oliveira et al., 2007b) and M13 fingerprinting (Valério et al., 2005, Muralitharan and Thajuddin, 2010, Valério et al., 2009). Briefly, STRRs are widespread in the cyanobacteria genome and thought to be of possible evolutionary significance (Rasmussen and Svenning, 1998); while the octameric HIP1 5’ C ATC C 3’sequence has been found to be over represented in many (but not all) cyanobacteria genomes (Robinson et al., 1995). M13 fingerprinting uses a 15 bp repeat motif of bacteriophage M13 DNA, which is naturally present in many organisms (Ryskov et al., 1988, Muralitharan and Thajuddin, 2010). Using these techniques, researchers have produced characteristic banding patterns from various cyanobacteria strains, suggesting the possible application of these techniques for identification and subtyping (Whitton and Potts, 2000, Robinson et al., 1995, Valério et al., 2009, Rasmussen and Svenning, 1998).

1.6.2.1. Limitation of molecular methods As molecular methods of cyanobacteria identification have become increasingly utilised, so too has the use of molecular phylogeny to determine homology and characteristics of the organism. In order to determine relationships between isolates and divergence between taxa, phylogenetic reconstructions are needed (Misof et al., 2014). To construct molecular phylogenies, the alignment of molecular sequences or amino acids is required (Morrison and Ellis, 1997). However, multiple sequence alignments (MSA) are based on molecular homology, hence they are, by necessity, only approximations of the perfect assignment of homologous sequence positions (Misof et al., 2014, Morrison and Ellis, 1997).

24

CHAPTER I: GENERAL INTRODUCTION To address the difficulties in performing sequence alignment, various alignment methods have been developed, with ClustalW (Thompson et al., 1994) being one of the most widely used classical progressive aligner (Landan and Graur, 2009). Other methods available include MAFFT (Katoh et al., 2002), MUSCLE (Edgar, 2004), PRANK (Löytynoja and Goldman, 2005) and PAGAN (Löytynoja et al., 2012). Due to computational limitations, most alignment methods initially construct a guide tree to estimate phylogeny and determine the order of taxa treatment. Assumptions (e.g. order in which the sequences were processed) made during the construction of this initial guide tree may persist, resulting in a phylogeny that is biased against the true (unknown) phylogenetic tree (Misof et al., 2014). Although possibly often overlooked (Löytynoja, 2014), alignment lengths, gap treatment options and alignment errors are known to affect phylogenetic reconstructions and the resolution of some areas of the tree (Morrison and Ellis, 1997, Lindgren and Daly, 2007, Liu et al., 2009, Löytynoja, 2012, Ogden and Rosenberg, 2006, Landan and Graur, 2009). Consequently, to improve phylogenetic performance, error prone sites can be removed either manually or automatically (Talavera and Castresana, 2007). To this end, automatic alignment editing-tools, such as Gblocks (Talavera and Castresana, 2007, Castresana, 2000), REAP (Hartmann and Vision, 2008) and NOISY (Dress et al., 2008), have been developed. Finally, phylogenetic reconstructions are performed in order to visualise the differences in taxa. Widely used methods of phylogenetic reconstruction include distance (Neighbour-joining/NJ by Saitou and Nei (1987) or Unweighted Pair Group Method (UPGMA) by Sokal and Sneath (1963)), Maximum Parsimony (MP), Maximum Likelihood (ML) (Felsenstein, 1981) and Bayesian Inference (BI) (Ronquist et al., 2009). Each of these methods use a different algorithm to generate phylogenies and have their individual advantages and drawbacks (Van de Peer, 2009, Harrison and Langdale, 2006, Ronquist et al., 2009). Furthermore, when building phylogenetic trees, it is also necessary to consider the substitution model used – this is especially true for trees constructed using the ML algorithm (Harrison and Langdale, 2006), with studies showing that the application of mixed substitution models being superior to standard DNA models in analyses based on rRNA sequences (Dohrmann et al., 2008, Letsch et al., 2010). Other artefacts that can contribute to bias in phylogenetic reconstructions include poor taxon sampling, long-branch attraction due to variability in rates of evolution and variance in evolutionary rate (heterotachy) (Som, 2015).

25

CHAPTER I: GENERAL INTRODUCTION As a result, different combinations of each step of the phylogenetic reconstruction can produce different inferences, hence contributing to the difficulties encountered when attempting to identify and characterise cyanobacteria solely by molecular methods. Additionally, several terms are frequently encountered during the discussion of constructed phylogenies. These include monophyly, paraphyly and polyphyly. Schematics explaining the differences between these terms are shown in Figure 1.5.

Figure 1.5: Simple schematic displaying the differences between monophyly, paraphyly and polyphyly (adapted from (Harrison and Langdale, 2006)). Isolates A and B within the green box form a monophyletic clade; Isolates A to G within the blue box form a paraphyletic clade; while Isolates A and E (red circles) are polyphyletic to each other.

In addition to the limitations to molecular identification of cyanobacteria posed by the different choices for phylogenetic reconstruction, the presence of horizontal gene transfer (HGT) between cyanobacterial strains further complicates their identification. This transfer of genes between unrelated bacteria has the ability to complicate phylogenetic interpretations (Young, 2001). An example of such an occurrence within cyanobacteria is the observation of toxicity in C. raciborskii strains that do not show any relation to their phylogeny (Stucken et al., 2009). Speciation models have shown that at the beginning of speciation, mixed phylogenetic signals from different loci are obtained due to HGT and homologous recombination; and it is only later into the process of speciation that strong phylogenetic signals are obtained (Dvořák et al., 2015, Polz et al., 2013, Dvořák et al., 2014).

26

CHAPTER I: GENERAL INTRODUCTION 1.6.3. Polyphasic approaches In recent years, molecular data has increasingly formed the basis for taxonomic classification of environmental cyanobacterial strains. However without careful combination of genetic data with morphological diversity and variation, ecological and ecophysiological characteristics and ultrastructural information, it is not possible to develop an accurate cyanobacteria classification system (Komárek, 2006). In addition, the increasing integration of biochemical and molecular data have resulted in the renaming of many species (Wacklin et al., 2009, Zapomělová et al., 2009, Gugger et al., 2002, Rajaniemi et al., 2005). However these corrections cannot be easily incorporated into databases, causing confusion when attempting to identify environmental cyanobacteria isolates (Komárek, 2006). Furthermore, as not all commonly used loci have similar rates of evolution, it is necessary to take into account the different evolutionary rates in order to obtain accurate identification (Han et al., 2009). To overcome the issue of misidentification, a number of studies (Berrendero et al., 2008, Valério et al., 2009, Rajaniemi et al., 2005, Lee et al., 2014, Engene et al., 2011) have adopted a polyphasic approach for identifying newly isolated cyanobacteria strains. Despite this, several factors still hamper accurate cyanobacteria identification. These include incorrectly named sequences available in public databases (Komárek, 2010b), strains with insufficient (or incorrect) morphological description (Komárek, 2010b, Willame et al., 2006), arbitrary/incorrect naming of strains in culture collections and the absence of, or the use of a non-representative strains as the reference strains for taxonomic identification (Komárek, 2006, Komárek, 2010b, Rajaniemi et al., 2005). Therefore, there is an expectation that whole genome sequencing data from an ever increasing number of species, should make misidentified sequences in GenBank increasingly recognisable (e.g. a sequence identified to belong to Anabaena sp. amongst a majority of Nostoc sp. hits when a sequence similarity search is performed), allowing for a more prompt exclusion or revision. Finally, the advent of Next Generation Sequencing (NGS) in recent years, has allowed many culture-independent studies using the 16S rDNA (Caporaso et al., 2011, Kolmakova et al., 2014, Soo et al., 2014, Claesson et al., 2010, Mizrahi-Man et al., 2013). This has greatly increased our awareness of the phylogenetic breadth of the cyanobacteria with the identification of additional major lines of descent (Soo et al., 2014) and the detection of rarer taxa within complex communities. Using NGS, many of these novel sequences were identified as cyanobacteria in spite of being derived from 27

CHAPTER I: GENERAL INTRODUCTION aphotic (light absent) sites. This observation led to the description of Melainabacteria, a new candidate phylum, sister to the cyanobacteria (Di Rienzi et al., 2013, Hofer, 2013) or a class within cyanobacteria (Soo et al., 2014). As a result of the increasing affordability and accessibility to this technology, multiple projects to sequence entire genomes of various reference cyanobacteria strains have been undertaken (Frangeul et al., 2008, Sinha et al., 2014, Kaneko et al., 2007) and to date, 39 complete cyanobacteria genomes are publically available in CyanoBase (http://genome.microbedb.jp/cyanobase) (Nakao et al., 2010) (Last checked, 24th Dec, 2015). It is expected that in the near future, whole genome phylogeny (WGP), as initially demonstrated in a preliminary study by (Shih et al., 2013), will become a widely adopted phylogenetic reconstruction method. Studies on the Prochlorococcus marinus group (Prabha et al., 2014) and the class Melainabacteria (Soo et al., 2014) have already successfully applied this technology to determine phylogenetic relationships. However, until more genome sequences are collected, cyanobacteria identification and classification is likely to remain problematic.

28

CHAPTER I: GENERAL INTRODUCTION

1.7. Aims With the complexity involved in morphological and molecular identification of cyanobacteria, it is hypothesised that the use of different loci will affect the molecular identification and phylogenetic placement of environmental cyanobacteria isolates. Therefore, to aid in further establishing proper cyanobacteria identification and classification, this PhD study aimed to determine the effect of the various loci and phylogenetic reconstruction on cyanobacteria identification. Specifically the aims of this thesis were: 1. To isolate strains of cyanobacteria from various water sources and generate genetic profiles using previously described genetic markers (16S rDNA, rpoC1, cpcBA operon and 16S-23S ITS) and to identify them. 2. To identify and quantify the effect of the options used at each individual step during the process of phylogenetic reconstruction on the resultant phylogeny. 3. To design cyanobacteria-targeted primers for use with Next Generation Sequencing (NGS) and test their utility for the detection and identification of cyanobacteria from complex environmental microbial communities.

29

CHAPTER II: GENERAL MATERIALS & METHODS CHAPTER 2. GENERAL MATERIALS & METHODS

2.1. Water Sources Freshwater samples (600 mL – 1 L) were collected mostly in the warmer months (between September-May) from 2011 to 2013. Water samples from protected reservoirs in the Great Southern region were kindly collected and provided by the West Australian Water Corporation. Other samples were collected from various rural and urban public freshwater sources (e.g. lakes, ponds, drains) on an ad hoc basis. All samples were refrigerated upon collection and stored at 4ºC until processing (within 7 days of collection). The locations and number of times each location was sampled are listed in Appendix 1.

2.2. Isolation and maintenance of cyanobacteria Water samples were pre-filtered through qualitative filter paper no.1 (Advantec, ) and sterile 0.2 and 0.45µm mixed cellulose ester (MCE) filter membranes (Advantec, Japan) to concentrate organisms present in samples. Half of each filter was placed in 50 mL of either Artificial Seawater Medium (ASM-1, Appendix 2), or modified ASM-1 medium having reduced (10% of original recipe) or - no nitrates (NO3 ) and algae were allowed to grow for a fortnight at a light intensity of 90 µmol m-2 s-1, and photoperiod of 18h light/6h dark cycles. Cyanobacteria strains were then isolated using the following methods (Rippka, 1988): 1. streaking 2. Serial dilution to extinction in 24 well plates, 3mL media/well (Corning, NY) 3. Micromanipulation (using a MO-102 micromanipulator, Narishige, Japan) 4. Sequential centrifugation at varying speeds using an Allegra X-15R centrifuge (Beckman Coulter, USA) After repeated passages, usually via serial dilutions, to remove unwanted organisms and to obtain cyanobacteria of identical phenotype (as single cells, colonies or filaments), which were presumed to be unialgal, cultures were transferred into either 75 cm3 vented culture flasks (Greiner Bio-One, ), or in 125 cm3 petri dishes (Greiner Bio-One, Germany) for long term culture. The type of culture vessel used was dependent on whether the isolate was planktic or benthic respectively. These cultures were maintained at light intensity 20 µmol m-2 s-1.

30

CHAPTER II: GENERAL MATERIALS & METHODS When cultures were grown for experimental purposes, aliquots of exponentially growing cells were transferred into 1 L sterile glass culture flasks (Corning, NY) and grown at a light intensity of 90 µmol m-2 s-1 for the duration of the experiment (Giglio et al., 2011) so as to promote increased growth rates. All cultures were maintained at 25ºC with a 16h light, 8h dark photoperiod.

2.3. Extraction of cyanobacterial DNA After a stably growing population with homogenous phenotype was obtained; aliquots of cyanobacterial biomass (25mL of cell suspension or ~5g of microbial mats) were collected from the culture flasks into sterile 50 mL polypropylene tubes (Greiner Bio-One, Germany). The tubes were centrifuged at maximum speed (12,000 x g) for 30 minutes using an Allegra X-15R centrifuge (Beckman Coulter, USA) and the supernatant was discarded. Genomic DNA was extracted from the pellet using the Wizard SV Genomic DNA Purification System® (Promega, USA) according to the manufacturers’ instructions. The DNA obtained was quantified using a Nanodrop ND- 1000 Spectrophotometer (Thermo Scientific, USA). For NGS studies, crude DNA extracts were prepared by multiple freeze-thawing cycles of culture suspensions (1- 2mL) as previously described by (Gaget et al., 2011).

2.4. PCR amplification and sequencing of target genes Two types of polymerase chain reaction (PCR) were performed in the process of this study 1) Conventional Standard PCR and 2) Quantitative PCR (qPCR). All PCR reagents were stored frozen, separate from DNA preparations and amplicons. All PCR reactions were prepared in a sterile DNA-free and RNA-free laminar flow hood. The primers used and their target loci are listed in Appendix 3.

2.4.1. Conventional Standard PCR All PCR reactions were performed in 20 µL volumes and run on either a G- Storm GS1® (Kapa Biosystems, USA) or a Veriti 96-Well Fast Thermal Cycler® (Applied Biosystems, USA). Cycling conditions varied depending on the locus targeted. PCR amplification products were visualised by 1% w/v agarose gel electrophoresis containing SYBR Safe Gel Stain® (Invitrogen, USA) and visualised with a dark reader trans-illuminator (Clare Chemical Research, USA). Detailed information on specific PCR protocols are provided in individual chapters.

31

CHAPTER II: GENERAL MATERIALS & METHODS 2.4.2. Quantitative PCR (qPCR) Quantitative PCR (qPCR) reactions were performed in 25 µL reaction volumes and run on a StepOne System® (Applied Biosystems, USA). Fluorescence was measured in real-time using SYTO 9® (Life Technologies, USA) intercalating DNA dye. Cycling conditions varied depending on target region and primer sets used. Detailed information on specific qPCR protocols are provided in individual chapters.

2.5. Sequencing and analysis of amplified products

2.5.1. Sanger sequencing After visualisation of PCR products, amplicons corresponding to the expected length of the targeted gene were excised from the gel using a new scalpel blade for each band excised. The products were then purified using either a MO BIO UltraClean DNA purification kit® (MOBIO Laboratories, USA), or purified using a previously described filter tip method (Yang et al., 2013). Using the primers corresponding to the PCR products, eluted PCR products were then sequenced in each direction with an ABI Prism Terminator Cycle Sequencing kit® (Applied Biosystems, USA). The sequencing conditions used were 96C for 2 min, followed by 35 cycles of 96C for 10 sec, the appropriate annealing temperature for the specific primer used for 5 sec and 60C for 4 min. Sequencing reactions were purified by ethanol precipitation (Sambrook and Maniatis, 2001) and analysed on an Applied Biosystem 3730 DNA Analyser® (USA) according to the manufacturers’ instructions.

2.5.2. Analysis of sequenced chromatograms Sequencing chromatogram files were edited using FinchTV version 1.4 (PerkinElmer, USA: http://www.geospiza.com/Products/finchtv.shtml) and imported into Bioedit Sequence Alignment Editor (Hall, 1999) to generate consensus sequences. These were then imported into MEGA 5 (Tamura et al., 2011) for alignment and phylogenetic analyses. Phylogenetic analyses were conducted on the sequences obtained together with those retrieved from GenBank using the Basic Local Alignment Search Tool (BLAST) algorithm (NCBI, USA: http://blast.ncbi.nlm.nih.gov/Blast.cgi). Models with the lowest Bayesian Information Criterion (BIC) scores were calculated using MEGA 5 (Tamura et al., 2011). Maximum likelihood (ML), maximum parsimony (MP) and distance (neighbor-joining/NJ) trees were constructed using MEGA 5

32

CHAPTER II: GENERAL MATERIALS & METHODS (Tamura et al., 2011) using the options specified by the BIC score, and tree reliability evaluated using bootstrap analysis.

33

CHAPTER III: POLYPHASIC IDENTIFICATION CHAPTER 3. POLYPHASIC IDENTIFICATION OF CYANOBACTERIAL ISOLATES FROM AUSTRALIA

3.1. Introduction Cyanobacteria identification, enumeration and classification have traditionally been based on light-microscopy observations, using morphological characteristics such as cell size, cell fission type, trichome width, shape of the terminal cells, shape, size and position of specialised cells such as akinetes and heterocytes, presence of aerotopes (Castenholz, 2001). However, this approach requires considerable operator skill and time; with distinctive phenotypic characteristics varying significantly within species, or even being lost, due to environmental or culture conditions, growth phase and use of fixatives (Lyra et al., 2001, Whitton and Potts, 2000). Furthermore, manifestation of ecotypes, or microbial pleomorphism during long-term cultivation, has resulted in a large number of strains being misidentified, with disagreeing nomenclature and morphological descriptions (Komárek, 2006, Komárek, 2010b). These well-known limitations to morphology-based identification have promoted the development of DNA-based approaches as a means of reliably identifying cyanobacterial isolates (Valério et al., 2009, Willame et al., 2006). However, identification of unknown isolates using only molecular methods of identification still remain hampered by i) sequence conservation of the frequently used 16S rDNA and; ii) the sequence availability and cyanobacteria variety of alternative loci (e.g. 16S-23S ITS, rpoC1, cpcBA-IGS). With time, the rapid expansion of available sequence data (Figure 3.1) is expected to allow increasingly accurate molecular identification of cyanobacteria and help in resolving the discrepancies between microscopy and molecular approaches. In light of the current maturity of the sequence databases, taxonomy and molecular tools (Komárek et al., 2011, Siegesmund et al., 2008, Strunecky et al., 2011, Wacklin et al., 2009), the aim of this chapter was to determine how well morphological and molecular tools corroborate for the identification of cyanobacteria isolates. To this end, cyanobacteria were isolated from randomly selected locations in Western Australia, from which there have been few cyanobacteria studies performed, for use in the present study.

34

CHAPTER III: POLYPHASIC IDENTIFICATION

Figure 3.1: Number of cyanobacteria sequences within GenBank for the 16S, rpoC1 and phycocyanin loci over time. Data obtained from NCBI Nucleotide Database (http://www.ncbi.nlm.nih.gov/nuccore). References indicate papers characterising or reporting changes to cyanobacteria taxonomy and nomenclature. Figure updated from (Lee et al., 2014)

3.2. Materials and Methods

3.2.1. Isolation and cultivation of cyanobacterial strains Freshwater samples (n=50; 1 L each) were collected between November 2010 and June 2011, from various randomly selected locations in Western Australia, including i) protected freshwater reservoirs in the Great Southern region, with restricted access and excellent water quality (GS samples; n=6); ii) urban lentic systems, in the Perth metropolitan area, with free public access (n=3) and; iii) shallow, rural, lotic systems, which comprised of a drain and a river (n=2) (Table 3.1). Water qualities from the urban and rural systems, although not comparable to that of the reservoirs, were neither excessively poor nor experiencing cyanobacterial blooms. Cyanobacteria were isolated and maintained as described in section 2.2. In addition to the Western Australian isolates, strains from the culture collection at the Australian Water Quality Centre (AWQC), which were mostly collected in the early-mid 1990s during surveys of Australian freshwater sources, were also included in the study. The surveys included protected and open freshwater reservoirs from New South Wales, Victoria and South Australia (n=6), open rural lentic systems from New South Wales (n=2) and open rural lotic systems from New South

35

CHAPTER III: POLYPHASIC IDENTIFICATION Wales and Victoria (n=2) (Table 3.1). AWQC isolates were grown in 50 cm2 tissue culture flasks with vented lids (Greiner Bio-one, Germany), containing 20–30 mL of ASM-1 medium (pH adjusted to 7.6). Cultures were incubated at 20C under a photon irradiance of 20 mol m-2s-1 provided by cool white light (18 h/6 h light/dark cycle).

3.2.2. Morphological identification After obtaining a stable population of cells (either unicellular, as colonies or filaments) exhibiting homogenous phenotype and steady growth, each isolate was subcultured into 50 cm3 tissue culture flasks, containing, approximately, the same number of cells/filaments and the same medium (ASM-1 or modified ASM). In order to identify (to the species or lowest determinable level of identification) the isolates obtained, microscopic identification of these sister cultures was performed by two independent laboratories (Dalcon Environmental, WA and Water Planning Ecology Department, Qld). Observations were performed by light microscopy at various magnifications on a large number of live cells (i.e. un-fixed), either directly in the culture flask, or by making multiple fresh mounts of the cultures. Morphological traits used for isolate identification include presence/absence and shape of akinetes, the location of heterocysts in relation to akinetes and cell shape.

3.2.3. Molecular and phylogenetic analyses DNA was extracted from the West Australian samples as described in section 2.3. Cyanobacterial DNA obtained from the AWQC was kindly provided by Steven Giglio (Healthscope Pathology). The DNA was extracted using the DNeasy kit® (Qiagen, USA) according to the manufacturers’ instructions. All PCR reactions were run on a G-Storm GS1 standard block thermal cycler (Kapa Biosystems, USA). A partial fragment (313 bp) of the hypervariable V4 region of the 16S rDNA was amplified using the cyanobacterial-specific PCR protocol previously described (McGregor and Rasmussen, 2008), except that the reverse primers CYA781R(a) and CYA781R(b) (Nübel et al., 1997) were used instead. Amplification of rpoC1 (409bp) was performed using cyanobacteria-specific primers rpoC1-1 and rpoC1-T (Palenik and Haselkorn, 1992). Partial fragments (423bp) of the cpcBA-IGS and corresponding flanking regions cpcB and cpcA were amplified using the primers cpcBF (UPF) and cpcAR (URP) as described by (Robertson et al., 2001). Visualisation, sequencing and analysis of the amplicons are described in section 2.4 and 2.5.

36

CHAPTER III: POLYPHASIC IDENTIFICATION Phylogenetic analyses were conducted on the sequences obtained during the present study (GenBank accession numbers JQ811771 to JQ811820) as described in section 2.5.2. Sequences retrieved from GenBank were selected using the results of the morphological identifications as a guide, and on the basis on their e value (values closest to zero) and percentage query coverage. MEGA 5 (Tamura et al., 2011) was used for sequence manipulations, ClustalW (Larkin et al., 2007) for alignments and the p-distance model (Kimura, 1980) was used for the calculation of the pair-wise evolutionary distances. Parsimony information and phylogenetic analyses of aligned sequences were conducted using NJ, MP and ML methods in MEGA 5 (Tamura et al., 2011); tree reliability was evaluated with bootstrap analysis of 500 replicates. For the purpose of molecular identification, a percentage sequence similarity cutoff of 98% and 95% for species and genus identification respectively was used for the 16S rDNA (Stackebrandt and Goebel, 1994). As a result of the much smaller number of sequences available (Figure 3.1) for the rpoC1 and cpcBA-IGS loci, 95% and 90% similarity values respectively were used for species and genus identification.

3.3. Results From the 11 Western Australia sampling sites, 29 isolates were obtained; of these, 12 were obtained from protected reservoirs, five from urban lentic systems and 12 from rural lotic waters. The Western Australian isolates were studied together with the 10 cyanobacteria isolates from the AWQC (Table 3.1).

37

CHAPTER III: POLYPHASIC IDENTIFICATION Table 3.1: Collection sites and identities of the 39 cyanobacterial isolates used in the present study. Isolates Type Sampling site Isolate designation obtained Great Southern Region 1 2 GS1-1, GS1-2 Great Southern Region 2 1 GS2-1 Protected freshwater reservoirs (WA) Great Southern Region 3 2 GS3-1, GS3-2, (n=6) Great Southern Region 4 2 GS4-1, GS4-2 Great Southern Region 5 3 GS5-1, GS5-2, GS5-3 Great Southern Region 6 2 GS6-1, GS6-2 Chaffey Dam 1 AWQC-ANA019-BR Protected freshwater reservoirs (NSW) Burrinjuck Reservoir 1 AWQC-ANA150-A (n=4) Copeton Dam 1 AWQC-ANA278-FR Pejar Dam 1 AWQC-ANA318 Protected freshwater reservoir (VIC) Fish Creek Farm Dam 1 AWQC-ANA148-CR (n=1) Protected freshwater reservoir (SA) Millbrook Reservoir 1 AWQC-MIC058-B (n=1) Chelodina Wetland Reserve 1 Chelodina wetland type 1 Open urban lentic systems (WA) (n=3) Frederick Baldwin Park 3 Baldwin park types 1, 2, 3 Hyde Park 1 Hyde Park type 1 Lachlan River, Booligal 1 AWQC-ANA118-AR Open rural lentic systems (NSW) (n=2) Willandra Creek 1 AWQC-ANA196-A Buayanup Drain 2 Buayanup drain types 2, 4 Open rural lotic systems (WA) (n=2) Vasse River 10 Vasse River types 1, 2, 3, 6, 8, 9, 12, 13, 14, 15 Open rural lotic systems (NSW) (n=1) Lake Cargelligo 1 AWQC-ANA131-CR Open rural lotic systems (VIC) (n=1) Murrary River, Swan Hill 1 AWQC-ANA335-C Grand-total 39

38

CHAPTER III: POLYPHASIC IDENTIFICATION

3.3.1. Comparison of morphological data All 39 isolates were examined and identified morphologically: 17 isolates were analysed by two independent taxonomists (replicate identifications), while 22 isolates were analysed by either one of the two taxonomists (unique identifications) (Figure 3.2, Table 3.2). Overall, only 46% of the isolates (18/39) were morphologically identified to species level by at least one taxonomist; the remaining isolates (54%; 21/39) were identified only to the genus level (Table 3.2). For the 17 isolates that were analysed by both taxonomists, morphological identifications were in complete agreement for only three isolates (18%, 3/17): Buayanup drain type 2 (Anabaena torulosa), Hyde park type 1 (Anabaenopsis elenkinii) and Vasse River type 6 (Sphaerospermopsis aphanizomenoides) (Figure 3.2, Table 3.2). Baldwin park type 3 was also identified to the species level by both taxonomists, but as different species (Anabaena oscillaroides versus Sp. aphanizomenoides). For a further six isolates (35%, 6/17), morphological identifications were in agreement at the genus level, while, with the exception of Baldwin Park type 3, the lack of identifying features allowed identification of the remaining seven isolates (41%), to either family, or order– level only (Figure 3.2, Table 3.2).

39

CHAPTER III: POLYPHASIC IDENTIFICATION

40

35 3

30

13

25

20 1

8 1 12 Number Number of Isolates 15 8 8

10 20

14 6 5 9 10

3 0 16S rDNA rpoC1 cpcBA-IGS Unique Agreement by identifications replicate identifications Molecular Identification Morphological Identification

Species Genus Family/Order Unclassified

Figure 3.2: Number of cyanobacterial isolates identified at order, genus or species-level, using molecular and morphological methods. Depending on the isolate, morphological identifications were carried out by either one (unique identifications), or two independent taxonomists (replicate identifications). Where isolates were examined in duplicate, the number of isolates providing agreement between the replicate identifications is shown.

40

CHAPTER III: POLYPHASIC IDENTIFICATION Table 3.2: Identification of the 39 cyanobacterial isolates used in the study, based on 16S rDNA, rpoC1, cpcBA-IGS sequences and isolate morphology. Percentage similarity at each locus was calculated in MEGA 5 (Tamura et al., 2011), as the pairwise evolutionary divergence, with a p- distance model (Kimura, 1980). Molecular Identification Morphological identification Sub-group Isolate Related sequence (percentage similarity) 16S rDNA (n=36) rpoC1 (n=22) cpcBA-IGS (n=19) Taxonomist 1 Taxonomist 2 AWQC- D. affine |FN691906 – – D. circinale N.A. ANA019-BR (100%) AWQC- D. circinale |AF199423 – – D. circinale N.A. ANA118-AR (100%) AWQC- D. circinale |AF199423 – – D. circinale N.A. ANA131-CR (100%) AWQC- D. circinale |AF199423 – – D. circinale N.A. ANA148-CR (100%) AWQC- (D. circinale AWQC150-A D. circinale |AF199423

– D. circinale N.A. ANA150-A* |AF247573) (100%) AWQC- Ap. gracile |HQ157688 Ap. gracile |EU078450 – D. circinale N.A. ANA196-A (100%) (97%) AWQC- D. flos-aquae |AB551438 – – D. circinale N.A. ANA278-FR (100%) AWQC- D. circinale |AF247588 – – D. circinale N.A. Nostocales (n=25) Nostocales ANA335-C (100%) AWQC- D. circinale D. circinale |AF199425 – D. circinale N.A. ANA318 |AF247581(100%) (100%) Anabaena sp. |AF199432 Baldwin Park A. oscillaroides |AJ630428 (86%) – N.A. Anabaena sp. 1 type 2 (94%) Anabaena sp. |AF199433 (86%) Nostoc sp. PCC8976 Baldwin Park Nostoc sp. |AY242997 Sp. |AM711525 (100%) – A. oscillaroides type 3 (88%) aphanizomenoides Nostoc sp. |AB087403 41

CHAPTER III: POLYPHASIC IDENTIFICATION

(100%) Sp. aphanizomenoides Buayanup A. oscillaroides |AJ630428 |FJ234841 (86%) A. sphaerica |DQ439645 A. torulosa A. torulosa drain type 2 (99%) Sp. aphanzomenoides (92%) |FJ234848 (86%) Ap. gracile |FN552318 Chelodina Ap. gracile |EU078532 (97%) wetlands type – Anabaena sp. Anabaena sp. (99%) D. compacta |AY702239 1 (97%) N. punctiforme |GQ287652 A. variabilis |CP000117 GS1-2 – Anabaena sp. Anabaena sp. (97%) (89%) N. commune |DQ185223 (99%) Pseudanabaena sp. Pseudanabaena sp.

GS2-1 Nostocales Nostoc sp. N. commune |AB251863 PCC7367 |CP003592 (76%) |EF680776 (80%) (99%) A. bergii |FJ234863 Sp. GS4-2 A.bergii |FR822617 (100%) – Nostoc sp. (100%) aphanizomenoides Nostocaceae cyanobacterium C. stagnale PCC7417 Cylindrospermum GS5-1 – Anabaena sp.

Nostocales (n=25) Nostocales |GQ389643 (100%) |CP003642 (87%) sp. T. variabilis |AJ630456 (100%) A. cylindrica PCC7144 Anabaena sp. GS5-2 N.A. Anabaena sp. T. variabilis |JQ390607 |CP003659 (84%) |GU935369 (97%) (100%) Nostoc sp. |FJ948088 Nostoc sp. PCC6720 GS5-3 – Anabaena sp. Anabaena sp. (99%) |JF740673 (94%) Hyde Park An. circularis |GQ859629 An. circularis |EU078479 An. elenkinii |FN552383 An. elenkinii An. elenkinii type 1 (100%) (89%) (96%)

Vasse River D. flos-aquae |AY701573 Anabaena sp. |AF199432 Sp. aphanizomenoides Anabaena sp. D. flos-aquae type 1 (100%) (98%) |GU197719 (99%)

42

CHAPTER III: POLYPHASIC IDENTIFICATION

Vasse River Anabaena sp. PCC9109 Anabaena sp. |EU078475 Anabaena sp. PCC 9109

N.A. Anabaena sp. type 2 |AY768408 (99%) (87%) |AY768473 (96%) Vasse River A. sphaerica |GQ466513 A. variabilis |AB074795 N. linckia |AY466120 N.A. Anabaena sp. type 3 (100%) (100%) (99%) Vasse River Sp. aphanizomenoides Sp. aphanizomenoides Sp. aphanizomenoides Sp. Sp. type 6 |FM177473 (100%) |FJ830555 (96%) |GU197719 (97%) aphanizomenoides aphanizomenoides Vasse River Calothrix sp. |GQ859627 Nostocales (n=25) Nostocales – – Gloeotrichia sp. Rivularia sp. type 8 (98%) Planktothrix sp. |AF212922 (100%) Baldwin Park L. redekei |EU078512 G. amphibium Planktolyngbya sp. – Oscillatoriales type 1 (100%) |FJ545644 (100%) 1 Limnothrix sp. |EF088338 (100%) Cyanobacterium Buayanup Leptolyngbya sp. – txid129981 |AJ401183 N.A. Trichocoleus sp drain type 4 |EU729062 (98%) (75%) Leptolyngbya sp. Spirulina laxissima GS1-1 – N.A. Planktolyngbya sp. |HM217044 (100%) |DQ393286 (84%) Pseudanabaena sp. Pseudanabaena sp. PCC GS3-2 – Pseudanabaena sp. Ps. galeata |GU935355 (100%) 7409 |M99426 (82%) Planktothrix sp. |AF212922

Oscillatoriales (n=6) Oscillatoriales (100%) L. redekei |EU078512 G. amphibium Planktolyngbya sp. GS4-1 – Geitlerinema sp. (100%) |FJ545644 (100%) 1 Limnothrix sp. |EF088338 (100%) Pseudanabaena sp. Ps. mucicola |GQ859642 GS6-2 – PCC7409 |M99426 Pseudanabaena sp. Ps. galeata (99%) (83%)

43

CHAPTER III: POLYPHASIC IDENTIFICATION

M. flos-aquae |AF139328 GS3-1 – – N.A. Microcystis sp. (98%) AWQC- M. flos-aquae |AF139328 M. aeruginosa |AP009552 – M. flos-aquae N.A. MIC058-B (100%) (98%) Synechococcus sp. Synechococcus sp. GS6-1 PCC7920 |AF245158 – N.A. Aphanothece sp.

|HE975005 (100%) (96%) Synechococcus sp.

(n=8) Vasse River Synechococcus sp. PCC7920 |AF245158 – Aphanothece sp. Aphanothece sp. type 9 |HE975005 (100%) (96%) Synechococcus sp. Vasse River Synechococcus sp. – PCC7918 |AF223462 N.A. Aphanothece sp. 1 type 12 |HE975005 (100%) (97%)

Chroococcales Chroococcales Synechococcus sp. Synechococcus sp. Vasse River Synechococcus sp. PCC7920 |AF245158 PCC7918 |AF223462 N.A. Aphanothece sp. 1 type 13 |HE975005 (100%) (96%) (96%) Vasse River Synechococcus sp. – – N.A. Aphanothece sp. type 14 |HE975005 (100%) Vasse River Synechococcus sp. – – N.A. Aphanothece sp. 1 type 15 |HE975005 (100%) *16S rDNA sequence previously available (GenBank Acc. No AF247573); – = No sequence data obtained; N.A.= Not examined; A.= Anabaena; An.=Anabaenopsis; Ap.=Aphanizomenon; C.=Cylindrospermum; D.=Dolichospermum; G.=Geitlerinema; L.=Limnothrix; M.=Microcystis; N.=Nostoc; O.=Oscillatoria; P.=Planktothrix; Ps.=Pseudanabaena; Sp.=Sphaerospermopsis; T.=Trichormus. Sequences with percentage molecular similarity above the threshold used for identification are in bold. For species identification, the threshold was 98% for the 16S rDNA and 95% for the rpoC1 and cpcBA-IGS loci. For genus, the threshold was 95% for the 16S rRNA and 90% for rpoC1 and cpcBA-IGS.

44

CHAPTER III: POLYPHASIC IDENTIFICATION

3.3.2. Comparison of molecular identification and phylogenetic analysis at different loci Partial sequences were successfully obtained for the 16S rDNA (n=36), rpoC1 (n=22) and cpcBA-IGS (n=19) loci. Amplification at all three loci was successful for 23% (9/39) of the isolates (Table 3.2). Of the remaining isolates, 33% (10/30) amplified successfully at both the 16S rDNA and cpcBA-IGS loci, 33% (10/30) were successfully amplified at the 16S rDNA and rpoC1 loci, while the remaining 33% (10/30) successfully amplified at only one locus (Table 3.2). Tree topologies for all three loci were relatively similar (Figures 3.3, 3.4 and 3.5), with the isolates included in distinct clusters according to their orders (Table 3.2, Figures 3.3, 3.4 and 3.5). Based on the percentage similarity threshold set (in the Materials and Methods), when all three loci were successfully sequenced from any given isolate and compared, molecular identifications agreed at the species level for only one isolate (Vasse River type 6) and at the genus level for one other isolate (Vasse River type 13). Where successful amplification was obtained for only two loci, agreement at the species level for 25% (5/20) of the isolates was obtained. A further 50% (10/20) of the isolates agreed at the genus level, while the remaining 25% either agreed at the order level, or had no agreement at the loci amplified (Table 3.2, Figure 3.6). Analysis of the 16S rDNA data (313 characters; 110 parsimony informative sites), showed the existence of six major clusters, within the three major cyanobacterial orders: the Chroococcales (subsection I; n=8; two clusters), Oscillatoriales (subsection III; n=6; three clusters) and Nostocales (subsection IV; n=22; a single large cluster) (Figure 3.3) (Castenholz, 2001). The Chroococcales included sequences from planktic, unicellular coccoids, isolated mainly from the Vasse River and from two isolates (GS3- 1 and AWQC-MIC058-B), which were multicellular planktic colonies. The Oscillatoriales included benthic, non-heterocystous, filamentous (non-branching) strains, from a variety of locations. All other 16S rDNA sequences were obtained from cultures of filamentous (non-branching) strains clustering within the Nostocales, of which several genera were polyphyletic (Figure 3.3). At the rpoC1 locus (409 characters; 335 parsimony informative sites), the genotypes identified belonged to the Nostocales (n=17), Oscillatoriales (n=1) and Chroococcales (n=4) (Figure 3.4). Interestingly, GS2-1 grouped with Pseudanabaena sp. (Oscillatoriales) (bootstrap value > 50%), while Baldwin park type 2 clustered with A. cylindrica, within the Nostocales subsection. This is in contrast to the 16S rDNA 45

CHAPTER III: POLYPHASIC IDENTIFICATION locus, where GS2-1 and Baldwin park type 2, respectively grouped with Nostoc commune (bootstrap > 70%), or formed a clearly distinct branch, basal to the order (Figure 3.3). As with the 16S rDNA tree (Figure 3.3), multiple clusters of the Chroococcales were also evident from the rpoC1 tree (Figure 3.4), with four sequences grouped within this order: AWQC-MIC058-B, Vasse River types 9 and 13 and GS6-1. The remaining 17 sequences clustered within the Nostocales, which was characterised by the distinct positions of Vasse River type 2, Buayanup drain type 2 and GS5-2. Analysis of the cpcBA-IGS locus (423 characters; 396 parsimony informative sites) showed that, apart from GS2-1, which grouped strongly (bootstrap value >80%) with Pseudanabaena sp. (Oscillatoriales) on an isolated branch, the overall topology was similar to that obtained for the 16S rDNA (Figure 3.5). As with the 16S and rpoC1 loci, the Nostocales (10 sequences) and the Chroococcales (2 sequences) formed monophyletic groups, while the Oscillatoriales (7 sequences) were paraphyletic (Figure 3.5). As the majority of cpcBA-IGS sequences available from GenBank to date mainly belong to relatively few genera (e.g. Arthrospira, Synechococcus, Phormidium ), large distance values and the presence of isolated branches were observed for the tree based on this locus (Figure 3.5). Baldwin park type 2 could not be successfully amplified at this locus and only GS4-1, GS4-2, Baldwin park type 1 and Vasse River type 3 exhibited almost complete (>99%) homology with available sequences (Figure 3.5, Table 3.2).

46

CHAPTER III: POLYPHASIC IDENTIFICATION

Figure 3.3: Maximum Likelihood tree based on the 16S rDNA sequences (313 bp) showing the clustering of isolates obtained. Branch support values greater than 50% for ML, MP and NJ analyses respectively are indicated left of the nodes. Bar, 0.05 substitutions per site. The outgroup was removed to facilitate the visualisation of the isolates. *Sequence previously submitted to GenBank with accession number AF247573.

47

CHAPTER III: POLYPHASIC IDENTIFICATION

Figure 3.4: Maximum Likelihood tree based on the rpoC1 sequences (409 bp) showing the clustering of isolates obtained. Branch support values greater than 50% for ML, MP and NJ analyses respectively are indicated left of the nodes. Bar, 0.1 substitutions per site. The Synechococcus cluster containing GS6-1, Vasse River types 9 and 13 were removed to facilitate visualisation of the other isolates. .

48

CHAPTER III: POLYPHASIC IDENTIFICATION

Figure 3.5: Maximum Likelihood tree based on the cpcBA-IGS sequences (423 bp) showing the clustering of isolates obtained. Branch support values greater than 50% for ML, MP and NJ analyses respectively are indicated left of the nodes. Bar, 0.2 substitutions per site.

49

CHAPTER III: POLYPHASIC IDENTIFICATION

Figure 3.6: Comparison of the extent of agreement between molecular identifications of cyanobacterial isolates with at least two successfully sequenced loci. All three loci (16S rDNA, rpoC1 and cpcBA-IGS) were included in the analysis, so as to allow visualisation of all isolates that were successfully sequenced at > two loci. The numbers indicate the isolates for which agreement between the loci was found at species level (panel A), or at least genus level (i.e. identification of isolates were the same to either species or genus level) (panel B).

50

CHAPTER III: POLYPHASIC IDENTIFICATION

3.3.3. Comparison between morphological and molecular identifications Discrepancies between morphological and molecular identifications were observed for several isolates (Table 3.2). For example, Hyde park type 1 was identified on the basis of the sequence at the cpcBA-IGS locus and morphologically as An. elenkinii. However, at the 16S rDNA locus it was closest to An. circularis. Further molecular identification of Hyde park type 1 was hampered by the paucity of An. circularis and An. elenkinii sequences at both the rpoC1 and cpcBA-IGS loci which prevented confident identification based on data from these loci. The isolate Buayanup drain type 2, which was identified morphologically as A. torulosa was most similar to A. oscillaroides at the 16S rDNA locus and to A. spherica at the cpcBA-IGS locus (no A. oscillaroides sequences were available at this locus). Isolates GS6-1 and Vasse River types 9, 12 and 13 were identified to Aphanothece sp. by morphology. However, using molecular methods, they were phylogenetically more similar to Synechococcus sp. (HE975005) than to Aphanothece minutissima (FM177488) (Figure 3.3). Morphologically, AWQC-ANA196-A was identified as Dolichospermum circinale, but was phylogenetically placed with Aphanizomenon gracile, using the 16S rDNA and rpoC1 sequence data (Table 3.2). This was also observed for GS4-2, which was identified morphologically as a Nostoc sp. or Sp. aphanizomenoides, but was found to be most closely related to An. bergii (100% similarity) at both the 16S rDNA and cpcBA-IGS loci. Similarly, although Baldwin park type 1 and GS4-1 were identified morphologically to the Oscillatoriales (either to order or genus level), they showed 100% similarity to various Limnothrix spp. and Planktothrix spp., at the 16S rDNA locus and to Geitlerinema amphibium (FJ545644), at the cpcBA-IGS locus. Overall, for the nine isolates for which sequence data was obtained for all loci studied, microscopic and molecular data from at least one locus were in agreement at genus level for all isolates, except Vasse River type 13. However, agreement between morphological and molecular identifications, from all three loci, was obtained for only one isolate (Vasse River type 6) (Table 3.2). When morphology was compared with two loci for all isolates (16S rDNA plus, either rpoC1, or cpcBA-IGS), species identities for AWQC-ANA150-A (D. circinale) and AWQC-ANA318 (D. circinale) were in agreement. When morphological data was combined with molecular identification from

51

CHAPTER III: POLYPHASIC IDENTIFICATION any one locus, a further seven isolates could be identified to the species level. These included AWQC-ANA118-AR, AWQC-ANA131-CR, AWQC-ANA148-CR, AWQC- ANA335-C (D. circinale), Hyde park type 1 (An. elenkinii), AWQC-MIC-058-B (Microcystis flos-aquae) and Vasse River type 1 (D. flos-aquae). Of the remaining isolates, 31% (12/39) were in agreement at the genus level, 28% (11/39) to the order level, while no agreement was obtained for the remaining 15% (6/39) (Table 3.2).

3.3.4. Identification of novel isolates Based on their unique/variable phylogenetic positions and large difference in percentage identity from available sequences, two potentially new species from the Nostocales were identified. For Baldwin park type 2, analyses at the 16S rDNA showed only 94% similarity to A. oscillaorides, and 77% similarity to Anabaena sp. at the rpoC1 locus; while for GS2-1, high sequence identity to N. commune was seen at the 16S rDNA locus (99%), but sequences at the rpoC1 and cpcBA-IGS indicated that the strain was a Pseudanabaena sp. These new strains could only be morphologically identified to genus level (Baldwin park type 2 – morphologically identified as Anabaena sp. 1 and GS2-1 – morphologically identified as Nostoc sp.). Of particular note, 17 novel sequences with <95% similarity to previously published sequences were obtained during this study, with the majority of the novel sequences being observed at the rpoC1 locus.

3.4. Discussion The purpose of this study was to determine how well molecular and morphological methods corroborated in identifying cyanobacterial isolates. Furthermore, although there have been investigations on cyanobacterial diversity in Australia (Fergusson and Saint, 2000, McGregor and Rasmussen, 2008, Papineau et al., 2005, Saker et al., 2009), this is the first study involving and focusing on phenotypic and genotypic characterisations of freshwater cyanobacteria from Western Australia. Despite major revisions to the taxonomy and systematics of cyanobacteria, of the 17 isolates microscopically examined in duplicate, agreement at species level was obtained for only three isolates, with a further seven in agreement to the genus level. Examples of genera that are difficult to differentiate between include small unicellular cyanobacteria such as Aphanothece, Synechocystis and Synechococcus, and thin filamentous strains such as Leptolyngbya, Limnothrix and Pseudanabaena, which share similar morphology. This clearly highlights the current difficulties in morphological

52

CHAPTER III: POLYPHASIC IDENTIFICATION identification of cyanobacteria from environmental samples. As such, during monitoring of water bodies, it may be beneficial to utilise more than a single taxonomist. However, in order to maintain the consistency and integrity of the morphological identifications, having morphological identifications done by the same taxonomist for the duration of the study is also of importance in order to maintain identification consistency. Overall, tree topologies for the 16S rDNA and rpoC1 genes were similar to previous publications (Valério et al., 2009, Tomitani et al., 2006, Lyra et al., 2001, Litvaitis, 2002, Fergusson and Saint, 2000). Although the 16S rDNA alignment was relatively short (313 characters), using full-length reference sequences from GenBank did not alter the clustering patterns and tree topology (data not shown). Despite producing similar clustering patterns to previously published trees, the support values for the 16S rDNA phylogeny was generally lower than those of the rpoC1 and cpcBA- IGS loci. However, this low branch support can also be observed in other publications (Lukešová et al., 2009, Tomitani et al., 2006), providing further indication that the short sequence lengths used did not adversely affect the phylogenetic inferences drawn. Moreover, an rpoC1 amino acid alignment produced a similar tree to that obtained in Figure 3.5 (data not shown). The strictly qualitative nature of this study discourages the application of a statistically meaningful analysis or inference of ecological and water quality parameters of the sampled sites. In particular, the isolation methods implemented may have favoured isolation of particular species and cannot therefore be used to comprehensively survey the original cyanobacteria communities. This is evidenced by the majority of the isolates obtained belonging to the Nostocales, when the majority of previous cyanobacteria surveys from Western Australia indicate that Oscillatoriales and/or Chroococcales predominate in the water bodies of the region (Lund and Davis, 2000, Gordon et al., 1981, Garby et al., 2013, Kemp and John, 2006). Even so, 22% of the total number of sequences obtained had less than 95% similarity to previously published sequences (at any of the three loci studied), confirming that many genotypes of freshwater cyanobacteria in Western Australia are still unexplored. The present study identified two potentially new species of cyanobacteria; of these, GS2-1 was the most interesting, grouping with either the Nostocales or Oscillatoriales, depending on the locus considered. Generally speaking, these observed differences can be due to: i) the lack of sequences available in GenBank for the rpoC1 and cpcBA-IGS loci; ii) preferential amplification of contaminating strains; iii) horizontal gene transfer, or iv) presence of ecologically specialised genotypes 53

CHAPTER III: POLYPHASIC IDENTIFICATION (Komárek, 2010b, Komárek, 2006). Baldwin park type 2 was also of interest, as it was basal to the Nostocales at the well-studied 16S rDNA locus, but grouped with A. cylindrica on the rpoC1 tree. Minimum genetic distances for this isolate at either locus were also considerable (6% and 14%, for the 16S rDNA and rpoC1 locus respectively, from A. oscillaroides and Anabaena sp. respectively), indicating a potentially previously uncharacterised Anabaena species. Interestingly, both these two strains were grown in nitrogen deficient ASM-1 media, and had corroborating morphological identifications. Hence, future work such as re-extraction of DNA from new cultures followed by sequencing, at the loci studied or together with alternative loci should be considered in order determine if these two isolates are potentially new cyanobacterial species belonging to the Nostocales. Despite its wide usage, the 16S rDNA has been found to be too conserved to reliably differentiate closely related bacterial species (Coenye and Vandamme, 2003, Lyra et al., 2001) and to have an evolutionary pattern that is not reflective of the entire genome (Seo and Yokota, 2003). Consequently, alternative loci such as the protein coding rpoC1, the cpcBA-IGS, the nitrogenase genes (Kumari et al., 2009) and the 16- 23S ITS region (and its structure) have also been used for phylogenetic reconstructions, identification and discrimination of species (Palenik and Haselkorn, 1992, Johansen et al., 2011). Although polyphasic approaches, combining morphological and molecular identifications, have been proposed (Robertson et al., 2001, Litvaitis, 2002, Seo and Yokota, 2003, Komárek, 2006), a number of authors have demonstrated inconsistencies between phylogenetic and morphological classifications (Litvaitis, 2002, Robertson et al., 2001, Seo and Yokota, 2003). Thus, incorporating data from multiple sources (e.g. morphology, nucleotide sequences from multiple loci, biochemical composition) for the identification of unknown environmental genotypes has been recommended as a standard taxonomic practice (Komárek, 2006). Furthermore, apart from the 97%-98% sequence similarity for the 16S rDNA (Stackebrandt and Goebel, 1994), there is no consensus percentage sequence similarity for species delimitation using other loci. Hence, as a reflection of the relatively few rpoC1 and cpcBA-IGS sequence data available (using Anabaena/Dolichospermum as an example, there were approximately 214 rpoC1 and 240 phycocyanin sequences for this genera), a less stringent criterion was used for the determination of species and genus for these loci. Despite this, molecular agreement between three loci was still generally lower than between two loci. The combination of cpcBA-IGS and rpoC1, however, showed no agreement among all 54

CHAPTER III: POLYPHASIC IDENTIFICATION pairs and only with the inclusion of a third locus (i.e. 16S rDNA) was conformity between the loci found. These trends can be explained by the effects of two intertwined factors (which are ultimately responsible for the successful molecular identification of a given isolate): the total number of sequences available at a specific locus and the number of species represented at that same locus. Amplification efficiency of certain primer sets can potentially be affected by large genetic variations, possibly explaining why none of the isolates belonging to the Oscillatoriales successfully amplified at the rpoC1 locus. Furthermore, alignment length and gap treatment options adopted are known to affect phylogenetic reconstructions including the resolution of some areas of the tree (Lindgren and Daly, 2007). Where alignments of protein-coding genes should present fewer gaps, regions which are subject to less stringent genetic constraints (e.g. 16S hyper-variable regions and intergenic spacers) may produce numerous positions with gaps, requiring ad hoc strategies different from protein-coding data sets (Talavera and Castresana, 2007). This, potentially, accounts for some of the topological differences observed in the trees produced. To overcome such limitations stemming from sequence variability, the use of cyanobacterial 16S-23S ITS secondary structure may provide an alternative method to identifying intergenic and possibly intragenic, diversity which is congruent with that of the 16S rDNA as demonstrated by (Johansen et al., 2011). Similarly, the choice of algorithms for generating trees and alignments, as well as the nucleotide substitution models used, can potentially affect the successful identification of isolates (Lindgren and Daly, 2007, Eddy and Durbin, 1994). This could explain, for instance, the high similarity of AWQC-MIC058-B-derived 16S and rpoC1 sequences with M. aeruginosa and M. flos-aquae reference sequences, despite the morphological identification as M. flos-aquae. On the other hand, however, this observation is in agreement with the recommendation by (Otsuka et al., 2001) that M. flos-aquae be regarded as a morphological variant of M. aeruginosa. It is known that cyanobacteria frequently undergo morphological changes during cultivation (Lyra et al., 2001, Gugger et al., 2002, Komárek and Anagnostidis, 1989), resulting in potential loss of taxon-defining features (Gugger et al., 2002). Not only have the discrepancies between molecular and microscopic characterisations been well documented (Komárek, 2010b, Willame et al., 2006), it has also been shown that DNA sequences from incorrectly identified species exist in GenBank (Komárek, 2010b) (e.g. A. variabilis EF488831). Strains within culture collections have also been incorrectly named and are present in both GenBank and various culture collections 55

CHAPTER III: POLYPHASIC IDENTIFICATION under different designations (e.g. A. variabilis ATCC29413 which also appears as Nostoc sp. PCC7937, A. variabilis UTCC 105, Anabaena PCC 7937, A. flos-aquae UTEX144). Furthermore, with the many revisions to cyanobacteria taxonomy and nomenclature, there is no method for these corrections to be easily incorporated into databases (Komárek, 2006), causing further confusion when attempting to identify environmental cyanobacteria isolates. Finally, as pointed out by Castenholz and Norris (2005), species identification is still blurred and this together with the difficulties discussed when identifying cyanobacteria, questions the need for cyanobacteria identification to the species level. This is especially true in water monitoring situations where identifications to the genus level would usually be sufficient for initiation of remediation strategies. As highlighted by various authors (Komárek, 2006, Oren, 2011, Castenholz and Norris, 2005), a polyphasic approach is still currently the most reliable option for identifying cyanobacteria. The present study however has highlighted that despite the application of molecular techniques and the subsequent increase in publicly available sequences, the ability to accurately and definitively identify environmental cyanobacteria isolates is still challenging. It is expected that whole genome sequencing data from an ever increasing number of species, should make misidentified sequences in GenBank increasingly recognisable, allowing for more prompt exclusion or revision. This combination of whole genome sequences, together with alternative methods of cyanobacteria identification (e.g. mass spectrometry via secondary metabolite analysis c.f. Erhard et al. (1997)) will facilitate the process of cyanobacteria identification. However, until then, accurate identification of these organisms will remain problematic.

56

CHAPTER IV: CYANOBACTERIA SYSTEMATICS CHAPTER 4. EFFECT OF ALIGNMENT, CURATION AND RECONSTRUCTION METHODS ON THE MOLECULAR SYSTEMATICS OF CYANOBACTERIA

4.1. Introduction To alleviate the limitations to identification and classification of cyanobacteria based on phenotypic properties, DNA- or protein-based methods complementary to morphological analyses have been developed (Valério et al., 2009, Willame et al., 2006). The various loci which were covered in Section 1.6.2 are either commonly used phylogenetic markers (Castenholz, 2001, Coenye and Vandamme, 2003, Komárek, 2006, Lee et al., 2014, Fergusson and Saint, 2000, Neilan et al., 1995), or have been suggested to be useful, for lower level discrimination of cyanobacterial taxa (16S-23S ITS) (Otsuka et al., 1999, Boyer et al., 2001, Premanandh et al., 2006). Once the sequences from the chosen loci are obtained from an unknown isolate, the evolutionary relationships between the unknown and reference isolates can be inferred by means of phylogenetic reconstructions. This process consists of three main steps: alignment, alignment curation and tree building (De Bruyn et al., 2014, Harrison and Langdale, 2006, Holder and Lewis, 2003, Yang and Rannala, 2012). During the first step, gaps are added to a matrix of data so that the nucleotides in one column are related to each other by descent from a common ancestral residue (Holder and Lewis, 2003). ClustalW (Thompson et al., 1994) is probably the most widely used classical progressive alignment algorithm (Landan and Graur, 2009); MAFFT (Katoh et al., 2002) and MUSCLE (Edgar, 2004) are faster, progressive alignment methods that include iteration and refinement, while PRANK (Löytynoja, 2014) and PAGAN (Löytynoja, 2012) also distinguish between insertions, deletions and vary the costs between opening and extending gaps based on phylogenetic information. Highly variable regions between coding genes may be characterised by high rates of insertion or deletion of bases (INDELs) in the aligned sequences, which the algorithms resolve by introducing gaps of various lengths and frequencies. Alignment errors/artefacts and/or discrepancies are known to accumulate at these regions (Misof et al., 2014) and have been shown to significantly change the outcome of the phylogenetic reconstruction (Landan and Graur, 2009, Liu et al., 2009, Morrison and Ellis, 1997, Ogden and Rosenberg, 2006). Consequently, these error prone sites may be removed 57

CHAPTER IV: CYANOBACTERIA SYSTEMATICS either manually or automatically by programs like Gblocks (Castresana, 2000, Talavera and Castresana, 2007), REAP (Hartmann and Vision, 2008) or NOISY (Dress et al., 2008). During the final step, a phylogenetic tree (a graph simulating the ancestor– descendant relationships between organisms or gene sequences) is constructed using a variety of methods based on either distance or characters (Yang and Rannala, 2012). Computationally-fast distance-based methods such as neighbour-joining (NJ), calculate the pairwise distances between sequences and group the most similar sequences accordingly (Van de Peer, 2009). As the molecular clock hypothesis, with the assumption that branch length data can be used as a proxy of time intervals (Mooers and Heard, 1997), does not always hold, the simplicity of this method underestimates the complexity of the phylogenetic-inference problem. To overcome this, approaches like maximum likelihood (ML) or Bayesian inference (BI), that take into account rate variation across lineages, were introduced to obtain better estimates of divergence times (Holder and Lewis, 2003). Phylogenetic reconstructions can be used to identify unknown isolates, track infections or contaminations back to their sources, discover novel taxa or support classifications, formulated on the basis of characters of a different nature (e.g. molecular‒ and morphological‒systematics). It should be noted however, that the obtained trees (consisting of relationships and divergence times) are not directly observed, but are instead statistically-inferred from available data. This implies that the robustness of the reconstruction can vary and that, starting from a given dataset, multiple (sometimes equally plausible) scenarios can be obtained. While tree scores can be used to identify the most probable tree (e.g. the plausibility of the mutations that a particular tree would require to explain the data), the congruency between the inferred (unobservable) molecular phylogeny of a given taxon and the generated systematics cannot be easily tested. For this reason, noting only tree scores and branch statistical supports gives only partial and/or skewed indications of the “true” evolutionary relationships between the various lineages. In this context, accepting a tree without assessing the variation of all the topologies obtained experimentally can be inadequate (Morrison and Ellis, 1997). With the wide array of phylogenetic reconstruction tools available to the researcher using molecular methods to either 1) determine the identity of an unknown sample, or 2) to trace evolutionary relationships between different organisms, selecting the appropriate strategy to obtain the most accurate outcome usually becomes a choice 58

CHAPTER IV: CYANOBACTERIA SYSTEMATICS of what is most easily accessible. As such, using easily accessible algorithms, the goal of the present study was to investigate the combined effect of alignment algorithm, alignment curation option and tree-building method on the reproducibility of phylogenetic reconstructions of cyanobacterial taxa with the most commonly used and validated phylogenetic markers: 16S rDNA, rpoC1, cpcBA-IGS and 16S-23S ITS loci.

4.2. Materials and Methods

4.2.1. Isolation, cultivation and sequencing of cyanobacterial isolates Forty-nine cyanobacteria isolates from a variety of freshwater habitats (n = 20) surveyed in Western Australia were isolated, cultured and DNA was extracted as described in Chapter 2. Majority of these isolates were examined in Chapter 3, the additional isolates used in this study were only examined via the molecular methods used as covered in this chapter. Partial fragments of the 16S rDNA hypervariable region (467 bp), rpoC1 (612 bp), 16S-23S interspacer region (variable length) (Janse et al., 2003). and the cpcBA-IGS (approximately 585 bp) were amplified as previously described Amplicons corresponding to the expected length were excised from the electrophoresis gel, sequenced and analysed as described in section 2.5. The isolates and loci sequenced are listed in Appendix 4.

4.2.2. Multiple sequence alignment and curation Figure 4.1 gives an overview of the study workflow. The input set consisted of globally-trimmed sequences (16S rDNA: n = 112; 16S-23S ITS: n = 87; cpcBA-IGS: n = 95; rpoC1: n = 94), generated as part of this thesis or retrieved from GenBank by BLAST-searches. Global trimming of the input set was performed manually in MEGA5 (Tamura et al., 2011) to eliminate terminal gaps, by constructing temporary ClustalW alignments (Larkin et al., 2007) before global trimming. The temporary alignments were then dissolved and five new alignments were generated under the default settings of each program: ClustalW v.1.82, MAFFT v.6.712, MUSCLE v.3.7, PAGAN v.0.44 and PRANK v.100223 (Edgar, 2004, Katoh et al., 2002, Larkin et al., 2007, Löytynoja, 2014, Löytynoja et al., 2012). Alignments were curated (i.e. degapped), remotely (Dereeper et al., 2008), by Gblocks v.0.91b (Talavera and Castresana, 2007) on default settings. The pairwise distance (p-distance) scores for all (uncurated and curated) alignments were then computed by the PAST software v.3.08 (Hammer et al., 2001).

59

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.1: Overview of the workflow performed. Cyanobacterial sequences from four loci (16S rDNA, rpoC1, 16S-23S-ITS or cpcBA-IGS) were aligned using five alignment algorithms (ClustalW, MAFFT, MUSCLE, PAGAN, PRANK), prior to optional alignment curation (i.e. de-gapping). Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI). Multivariate analysis was performed based on a number of metrics computed from each phylogenetic tree. The output of the multivariate analysis consisted of principal component analysis (PCA) plots highlighting the topological differences between trees, generated using alternative methods adopted during steps 1 to 3. 4.2.3. Tree-building and multivariate analyses The appropriate models of nucleotide substitution for each alignment was determined using JModelTest2 (Darriba et al., 2012) on the CIPRES Science Gateway v.3.3 (Miller et al., 2010). Phylogenetic trees were constructed using NJ, ML and Bayesian inference (BI), using MEGA5, RAxML v.8.0 or SiMBa v.1.0 (a graphical interface for MrBayes) (Tamura et al., 2011, Stamatakis et al., 2008, Ronquist et al., 2012, Mishra and Thines, 2014). Optimal nucleotide substitution models and gamma‒ and invariable‒rates were chosen for each reconstruction, based on the JModelTest2 results (Darriba et al., 2012). Where possible, the best substitution model (with the lowest BIC score) was used; otherwise, the second best (and so forth) model choice was used to generate phylogenies. For all the alignments used, there was always a suitable model which could be used to generate phylogenies before a large increase in BIC score was observed. For the NJ and ML reconstructions, tree reliability was evaluated with bootstrap analysis of 500 replicates, while default settings were used for BI (No. generations: 1,000,000; burnin fraction: 0.25; No. of runs: 2; No. of chains: 4; outgroup 60

CHAPTER IV: CYANOBACTERIA SYSTEMATICS set to G. violaceous), with substitution model and gamma‒ and invariable‒rates based on JModelTest2 results (Darriba et al., 2012). The trees generated were visualised using FigTree v.1.4.2 (Rambaut, 2014). From each of the trees generated, a set of five tree metrics were calculated using TreeStat v.1.2 (Rambaut, 2008). These figures were then imported into PAST v.3.08 (Hammer et al., 2001) to compute multivariate analyses (principal component analyses‒PCA) and generate PCA plots. The tree metrics considered describe general features of the tree, its shape and topology. These were: 1) tree length (i.e. sum of branch lengths), 2) tree height (i.e. height of the root of the tree), 3) treeness (i.e. proportion of total length of tree taken up by internal branches; interpreted as a signal/signal+noise measure) (Phillips et al., 2001), 4) N_bar (i.e. mean number of nodes above an external node) (Kirkpatrick and Slatkin, 1993) and 5) cherry count (i.e. number of internal nodes that have only tips/terminal branches as children (McKenzie and Steel, 2000). Although these measures are somewhat inter-correlated, they capture different aspects of the tree shape and may be used for comparisons (Agapow and Purvis, 2002).

4.3. Results

4.3.1. Comparison of alignment algorithms and effect of curation Depending on the alignment algorithm used, the final alignment length varied: at all four loci, PAGAN always produced the longest alignments, while ClustalW alignments were consistently the shortest. The effect of alignment algorithm and curation were more evident at the loci containing variable regions (16S-23S ITS and cpcBA-IGS) than at the conserved or protein coding loci (16S rDNA and rpoC1). For the uncurated alignments at the 16S rDNA and rpoC1 loci, there were negligible differences between the five algorithms, in: alignment length and fraction of conserved‒, variable‒ and parsimony-informative‒sites (data not shown). These alignments remained very similar, even after curation (there was < 20% change in alignment length, compared to pre-curation). This was reflected in the negligible differences between the five algorithms for the p-distance values, measured both pre‒ and post‒curation (data not shown). Conversely, large differences were seen when comparing the output of the various alignment algorithms at the hypervariable, highly-gapped 16S-23S ITS and cpcBA-I S loci, where the longest alignments were three‒ and two‒times longer than

61

CHAPTER IV: CYANOBACTERIA SYSTEMATICS the shortest alignments, respectively. Curation also had a significant effect (especially for 16S-23S ITS) in reducing the alignment length by at least 65%. These large fluctuations in alignment length, fraction of conserved,- variable‒ and parsimony- informative‒sites , were reflected in the computed p-distance values. At the16S-23S ITS (in particular) and cpcBA-IGS loci, the various alignment algorithms had different p- distance values both pre‒ and post‒curation (data not shown).

4.3.2. Comparison of phylogenetic reconstructions For each of the four loci, a total of 30 trees were generated, starting from 5 curated and 5 uncurated alignments (n = 120 trees in total). Values obtained for the metrices used have been included in Appendix 5-8.

4.3.2.1. Effect of alignment algorithm on tree topology Multivariate analysis based on the five tree metrices used allowed for the comparison of the shape and topology of trees produced by different alignments. For 16S rDNA, there was little difference between the trees produced using the same alignment algorithms, as shown by the areas connecting the trees produced by the same algorithm, which were clearly overlapping in the PCA plot (Figure 4.2). Only two trees (PRANK/uncurated/BI and MAFFT/curated/ML) appeared slightly spatially segregated from the others on the PCA plot, based on their metrics. Also for rpoC1, the differences were very small, with possibly up to seven trees showing a minor spatial segregation on the PCA plot (MAFFT/uncurated/NJ, PRANK/curated/NJ; PRANK/uncurated/NJ, MUSCLE/uncurated/NJ, MAFFT/uncurated/ML, MAFFT/curated/ML, MUSCLE/curated/BI) (Figure 4.3). On the other hand, for the cpcBA-IGS and 16S-23S ITS loci, some differences were noticeable and outlying tree topologies were relatively more evident (these trees appeared spatially segregated on the PCA plot, based on the metrics considered) (Figure 4.4 and 4.5). For cpcBA-IGS, there was a core of trees with comparable topologies, surrounded by a large number of trees (≈10) that were segregated from the others based on their metrics (PAGAN/curated/BI, PRANK/curated/BI, ClustalW/curated/BI, ClustalW/uncurated/BI, MUSCLE/uncurated/ML, ClustalW/uncurated/ML, PRANK/curated/ML, ClustalW/curated/ML, MUSCLE/curated/NJ, PRANK/uncurated/NJ) (Figure 4.4). For 16S-23S ITS, trees with possibly peculiar topologies were: MUSCLE/curated/BI, MUSCLE/uncurated/BI and ClustalW/uncurated/ML (Figure 4.5).

62

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.2: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the 16S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

Figure 4.3: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the rpoC1 locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

63

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.4: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the cpcBA-IGS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

Figure 4.5: Principal component analysis (PCA) highlighting the topological differences between the trees generated using different alignment algorithms using sequences from the 16S-23S ITS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

64

CHAPTER IV: CYANOBACTERIA SYSTEMATICS 4.3.2.2. Effect of alignment curation on tree topology Alignment curation affected tree topology in a clearly locus-specific manner. Differences between topologies of the trees produced from curated or uncurated alignments ranged from insignificant to extreme, increasing from 16S rDNA (curated alignment = uncurated alignment), to rpoC1, cpcBA-IGS and 16S-23S ITS (curated alignment ≠ uncurated alignment) (Figures 4.6 – 4.9). For the highly-gapped 16S-23S ITS hypervariable locus, the effects of curation were very obvious, with only two trees (PRANK/uncurated/NJ and MAFFT/curated/BI) clearly intersecting the opposite cluster in the PCA plot (thus showing a topology more similar to the opposite cluster) (Figure 4.9).

Figure 4.6: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the 16S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

65

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.7: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the rpoC1 locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

Figure 4.8: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the cpc-BA IGS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI)

66

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.9: PCA plots highlighting the topological differences generated using curated or uncurated alignments from the 16S-23S ITS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI). The two trees that clearly intersect the opposite cluster are indicated by the red arrows and have their reconstruction option underlined. 4.3.2.3. Effect of tree-building method on tree topology The tree-building method had a strong effect on tree topology, based on the dataset and metrics examined in this study; the tree building method (particularly for the 16S rDNA) is by far the more critical than the choice of alignment algorithm or curation option. Although locus specific, the extent of this effect was obvious for all loci considered (Figures 4.10 – 4.13). The 16S-23S ITS was the only marker where similar tree shapes were obtained from different tree-building methods (Figure 4.13). At this locus, NJ and ML trees largely coincided, whilst most trees obtained by BI were different. Conversely, at all other loci, the three tree-building methods produced considerably different topologies, with divergences between the methods progressively increasing from cpcBA-IGS, to rpoC1 and 16S rDNA. Interestingly, for the 16S rDNA locus, there was also high consistency among trees produced by the same method. For instance, all NJ trees (or ML or BI) showed virtually identical topology, irrespective of alignment algorithm or curation option (Figure 4.10).

67

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.10: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the 16S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

Figure 4.11: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the rpoC1 locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

68

CHAPTER IV: CYANOBACTERIA SYSTEMATICS

Figure 4.12: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the cpcBA-IGS locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI).

Figure 4.13: PCA plots highlighting the topological differences generated using different tree building methods from both curated and uncurated sequences of the 16S-23S locus. Trees were built using maximum likelihood (ML), neighbour joining (NJ) and Bayesian inference (BI). 4.3.2.4. Effect of different phylogenetic reconstruction strategies on the topology trees generated from the loci studied The tight clustering of trees for the rpoC1 protein-coding locus showed that the reconstruction methods used had relatively little impact on the trees generated (Figure

69

CHAPTER IV: CYANOBACTERIA SYSTEMATICS 4.14). However, as the characteristic of the loci shifted from strictly coding to loci containing both coding and non-coding regions, the effects of the phylogenetic reconstruction options become increasingly obvious with increasing topological differences between topologies constructed for each locus. This is most obvious at the 16S-23S ITS locus, which had the largest variation in tree metrices examined, as evidenced by the large area covered on the plot (Figure 4.14). This can be attributed to the nature of the sequences obtained from this locus, which was mainly made up of variable sequences flanked by short conserved regions, that would in turn produce greatly varied alignments, which would in turn affect the resultant phylogenies.

Figure 4.14: Principal component analysis (PCA) plot highlighting the topological differences between trees generated using various combinations of phylogenetic reconstruction options and starting from cyanobacterial sequences from four loci: 16S rDNA, rpoC1, 16S-23S-ITS or cpcBA-IGS. Multivariate analysis was performed based on a number of metrics computed from each phylogenetic tree.

4.4. Discussion Initially conceived to infer evolutionary relationships based on traditional (morphological and physiological) characters, phylogenetic reconstructions have become extremely popular with the advent of culture-independent typing techniques and DNA sequencing technologies. The molecular systematics of a collection of organisms is proposed through phylogenetic trees (graphical representations of the relationships among taxa) substantiated by the data currently available in databases, with the aim of inferring the “true” (un-observable) phylogeny of the taxon. In this context, the limitations shown by the more traditional approaches used for identification (e.g. morphological, biochemical et cetera), have contributed to and justified, the

70

CHAPTER IV: CYANOBACTERIA SYSTEMATICS explosion of molecular classification of cyanobacteria based on DNA and/or protein analyses. However, due to the differences in amplification of the different loci, and the availability of sequences for the individual locus within GenBank, the number of sequences used to reconstruct the phylogenies for each locus differed. This in turn would have affected the phylogenies generated, and hence the tree metrices obtained. In most of the 120 trees produced via the different combinations alignment, curation, and phylogenetic reconstruction methods during the study, the Nostocales generally formed a monophyletic shallow-branching cluster, irrespective of the locus. The Chroococcales and Oscillatoriales were more deep-branching but their evolutionary relationships varied, depending on the reconstruction workflow and locus (the number of taxa per locus differed due to the variety and number of cyanobacterial sequences available in GenBank). As only three orders were included in the analysis, the degrees of resolution (soft polytomies) and of paraphyly of the orders yielded the major difference between the trees obtained from the same locus. The focus of this study was to quantify and visualise the uncertainties associated with alternative reconstruction methods and at assessing the consequences of alternative analytical choices. As cyanobacteria sequences were used, the output of this study can still be used to suggest guidelines for investigating the molecular phylogeny of cyanobacteria. The combination of conserved and variable regions within the 16S rDNA, have made this marker the gold standard for the systematics of bacteria and archaea since the molecule was first sequenced in the late 70’s (Brosius et al., 1978). This locus is important also for the phylum cyanobacteria (Seo and Yokota, 2003, Rehakova et al., 2014, Lee et al., 2014) and the greater availability of 16S rDNA-based studies should be exploited to identify a “prototype tree”, representing the currently accepted systematics (Tomitani et al., 2006, Rehakova et al., 2014, Howard-Azzeh et al., 2014). This step is essential to pinpoint the spatial position in the PCA plot of the “prototype tree”, in comparison to trees with alternative topologies under test. During the present study, coherence with the topology of trees published in previous seminal papers (Tomitani et al., 2006, Rehakova et al., 2014, Howard-Azzeh et al., 2014) was found in several (similar) trees, produced by the ML tree-building method. This suggests that, with our dataset, this method was adequate to obtain reliable cyanobacteria systematics based on the 16S rDNA. When a benchmark “prototype tree” cannot be chosen a priori, or when more trees are equally plausible, the use of PCA plots, as described in the present study, can 71

CHAPTER IV: CYANOBACTERIA SYSTEMATICS only assess the reproducibility of the reconstructions, obtained from a given dataset using alternative options. However, optional display of metrics-associated vectors in the PCA plots (these are called biplots) can be very useful to identify possible relationships between the metrics’ value and the spatial pattern in the PCA plot. For instance, one ML 16S rDNA tree, generated based on a curated MUSCLE alignment, was characterised by numerous monophyletic clades and fully resolved taxa and this is a combination of options that appeared successful in our hands for the 16S rDNA locus (Appendix 9). Paraphyletic clades of Oscillatoriales appeared basal to the tree, while the Chroococcales (e.g. Microcystis spp., Synechococcus sp.) fell into two strongly supported clades (100% bootstrap support). The Nostocales formed a large monophyletic group with strong bootstrap support (96%) (data not shown). Interestingly, the topology of this very robust tree (MUSCLE/curated/ML) produced a treeness value higher than all other trees (Appendix 9). During our study, high values of treeness (and possibly N_bar and cherry counts) were generally associated with fewer unresolved clades and (soft) polytomies. It is tempting to speculate that, once the systematics of a particular taxon is well-known, one metric (or a few) could be used as a proxy of “quality” of newly generated trees. Future implementations of the approach presented in this paper should also include metrics associated with the statistical support of branches. The genetic constraints, due to the need to maintain functionality, of protein coding loci results in the conservation of sequences. This in turn produces alignments with fewer gaps and increased alignment reproducibility regardless of the options chosen when reconstructing phylogenies. The present investigation clearly confirms that, because compared to 16S-23S ITS, trees at the rpoC1 locus (and to a lesser extent cpcBA-IGS) clustered tightly showing relatively little influence by the reconstruction method. In contrast, the set of topologies for the first locus were highly divergent. These findings imply that for rpoC1 and cpcBA-IGS loci the combination of analytical options made for the reconstruction should be less critical than for 16S-23S ITS locus. As multiple plausible trees can be obtained, the comparison of topologies should be treated with caution as none of the tree metrics used in the present paper has been tested as a universal proxy of accuracy. When other tree-building methods are used (e.g. minimum evolution) trees showing minimum tree length (shortest tree length) have been proposed as a proxy to select the “true” trees (Van de Peer, 2009). However, previous studies have also shown that, although curation results in shorter trees, this does not necessarily yield improved accuracy for other tree-building methods (Liu et 72

CHAPTER IV: CYANOBACTERIA SYSTEMATICS al., 2009). Alternative methods to overcome the difficulties and inconsistencies presented include performing both alignment and phylogenetic reconstructions simultaneously (e.g. POY (Varon et al., 2010)), or alignment free methods (e.g. “google DNA” method described by (Gautier and Lund, 2013)) (Land et al., 2015), may potentially provide better reconstruction accuracy (Hohl and Ragan, 2007). One finding from the present analysis somewhat contrary to previous reports (Morrison and Ellis, 1997, Löytynoja, 2012, Lindgren and Daly, 2007, Liu et al., 2009), is that alignment parameters did not appear to affect phylogenetic reconstructions and evolutionary inferences. This difference may be due to the present study directly comparing the topology of the trees generated with various alignment algorithms; conversely, previous studies directly scored the alignments generated, in comparison to a “prototype alignment”. Earlier reports (Varon and Wheeler, 2012, Sedaghatinia et al., 2009, Ogden and Rosenberg, 2007, Löytynoja et al., 2012, Liu et al., 2009), employed either simulated data or simulated differences from a known data set, such that a “true- alignment” and a “true-tree” were known. These provided benchmark tools to score any experimenatlly-obtained output against. However, in the present study, the correct alignment was unknown and therefore we were unable to benchmark the accuracy of the alignment algorithm, its curation and the tree-building method. For the first time, a simple and universal method, applicable to any locus or organism, shows the different effects produced by alternative alignment algorithms, gap-treatment options and tree-building methods, on the tree topology. While choosing one tree against another can be difficult, especially when a “reference tree” is unavailable (like in the case of markers less validated than 16S rDNA), this paper highlights the consequences of choosing one workflow over another. Although molecular methods and the resultant classifications (molecular systematics) should be still considered invaluable tools, pitfalls and limitations of these approaches must be kept in mind.

73

CHAPTER V: CYANOBACTERIA NGS CHAPTER 5. NOVEL PHYLUM SELECTIVE PRIMERS FOR USE WITH NEXT GENERATION SEQUENCING TECHNOLOGIES

5.1. Introduction Massively parallel high throughput molecular technologies, in particular next generation sequencing (NGS), have allowed us to profile complex microbial communities of biological samples at ever-increasing resolutions (Claesson et al., 2010, Mizrahi-Man et al., 2013). The increasing accessibility of this technology has greatly improved our understanding of environmental microbial diversity (Kolmakova et al., 2014, Soo et al., 2014, Caporaso et al., 2011, Shih et al., 2013). Being culture independent, this technology has the advantage of being able to detect rare/low abundance taxa (Eiler et al., 2013, Claesson et al., 2010). The typical approach is to produce large numbers of sequences from individual samples either by targeting selected genes by PCR (Creer et al., 2010) or by sequencing entire genomes (Machida and Knowlton, 2012). These sequences are then assigned taxonomic classifications based on their similarity to known sequences in existing databases (Mizrahi-Man et al., 2013). However, obtaining data that accurately reflects microbial composition relies on the choice and region of the locus targeted, amplification and sequencing strategy, primer mismatch and bias and the choice of taxonomic classifier used (Claesson et al., 2010, Mizrahi-Man et al., 2013, Hadziavdic et al., 2014, Lanzén et al., 2011). As such, representation of specific groups or members of the community may be skewed by the PCR primers used in the amplification step of the microbial profiling pipeline (the series of steps taken to obtain sequencing data) (Milani et al., 2014, Hong et al., 2009). Due to its unique properties (see Section 1.6.2), the 16S rDNA gene has largest reference library available for comparison (Kermarrec et al., 2013). Consequently, for NGS studies of microbial diversity, much research has focused on designing universal 16S rDNA primers that amplify all species with equal efficiency (Caporaso et al., 2012, Takahashi et al., 2014, Klindworth et al., 2013). These primers have allowed successful characterisation of microbial diversity from a wide variety of habitats (Whiteley et al., 2012, Caporaso et al., 2012, Sinclair et al., 2015, Menchaca et al., 2013, Milani et al., 2014). Although primers with broad taxonomic coverage and unbiased amplification provide the most comprehensive assessment of the microbiome, their use results in

74

CHAPTER V: CYANOBACTERIA NGS shallow/reduced resolution and sequencing depth (the number of sequences obtained for each sample/taxon) for each taxon, in the case of complex communities. Consequently, the use of universal primers can result in low abundance taxa going undetected (Eiler et al., 2013). In contrast, primers with increased specificity, by favouring the amplification of targeted groups, can increase sequence coverage for the specific taxon of interest (Hadziavdic et al., 2014, Lanzén et al., 2011). To overcome the limitations of universal 16S primers mentioned above, cyanobacteria-targeted 16S rDNA primers were designed for use with NGS technologies. In the context of water management and monitoring, successful application of such primers would allow us to obtain a deeper insight into cyanobacteria diversity from complex communities and have an increased coverage of cyanobacteria sequences from such communities. As mentioned previously, cyanobacteria are of concern due to the presence of bloom forming and/or secondary metabolite producing strains (Welker and von Dohren, 2006) that can adversely affect public health (Humpage et al., 2012, Paerl and Otten, 2013). Although PCR based methods for identification of this phylum are being increasingly utilised (Valério et al., 2009, Dall'agnol et al., 2012, Wood et al., 2013, Eiler et al., 2013, Lee et al., 2014), the majority of the primers designed to date either produce DNA fragments with product lengths incompatible with NGS, or are too specific for effective use with this technology (e.g. Microcystis specific 16S rDNA qPCR primers, toxin gene specific qPCR primers) (Kataoka et al., 2013, Rasmussen et al., 2008, Al-Tebrineh et al., 2011, Fergusson and Saint, 2000, Neilan et al., 1995, Man-Aharonovich et al., 2007, Manen and Falquet, 2002). The novel primers designed in this chapter were tested on both cyanobacterial cultures and environmental water samples of unknown species composition. The results obtained were compared against a widely adopted universal 16S rDNA primer set to determine the potential utility of these new primers to detect cyanobacteria, especially in the context of water monitoring.

5.2. Materials and Methods

5.2.1. Cyanobacteria cultures, sampling and DNA extraction Two types of water sources were used in this study: cyanobacterial cultures isolated as described in Section 2.2 and heterogeneous environmental samples obtained

75

CHAPTER V: CYANOBACTERIA NGS from a variety of locations and ecosystems around Perth, WA (Table 5.1). Urban freshwater samples were collected in the winter of 2014 (Piney Lakes Reserve) or summer 2015. Grab samples were collected in the winters of 2014 and 2015, from the final effluents of three wastewater treatment plants (WWTP) in Western Australia. These WWTP collection sites had previously been selected to cover different construction design, treatment technologies, and operating regimes. DNA extractions from the six (three Nostocales and three Oscillatoriales), previously characterised, cyanobacteria cultures were carried as previously described (Iteman et al., 2000, Gaget et al., 2011). Briefly, cultures (2 mL) were centrifuged at 12,000 x g for 10 mins at 20°C and washed twice with sterile MilliQ water. The final pellets were resuspended in 200 μL sterile MilliQ water and frozen in liquid nitrogen. Lysates of frozen samples were prepared by a minimum of ten alternating freeze-thaw cycles. These DNA extracts were used for initial NGS sequencing runs. Whole DNA was extracted from 10 environmental water samples using the MOBIO PowerWater Sterivex DNA Isolation kit (MOBIO, USA) according to the manufacturers’ instructions. Two extraction blanks were included with every batch of DNA extractions. Controls for the culture extractions involved mock-extractions being done using the same sterile MilliQ water used to wash the cultures and resuspend the extracted DNA. As controls for the environmental water sample extractions, fresh Sterivex filters were put through the same DNA isolation process, at the same time as the sample DNA extractions were being done. Extracted DNA were quantified using the Qubit 2.0 Fluorometer® to determine amount and qualtity of extracted DNA. To minimise potential contamination from external sources, all reactions were carried out using water and consumables guaranteed to be DNA free, with amplification and sequencing reactions being set up in DNA free environments (clean rooms).

5.2.2. Primer design and in silico validation Cyanobacteria specific primers targeting the V3 and mainly the V6 region of the 16S rDNA were designed using the Primer 3 add-on in Geneious (Untergasser et al., 2012, Kearse et al., 2012). The primers were designed from an alignment of 6,000 cyanobacteria 16S rDNA sequences retrieved from NCBI GenBank. The sequences and regions targeted by the primers used in this study are shown in Figure 5.1 and Table 5.2. In silico testing of primer specificity was conducted using BLAST (NCBI) searches and modified to increase specificity to cyanobacteria. In silico re-evaluation of the specificity of the modified primers, together with the universal 16S rDNA 515F_806R

76

CHAPTER V: CYANOBACTERIA NGS (Caporaso et al., 2011) primers were performed against all available bacteria sequences in the SILVA (SSURef-122 NR database) and Ribosomal Database Project (RDP) databases using TestPrime (http://www.arb-silva.de/search/testprime/) and ProbeMatch (https://rdp.cme.msu.edu/probematch) respectively. The recommended mismatch settings were used for the SILVA TestPrime analysis, while one and two differences were allowed for the RDP ProbeMatch analysis.

5.2.3. Thermal cycling conditions

5.2.3.1. Conventional PCR Thermal cycling conditions for the new primers were preliminarily optimised, for annealing temperature, in 25 μL reactions containing the 1U PerfectTaq DNA polymerase (5 Prime), 2 mM MgCl2, 200 μM of each deoxynucleotide triphosphate (dNTP, Promega), 0.2 μM of each primer and 1 μL cyanobacteria culture DNA obtained in Chapter 3. Amplification reactions were run in a Veriti 96-Well Thermal Cycler (Applied Biosystems) with the following conditions: initial denaturation at 95°C for 3 min, followed by 35 cycles of denaturation at 95°C for 45 s, annealing at a range of temperatures (50°C – 60°C) for 45 s, 72°C for 90 s and a final extension of 72°C for 5 min. Amplicons were visualised on 2% agarose gel, with amplification reactions deemed successfully optimised when a single bright band of the expected size was seen on the gel. These amplification products were cleaned up and confirmed by the sequencing of the band as described in section 2.4 to ensure that only cyanobacterial sequences were obtained.

5.2.3.2. Real-time PCR (qPCR) Final amplification reactions were carried out on the DNA extracted as described in Section 5.2.1 using the primers designed in the present study together with the universal 16S rDNA 515F_806R primer pair (Caporaso et al., 2011) to enrich for target sequences. Three separate amplification reactions were carried out, with one reaction per primer pair, for each culture DNA extract and environmental sample. All amplification reactions were carried out in triplicate in 25μl reactions, with reaction mixtures similar to the preliminary optimisation, except that 0.01 mg BSA (Fisher Biotech, Australia) and 3.3 µM SYTO 9® (Thermo Fisher Scientific, USA) were added. The same volume (1 μL) of DNA was used for all reactions. No-template control reactions and extraction reagent blank controls were included in every run and incorporated in the subsequent sequencing libraries. A longer initial denaturation time

77

CHAPTER V: CYANOBACTERIA NGS of 5 mins at 95°C was performed to endure complete denaturation of sample DNA, subsequent thermal cycling conditions consisted of 45 cycles of denaturation at 95°C for 30 s, annealing at the appropriate temperature for 30 s, an extension at 72°C for 45 s and a final extension at 72 °C for 7 min. Amplification was followed by a melt curve analysis to determine the specificity of the amplification product. All PCR amplifications were performed on a Step-One real-time qPCR machine (Applied Biosystems, USA).

5.2.4. Library Preparation and NGS Library preparation was carried out in a similar manner to that of the final amplification reactions, except 1 x ROX® dye (Life Technologies, USA) was added to the reactions and fusion primers containing the original primer sequence, a six to eight base pair multiplex identifier (MID) sequence and Ion Torrent sequencing adapters A and P1 (Life Technologies, USA) were used. Again, three separate amplification reactions, each corresponding to one reaction per primer, for each sample was performed in triplicate. A unique combination of forward and reverse primers containing MID sequences was used for each sample to allow multiplex sequencing and discrimination of sequences to samples in downstream analysis. Thermal cycling conditions used were identical to that of the qPCR amplification reaction, except that 35 amplification cycles were used. 16S rDNA amplicons from all samples and controls were pooled into one of two sequencing libraries in equimolar amounts. Amplicon libraries were then purified twice using 1.2 volumes of Agencourt Ampure XP® beads (Agilent Technologies, USA) and quantified by qPCR using a known concentration of a serially diluted 152 bp synthetic oligonucleotide as a standard. qPCR reactions contained 1X Power Syber reen mastermix (Life Technologies, USA), 0.4 μM Ion Torrent primers A and P1 and 2 μl DNA template, amplified using the fusion primers, were run using the following cycling conditions: initial denaturation at 95°C for 5 min followed by 30 cycles of denaturation at 95°C (30 s), annealing and extension at 60°C (45 s). Template emulsion PCR and enrichment were performed according to the manufactures’ recommendations on the One-Touch 2 and One-Touch ES instruments (Life Technologies, USA). Sequencing was performed on an Ion Torrent PGM (Life Technologies, USA) using 400 bp chemistry and 316-V2 semiconductor chips, following the manufacturers’ recommendations.

78

CHAPTER V: CYANOBACTERIA NGS Table 5.1: Environmental water sample collection sites. Number of Sampling Site Location Sample Description samples 32°1'13.4"S 115°49'39.0"E Alfred Cove/Tompkins Park (drain runoff) Urban runoff 1 (-32.020379, 115.827490) 32°2'17.6"S 115°50'48.5"E Blue Gum Lake Freshwater 1 (-32.038215, 115.846807) 32°2'39.1"S 115°50'34.3"E Booragoon Lake Freshwater 1 (-32.044182, 115.842863) 32°3'35.2"S 115°48'48.3"E Frederick Baldwin Park Freshwater 1 (-32.059782, 115.813410) 32°2’20.2”S, 115°48’40.8”E Marmion Reserve Freshwater 1 (-32.038938, 115.811322) 32°4'00.1"S 115°49'56.9"E Murdoch University Chinese Garden Koi Pond Freshwater 1 (-32.066707, 115.832480) 32°3’12.4”S, 115°50’25.6”E Piney Lakes Reserve Freshwater 1 (-32.053449, 115.840453) (-32.508669, 115.758771) WWTP* 4 Wastewater 1 South-Western Australia (-17.974228, 122.221855) WWTP* 5 Northwest Region of Western Wastewater 1 Australia (-32.626383, 115.908005) WWTP* 6 Wastewater 1 South-Western Australia Total 10 *WWTP: wastewater treatment plant

79

CHAPTER V: CYANOBACTERIA NGS

Figure 5.1: Screen capture from Geneious (Biomatters, NZ) showing cyanobacteria 16S rDNA consensus sequence mapped onto E. coli K-12 substrain MG1655 (NR10284) Annotations indicate the positions of the hypervariable regions (blue boxes), and primer binding sites. Black arrows indicate positions of reference primers, green/grey arrows indicate the positions of primers designed in this study.

Table 5.2: Primers used in the study. Location on Target Annealing Amplicon Name Sequence (5’-3’) E.coli K-12 Reference region temperature size (NR102804) 293F AGCCACACTGGGRCTGAGA 312-331 V3 50°C 255 bp This study 751R TGCGGACGCTTTACGCCCA 572-590 515F GTGCCAGCMGCCGCGGTAA 523-541 (Caporaso et al., V4 55°C ~253 bp 806R GGACTACHVGGGTWTCTAAT 796-816 2011) 1328F GCTAACGCGTTAAGTATCCCGCCTGG 870-896 Mainly V6 55°C ~298 bp This study 1664R GTCTCTCTAGAGTGCCCAACTTAATG 1166-1185

80

CHAPTER V: CYANOBACTERIA NGS

5.2.5. Bioinformatics analysis Sequences were first processed using Geneious Pro 8.0.4 software (Kearse et al., 2012) by retaining only reads with perfect forward and reverse primer sequences and MID sequences (no mismatches allowed). Sequences were then de-multiplexed into individual samples based on their unique combination MID sequences. Primer sequences and distal bases were trimmed from each read. Reads shorter than the minimum estimated length for each primer pair (Table 5.2) were discarded. The remaining reads were quality filtered using the USEARCH software (Edgar, 2010), allowing only reads with a < 1% error rate to remain and singletons were removed on a per-sample basis. In order to have equal sampling depth for each primer set, every sample was subsampled to an equal number of sequences in QIIME (Caporaso et al., 2010) and the sequences pooled together according to primer prior to further analysis. Operational taxonomic units (OTUs) were selected by clustering sequences at 97% similarity with the UPARSE algorithm (Edgar, 2013) and bacterial genera present were identified. OTUs were checked against the ChimeraSlayer Gold reference database with the UCHIME algorithm (Edgar et al., 2011) to ensure OTUs were not the result of chimeric reads. Genus level taxonomy was assigned to OTUs against the GreenGenes 16S rDNA database (August 2013 release) (DeSantis et al., 2006) in QIIME 1.9.0 (Caporaso et al., 2010) using the UCLUST algorithm (Edgar, 2010), with default parameters. Bacterial genera that were identified in extraction reagent blanks and no- template controls were removed from the dataset to eliminate background and contaminating bacterial sequences. Diversity analyses were then performed, the number of sequences analysed for each primer set was determined by the sample with the least number of sequences. 16S sequences from the most abundant phyla obtained by all three primer sets (cyanobacteria and proteobacteria), were compared against the NCBI GenBank Nucleotide database using the BLAST algorithm to obtain increased taxonomic resolution. The results were then visualised using the MEtaGenome ANalyzer (MEGAN) version 5 (Huson and Weber, 2013) (LCA default settings except Min Score =150 and Min Complexity = 0.3).

81

CHAPTER V: CYANOBACTERIA NGS

5.3. Results

5.3.1. Evaluation of new primers The results from both the TestPrime and ProbeMatch in silico analyses indicated that the new primer pairs targeted a larger proportion of cyanobacterial sequences, amplifying at least three times more cyanobacterial sequences than the universal 16S rDNA primers (Table 5.3). Figure 5.2 shows the percentage composition of each phylum as a portion of the total number of sequences amplified by each primer set by the SILVA TestPrime analysis. The primers designed in this study amplified a much smaller number of phyla as compared to the universal 16S primer pair (7 and 19 phyla for the 1328F_1664R and 293F_751R pairs respectively versus 50 phyla for 515F_806R), with cyanobacteria sequences forming a larger proportion of predicted sequences amplified. Furthermore, cyanobacteria sequences formed 95% of the predicted amplicons of the 1328F_1664R primers, indicating the increased specificity of these primers for the Cyanophyceae as compared to the universal primers. The new primers were then tested on DNA from cultured cyanobacteria to confirm their utility and to determine optimal amplification conditions. The final optimal annealing temperature used for the primers designed in this study are listed in Table 5.2.

82

CHAPTER V: CYANOBACTERIA NGS

Figure 5.2: Composition (percentage) of phyla amplified by the primer sets as predicted by in silico testing using TestPrime, against the SILVA database, with the recommended settings (1 mismatch +5 bases). Phyla forming less than 0.01% of the predicted amplifications were omitted. 83

CHAPTER V: CYANOBACTERIA NGS Table 5.3: Percentage composition of cyanobacteria sequences from total predicted sequences amplified via in silico analyses using the SILVA and RDP databases. RDP Primer Pair SILVA TestPrime 1 difference 2 differences 293F_751R 15% 23% 12% 515F_806R 2% 4% 4% 1328F _664R 95% 95% 60%

5.3.2. NGS microbiome analysis of water samples Detailed breakdown of the phyla obtained can be found in Appendix 10-12. After the initial sequence processing and quality filtering the sequences remaining for the primer sets were: 339,008 sequences for the 293F_751R primers; 852,293 sequences for the universal 16S 515F_806R primers; and 237,860 sequences for the 1328F_1664R primers. Even subsampling of the individual samples, and pooling of the reads according to primer sets, resulted in the selection of 27,410, 27,209 and 27,355 for the three primer sets being used for downstream analyses. After removal of background taxa, exclusion of unassigned OTUs and those classified to the bacterial root, the universal 515F_806R primer pair had the greatest alpha diversity (diversity of species within each sample) while that of the cyanobacteria-targeted 1328F_1664R primer set was the smallest (average of 44 OTUs versus 23 OTUs respectively) (Figure 5.3). This increased specificity is also reflected in the number of (especially cyanobacteria) detected by the primer pairs (Figure 5.4). Amplification with the 293F_751R, 515F_806R and 1328F_1664R primer sets allowed for the detection of 8, 15 and 7 bacterial phyla respectively. As expected, the universal 16S primers (515F_806R) also picked up sequences that were assigned to the Archaea (), but at only a very low abundance (0.04%, data not shown). Figure 5.4 shows a breakdown of the phyla observed, with only those forming > 1% of total number of sequences for each primer pair being represented. Although only cyanobacteria sequences were used as template to design the new primers, they also amplified members of the phylum proteobacteria. Consequently, proteobacteria and cyanobacteria sequences formed at least 90% of the total subsampled sequences observed by both the 293F_751R and 1328F_1664R primer pairs; this is in comparison to the universal 16S primers, where cyanobacteria and proteobacteria sequences together comprised less than half (49%) of all the sequences obtained (Figure 5.4,

84

CHAPTER V: CYANOBACTERIA NGS detailed breakdown in Appendix 10). Two other phyla ( and ) were also detected and formed varying proportions of the subsampled data for each primer set (Actinobacteria forming 0.2%, 5.9% and 11.1% and Verrucomicrobia 1.7%, 0.2% and 8.2% for 293F_751R, 1328F_1664R and universal 16S primers respectively).

120

100

80

60 No. of OTUs of No. 40

20

0 293F_751R 515F_806R 1328F_1664R

Figure 5.3: Alpha diversity observed from the sampled subset for each primer set. Grey diamonds represent the median number of OTUs observed by each primer pair; boxes give the 25% and 75% percentiles for each primer pair, error bars indicate standard errors.

85

CHAPTER V: CYANOBACTERIA NGS

1 1.5% 5.3% 0.9

0.8 40.3% 44.5% 33.4% 0.7

0.6 2.0% 0.5 15.3%

0.4

0.3 54.4% 30.2% Percentage Percentage composition 50.8% 0.2

0.1 10.2% 0 2.6% 293F_751R 515F_806R 1328F_1664R Actinobacteria Cyanobacteria OD1 Proteobacteria Verrucomicrobia Unassigned Other Phyla < 0.5%

Figure 5.4: Proportion of sequences observed by each primer set for each phyla (from the sampled subset). Phyla with < 0.5% abundance were combined.

The discriminating power of the primer sets allowing for assignation of sequences to lower taxonomic level (genus) identification was compared. For the cyanobacteria, the 293F_751R primer set had the largest number of successfully assigned organisms, followed by the 1328F_1664R primers then by the universal 16S primers. For the detection of proteobacteria, the 293F_751R and the universal 16S primers performed approximately the same (Figure 5.5).

Figure 5.5: Comparison of taxonomic coverage of cyanobacteria and proteobacteria genera by primer.

Breakdown of the results for each phylum showed that each of the primer sets preferentially amplified different taxa from within the cyanobacteria and proteobacteria 86

CHAPTER V: CYANOBACTERIA NGS phyla (Table 5.4, detailed breakdown for both phyla in Appendixes 5 and 7). For the cyanobacteria, the genus Calothrix (Order Nostocales) and genus Pseudanabaena (Order Pseudanabaenales) were successfully amplified by all three primer pairs. Apart from melainabacteria, representative sequences from all the cyanobacterial class/orders were obtained using the cyanobacteria-targeted primers. In comparison, the universal 515F_806R primers did not amplify any sequences from the Chroococcales and orders. Similarly for the proteobacteria, only two genera (Limnohabitans and Glaciecola) were amplified by all three primer pairs. The universal 515F_806R primers amplified all classes of proteobacteria. However, apart from the and eplisonproteobacteria, the number of proteobacteria orders amplified from within the other classes was intermediate when compared to the 293F_751R and 1328F_1664R primer pairs (Table 5.4).

Table 5.4: Taxonomic breakdown to Class/Order observed for cyanobacteria and proteobacteria. Classification Phylum 293F_751R 515R_806R 1328F_1664R (; number of taxa) 4C0s-2/Chloroplast (C; n=9) 9 5 1

Nostocales (O; n=6) 3 3 4 Stigonematales (O; n=1) 1 1 1 Oscillatoriales (O; n=2) 1 1 1 Chroococcales (O, n=3) 2 0 3 Pseudanabaenales (O; n=5) 5 2 4

Cyanobacteria Synechococcales (O; n=2) 1 0 2

Alphaproteobacteria (C; n=44) 22 26 20 * Betaproteobacteria (C; n=32) 27 17 7 Deltaproteobacteria (C; n=17) 1 8 12 Eplisonproteobacteria (C; n=4) 1 4 0

Proteobacteria (C; n=40) 27 23 7 Taxonomic ranks: C= Class; O= Order * Zetaproteobacteria was not detected by any of the primer sets used in the study, and hence excluded from the table.

In order to determine the presence of potentially important taxa, taxonomic assignment to the species level for the sequences assigned to the Cyanobacteria and Proteobacteria was performed. However, the majority of the OTUs were classified to either the order or genus level, with few having conclusive species assignation. Even so, potentially problematic taxa detected include canis, Coxiella burnetii, Planktothrix spp., Limnothrix spp. (Table 5.5).

87

CHAPTER V: CYANOBACTERIA NGS Table 5.5: Illustrative comparison of some potentially problematic taxa detected by the primer pairs used in the present study. Listed genera include those containing medically important organisms of human and animals and toxic bloom-forming microalgae Primer Pair 293F_751R 515F_ 806R 1328F_1664R Calothrix sp. Calothrix sp. Anabaena oscillariodes Limnothrix planktonica Planktothrix sp. Aphanizomenon gracile

Phormidium sp. Trichormus variabilis Limnothrix sp. Planktothrix sp. Nostoc linckia Trichormus variabilis Phormidium sp.

Cyanobacteria Calothrix sp. Pseudanabaena gelata Pseudanabaena limnetica Sphaerospermopsis aphanizomenoides Alcaligenes sp. Acrobacter sp

Aeromonas sp. Coxiella burnetii

Burkholderia Pasteurella testundinis Coxiella burnetii Pseudamonas sp. Endozoicomonas elysicola Rickettsiales sp. Proteobacteria Neisseria canis Ralstonia sp.

88

CHAPTER V: CYANOBACTERIA NGS

5.3.3. Bacteria composition of freshwater water samples As the water samples tested were a heterogeneous mix from various sources, in order to obtain an insight into the cyanobacteria composition of environmental water samples, reads from these samples were examined further. Proteobacteria formed the largest proportion (> 40%) of the bacteria observed by all primer pairs (Figure 5.6). As predicted by the in silico analyses (Figure 5.2), the selectivity of the new primers is apparent, with cyanobacteria and proteobacteria forming 98% and 95% of the bacteria observed by the 1328F_1664R and 293F_751R primers respectively. This is in comparison to the universal 16S primers where these two phyla formed only 60% of the total bacteria observed, with actinobacteria and bacteroidetes and verrucomicobia forming major proportions of the remaining 40% of bacteria. Although other bacteria phyla were also observed using the two new primer sets designed in this study, apart from verrucomicrobia (3% of the bacteria observed by the 293F_751R primers), all other bacterial phyla formed less than 1% of the total bacteria observed (Figure 5.6).

1 1.7% 8.2% 0.9

0.8

54.9% 0.7 44.7% 63.2% 0.6

0.5

0.4 14.2%

0.3

Percentage Percentage composition 38.3% 20.1% 0.2 34.1% 0.1 11.1% 5.9% 0 293F_751R 515F_806R 1328F_1664R Actinobacteria Armatimonadetes Bacteroidetes Cyanobacteria Proteobacteria Verrucomicrobia Unassigned;Other Other Phyla < 0.5% Figure 5.6: Percentage phyla composition of the Freshwater samples from sampled dataset, as observed by each primer pair. Phyla with < 0.1% abundance were combined.

89

CHAPTER V: CYANOBACTERIA NGS

5.4. Discussion The purpose of the present study was to design cyanobacteria specific 16S rDNA primers for use with the Ion Torrent PGM sequencer, to allow for rapid detection and identification of cyanobacteria strains from a variety of environmental water samples. To this end, two different primer sets targeting different regions (V3 and V6 regions) of the 16S rDNA were designed and tested to compare their utility. This was done in order to overcome the limitations posed by the use of universal 16S rDNA primers and to increase the depth of coverage of cyanobacteria specific sequences from environmental samples. The availability of a large database (Dall'agnol et al., 2012), makes the 16S rDNA locus ideal for the design of preliminary phyla specific primers for use with N S. In bacteria, this gene generally has nine “hypervariable regions” interspersed within stretches of DNA conserved in most or all species (Chakravorty et al., 2007, Liu et al., 2007, Van de Peer et al., 1996). The primers in this study were designed to span both conserved and variable regions. In silico tests of the cyanobacteria-targeted primers showed high specificity for cyanobacterial sequences, with predicted amplification of at least threefold greater proportion of cyanobacteria sequences than the universal 16S rDNA primers. Thus these primers were expected to provide a greater depth of coverage for cyanobacterial sequences than the universal primers using NGS technology. In addition to the potential to amplify a greater proportion of cyanobacteria DNA, the cyanobacteria-targeted primers developed in this study also amplified a smaller range of phyla. This reduced taxonomic coverage was also apparent in the sampled data from the NGS analysis. In the context of detecting potentially problematic cyanobacteria (or proteobacteria), the increased specificity of the primers would amplify sequences from these taxa when present at lower concentrations, hence allowing for increased detection sensitivity as compared to the universal 16S primers. Breakdown of the phyla amplified experimentally via NGS demonstrated the increased selectivity of the new primers for cyanobacterial sequences as compared to the universal 16S rDNA primers. This increased specificity is essential as in water monitoring situations where there is the need to detect the presence of problematic (toxin or T/O producing) cyanobacteria present at low levels. The ability to enrich for sequences from this phylum may be desirable in such situations, as this will also amplify DNA of such cyanobacteria while they are present in low concentrations. 90

CHAPTER V: CYANOBACTERIA NGS Furthermore, the selectivity of these new primers also increases the sequencing depth of cyanobacteria specific sequences, allowing for the detection of a broader range of orders, as compared to the universal 16S primers. This in turn provides a better picture of the types of cyanobacteria present in the system of interest and the potential to identify potentially problematic cyanobacteria within said system. In addition to the preferential amplification of cyanobacteria sequences, the primers designed in this study were also highly successful in amplifying proteobacteria sequences. This resulted in cyanobacteria and proteobacteria sequences forming the bulk of the sequences amplified by the new primers. Although the new primers showed increased selectivity towards proteobacteria sequences, they did not appear to have increased taxonomic coverage of proteobacteria as compared to the universal primers. This bias for proteobacteria sequences, was evident based on the SILVA TestPrime results, but was not observed in the initial primer specificity test,s or when the primers were tested on cyanobacterial cultures (i.e. sequencing results of the primers did not produce mixed signals) (data not shown). Proteobacteria are the largest and most phenotypically diverse division of prokaryotes, accounting for the majority of known Gram-negative bacteria (Gupta, 2000, Imhoff, 2001, Rosenberg et al., 2014). They form a complex phylum arbitrarily divided into five subdivisions - alpha, beta, gamma, delta and epsilon based on their 16S and 23S rDNA, have varying physiological and phenotypic characteristics and include a large number of known human and animal pathogens (Gupta, 2000, Rosenberg et al., 2014, Parte, 2014, Stackebrandt et al., 1988). A new class of proteobacteria, with a highly specific ecological niche – the zetaproteobacteria, has been described only fairly recently (Emerson et al., 2007). As it is a novel proteobacterial class, despite best efforts to ensure that the latest 16S rDNA database was used, it is possible that the zetaproteobacteria were not included. Additionally, zetaproteobacteria are mainly linked to hydrothermal activity and exposed marine crusts (Rubin-Blum et al., 2014), so it is not unexpected that they were not observed from the samples sequenced, and were thus excluded from our analyses. As such, the potential to utilise these new primers as a tool for detecting potentially pathogenic proteobacteria from mixed communities also exists. For this study, the freshwater samples were obtained mainly from lakes/ponds (lotic water systems). Although this differs from majority of the freshwater metagenomic studies which have been done on river systems (Kolmakova et al., 2014, Schultz Jr et al., 2013, Ghai et al., 2011, Fortunato et al., 2012, Staley et al., 2013, Oh 91

CHAPTER V: CYANOBACTERIA NGS et al., 2011), comparison of microbiota between studies still appears possible, with the data from the universal 16S primers collected during the present study showing a similar composition to that reported by (Staley et al., 2013). Although the primer sets used in the present study targeted different regions of the 16S rDNA, data from all three primer sets corroborated with the results obtained by Fortunato et al. (2012) from the Columbia River, where Proteobacteria dominated the freshwater bacterial communities. Future studies testing an even wider diversity of water sources (e.g. marine samples) can be performed to better determine the applicability and limitations of these primers. Additional studies challenging these new primers using known, mock cyanobacteria communities in a study similar to that performed by (Kermarrec et al., 2013) may be considered. The design and use of specific blocking primers (a modified DNA oligo that preferentially binds to the DNA of a specific organism, preventing the amplification of the targeted DNA) can potentially further increase the detection abilities of the new primers in detecting either cyanobacteria or proteobacteria specific sequences (Vestheim and Jarman, 2008, Gofton et al., 2015). Alternatively, enrichment of the samples for cyanobacteria before DNA extraction, by filtration of the samples through a higher pore size (> 0.2 μm) could help reduce the number of proteobacterial sequences obtained. This may also be used in combination with an initial amplification step using cyanobacteria specific 16S rDNA primers, however, the potential for PCR biases resulting from this additional amplification step cannot be excluded. Additionally, studies have shown that the Ion Torrent has a higher rate of error when sequencing microbial samples (Stewart, 2012, Quail et al., 2012), this may have had an impact on the bacterial compositions observed in the present study. As such, consideration should be made to attempt testing these new primers on an alternative platform (e.g. Illumina’s MiSeq platform) to determine if they are able to provide better taxonomic resolution than what was achieved in the present study. Both primers designed in the study showed increased selectivity for cyanobacteria sequences. However, the 1328F_1664R primer set showed an increased specificity for cyanobacterial sequences, especially sequences from the Nostocales and Oscillatoriales orders as compared to the 293F_751R primer pair. Furthermore, as the hypervariable regions (V3 and V6) targeted by the primer pairs designed in this study are supposed to provide better resolution then the V4 region that the universal 16S 515F_806R target, it is possible with further refinements, these primers could be used to detect specific species of cyanobacteria (or proteobacteria) which are of concern to the water industry. As they are, the increased selectivity and narrower range of phyla 92

CHAPTER V: CYANOBACTERIA NGS targeted by the primers designed in this study, already allows for the pooling of samples from multiple sources (combining DNA from multiple different samples) into a single NGS run, while still maintaining the increased sequencing depth and range of targeted phyla. As compared to traditional methods of cyanobacteria isolation and characterisation, NGS also allows for the identification of cyanobacteria strains that are difficult to obtain in culture. When combined with faster sample processing, this makes NGS significantly cheaper than current traditional morphology-based taxonomic methods of determining species composition of environmental water samples. The ever reducing cost per base of sequencing (Caporaso et al., 2011, Hajibabaei, 2012), together with the reproducibility and potential for standardisation (Eiler et al., 2013) make NGS an ideal tool for bio-monitoring (Bik et al., 2012, Kermarrec et al., 2013, Hadziavdic et al., 2014, Tan et al., 2015). Hence, with the addition of the cyanobacteria-targeted primers designed in this study, the potential of the NGS platform as a means of studying (and monitoring) cyanobacterial diversity is demonstrated.

93

CHAPTER VI: GENERAL DISCUSSION CHAPTER 6. GENERAL DISCUSSION

6.1. Overview Although there have been extensive studies on cyanobacteria in Australia, this has predominantly occurred in Eastern and Southern Australia where cyanobacteria blooms are a regular occurrence (Baker and Humpage, 1994, Mitrovic et al., 2011, Steffensen et al., 1999, Griffiths and Saker, 2003, Hawkins et al., 1985). In contrast, in Western Australia, apart from studies on stromatolites (Garby et al., 2013, Smith et al., 2010) and the Swan River bloom in 2000 (Robson and Hamilton, 2003, Atkins et al., 2001), majority of the studies on cyanobacteria have focused on phytoplankton, nutrient content and toxins (Gordon et al., 1981, Hosja and Deeley, 1994, Huisman et al., 2011, Lund and Davis, 2000, Kemp and John, 2006), with few molecular studies on cyanobacteria conducted leaving an unfilled knowledge gap. From the studies that have been done (Gordon et al., 1981, Hosja and Deeley, 1994, Kemp and John, 2006, Lund and Davis, 2000, Robson and Hamilton, 2003), Oscillatoriales and Chroococcales appear to dominate the inland freshwater bodies in the region. Hence, the aim of this this was to obtain a snapshot of cyanobacteria diversity in Western Australia using conventional methods of isolating, culturing and maintaining cyanobacteria from various freshwater sources in the region and characterising the isolates via polyphasic methods, of which a subset of the results were presented in Chapter 3. As a consequence of the irregularities encountered in implementing widely adopted phylogeny assignment methods, it was decided to assess the effects of the different strategies (alignment method, curation, tree reconstruction algorithms) employed to construct phylogenetic trees (Chapter 4). Finally, in order to facilitate future cyanobacterial studies, bypassing the need for time consuming cyanobacteria isolation and allowing for detection of unculturable cyanobacterial strains, novel cyanobacteria-targeted primers were designed and evaluated using the latest state-of- the-art NGS technologies (Chapter 5).

6.2. Cyanobacteria classification The work presented in chapter three examined the concordance between molecular and morphological methods of cyanobacteria identification. The difficulties and discrepancies involved in morphological identification of cyanobacteria were demonstrated, with duplicate examinations providing agreement for a total of ten isolates at the species (3 isolates) and genus (7 isolates) levels. 94

CHAPTER VI: GENERAL DISCUSSION At the 16S rDNA locus, the use of 98% sequence similarity for species and 95% sequence similarity for genus delimitation as suggested by Stackebrandt and Goebel (1994) was further supported by the findings of Gaget et al. (2015) who reported that the species barrier for cyanobacteria was at 98.5% similarity. Using either of these criterions, Baldwin Park type 2 would still be of interest as it had only a 94% sequence similarity to its closest matched isolate (A. oscillariodes). This is further supported by the low percentage similarity observed at the rpoC1 locus (86% similarity to Anabaena sp.) and genus level morphological identification. The potential presence of unexplored cyanobacteria varieties in Western Australia is also indicated by the presence of sequences (at the rpoC1 or cpcBA-IGS loci) with less than 95% similarity to sequences available on GenBank (22% of the total number of sequences obtained). However, as the low percentage similarities between isolate and reference sequences were found in the less studied rpoC1 and cpcBA-IGS loci, this can also be attributed to sequence paucity and limited variety of cyanobacteria for which theses loci have been sequenced. Hence, the possibility that these cyanobacteria isolates are simply cosmopolitan strains, which have yet to be well studied (because they are neither toxic nor bloom causing), cannot be discounted. The use of a polyphasic approach for the identification of cyanobacteria as performed in Chapter 3 has been proposed and advocated for by other researchers, and is currently the most accepted method of cyanobacteria identification (Robertson et al., 2001, Litvaitis, 2002, Seo and Yokota, 2003, Komárek, 2006, Sciuto and Moro, 2015). This study, together with a number of others, demonstrates the inconsistencies between phylogenetic and morphological classifications using a polyphasic approach (Litvaitis, 2002, Robertson et al., 2001, Seo and Yokota, 2003, Moro et al., 2010). Reasons for the discrepancies observed include the absence of/variation in identifying features used for conclusive morphological identification of cyanobacteria strains; the differences in number of sequences available for each of the loci studied; the variety of cyanobacteria species represented for each locus; and the effects of the alignment and phylogenetic strategies used. Additionally, as multiple loci were utilised in the study, the option for concatenation of the sequences exists. However, this could not be performed in the study as majority of the cyanobacteria strains only had sequences for two of the three loci studied. Hence, until comprehensive datasets for all (or commonly utilised) loci used for phylogenetic reconstructions of cyanobacteria are available, both polyphasic and solely molecular based methods of cyanobacterial classification and identification will potentially remain difficult. 95

CHAPTER VI: GENERAL DISCUSSION 6.3. Cyanobacteria systematics Given the importance of knowing the systematics of organisms and the inaccuracies found in uncurated public sequence repositories, phylogenetic reconstruction strategies are essential for determining the evolutionary relationships between isolates. However, options chosen during the reconstruction process (alignment algorithm and tree building method) may affect the inferences obtained. This was briefly discussed in Chapter 3 and further examined in the following chapter. As part of this study, an easily applicable method to determine the effects of the options employed for each step of the reconstruction was developed. Although distinct separations as observed on the PCA plots, which were used as indicators of tree congruence between phylogenetic reconstruction strategies, when the effect of tree reconstruction method was examined, the more conserved loci (16S rDNA and rpoC1) were sufficiently robust to produce consistent results regardless of strategy used. However, the variable regions, which have the potential to provide better (species or strain rather than genus) resolution, requires prior information and greater consideration when used to reconstruct phylogenies. A possible strategy to obtain the maximum amount of information from these variable regions may be to limit the reference sequences included for phylogeny reconstruction to those within the same order or genera as the query sequences (Otsuka et al., 1999, Suda et al., 2002, Moro et al., 2010, Sciuto et al., 2012), as opposed to the inclusion of majority of the cyanobacterial orders within a single tree, as done in the thesis. In addition, contrary to prior studies, the choice of tree building method, rather than the alignment method utilised, had the greatest effect on the resultant phylogeny (Morrison and Ellis, 1997, Lindgren and Daly, 2007, Liu et al., 2009, Löytynoja, 2012). However, this may have been due to the strategy employed in the study – whereby the phylogenies generated were compared against each other, rather than against a known “prototype tree. Even so, the differences in resultant tree topologies using alternative strategies, demonstrate how each option can affect the inferences made. As multiple plausible tree topologies can be obtained from a selection of homologous sequences, the implicit reliance of most researchers on the alignments obtained from a single algorithm (Landan and Graur, 2009) means that many other possibilities are missed. Furthermore, most MSAs available are designed to align the DNA sequences for globular and folded protein (conserved) domains (Thompson et al.,

96

CHAPTER VI: GENERAL DISCUSSION 2011), making alignments of ITS (variable) regions problematic due to the limited number of character states available (4) when aligning DNA sequences. Although default penalty scores may introduce biases and reconstructions that do not necessarily reflect the true phylogenetic relationships, alignment algorithm default parameters are usually suitable for aligning a wide range of loci and, are reasonable options where no prior evolutionary knowledge is available (Landan and Graur, 2009). Alternatively, rather than comparing reconstructed phylogenies of individual genes, concatenated gene trees may provide a possible alternative to resolving conflicting phylogenetic reconstructions. However, although trees constructed using concatenated genes have high bootstrap values, they have been shown to be inaccurate, or have weak agreement with individual gene trees (Thiergart et al., 2014, Seo, 2008). Hence, rather than concatenation of sequences, “combining log-likelihood scores” from separate analyses, together with the use of two-step bootstrap procedures for both ML and distance phylogenetic analyses have been suggested to be more suitable when concatenating sequence data (Seo, 2008, Seo et al., 2005). Despite recent advances in reconstruction methods and strategies, until comprehensive datasets for commonly utilised loci are available, molecular methods of cyanobacteria identification, as determined by agreement between multiple loci, will potentially remain difficult regardless of strategies adopted.

6.4. Cyanobacteria-targeted primers for NGS technologies NGS is being increasingly utilised to catalogue microbial diversity from various environments through projects such as the Earth Microbiome project (EMP) (Gilbert et al., 2010) and the (HMP) (2012). Although universal, universal primers also have biases (Gilbert et al., 2014), hence using universal profiling primers for detecting specific taxa (e.g. in the case of drinking water monitoring, where early detection of problematic organisms is essential) may result in the targeted organisms going undetected for reasons such as i) low abundance, ii) inefficient amplification and/or iii) limited sequencing depth. In order to overcome these limitations cyanobacteria-specific 16S rDNA primers for use with the Ion Torrent PGM sequencer (or other platforms) were designed. Both in silico and experimental data (Chapter 5) indicated that the new primers performed as intended, amplifying a smaller range of phyla (mainly cyanobacteria and proteobacteria) and consequently 97

CHAPTER VI: GENERAL DISCUSSION providing a greater sequencing depth for these phyla (with cyanobacteria and proteobacteria forming > 50% and 40% respectively of the sequences recovered), when compared to the universal 16S primers (where cyanobacteria and proteobacteria combined formed less than 50% of the sequences recovered). As such, the increased selectivity of the 16S rDNA primers designed in the study allow for detection of cyanobacteria (and proteobacteria) even when present in low abundance, while simultaneously increasing the depth coverage of the targeted phyla. These primers, hence, provide a starting point for the further development and refinement of cyanobacteria-targeted primers for future NGS studies. Further refinements to the primers (either via modifying primer sequence or amplification conditions) can be done to increase their specificity for cyanobacteria (or proteobacteria). Additionally, other considerations for the use of these primers in water monitoring situations that require further investigation include: i) Determining biases due to DNA extraction and PCR amplification methods, especially, when attempting to determine taxonomic diversity of environmental samples (Deiner et al., 2015). ii) Inclusion of controls to determine thresholds for constructing OTUs, to obtain reliable estimates of sequencing errors and to determine a threshold number of sequences for accepting a species presence in a sample. iii) Challenging the new primers on a wider diversity of water sources (e.g. marine samples) to better determine their applicability and limitations. Finally for water monitoring purposes, strategies allowing for increased inter-laboratory comparisons of NGS data, long-term NGS data collection to obtain baseline microbial community compositions, and establishment of microbial community databases for metagenomic studies should also be considered (Tan et al., 2015). The potential of NGS as a bio-monitoring tool is still in its infancy, with majority of the studies performed using universal primers (usually 16S or 18S rDNA primers) for detection of organisms present within the ecosystems studied (Sinclair et al., 2015, Takahashi et al., 2014, Zhan et al., 2013, Kermarrec et al., 2013, Xiao et al., 2014, Hugerth et al., 2014, Hadziavdic et al., 2014, Klindworth et al., 2013, Gofton et al., 2015, Bik et al., 2012, Caporaso et al., 2011), it is hoped that the introduction of the phyla specific primers designed in the study will not only allow for an increased level of 98

CHAPTER VI: GENERAL DISCUSSION detection of the targeted phyla, but would also potentially increase our understanding of the factors governing the seasonal changes in species composition of cyanobacteria (or proteobacteria) via the ability to carry out larger scaled (and potentially more intensive) sampling strategies.

6.5. Future directions Despite the many revisions to cyanobacteria classification and nomenclature (Komárek, 2010b, Komárek, 2010a, Strunecky et al., 2011, Wacklin et al., 2009), species identification of cyanobacteria still remains difficult with further changes anticipated (Castenholz and Norris, 2005, Komárek et al., 2014, Dvořák et al., 2015). As with the eukaryotes (Bik et al., 2012), inconsistencies in lower taxonomic classification of cyanobacteria limit the accurate identification of new/environmental isolates. Although highly recommended and being accepted as the preferred method of cyanobacteria identification (Sciuto and Moro, 2015), polyphasic identification is still difficult. The reliance on preformed databases together with the paucity and/or accuracy of sequences from other (non 16S rDNA) loci, limits the accuracy of BLAST-search- based taxonomy and phylogenetic inferences obtainable for cyanobacteria genera that have not been as well studied, consequently, consensus identification with the inclusion of these loci is difficult, as demonstrated in Chapter 3. However, with the expected expansion and curation of sequences available in databases, allowing for increased identification of misidentified sequences, together with the increasing number of genomes being sequenced (e.g. Shih et al. (2013)) and improved phylogenetic reconstruction strategies, identification of cyanobacteria via polyphasic methods can only become much easier in the future. Furthermore, the applicability of alternative technologies (e.g. mass spectrometry (MS) or satellite remote sensing) to various aspects of cyanobacteria identification, bloom and toxin detection have been demonstrated (Zwiener and Frimmel, 2004, Welker and Moore, 2011, Welker et al., 2002, Welker and Erhard, 2007, Fenselau and Demirev, 2001, Esquenazi et al., 2008, de Albuquerque et al., 2007, Kahru and Elmgren, 2014, Moreira et al., 2014). Integration of these technologies into future polyphasic cyanobacteria characterisation studies, together with the development of a standardised and regularly updated sequence database(s) can greatly improve our ability to accurately identify and monitor environmental cyanobacteria. As mentioned previously, traditional methods of cyanobacteria isolation and characterisation is time consuming. Furthermore, this process only allows for 99

CHAPTER VI: GENERAL DISCUSSION determination of cyanobacterial strains that can grow in culture, leaving a vast majority of cyanobacteria undiscovered and undescribed. In contrast, NGS is culture independent, thereby providing the opportunity to directly analyse communities as they exist under in situ conditions (Claesson et al., 2010, Eiler et al., 2013). Additionally, NGS also has the ability to rapidly process multiple samples (Caporaso et al., 2011, Hajibabaei, 2012), making entire genome approaches to cyanobacteria phylogeny reconstruction increasingly affordable (Shih et al., 2013, Soo et al., 2014) and allowing metagenomic profiling of complex microbial communities to be increasingly cost- effective, enabling its use as a screening tool to complement or replace traditional diagnostic methods (Tan et al., 2015). Despite the great potential for NGS in the monitoring, detection and identification of cyanobacteria, some limitations still exist. Primary amongst the limitations to NGS is the short amplicon size obtainable (Hugerth et al., 2014). The short reads currently obtainable place a limit on the degree of identity resolution, as longer amplicon length usually allow for increased phylogenetic resolution, due to the presence of species-specific variations in the variable regions. However, with the new upgrades to current NGS technology and introduction of third generation sequencers (e.g. MinION, Oxford Nanopore Technologies; Sequel System, Pacific BioScience), read lengths exceeding 1,000bp are becoming possible. Hence, short read lengths may no longer be a limitation in the future. Even so, due to the nature of NGS and the use of universal primers, targeted taxa present in low abundance can still easily remain undetected, or only have very few sequencing reads. Hence, in order to detect organisms present in low abundance, alternative strategies are needed. The use of blocking primers as demonstrated by Gofton et al. (2015) can be used to eliminate overabundant sequences-organisms, allowing for detection of other sequences present in the community studied. Alternatively, when a target phylum is desired, the design and use of phylum-specific primers (such as the primers developed in Chapter 5) may be the solution to increasing sequence coverage. However, as with characterisation and identification of unknown cyanobacteria via traditional PCR methods, NGS is also dependent on the availability of sequences present in the database against which sequence reads are matched. As such, the ability of this technology to accurately and confidently assign taxonomic identifications to the sequences obtained is dependent on the information available – drawing parallels to the difficulties faced when performing polyphasic characterisation of cyanobacteria. On the other hand, whole genome sequencing of reference sample using NGS provides a way 100

CHAPTER VI: GENERAL DISCUSSION for researchers to easily sequence multiple loci of interest (as performed by Shih et al. (2013)), this in turn would rapidly increase the amount of data available in reference databases, benefiting both traditional and NGS methods of cyanobacteria identification. The introduction of various technologies (NGS, MS, satellite remote sensing) has provided the ability to easily detect cyanobacteria, even when they are present in low numbers, allowing for the mitigation of cyanobacterial blooms. However, as these technologies are still relatively new, with immature databases, traditional methods of identification still remains the tool of choice. In due time, the potential for these technologies to supersede traditional methods of identification exists. Even so, this does not mean that the lengthy process of isolating cyanobacteria from environmental sources, followed by both morphological and molecular characterisation is no longer necessary Ultimately, in order to fully understand cyanobacteria diversity, evaluation of both morphological and molecular variability is necessary. Additionally, molecular data alone has limited capacity to recognise ecological importance of different genotypes (Komárek, 2006). Cyanobacteria are also known to produce bioactive compounds with promising anti-carcinogenic and anti-disease applications (Sciuto et al., 2012, Gerwick et al., 2008); and allochemicals with the potential for use in the development of algaecides, herbicides and insecticides (Berry et al., 2008, Sciuto and Moro, 2015). In order to fully access the potential of these compounds, it is still necessary to isolate and perform further studies on the cyanobacteria strains found to produce such chemicals. As such, polyphasic methods of novel cyanobacteria characterisation is likely to remain the standard, however, these step could, in the future, potentially be preceded by NGS analysis of the environmental sample to determine if the organisms of interest are present before the process of cyanobacteria isolation is carried out.

6.6. Conclusion The work presented provides a preliminary snapshot into molecular and morphological diversity of cyanobacteria found in Western Australia. The identification of two potentially novel strains of cyanobacteria, together with sequences with relatively low similarity to previously published sequences, demonstrates the as yet undiscovered cyanobacteria diversity in the region. Further studies using the cyanobacteria strains (e.g. sequencing of the entire 16S rDNA, whole genome sequencing or morphologically identifying/characterising all the strains isolated) will give us a better idea of the cyanobacteria within this region. 101

CHAPTER VI: GENERAL DISCUSSION Although well documented by other researchers, the work presented in this thesis demonstrated the difficulties encountered when using the traditional (morphological) method of cyanobacteria identification. When combined with the use of multiple DNA loci to characterise individual cyanobacteria strains, the limitations to the polyphasic approach is further demonstrated. Hence, calling into question the need for species level identification (Komárek, 2003), especially in the context of water monitoring situations where genus level identification is usually sufficient. The difficulties in obtaining consensus identification from multiple loci were further studied, with the effect of phylogenetic reconstruction strategies examined. In the process, a method to quantify the differences in phylogeny obtained using different reconstruction strategies was developed and provides a potential tool to score experimentally –obtained outputs, especially at less validated loci. The initial, successful development and validation of cyanobacteria specific NGS primers provides a rapid and cost effective tool for monitoring cyanobacteria in environmental samples. The primers designed in this study provide a starting point for increasing the selectivity and sequencing depth of cyanobacteria sequences environmental samples. With further refinements, the specificity of these primers for cyanobacteria sequences can be improved upon, potentially paving the way for multiplexed sequencing through the combination of the primers with currently available (or new) function specific (e.g. toxin or terpene gene targeted) primers. This would thus, allow for the detection of problematic organisms, and implementation of pre-emptive measures before the problem occurs.

102

REFERENCES REFERENCES

2012. A framework for human microbiome research. Nature, 486, 215-221. AGAPOW, P.-M. & PURVIS, A. 2002. Power of eight tree shape statistics to detect nonrandom diversification: A comparison by simulation of two models of cladogenesis. Systematic Biology, 51, 866-872. AL-TEBRINEH, J., GEHRINGER, M. M., AKCAALAN, R. & NEILAN, B. A. 2011. A new quantitative PCR assay for the detection of hepatotoxigenic cyanobacteria. Toxicon, 57, 546-554. ANAGNOSTIDIS, K. 2001. Nomenclatural changes in cyanoprokaryotic order Oscillatoriales. Preslia (Prague), 73, 359-375. ANGERT, E. R. 2005. Alternatives to binary fission in bacteria. Nature Reviews , 3, 214-224. ARFI, R., BOUVY, M., CECCHI, P., PAGNAO, M. & THOMAS, S. 2001. Factors limiting phytoplankton productivity in 49 shallow reservoirs of North Côte d'Ivoire (West Africa). Aquatic Ecosystem Health & Management, 4, 123-138. ATECH 2000. Cost of algal blooms, Canberra, Australia, Land and Water Resources Research and Development Corporation. ATKINS, R., ROSE, T., BROWN, R. S. & ROBB, M. 2001. The Microcystis cyanobacteria bloom in the Swan River - February 2000. Water Science & Technology, 43, 107-114. BAKER, P. & HUMPAGE, A. 1994. Toxicity associated with commonly occurring cyanobacteria in surface waters of the Murray-Darling Basin, Australia. Marine and Freshwater Research, 45, 773-786. BARRINGTON, D. J. & GHADOUANI, A. 2008. Application of Hydrogen Peroxide for the Removal of Toxic Cyanobacteria and Other Phytoplankton from Wastewater. Environmental Science & Technology, 42, 8916-8921. BAUZA, L., AGUILERA, A., ECHENIQUE, R., ANDRINOLO, D. & GIANNUZZI, L. 2014. Application of hydrogen peroxide to the control of eutrophic lake systems in laboratory assays. Toxins (Basel), 6, 2657-75. BELLINGER, E. G. & SIGEE, D. C. 2010. Freshwater Algae: identification and use as bioindicators, Wiley-Blackwell. BERGSLAND, K. J. & HASELKORN, R. 1991. Evolutionary relationships among eubacteria, cyanobacteria, and chloroplasts: evidence from the rpoC1 gene of Anabaena sp. strain PCC 7120. Journal of Bacteriology, 173, 3446-3455. BERNARD, C., FROSCIO, S., CAMPBELL, R., MONIS, P., HUMPAGE, A. & FABBRO, L. 2011. Novel toxic effects associated with a tropical Limnothrix/Geitlerinema-like cyanobacterium. Environmental Toxicology, 26, 260-270. BERRENDERO, E., PERONA, E. & MATEO, P. 2008. Genetic and morphological characterization of Rivularia and Calothrix (Nostocales, Cyanobacteria) from running water. International Journal of Systematic and Evolutionary Microbiology, 58, 447-460. BERRY, J., GANTAR, M., PEREZ, M., BERRY, G. & NORIEGA, F. 2008. Cyanobacterial Toxins as Allelochemicals with Potential Applications as Algaecides, Herbicides and Insecticides. Marine Drugs, 6, 117. BIK, H. M., PORAZINSKA, D. L., CREER, S., CAPORASO, J. G., KNIGHT, R. & THOMAS, W. K. 2012. Sequencing our way towards understanding global eukaryotic biodiversity. Trends in Ecology & Evolution, 27, 233-243. BITTENCOURT-OLIVEIRA, M. D., MASSOLA, N. S., HERNANDEZ-MARINE, M., ROMO, S. & MOURA, A. D. 2007a. Taxonomic investigation using DNA 103

REFERENCES fingerprinting in Geitlerinema species (Oscillatoriales, Cyanobacteria). Phycological Research, 55, 214-221. BITTENCOURT-OLIVEIRA, M. D., MOURA, A. D., GOUVEA-BARROS, S. & PINTO, E. 2007b. HIP1 DNA fingerprinting in Microcystis panniformis (Chroococcales, Cyanobacteria). Phycologia, 46, 3-9. BITTENCOURT-OLIVEIRA, M. D. & PICCIN-SANTOS, V. 2012. Genetic Diversity of Brazilian Cyanobacteria Revealed by Phylogenetic Analysis. In: CALISKAN, M. (ed.) Genetic Diversity in Microorganisms. InTech. BLANKENSHIP, R. E. 1992. Origin and early evolution of photosynthesis. Photosynthesis Research, 33, 91-111. BOLCH, C. J. S., BLACKBURN, S. I., NEILAN, B. A. & GREWE, P. M. 1996. Genetic characterisation of strains of cyanobacteria using PCR-RFLP of the cpcBA intergenic spacer and flanking reagions. Journal of Phycology, 32, 445- 451. BOLD, H. C. & WYNNE, M. J. 1985. Introduction to the algae: structure and reproduction, Englewood Cliffs, Prentice-Hall. BOYER, S. L., FLECHTNER, V. R. & JOHANSEN, J. R. 2001. Is the 16S-23S rRNA internal transcribed spacer region a good tool for use in molecular systematics and population genetics? A case study in cyanobacteria. Molecular Biology and Evolution, 18, 1057-1069. BROOKES, J. D., BAKER, P. D. & BURCH, M. D. 2002. Ecology and management of cyanobacteria in rivers and reservoirs. Blue-green algae: their significance and management within water supplies. 4 ed. Salisbury: Cooperative Research Centre for Water Quality and Treatment BROSIUS, J., PALMER, M. L., KENNEDY, P. J. & NOLLER, H. F. 1978. Complete nucleotide sequence of a 16S ribosomal RNA gene from . Proceedings of the National Academy of Sciences of the United States of America, 75, 4801-4805. BURCH, M., CHOW, C. W. K. & HOBSON, P. 2002. Algicides for control of toxic cyanobacteria. Blue-green algae : their signific ance and management within water supplies. 4 ed. Salisbury: Cooperative Research Centre for Water Quality Treatment. CAPONE, D. G., ZEHR, J. P., PAERL, H. W., BERGMAN, B. & CARPENTER, E. J. 1997. Trichodesmium, a globally significant marine cyanobacterium. Science, 276, 1221-1229. CAPORASO, J. G., KUCZYNSKI, J., STOMBAUGH, J., BITTINGER, K., BUSHMAN, F. D., COSTELLO, E. K., FIERER, N., PENA, A. G., GOODRICH, J. K., GORDON, J. I., HUTTLEY, G. A., KELLEY, S. T., KNIGHTS, D., KOENIG, J. E., LEY, R. E., LOZUPONE, C. A., MCDONALD, D., MUEGGE, B. D., PIRRUNG, M., REEDER, J., SEVINSKY, J. R., TUMBAUGH, P. J., WALTERS, W. A., WIDMANN, J., YATSUNENKO, T., ZANEVELD, J. & KNIGHT, R. 2010. QIIME allows analysis of high- throughput community sequencing data. Nature Methods, 7, 335-336. CAPORASO, J. G., LAUBER, C. L., WALTERS, W. A., BERG-LYONS, D., HUNTLEY, J., FIERER, N., OWENS, S. M., BETLEY, J., FRASER, L., BAUER, M., GORMLEY, N., GILBERT, J. A., SMITH, G. & KNIGHT, R. 2012. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME Journal, 6, 1621-1624. CAPORASO, J. G., LAUBER, C. L., WALTERS, W. A., BERG-LYONS, D., LOZUPONE, C. A., TURNBAUGH, P. J., FIERER, N. & KNIGHT, R. 2011. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences, 108, 4516-4522. 104

REFERENCES CARMICHAEL, W. W. 1998. Algal Poisoning In: AIELLO, S. (ed.) The Merck Veterinary Manual. Whitehouse Station, New Jersey: Merck & Co., Inc. CARMICHAEL, W. W. 2001. Health Effects of Toxin-Producing Cyanobacteria: The CyanoHABs. Human and Ecological Risk Assessment, 7, 1393-1407. CARMICHAEL, W. W., AN, J. S., AZEVEDO, S. M. F. O., EAGLESHAM, G. K., JOCHIMSEN, E. M., LAU, S., MOLICA, R. J. R., RINEHART, K. L. & SHAW, G. R. 2001. Human Fatalities from Cyanobacteria: Chemical and Biological Evidence for Cyanotoxins. Environmental Health Perspectives, 109, 663. CASAMATTA, D. A., VIS, M. L. & SHEATH, R. G. 2003. Cryptic species in cyanobacterial systematics: a case study of Phormidium retzii (Oscillatoriales) using RAPD molecular markers and 16S rDNA sequence data. Aquatic Botany, 77, 295-309. CASTENHOLZ, R. W. 2001. Phylum BX. Cyanobacteria. In: BOONE, D. R., CASTENHOLZ, R. W. & GARRITY, G. M. (eds.) Bergey’s Manual of Systematic Bacteriology. 2 ed. New York: Springer. CASTENHOLZ, R. W. 2015a. Cyanobacteria. In: WHITMAN, W. B. (ed.) Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. CASTENHOLZ, R. W. 2015b. General Characteristics of the Cyanobacteria. In: WHITMAN, W. B. (ed.) Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. CASTENHOLZ, R. W. 2015c. Oxygenic Photosynthetic Bacteria. In: WHITMAN, W. B. (ed.) Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. CASTENHOLZ, R. W. & NORRIS, T. B. 2005. Revisionary concepts of species in the cyanobacteria and their applications. Archiv für Hydrobiologie Supplement, 159, 53-69. CASTENHOLZ, R. W., RIPPKA, R., HERDMAN, M. & WILMOTTE, A. 2015. Subsection III. Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. CASTENHOLZ, R. W. & WATERBURY, J. 1989. Introduction to Cyanobacteria. In: KRIEG, N. R. & HOLT, J. G. (eds.) Bergey's Manual of Systematic Bacteriology Baltimore, Md: Williams & Wilkins. CASTRESANA, J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular Biology and Evolution, 17, 540- 552. CHAKRAVORTY, S., HELB, D., BURDAY, M., CONNELL, N. & ALLAND, D. 2007. A detailed analysis of 16S ribosomal RNA gene segments for the diagnosis of . Journal of Microbiological Methods, 69, 330- 339. CHORUS, I. & BARTRAM, J. 1999. Toxic Cyanobacteria in Water: A Guide to Their Public Health Consequences, Monitoring and Management, London, World Health Organisation, E & FN Spon. CHUDLEIGH, P. D., CROOME, R. & SIMPSON, S. 2000. The National Eutrophication Management Program: A Review, Land and Water Resources Research and Development Corporation. CLAESSON, M. J., WANG, Q., O'SULLIVAN, O., GREENE-DINIZ, R., COLE, J. R., ROSS, R. P. & O'TOOLE, P. W. 2010. Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions. Nucleic Acids Research, 38, e200.

105

REFERENCES CODD, G. A. 2000. Cyanobacterial toxins, the perception of water quality, and the prioritisation of eutrophication control. Ecological Engineering, 16, 51-60. CODD, G. A., MORRISON, L. F. & METCALF, J. S. 2005. Cyanobacterial toxins: risk management for health protection. Toxicology and Applied Pharmacology, 203, 264-272. COENYE, T. & VANDAMME, P. 2003. Intragenomic heterogeneity between multiple 16S ribosomal RNA operons in sequenced bacterial genomes. FEMS Microbiology Letters, 228, 45-49. COHEN, Y. & GUREVITZ, M. 2006. The Cyanobacteria, Ecology, Physiology and Molecular Genetics. In: DWORKIN, M., FALKOW, S., ROSENBERG, E., SCHLEIFER, K.-H. & STACKEBRANDT, E. (eds.) The Prokaryotes. New York: Springer USA. COX, P. A., BANACK, S. A., MURCH, S. J., RASMUSSEN, U., TIEN, G., BIDIGARE, R. R., METCALF, J. S., MORRISON, L. F., CODD, G. A. & BERGMAN, B. 2005. Diverse taxa of cyanobacteria produce beta-N- methylamino-L-alanine, a neurotoxic amino acid. Proceedings of the National Academy of Sciences of the United States of America, 102, 5074-5078. CREER, S., FONSECA, V. G., PORAZINSKA, D. L., GIBLIN-DAVIS, R. M., SUNG, W., POWER, D. M., PACKER, M., CARVALHO, G. R., BLAXTER, M. L., LAMBSHEAD, P. J. D. & THOMAS, W. K. 2010. Ultrasequencing of the meiofaunal biosphere: practice, pitfalls and promises. Molecular Ecology, 19, 4- 20. DALL'AGNOL, L. T., GHILARDI-JUNIOR, R., MCCULLOCH, J. A., SCHNEIDER, H., SCHNEIDER, M. P. C. & SILVA, A. 2012. Phylogenetic and gene trees of Synechococcus: Choice of the right marker to evaluate the population diversity in the Tucuruí Hydroelectric Power Station Reservoir in Brazilian Amazonia. Journal of Plankton Research, 34, 245-257. DANIELS, O., FABBRO, L. & MAKIELA, S. 2014. The Effects of the Toxic Cyanobacterium Limnothrix (Strain AC0243) on Bufo marinus Larvae. Toxins, 6, 1021-1035. DARRIBA, D., TABOADA, G., DOALLO, R. & POSADA, D. 2012. jModelTest 2: more models, new heuristics and parallel computing. Nature Methods, 9, 772. DE ALBUQUERQUE, E. C., MELO, L. F. C. & FRANCO, T. T. 2007. Use of solid- phase extraction, high-performance liquid chromatography, and MALDI-TOF identification for D-Leu(1) MCYST-LR analysis in treated water: Validation of the analytical methodology. Canadian Journal of Analytical Sciences and Spectroscopy, 52, 1-10. DE BRUYN, A., MARTIN, D. P. & LEFEUVRE, P. 2014. Phylogenetic Reconstruction Methods: An Overview. In: BESSE, P. (ed.) Molecular : Methods and Protocols. Totowa: Humana Press Inc. DE LA CRUZ, A. A., HISKIA, A., KALOUDIS, T., CHERNOFF, N., HILL, D., ANTONIOU, M. G., HE, X., LOFTIN, K., O'SHEA, K., ZHAO, C., PELAEZ, M., HAN, C., LYNCH, T. J. & DIONYSIOU, D. D. 2013. A review on cylindrospermopsin: the global occurrence, detection, toxicity and degradation of a potent cyanotoxin. Environmental Science: Processes & Impacts, 15, 1979- 2003. DEINER, K., WALSER, J. C., MACHLER, E. & ALTERMATT, F. 2015. Choice of capture and extraction methods affect detection of freshwater biodiversity from environmental DNA. Biological Conservation, 183, 53-63. DEREEPER, A., GUIGNON, V., BLANC, G., AUDIC, S., BUFFET, S., CHEVENET, F., DUFAYARD, J.-F., GUINDON, S., LEFORT, V., LESCOT, M.,

106

REFERENCES CLAVERIE, J.-M. & GASCUEL, O. 2008. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Research, 36, W465-W469. DESANTIS, T. Z., HUGENHOLTZ, P., LARSEN, N., ROJAS, M., BRODIE, E. L., KELLER, K., HUBER, T., DALEVI, D., HU, P. & ANDERSEN, G. L. 2006. Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB. Applied and Environmental Microbiology, 72, 5069- 5072. DI RIENZI, S. C., SHARON, I., WRIGHTON, K. C., KOREN, O., HUG, L. A., THOMAS, B. C., GOODRICH, J. K., BELL, J. T., SPECTOR, T. D., BANFIELD, J. F. & LEY, R. E. 2013. The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria. eLife, 2, e01102. DIONYSIOU, D. 2010. Overview: Harmful algal blooms and natural toxins in fresh and marine waters – Exposure, occurrence, detection, toxicity, control, management and policy. Toxicon, 55, 907-908. DITTMANN, E., FEWER, D. P. & NEILAN, B. A. 2013. Cyanobacterial toxins: biosynthetic routes and evolutionary roots. FEMS Microbiology Reviews, 37, 23-43. DOHRMANN, M., JANUSSEN, D., REITNER, J., COLLINS, A. G. & WÖRHEIDE, G. 2008. Phylogeny and Evolution of Glass Sponges (Porifera, Hexactinellida). Systematic Biology, 57, 388-405. DOKULIL, M. & TEUBNER, K. 2000. Cyanobacterial dominance in lakes. Hydrobiologia, 438, 1-12. DRESS, A. W. M., FLAMM, C., FRITZSCH, G., GRUENEWALD, S., KRUSPE, M., PROHASKA, S. J. & STADLER, P. F. 2008. Noisy: Identification of problematic columns in multiple sequence alignments. Algorithms for Molecular Biology, 3. DRIKAS, M., CHOW, C. W. K., HOUSE, J. & BURCH, M. D. 2001. Toxic cyanobacteria. Journal American Water Works Association, 93, 100-111. DROBAC, D., TOKODI, N., SIMEUNOVIĆ, J., BALTIĆ, V., STANIĆ, D. & SVIRČEV, Z. 2013. Human Exposure to Cyanotoxins and Their Effects on Health. Arhiv za higijenu rada i toksikologiju, 64, 305-315. DVOŘÁK, P., HINDÁK, F., HAŠLER, P., HINDÁKOVA, A. & POULICOVÁ, A. 2014. Morphological and molecular studies of Neosynechococcus sphagnicola, gen. et sp. nov. (Cyanobacteria, Synechococcales). 2014, 170, 11. DVOŘÁK, P., POULÍČKOVÁ, A., HAŠLER, P., BELLI, M., CASAMATTA, D. A. & PAPINI, A. 2015. Species concepts and speciation factors in cyanobacteria, with connection to the problems of diversity and classification. Biodiversity and Conservation, 24, 739-757. DYBLE, J., PAERL, H. W. & NEILAN, B. A. 2002. Genetic characterization of Cylindrospermopsis raciborskii (Cyanobacteria) isolates from diverse geographic origins based on nifH and cpcBA-IGS nucleotide sequence analysis. Applied and Environmental Microbiology, 68, 2567-2571. EDDY, S. R. & DURBIN, R. 1994. RNA sequence-analysis using covariance-models. Nucleic Acids Research, 22, 2079-2088. EDGAR, R. C. 2004. MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5, 1-19. EDGAR, R. C. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26, 2460-2461. EDGAR, R. C. 2013. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods, 10, 996-998.

107

REFERENCES EDGAR, R. C., HAAS, B. J., CLEMENTE, J. C. & KNIGHT, R. 2011. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics, 27, 2194- 21200. EILER, A., DRAKARE, S., BERTILSSON, S., PERNTHALER, J., PEURA, S., ROFNER, C., SIMEK, K., YANG, Y., ZNACHOR, P. & LINDSTRÖM, E. S. 2013. Unveiling Distribution Patterns of Freshwater Phytoplankton by a Next Generation Sequencing Based Approach. PLoS ONE, 8, e53516. EMERSON, D., RENTZ, J. A., LILBURN, T. G., DAVIS, R. E., ALDRICH, H., CHAN, C. & MOYER, C. L. 2007. A Novel Lineage of Proteobacteria Involved in Formation of Marine Fe-Oxidizing Microbial Mat Communities. PLoS ONE, 2, e667. ENGENE, N., CHOI, H., ESQUENAZI, E., ROTTACKER, E. C., ELLISMAN, M. H., DORRESTEIN, P. C. & GERWICK, W. H. 2011. Underestimated biodiversity as a major explanation for the perceived rich secondary metabolite capacity of the cyanobacterial genus Lyngbya. Environmental Microbiology, 13, 1601- 1610. ERHARD, M., VONDOHREN, H. & JUNGBLUT, P. 1997. Rapid typing and elucidation of new secondary metabolites of intact cyanobacteria using MALDI- TOF mass spectrometry. Nature Biotechnology, 15, 906-909. ESQUENAZI, E., COATES, C., SIMMONS, L., GONZALEZ, D., GERWICK, W. H. & DORRESTEIN, P. C. 2008. Visualizing the spatial distribution of secondary metabolites produced by marine cyanobacteria and sponges via MALDI-TOF imaging. Molecular Biosystems, 4, 562-570. FALCON, L. I., MAGALLON, S. & CASTILLO, A. 2010. Dating the cyanobacterial ancestor of the chloroplast. ISME Journal, 4, 777-783. FALCONER, I. R. 2001. Toxic cyanobacterial bloom problems in Australian waters: Risks and impacts on human health. Phycologia, 40, 228-233. FALCONER, I. R. 2004a. Cyanobacterial Poisoning of Livestock and People. Cyanobacterial Toxins of Drinking Water Supplies. Florida: CRC Press. FALCONER, I. R. 2004b. Water Treatment. Cyanobacterial Toxins of Drinking Water Supplies. Florida: CRC Press. FALCONER, I. R. 2008. Health effects associated with controlled exposures to cyanobacterial toxins. In: HUDNELL, H. K. (ed.) Cyanobacterial Harmful Algal Blooms: State of the Science and Research Needs. Springer New York. FELSENSTEIN, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. Journal of Molecular Evolution, 17, 368-376. FENSELAU, C. & DEMIREV, P. A. 2001. Characterization of intact microorganisms by MALDI mass spectrometry. Mass Spectrometry Reviews, 20, 157-171. FERGUSSON, K. M. & SAINT, C. P. 2000. Molecular phylogeny of Anabaena circinalis and its identification in environmental samples by PCR. Applied and Environmental Microbiology, 66, 4145-4148. FOGG, G. E., STEWART, W. D. P., FAY, P. & WALSBY, A. E. 1973. The Blue- Green Algae, London, Academic Press. FORTUNATO, C. S., HERFORT, L., ZUBER, P., BAPTISTA, A. M. & CRUMP, B. C. 2012. Spatial variability overwhelms seasonal patterns in bacterioplankton communities across a river to ocean gradient. ISME Journal, 6, 554-563. FRANGEUL, L., QUILLARDET, P., CASTETS, A.-M., HUMBERT, J.-F., MATTHIJS, H., CORTEZ, D., TOLONEN, A., ZHANG, C.-C., GRIBALDO, S., KEHR, J.-C., ZILLIGES, Y., ZIEMERT, N., BECKER, S., TALLA, E., LATIFI, A., BILLAULT, A., LEPELLETIER, A., DITTMANN, E., BOUCHIER, C. & TANDEAU DE MARSAC, N. 2008. Highly plastic genome

108

REFERENCES of Microcystis aeruginosa PCC 7806, a ubiquitous toxic freshwater cyanobacterium. BMC Genomics, 9, 274. FRISTACHI, A., SINCLAIR, J. L., HALL, S., BERKMAN, J. A. H., BOYER, G., BURKHOLDER, J., BURNS, J., CARMICHAEL, W., DUFOUR, A., FRAZIER, W., MORTON, S. L., O’BRIEN, E. & WALKER, S. 2008. Occurrence of Cyanobacterial Harmful Algal Blooms Workgroup Report. In: HUDNELL, H. K. (ed.) Cyanobacterial Harmful Algal Blooms: State of the Science and Research Needs. New York: Springer. GAGET, V., GRIBALDO, S. & DE MARSAC, N. T. 2011. An rpoB signature sequence provides unique resolution for the molecular typing of cyanobacteria. International Journal of Systematic and Evolutionary Microbiology, 61, 170- 183. GAGET, V., WELKER, M., RIPPKA, R. & DE MARSAC, N. T. 2015. A polyphasic approach leading to the revision of the genus Planktothrix (Cyanobacteria) and its type species, P. agardhii, and proposal for integrating the emended valid botanical taxa, as well as three new species, Planktothrix paucivesiculata sp. nov.ICNP, Planktothrix tepida sp. nov.ICNP, and Planktothrix serta sp. nov.ICNP, as genus and species names with nomenclatural standing under the ICNP. Systematic and applied microbiology, 38, 141-158. GANF, G. G. & OLIVER, R. L. 1982. Vertical Separation of Light and Available Nutrients as a Factor Causing Replacement of Green Algae by Blue-Green Algae in the Plankton of a Stratified Lake. Journal of Ecology, 70, 829-844. GARBY, T. J., WALTER, M. R., LARKUM, A. W. D. & NEILAN, B. A. 2013. Diversity of cyanobacterial biomarker genes from the stromatolites of Shark Bay, Western Australia. Environmental Microbiology, 15, 1464-1475. ARCÍA-VILLADA, L., RICO, M., ALTAMIRANO, M. A., SÁNCHEZ-MARTÍN, L., LÓPEZ-RODAS, V. & COSTAS, E. 2004. Occurrence of copper resistant mutants in the toxic cyanobacteria Microcystis aeruginosa: characterisation and future implications in the use of copper sulphate as algaecide. Water Research, 38, 2207-2213. GAUTIER, L. & LUND, O. 2013. Low-Bandwidth and Non-Compute Intensive Remote Identification of Microbes from Raw Sequencing Reads. PLoS ONE, 8, e83784. GERWICK, W. H., COATES, R. C., ENGENE, N., GERWICK, L., GRINDBERG, R. V., JONES, A. C. & SORRELS, C. M. 2008. Giant marine cyanobacteria produce exciting potential pharmaceuticals. Microbe, 3, 277-284. GHAI, R., RODRIGUEZ-VALERA, F., MCMAHON, K. D., TOYAMA, D., RINKE, R., CRISTINA SOUZA DE OLIVEIRA, T., WAGNER GARCIA, J., PELLON DE MIRANDA, F. & HENRIQUE-SILVA, F. 2011. Metagenomics of the Water Column in the Pristine Upper Course of the Amazon River. PLoS ONE, 6, e23785. GIGLIO, S., SAINT, C. P. & MONIS, P. T. 2011. Expression of the Geosmin Synthase gene in the cyanobacterium Anabaena circinalis AWQC318. Journal of Phycology, 47, 1338-1343. GILBERT, J. A., JANSSON, J. K. & KNIGHT, R. 2014. The Earth Microbiome project: successes and aspirations. BMC Biology, 12, 1-4. GILBERT, J. A., MEYER, F., JANSSON, J., GORDON, J., PACE, N., TIEDJE, J., LEY, R., FIERER, N., FIELD, D., KYRPIDES, N., GLÖCKNER, F.-O., KLENK, H.-P., WOMMACK, K. E., GLASS, E., DOCHERTY, K., GALLERY, R., STEVENS, R. & KNIGHT, R. 2010. The Earth Microbiome Project: Meeting report of the “1st EMP meeting on sample selection and

109

REFERENCES acquisition” at Argonne National Laboratory October 6th 2010. Standards in Genomic Sciences, 3, 249-253. GIOVANNONI, S. J., TURNER, S., OLSEN, G. J., BARNS, S., LANE, D. J. & PACE, N. R. 1988. Evolutionary relationships among cyanobacteria and green chloroplasts. Journal of Bacteriology, 170, 3584-3592. GOFTON, A., OSKAM, C., LO, N., BENINATI, T., WEI, H., MCCARL, V., MURRAY, D., PAPARINI, A., GREAY, T., HOLMES, A., BUNCE, M., RYAN, U. & IRWIN, P. 2015. Inhibition of the endosymbiont "Candidatus Midichloria mitochondrii" during 16S rRNA gene profiling reveals potential pathogens in Ixodes ticks from Australia. Parasites & Vectors, 8, 345. GORDON, D. M., FINLAYSON, C. M. & MCCOMB, A. J. 1981. Nutrients and phytoplankton in three shallow, freshwater lakes of different trophic status in Western Australia. Australian Journal of Marine and Freshwater Research, 32, 541-553. GORHAM, P. R., MCLACHLAN, R. W. & HAMMER, U. T. 1964. Isolation and culture of toxic strains of Anabaena flos-aquae. Verhandlungen der Internationalen Vereinigung für Theoretische und Angewandte Limnologie, 19, 796-804. GOTO, J. J., KOENIG, J. H. & IKEDA, K. 2012. The physiological effect of ingested β-N-methylamino-L-alanine on a glutamatergic synapse in an in vivo preparation. Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology, 156, 171-177. GRANÉLI, E. & TURNER, J. T. 2006. Ecology of harmful algae, Springer, Germany. GRAUR, D. & MARTIN, W. 2004. Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends in Genetics, 20, 80- 86. GRIFFITHS, D. J. & SAKER, M. L. 2003. The Palm Island mystery disease 20 years on: A review of research on the cyanotoxin cylindrospermopsin. Environmental Toxicology, 18, 78-93. GUGGER, M., LYRA, C., HENRIKSEN, P., COUTÉ, A., HUMBERT, J.-F. & SIVONEN, K. 2002. Phylogenetic comparison of the cyanobacterial genera Anabaena and Aphanizomenon. International Journal of Systematic and Evolutionary Microbiology, 52, 1867-1880. GUGGER, M., MOLICA, R., LE BERRE, B., DUFOUR, P., BERNARD, C. & HUMBERT, J.-F. 2005. Genetic Diversity of Cylindrospermopsis Strains (Cyanobacteria) Isolated from Four Continents. Applied and Environmental Microbiology, 71, 1097-1100. GUPTA, R. S. 2000. The phylogeny of proteobacteria: relationships to other eubacterial phyla and eukaryotes. FEMS Microbiology Reviews, 24, 367-402. HADZIAVDIC, K., LEKANG, K., LANZEN, A., JONASSEN, I., THOMPSON, E. M. & TROEDSSON, C. 2014. Characterization of the 18S rRNA Gene for Designing Universal Specific Primers. PLoS ONE, 9, e87624. HAJIBABAEI, M. 2012. The golden age of DNA metasystematics. Trends in Genetics, 28, 535-537. HALL, T. A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series, 41, 95-98. HAMMER, Ø., HARPER, D. A. T. & RYAN, P. D. 2001. PAST: Paleontological statistics software package for education and data analysis. Palaeontologia Electronica 4, 9.

110

REFERENCES HAN, D., FAN, Y. & HU, Z. 2009. An Evaluation of Four Phylogenetic Markers in Nostoc: Implications for Cyanobacterial Phylogenetic Studies at the Intrageneric Level. Current Microbiology, 58, 170-176. HARRISON, J. C. & LANGDALE, J. A. 2006. A step by step guide to phylogeny reconstruction. The Plant Journal, 45, 561-572. HARTMANN, S. & VISION, T. J. 2008. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment? BMC Evolutionary Biology, 8. HAUER, T., BOHUNICKA, M., JOHANSEN, J. R., MARES, J. & BERRENDERO- GOMEZ, E. 2014. Reassessment of the cyanobacterial family Microchaetaceae and establishment of new families Tolypothrichaceae and Godleyaceae. Journal of phycology, 50, 1089-1100. HAWKINS, P. R., RUNNEGAR, M. T., JACKSON, A. R. & FALCONER, I. R. 1985. Severe hepatotoxicity caused by the tropical cyanobacterium (blue-green alga) Cylindrospermopsis raciborskii (Woloszynska) Seenaya and Subba Raju isolated from a domestic water supply reservoir. Applied and Environmental Microbiology, 50, 1292-1295. HEATH, M. W., WOOD, S. A. & RYAN, K. G. 2011. Spatial and temporal variability in Phormidium mats and associated anatoxin-a and homoanatoxin-a in two rivers. Aquatic , 64, 69-79. HERDMAN, M., CASTENHOLZ, R. W., ITEMAN, I., WATERBURY, J. B. & RIPPKA, R. 2015. Subsection I. Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. HIMBERG, K., KEIJOLA, A. M., HIISVIRTA, L., PYYSALO, H. & SIVONEN, K. 1989. The effect of water-treatment processes on the removal of hepatoxins from microcystis and oscillatoria cyanobacteria - A laboratory study. Water Research, 23, 979-984. HITZFELD, B. C., HÖGER, S. J. & DIETRICH, D. R. 2000. Cyanobacterial Toxins: Removal during Drinking Water Treatment, and Human Risk Assessment. Environmental Health Perspectives, 108, 113-122. HO, L., SAWADE, E. & NEWCOMBE, G. 2012. Biological treatment options for cyanobacteria metabolite removal – A review. Water Research, 46, 1536-1548. HÖCKELMANN, C. & JÜTTNER, F. 2005. Off-flavours in water: hydroxyketones and β-ionone derivatives as new odour compounds of freshwater cyanobacteria. Flavour and Fragrance Journal, 20, 387-394. HOEK, C., VAN DEN MANN, D. G. & M., J. H. (eds.) 1995. Algae An Introduction to Phycology, Cambridge: Cambridge University Press. HOFER, U. 2013. Bacterial evolution: Getting to the bottom of Cyanobacteria. Nature Reviews Microbiology, 11, 818-819. HOFFMANN, L. & CASTENHOLZ, R. W. 2015. Subsection V. In: WHITMAN, W. B. (ed.) Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. HOHL, M. & RAGAN, M. A. 2007. Is multiple-sequence alignment required for accurate inference of phylogeny? Systematic Biology, 56, 206-221. HOLDER, M. & LEWIS, P. O. 2003. Phylogeny estimation: traditional and Bayesian approaches. Nature Reviews Genetics, 4, 275-284. HONG, S., BUNGE, J., LESLIN, C., JEON, S. & EPSTEIN, S. S. 2009. Polymerase chain reaction primers miss half of rRNA microbial diversity. ISME Journal, 3, 1365-1373. HOSJA, W. & DEELEY, D. 1994. Harmful Phytoplankton Surveillance in Western Australia. Perth: Waterways Commission.

111

REFERENCES HOWARD-AZZEH, M., SHAMSEER, L., SCHELLHORN, H. E. & GUPTA, R. S. 2014. Phylogenetic analysis and molecular signatures defining a monophyletic clade of heterocystous cyanobacteria and identifying its closest relatives. Photosynthesis Research, 122, 171-185. HUGERTH, L. W., MULLER, E. E. L., HU, Y. O. O., LEBRUN, L. A. M., ROUME, H., LUNDIN, D., WILMES, P. & ANDERSSON, A. F. 2014. Systematic Design of 18S rRNA Gene Primers for Determining Eukaryotic Diversity in Microbial Consortia. PLoS ONE, 9, e95567. HUISMAN, J. M., KENDRICK, A. J. & RULE, M. J. 2011. Benthic algae and seagrasses of the Walpole and Nornalup Inlets Marine Park, Western Australia. Journal of the Royal Society of Western Australia, 94, 29-44. HUMPAGE, A. 2008. Toxin types, toxicokinetics and toxicodynamics. In: HUDNELL, H. K. (ed.) Cyanobacterial Harmful Algal Blooms: State of the Science and Research Needs. Springer New York. HUMPAGE, A., FALCONER, I., BERNARD, C., FROSCIO, S. & FABBRO, L. 2012. Toxicity of the cyanobacterium Limnothrix AC0243 to male Balb/c mice. Water Research, 46, 1576-1583. HUSON, D. H. & WEBER, N. 2013. Chapter Twenty-One - Microbial Community Analysis Using MEGAN. In: EDWARD, F. D. (ed.) Methods in Enzymology. Academic Press. IMHOFF, J. F. 2001. The Anoxygenic Phototrophic Purple Bacteria. In: BOONE, D. R., CASTENHOLZ, R. W. & GARRITY, G. M. (eds.) Bergey’s Manual of Systematic Bacteriology. 2 ed. New York: Springer. ITEMAN, I., RIPPKA, R., DE MARSAC, N. T. & HERDMAN, M. 2000. Comparison of conserved structural and regulatory domains within divergent 16S rRNA-23S rRNA spacer sequences of cyanobacteria. Microbiology, 146, 1275-1286. ITEMAN, I., RIPPKA, R., DE MARSAC, N. T. & HERDMAN, M. 2002. rDNA analyses of planktonic heterocystous cyanobacteria, including members of the genera Anabaenopsis and Cyanospira. Microbiology, 148, 481-496. JANSE, I., MEIMA, M., KARDINAAL, W. E. A. & ZWART, G. 2003. High- resolution differentiation of cyanobacteria by using rRNA-internal transcribed spacer denaturing gradient gel electrophoresis. Applied and Environmental Microbiology, 69, 6634-6643. JOHANSEN, J. R., KOVACIK, L., CASAMATTA, D. A., FUCIKOVA, K. & KAŠTOVSKÝ, J. 2011. Utility of 16S-23S ITS sequence and secondary structure for recognition of intrageneric and intergeneric limits within cyanobacterial taxa: Leptolyngbya corticola sp. nov (Pseudanabaenaceae, Cyanobacteria). Nova Hedwigia, 92, 283-302. JOHANSEN, J. R., MARTIN, M. P., REHAKOVA, K. & CASAMATTA, D. A. 2009. Using secondary structure of the 16S rRNA molecule and associated 16S-23S ITS to examine phylogenetic relationships of Aulosira (Nostocaceae, Cyanobacteria). Plant Biology (Rockville), 2009, 62. JÜTTNER, F. & WATSON, S. B. 2007. Biochemical and ecological control of geosmin and 2-methylisoborneol in source waters. Applied and Environmental Microbiology, 73, 4395-4406. KAHRU, M. & ELMGREN, R. 2014. Multidecadal time series of satellite-detected accumulations of cyanobacteria in the Baltic Sea. Biogeosciences, 11, 3619- 3633. KANEKO, T., NAKAJIMA, N., OKAMOTO, S., SUZUKI, I., TANABE, Y., TAMAOKI, M., NAKAMURA, Y., KASAI, F., WATANABE, A., KAWASHIMA, K., KISHIDA, Y., ONO, A., SHIMIZU, Y., TAKAHASHI, C., MINAMI, C., FUJISHIRO, T., KOHARA, M., KATOH, M., NAKAZAKI, N., 112

REFERENCES NAKAYAMA, S., YAMADA, M., TABATA, S. & WATANABE, M. M. 2007. Complete Genomic Structure of the Bloom-forming Toxic Cyanobacterium Microcystis aeruginosa NIES-843. DNA Research, 14, 247-256. KATAOKA, T., HOMMA, T., NAKANO, S., HODOKI, Y., OHBAYASHI, K. & KONDO, R. 2013. PCR primers for selective detection of intra-species variations in the bloom-forming cyanobacterium, Microcystis. Harmful Algae, 23, 46-54. KATOH, K., MISAWA, K., KUMA, K. & MIYATA, T. 2002. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30, 3059-3066. KEARSE, M., MOIR, R., WILSON, A., STONES-HAVAS, S., CHEUNG, M., STURROCK, S., BUXTON, S., COOPER, A., MARKOWITZ, S., DURAN, C., THIERER, T., ASHTON, B., MEINTJES, P. & DRUMMOND, A. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics, 28, 1647-1649. KEIJOLA, A. M., HIMBERG, K., ESALA, A. L., SIVONEN, K. & HIIS-VIRTA, L. 1988. Removal of cyanobacterial toxins in water treatment processes: Laboratory and pilot-scale experiments. Toxicity Assessment, 3, 643-656. KEMP, A. & JOHN, J. 2006. Microcystins associated with microcystis dominated blooms in the southwest wetlands, Western Australia. Environmental Toxicology, 21, 125-130. KERMARREC, L., FRANC, A., RIMET, F., CHAUMEIL, P., HUMBERT, J. F. & BOUCHEZ, A. 2013. Next-generation sequencing to inventory taxonomic diversity in eukaryotic communities: a test for freshwater diatoms. Molecular Ecology Resources, 13, 607-619. KIM, S. G., RHEE, S. K., AHN, C. Y., KO, S. R., CHOI, G. G., BAE, J. W., PARK, Y. H. & OH, H. M. 2006. Determination of Cyanobacterial Diversity during Algal Blooms in Daechung Reservoir, Korea, on the Basis of cpcBA Intergenic Spacer Region Analysis. Applied and Environmental Microbiology, 72, 3252-3258. KIMURA, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111-120. KIRKPATRICK, M. & SLATKIN, M. 1993. Searching for evolutionary patterns in the shape of a phylogenetict tree. Evolution, 47, 1171-1181. KLINDWORTH, A., PRUESSE, E., SCHWEER, T., PEPLIES, J., QUAST, C., HORN, M. & GLÖCKNER, F. O. 2013. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Research, 41, e1. KOLMAKOVA, O. V., GLADYSHEV, M. I., ROZANOV, A. S., PELTEK, S. E. & TRUSOVA, M. Y. 2014. Spatial biodiversity of bacteria along the largest Arctic river determined by next-generation sequencing. FEMS Microbiology Ecology, 89, 442-450. KOMÁREK, J. 2003. Problem of the taxonomic category "species" in cyanobacteria. Archiv für Hydrobiologie Supplement, 148, 281-297. KOMÁREK, J. 2006. Cyanobacterial taxonomy: Current problems and prospects for the integration of traditional and molecular approaches. Algae, 21, 349-375. KOMÁREK, J. 2010a. Modern taxonomic revision of planktic nostocacean cyanobacteria: a short review of genera. Hydrobiologia, 639, 231-243. KOMÁREK, J. 2010b. Recent changes (2008) in cyanobacteria taxonomy based on a combination of molecular background with phenotype and ecological consequences (genus and species concept). Hydrobiologia, 639, 245-259.

113

REFERENCES KOMÁREK, J. & ANAGNOSTIDIS, K. 1989. Modern approach to the classification system of Cyanophytes 4 - Nostocales. Archiv für Hydrobiologie - Algological Studies, Supplement Volumes, 56, 247-345. KOMÁREK, J., KAŠTOVSKÝ, J. & JEZBEROVA, J. 2011. Phylogenetic and taxonomic delimitation of the cyanobacterial genus Aphanothece and description of Anathece gen. nov. European Journal of Phycology, 46, 315-326. KOMÁREK, J., KAŠTOVSKÝ, J., MAREŠ, J. & JOHANSEN, J. R. 2014. Taxonomic classification of cyanoprokaryotes (cyanobacterial genera) 2014 according to the polyphasic approach. Preslia, 86, 295-335. KONG, Y., XU, X., ZHU, L. & MIAO, L. 2013. Control of the Harmful Alga Microcystis aeruginosa and Absorption of Nitrogen and Phosphorus by Candida utilis. Applied Biochemistry and Biotechnology, 169, 88-99. KOREIVIENĖ, J., ANNE, O., KASPEROVIČIENĖ, J. & BURŠKYTĖ, V. 2014. Cyanotoxin management and human health risk mitigation in recreational waters. Environmental Monitoring and Assessment, 186, 4443-4459. KUMAR, K., MELLA-HERRERA, R. A. & GOLDEN, J. W. 2010. Cyanobacterial Heterocysts. Cold Spring Harbor Perspectives in Biology, 2. KUMARI, N., SRIVASTAVA, A. K., BHARGAVA, P. & RAI, L. C. 2009. Molecular approaches towards assessment of cyanobacterial biodiversity. African Journal of Biotechnology, 8, 4284-4298. LAGUS, A., SUOMELA, J., HELMINEN, H., LEHTIMAKI, J. M., SIPURA, J., SIVONEN, K. & SUOMINEN, L. 2007. Interaction effects of N : P ratios and frequency of nutrient supply on the plankton community in the northern Baltic Sea. Marine Ecology Progress Series, 332, 77-92. LAND, M., HAUSER, L., JUN, S. R., NOOKAEW, I., LEUZE, M. R., AHN, T. H., KARPINETS, T., LUND, O., KORA, G., WASSENAAR, T., POUDEL, S. & USSERY, D. W. 2015. Insights from 20 years of bacterial genome sequencing. Functional & Integrative Genomics, 15, 141-161. LANDAN, G. & GRAUR, D. 2009. Characterization of pairwise and multiple sequence alignment errors. Gene, 441, 141-147. LANE, D. J., PACE, B., OLSEN, G. J., STAHL, D. A., SOGIN, M. L. & PACE, N. R. 1985. Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proceedings of the National Academy of Sciences of the United States of America, 82, 6955-6959. LANZÉN, A., JØRGENSEN, S. L., BENGTSSON, M. M., JONASSEN, I., ØVREÅS, L. & URICH, T. 2011. Exploring the composition and diversity of microbial communities at the Jan Mayen hydrothermal vent field using RNA and DNA. FEMS Microbiology Ecology, 77, 577-589. LARKIN, M. A., BLACKSHIELDS, G., BROWN, N. P., CHENNA, R., MCGETTIGAN, P. A., MCWILLIAM, H., VALENTIN, F., WALLACE, I. M., WILM, A., LOPEZ, R., THOMPSON, J. D., GIBSON, T. J. & HIGGINS, D. G. 2007. Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948. LAWTON, L. A., ROBERTSON, P. K. J., ROBERTSON, R. F. & BRUCE, F. G. 2003. The destruction of 2-methylisoborneol and geosmin using titanium dioxide photocatalysis. Applied Catalysis B: Environmental, 44, 9-13. LEE, E., RYAN, U. M., MONIS, P., MCGREGOR, G. B., BATH, A., GORDON, C. & PAPARINI, A. 2014. Polyphasic identification of cyanobacterial isolates from Australia. Water Research, 59, 248-261. LETSCH, H. O., KÜCK, P., STOCSITS, R. R. & MISOF, B. 2010. The Impact of rRNA Secondary Structure Consideration in Alignment and Tree Reconstruction: Simulated Data and a Case Study on the Phylogeny of Hexapods. Molecular Biology and Evolution, 27, 2507-2521. 114

REFERENCES LIN, S., WU, Z., YU, G., ZHU, M., YU, B. & LI, R. 2010. Genetic diversity and molecular phylogeny of Planktothrix (Oscillatoriales, cyanobacteria) strains from China. Harmful Algae, 9, 87-97. LINDGREN, A. R. & DALY, M. 2007. The impact of length-variable data and alignment criterion on the phylogeny of Decapodiformes (Mollusca: Cephalopoda). Cladistics, 23, 464-476. LINDNER, M., THEURICH, J. & BAHNEMANN, D. W. 1997. Photocatalytic degradation of organic compounds: Accelerating the process efficiency. Water Science and Technology, 35, 79-86. LITVAITIS, M. K. 2002. A molecular test of cyanobacterial phylogeny: inferences from constraint analyses. Hydrobiologia, 468, 135-145. LIU, K., NELESEN, S., RAGHAVAN, S., LINDER, C. R. & WARNOW, T. 2009. Barking up the wrong treelength: The impact of gap penalty on alignment and tree accuracy. IEEE-ACM Transactions on Computational Biology and Bioinformatics, 6, 7-21. LIU, Z. Z., LOZUPONE, C., HAMADY, M., BUSHMAN, F. D. & KNIGHT, R. 2007. Short pyrosequencing reads suffice for accurate microbial community analysis. Nucleic Acids Research, 35. LOIACONO, S. T., MEYER-DOMBARD, D. A. R., HAVIG, J. R., PORET- PETERSON, A. T., HARTNETT, H. E. & SHOCK, E. L. 2012. Evidence for high-temperature in situ nifH transcription in an alkaline hot spring of Lower Geyser Basin, Yellowstone National Park. Environmental Microbiology, 14, 1272-1283. LÖYTYNOJA, A. 2012. Alignment methods: Strategies, challenges, benchmarking, and comparative overview. In: ANISIMOVA, M. (ed.) Evolutionary Genomics. New York: Humana Press. LÖYTYNOJA, A. 2014. Phylogeny-aware alignment with PRANK. In: RUSSELL, D. J. (ed.) Multiple Sequence Alignment Methods. New York: Humana Press. LÖYTYNOJA, A. & GOLDMAN, N. 2005. An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences of the United States of America, 102, 10557-10562. LÖYTYNOJA, A., VILELLA, A. J. & GOLDMAN, N. 2012. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics, 28, 1684-1691. LUKEŠOVÁ, A., JOHANSEN, J. R., MARTIN, M. P. & CASAMATTA, D. A. 2009. Aulosira Bohemensis sp. nov.: Further Phylogenetic Uncertainty at The Base of The Nostocales (Cyanobacteria). Phycologia, 48, 118-129. LUND, M. A. & DAVIS, J. A. 2000. Seasonal dynamics of plankton communities and water chemistry in a eutrophic wetland (Lake Monger, Western Australia): implications for biomanipulation. Marine and Freshwater Research, 51, 321- 332. LÜRLING, M. & ROESSINK, I. 2006. On the way to cyanobacterial blooms: Impact of the herbicide metribuzin on the competition between a green alga (Scenedesmus) and a cyanobacterium (Microcystis). Chemosphere, 65, 618-626. LYRA, C., HANTULA, J., VAINIO, E., RAPALA, J., ROUHIAINEN, L. & SIVONEN, K. 1997. Characterization of Cyanobacteria by SDS-PAGE of whole-cell proteins and PCR/RFLP of the 16S rRNA gene. Archives of Microbiology, 168, 176-184. LYRA, C., SUOMALAINEN, S., GUGGER, M., VEZIE, C., SUNDMAN, P., PAULIN, L. & SIVONEN, K. 2001. Molecular characterization of planktic cyanobacteria of Anabaena, Aphanizomenon, Microcystis and Planktothrix

115

REFERENCES genera. International Journal of Systematic and Evolutionary Microbiology, 51, 513-526. MACHIDA, R. J. & KNOWLTON, N. 2012. PCR Primers for Metazoan Nuclear 18S and 28S Ribosomal DNA Sequences. PLoS ONE, 7, e46180. MAN-AHARONOVICH, D., KRESS, N., ZEEV, E. B., BERMAN-FRANK, I. & BÉJÀ, O. 2007. Molecular ecology of nifH genes and transcripts in the eastern Mediterranean Sea. Environmental Microbiology, 9, 2354-2363. MANEN, J.-F. & FALQUET, J. 2002. The cpcB-cpcA locus as a tool for the genetic characterization of the genus Arthrospira (Cyanobacteria): evidence for horizontal transfer. International Journal of Systematic and Evolutionary Microbiology, 52, 861-867. MARQUARDT, J. & PALINSKA, K. A. 2007. Genotypic and phenotypic diversity of cyanobacteria assigned to the genus Phormidium (Oscillatoriales) from different habitats and geographical sites. Archives of Microbiology, 187, 397-413. MARTIN, D. & RIDGE, I. 1999. The relative sensitivity of algae to decomposing barley straw. Journal of Applied Phycology, 11, 285-291. MAZEL, D., HOUMARD, J., CASTETS, A. M. & TANDEAU DE MARSAC, N. 1990. Highly repetitive DNA sequences in cyanobacterial genomes. Journal of Bacteriology, 172, 2755-2761. MAZMOUZ, R., CHAPUIS-HUGON, F., MANN, S., PICHON, V., MÉJEAN, A. & PLOUX, O. 2010. Biosynthesis of Cylindrospermopsin and 7- Epicylindrospermopsin in Oscillatoria sp. Strain PCC 6506: Identification of the cyr Gene Cluster and Toxin Analysis. Applied and Environmental Microbiology, 76, 4943-4949. MCCARRON, P., LOGAN, A. C., GIDDINGS, S. D. & QUILLIAM, M. A. 2014. Analysis of β-N-methylamino-L-alanine (BMAA) in spirulina-containing supplements by liquid chromatography-tandem mass spectrometry. Aquatic Biosystems, 10, 1-7. MCGREGOR, G. B. & RASMUSSEN, J. P. 2008. Cyanobacterial composition of microbial mats from an Australian thermal spring: A polyphasic evaluation. FEMS Microbiology Ecology, 63, 23-35. MCGREGOR, G. B. & SENDALL, B. C. 2015. Phylogeny and toxicology of Lyngbya wollei (Cyanobacteria, Oscillatoriales) from north-eastern Australia, with a description of Microseira gen. nov. J Phycol, 51, 109-19. MCKENZIE, A. & STEEL, M. 2000. Distributions of cherries for two models of trees. Mathematical Biosciences, 164, 81-92. MCNEILL, J., BARRIE, F. R., BUCK, W. R., DEMOULIN, V., GREUTER, W., HAWKWORTH, D. L., HERENDEEN, P. S., KNAPP, S. & TURLAND, N. J. 2012. International Code of Botanical Nomenclature for Algae, Fungi and Plants (Melbourne Code): Adopted by the 18th International Botanical Congress Melbourne Australia July 2011, Oberreifenberg, Koeltz Scientific Books. MEHNER, T., ARLINGHAUS, R., BERG, S., DÖRNER, H., JACOBSEN, L., KASPRZAK, P., KOSCHEL, R., SCHULZE, T., SKOV, C., WOLTER, C. & WYSUJACK, K. 2004. How to link biomanipulation and sustainable fisheries management: a step-by-step guideline for lakes of the European temperate zone. Fisheries Management and Ecology, 11, 261-275. MENCHACA, A. C., VISI, D. K., STREY, O. F., TEEL, P. D., KALINOWSKI, K., ALLEN, M. S. & WILLIAMSON, P. C. 2013. Preliminary Assessment of Microbiome Changes Following Blood-Feeding and Survivorship in the Amblyomma americanum Nymph-to-Adult Transition using Semiconductor Sequencing. PLoS ONE, 8, e67129. 116

REFERENCES METCALF, J. S., RICHER, R., COX, P. A. & CODD, G. A. 2012. Cyanotoxins in desert environments may present a risk to human health. Science of The Total Environment, 421-422, 118-123. MILANI, C., LUGLI, G. A., TURRONI, F., MANCABELLI, L., DURANTI, S., VIAPPIANI, A., MANGIFESTA, M., SEGATA, N., VAN SINDEREN, D. & VENTURA, M. 2014. Evaluation of bifidobacterial community composition in the human gut by means of a targeted amplicon sequencing (ITS) protocol. FEMS Microbiology Ecology, 90, 493-503. MILLER, M. A., PFEIFFER, W. & SCHWARTZ, T. 2010. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. Gateway Computing Environments Workshop (GCE), 14 November 2010 New Orleans. New Orleans, 1-8. MISHRA, B. & THINES, M. 2014. siMBa—a simple graphical user interface for the Bayesian phylogenetic inference program MrBayes. Mycological Progress, 13, 1255-1258. MISOF, B., MEUSEMANN, K., VON REUMONT, B. M., KUCK, P., PROHASKA, S. J. & STADLER, P. F. 2014. A priori assessment of data quality in molecular phylogenetics. Algorithms for Molecular Biology, 9. MITROVIC, S. M., HARDWICK, L. & DORANI, F. 2011. Use of flow management to mitigate cyanobacterial blooms in the Lower Darling River, Australia. Journal of Plankton Research, 33, 229-241. MIZRAHI-MAN, O., DAVENPORT, E. R. & GILAD, Y. 2013. Taxonomic Classification of Bacterial 16S rRNA Genes Using Short Sequencing Reads: Evaluation of Effective Study Designs. PLoS ONE, 8, e53608. MOISANDER, P. H., CHESHIRE, L. A., BRADDY, J., CALANDRINO, E. S., HOFFMAN, M., PIEHLER, M. F. & PAERL, H. W. 2012. Facultative diazotrophy increases Cylindrospermopsis raciborskii competitiveness under fluctuating nitrogen availability. FEMS Microbiology Ecology, 79, 800-811. MOOERS, A. O. & HEARD, S. B. 1997. Inferring evolutionary process from phylogenetic tree shape. Quarterly Review of Biology, 31-54. MOREIRA, C., RAMOS, V., AZEVEDO, J. & VASCONCELOS, V. 2014. Methods to detect cyanobacteria and their toxins in the environment. Applied Microbiology and Biotechnology, 98, 8073-8082. MORO, I., RASCIO, N., LA ROCCA, N., SCIUTO, K., ALBERTANO, P., BRUNO, L. & ANDREOLI, C. 2010. Polyphasic characterization of a thermo-tolerant filamentous cyanobacterium isolated from the Euganean thermal muds (Padua, Italy). European Journal of Phycology, 45, 143-154. MORRISON, D. A. & ELLIS, J. T. 1997. Effects of nucleotide sequence alignment on phylogeny estimation: A case study of 18S rDNAs of Apicomplexa. Molecular Biology and Evolution, 14, 428-441. MSAGATI, T. A. M., SIAME, B. A. & SHUSHU, D. D. 2006. Evaluation of methods for the isolation, detection and quantification of cyanobacterial hepatotoxins. Aquatic Toxicology, 78, 382-397. MUNDAY, R., THOMAS, K., GIBBS, R., MURPHY, C. & QUILLIAM, M. A. 2013. Acute toxicities of saxitoxin, neosaxitoxin, decarbamoyl saxitoxin and gonyautoxins 1&4 and 2&3 to mice by various routes of administration. Toxicon, 76, 77-83. MURALITHARAN, G. & THAJUDDIN, N. 2010. M13-based genotyping of marine cyanobacterial strains from the Indian subcontinent and maintained in the NFMC germplasm collection. Journal of Applied Phycology, 22, 709-716.

117

REFERENCES MURRAY, D., JEFFERSON, B., JARVIS, P. & PARSONS, S. A. 2010. Inhibition of three algae species using chemicals released from barley straw. Environmental Technology, 31, 455-466. NAKAO, M., OKAMOTO, S., KOHARA, M., FUJISHIRO, T., FUJISAWA, T., SATO, S., TABATA, S., KANEKO, T. & NAKAMURA, Y. 2010. CyanoBase: the cyanobacteria genome database update 2010. Nucleic Acids Research, 38, D379-D381. NEILAN, B. 1995. Identification and Phylogenetic Analysis of Toxigenic Cyanobacteria by Multiplex Randomly Amplified Polymorphic DNA PCR. Applied and Environmental Microbiology, 61, 2286-2291. NEILAN, B. A. 2002. The application of genetics and molecular biology for detecting toxic cyanobacteria and the basis for their toxicity. Blue-green algae : their signific ance and management within water supplies 4ed. Salisbury: Cooperative Research Centre for Water Quality and Treatment. NEILAN, B. A., JACOBS, D. & GOODMAN, A. E. 1995. Genetic diversity and phylogeny of toxic cyanobacteria determined by DNA polymorphisms within the phycocyanin locus. Applied and Environmental Microbiology, 61, 3875- 3883. NEILAN, B. A., JACOBS, D., THERESE, D. D., BLACKALL, L. L., HAWKINS, P. R., COX, P. T. & GOODMAN, A. E. 1997a. rRNA Sequences and Evolutionary Relationships among Toxic and Nontoxic Cyanobacteria of the Genus Microcystis. International Journal of Systematic Bacteriology, 47, 693-697. NEILAN, B. A., STUART, J. L., GOODMAN, A. E., COX, P. T. & HAWKINS, P. R. 1997b. Specific amplification and restriction polymorphisms of the cyanobacterial rRNA operon spacer region. Systematic and Applied Microbiology, 20, 612-621. NEWCOMBE, G. 2009a. Chapter 5: Treatment Options. International Guidance Manual for the Management of Toxic Cyanobacteria. London: Global Water Research Coalition. NEWCOMBE, G. 2009b. Chapter 7: Implications for Recreational Waters. International Guidance Manual for the Management of Toxic Cyanobacteria. London: Global Water Research Coalition. NEWCOMBE, G., HOUSE, J., HO, L., BAKER, P. D. & BURCH, M. D. 2010. Management Strategies for Cyanobacteria (Blue-Green Algae) and their Toxins: a Guide for Water Utilities. Adelaide: Water Quality Research Australia. NICHOLSON, B. C., ROSITANO, J. & BURCH, M. D. 1994. Destruction of cyanobacterial peptide hepatotoxins by chlorine and chloramine. Water Research, 28, 1297-1303. NÜBEL, U., GARCIA-PICHEL, F. & MUYZER, G. 1997. PCR primers to amplify 16S rRNA genes from cyanobacteria. Applied and Environmental Microbiology, 63, 3327-3332. O'NEIL, J. M., DAVIS, T. W., BURFORD, M. A. & GOBLER, C. J. 2012. The rise of harmful cyanobacteria blooms: The potential roles of eutrophication and climate change. Harmful Algae, 14, 313-334. OGDEN, T. H. & ROSENBERG, M. S. 2006. Multiple sequence alignment accuracy and phylogenetic inference. Systematic Biology, 55, 314-328. OGDEN, T. H. & ROSENBERG, M. S. 2007. Alignment and topological accuracy of the direct optimization approach via POY and traditional phylogenetics via ClustalW plus PAUP*. Systematic Biology, 56, 182-193. OH, S., CARO-QUINTERO, A., TSEMENTZI, D., DELEON-RODRIGUEZ, N., LUO, C., PORETSKY, R. & KONSTANTINIDIS, K. T. 2011. Metagenomic Insights into the Evolution, Function, and Complexity of the Planktonic Microbial 118

REFERENCES Community of Lake Lanier, a Temperate Freshwater Ecosystem. Applied and Environmental Microbiology, 77, 6000-6011. OREN, A. 2004. A proposal for further integration of the cyanobacteria under the Bacteriological Code. International Journal of Systematic and Evolutionary Microbiology, 54, 1895-1902. OREN, A. 2011. Cyanobacterial systematics and nomenclature as featured in the International Bulletin of Bacteriological Nomenclature and Taxonomy/ International Journal of Systematic Bacteriology/ International Journal of Systematic and Evolutionary Microbiology. International Journal of Systematic and Evolutionary Microbiology, 61, 10-15. OREN, A. 2015. Cyanobacteria in hypersaline environments: biodiversity and physiological properties. Biodiversity and Conservation, 24, 781-798. OREN, A. & TINDALL, B. J. 2005. Nomenclature of the cyanophyta/ cyanobacteria/ cyanoprokaryotes under the International Code of Nomenclature of Prokaryotes. Archiv für Hydrobiologie Supplement, 159, 39-52. OTSUKA, S., SUDA, S., LI, R., WATANABE, M., OYAIZU, H., MATSUMOTO, S. & WATANABE, M. M. 1999. Phylogenetic relationships between toxic and non-toxic strains of the genus Microcystis based on 16S to 23S internal transcribed spacer sequence. FEMS Microbiology Letters, 172, 15-21. OTSUKA, S., SUDA, S., SHIBATA, S., OYAIZU, H., MATSUMOTO, S. & WATANABE, M. M. 2001. A proposal for the unification of five species of the cyanobacterial genus Microcystis Kutzing ex Lemmermann 1907 under the rules of the Bacteriological Code. International Journal of Systematic and Evolutionary Microbiology, 51, 873-879. OUELLETTE, A. A., HANDY, S. & WILHELM, S. 2006. Toxic Microcystis is Widespread in Lake Erie: PCR Detection of Toxin Genes and Molecular Characterization of Associated Cyanobacterial Communities. Microbial Ecology, 51, 154-165. PACE, N. R. 2006. Time for a change. Nature, 441, 289-289. PAERL, H. W. & HUISMAN, J. 2009. Climate change: a catalyst for global expansion of harmful cyanobacterial blooms. Environmental Microbiology Reports, 1, 27- 37. PAERL, H. W. & OTTEN, T. G. 2013. Harmful Cyanobacterial Blooms: Causes, Consequences, and Controls. Microbial Ecology, 65, 995-1010. PALENIK, B. & HASELKORN, R. 1992. Multiple evolutionary origins of prochlorophytes, the chlorophyll b-containing prokaryotes. Nature, 355, 265- 267. PANTELIĆ, D., SVIRČEV, Z., SIMEUNOVIĆ, J., VIDOVIĆ, M. & TRAJKOVIĆ, I. 2013. Cyanotoxins: Characteristics, production and degradation routes in drinking water treatment with reference to the situation in Serbia. Chemosphere, 91, 421-441. PAPINEAU, D., WALKER, J. J., MOJZSIS, S. J. & PACE, N. R. 2005. Composition and structure of microbial communities from stromatolites of Hamelin Pool in Shark Bay, Western Australia. Applied and Environmental Microbiology, 71, 4822-4832. PARTE, A. C. 2014. LPSN—list of prokaryotic names with standing in nomenclature. Nucleic Acids Research, 42, D613-D616. PEARSON, J. E. & KINGSBURY, J. M. 1966. Culturally Induced Variation in Four Morphologically Diverse Bluegreen Algae. American Journal of Botany, 53, 192-200. PHILLIPS, M. J., LIN, Y. H., HARRISON, G. L. & PENNY, D. 2001. Mitochondrial genomes of a bandicoot and a brushtail possum confirm the monophyly of 119

REFERENCES australidelphian marsupials. Proceedings of the Royal Society B-Biological Sciences, 268, 1533-1538. PINEVICH, A. V. 2015. Proposal to consistently apply the International Code of Nomenclature of Prokaryotes (ICNP) to names of the oxygenic photosynthetic bacteria (cyanobacteria), including those validly published under the International Code of Botanical Nomenclature (ICBN)/International Code of Nomenclature for algae, fungi and plants (ICN), and proposal to change Principle 2 of the ICNP. International Journal of Systematic and Evolutionary Microbiology, 65, 1070-1074. PITOIS, S., JACKSON, M. H. & WOOD, B. J. B. 2000. Problems associated with the presence of cyanobacteria in recreational and drinking waters. International Journal of Environmental Health Research, 10, 203-218. POLZ, M. F., ALM, E. J. & HANAGE, W. P. 2013. Horizontal gene transfer and the evolution of bacterial and archaeal population structure. Trends in Genetics, 29, 170-175. POWELL, H. A., KERBBY, N. W. & ROWELL, P. 1991. Natural tolerance of cyanobacteria to the herbicide glyphosate. New Phytologist, 119, 421-426. PRABHA, R., SINGH, D., GUPTA, S. & RAI, A. 2014. Whole genome phylogeny of Prochlorococcus marinus group of cyanobacteria: genome alignment and overlapping gene approach. Interdisciplinary Sciences: Computational Life Sciences, 6, 149-157. PREMANANDH, J., PRIYA, B., TENEVA, I., DZHAMBAZOV, B., PRABAHARAN, D. & UMA, L. 2006. Molecular characterization of marine cyanobacteria from the Indian subcontinent deduced from sequence analysis of the phycocyanin operon (cpcB-IGS-cpcA) and 16S-23S ITS region. Journal of Microbiology, 44, 607-616. QUAIL, M. A., SMITH, M., COUPLAND, P., OTTO, T. D., HARRIS, S. R., CONNOR, T. R., BERTONI, A., SWERDLOW, H. P. & GU, Y. 2012. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics, 13, 1-13. QUIBLIER, C., WOOD, S., ECHENIQUE-SUBIABRE, I., HEATH, M., VILLENEUVE, A. & HUMBERT, J. F. 2013. A review of current knowledge on toxic benthic freshwater cyanobacteria - Ecology, toxin production and risk management. Water Research, 47, 5464-5479. RAJANIEMI, P., HROUZEK, P., KLÁRA, K., WILLAME, R., RANTALA, A., HOFFMANN, L., KOMÁREK, J. & SIVONEN, K. 2005. Phylogenetic and morphological evaluation of the genera Anabaena, Aphanizomenon, Trichormus and Nostoc (Nostocales, Cyanobacteria). International Journal of Systematic and Evolutionary Microbiology, 55, 11-26. RAJASEKHAR, P., FAN, L., NGUYEN, T. & RODDICK, F. A. 2012. A review of the use of sonication to control cyanobacterial blooms. Water Research, 46, 4319- 4329. RAMBAUT, A. 2008. TreeStat. 1.2 ed. RAMBAUT, A. 2014. FigTree. 1.4.2 ed. RANTALA, A., RAJANIEMI-WACKLIN, P., LYRA, C., LEPISTÖ, L., RINTALA, J., MANKIEWICZ-BOCZEK, J. & SIVONEN, K. 2006. Detection of Microcystin- Producing Cyanobacteria in Finnish Lakes with Genus-Specific Microcystin Synthetase Gene E (mcyE) PCR and Associations with Environmental Factors. Applied and Environmental Microbiology, 72, 6101-6110. RASMUSSEN, J. P., GIGLIO, S., MONIS, P. T., CAMPBELL, R. J. & SAINT, C. P. 2008. Development and field testing of a real-time PCR assay for

120

REFERENCES cylindrospermopsin-producing cyanobacteria. Journal of Applied Microbiology, 104, 1503-1515. RASMUSSEN, U. & SVENNING, M. M. 1998. Fingerprinting of cyanobacteria based on PCR with primers derived from short and long tandemly repeated repetitive sequences. Applied and Environmental Microbiology, 64, 265-272. REHAKOVA, K., JOHANSEN, J. R., BOWEN, M. B., MARTIN, M. P. & SHEIL, C. A. 2014. Variation in secondary structure of the 16S rRNA molecule in cyanobacteria with implications for phylogenetic analysis. Fottea, 14, 161-178. RESSOM, R., FOONG, S. S., FITZGERALD, J., TURCZYNOWICZ, L., EL SAADI, O., RODER, D., MAYNARD, T. & FALCONER, I. R. 1994. Health Effects of Toxic Cyanobacteria (Blue-Green Algae), Canberra, National Health and Medical Research Council. REYNOLDS, C. S. 2006. Ecology of phytoplankton, New York, Cambridge University Press. REYNOLDS, C. S., OLIVER, R. L. & WALSBY, A. E. 1987. Cyanobacterial dominance: The role of buoyancy regulation in dynamic lake environments. New Zealand Journal of Marine and Freshwater Research, 21, 379-390. RINTA-KANTO, J. M., OUELLETTE, A. J. A., BOYER, G. L., TWISS, M. R., BRIDGEMAN, T. B. & WILHELM, S. W. 2005. Quantification of Toxic Microcystis spp. during the 2003 and 2004 Blooms in Western Lake Erie using Quantitative Real-Time PCR. Environmental Science & Technology, 39, 4198- 4205. RIPPKA, R. 1988. Isolation and purification of cyanobacteria. Methods in Enzymology, 167, 3-27. RIPPKA, R., CASTENHOLZ, R. W. & HERDMAN, M. 2015a. Subsection IV. Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. RIPPKA, R., DERUELLES, J., WATERBURY, J. B., HERDMAN, M. & STANIER, R. Y. 1979. Generic Assignments, Strain Histories and Properties of Pure Cultures of Cyanobacteria. Journal of General Microbiology, 111, 1-61. RIPPKA, R., WATERBURY, J. B., HERDMAN, M. & CASTENHOLZ, R. W. 2015b. Subsection II. In: WHITMAN, W. B. (ed.) Bergey's Manual of Systematics of Archaea and Bacteria. John Wiley & Sons, Ltd. ROBERTSON, B. R., TEZUKA, N. & WATANABE, M. M. 2001. Phylogenetic analyses of Synechococcus strains (cyanobacteria) using sequences of 16S rDNA and part of the phycocyanin operon reveal multiple evolutionary lines and reflect phycobilin content. International Journal of Systematic and Evolutionary Microbiology, 51, 861-71. ROBERTSON, P. J. & LAWTON, L. 1997. Destruction of cyanobacterial toxins by semiconductor photocatalysis. Chemical Communications, 393-394. ROBINSON, N. J., ROBINSON, P. J., GUPTA, A., BLEASBY, A. J., WHITTON, B. A. & MORBY, A. P. 1995. Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria. Nucleic Acids Research, 23, 729-735. ROBSON, B. J. & HAMILTON, D. P. 2003. Summer flow event induces a cyanobacterial bloom in a seasonal Western Australian estuary. Marine and Freshwater Research, 54, 139-151. ROCAP, G., DISTEL, D. L., WATERBURY, J. B. & CHISHOLM, S. W. 2002. Resolution of Prochlorococcus and Synechococcus Ecotypes by Using 16S-23S Ribosomal DNA Internal Transcribed Spacer Sequences. Applied and Environmental Microbiology, 68, 1180-1191.

121

REFERENCES RONQUIST, F., TESLENKO, M., VAN DER MARK, P., AYRES, D. L., DARLING, A., HÖHNA, S., LARGET, B., LIU, L., SUCHARD, M. A. & HUELSENBECK, J. P. 2012. MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic Biology, 61, 539-542. RONQUIST, F., VAN DER MARK, P. & HUELSENBECK, J. P. 2009. Bayesian phylogenetic analysis using MrBayes. In: LEMEY, P., SALEMI, M. & VANDAMME, A.-M. (eds.) The Phylogenetic Handbook: a Practical Approach to Phylogenetic Analysis and Hypothesis Testing. 2n edition ed. New York: Cambridge University Press. ROSENBERG, E., DELONG, E. F., LORY, S., STACKEBRANDT, E. & THOMPSON, F. (eds.) 2014. The Prokaryotes, Springer Berlin Heidelberg. RUBIN-BLUM, M., ANTLER, G., TSADOK, R., SHEMESH, E., AUSTIN, J. A., JR., COLEMAN, D. F., GOODMAN-TCHERNOV, B. N., BEN-AVRAHAM, Z. & TCHERNOV, D. 2014. First Evidence for the Presence of Iron Oxidizing Zetaproteobacteria at the Levantine Continental Margins. PLoS ONE, 9, e91456. RUNNEGAR, M. T., XIE, C., SNIDER, B. B., WALLACE, G. A., WEINREB, S. M. & KUHLENKAMP, J. 2002. In Vitro Hepatotoxicity of the Cyanobacterial Alkaloid Cylindrospermopsin and Related Synthetic Analogues. Toxicological Sciences, 67, 81-87. RYSKOV, A. P., JINCHARADZE, A. G., PROSNYAK, M. I., IVANOV, P. L. & LIMBORSKA, S. A. 1988. M13 phage DNA as a universal marker for DNA fingerprinting of animals, plants and microorganisms. FEBS Letters, 233, 388- 392. SAITOU, N. & NEI, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406-425. SAKER, M., MOREIRA, C., MARTINS, J., NEILAN, B. & VASCONCELOS, V. M. 2009. DNA profiling of complex bacterial populations: toxic cyanobacterial blooms. Applied Microbiology and Biotechnology, 85, 237-252. SAKER, M., VALE, M., KRAMER, D. & VASCONCELOS, V. 2007. Molecular techniques for the early warning of toxic cyanobacteria blooms in freshwater lakes and rivers. Applied Microbiology and Biotechnology, 75, 441-449. SAMBROOK, J. & MANIATIS, T. 2001. Molecular cloning: a laboratory manual, New York, Cold Spring Harbour: ColdSpring Harbour Laboratory. SÁNCHEZ-BARACALDO, P., HAYES, P. K. & BLANK, C. E. 2005. Morphological and habitat evolution in the Cyanobacteria using a compartmentalization approach. Geobiology, 3, 145-165. SCHRADER, K., K. & RIMANDO, A., M. 2003. Off-Flavors in Aquaculture: An Overview. Off-Flavors in Aquaculture. Washington: American Chemical Society. SCHRADER, K. K. & DENNIS, M. E. 2005. Cyanobacteria and earthy/musty compounds found in commercial catfish (Ictalurus punctatus) ponds in the Mississippi Delta and Mississippi–Alabama Blackland Prairie. Water Research, 39, 2807-2814. SCHRADER, K. K., RIMANDO, A. M., TUCKER, C. S., GLINSKI, J., CUTLER, S. J. & CUTLER, H. G. 2004. Evaluation of the Natural Product SeaKleen for Controlling the Musty-Odor-Producing Cyanobacterium Oscillatoria perornata in Catfish Ponds. North American Journal of Aquaculture, 66, 20-28. SCHULTZ JR, G., KOVATCH, J. & ANNEKEN, E. 2013. Bacterial diversity in a large, temperate, heavily modified river, as determined by pyrosequencing. Aquatic Microbial Ecology, 70, 169-179. 122

REFERENCES SCIUTO, K., ANDREOLI, C., RASCIO, N., LA ROCCA, N. & MORO, I. 2012. Polyphasic approach and typification of selected Phormidium strains (Cyanobacteria). Cladistics, 28, 357-374. SCIUTO, K. & MORO, I. 2015. Cyanobacteria: the bright and dark sides of a charming group. Biodiversity and Conservation, 24, 711-738. SEDAGHATINIA, A., ATAN, R. B., ARIFIN, K. & MURAD, M. 2009. Comparison and evaluation of multiple sequence alignment tools in bioinformatics. International Journal of Computer Science and Network Security, 9, 51-56. SEO, P. S. & YOKOTA, A. 2003. The phylogenetic relationships of cyanobacteria inferred from 16S rRNA, gyrB, rpoC1 and rpoD1 gene sequences. Journal of General and Applied Microbiology, 49, 191-203. SEO, T. K. 2008. Calculating bootstrap probabilities of phylogeny using multilocus sequence data. Molecular Biology and Evolution, 25, 960-971. SEO, T. K., KISHINO, H. & THORNE, J. L. 2005. Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data. Proceedings of the National Academy of Sciences of the United States of America, 102, 4436-4441. SEVERIN, I., ACINAS, S. G. & STAL, L. J. 2010. Diversity of nitrogen-fixing bacteria in cyanobacterial mats. FEMS Microbiology Ecology, 73, 514-525. SHIH, P. M., WU, D., LATIFI, A., AXEN, S. D., FEWER, D. P., TALLA, E., CALTEAU, A., CAI, F., TANDEAU DE MARSAC, N., RIPPKA, R., HERDMAN, M., SIVONEN, K., COURSIN, T., LAURENT, T., GOODWIN, L., NOLAN, M., DAVENPORT, K. W., HAN, C. S., RUBIN, E. M., EISEN, J. A., WOYKE, T., GUGGER, M. & KERFELD, C. A. 2013. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing. Proceedings of the National Academy of Sciences, 110, 1053-1058. SHIRAI, M., OHTAKE, A., SANO, T., MATSUMOTO, S., SAKAMOTO, T., SATO, A., AIDA, T., HARADA, K., SHIMADA, T. & SUZUKI, M. 1991. Toxicity and toxins of natural blooms and isolated strains of Microcystis spp. (Cyanobacteria) and improved procedure for purification of cultures. Applied and Environmental Microbiology, 57, 1241-1245. SIEGESMUND, M. A., JOHANSEN, J. R., KARSTEN, U. & FRIEDL, T. 2008. Coleofasciculus Gen. Nov (Cyanobacteria): Morphological and molecular criteria for the revision of the genus Microcoleus gomont. Journal of Phycology, 44, 1572-1585. SIERP, M., QIN, J. & RECKNAGEL, F. 2009. Biomanipulation: a review of biological control measures in eutrophic waters and the potential for Murray cod Maccullochella peelii peelii to promote water quality in temperate Australia. Reviews in Fish Biology and Fisheries, 19, 143-165. SINCLAIR, L., OSMAN, O. A., BERTILSSON, S. & EILER, A. 2015. Microbial Community Composition and Diversity via 16S rRNA Gene Amplicons: Evaluating the Illumina Platform. PLoS ONE, 10, e0116955. SINGH, N. & DHAR, D. 2011. Phylogenetic relatedness among Spirulina and related cyanobacterial genera. World Journal of Microbiology and Biotechnology, 27, 941-951. SINHA, R., PEARSON, L., DAVIS, T., MUENCHHOFF, J., PRATAMA, R., JEX, A., BURFORD, M. & NEILAN, B. 2014. Comparative genomics of Cylindrospermopsis raciborskii strains with differential toxicities. BMC Genomics, 15, 83. SIVONEN, K. & JONES, G. J. 1999. Cyanobacterial toxins. In: CHORUS, I. & BARTRAM, J. (eds.) Toxic Cyanobacteria in Water: A Guide to Their Public Health Consequences, Monitoring and Management. London: E & FN Spon. 123

REFERENCES SMITH, J. L., BOYER, G. L. & ZIMBA, P. V. 2008. A review of cyanobacterial odorous and bioactive metabolites: Impacts and management alternatives in aquaculture. Aquaculture, 280, 5-20. SMITH, M. D., GOATER, S. E., REICHWALDT, E. S., KNOTT, B. & GHADOUANI, A. 2010. Effects of recent increases in salinity and nutrient concentrations on the microbialite community of Lake Clifton (Western Australia): are the thrombolites at risk? Hydrobiologia, 649, 207-216. SOARES, M. C. S., LOBÃO, L. M., VIDAL, L. O., NOYMA, N. P., BARROS, N. O., CARDOSO, S. J. & ROLAND, F. 2011. Light Microscopy in Aquatic Ecology: Methods for Plankton Communities Studies. In: CHIARINI-GARCIA, H. & MELO, R. C. N. (eds.) Light Microscopy. New York: Humana Press. SOKAL, R. R. & SNEATH, P. H. A. 1963. Principles of numerical taxonomy, San Francisco, W. H. Freeman. SOM, A. A. 2015. Causes, consequences and solutions of phylogenetic incongruence. Briefings in bioinformatics, 16, 536-548. SOO, R. M., SKENNERTON, C. T., SEKIGUCHI, Y., IMELFORT, M., PAECH, S. J., DENNIS, P. G., STEEN, J. A., PARKS, D. H., TYSON, G. W. & HUGENHOLTZ, P. 2014. An Expanded Genomic Representation of the Phylum Cyanobacteria. Genome Biology and Evolution, 6, 1031-1045. STACKEBRANDT, E. & GOEBEL, B. M. 1994. Taxonomic Note: A place for DNA- DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. International Journal of Systematic Bacteriology, 44, 846-849. STACKEBRANDT, E., MURRAY, R. G. E. & TRÜPER, H. G. 1988. Proteobacteria classis nov., a Name for the Phylogenetic Taxon That Includes the “Purple Bacteria and Their Relatives”. International Journal of Systematic and Evolutionary Microbiology, 38, 321-325. STALEY, C., UNNO, T., GOULD, T. J., JARVIS, B., PHILLIPS, J., COTNER, J. B. & SADOWSKY, M. J. 2013. Application of Illumina next-generation sequencing to characterize the bacterial community of the Upper Mississippi River. Journal of Applied Microbiology, 115, 1147-1158. STAMATAKIS, A., HOOVER, P. & ROUGEMONT, J. 2008. A rapid bootstrap algorithm for the RAxML web servers. Systematic Biology, 57, 758-771. STANIER, R. Y., SISTROM, W. R., HANSEN, T. A., WHITTON, B. A., CASTENHOLZ, R. W., PFENNIG, N., GORLENKO, V. N., KONDRATIEVA, E. N., EIMHJELLEN, K. E., WHITTENBURY, R., GHERNA, R. L. & TRUPER, H. G. 1978. Proposal to place nomenclature of cyanobacteria (blue- green-algae) under rules of International Code of Nomenclature of Bacteria. International Journal of Systematic Bacteriology, 28, 335-336. STEFFENSEN, D., BURCH, M., NICHOLSON, B., DRIKAS, M. & BAKER, P. 1999. Management of toxic blue–green algae (cyanobacteria) in Australia. Environmental Toxicology, 14, 183-195. STEVEN, B., MCCANN, S. & WARD, N. L. 2012. Pyrosequencing of plastid 23S rRNA genes reveals diverse and dynamic cyanobacterial and algal populations in two eutrophic lakes. FEMS Microbiology Ecology, 607-615. STEWART, E. J. 2012. Growing Unculturable Bacteria. Journal of Bacteriology, 194, 4151-4160. STEWART, I., SEAWRIGHT, A. A. & SHAW, G. R. 2008. Cyanobacterial poisoning in livestock, wild mammals and birds – an overview . In: HUDNELL, H. K. (ed.) Cyanobacterial Harmful Algal Blooms: State of the Science and Research Needs. New York: Springer.

124

REFERENCES STILLER, J. W., REEL, D. C. & JOHNSON, J. C. 2003. A SINGLE ORIGIN OF PLASTIDS REVISITED: CONVERGENT EVOLUTION IN ORGANELLAR GENOME CONTENT1. Journal of Phycology, 39, 95-105. STRUNECKY, O., ELSTER, J. & KOMÁREK, J. 2011. Taxonomic revision of the freshwater cyanobacterium "Phormidium" murrayi = Wilmottia murrayi. Fottea, 11, 57-71. STUCKEN, K., MURILLO, A. A., SOTO-LIEBE, K., FUENTES-VALDES, J. J., MENDEZ, M. A. & VASQUEZ, M. 2009. Toxicity phenotype does not correlate with phylogeny of Cylindrospermopsis raciborskii strains. Syst Appl Microbiol, 32, 37-48. SUDA, S., WATANABE, M. M., OTSUKA, S., MAHAKAHANT, A., YONGMANITCHAI, W., NOPARTNARAPORN, N., LIU, Y. D. & DAY, J. G. 2002. Taxonomic revision of water-bloom-forming species of oscillatorioid cyanobacteria. International Journal of Systematic and Evolutionary Microbiology, 52, 1577-1595. TAKAHASHI, S., TOMITA, J., NISHIOKA, K., HISADA, T. & NISHIJIMA, M. 2014. Development of a Prokaryotic Universal Primer for Simultaneous Analysis of Bacteria and Archaea Using Next-Generation Sequencing. PLoS ONE, 9, e105592. TALAVERA, G. & CASTRESANA, J. 2007. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Systematic Biology, 56, 564-577. TAMURA, K., PETERSON, D., PETERSON, N., STECHER, G., NEI, M. & KUMAR, S. 2011. MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution, 28, 2731-2739. TAN, B., NG, C. M., NSHIMYIMANA, J. P., LOH, L.-L., GIN, K. Y.-H. & THOMPSON, J. R. 2015. Next-generation sequencing (NGS) for assessment of microbial water quality: current progress, challenges, and future opportunities. Frontiers in Microbiology, 6. TATON, A., GRUBISIC, S., BRAMBILLA, E., DE WIT, R. & WILMOTTE, A. 2003. Cyanobacterial Diversity in Natural and Artificial Microbial Mats of Lake Fryxell (McMurdo Dry Valleys, Antarctica): a Morphological and Molecular Approach. Applied and Environmental Microbiology, 69, 5157-5169. THIERGART, T., LANDAN, G. & MARTIN, W. F. 2014. Concatenated alignments and the case of the disappearing tree. BMC Evolutionary Biology, 14. THOMPSON, J. D., HIGGINS, D. G. & GIBSON, T. J. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22, 4673-4680. THOMPSON, J. D., LINARD, B., LECOMPTE, O. & POCH, O. 2011. A comprehensive benchmark study of multiple sequence alignment methods: Current challenges and future perspectives. PLoS ONE, 6, e18093. TOMITANI, A., KNOLL, A. H., CAVANAUGH, C. M. & OHNO, T. 2006. The evolutionary diversification of cyanobacteria: Molecular-phylogenetic and paleontological perspectives. Proceedings of the National Academy of Sciences of the United States of America, 103, 5442-5447. TRUST, S. R. 2005. River Science: The science behind the Swan-Canning Cleanup Program. Perth. TUCKER, C., S. & HARGREAVES, J., A. 2003. Copper Sulfate to Manage Cyanobacterial Off-Flavors in Pond-Raised Channel Catfish. Off-Flavors in Aquaculture. American Chemical Society. 125

REFERENCES TURNER, S., PRYER, K. M., MIAO, V. P. W. & PALMER, J. D. 1999. Investigating deep phylogenetic relationships among cyanobacteria and plastids by small submit rRNA sequence analysis. Journal of Eukaryotic Microbiology, 46, 327- 338. UNTERGASSER, A., CUTCUTACHE, I., KORESSAAR, T., YE, J., FAIRCLOTH, B. C., REMM, M. & ROZEN, S. G. 2012. Primer3-new capabilities and interfaces. Nucleic Acids Research, 40. VALÉRIO, E., CHAMBEL, L., PAULINO, S., FARIA, N., PEREIRA, P. & TENREIRO, R. 2009. Molecular identification, typing and traceability of cyanobacteria from freshwater reservoirs. Microbiology, 155, 642-656. VALÉRIO, E., PEREIRA, P., SAKER, M. L., FRANCA, S. & TENREIRO, R. 2005. Molecular characterization of Cylindrospermopsis raciborskii strains isolated from Portuguese freshwaters. Harmful Algae, 4, 1044-1052. VAN APELDOORN, M. E., VAN EGMOND, H. P., SPEIJERS, G. J. A. & BAKKER, G. J. I. 2007. Toxins of cyanobacteria. Molecular Nutrition & Food Research, 51, 7-60. VAN DE PEER, Y. 2009. Phylogenetic inference based on distance methods. In: LEMEY, P., SALEMI, M. & VANDAMME, A.-M. (eds.) Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing. 2nd Edition ed. New York: Cambridge University Press. VAN DE PEER, Y., CHAPELLE, S. & DE WACHTER, R. 1996. A quantitative map of nucleotide substitution rates in bacterial rRNA. Nucleic Acids Research, 24, 3381-3391. VAN DER MOLEN, D. T. & PORTIELJE, R. 1999. Multi-lake studies in The : trends in eutrophication. Hydrobiologia, 408-409, 359-365. VARON, A., VINH, L. S. & BOMASH, I. 2010. POY version 4: phylogenetic analysis using dynamic homologies. Cladistics, 26, 72-78. VARON, A. & WHEELER, W. C. 2012. The tree alignment problem. BMC Bioinformatics, 13. VERA, M., LAGOMARSINO, L., SYLVESTER, M., PÉREZ, G., RODRÍGUEZ, P., MUGNI, H., SINISTRO, R., FERRARO, M., BONETTO, C., ZAGARESE, H. & PIZARRO, H. 2010. New evidences of Roundup® (glyphosate formulation) impact on the periphyton community and the water quality of freshwater ecosystems. Ecotoxicology, 19, 710-721. VESTHEIM, H. & JARMAN, S. N. 2008. Blocking primers to enhance PCR amplification of rare sequences in mixed samples – a case study on prey DNA in Antarctic krill stomachs. Frontiers in Zoology, 5, 12. VILLENEUVE, A., S, L. & J.F., H. 2011. Herbicide Contamination of Freshwater Ecosystems: Impact on Microbial Communities. In: STOYTCHEVA, M. (ed.) Pesticides - Formulations, Effects, FateI. InTech. VINCENT, W. F. 2009. Cyanobacteria. In: LIKENS, G. E. (ed.) Encyclopedia of Inland Waters. Oxford: Elsevier. WACKLIN, P., HOFFMANN, L. & KOMÁREK, J. 2009. Nomenclatural validation of the genetically revised cyanobacterial genus Dolichospermum (RALFS ex BORNET et FLAHAULT) comb. nova. Fottea, 9, 59-64. WALSBY, A. E. 1994. Gas vesicles. Microbiological Reviews, 58, 94-144. WATERBURY, J. 2006. The Cyanobacteria—Isolation, Purification and Identification The Prokaryotes. In: DWORKIN, M., FALKOW, S., ROSENBERG, E., SCHLEIFER, K.-H. & STACKEBRANDT, E. (eds.). New York: Springer. WATSON, S. B. 2003. Cyanobacterial and eukaryotic algal odour compounds: signals or by-products? A review of their biological activity. Phycologia, 42, 19.

126

REFERENCES WATSON, S. B. 2004. Aauatic Taste and Odour: A Primary Signal of Drinking-water Integrity. Journal of Toxicology and Environmental Health, Part A, 67, 1779- 1795. WATSON, S. B., BROWNLEE, B., SATCHWILL, T. & HARGESHEIMER, E. E. 2000. Quantitative analysis of trace levels of geosmin and MIB in source and drinking water using headspace SPME. Water Research, 34, 2818-2828. WATSON, S. B., RIDAL, J. & BOYER, G. L. 2008. Taste and odour and cyanobacterial toxins: impairment, prediction, and management in the Great Lakes.(Perspective/ Perspective)(Report). Canadian Journal of Fisheries and Aquatic Sciences, 65, 1779-1797. WEISS, J. H., KOH, J. Y. & CHIOI, D. W. 1989. Neurotoxicity of beta-N- methylamino-L-alanine (BMAA) and beta-N-oxalylamino-L-alanine (BOAA) on cultured cortical neurons. Brain Research, 497, 64-71. WELKER, M. & ERHARD, M. 2007. Consistency between chemotyping of single filaments of Planktothrix rubescens (cyanobacteria) by MALDI-TOF and the peptide patterns of strains determined by HPLC-MS. Journal of Mass Spectrometry, 42, 1062-1068. WELKER, M., FASTNER, J., ERHARD, M. & VON DOHREN, H. 2002. Applications of MALDI-TOF MS analysis in cyanotoxin research. Environmental Toxicology, 17, 367-374. WELKER, M. & MOORE, E. R. B. 2011. Applications of whole-cell matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry in systematic microbiology. Systematic and Applied Microbiology, 34, 2-11. WELKER, M. & VON DOHREN, H. 2006. Cyanobacterial peptides - Nature's own combinatorial biosynthesis. Fems Microbiology Reviews, 30, 530-563. WHITELEY, A. S., JENKINS, S., WAITE, I., KRESOJE, N., PAYNE, H., MULLAN, B., ALLCOCK, R. & O'DONNELL, A. 2012. Microbial 16S rRNA Ion Tag and community metagenome sequencing using the Ion Torrent (PGM) Platform. Journal of Microbiological Methods, 91, 80-88. WHITTON, B. A. & POTTS, M. 2000. Introduction to the Cyanobacteria. In: WHITTON, B. A. & POTTS, M. (eds.) The ecology of Cyanobacteria. Their diversity in time and space. Berlin: Springer. WILLAME, R., BOUTTE, C., GRUBISIC, S., WILMOTTE, A., KOMÁREK, J. & HOFFMANN, L. 2006. Morphological and molecular characterization of planktonic cyanobacteria from and Luxembourg Journal of Phycology, 42, 1312-1332. WILMOTTE, A. 2004. Molecular Evolution and Taxonomy of the Cyanobacteria. In: BRYANT, D. (ed.) The Molecular Biology of Cyanobacteria. Netherlands: Springer. WILMOTTE, A. & HERDMAN, M. 2001. Phylogenetic relationships among the cyanobacteria based on 16S rRNA sequences. In: GARRITY, G. M. & CASTENHOLZ, R. W. (eds.) Bergey's Manual of Systematic Bacteriology. Volume One : The Archaea and the Deeply Branching and Phototrophic Bacteria. New York: Springer. WILSON, K. M., SCHEMBRI, M. A., BAKER, P. D. & SAINT, C. P. 2000. Molecular characterization of the toxic cyanobacterium Cylindrospermopsis raciborskii and design of a species-specific PCR. Applied and Environmental Microbiology, 66, 332-338. WOESE, C. R. 1987. Bacterial Evolution. Microbiological Reviews, 51, 221-271. WOOD, S. A., SMITH, K. F., BANKS, J. C., TREMBLAY, L. A., RHODES, L., MOUNTFORT, D., CARY, S. C. & POCHON, X. 2013. Molecular genetic tools for environmental monitoring of New Zealand's aquatic habitats, past, 127

REFERENCES present and the future. New Zealand Journal of Marine and Freshwater Research, 47, 90-119. WU, X., JOYCE, E. M. & MASON, T. J. 2011. The effects of ultrasound on cyanobacteria. Harmful Algae, 10, 738-743. XIAO, X., SOGGE, H., LAGESEN, K., TOOMING-KLUNDERUD, A., JAKOBSEN, K. S. & ROHRLACK, T. 2014. Use of High Throughput Sequencing and Light Microscopy Show Contrasting Results in a Study of Phytoplankton Occurrence in a Freshwater Environment. PLoS ONE, 9, e106510. YANG, R., MURPHY, C., SONG, Y., NG-HUBLIN, J., ESTCOURT, A., HIJJAWI, N., CHALMERS, R., HADFIELD, S., BATH, A., GORDON, C. & RYAN, U. 2013. Specific and quantitative detection and identification of Cryptosporidium hominis and C. parvum in clinical and environmental samples. Experimental Parasitology, 135, 142-147. YANG, Z. & RANNALA, B. 2012. Molecular phylogenetics: principles and practice. Nature Reviews Genetics, 13, 303-314. YOUNG, J. M. 2001. Implications of alternative classifications and horizontal gene transfer for bacterial taxonomy. International Journal of Systematic and Evolutionary Microbiology, 51, 945-953. YOUNG, W. F., HORTH, H., CRANE, R., OGDEN, T. & ARNOTT, M. 1996. Taste and odour threshold concentrations of potential potable water contaminants. Water Research, 30, 331-340. ZANI, S., MELLON, M. T., COLLIER, J. L. & ZEHR, J. P. 2000. Expression of nifH Genes in Natural Microbial Assemblages in Lake George, New York, Detected by Reverse Transcriptase PCR. Applied and Environmental Microbiology, 66, 3119-3124. ZAPOMĚLOVÁ, E., JEZBEROVÁ, J., HROUZEK, P., HISEM, D., ŘEHÁKOVÁ, K. & KOMÁRKOVÁ, J. 2009. Polyphasic characterisation of three strains of Anabaena reniformis and Aphanizomenon aphanizomenoides (cyanobacteria) and their reclassification to Sphaerospermum Gen. Nov. (incl. Anabaena kisseleviana). Journal of Phycology, 45, 1363-1373. ZHAN, A., HULÁK, M., SYLVESTER, F., HUANG, X., ADEBAYO, A. A., ABBOTT, C. L., ADAMOWICZ, S. J., HEATH, D. D., CRISTESCU, M. E. & MACISAAC, H. J. 2013. High sensitivity of 454 pyrosequencing for detection of rare species in aquatic communities. Methods in Ecology and Evolution, 4, 558-565. ZIMBA, P. V. & GRIMM, C. C. 2003. A synoptic survey of musty/muddy odor metabolites and microcystin toxin occurrence and concentration in southeastern USA channel catfish (Ictalurus punctatus Ralfinesque) production ponds. Aquaculture, 218, 81-87. ZOUBOULIS, A., SAMARAS, P., NTAMPOU, X. & PETALA, M. 2007. Potential Ozone Applications for Water/Wastewater Treatment. Separation Science and Technology, 42, 1433-1446. ZWIENER, C. & FRIMMEL, F. H. 2004. LC-MS analysis in the aquatic environment and in water treatment technology - a critical review - Part II: Applications for emerging contaminants and related pollutants, microorganisms and humic acids. Analytical and Bioanalytical Chemistry, 378, 862-874.

128

APPENDIX APPENDIX

Appendix 1: Water collection sites, from which cyanobacteria isolation was attempted, and the number of times/samples collected. Number of samples Source type Location collected Great Southern Region 1 6 Great Southern Region 2 18 Great Southern Region 3 2 Great Southern Region 4 13 Protected freshwater Great Southern Region 5 9 reservoirs (n=10) Great Southern Region 6 10 Great Southern Region 7 6 Great Southern Region 8 17 Great Southern Region 9 6 Great Southern Region 10 1 Open rural lentic systems Hillgrove, Queensland 4 (n=1) Buayanup Drain 1 Glenn Brook 1 Open rural lotic systems Lesmurdie Brook 1 (n=5) Vasse River 3 Williams River 1 Blue Gum Lake 1 Booragoon Lake 1 Chelodina Wetlands 2 City of Swan lake 1 Open urban freshwater Emu Lake 1 systems Frederick Baldwin Park 2 (n=10) Hyde Park 3 Lake Monger 2 North Lake 1 Piney Lakes 2 WWTP 1 1 Wastewater treatment WWTP 2 1 plants (n=3) WWTP 3 1 119

129

APPENDIX Appendix 2: Preparation of ASM-1 media (Gorham et al., 1964) and modified ASM-1 medium. Final Conc. (mg/L) ASM-1 ASM-N- ASM- N0 Compound Formula - - (10% NO3 ) (no NO3 )

Sodium nitrate NaNO3 170 17 0 Potassium Phosphate K2HPO4 17.4 17.4 17.4 Di-sodium hydrogen Phosphate Na2HPO4 14.2 14.2 14.2 Magnesium chloride MgCl2.H2O 40.62 40.62 40.62 Magnesium sulfate MgSO4.7H2O 49.33 49.33 49.33 Calcium chloride CaCl2.2H2O 29.4 29.4 29.4 Ferric chloride FeCl3.6H2O 1.0835 1.0835 1.0835 Boric acid H3BO3 2.47 2.47 2.47 Manganese (II) chloride MnCl2.4H2O 1.3686 1.3686 1.3686 Zinc chloride ZnCl2 0.44 0.44 0.44 Sodium EDTA Na2EDTA 6.64 6.64 6.64 Cobalt sulfate CoSO4.7H2O 0.0216 0.0216 0.0216 Copper (II) chloride CuCl2.2H2O 0.00013 0.00013 0.00013

130

APPENDIX Appendix 3: List of reference primers used in the study. Primer Annealing Fragment Reference Region/Gene Sequence (5’-3-) Name temperature length (bp) CGC CCG CCG CGC CCC GCG CCC GTC CCG CCG (McGregor and 338F 11 cycles 16S rDNA CCC CCG CCC TCC YAC CGG GAG GCA G touchdown, Rasmussen, 2008) hypervariable region 4 781R (a) GAC TAC TGG GGT ATC TAA TCC CAT T 65C - 55C, 467 (V4) 40 cycles at (Nübel et al., 1997) 781R (b) GAC TAC AGG GGT ATC TAA TCC CTT T 55C

MICR 184F GCC GCR AGG TGA AAM CTA A (Neilan et al., 1997a) 16S qPCR primers MICR 431R AAT CCA AAR ACC TTC CTC CC 55C 230 MICR 228F 56FAM-AAG AGC TTG -/ZEN/-CGT CTG ATT AGC (Rinta-Kanto et al., (Taq) TAG T-/3IABkFQ 2005) CSIF G(T/C)CACG CCC GAA GTC (G/A)TT AC 20 cycles (Janse et al., 2003) touchdown, 16-23S ITS 62C - 52C, variable (Neilan et al., 23ULR CCT CTG TGT GCC TAG GTA TC 10 cycles at 1997b) 52C1.6*

rpoC1 GAG CTC YAW NAC CAT CCA YTC NGG (Palenik and rpoC1 50C 612 rpoC1-T GGT ACC NAA YGG NSA RRT NGT TGG Haselkorn, 1992) TAG TGT AAA ACG ACG GCC AGT TGY YTK CGC cpcBF GAC ATG (Robertson et al., Phycocyanin operon 56.4C 585 TAG CAG GAA ACA GCT ATG ACG TGG TGT ARG 2001) cpcAR GGA AYT T

131

APPENDIX Appendix 4: List of isolates and loci successfully sequenced and used in Chapter 4. 16S-23S Isolate ID 16S rDNA rpoC1 cpcBA-IGS ITS Baldwin Pond Type 1   Baldwin Pond Type 2   Baldwin Pond Type 3      Blue Gum Lake Type 1    Buayanup Drain Type 2     Buayanup Drain Type 4    Chelodina Wetlands Type 1    Chelodina Wetlands Type 2     Emu Lake Type 2   GS1-1    GS1-2     GS2-1     GS3-1   GS3-2   GS4-1    GS4-2    GS5-1     GS5-2     GS5-3    GS5-4    GS5-5  GS6-1     GS6-2    GS6-3      GS8-1    GS9-1     GS9-2    GS9-3    Hyde Park Type 1      Hyde Park Type 3  Hyde Park Type 4   Lake Monger Type 1  Lesmurdie Brook Type 1       Piney Lakes Type 1   Vasse River Type 1      Vasse River Type 2    Vasse River Type 3    Vasse River Type 6     Vasse River Type 7    Vasse River Type 8   Vasse River Type 9      Vasse River Type 12   Vasse River Type 13       Vasse River Type 14 

132

APPENDIX

Vasse River Type 15  Williams River Type 1       WWTP1-1   WWTP1-2   WWTP1-3    

133

APPENDIX Appendix 5: Tree matric values obtained for the phylogenies constructed using the 16S rDNA locus Tree construction Treeness N_bar Cherry Tree Tree Alignment Curation? Tree strategy count Height length method construction ClustalW_BI 0.64 8.38 29.00 2.17 13.83 ClustalW N BI ClustalW_Cur_BI 0.66 9.31 27.00 2.31 15.19 ClustalW Y BI ClustalW_Cur_ML 0.71 15.89 34.00 1.41 6.92 ClustalW Y ML ClustalW_Cur_NJ 0.69 12.90 28.00 0.48 2.35 ClustalW Y NJ ClustalW_ML 0.70 15.32 39.00 1.55 7.85 ClustalW N ML ClustalW_NJ 0.68 13.05 30.00 0.56 2.94 ClustalW N NJ MAFFT_BI 0.66 11.46 27.00 3.05 16.18 MAFFT N BI MAFFT_Cur_BI 0.65 9.18 26.00 2.55 16.77 MAFFT Y BI MAFFT_Cur_ML 0.72 17.45 32.00 0.99 4.91 MAFFT Y ML MAFFT_Cur_NJ 0.69 12.91 28.00 0.52 2.61 MAFFT Y NJ MAFFT_ML 0.69 14.07 38.00 1.51 7.82 MAFFT N ML MAFFT_NJ 0.68 12.49 30.00 0.55 2.93 MAFFT N NJ MUSCLE_BI 0.67 11.61 26.00 3.01 16.27 MUSCLE N BI MUSCLE_Cur_BI 0.66 9.89 27.00 2.67 16.28 MUSCLE Y BI MUSCLE_Cur_ML 0.73 13.85 33.00 1.48 7.43 MUSCLE Y ML MUSCLE_Cur_NJ 0.64 11.06 28.00 0.29 2.35 MUSCLE Y NJ MUSCLE_ML 0.70 14.68 34.00 1.54 8.03 MUSCLE N ML MUSCLE_NJ 0.68 12.48 31.00 0.55 2.93 MUSCLE N NJ PAGAN_BI 0.66 8.90 28.00 2.47 15.02 PAGAN N BI PAGAN_Cur_BI 0.65 9.13 26.00 2.52 16.60 PAGAN Y BI PAGAN_Cur_ML 0.71 14.64 36.00 1.48 7.30 PAGAN Y ML PAGAN_Cur_NJ 0.69 12.91 28.00 0.53 2.63 PAGAN Y NJ PAGAN_ML 0.71 14.69 36.00 1.47 7.27 PAGAN N ML PAGAN_NJ 0.63 10.43 30.00 0.33 2.87 PAGAN N NJ PRANK_BI 0.64 7.35 28.00 1.98 13.30 PRANK N BI PRANK_Cur_BI 0.65 9.46 26.00 2.32 15.17 PRANK Y BI PRANK_Cur_ML 0.72 14.96 34.00 1.74 8.37 PRANK Y ML PRANK_Cur_NJ 0.64 11.06 28.00 0.30 2.35 PRANK Y NJ PRANK_ML 0.70 13.57 38.00 1.59 8.21 PRANK N ML PRANK_NJ 0.68 12.72 30.00 0.53 2.78 PRANK N NJ

134

APPENDIX Appendix 6: Tree matric values obtained for the phylogenies constructed using the rpoC1 locus Tree construction Cherry Tree Tree Alignment Tree strategy Treeness N_bar count Height length method Curation? construction ClustalW_BI 0.47 11.25 28.00 2.01 13.94 ClustalW N BI ClustalW_Cur_BI 0.47 11.24 29.00 1.96 13.89 ClustalW Y BI ClustalW_Cur_ML 0.47 15.33 27.00 2.61 17.73 ClustalW Y ML ClustalW_Cur_NJ 0.47 16.49 26.00 1.53 9.32 ClustalW Y NJ ClustalW_ML 0.47 14.07 29.00 2.44 18.21 ClustalW N ML ClustalW_NJ 0.47 15.72 28.00 2.01 10.34 ClustalW N NJ MAFFT_BI 0.47 11.00 28.00 2.02 13.96 MAFFT N BI MAFFT_Cur_BI 0.47 10.55 28.00 2.05 13.94 MAFFT Y BI MAFFT_Cur_ML 0.47 14.54 30.00 2.79 18.01 MAFFT Y ML MAFFT_Cur_NJ 0.47 16.98 26.00 1.61 9.65 MAFFT Y NJ MAFFT_ML 0.48 14.27 31.00 2.68 19.09 MAFFT N ML MAFFT_NJ 0.46 14.30 27.00 0.95 10.06 MAFFT N NJ MUSCLE_BI 0.46 10.83 26.00 2.00 13.65 MUSCLE N BI MUSCLE_Cur_BI 0.46 10.16 26.00 2.06 13.66 MUSCLE Y BI MUSCLE_Cur_ML 0.48 13.86 30.00 2.56 18.72 MUSCLE Y ML MUSCLE_Cur_NJ 0.48 15.41 26.00 1.54 9.70 MUSCLE Y NJ MUSCLE_ML 0.48 13.64 29.00 1.46 11.15 MUSCLE N ML MUSCLE_NJ 0.48 15.72 28.00 2.03 10.58 MUSCLE N NJ PAGAN_BI 0.46 11.00 28.00 1.96 13.95 PAGAN N BI PAGAN_Cur_BI 0.47 11.26 29.00 1.86 13.83 PAGAN Y BI PAGAN_Cur_ML 0.46 14.26 26.00 2.50 17.34 PAGAN Y ML PAGAN_Cur_NJ 0.47 15.04 27.00 1.46 9.23 PAGAN Y NJ PAGAN_ML 0.48 13.55 30.00 2.62 18.94 PAGAN N ML PAGAN_NJ 0.47 14.47 27.00 1.52 9.67 PAGAN N NJ PRANK_BI 0.47 11.73 28.00 2.09 13.61 PRANK N BI PRANK_Cur_BI 0.46 10.25 25.00 2.26 14.25 PRANK Y BI PRANK_Cur_ML 0.47 14.19 28.00 2.59 17.90 PRANK Y ML PRANK_Cur_NJ 0.48 15.09 25.00 1.51 9.26 PRANK Y NJ PRANK_ML 0.48 15.16 30.00 2.72 18.36 PRANK N ML PRANK_NJ 0.48 15.98 27.00 1.57 9.68 PRANK N NJ

135

APPENDIX Appendix 7: Tree matric values obtained for the phylogenies constructed using the cpcBA-IGS locus Tree construction Cherry Tree Tree Alignment Tree strategy Treeness N_bar count Height length method Curation? construction ClustalW_BI 0.43 9.66 23.00 3.71 13.77 ClustalW N BI ClustalW_Cur_BI 0.43 8.71 22.00 3.05 11.89 ClustalW Y BI ClustalW_Cur_ML 0.53 12.28 29.00 1.41 8.25 ClustalW Y ML ClustalW_Cur_NJ 0.53 13.64 25.00 1.37 7.02 ClustalW Y NJ ClustalW_ML 0.56 11.21 23.00 2.61 12.12 ClustalW N ML ClustalW_NJ 0.53 11.47 21.00 1.93 9.54 ClustalW N NJ MAFFT_BI 0.51 11.47 23.00 1.83 11.83 MAFFT N BI MAFFT_Cur_BI 0.47 9.21 21.00 2.41 12.46 MAFFT Y BI MAFFT_Cur_ML 0.47 12.90 28.00 1.82 8.98 MAFFT Y ML MAFFT_Cur_NJ 0.49 15.14 23.00 1.74 7.65 MAFFT Y NJ MAFFT_ML 0.50 12.11 25.00 2.20 12.44 MAFFT N ML MAFFT_NJ 0.48 12.73 22.00 1.14 8.44 MAFFT N NJ MUSCLE_BI 0.47 11.24 23.00 2.11 14.51 MUSCLE N BI MUSCLE_Cur_BI 0.48 8.57 22.00 2.26 13.82 MUSCLE Y BI MUSCLE_Cur_ML 0.49 11.90 27.00 1.51 8.15 MUSCLE Y ML MUSCLE_Cur_NJ 0.51 18.89 24.00 1.67 6.70 MUSCLE Y NJ MUSCLE_ML 0.46 11.97 25.00 2.69 16.76 MUSCLE N ML MUSCLE_NJ 0.46 11.51 20.00 1.66 9.82 MUSCLE N NJ PAGAN_BI 0.51 11.37 22.00 1.38 8.94 PAGAN N BI PAGAN_Cur_BI 0.39 7.17 17.00 2.52 12.29 PAGAN Y BI PAGAN_Cur_ML 0.48 15.63 22.00 2.24 8.50 PAGAN Y ML PAGAN_Cur_NJ 0.47 16.13 22.00 2.13 6.69 PAGAN Y NJ PAGAN_ML 0.51 11.85 25.00 1.50 9.54 PAGAN N ML PAGAN_NJ 0.47 11.78 24.00 1.16 7.48 PAGAN N NJ PRANK_BI 0.43 11.72 23.00 1.99 10.54 PRANK N BI PRANK_Cur_BI 0.42 5.12 20.00 2.22 12.13 PRANK Y BI PRANK_Cur_ML 0.54 13.79 28.00 2.82 9.53 PRANK Y ML PRANK_Cur_NJ 0.50 15.04 21.00 1.56 7.38 PRANK Y NJ PRANK_ML 0.48 12.67 24.00 1.69 9.97 PRANK N ML PRANK_NJ 0.43 11.99 23.00 0.64 6.90 PRANK N NJ

136

APPENDIX Appendix 8: Tree matric values obtained for the phylogenies constructed using the16S- 23S ITS locus Tree construction Cherry Tree Tree Alignment Tree strategy Treeness N_bar count Height length method Curation? construction ClustalW_BI 0.54 13.48 23.00 3.98 20.35 ClustalW N BI ClustalW_Cur_BI 0.18 3.38 3.00 0.44 1.60 ClustalW Y BI ClustalW_Cur_ML 0.39 15.10 21.00 0.78 1.15 ClustalW Y ML ClustalW_Cur_NJ 0.34 31.33 7.00 0.62 1.02 ClustalW Y NJ ClustalW_ML 0.55 17.99 23.00 5.38 24.16 ClustalW N ML ClustalW_NJ 0.52 16.33 24.00 2.88 15.67 ClustalW N NJ MAFFT_BI 0.48 10.60 25.00 2.29 18.61 MAFFT N BI MAFFT_Cur_BI 0.44 9.76 36.00 1.62 12.16 MAFFT Y BI MAFFT_Cur_ML 0.33 21.60 19.00 0.38 1.15 MAFFT Y ML MAFFT_Cur_NJ 0.35 25.55 8.00 0.31 0.88 MAFFT Y NJ MAFFT_ML 0.48 14.66 26.00 3.35 23.28 MAFFT N ML MAFFT_NJ 0.41 15.03 22.00 1.46 12.40 MAFFT N NJ MUSCLE_BI 0.40 5.18 29.00 1.49 20.85 MUSCLE N BI MUSCLE_Cur_BI 0.08 2.11 4.00 1.87 8.00 MUSCLE Y BI MUSCLE_Cur_ML 0.29 15.18 24.00 0.31 0.75 MUSCLE Y ML MUSCLE_Cur_NJ 0.32 23.54 8.00 0.31 0.69 MUSCLE Y NJ MUSCLE_ML 0.41 14.32 29.00 2.23 24.22 MUSCLE N ML MUSCLE_NJ 0.38 13.18 23.00 1.04 13.05 MUSCLE N NJ PAGAN_BI 0.49 14.78 31.00 1.39 12.32 PAGAN N BI PAGAN_Cur_BI 0.01 1.99 1.00 0.16 1.46 PAGAN Y BI PAGAN_Cur_ML 0.15 36.40 8.00 0.17 0.33 PAGAN Y ML PAGAN_Cur_NJ 0.14 39.41 3.00 0.20 0.35 PAGAN Y NJ PAGAN_ML 0.48 14.77 30.00 1.39 12.45 PAGAN N ML PAGAN_NJ 0.42 18.45 28.00 1.30 8.51 PAGAN N NJ PRANK_BI 0.39 10.30 27.00 1.26 10.35 PRANK N BI PRANK_Cur_BI 0.02 2.91 1.00 0.37 1.71 PRANK Y BI PRANK_Cur_ML 0.15 22.55 15.00 0.17 0.33 PRANK Y ML PRANK_Cur_NJ 0.14 43.09 2.00 0.20 0.35 PRANK Y NJ PRANK_ML 0.42 10.03 28.00 1.10 11.66 PRANK N ML PRANK_NJ 0.33 14.49 26.00 0.86 6.55 PRANK N NJ

137

APPENDIX Appendix 9: Representative “prototype” 16S rDNA tree

138

APPENDIX Appendix 10: Relative abundance of phyla detected by each primer set Taxonomy 293F_751R 515F_806R 1328F_1668R Unassigned;Other 0.50% 1.19% 0.17% Archaea;p_Euryarchaeota 0.00% 0.04% 0.00% Bacteria;Other 0.00% 0.00% 0.05% Bacteria;p_Acidobacteria 0.00% 0.70% 0.27% Bacteria;p_Actinobacteria 0.12% 10.18% 2.60% Bacteria;p_Armatimonadetes 0.00% 0.73% 0.00% Bacteria;p_Bacteroidetes 0.00% 30.25% 0.00% Bacteria;p_Chlorobi 0.00% 0.14% 0.00% Bacteria;p_Chloroflexi 0.26% 0.24% 0.00% Bacteria;p_Cyanobacteria 50.76% 15.27% 54.37% Bacteria;p_Elusimicrobia 0.00% 0.06% 0.00% Bacteria;p_Firmicutes 0.36% 0.14% 1.99% Bacteria;p_Fusobacteria 0.00% 0.01% 0.01% Bacteria;p_GN02 0.02% 0.00% 0.00% Bacteria;p_Gemmatimonadetes 0.00% 0.45% 0.00% Bacteria;p_Lentisphaerae 0.00% 0.00% 0.00% Bacteria;p_NKB19 0.00% 0.01% 0.00% Bacteria;p_Nitrospirae 0.00% 0.02% 0.00% Bacteria;p_OD1 1.82% 0.56% 0.00% Bacteria;p_OP3 0.00% 0.05% 0.00% Bacteria;p_Planctomycetes 0.00% 1.18% 0.00% Bacteria;p_Proteobacteria 44.48% 33.38% 40.29% Bacteria;p_Spirochaetes 0.00% 0.00% 0.01% Bacteria;p_Synergistetes 0.02% 0.00% 0.00% Bacteria;p_TM7 0.14% 0.04% 0.23% Bacteria;p_Verrucomicrobia 1.52% 5.32% 0.00% Bacteria;p_WPS-2 0.00% 0.04% 0.00%

139

APPENDIX Appendix 11: Relative abundance (%) of cyanobacterial sequences amplified by each primer set used in the study Taxonomy 293F_ 515F_ 1328F_ 751R 806R 1668R Cyanobacteria;Other;Other;Other;Other 0.2% 0.0% 0.0% Cyanobacteria;c_;o_;f_;g_ 13.1% 6.1% 5.3%

Cyanobacteria;c_4C0d-2;o_SM1D11;f_;g_ 0.2% 0.0% 0.0%

Cyanobacteria;c_Chloroplast;o_;f_;g_ 4.9% 4.4% 0.0% Cyanobacteria;c_Chloroplast;o_Chlorophyta;f_;g_ 8.1% 6.3% 0.0% Cyanobacteria;c_Chloroplast;o_Chlorophyta;f_Chlamydomonad 4.6% 0.0% 0.0% aceae;Other Cyanobacteria;c_Chloroplast;o_Chlorophyta;f_Chlamydomonad 0.6% 0.0% 0.0% aceae;g_

Cyanobacteria;c_Chloroplast;o_Cryptophyta;f_;g_ 0.8% 1.0% 1.6% 2/Chloroplast 9) (class; - Cyanobacteria;c_Chloroplast;o_Euglenozoa;f_;g_ 1.8% 4.8% 0.0%

Cyanobacteria;c_Chloroplast;o_Stramenopiles;f_;g_ 11.3% 26.2% 0.0% 4C0d Cyanobacteria;c_Chloroplast;o_Streptophyta;f_;g_ 0.1% 0.0% 0.0% Cyanobacteria;c_Nostocophycideae;o_;f_;g_ 0.1% 0.0% 0.0%

Cyanobacteria;c_Nostocophycideae;o_Nostocales; 5.8% 0.0% 0.4% f_Nostocaceae;Other Cyanobacteria;c_Nostocophycideae;o_Nostocales; 0.0% 0.3% 6.8% f_Nostocaceae;g_ Cyanobacteria;c_Nostocophycideae;o_Nostocales; 0.0% 8.7% 0.0% f_Nostocaceae;g_Cylindrospermopsis Cyanobacteria;c_Nostocophycideae;o_Nostocales; 0.0% 0.3% 0.0% f_Nostocaceae;g_Dolichospermum Nostoc (order;6) Cyanobacteria;c_Nostocophycideae;o_Nostocales; 0.0% 0.0% 1.7%

Nostocophycideae (class); f_Scytonemataceae;g_Scytonema Cyanobacteria;c_Nostocophycideae;o_Stigonematales; 0.7% 0.3% 0.0% f_Rivulariaceae;g_Calothrix Cyanobacteria;c_Oscillatoriophycideae;o_Chroococcales; 0.1% 0.0% 5.0%

f_;g_ Cyanobacteria;c_Oscillatoriophycideae;o_Chroococcales; 0.2% 0.0% 0.3% f_Cyanobacteriaceae;g_Cyanobacterium Cyanobacteria;c_Oscillatoriophycideae;o_Chroococcales; 0.0% 0.0% 0.2% f_Gomphosphaeriaceae;Other Cyanobacteria;c_Oscillatoriophycideae;o_Chroococcales; 0.1% 0.0% 0.0% f_Xenococcaceae;g_ Cyanobacteria;c_Oscillatoriophycideae;o_Oscillatoriales; 1.0% 0.0% 0.0%

f_Phormidiaceae;Other Oscillatoriales(order; 2) Chroococcales(order; 3); Cyanobacteria;c_Oscillatoriophycideae;o_Oscillatoriales; 4.8% 4.8% 11.2%

Oscillatoriophycideae 5); (class; f_Phormidiaceae;g_Planktothrix Cyanobacteria;c_Synechococcophycideae; 0.4% 0.0% 6.3%

o_Pseudanabaenales;f_Pseudanabaenaceae;Other ); Cyanobacteria;c_Synechococcophycideae; 4.3% 0.0% 16.4% o_Pseudanabaenales;f_Pseudanabaenaceae;g_ Cyanobacteria;c_Synechococcophycideae;o_Pseudanabaenales; 6.2% 6.3% 0.0% f_Pseudanabaenaceae;g_Halomicronema Cyanobacteria;c_Synechococcophycideae; 0.1% 0.0% 0.4% o_Pseudanabaenales;f_Pseudanabaenaceae;g_Leptolyngbya Cyanobacteria;c_Synechococcophycideae; 5.9% 6.9% 12.4% o_Pseudanabaenales;f_Pseudanabaenaceae;g_Pseudanabaena Cyanobacteria;c_Synechococcophycideae; 0.0% 0.0% 0.2%

o_Synechococcales;f_Synechococcaceae;g_PaulinellaSynechoccales(order; 2) Pseudanabaenales(order; 5

Cyanobacteria;c_Synechococcophycideae; 24.6% 23.7% 31.9% Synechococcophycideae7); (class; o_Synechococcales;f_Synechococcaceae;g_Synechococcus

140

APPENDIX Appendix 12: Relative abundance (%) of proteobacterial sequences amplified by each primer set used in this study Taxonomy 293F_ 515F_ 1328F_ 751R 806R 1664R Alphaproteobacteria;Other;Other;Other 0.2% 1.1% Alphaproteobacteria;o_;f_;g_ 0.3% 0.4% 0.2% Alphaproteobacteria;o_BD7-3;f_;g_ 0.0% 0.1% 0.0% Alphaproteobacteria;o_Caulobacterales;f_Caulobacteraceae; 0.0% 0.4% Other Alphaproteobacteria;o_Caulobacterales;f_Caulobacteraceae; 0.0% 1.1% 0.0% g_ Alphaproteobacteria;o_Caulobacterales;f_Caulobacteraceae; 0.0% 0.0% 0.0% g_Mycoplana Alphaproteobacteria;o_Caulobacterales;f_Caulobacteraceae; 0.0% 1.2% 4.6% g_Phenylobacterium Alphaproteobacteria;o_Ellin329;f_;g_ 0.0% 0.0% Alphaproteobacteria;o_Rhizobiales;Other;Other 0.0% 0.1% 11.6% Alphaproteobacteria;o_Rhizobiales;f_;g_ 1.6% 0.5% 11.0% Alphaproteobacteria;o_Rhizobiales;f_Hyphomicrobiaceae;g_ 0.0% 2.4% Alphaproteobacteria;o_Rhizobiales;f_Hyphomicrobiaceae; 0.1% 0.9% g_Hyphomicrobium Alphaproteobacteria;o_Rhizobiales;f_Hyphomicrobiaceae; 0.1% 0.0% g_Pedomicrobium Alphaproteobacteria;o_Rhizobiales;f_Hyphomicrobiaceae; 0.0% 0.1% 0.6% g_Rhodoplanes Alphaproteobacteria;o_Rhizobiales;f_Methylocystaceae;g_ 0.0% 0.6%

Alphaproteobacteria;o_Rhizobiales;f_Methylocystaceae;Other 1.5% 0.0%

Alphaproteobacteria;o_Rhizobiales;f_Methylocystaceae; 0.0% 12.0% g_Methylosinus Alphaproteobacteria;o_Rhizobiales;f_Methylocystaceae; 0.8% 0.0% g_Pleomorphomonas Alphaproteobacteria;o_Rhizobiales;f_Phyllobacteriaceae; 0.0% 0.1% 0.0% Other Alphaproteobacteria;o_Rhizobiales;f_Rhizobiaceae;Other 0.1% 0.4% 0.0%

Alphaproteobacteria (45)Alphaproteobacteria Alphaproteobacteria;o_Rhizobiales;f_Rhizobiaceae; 0.0% 0.1% g_Kaistia Alphaproteobacteria;o_Rhizobiales;f_Xanthobacteraceae; 0.2% 0.3% 0.0% g_Labrys Alphaproteobacteria;o_Rhodobacterales;f_Hyphomonadaceae 0.3% 0.0% ;Other Alphaproteobacteria;o_Rhodobacterales;f_Hyphomonadaceae 2.2% 5.1% 0.0% ;g_ Alphaproteobacteria;o_Rhodobacterales;f_Hyphomonadaceae 0.0% 0.1% 0.0% ;g_Hyphomonas Alphaproteobacteria;o_Rhodobacterales;f_Rhodobacteraceae; 3.2% 0.5% 0.3% g_ Alphaproteobacteria;o_Rhodobacterales;f_Rhodobacteraceae; 0.0% 1.4% 0.0% g_Anaerospora Alphaproteobacteria;o_Rhodobacterales;f_Rhodobacteraceae; 1.5% 2.6% 0.0% g_Rhodobacter Alphaproteobacteria;o_Rhodospirillales;f_Acetobacteraceae;g 0.1% 0.0% 0.0% _Roseococcus Alphaproteobacteria;o_Rhodospirillales;f_;g_ 0.0% 0.1% 0.0% Alphaproteobacteria;o_Rhodospirillales;f_Acetobacteraceae;g 0.0% 3.0% 0.0% _ Alphaproteobacteria;o_Rhodospirillales;f_Acetobacteraceae;g 0.0% 0.2% 0.0% _Roseococcus Alphaproteobacteria;o_Rhodospirillales;f_Rhodospirillaceae; 0.0% 0.0% 0.0% Other Alphaproteobacteria;o_Rhodospirillales;f_Rhodospirillaceae;g 4.2% 2.4% 4.4% _

141

APPENDIX

Alphaproteobacteria;o_Rhodospirillales;f_Rhodospirillaceae;g 0.0% 0.0% 0.0% _Phaeospirillum Alphaproteobacteria;o_Rickettsiales;f_;g_ 0.1% 0.1% 0.0% Alphaproteobacteria;o_Rickettsiales;f_Pelagibacteraceae;g_ 0.7% 0.2% 0.0%

Alphaproteobacteria;o_Rickettsiales;f_mitochondria;g_ 0.2% 0.0% 0.0% Alphaproteobacteria;o_Rickettsiales;f_mitochondria;g_Hetero 0.0% 0.0% 0.0% sigma Alphaproteobacteria;o_Rickettsiales;f_mitochondria;g_Thalas 0.1% 0.0% 0.0% siosira Alphaproteobacteria;o_Sphingomonadales;f_;g_ 0.0% 0.0% 0.1% Alphaproteobacteria;o_Sphingomonadales;f_Erythrobacterace 0.0% 0.0% 0.3% ae;Other

Alphaproteobacteria (45)Alphaproteobacteria Alphaproteobacteria;o_Sphingomonadales;f_Erythrobacterace 9.8% 8.5% 20.9% ae;g_ Alphaproteobacteria;o_Sphingomonadales;f_Sphingomonadac 5.4% 4.8% 0.1% eae;g_ Alphaproteobacteria;o_Sphingomonadales;f_Sphingomonadac 0.0% 0.0% 2.4% eae;g_Novosphingobium Betaproteobacteria;Other;Other;Other 0.0% 0.3% 0.0% Betaproteobacteria;o_;f_;g_ 0.4% 0.1% 0.0% Betaproteobacteria;o_Burkholderiales;Other;Other 0.8% 0.0% 0.0% Betaproteobacteria;o_Burkholderiales;f_Alcaligenaceae;g_ 0.0% 2.1% 0.0% Betaproteobacteria;o_Burkholderiales;f_Alcaligenaceae;g_Sut 0.0% 0.1% 0.0% terella Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;Ot 1.7% 0.0% 1.3% her Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 3.1% 9.6% 5.2% Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 0.1% 0.0% 0.0% Comamonas Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 2.3% 3.2% 0.0% Hydrogenophaga Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 2.5% 0.0% 0.0% Hylemonella Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 0.3% 3.3% 0.2% Limnohabitans

Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 1.5% 1.3% 0.0% Methylibium Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 0.9% 0.0% 0.0% Ramlibacter Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 0.1% 0.0% 0.0% Rhodoferax Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 0.3% 0.0% 0.0% Rubrivivax

Betaproteobacteria (32) Betaproteobacteria;o_Burkholderiales;f_Comamonadaceae;g_ 0.1% 0.0% 0.0% Simplicispira Betaproteobacteria;o_Burkholderiales;f_Oxalobacteraceae;Ot 0.0% 2.3% 0.0% her Betaproteobacteria;o_Burkholderiales;f_Oxalobacteraceae;g_ 19.7% 5.6% 0.0% Polynucleobacter Betaproteobacteria;o_Ellin6067;f_;g_ 3.4% 2.5% 0.7% Betaproteobacteria;o_MWH-UniP1;f_;g_ 3.3% 1.7% 0.0% Betaproteobacteria;o_Methylophilales;f_Methylophilaceae;g_ 0.2% 0.2% 0.0% Betaproteobacteria;o_Methylophilales;f_Methylophilaceae;g_ 0.0% 1.0% 0.0% Methylotenera Betaproteobacteria;o_Neisseriales;f_Neisseriaceae;Other 0.0% 0.0% 0.0% Betaproteobacteria;o_Neisseriales;f_Neisseriaceae;g_ 1.0% 0.0% 0.0% Betaproteobacteria;o_Rhodocyclales;f_Rhodocyclaceae;g_ 0.7% 6.1% 0.0% Betaproteobacteria;o_Rhodocyclales;f_Rhodocyclaceae;g_Ca 0.8% 0.6% 0.0% ndidatus Accumulibacter Betaproteobacteria;o_Rhodocyclales;f_Rhodocyclaceae;g_De 2.7% 0.4% 0.0% chloromonas

142

APPENDIX

Betaproteobacteria;o_Rhodocyclales;f_Rhodocyclaceae;g_Pro 0.0% 0.0% 0.0% pionivibrio Betaproteobacteria;o_Rhodocyclales;f_Rhodocyclaceae;g_Uli 0.3% 0.0% 0.0%

ginosibacterium ) Betaproteobacteria;o_Rhodocyclales;f_Rhodocyclaceae;g_Zo 0.2% 0.0% 0.0% (32 ogloea Betaproteobacteria;o_SC-I-84;f_;g_ 0.1% 0.0% 0.0%

Betaproteobacteria Betaproteobacteria;o_Thiobacterales;f_;g_ 0.0% 0.0% 0.0% Deltaproteobacteria;o_;f_;g_ 0.0% 0.3% 1.1% Deltaproteobacteria;o_AF420338;f_;g_ 0.0% 0.2% 0.0% Deltaproteobacteria;o_Bdellovibrionales;f_Bacteriovoracacea 0.0% 0.0% 0.6% e;g_ Deltaproteobacteria;o_Bdellovibrionales;f_Bdellovibrionacea 0.0% 0.0% 2.8% e;g_Bdellovibrio Deltaproteobacteria;o_Desulfovibrionales;f_Desulfovibrionac 0.0% 0.0% 0.4% eae;g_ Deltaproteobacteria;o_Desulfovibrionales;f_Desulfovibrionac 0.0% 0.1% 0.0% eae;g_Desulfovibrio Deltaproteobacteria;o_Desulfuromonadales;f_Geobacteraceae 0.0% 0.0% 1.2% ;g_Geobacter Deltaproteobacteria;o_FAC87;f_;g_ 0.0% 0.0% 0.2% Deltaproteobacteria;o_MIZ46;f_;g_ 0.0% 0.2% 1.7% Deltaproteobacteria;o_Myxococcales;f_;g_ 0.0% 0.1% 0.1%

Deltaproteobacteria;o_Myxococcales;f_0319-6G20;g_ 0.0% 0.0% 0.1% Deltaproteobacteria (17) Deltaproteobacteria;o_Myxococcales;f_Myxococcaceae;g_An 0.0% 0.0% 0.0% aeromyxobacter Deltaproteobacteria;o_Myxococcales;f_OM27;g_ 0.0% 0.2% 0.0% Deltaproteobacteria;o_NB1-j;f_JTB38;g_ 0.0% 0.0% 0.0% Deltaproteobacteria;o_Sva0853;f_JTB36;g_ 0.0% 0.0% 0.1% Deltaproteobacteria;o_Syntrophobacterales;f_Syntrophaceae;g 0.0% 0.0% 0.3% _Desulfomonile Deltaproteobacteria;o_Syntrophobacterales;f_Syntrophobacter 0.0% 0.0% 0.2% aceae;g_Syntrophobacter Epsilonprote Epsilonproteobacteria;o_Campylobacterales;f_Campylobacter 0.0% 0.2% 0.0% obacteria (4) aceae;g_Arcobacter Epsilonproteobacteria;o_Campylobacterales;f_Campylobacter 0.0% 0.2% 0.0% aceae;g_Sulfurospirillum Epsilonproteobacteria;o_Campylobacterales;f_Helicobacterac 0.1% 0.9% 0.0% eae;g_Sulfuricurvum Epsilonproteobacteria;o_Campylobacterales;f_Helicobacterac 0.0% 0.2% 0.0% eae;g_Sulfurimonas Gammaproteobacteria;Other;Other;Other 0.0% 0.0% 0.0% Gammaproteobacteria;o_Alteromonadales;f_Alteromonadace 0.0% 0.0% 2.9% ae;Other Gammaproteobacteria;o_Aeromonadales;f_Aeromonadaceae; 2.4% 0.2% 3.2% g_

Gammaproteobacteria;o_Aeromonadales;f_Aeromonadaceae; 0.0% 0.0% 0.0% g_Tolumonas Gammaproteobacteria;o_Alteromonadales;Other;Other 1.4% 0.0% 0.0% Gammaproteobacteria;o_Alteromonadales;f_Alteromonadace 0.5% 0.0% 0.0% ae;Other Gammaproteobacteria;o_Alteromonadales;f_Alteromonadace 0.1% 0.0% 0.0% ae;g_ Gammaproteobacteria;o_Alteromonadales;f_Alteromonadace 0.0% 0.5% 0.0% ae;g_Alteromonas

Gammaproteobacteria (40)Gammaproteobacteria Gammaproteobacteria;o_Alteromonadales;f_Alteromonadace 0.2% 0.3% 0.1% ae;g_Glaciecola Gammaproteobacteria;o_Alteromonadales;f_Idiomarinaceae;g 0.3% 0.1% 0.0% _Pseudidiomarina Gammaproteobacteria;o_Alteromonadales;f_OM60;g_ 0.0% 0.1% 0.0% Gammaproteobacteria;o_Chromatiales;f_;g_ 0.0% 0.2% 0.0%

143

APPENDIX

Gammaproteobacteria;o_Chromatiales;f_Chromatiaceae;Othe 0.0% 0.0% 0.0% r Gammaproteobacteria;o_Chromatiales;f_Chromatiaceae;g_ 0.0% 0.1% 0.0% Gammaproteobacteria;o_Chromatiales;f_Chromatiaceae;g_Al 4.6% 5.9% 0.0% lochromatium Gammaproteobacteria;o_Chromatiales;f_Chromatiaceae;g_Th 0.4% 0.0% 0.0% iodictyon Gammaproteobacteria;o_Chromatiales;f_Halothiobacillaceae; 0.0% 0.1% 0.0% g_Thiovirga Gammaproteobacteria;o_Enterobacteriales;f_Enterobacteriace 0.0% 0.8% 0.0% ae;g_ Gammaproteobacteria;o_HOC36;f_;g_ 0.4% 0.0% 0.0% Gammaproteobacteria;o_Legionellales;f_;g_ 0.0% 5.1% 0.0% Gammaproteobacteria;o_Legionellales;f_Coxiellaceae;g_ 0.2% 0.8% 0.0% Gammaproteobacteria;o_Methylococcales;f_Crenotrichaceae; 0.1% 0.1% 0.0% g_Crenothrix Gammaproteobacteria;o_Methylococcales;f_Methylococcacea 0.4% 0.0% 0.0% e;Other Gammaproteobacteria;o_Methylococcales;f_Methylococcacea 0.0% 0.0% 0.0% e;g_ Gammaproteobacteria;o_Methylococcales;f_Methylococcacea 0.7% 0.8% 1.3%

e;g_Methylocaldum Gammaproteobacteria;o_Methylococcales;f_Methylococcacea 0.0% 0.5% 0.0% e;g_Methylomonas Gammaproteobacteria;o_Oceanospirillales;Other;Other 0.0% 0.0% 0.0% Gammaproteobacteria;o_Oceanospirillales;f_Oceanospirillace 1.2% 3.5% 0.0% ae;g_ Gammaproteobacteria;o_Oceanospirillales;f_Oceanospirillace 0.3% 0.0% 0.0% ae;g_Marinobacterium Gammaproteobacteria;o_Oceanospirillales;f_Oceanospirillace 0.0% 0.1% 0.0%

Gammaproteobacteria (40)Gammaproteobacteria ae;g_Marinomonas Gammaproteobacteria;o_Pseudomonadales;f_Moraxellaceae;g 0.1% 0.2% 0.0% _Acinetobacter Gammaproteobacteria;o_Thiotrichales;f_;g_ 0.1% 0.0% 0.0% Gammaproteobacteria;o_Thiotrichales;f_Piscirickettsiaceae;g 0.0% 0.0% 0.0% _ Gammaproteobacteria;o_Thiotrichales;f_Piscirickettsiaceae;g 0.0% 0.0% 0.0% _Thiomicrospira Gammaproteobacteria;o_Thiotrichales;f_Thiotrichaceae;g_ 0.0% 0.0% 0.1% Gammaproteobacteria;o_Vibrionales;f_Pseudoalteromonadac 0.1% 0.1% 0.0% eae;g_Pseudoalteromonas Gammaproteobacteria;o_Xanthomonadales;f_Sinobacteraceae 0.1% 0.1% 0.2% ;g_ Gammaproteobacteria;o_Xanthomonadales;f_Xanthomonadac 6.7% 4.1% 0.0% eae;g_ Gammaproteobacteria;o_Xanthomonadales;f_Xanthomonadac 0.0% 0.2% 0.0% eae;g_Dokdonella Gammaproteobacteria;o_Xanthomonadales;f_Xanthomonadac 0.0% 0.1% 0.0% eae;g_Pseudoxanthomonas TA18;o_CV90;f_;g_ 0.0% 0.0% 1.2% TA18;o_PHOS-HD29;f_;g_ 0.0% 0.0% 0.7% ;o_;f_;g_ 0.0% 0.0%

144