UNDERSTANDING THE GENOMIC BASIS OF STRESS ADAPTATION IN
PICOCHLORUM GREEN ALGAE
By
FATIMA FOFLONKER
A dissertation submitted to the
School of Graduate Studies
Rutgers, The State University of New Jersey
In partial fulfillment of the requirements
For the degree of
Doctor of Philosophy
Graduate Program in Microbial Biology
Written under the direction of
Debashish Bhattacharya
And approved by
______
______
______
______
New Brunswick, New Jersey
January 2018
ABSTRACT OF THE DISSERTATION
Understanding the Genomic Basis of Stress Adaptation in Picochlorum Green Algae
by FATIMA FOFLONKER
Dissertation Director:
Debashish Bhattacharya
Gaining a better understanding of adaptive evolution has become increasingly important to predict the responses of important primary producers in the environment to climate-change driven environmental fluctuations. In my doctoral research, the genomes from four taxa of a naturally robust green algal lineage,
Picochlorum (Chlorophyta, Trebouxiphycae) were sequenced to allow a comparative genomic and transcriptomic analysis. The over-arching goal of this work was to investigate environmental adaptations and the origin of haltolerance. Found in environments ranging from brackish estuaries to hypersaline terrestrial environments, this lineage is tolerant of a wide range of fluctuating salinities, light intensities, temperatures, and has a robust photosystem II. The small, reduced diploid genomes (13.4-15.1Mbp) of Picochlorum, indicative of genome specialization to extreme environments, has resulted in an interesting genomic organization, including the clustering of genes in the same biochemical pathway and coregulated genes. Coregulation of co-localized genes in “gene neighborhoods” is more prominent soon after exposure to salinity shock, suggesting a role in the rapid response to salinity stress in Picochlorum. Despite the pressure for genome reduction, key gene gains are seen through gene family expansion of an important
ii
SOS1 salt transporter and through bacterium-derived horizontal gene transfer
(HGT). Thirteen instance of HGT were identified that display differential acquisition among Picochlorum taxa, indicating an ongoing process in this lineage. The presence of introns, differential expression under salinity shock, and the use of high quality genomes from closely related species provide robust support for the integration of
HGT candidates into host nuclear genomes. Transferred genes are potentially functionally relevant and include encoded proteins with roles related to osmolyte production, cell wall metabolism, and metabolic flexibility. A transcriptomic comparison of two sister taxa with very similar genomes, Picochlorum SENEW3 from a brackish lagoon and Picochlorum oklahomensis from a hypersaline salt plain environment was performed under high (1.5 M NaCl) and low salinity (10mM NaCl) shock conditions. This work revealed different regulation responses to salinity shock in terms of osmolyte production, reflecting nitrogen availability in the respective environments, and indicating that the habitat-driven regulation of the existing gene inventory is key to environmental adaptation. These diploid sister taxa also reveal one striking difference between them, levels of haplotype heterozygosity.
RNA-seq expression data supports condition-dependent allele-specific gene expression, indicating a functional relevance to maintaining a large divergent allele pool in P. oklahomensis. Overall, Picochlorum has revealed differences in adaptation strategies between seemingly identical species with regard to morphology and gene sequence similarity. My study has provided insights into the adaptive strategies used by eukaryotes with reduced gene inventories that is reflected in selection acting on genome organization, gene regulation, and specialization.
iii
ACKNOWLEDGMENTS
I would like to thank my advisor, Dr. Debashish Bhattacharya, for his guidance, patience, and support; my committee members Dr. Kay Bidle, Dr. Jeff Boyd, and Dr. Lena Struwe; the Microbial Biology graduate program director, Dr. Gerben Zylstra; research collaborators, Dr. G. C. Dismukes, Dr. Gennady Ananyev, Dr. Hwan Su Yoon, and Dr.
Mehdi Javinmard; the members of the Bhattacharya lab for their advice and help, my mom and dad for supporting me and my education, and my family and friends for their support. I would also like to thank Dr. H. Boyd Woodruff for the fellowship support for the first year in the Microbial Biology graduate program, The Phycological Society of
America for research support, and the NSF-IGERT (Integrative Graduate Education
Research Traineeship) for renewable fuels at Rutgers University (0903675) for fellowship support.
iv
DEDICATION
To the memory of my late husband, Mahmood.
v
ACKNOWLEDGEMENT OF PUBLICATIONS
Chapter 2 has been published as Foflonker F, Price DC, Qiu H, Palenik B, Wang S &
Bhattacharya D (2015) Genome of the halotolerant green alga Picochlorum sp. reveals strategies for thriving under fluctuating environmental conditions. Environ Microbiol 17:
412-426. F. Foflonker participated in writing the manuscript and is directly responsible for all genomic analyses and figures 2-7 and all tables.
Chapter 3 has been published as Foflonker F, Ananyev G, Qiu H, Morrison A, Palenik B,
Dismukes GC & Bhattacharya D (2016) The unexpected extremophile: Tolerance to fluctuating salinity in the green alga Picochlorum. Algal Research 16: 465-472. F.
Foflonker participated in writing the manuscript and is directly responsible for all analyses, tables, and figures.
Chapter 4 is being prepared for publication as Foflonker F, Mollegard D, Ong M, Yoon
HS, & Bhattacharya D (2018). Genomic anlysis of Picochlorum species reveals how microalgae adapt to fluctuating environments. F. Foflonker participated in writing the manuscript and is directly responsible for all analyses, tables, and figures.
vi
TABLE OF CONTENTS
Abstract of the Dissertation ...... ii
Acknowledgments ...... iv
Dedication ...... v
Acknowledgement of Publications ...... vi
Table of Contents ...... vii
List of Tables ...... xi
List of Figures ...... xiv
Chapter 1: Introduction ...... 1
Microalgae as biofuel feedstock ...... 1
Salinity stress on eukaryotic microalgae ...... 3
Picochlorum as a biofuel candidate and model to study salinity stress ...... 7
Scope of the thesis ...... 9
Chapter 2: Genome of the haloterant green alga Picochlorum sp. reveals strategies for
thriving under fluctuation environmental conditions ...... 11
Abstract ...... 11
Introduction ...... 12
Results ...... 13
Genome features and phylogeny ...... 13
Clusters of functionally related genes ...... 14
Transporter analysis ...... 17
Growth rates in the presence of organic carbon sources ...... 20
HGT analysis ...... 21
vii
Selenoproteins ...... 24
Hydrogenase activity and other genes of interest ...... 25
Discussion ...... 25
Experimental Procedures ...... 27
Strains and culture conditions ...... 27
DNA and RNA extraction and library construction ...... 28
Genome and transcriptome sequencing ...... 28
Construction of multi-protein tree ...... 29
Phylogenomic analysis ...... 30
Transporter analysis ...... 31
Functionally clustered pathways ...... 31
Acknowledgements ...... 32
Chapter 3: Elucidating salinity shock response mechanisms in picochlorum ...... 45
Abstract ...... 45
Introduction ...... 46
Materials and Methods ...... 56
Salinity shock experimental conditions ...... 56
Transcriptome sequencing ...... 57
Transcriptome analysis ...... 57
Co-localization analysis ...... 58
Photosynthetic measurements ...... 59
Results and Discussion ...... 48
High and low salinity stress elicits separate metabolic responses ...... 48
viii
Co-localization of co-expressed genes in response to salt shock ...... 50
High photorespiration influences carbon and nitrogen flux at high salinity ...... 52
Starch and osmolytes ...... 54
Response of the photosynthetic machinery ...... 54
Conclusions ...... 61
Acknowledgements ...... 62
Supporting Information ...... 67
Cell wall remodeling prevalent at both high and low salinity stress ...... 68
Membrane remodeling key to low salinity shock response ...... 68
Other responses ...... 69
Chapter 4: Characterization of multiple picochlorum genomes to elucidate the origin of
salt tolerance...... 88
Abstract ...... 88
Introduction ...... 89
Results ...... 92
Physiology ...... 92
Genome features and assembly ...... 93
Phylogeny and genome synteny ...... 95
Horizontal gene transfer ...... 96
Gene gain/loss ...... 98
Picochlorum SENEW3 and P. oklahomensis transcriptome comparison in
response to salinity shock ...... 100
Allele-specific expression ...... 103
ix
Discussion ...... 104
Materials and Methods ...... 108
Strain information and growth rates ...... 108
Genome sequencing ...... 108
Genome Assembly, Gene prediction, and annotation ...... 109
Genome synteny ...... 110
Construction of multi-protein tree ...... 111
Genome comparison ...... 112
Phylogenomic methods ...... 112
RNA-seq ...... 113
Allele-specific expression ...... 114
Acknowledgments ...... 114
Conclusion ...... 139
References ...... 142
x
LIST OF TABLES
Table 2.1 Clustered genes in shared pathways in Picochlorum SENEW3 ...... 41
Table 2.2 Instances of HGT that were identified in the Picochlorum SENEW3 genome and their putative gene functions. Putative gene annotations, results of the BLAST analysis, phylogenetic domain of gene origin, putative gene function, putative prokaryotic donor, and the number of EST reads that mapped to the genes are shown...... 42
Table S2.1 List of predicted proteins in Picochlorum SENEW3 showing their putative annotations and results of a BLASTP search against a comprehensive in-house database...... 43
Table S2.2 Ostreococcus tauri nitrate assimilation gene clusters and the corresponding Picochlorum SENEW3 genes and their contig locations. Genes located on the same contig are shown in boldface. The genes are ordered according to their physical location in the Ostreococcus tauri genome. We note that the Ostreococcus
Maf4 gene located in the Cnx2-Maf4-Cnx5 cluster described in the orginal paper is absent from the gene cluster in the lastest genome assembly (Ostreococcus tauri v2.0 from JGI)...... 44
Table 3.S1. Number of RNA-seq reads in each experiment...... 78
Table 3.S2. Co-localized genes in cluster in various gene sets. The highlighted cells represent statistically significant results...... 78
Table 3.S3. Examples of co-localized clusters allowing for two intervening genes that do not follow the expression pattern. Highlighted in yellow are conditions under which clusters meet co-localization criteria. Blue indicates genes that are part of a
xi
cluster, red denotes intervening genes. Also included are orthologs in Chlorella vulgaris and Coccomyxa subellipsoidea. Full clustering information available in Excel file Table 3.S8...... 79
Table 3.S4. Gene expression and predicted targeting for genes in pathways involved in salt stress. Accompanying table for Figure 3.3. E.C. number, enzyme commission number; TC number, transporter classification number; NDE, not differentially expressed; C,chloroplast;M,mitochondria;S,signal peptide...... 82
Table S5. Differentially expressed genes of bacterial origin. NDE; not differentially expressed. L2fc; log2fold change...... 86
Table 4.1. Sequencing and assembly statistics of Picochlorum species ...... 116
Table 4.S1. Summary statistics for variant detection in Picochlorum SENEW3 and P. oklahomensis primary assemblies...... 118
Table 4.S2. Genome Assembly Completeness compared to BUSCO core Eukaryota
...... 120
Table 4.S3. Collinearity between Picochlorum assemblies (# collinear homolog pairs/ # homolog pairs). Maximum gaps allowed = 5...... 120
Table 4.S4. Duplication type categorized into duplication types in the following priority order: segmental duplication (whole genome duplications) (min genes per block 5, maximum gaps 25), tandem, proximal (< 20 intervening genes), dispersed, and singletons...... 121
Table 4.S6. Gene expression of HGT-derived genes in P. oklahomensis and
Picochlorum SENEW3. Differentially expressed genes are highlighted in green...... 131
Table 4.S7. Organelle genome statistics...... 132
xii
Table 4.S8. Predicted genes in plastid genomes. Present (+); absent (-)...... 132
Table 4.S9. Predicted genes in mitochondrial genomes. Present (+); absent (-); found in nuclear genome (N)...... 135
Table 4.S10. Allele-specific gene expression in P. oklahomensis. Primary and haplotig columns represent percentage of gene pairs with > 90% of reads mapping to one of the two alleles on either the primary or haplotig contigs. Biallelic expression defined as between 40 and 60% of reads mapping to both alleles...... 138
xiii
LIST OF FIGURES
Figure 1.1 UV images of Picochlorum SENEW3, with BODIPY. Green fluorescence indicates the neutral lipid ...... 8
Figure 2.1 Phylogenetic analysis of Picochlorum SENEW3. Multi-gene maximum likelihood tree of ten green algae inferred from an alignment of 480,102 amino acids.. . 32
Figure 2.3 Analysis of metabolite transporters in Picochlorum SENEW3 showing frequency of transporters per transporter family in Picochlorum SENEW3 and O. tauri...... 34
Figure 2.4 Putative distribution and functions of metabolite transports in the Picochlorum SENEW3 cell showing transporters involved in the salt stress response...... 35
Figure 2.5 Mixotrophic growth of Picochlorum SENEW3. (A) Growth under salt stress and (B) growth under 1.5 M NaCl salt stress with the addition of different amounts of glucose...... 37
Figure 2.7. Putative functions in Picochlorum SENEW3 conferred by HGT...... 40
Figure S2.1. Mixotrophic growth of Picochlorum SENEW3 cultures in the absence of high salt stress (0.4 M NaCl) with the addition of different amounts of glucose...... 43
Figure 3.1. (a) Average chlorophyll variable fluorescence yield (Fv/Fm) and (b) Growth rate of the algal cultures acclimated to 1M NaCl media...... 63
Figure 3.2. (a) Examples of co-expressed and co-localized gene clusters. (b) Number of genes co-localized versus total genes in gene set at the 1.0 and1.5 L2fc cutoffs ...... 64
Figure 3.3. Summary of the salt shock response in Picochlorum SENEW3 at 1h under (a) high salinity and (b) low salinity conditions...... 66
Figure 3.4. Photoinhibition under 1500 µE m−2 s-1 high light conditions in the presence and absence of chloroplast protein synthesis inhibitor lincomycin (LIN). Cells adapted to 1M NaCl media were incubated in media at various salinities...... 67
xiv
Figure 3.S1. Venn diagram showing the number of genes that are DE (shared and unique) when comparing high salinity and low salinity at 1h and 5h time points...... 70
Figure 3.S2. Gene expression patterns over the time course (data centered) at (A) high salinity and (B) low salinity...... 72
Figure 3.S3. KEGG metabolic maps comparing the low and high salinity stress response at 1h; revealing little overlap in gene responses (blue). Green: low salinity, red: high salinity, blue: both. (A) Background expression showing all expressed genes (DE and not DE). (B) Up-regulated (C) Down-regulated...... 74
Figure 3.S4. Example of randomization analysis simulating gene clustering of same-sized data (N=1000)...... 74
Figure 3.S5. KEGG pathway analysis of genes involved in the TCA cycle under (A) high salinity and (B) low salinity ...... 76
Figure 3.S6. KEGG pathway analysis of genes involved in protein processing in the endoplasmic reticulum under (A) high salinity and (B) low salinity...... 77
Figure 3.S7. Effect of salinity on quality factor (QF) over 24 hours for cells initially grown in 1M NaCl incubated in media at various salinities. QF describes the Kok fitting paramaters, alpha (misses) and beta (double hits), of Fv/Fm data from figure 1...... 78
Figure 4.1. (A) Acclimated growth rates of Picochlorum species in media with varying salinity (10mM – 1.2 M). (B) Growth rates of P. oklahomensis and Picochlorum SENEW3 acclimated to 1M NaCl and shocked with 10mM and 1.5M NaCl...... 115
Figure 4.2. Phylogeny of Picochlorum and other sequenced chlorophytes. Multi-protein tree constructed from an alignment of 1122 proteins (295,805 characters). Overall gene family gains plus losses noted on the branches...... 117
Figure 4.3. Acquisition of HGT-derived genes in Picochlorum...... 118
4.S1. Distribution of variant frequencies in (A) P. oklahomensis and (B) Picochlorum SENEW3 primary assemblies...... 119
Figure 4.S2. Synteny between Picochlorum SENEW3 (right) and other species (left). 123
xv
Figure 4.S3. (A) IQ-TREE of HGT candidate peptidase S9. (B) Transcriptome evidence for intron in the gene in Picochlorum SENEW3 under control (1M NaCl) conditions. (C) Collinearity of this candidate with Picochlorum SENEW3 as the reference chromosome. (D) Collinearity of this candidate with P. soleocismus as the reference chromosome. .. 126
Figure 4.S4. (A) IQ-TREE of HGT candidate GDP-Mannose 4,6,dehydratase gene. (B) Transcriptome evidence for the gene in P. oklahomensis under control (1M NaCl) conditions. (C) Collinearity of this candidate with Picochlorum SENEW3 as the reference chromosome.. (D) Collinearity of this candidate with P. soleocismus as the reference chromosome...... 128
Figure 4.S5. (A) IQ-TREE of HGT candidate indolepyruvate decarboxylase. (B) Transcriptome evidence for the gene in Picochlorum SENEW3 under control (1M NaCl) conditions. (C) Collinearity of this candidate with Picochlorum SENEW3 as the reference chromosome. (D) Collinearity of this candidate with P. soleocismus as the reference chromosome...... 130
Figure 4.S6. Venn diagram of differentially expressed genes in P. oklahomensis under the four conditions tested...... 136
Figure 4.S7. Gene expression comparison between P. oklahomensis and Picochlorum SENEW3. (A) 1.5M NaCl 1h, (B) 10mM NaCl 1h, (C) 1.5M NaCl 5h, (D) 10mM NaCl 5h...... 137
Figure 4.S8. Number of (A) monoallelic (118 pairs total) and (B) biallelically (200 pairs total) expressed allele pairs shared under various salinity treatment conditions...... 138
Figure 4.S9. Number of allele pairs showing high change in ratio of primary: haplotig allele expression under various salinity treatment conditions...... 139
xvi
1
Chapter 1: Introduction
Microalgae as biofuel feedstock
In the quest for renewable and sustainable energy, algae have emerged as promising sources of biofuel. Marine or halotolerant algae are particularly desirable as biofuel feedstock because of their ability to grow on non-arable land, with non-potable brackish or saltwater. Unlike land based biofuel crops, algae reduce the competition with food crops for land or drinking water usage (1). Microalgae are favored because of their fast growth rates, large biomass production, and desirable lipid profiles. Finally, algal biomass can be processed into liquid biofuels that have the benefit of being able to be directly incorporated into existing infrastructure and pipelines (2).
There are several sought after characteristics, which if combined, would create the ideal microalga for biofuel production. First, the strain should have high biomass productivity and high lipid content. Constitutive lipid production would be best, because the nutrient deprivation technique typically used to induce oil body formation, diverts energy from growth (3). It should be robust enough to withstand the shear stresses caused by mixing, and the temperature, salinity, and light fluctuations associated with the traditional open pond cultivation systems. It should also be resistant to contamination and infections that commonly result in resource competition or decimation of entire pond cultures (4). Considering the effects of high light intensity on a cell, including photoinhibition and high mutation rates, and the fact that these open pond culturing facilities have typically been built in desert environments, the ability to withstand high light intensities would also be desirable in a potential biofuel feedstock strain (5). Strains
2 with reduced sensitivity of Rubisco to high oxygen concentrations are desirable, because oxygen competes with carbon dioxide as a substrate for Rubisco, limiting yield (6). Cells that autoflocculate, cells that are larger or heavier, or cells that have thin membranes would reduce harvest and extraction costs. Moreover, the ideal strain would excrete the desired lipids, avoiding the harvest process altogether (3). Finally, the ability to produce a high value co-product in conjunction with biofuels, a low value product, would make the process more economically feasible (7).
The use of halophiles for algal biomass production is a strategic choice; culturing halophiles reduces freshwater usage, and salinity can be utilized as a crop protection mechanism. When there is water loss in an open pond system due to evaporation, the lost water can either be replaced by freshwater thereby maintaining constant salinity or by seawater, resulting in a gradual increase in overall salinity of the pond (8). Ideally, saltwater, saline aquifer water, or nutrient rich wastewater would be the more environmentally sustainable choice for large-scale algal biomass production, requiring a halophilic feedstock organism, however a substantial amount freshwater is still required to overcome the effects of evaporation (1). In addition to evaporation, open pond systems are also susceptible to contamination. The use of selective conditions like high pH in the case of Spirulina platensis or high salinity for Dunaliella salina has been successful in reducing contamination, and facilitating growth of unialgal cultures (4). Von Alvensleben et al. showed that a salinity of 36‰, about the salinity of seawater, was capable of slowing the establishment of contamination of a Picochlorum atomus culture by the freshwater cyanobacterial contaminant, Pseudanabaena limnetica (9). Contamination by
3 non-target algae or cyanobacteria can also lead to resource competition or the release of potentially toxic allelochemicals into the culture and environment (9).
Salinity stress on eukaryotic microalgae
Microalgae are ubiquitous in the marine and freshwater environment, and as such, are exposed to a variety of salinity stresses, often coupled with osmotic and desiccation stress: river and tidal influx into estuaries, precipitation and evaporation in ponds or terrestrial environments, diel low and high tide exposure to salinity and desiccation stress in intertidal zones, wind-driven salinity fluctuations near the coast, and salinity changes in brine pockets during sea ice freezing and thawing cycles (10).
The majority of the algal salinity response literature focuses on the cell wall-less halotolerant green alga, Dunaliella. In Dunaliella, the immediate responses to salt stress are biophysical responses independent of gene expression. These include immediate loss of water and shrinking of the cell under hyperosmotic conditions, followed by the passive influx of external ions and reuptake of water. However the now internal excess of ions inhibits many cellular functions including photosynthesis and translation (11). The short- term response involves the synthesis or expulsion of osmolytes under hyper- and hypo- saline conditions, respectively. Osmolytes are small organic low molecular weight compounds involved in osmoacclimation that do not disrupt cellular processes. This occurs within 2-3 hours in Dunaliella. The acclimated response, which involves the accumulation of stress induce proteins starts around 12 hours after initial salt exposure in
Dunaliella (12).
4
The cell membrane is an integral component of halotolerance because it acts as a barrier to solutes entering or exiting the cell and may be involved in osmotic sensing.
Salinity affects the plasma membrane of cells in many ways including increased cell rigidity and shrinking under hyperosmotic stress, while hypoosmotic stress leads to an increase in cell volume and membrane fluidity (13). In the cell wall-less Dunaliella, the ability to adjust fatty acid composition and organization, thereby maintaining membrane fluidity is also an important factor in salt tolerance. These molecular responses to hypersalinity stress include the induction of fatty acid elongase involved in fatty acid elongation and ultimately leading to desaturation in order to maintain membrane fluidity
(14), changes in membrane sterols that may also be involved in osmotic signaling (15), changes in membrane lipid order to maintain elasticity (16), and membrane reservoirs that allow the expansion and contraction of the cell membrane without apoptosis (17).
While not much is known about salinity sensing in microalgae, the membrane mechanosensitive ion channels or cytoskeletal elements a role in sensing changes to turgor pressure in higher plants (18). The retraction of the cell membrane from the cell wall is one way to maintain membrane integrity and is noted in aeroterrestrial green alga
Zygnema under osmotic stress as well as Asterochloris erici under desiccation stress (19,
20). It has been suggested that cell walls may play a role in osmotic stress tolerance by forming a rigid layer of protection to resist water loss or possibly cell wall elasticity is an important strategy in maintaining cell integrity in streptophytes and higher plants (21,
22).
Microalgae apply a ‘salt out strategy’ resulting in the exclusion of salt from the cytoplasm via transporter enhancement in the membrane, in contrast to halophilic
5 prokaryotes that utilize a ‘salt-in strategy’ in which they accumulate KCl to maintain ionic and turgor pressure (23) . Salt overly sensitive transporters involved in sodium extrusion in Arabidopsis are also found to be involved in the salinity response in
Picochlorum. Enhanced transporters include carbonic anhydrases in Dunaliella and
Picochlorum and an iron transporter in Dunaliella potentially involved in overcoming carbon dioxide and iron limitations under high salinity (24-26). In Dunaliella, nitrate transporters are upregulated under high salinity and are coupled to the sodium rather than proton gradient (27). Dunaliella and Porphyra purpurea may also sequester salt in vacuoles, similar to higher plants (10).
Microalgae accumulate a variety of compatible osmolytes and often multiple in the same organism. Osmolytes include glycerol in Dunaliella and Phaeodactylum , and betaine or proline in Picochlorum and Fragilariopsis cylindrus (28, 29).
Dimethylsulphoniopropionate (DMSP) uptake and rapid expulsion into the environment for Cylindrotheca closterium or Phaedactylum salinity shock suggest that it can also serve in osmoacclimation (29, 30). Osmolytes are formed from either glucose generated by photosynthesis or starch degradation (31). Starch degradation occurs to free up resources for the carbon pool. Carbon flux is redirected to the production of compatible osmolytes such as glycerol in Dunaliella and proline in Picochlorum and energy utilized for transporters involved in ion exclusion. Salinity tolerance is an energy intensive process that is diverted from growth to maintaining homeostasis; therefore, pathways such as carbohydrate metabolism are differentially expressed (26, 32).
Salinity stress is associated with generation of reactive oxygen species (ROS), therefore antioxidant responses are increased in Dunaliella including ascorbate and
6 glutathione peroxidases, and alpha-tocopherol (33). Antioxidant upregulation is also correlated with salinity stress in Chlamydomonas (34).
Some algae such as Chlamydomonas and Dunaliella lose their flagella or flagellar activity and form aggregates of cells surrounded by an exopolysaccaride matrix reducing ion influx, called palmelloids or palmella, under stress, including salinity stress. These structures dissociate upon removal of stress, accompanied with recovery of flagella (35,
36). Extracellular polysaccharide substances (EPS) were found to protect the photosynthetic apparatus and resulted in enhanced viability and maintenance of Fv/fm under hypersaline conditions in Cylndrotheca closterium (37).
Salinity may inhibit photosynthesis and photosystem II repair from photoinhibition (38). Photosynthesis is reduced while photorespiration is increased under salt stress. Energy is also utilized under salinity stress to repair photosystem II in order to resume photosynthesis. Kim and co-authors showed that many photosynthetic related proteins were downregulated under both high and low salinity stress in Dunaliella, as well as reduced photosynthetic efficiency (Fv/Fm) under low salinity stress (39).
Reduced Fv/Fm was also noted in Fragilariopsis cylindrus (28). Light stress in combination with salinity stress results in enhanced photoinhibition in Chlamydomonas and sea ice algae (38, 40). Carotenoids such as Beta-carotene in Dunaliella or lutein in
Botryococcus braunii accumulate under salinity stress may be involved in photoprotection and was also seen in Picochlorum (41, 42).
Gene and protein expression changes in microalgae are similar to the response in higher plants. Protein folding and chaperones are also commonly expressed in response to protein denaturation that can occur at high salt concentrations. CO2 availability
7 decreases with increased salinity, correlating with upregulating of carbonic anhydrases as a counter response to convert bicarbonate into CO2 (25, 26, 43). Long term effects of salinity generally include reduced growth rate in favor of maintaining cell homeostasis and increased respiration.
Picochlorum as a biofuel candidate and model to study salinity stress
Bioprospecting for unique, robust, and highly efficient strains is the first step towards developing a suitable biofuel candidate through downstream genetic manipulation or adaptation for increased yield. The species we have chosen to investigate as a potential biofuel candidate, Picochlorum sp. strain SENEW3 (SENEW3), was selected because it is highly robust in the face of environmental fluctuations, specifically salt stress. Picochlorum SENEW3 is a small coccoid green alga
(Trebouxiophyceae, Chlorophyta) that is about 2-3 µm in diameter. It was isolated from a small permanent pond in the San Elijo Lagoon system, one of the largest coastal wetlands in San Diego, California. The lagoon is a shallow-water estuary, therefore a brackish environment, and is subject to large fluctuations in salinity through evaporation, precipitation, and tidal influx of seawater. Salinities range from 108.3‰ in the dry season to freshwater levels, 1.7‰ in the winter, rainy season. Nutrient levels (phosphate, nitrate, nitrite, ammonium) vary substantially as well. As is commonplace in brackish environments, this pond exhibits low species diversity. The plankton is typically dominated throughout the year by three major species: the green alga Picocystis sp., a diatom Chaetoceros sp., and Picochlorum SENEW3. Of the three, Picochlorum
SENEW3 was found to have the broadest range of salinity tolerance and was present
8 year-round in the pond, whereas Picocystis had the highest salinity tolerance.
Picochlorum SENEW3 grows above 16 oC and has reduced growth rate above 32 oC.
Pigment analysis indicates it contains the carotenoids violaxanthin and zeaxanthin.
Picochlorum SENEW3 also shows significant lipid body accumulation under nitrogen limitation (Figure 1.1) (44).
Figure 1.1 UV images of Picochlorum SENEW3, with BODIPY. Green fluorescence indicates the neutral lipid (44)
Studies have assessed the potential of various Picochlorum species as candidates for biofuel production. These studies report doubling times between 36 - 48 hours and maximum biomass concentration 1.8-2.1 g/L (45, 46). Zhu et al. showed that
Picochlorum oklahomensis can be easily harvested with common methods of flocculation such as pH adjustment and chitosan addition (46). Most studies reported lipid content around 20-25% (10% fatty acids), however lipid profiles differ vastly between species of the genus, and between Picochlorum and other microalgae (45-47). Picochlorum species also produce several carotenoids that can be used to produce high value co-products, such as eye vitamin supplements and other dietary supplements (44-46). Additionally,
9
Picochlorum soleocismus has been found to be amenable to genetic manipulation in order to increase lipid production (48).
The Picochlorum lineage has evolved halotolerance from a freshwater ancestral state (49). Halotolerance within the genus varies with Picochlorum SENEW3 and P. oklahomensis capable of tolerating hypersaline conditions of a brackish lagoon and a salt plains environment, respectively. Picochlorum species also have highly reduced, specialized genomes, making this lineage a good model to address evolutionary questions of a complex trait like halotolerance.
Scope of the thesis
Using Picochlorum SENEW3 as a model, my thesis aims to explore the basic question of broad ecologic importance: How do microalgae adapt to environmental stresses such as salinity stress? How does habitat-driven adaptation shape small genomes like Picochlorum? Addressing this question from a genomics perspective will provide a better understanding of the molecular mechanisms of salinity and stress tolerance.
Studying the evolution of halotolerance is fundamentally interesting because it is a complex response that employs a variety of mechanisms to adapt to salt stress. Chapter 1 involves the sequencing and characterization of a new green algal isolate, Picochlorum
SENEW3, generating hypotheses on salinity tolerance mechanisms and metabolic flexibility. Chapter 2 establishes a model for the salinity shock response in Picochlorum
SENEW3 through transcriptome analysis, suggests that genome organization is important for rapid coordinated response to salinity stress, and explores the robust nature of its photosystem II under salinity and high light stress. Finally, chapter 3 identifies
10 mechanisms of environmental adaptation and evolution of salinity tolerance through a comparative genomics analysis of several Picochlorum species. Utilizing robust genome assemblies from multiple microalgae, the role and extent of HGT in this lineage is identified, and evolutionary strategies including coordinated gene expression, and gene family gain/loss and expansion.
11
Chapter 2: Genome of the haloterant green alga Picochlorum sp. reveals strategies for thriving under fluctuation environmental conditions
Fatima Foflonker1, Dana C. Price2, Huan Qiu3, Brian Palenik4, Shuyi Wang4 and
Debashish Bhattacharya1
1Departments of Biochemistry and Microbiology, 2Plant Biology, 3Ecology, Evolution and Natural Resources, Rutgers University, New Brunswick, NJ 08901, USA. 4Scripps
Institution of Oceanography, University of California, San Diego, La Jolla, CA 92093,
USA.
Abstract
An expected outcome of climate change is intensification of the global water cycle, which magnifies surface water fluxes, and consequently alters salinity patterns. It is therefore important to understand the adaptations and limits of microalgae to survive changing salinities. To this end, we sequenced the 13.5 Mbp genome of the halotolerant green alga Picochlorum SENEW3 (SENEW3) that was isolated from a brackish water pond subject to large seasonal salinity fluctuations. Picochlorum SENEW3 encodes 7367 genes, making it one of the smallest and most gene dense eukaryotic genomes known.
Comparison with the pico-prasinophyte Ostreococcus tauri, a species with a limited range of salt tolerance, reveals the enrichment of transporters putatively involved in the salt stress response in Picochlorum SENEW3. Analysis of cultures and the protein complement highlight the metabolic flexibility of Picochlorum SENEW3 that encodes genes involved in urea metabolism, acetate assimilation and fermentation, acetoin
12 production and glucose uptake, many of which form functional gene clusters. Twenty- four cases of horizontal gene transfer from bacterial sources were found in Picochlorum
SENEW3 with these genes involved in stress adaptation including osmolyte production and growth promotion. Our results identify Picochlorum SENEW3 as a model for understanding microalgal adaptation to stressful, fluctuating environments.
Introduction
Climate change is expected to intensify changes in the water cycle at the rate of a 7% increase in intensity per degree Kelvin of warming (50). Increased evaporation and precipitation, caused by warmer waters and the ability of warmer air to retain more moisture are the major driving forces in this cycle (51). The predicted magnification of surface water fluxes from evaporation and precipitation closely correlate to changing salinity patterns (52).Salt concentration in water also affects its density and thereby the vertical mixing patterns of water (53). In addition to the challenge of adapting to salinity variation, phytoplankton communities will also face differences in nutrient and light availability due to changes in turbulence (54).
Picochlorum sp. strain SENEW3 (here SENEW3) is potentially a highly useful model to understand the effects of salinity stress on microalgae because of its wide range of salt tolerance. Picochlorum SENEW3 is a tiny, coccoid (i.e., non-motile) green alga
(Trebouxiophyceae, Chlorophyta) that is 2-3 µm in cell diameter. It was isolated from a small permanent pond in the San Elijo Lagoon system in San Diego County, California.
The pond is subject to large seasonal fluctuations in salinity (1.7 -108.3‰ [i.e., parts per thousand]) via evaporation, precipitation, and tidal influx of seawater. Laboratory studies
13 have confirmed the wide salt tolerance range of Picochlorum SENEW3 that extends from at least 3.5 – 105‰ (55). In comparison, other Picochlorum species grow maximally to ca. 75‰ salinity and in contrast, species from a freshwater sister clade to Picochlorum grow in salinities up to 25‰ (49). Picochlorum SENEW3 tolerates temperatures above
16°C but exhibits a reduced growth rate above 32°C. Carotenoid production and significant lipid body accumulation under nitrogen limitation suggests that Picochlorum
SENEW3 may be a promising species for commercial algal biomass applications (55).
Here we report the genome sequence of the natural isolate, Picochlorum
SENEW3. We analyze possible mechanisms of adaptation to salt stress through comparisons of metabolite transporters, identify genome regions of functionally clustered genes, and investigate the role of horizontal gene transfer (HGT) in potentially enhancing the stress tolerance capabilities of this free-living green alga.
Results
Genome features and phylogeny
A total of 830 Mbp of paired-end (2 x 150bp) Illumina sequence data were generated from Picochlorum SENEW3 using the Illumina MiSeq Personal Genome Sequencer of which 98.3% of reads matched to the assembled contigs. The assembly comprised 1,266 contigs with a N50=124.5 Kbp and an average genome coverage of 62x (52x median coverage). A total of 2.07 Gbp of RNA-seq data from this species were used to train the ab initio gene predictor Augustus (Stanke and Morgenstern, 2005), resulting in high quality gene models for downstream analysis. Our data show that the 13.5 Mbp nuclear genome encodes 7,367 protein-coding genes, with 5,795 introns, a G+C content of
14
46.1%, and a gene density of 1.8 Kbp/gene. These values are comparable to
Ostreococcus tauri (i.e., 12.6 Mbp genome; 1.6 Kbp/gene). Of the 458 shared core genes compiled in the Core Eukaryotic Genes Mapping Approach (CEGMA; http://korflab.ucdavis.edu/datasets/cegma/) database, 454 (99%) are present in the
Picochlorum SENEW3 draft genome suggesting a complete assembly. Putative annotations of all Picochlorum SENEW3 predicted proteins, their contig of origin, top database hit, and other attributes are presented in Table 2.S1. A maximum likelihood tree inferred from a concatenated alignment of 480,102 amino acids unambiguously places
Picochlorum SENEW3 as sister to Chlorella variabilis within Chlorophyta (100% bootstrap support) and reveals that its average protein evolutionary divergence rate (i.e., branch length) is elevated since its split from C. variablilis (Fig. 2.1). Nested within two other Trebouxiophyceae that have much larger genomes (e.g., 46 Mbp, 49 Mbp), the relatively small genome of Picochlorum SENEW3 likely indicates genome reduction in this taxon (Fig. 2.1). Streamlined genomes are characteristic of fast-evolving species that live in specialized ecological niches or in extreme environments (e.g.,(56-59)).
Clusters of functionally related genes
Recent evidence suggests that eukaryotic gene order may not be random, but rather some groups of functionally related genes form gene clusters (60). Here we identified gene clusters involved in a shared metabolic pathway, while allowing for the presence of intervening genes with unknown or un-related functions. A total of 5,795 Picochlorum
SENEW3 proteins were mapped to 482 pathways annotated in the Unipathway database.
15
This analysis resulted in a list of 633 proteins with BLASTp hits to Unipathway that were manually examined for evidence of functional clustering.
One interesting cluster we uncovered contains genes involved in urea uptake and degradation (Picochlorum_contig_54.g177.t1 – Picochlorum_contig_54.g180.t1 [Table
2.S1]). In contrast to the major route of urea degradation to carbon dioxide and ammonia by the nickel-containing urease present in green algae such as Ostreococcus and
Micromonas species, as well as land plants (58), Picochlorum SENEW3 and some green algae and fungi, including C. reinhardtii and C. vulgaris, use a two-step process involving an ATP-dependent urea carboxylase and allophanate hydrolase/amidase (61,
62). In green algae, these two enzymes are encoded by genes in close proximity, and are
673 bp apart in Picochlorum SENEW3 (Fig. 2.2; (63)). We did not detect any genes that encode subunits of the nickel-dependent urease complex in the Picochlorum SENEW3 genome. The Picochlorum SENEW3 urease gene cluster also includes a high affinity urea:Na+ symporter similar to DUR3 in Arabidopsis thaliana (AtDUR3) that is involved in import of urea during nitrogen starvation. This is the sole urea transporter identified in the Picochlorum SENEW3 genome. The clustering of these genes (not the case in
Ostreococcus) is not surprising because exogenous urea is an important nitrogen source to support amino acid biosynthesis in phytoplankton (64). Consistent with these results, the major nitrate assimilation cluster found in Ostreococcus (57) is also largely conserved in Picochlorum SENEW3 (see Table 2.S2).
Another pathway that shows evidence of clustering is the acetate assimilation pathway that leads to acetyl-CoA biosynthesis. Two genes encoding acetate kinase and phosphate acetyltransferase are located in close proximity
16
(Picochlorum_contig_155.g703.t1 and Picochlorum_contig_155.g705.t1). These genes are also present in C. reinhardtii but not in Ostreococcus and function in acetate assimilation or in the reverse direction, generate ATP through fermentation (65). This evidence suggests that Picochlorum SENEW3 may be able to utilize acetate as a carbon source, enhancing its metabolic flexibility, and may be capable of energy generation through fermentation during anoxic conditions. Also noteworthy is a cluster of two genes encoding alpha-acetolactate decarboxylase and acetolactate synthase
(Picochlorum_contig_58.g128.t1 and Picochlorum_contig_58.g129.t1) that are involved in the conversion of pyruvate to acetoin as part of (R,R)-butane-2,3-diol biosynthesis pathway. These genes have homologs in C. variabilis and appear to have a bacterial origin, but are absent in other Viridiplantae, including Ostreococcus,. In some bacteria, these two genes form an operon (alsSD operon in Bacillus subtilis) and are involved in the fermentative production of acetoin, a neutral four-carbon molecule that serves to maintain cellular pH levels, regulate NAD/NADH ratios, and acts as a carbon storage molecule that can be excreted or reutilized during stationary phase (66-68). Acetolactate synthase also catalyzes the first step in the biosynthesis of the branched chain amino acids leucine, isoleucine, and valine (67). The genome of Picochlorum SENEW3 does not however encode the gene alsR, the transcription factor essential for the transcription of the B. subtilis alsSD operon, nor does it include (R,R)-butanediol dehydrogenase, the enzyme responsible for the reduction of acetoin to 2,3-butanediol, a common fermentation product of industrial importance in bacteria (69, 70). Other clusters of functionally related gene are listed in Table 2.1.
17
We also used the program C-Hunter (71) to identify functional clusters based on
Gene Ontology (GO) terms. Several clusters of 5-8 genes were identified that are involved in the following functions: response to abiotic stimulus, transferase activity, hydrolase activity, and nucleotide binding. Clusters of three genes or less included those involved in biotin synthesis, the citric acid cycle, inorganic phosphate transport, histone proteins, and response to abscisic acid, a stress indicator (results not shown). These results provide evidence for the clustering of functionally related genes in Picochlorum
SENEW3 and suggest that some of the clusters we identified likely play a role in environmental adaptation. Finally, we note that due to the fragmented nature of the genome assembly and the limited number of genes that have pathway annotations, the gene clusters we have identified likely represent an under-estimate of the true number. A more complete assembly is likely to physically connect additional genes with shared pathways functions into single contigs.
Transporter analysis
The number and type of metabolite transporters were compared between Picochlorum
SENEW3 and Ostreococcus tauri, a species with limited halotolerance. We identified putative membrane transport proteins and classified them based on sequence similarity
(BLASTp, E-value cutoff ≤1x10-10) using the Transporter Classification Database
(TCDB; http://www.tcdb.org/) (72). This resulted in the identification of 719 putative membrane transport proteins in Picochlorum SENEW3 that were categorized into 124 families, representing 9.8% of the genome. Fewer transport proteins were identified in O. tauri, 660 proteins in 112 families, comprising 8.3% of the genome. The most common
18 transporter proteins in Picochlorum SENEW3 belong to the ATP-binding cassette (ABC) superfamily (60 proteins), followed by the nuclear mRNA exporter (mRNA-E) family
(39), and the peroxisomal protein importer (PPI) family (37) (see Fig. 2.3).
Picochlorum SENEW3 and O. tauri share a set of 165 distinct transporter classification (TC) numbers, making up a core set of transporter functions. TC number distinguishes transport proteins at the level of subfamily and substrate range.
Picochlorum SENEW3 encodes 533 shared proteins, whereas O. tauri encodes 508, indicating gene expansion in the Picochlorum SENEW3 genome. Putative functional annotations of the overrepresented proteins in the Picochlorum SENEW3 genome include the mitochondrial protein translocase family (3.A.8.1.1) and the sodium/hydrogen
(AtNHX8) exchanger (2.A.36.7.3). Picochlorum SENEW3 had 175 transport proteins with 35 distinct TC numbers, not present in O. tauri. The most abundant transporters found only in Picochlorum SENEW3 include general amino acid transporters (AAP3)
(2.A.18.2.3) and the multidrug resistance protein 4 involved in efflux of drugs and signaling molecules (3.A.1.201.7). Protein families overrepresented in O. tauri include the Resistance-Nodulation-Cell Division (RND) Superfamily (2.A.6) that functions in drug and lipid efflux, the Voltage-gated Ion Channel (VIC) Superfamily (1.A.1), and potassium transport related proteins. These data show generally that both Picochlorum
SENEW3 and Ostreococcus contain a large number of membrane transporters, many of which are shared and some of which are unique to each taxon. The latter likely reflect adaptations to their different environments, although this needs to be tested using functional analyses.
19
Nonetheless, it is likely that some of the transport proteins contribute to the broad range of salt tolerance in Picochlorum SENEW3 (Fig. 2.4). For example, this alga encodes six copies of the atNHX8/salt overly sensitive 1 (SOS1) gene (compared to one in O. tauri), a plasma membrane Na+/H+ antiporter involved in the extrusion of sodium from the cell that is essential for salt tolerance (73, 74). Sodium extrusion via the Na+/H+ antiporter is coupled to an H+ gradient formed by an H+-ATPase. Picochlorum SENEW3 also contains one gene annotated as a subunit of Na+/K+-ATPase, involved in the ATP- dependent active extrusion of sodium from the cell, which is particularly useful under high pH conditions when the export of sodium via the Na+/H+ antiporter is rendered ineffective (75). Thought initially to be exclusive to animals, homologs of the Na+/K+-
ATPase have been reported in some algae including Dunaliella salina, Heterosigma akashiwo, and Porphyra yezoensis (76-78). Also present is NHX1, a Na+ (K+)/H+ antiporter localized in the vacuolar membrane that is involved in the vacuolar accumulation of K+ for osmotic adjustment (79). NHX1, similar to SOS1, is also driven by a proton gradient formed by vacuolar H+-ATPase and H+ translocating inorganic pyrophosphatase (80).
Other proteins found in Picochlorum SENEW3 but not in O. tauri include three copies of the mechanosensitive channel 1 (MSC1), likely located in the chloroplast.
Mechanosensitive channels are present in most prokaryotes and sense changes in the membrane, often involved in sensing osmotic stress (81). Picochlorum SENEW3 also has two inward rectifying potassium channels (IRK). Maintaining a high intracellular potassium level is one strategy used to reduce the toxic effects of Na+ on cells (82).
Picochlorum SENEW3 contains several more amino acid permeases than O. tauri. These
20 are primarily the general amino acid permease 3 (AAP3) involved in the transport of neutral and acidic amino acids. Amino acids and other nitrogen containing compounds accumulate in plant cells as osmolytes in response to salt stress (83)
Other environmental adaptations include genes involved in metal uptake.
Picochlorum SENEW3 has several additional transporters for zinc and other heavy metals including iron and magnesium in the Zinc (Zn2+)-Iron (Fe2+) Permease (ZIP)
Family (2.A.5) and the Putative Tripartite Zn2 Transporter (TZT) Family (9.B.10), the
CorA Metal Ion Transporter (MIT) Family (1.A.35); there are 23 genes in these three categories in Picochlorum SENEW3 compared to 10 in O. tauri. Picochlorum SENEW3 has an abundance of ABC transporters, many of which are multidrug resistance proteins.
In terms of metabolic flexibility, Picochlorum SENEW3 contains seven hexose transporters including one that is homologous to the Hup1 glucose transporter in
Chlorella kessleri (84), consistent with the reported mixotrophic capabilities of
Picochlorum strain S1b in the presence of glucose (85).
Growth rates in the presence of organic carbon sources
Given the discovery of putative glucose transporters in the Picochlorum SENEW3 genome, we tested the impact of glucose on cell growth. For this experiment, we raised
Picochlorum SENEW3 under different levels of salt stress, added glucose to the medium, and then measured the growth rate (Fig. 2.5). This analysis reveals suppressed growth rates and longer acclimation periods between 1.4 and 1.6 M NaCl with no growth being observed at 1.8 M NaCl; i.e., under the conditions used in the laboratory (see Methods).
Mixotrophic growth with the addition of glucose under 1.5 M NaCl stress showed
21 increased maximum cell density with increasing glucose concentrations. No evidence of heterotrophic growth was observed with the addition of glucose in the dark. Mixotrophic growth on glucose has also been shown to increase growth rates in Picochlorum S1b and
Chlorella vulgaris (85, 86). Unlike Picochlorum SENEW3 and C. vulgaris, C. reinhardtii lacks hexose transporters (87). These results are consistent with our comparative genomic analysis, suggesting that glucose may be taken up and metabolized by the cell, thereby partially mitigating the effects of high salt stress. Preliminary culture experiments in which acetate was added to the medium show that Picochlorum SENEW3, similar to C. reinhardtii, can grow mixotrophically in the presence of this organic carbon source
(Foflonker and Bhattacharya unpublished results).
HGT analysis
We investigated the extent of HGT in the Picochlorum SENEW3 genome using an automated phylogenomic pipeline (88). Here we focused on inter-domain HGT because of the greater sampling depth of prokaryote genomes and their large phylogenetic distance from Picochlorum SENEW3 that provides a clear signal of foreign gene acquisition. We generated 5,871 maximum likelihood (Randomized Axelerated
Maximum Likelihood [RaxML];(89)) protein trees (containing >3 phyla) using the
Picochlorum SENEW3 predicted proteins as the query. These trees were sorted using the program PhyloSort (90) to search for cases of monophyly with Bacteria, Archaea, or
Vira. Phylogenies of interest were then manually examined to identify candidates for
HGT with >60% bootstrap support for the sister-group relationship between Picochlorum
SENEW3 and prokaryotes or trees that included only prokaryotes with the Picochlorum
22
SENEW3 protein (Fig. 2.6). Given the rampant history of HGT among prokaryotes and the relatively rich green algal/plant database, we presumed that the absence of eukaryotic proteins in the latter trees (except for Picochlorum SENEW3) was sufficient evidence to implicate HGT.
Using this approach, we found 24 instances of HGT unique to Picochlorum
SENEW3 (i.e. not found in any other sequenced green alga), of prokaryotic, mainly bacterial origin (Table 2.2). This can be compared to ca. 74 genes of putative prokaryotic origin in the Bathycoccus prasinos genome (91). Fifteen of the 24 are expressed and have at least 20X EST coverage (Table 2.2) under standard culture conditions (see materials and methods). Interestingly, many of the 24 genes in Picochlorum SENEW3 have functions potentially related to stress adaptation and the majority is related to carbohydrate metabolism. Most are glycosyltransferases, glycoside hydrolases, and polysaccharide lyases that function in polysaccharide synthesis and degradation into its sugar moieties. The gain of polysaccharide degrading enzymes, including a cellulase, may function in cell wall recycling, remodeling, or may be excreted from the cell and function in nutrient acquisition, thereby providing metabolic flexibility (92). Other genes may be involved in cell wall synthesis; for example, the GDP-mannose to GDP fucose pathway; both genes in this pathway (GDP-mannose-4,6,-dehydratase and GDP-L-fucose synthase) appear to have a bacterial origin. In Arabidopsis, GDP-mannose-4,6- dehydratase mutants are deficient in L-fucose, a precursor of the cell wall constituent rhamnose, leading to weakened cell walls resulting in stunted growth (93, 94). Also noteworthy are several genes of suspected HGT origin involved in carbohydrate modifications in the cell wall in the green algae C. variabilis and Ostreococcus
23 lucimarinus (57, 95). Several glycosyltransferases of foreign origin involved in cell surface protein modifications were identified in the cyanobacterium Synechococcus, and hypothesized to function as a predation avoidance mechanism (96). Glycosylation was also noted as an enriched category of genes of both prokaryotic and non-Viridiplantae eukaryotic origin in B. prasinos (91). Taken together, these data suggest that HGT in green algae has repeatedly conferred genes involved in cell wall and cell surface modifications.
Several of the HGT-derived genes may contribute to the broad salt tolerance properties of Picochlorum SENEW3 (Fig. 2.7). A gene encoding glycerol dehydrogenase is involved in the synthesis of glycerol, a common osmolyte involved in osmoregulation during salt stress. Proteases, including a glutamyl endopeptidase known to be induced during salt stress, breaks peptide bonds thereby freeing amino acids like glutamate, which may act as an osmolyte (97).
Other HGT-derived genes potentially involved in stress adaptation include those involved in sulfate scavenging, growth promoting hormone synthesis, cell cycle control, and DNA methylation. An arylsulfatase gene in Picochlorum SENEW3 is potentially involved in sulfur mineralization by the hydrolysis of sulfate esters to sulfate (98).
Whereas most plants and algae increase inorganic sulfur uptake in response to sulfur limitation stress typical of freshwater environments, periplasmic arylsulfatases provide the means to utilize organic sulfur as an alternative, and are induced in C. reinhardtii and
Volvox carteri under sulfur limitation (99-101). Excess sulfur, can be incorporated in sulfur containing amino acids such as cysteine and methionine or shunted to the synthesis of DMSP, an osmoprotectant that is favored under nitrogen limiting conditions (102).
24
Another bacterial gene found in Picochlorum SENEW3, regulation of chromosome condensation (RCC1) is a protein that binds the nucleosome and is involved in chromosome segregation during cell division (103). RCC1 may be involved in cell cycle control, important in unpredictable environments. It is also among the expanded gene families in the diatoms Phaeodactylum tricornutum and Thalassiosira pseudonana (104).
Also of foreign origin is an enzyme involved in the synthesis of a plant growth promotion hormone, indole-3-acetic acid, typically produced by plants and plant-associated soil bacteria (105). This growth hormone promotes growth in co-cultures of Chlorella vulgaris with the plant growth-promoting rhizobacterium Azospirillium brasilens (106).
Selenoproteins
Selenoproteins are selenocysteine-containing proteins, often oxidoreductases that are widely distributed in all domains of life. Green algae such as O. tauri and C. reinhardtii contain 26 and 10 selenoproteins, respectively, whereas some higher plants lack selenoproteins (57, 107, 108). Selenocysteine is encoded by UGA (typically read as a stop codon) in the mRNA sequence, and its translation as selenocysteine is dependent on the presence of a SECIS element (selenocysteine insertion sequence) located in the 3’
UTR region in eukaryotes (109). We used TBLASTn to search for similarity between
Picochlorum SENEW3 proteins and known selenoproteins in O. tauri (57). Top hits were tested for the occurrence of the conserved stem-loop structure of the SECIS element using SECISearch (110), which were then manually verified. Using this approach, suprisingly, no clear evidence was found for the presence of selenoproteins in
25
Picochlorum SENEW3, although our data do not preclude their potential occurrence in the genome.
Hydrogenase activity and other genes of interest
The Picochlorum SENEW3 genome contains two hydA-like genes encoding [FE]- hydrogenase involved in maintaining pH homeostasis while releasing H2 gas in response to anoxic stress. Hydrogen evolution is present in green algae such as C. reinhardtii but is absent from others (e.g., D. salina) and provides a target alternative energy source (111).
Our finding suggests that Picochlorum SENEW3 may have H2-evolution capabilities, adding to its repertoire of stress adaptations. This species may also provide an attractive alternative to C. reinhardtii for H2 production, particularly because a high salt, selective medium could be utilized for cultivation. Finally, similar to C. variabilis NC64A, the genome of Picochlorum SENEW3 contains genes involved in chitin and chitosan biosynthesis (112). A homolog of chitin synthase, four copies of chitin deacetylase, three chitinase genes, and a chitosanase gene were found (see Table 2.S1), suggesting that chitin and chitosan may contribute to the resilience of the Picochlorum SENEW3 cell wall.
Discussion
Microalgae are increasingly being looked upon as indicators of climate change and as their genomes are being determined and metabolic pathways described, as potential models for biotechnology (55, 113, 114). Two traits of particular interest for the biofuel industry are salt-tolerance as a means of achieving crop protection in open pond systems
26 and high lipid production (114). Picochorum SENEW3 has both of these properties and is therefore a potential algal biofuel candidate (55). Understanding how this alga is able to survive in a stressful and fluctuating environment offers the promise to advance both applied and fundamental research.
Given this interest, it was serendipitous that Picochlorum SENEW3 has a highly reduced and compact genome of size 13.5 Mbp (i.e., with a gene present every ca. 2 Kbp) that is highly amenable to comparative analysis. Our work identifies a suite of characters that differentiate Picochlorum SENEW3 from its non-halotolerant sisters that represent major innovations in this lineage. One of these is the clustering of functionally related genes such as for urea metabolism and acetate assimilation that likely allow a rapid response to nutrient stress. Another adaptation to salt and nutrient stress in Picochlorum
SENEW3 is the expansion of metabolite transporter gene families with 719 members representing nearly 10% of the gene inventory. We found seven transporters putatively capable of glucose uptake with culture work showing that glucose in the medium improved algal growth rate under high salt stress (Fig. 2.5). A third major input into overcoming environmental stress in Picochlorum SENEW3 is HGT of bacterial genes.
The notion that microalgae can, like bacteria, gather genes from the environment to adapt to changing conditions has still not been widely tested with free-living taxa. A recent study of the extremophilic (i.e., found in hot acidic waters near fumaroles)
Cyanidiophytina red alga Galdieria sulphuraria showed that its remarkable metabolic flexibility (e.g., glycerol metabolism, ability to detoxify mercury and arsenic) is explained by HGT from prokaryotic sources. Its sister lineage, the rock-dwelling G. phlegrea has, since its split from G. sulphuraria, regained all of the genes required for
27 urea hydrolysis through (likely independent) HGTs from bacteria, allowing it to survive in the nitrogen-limited cryptoendolithic environment (58). Our results with Picochlorum
SENEW3 significantly extend these findings and show that an alga that lives in a variety of environmental conditions ranging from mesophilic to halophilic is also able to acquire genes from the environment to extend its metabolic flexibility. This trait is evident in the enhanced repertoire of proteins involved in carbohydrate metabolism, osmolyte regulation, sulfate scavenging, and cell cycle control. The 24 clear cases of HGT we identified in Picochlorum SENEW3 stand in stark contrast to the obvious strong selection in this lineage for shedding genes and reducing genome size and complexity. This observation suggests that the prokaryote-derived genes in Picochlorum SENEW3 must confer an ecological, adaptive advantage.
In summary, although little known in the general scientific literature, our results identify Picochlorum SENEW3 as a potentially valuable model for investigating the origin of metabolic flexibility in eukaryotic microbes. The next step is to develop genetic tools in Picochlorum SENEW3 to test the hypotheses presented here with the goal of using this knowledge to improve other microbial strains of interest to serve basic and applied research goals.
Experimental Procedures
Strains and culture conditions
Picochlorum strain SENEW3 (SENEW3) was isolated by B.P. and S.W. from the San
Elijo Lagoon system, in San Diego County, California and is described further in (55).
The alga was cultivated in artificial seawater (115) based Guillard’s F/2 medium without
28 silica at 25°C under continuous light (~100uE/m2/s) on a rotary shaker at 100 rpm
(Innova 43, New Brunswick Eppendorf).
The high salt stress experiments were done in duplicate cultures by varying the concentration of NaCl in the artificial seawater based F/2 medium. Mixotrophic growth rate experiments under 1.5 M NaCl stress were performed in triplicate cultures with the addition of 1-30 g/L of glucose that was filter-sterilized using 0.2 µm cellulose acetate filters. Heterotrophic growth was tested with the addition of 5 g/L glucose in the dark.
Picochlorum SENEW3 stock solution was used to inoculate 100 mL flasks to the inoculation density of 1x105 cells/mL. Algal growth was determined by cell counts using a hemacytometer (Neubauer improved, Hausser Scientific) and ImageJ software.
DNA and RNA extraction and library construction
Approximately 100 mg of cells was harvested by centrifugation at 4,000 rpm for 2 minute and then immediately frozen with liquid nitrogen. DNA extraction was performed using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA) and total RNA was extracted using the RNeasy Mini Kit (Qiagen). DNA and cDNA libraries were constructed using the Nextera DNA Sample Preparation Kit V2 and TruSeq RNA Sample Preparation Kit, respectively (Illumina Inc., San Diego, CA) following manufacturer’s protocols.
Genome and transcriptome sequencing
A total of 830 Mbp of paired-end (2 x 150bp) Illumina genome sequence data and 2.07
Gbp (13.8 million reads) of paired-end 2 x 150bp mRNAseq data were generated from
Picochlorum SENEW3 using the Illumina MiSeq Personal Genome Sequencer (Illumina,
29
Inc., San Diego, CA). The genome assembly was generated using the CLC Genomics
Workbench de novo assembler (CLC Bio, Aarhus, Denmark) and consisted of 1,266 contigs totaling 13.45 Mbp with an N50 of 124,539 bp. The RNA-seq data were aligned to the genome using GSNAP (116) and the output used to train the ab initio gene predictor
Augustus (117), resulting in 7,367 high quality gene models for downstream analysis.
The sequence data used to assemble the draft Picochlorum SENEW3 genome and the assembled contigs can be accessed via NCBI BioProject PRJNA245752. The genome assembly, gene models, and phylogenomic output (see below) are also available at: http://cyanophora.rutgers.edu/picochlorum/.
Construction of multi-protein tree
We collected complete proteome data from ten green algae: Picochlorum SENEW3,
Chlamydomonas reinhardtii (118), Volvox carteri (119), Chlorella variabilis (112),
Coccomyxa subellipsoidea (95), Micromonas isolate RCC299 (120), Micromonas pusilla
CCMP1545 (120), Ostreococcus lucimarinus, Ostreococcus tauri (57), and Bathycoccus prasinos (91), from the glaucophyte Cyanophora paradoxa (88), and from the red alga
Porphyridium purpureum (Bhattacharya et al., 2013). These combined data were subjected to an all-vs-all self-BLASTp search (E-value cut-off ≤1e-05). Ortholog groups across the 12 taxa were constructed using OrthoMCL (121)with default settings.
Sequence alignments were constructed for orthologous groups containing only one sequence in each green algal taxon (allowing missing data in a maximum of 2 taxa). The alignment was built using MAFFT (--auto) (122) with the poorly aligned regions being removed using Gblocks (-b4=5; -b5=h) (123). Because Gblocks is unable to remove
30 badly aligned individual sequence within well-conserved blocks, we applied T-coffee
(124) to remove poorly aligned residuals (i.e., conservation score ≤5). Sequences less than one-half of the alignment length and columns with <8 residues were also removed from alignments. The resulting alignments (≥100 amino acids) were used for single gene tree construction using PhyML3 (125) under the LG+Γ+F+I model. Trees (and the alignments) with 20% longest total branch length were removed. The remaining 1656 alignments were concatenated into a super-alignment (480,102 amino acids). The multi- protein tree was built using RAxML (89) under the PROGAMMALGF model. The bootstrap values were generated using 100 replicates.
Phylogenomic analysis
Automated phylogenomic analysis of individual proteins was done as previously described in (88). Briefly, BLASTp was used to retrieve a set of taxonomically diverse sequences from our in-house protein database. Sequence alignments were constructed using MAFFT v6.864b (122), and RAxML 7.2.8 (PROTGAMMAWAG model; 100 bootstrap replicates) was used to generate 5,871 trees containing greater than or equal to
3 phyla. Trees were sorted for monophyly with bacteria, archaea, and vira using
PhyloSort (90). Instances of HGT were then manually confirmed with ≥60% bootstrap support for the sister relationship between Picochlorum SENEW3 and prokaryotes or trees containing only prokaryotes.
31
Transporter analysis
Putative membrane transport proteins and their classifications were identified based on sequence similarity searches (BLASTp, E-value cutoff ≤1x10-10) against the Transporter
Classification Database (TCDB). TC numbers were used to identify a set of core or shared proteins and those unique to Picochlorum SENEW3 or O. tauri.
Functionally clustered pathways
We downloaded pathway annotations from the Unipathway database (126). The sequences of the underlying genes were retrieved from UniProtKB/Swiss-Prot that comprises a collection of high quality manually annotated and non-redundant protein sequences (http://www.ebi.ac.uk/uniprot). The resulting database comprises proteins for
907 reactions (gene families) that build 207 pathways (478 sub-pathways). Picochlorum
SENEW3 protein sequences were used as query to search against the database using
BLASTp (E-value cutoff ≤1x10-10). This resulted in a list of 633 Picochlorum SENEW3 proteins with significant hits. Physically linked genes involved in the same pathway were then manually identified. C-hunter (71) was also used to identify functional clusters based on Gene Ontology (GO) terms ((127); minimum number of genes per cluster 2; maximum cluster size 3; E-value cutoff ≤1 x 10-3; threshold of cluster overlap 10%) and
(minimum number of genes per cluster 2; maximum cluster size 50; E-value cutoff ≤1x
10-4; threshold of cluster overlap 50%). GO terms were identified using the Blast2Go program (default settings) (128). The top two levels of GO scheme were not considered; e.g., molecular function and biological process, which are too general to provide insights in this analysis.
32
Acknowledgements
This work was supported by a grant from the Department of Energy (DE- EE0003373/001). The authors have no conflict of interest with respect to this work.
Genome size Gene number 0.31 13.3 Mb 7367 0.09 Picochlorum SE3 0.04 0.18 Chlorella variabilis 46 Mb 9791 0.25 Coccomyxa subellipsoidea 49 Mb 9851 0.07 Volvox carteri 138 Mb 14566 0.23 0.07 Chlamydomonas reinhardtii 121 Mb 15143 Ostreococcus lucimarinus 13 Mb 7805 Ostreococcus tauri 12 Mb 8116
Bathycoccus prasinos 15 Mb 7847
Micromonas strain RCC299 21 Mb 10056
Micromonas pusilla CCMP1545 22 Mb 10575