<<

1

Transcriptional Profiling of Under Diverse Environmental

Conditions

Thesis by

Annageldi Tayyrov

In Partially Fulfillment of the Requirements

For the Degree of

Master of Science

King Abdullah University of Science and Technology

Thuwal, of Saudi Arabia

May, 2014

2

EXAMINATION COMMITTEE APPROVALS FORM

The thesis of Annageldi Tayyrov is approved by the examination committee.

Committee Chairperson: Dr. Arnab Pain

Committee member: Dr. Liming Xiong

Committee member: Dr. Christian Voolstra

3

© 2014

Annageldi Tayyrov

All Rights Reserved

4

ABSTRACT

Transcriptional Profiling of Chromera velia Under Diverse Environmental Condi-

tions

Annageldi Tayyrov

Since its description in 2008, Chromera velia has drawn profound interest as the closest free-living photosynthetic relative of apicomplexan parasites that are significant pathogens, causing enormous health and economic problems. There- fore, this newly described species holds a great potential to understand evolu- tionary basis of how photosynthetic evolved into the fully pathogenic

Apicomplexa and how their common ancestors may have lived before they evolved into obligate parasites. Hence, the aim of this work is to understand how

C. velia function and respond to different environmental conditions. This study aims to reveal how C. velia is able to respond to environmental perturbations that are applied individually and simultaneously since, studying stress factors in separation fails to elucidate complex responses to multi stress factors and un- derstanding the systemic regulation of involved genes. To extract biologically significant information and to identify genes involved in various physiological processes under variety of environmental conditions (i.e. a combination of vary- ing temperatures, iron availability, and salinity in the growth medium) we pre- pared strand specific RNA-seq libraries for 83 samples in diverse environmental conditions.

Here, we report the set of significantly differentially expressed genes as a re- sponse to the each condition and their combinations. Several interesting up- regulated and down-regulated genes were found and their functions and in-

5 volved pathways were studied. We showed that the profound regulation of

HSP20 proteins is significant under stress conditions and hypothesized that the- se proteins might be involved in their movements.

6

ACKNOWLEDGEMENT

First of all, I would like to thank my supervisor Dr. Arnab Pain for all that I have learned from him and his continued guidance and support in all stages of this project.

Also, I am sincerely grateful to Dr. Yong H. Woo for his valuable advices, discus- sions, directions and tremendous help with the computational part of the work.

I also thank the thesis committee members Dr. Liming Xiong and Dr. Christian R.

Voolstra.

I would like to thank Center for Desert Agriculture for letting us to use their in- cubators, especially to Dr. Ali Mahjoub for his assistance in this.

Furthermore, I am sincerely grateful to all of my colleagues in the Pathogen Ge- nomics Group and all my friends in KAUST.

Finally, my sincere gratitude goes to my family for always believing in me, for their continued love and support in my decisions, without which I could not have made it here.

7

TABLE OF CONTENTS

EXAMINATION COMMITTEE APROVAL FORM………………………………………………2 COPYRIGHT…………………………………………………………………………………………………3 ABSTRACT…………………………………………………………………………………………………..4 ACKNOWLEDGEMENTS……………………………………………………………………………….5 LIST OF ABBREVIATIONS…………………………………………………………………………….8 LIST OF FIGURES…………………………………………………………………………………………9 LIST OF TABLES …………………………………………………………………………..……………10 Chapter 1 Introduction 11 1.1 Discovery of C. velia……………………………………………………………………………12 1.2 Life cycle and morphology of C. velia……………………………………………………..12 1.3 C. velia as a “missing link” in the evolution of apicomplexan parasites… 14 1.4 C. velia as a model organism……………………………………………………………….15 1.5 Environmental stresses and Chromera velia………………………………………..16 1.5.1 Heat shock response……………………………………………………………………..16 1.5.1.1 Small heat shock proteins in adhesion and movement of cells..17 1.5.2 Iron deficiency………………………………………………………………………………18 1.5.3 Effect of salinity…………………………………………………………………………….19 1.6 Transcriptomics with strand specific RNA-seq…………………………………….20 1.7 ANOVA………………………………………………………………………………………………22 1.7.1 Multi-way analysis of variance……………………………………………………….22 1.7.2 Factorial Experiment……………………………………………………………………..23 Chapter 2 Methods 2.1 Medium preparations and C. velia culturing……………………….……………….24 2.2 Heat stress application……………………………………………………………………....24 2.3 Iron deficiency application……………………………………………………………...….25 2.4 Combination of stress application……………………………………………………….26 2.5 Sampling and RNA extraction……………………………………………………………..27 2.6 Strand specific RNA-seq library preparation……………………………………….27 2.7 RNA-seq data processing……………………………………………………………………28 Chapter 3 Results and Discussion 3.1 Effect of Iron deficiency and induction………………………………………………..30 3.2 Heat shock treatment…………………………………………………………………………31 3.3 Multi-factorial experiment………………………………………………………………….33 3.4 Differentially expressed genes and cellular pathways…………………………..44

Chapter 4 Conclusions 50 Future studies 52 References 54 Appendices 59

8

ABBREVIATIONS

ANOVA analysis of variance cDNA complementary DNA

CV Chromera velia

FDR false discovery rate

FPKM Fragments Per Kilobase (of transcript) per Million (mapped reads)

HSPs heat shock proteins

KAAS KEGG Automatic Annotation Server

KEGG Kyoto Encyclopedia of Genes and Genomes

RH relative humidity

RIN RNA integrity number

RNA-seq RNA sequencing

RPM revolutions per minute

RT-qPCR reverse transcription-quantitative polymerase chain reaction

9

LIST OF FIGURES

Figure 1. Life cycle of C. velia Figure 2. Transcriptome profiles of iron depressed and induced C. velia Figure 3. Transcriptome profiles of heat shock response in C. velia Figure 4. Factorial experiment design Figure 5. General outcome of the factorial experiment Figure 6. Correlation of gene expression patterns between individual experi- ment and factorial experiment after 2 h. of heat treatment Figure 7. Effect of heat shock treatment on genes with HSP domains Figure 8. Effect of salinity and iron on expression of HSP20 and dynein genes Figure 9. General representation of KEGG metabolic pathways Figure 10. Number of significantly up- and down- regulated genes for each condi- tion Figure 11. Precursory metabolic pathways for synthesis of proteins Figure 12. Some of the crucial pathways for synthesis and regulation of new RNA molecules Figure 13. Pathways that are directly involved in multiplication of the cells Appendix E - Global metabolic map for differentially expressed genes

10

LIST OF TABLES

Appendix A – Pilot Experiment: Heat Shock treatment Hiseq-2000 sequencing

output data summary

Appendix B – Pilot Experiment: Fe deficiency and induction treatments Hiseq-

2000 sequencing output data summary

Appendix C – Multi-Factorial Experiment Hiseq-2000 sequencing output data

summary (Single end)

Appendix D - Number of DE genes and their involved KEGG pathways

11

Chapter 1

Introduction

Chromera velia is newly described marine species that turned out to be the clos- est known living photosynthetic organism to the apicomplexan parasites [1].

The is a critical group of parasites that include causative agent - spp. that kills more than million people annually; - causes fatal injuries in immune-deficient pregnant women and their fe- tus; Eimera and – responsible for large loss of poultries and beef each year [2-5]. In fact, it is virtually believed that almost every animal on the planet has at least one species of apicomplexan parasites that challenges it. Thus, this group of parasites has been extremely successful in term of host adaptation.

What makes this newly described species of Alveolata very important is that it is known to be the “missing link” between photosynthetic algae and the parasitic

Apicomplexa [6]. Furthermore, one of the major bottlenecks to finding cures for the diseases caused by Apicomplexa is that these obligate parasite organisms are difficult to study as they require specific host cells. On the other hand, C. velia can be cultured simply and cheaply in a laboratory. Since, C. velia is closely relat- ed to the apicomplexan parasites, it could be a suitable model organism to work on in development of cures against those parasites. Because, apicomplexan par- asites have leftover remnants, indicating their evolutionary past as algae, this newly described species, C. velia, will also help us in understanding evolutionary basis of how photosynthetic algae evolved into the fully parasitic

Apicomplexa and how their ancestors may have lived before they evolved into obligate parasites of land animals and human[1].

12

1.1 Discovery of C. velia

Chromera velia was first found in 2001 while trying to get pure culture of Symbi- odinium, one of the major symbionts of the stony corals. However, it is described and introduced to the literature in 2008 by Moore et al. [1] It was first isolated from the stony corals of the Sydney harbor, and can live either freely or as an as- sociated manner with the corals. Later on, another group isolated C. velia from another coral species, digitata, at Magnetic Island, Austria, while aim- ing to culture from the corals [7]. Phylogenetic analysis suggests that C. velia belongs to the Alveolata superphylum, but due to its significant dif- ferences from the previously known subgroups of Alveolata, it has been placed into the new phylum named [1]. Although, in nature C. velia has al- ways been found associated with corals, as of yet, not much is known about the relationship of C. velia and the stony corals. However, the fact that C. velia can be freely cultured in the labs indicates that the presence of corals is not mandatory for the normal growth of C. velia. Combo et al. recently designed a study to test the interaction between C. velia and the coral reefs and showed that C. velia car- ries a potential to be an endosymbiotic species in the larvae of coral reefs [8].

1.2 Life cycle and morphology of C. velia

The life cycle of Chromera velia has two main phases; immotile vegetative coc- coid cells and motile flagellated cells (Fig. 1) [9]. The immotile life stage is the predominant and the size of cells in this stage is 5-7 µm in diameter. The life cy- cle starts with binary division of these vegetative cells into two to four daughter cells, which are also known as autospores. This group of daughter cells is held together with a membrane - a structure called sporangium. When the conditions

13 are suitable, the autospores are released from sporangia. Bi-flagellated cells termed zoospores are formed from these autospores and transformation from non-motile coccoid cells into a motile stage occurs during exponentially growth phase of the cultures and is triggered by several abiotic factors including intense light exposure and low salinity [10]. C. velia has one complex and a single large mitochondrion. The plastid is covered by four membranes and contains with chlorophyll-a but not chlorophyll-c. The plastid of C. velia has become par- ticular interest for researchers due to its significant similarity to apicomplexan . In addition to these two endosymbiotic , almost each vege- tative cells of C. velia has a unique structure called chromerosome, whose func- tion remains unknown [9]

Figure 1. Life cycle of C. velia (a) predominant coccoid vege- tative cells, (b, c) dividing cells to form a structure with addi- tional membrane, autosporan- gium, (d, e) released autospores to start a new life cycle, (f) au- tospores can form bi-flagellated cells called zoospores if condi- tions are favorable[11]. Oborník et al.,2013

14

1.3 C. velia as a “missing link” in the evolution of apicomplexan parasites

According to its phylogenetic tree based on nuclear and DNA sequenc- es, morphology, physiology, metabolic pathways Chromera velia is the closest known organism to apicomplexan parasites that are group of unicellular living as animals’ obligate parasites. Taxonomically, apicomplexans belong to a group of known as Alveolata and it is esti- mated that there are millions of species of this apicomplexans including causa- tive agents of malaria – Plasmodium spp. [12]. Two main unique structural char- acteristics of apicomplexan parasites are that most of them posses an apical complex that is used to penetrate the host cells, and the apicoplast, which is an evolutionary remnant of photosynthetic . Many functional studies, espe- cially, on Plasmodium spp. have shown that the apicoplast is essential for surviv- al of the parasites [13]. Therefore, this relic plastid represents a promising, po- tential target to fight against the diseases that are caused by these parasites.

Furthermore, the discovery of plastid in apicomplexan parasites suggested that they must have plastid evolved from a photosynthetic organism in particular, complex plastid possessing algae. However, this discovery started another de- bate on whether the plastid had originated from red or green algae, which re- cently resolved after the discovery of chromerids. The superphylum Alveolata also includes , which are heterotrophic free-living protists with numerous ciliates on their cell surface and , mostly comprise photosynthetic algae with a complex plastid structure. In fact, before the discovery of C. velia di- noflagellates were known as a closest known photosynthetic relatives of the apicomplexan parasites [14]. Since, plastid genome had lost most of its genes to the nuclear genome, it does not overlap enough with the apico-

15 plast genome to make a meaningful comparison and phylogenetic analysis [15].

Fortunately, the discovery of Chromera velia have shed light on this aspect. The new group of the photosynthetic contains well-conserved genome that has rather overlapping genes with both apicomplexan pathogens and dinoflagel- lates, suggesting a common lineage. In fact, phylogenetic analysis of C. velia plas- tid genome unambiguously shown that the plastid of chromera is evolutionary the closest known still photosynthesizing plastids to the apicoplast than any oth- er known photosynthetic plastids [1]. Accumulated information derived from molecular phylogeny analysis with ultrastructure and biochemical metabolic analysis has revealed C. velia’s surprisingly closeness with the obligatory apicomplexa parasites [16]. Hence, in-depth studies of this newly discovered al- gae along with that is another close autotrophic relative of apicomplexan parasites [17], have a great potential in understanding the evolu- tion of these parasites that once were photosynthetic autotrophs.

1.4 C. velia as a model organism

Having a right model organism is one of the key steps in performing uneasy, complex research studies. For scientists working on apicomplexan parasites have a difficulty to grow these parasites in labs due to their obligatory parasit- ism. On the other hand, C. velia can be cultured without living host cells in large quantities easily and cheaply in laboratories. Furthermore, the biology and mor- phology of C. velia is very similar to its parasitic relatives, especially its plastid has drawn big attention. Studying and finding the essential metabolic or signal transduction pathways, and genes that are involved in these pathways, in paral- lel to the studies that have been done on apicomplexans might help us to discov-

16 er new therapeutic targets to fight against apicomplexan parasites. Hence, C. vel- ia could be a good model organism for researchers working on cures for the in- fections caused by Apicomplexa.

1.5 Environmental stresses and Chromera velia

The environmental stresses are integral part of climate change with a wide range of effects on growth and survival of most organisms, if not all. Especially, species that don't have an ability to run away from the unfavorable conditions must have been developed several biochemical, physiological and metabolic strategies to cope with most of these environmental fluctuations i.e. extreme temperature, drought, salinity, nutrient and essential elements deficiencies etc. Although, there are not many studies regarding the response of C. velia to the environmen- tal stresses, it has been shown that C. velia has extremely flexible photosynthetic features, which works surprisingly efficient under very different range of light intensities by quickly adjusting the content of the photosynthetic and auxiliary pigments [18].

1.5.1 Heat shock response

In nature, temperature changes are more common than any other stressors.

Worldwide, marine organisms are threatened by climate changes, which can have devastating effects on them. The unfavorable temperatures can significant- ly reduce the productivity of photosynthesis by interfering with photosystem II reaction center, Calvin cycle and structures of thylakoid membranes [19]. Fur- thermore, the elevated temperature can increase the reactive oxygen species, which are responsible for oxidative stress that causes cell apoptosis [20]. How-

17 ever, most of the organisms have developed strategies to cope against these temperature fluctuations. One of the most important components of these mechanisms are heat shock proteins (HSPs) [21]. HSPs are highly conserved among species and function as chaperones that are involved in proper fold- ing/unfolding, precipitation, transport, and degradation of proteins. Further- more, these proteins are involved in morphogenesis and differentiation of cells.

Although, due to their normal functions, heat shock proteins are always present inside the cells but their expression can be triggered by exposure to high tem- perature and other different kind of environmental conditions. These proteins are classified into different families based on their protein mass. The mostly known and well-studied ones of these are Hsp70 and Hsp90 proteins (70 and

90kDa, respectively) [22]. Induction of these proteins is stimulated with a tran- scription factor known as Heat shock factor (HSF) [23].

1.5.1.1 Small heat shock proteins in adhesion and movement of cells

Heat shock proteins that have molecular weight ranging from 12 to 43 kDa be- long to the family of small heat shock proteins. The family includes molecular mini-chaperones with a well-conserved 90 length of alpha-crystallin domains [21, 24]. Studies from diverse model organisms have shown a role for small heat shock proteins (HSPs) in regulation of cellular adhesion and locomo- tion. These small HSPBs proteins usually form bigger oligomers to be fully func- tional. There are three main duties of these proteins within cells. First, like oth- er chaperones they also act to prevent or correct protein damages caused by in- ternal or external stressors. Second, HSPBs function in the formation or mainte- nance of the native conformation of cytosolic protein by preventing these pro-

18 teins aggregation. Additionally many studies have shown their association with cellular membrane lipid bilayers, potentially helps with preserving the integrity of membrane. Universal presence of these proteins in all three domains of life i.e. bacteria, archaea and euckarya, proves their physiological importance.

HSP20 proteins are amongst these HSPBs proteins with an average molecular weight of 20 kDa. Several recent studies demonstrated the importance of HSP20 proteins in cellular migration of apicomplexa parasites. Cellular motility is es- sential for life cycle of these obligate intracellular eukaryotic parasites. For in- stance, one of the recent studies have demonstrated that ablation of HSP20 pro- teins in infectious and highly motile sporozoites of plasmodium, the causative agent of malaria, significantly affected the mobility of the cells [25]. Additional to aberrant cell motility displayed by plasmodium sporozoites, anti-serum against

HSP20 is able to lessen pathogen invasion in Neospora caninum and parasite penetration and gliding motility in Toxoplasma gondii, which are apicomplexan parasites and the causative agents for toxoplasmosis and neosporosis, respec- tively [26].

1.5.2 Iron deficiency

There are some crucial elements that are required for a normal growth of cells.

Iron is one of these elements [27, 28]. Most of the microalgae have a high iron requirement for their growth, although marine environments usually tend to contain less of this metal than terrestrial environment. Since, the C. velia is a photosynthetic organism the deficiency of this element is expected to interfere with the photosynthetic ability of these algae. This is because iron is required for the construction of chlorophyll pigments and some components of photosystem

19 centers such as cytochromes. Deficiency of iron also affects the electron transfer systems (ETS) of respiration, which is the main energy supplying mechanism for most of the aerobic organism including C. velia. Furthermore, iron is functioning as a cofactor or a part of a cofactor for some enzymes. The nitrogen assimilation enzymes; nitrate reductase and nitrite reductase that reduce nitrate (NO-3) to ni- trite (NO-2) and nitrite to ammonia (NH3), respectively, contain iron ions [29].

The study with phytoplankton has showed that the cell densities and growth rate significantly decreased under Fe stress [30]. There are two main iron uptake strategies that most of the terrestrial microorganism and plants use. The first one of these strategies is the reductive mechanisms of iron uptake, in which ex- tracellular ferric molecules are dissociated into free iron ions by reduction and imported into the cell through plasma membrane channels. The second strategy is the siderophore-mediated mechanisms, in which the siderophores that are re- leased from the cells or microorganism such as bacteria and fungus are directly taken up and dissociated into iron elements inside the cells [31, 32]. Although, there are some studies that show some marine organisms also may have been using one of these well-studied iron uptake mechanisms, we know very less about iron acquire strategies of those marine phytoplankton. However, it is ex- pected that strategies used by these marine organisms to uptake iron must have evolved further to operate efficiently under extremely low iron concentration since, most of the iron content most of the sea habitats are extremely low i.e.

0.02-1nm [33]. In fact, one of the recent studies on iron uptake mechanisms of C. velia has found these algae use a novel mechanism to acquire iron from its envi- ronment [30]. The highly efficient novel strategy involves large number of iron

20 binding sites on the cell wall that allow concentrating ferric ions prior to uptake into the cells.

1.5.3 Effect of salinity

Salinity level is also another crucial abiotic factor for marine organisms. Alt- hough, sea species have an adapted lifestyle in salty marine water, to keep salini- ty level at optimum level is important for osmoregulation of the species. Most of marine organisms have been developed variety of strategies to cope with salinity level fluctuations of sea that may happen from time to time [34, 35]. Based on the observations so far on culturing of C. velia, it is expected that this algae also must have a sophisticated mechanism(s) to fight against unfavorable salt con- centrations.

However, in nature organisms are simultaneously exposed to more than just one type of stress at a given time. Although, there are not many studies on multi- stress responses of organisms as of yet, due to the complexity of experimental designs and data analysis, several studies have shown that the gene expression profile of organisms under simultaneous multiple stresses is not predictable from individual stress treatments. A study on Arabidopsis thaliana has shown that 61% of new genes that are unpredictable from the single stress treatments are induced under combination of stresses [36]. In fact, since, multi- factorial stress data expected to provide more natural and sophisticated stress responses of an organism, it allows us to identify and (re)construct functional transcrip- tional networks and metabolic processes involved in interaction of an organism with its environment [37].

21

1.6 Transcriptomics with Strand specific RNA-seq

The transcriptome is a complete set of transcripts in a cell at a given time. It is essential to understand the transcriptome of an organism for decoding the func- tional units of the genome and revealing the molecular element of a cell at a spe- cific physiological conditions or a development stage [38]. Therefore, transcrip- tional profiling of gene expression has become an important tool to investigate how organisms respond to environmental changes. Organisms are modulating and rearranging their physiology and morphological structures by changing the expression of their gene pattern to cope with any environmental stresses they encounter. Within last decade several methods have been developed to do high throughput transcriptome studies that can be put into two categories; hybridiza- tion-based approaches and sequencing-based approaches. The hybridization- based approaches involve hybridization of fluorescently labeled cDNA molecules with a specifically designed microarrays [39]. Although, the microarray technol- ogy is high throughput and relatively cheap, it has several limitations; the neces- sity of prior knowledge about genome sequence (i.e. fully and accurately anno- tated genome), tiresome and inaccurate normalization techniques especially when comparing results of different experiments, inefficiency to provide abso- lute quantity of transcripts. In contrast to hybridization-based technologies, the sequence-based approaches can overcome almost all of the aforementioned problems due to their ability to directly determine the cDNA. One of the recent and most advanced of sequence-based methods is RNA-sequencing (RNA-seq) technology. Being not dependent on annotated genomes makes this deep se- quencing technology attractive especially for organisms with genomic sequences that are yet to be annotated. Besides all this, RNA-seq is a technology still under

22 active development. For instance, recently improved experimental protocols al- low the production of strand specific reads. In this strand specific RNA-seq tech- nique, overlapping transcripts in sense-antisense strands can be easily resolved, which is needed especially when working with newly annotated reference ge- nomes.

1.7 ANOVA

One of the major challenges when dealing with high-throughput sequencing data is experimental variance. The source of this variance can be due to variability between biological conditions and variability within biological conditions.

Properly identifying the source is crucial to get a list of genes for which there is evidence of differential expression among experimental conditions. Several so- phisticated tools have been developed to obtain such a gene list. Analysis of Var- iance (ANOVA) is a statistical method that can decompose mean differences within groups (e.g. biological replicates) from those between groups (e.g. two different environmental conditions) by testing whether the difference is signifi- cant, or merely by chance. ANOVA can be used to test differences between two or more groups. It is used to test general rather than specific differences among means. If both are equal it is most likely that variation is due to chance and not significant [40].

1.7.1 Multi-way analysis of variance

The multi-way ANOVA is also known as a factorial ANOVA, and is used for ana- lyzing of multiple independent variables (for instance: temperature and salinity) on dependent variables (for example: gene expression changes). This method is

23 more efficient for dissecting effects of several factors at the same time. In addi- tion to main effects for each independent variable, it calculates interaction ef- fects, which is a special kind of effect that can be observed only in factorial exper- iments due to effect of one independent variable on the other one. Therefore, studying the interaction effects usually adds significant value to the complete- ness of a study, and it changes based on types and numbers of independent vari- ables.

1.7.2 Factorial Experiment

Conventionally, experiments are designed to test the effect of one factor upon one response. In contrast, factorial design is used to determine the effect of mul- tiple variables on a response. Since a century ago, it has been claimed that facto- rial designs are much more better to imitate the nature that studying one factor at a time [41]. By combining the study of multiple variables in one factorial ex- periment one can significantly reduce the number of has to be performed exper- iments. Furthermore, additional to main effect of each factor, studying multiple factors simultaneously can be used to find the interaction effects that cannot be explained only by main effects. In a factorial design, each factor may have a sev- eral number of levels. Experiments in which the numbers of levels of all the fac- tors are same are called symmetrical factorial design if not, they are called asymmetrical factorial experiments. Number all unique combination can be de- termined by the “number of levels” and the “number of factors”.

24

Chapter 2

Methods

2.1 Medium preparation and C. velia culturing

Chromera velia isolated from the stony corals of the Sydney harbor, was used throughout this work. The medium for cultivation was prepared as fol- lows; first, for one liter of normal medium, 20 ml of the Guillard’s (F/2) Marine

Water Enrichment Solution, 50X, (bought from Sigma-Aldrich) was added onto

950 ml sterilized water. Then, 33.3 g sea salt and 1ml antibiotic solution (ampi- cillin (50 mg/ml) and kanamycin (50 mg/ml)) were added and mixed well on magnetic stirrer until the solution gets clear. The pH of the medium was adjust- ed to 8.0-8.4 by adding basic, NaOH, or acidic, HCl, solution as desired. Once, newly made medium was filtered further in a sterile condition, the media stored in 4 ˚C until usage time. Under normal circumstances, the incubation condition for C. velia is; temperature, 26 ˚C; humidity, 60 %RH, and light - 4 µmol with

12/12 light-dark regime.

2.2 Heat stress application

Two flasks of 600 ml new C. velia cell cultures were maintained in f/2 media at normal incubation condition for 11 days. Due to the initial growth rate experi- ment, the cells are expected to be at logarithmic growth phase on this day. For the heat treatment experiment, these two flasks were mixed and divided equally in 12 new small flasks each with 100 ml of cell culture. There were four different heat treatment durations; 0 hour (control), 30 min., 2 hours and 4 hours. For each treatment, there were three biological replicates. After all of the samples were numbered randomly, three of them were incubated at 26 ˚C while others

25 were incubated at 37 ˚C. Everything except the temperature values was kept same between two incubators. When heat exposures were done all of the flasks were processed with centrifugation at 3500RMP for 15min at 4˚C to pellet the cells. Then RNA extraction was carried out.

2.3 Iron deficiency application

To test the effect of Fe deficiency on growth and survival of photosynthetic C. vel- ia Fe (-) and Fe (+) media were prepared. Since Iron is already present in Guil- lard’s (F/2) Marine Water Enrichment Solution preparing just a normal medium can function as a Fe (+) medium. The trick comes in the preparation of Fe (-) media. To get rid of Fe+3 molecules in the medium citric acid was used since, it has been shown that C. velia can only use ferric ions and transfer of these ions in- to the cells can be efficiently prevented by adding a strong ligand of ferric ions, citrate ions [42]. These two molecules form insoluble Iron Citrate molecules that can’t be used by C. velia.

For this part of the work, two flasks of new C. velia flasks were cultivated. One of the cultures was maintained in 300 ml Fe (+) f/2 medium while the other one was maintained in 600 ml Fe (-) f/2 medium. After nine days of incubation un- der normal conditions, the cultures in both flasks were divided into nine 100 ml cultures. All cultures were centrifuged and cells those had been maintaining in

Fe (+) medium were washed with Fe (+) medium three times and then 100 ml of fresh Fe (+) media was added onto each of those flasks. The other three flasks were also washed with the Fe (+) medium three times and 100 ml of the Fe (+) medium was added to the each of the flasks (also known as Fe Induction). The remaining three flasks were filled up with 100 ml of Fe (-) medium after washing

26 three times with Fe (-) medium. All nine 100 ml cultures were incubated two more days at the same conditions. On the 11th day, all of the cultures were pro- cessed with centrifugation at 3500RMP for 15min at 4˚C to pellet the cells. Then

RNA extraction was carried out.

2.4 Combination of stress application

C. velia cells were exposed to the combination of stresses i.e. heat treatment, cold treatment, Fe deficiency and different salinity levels. First, six different media were prepared from the combinations of low salt (16.7 g/L), medium salt (33.3 g/L), high salt (66.6 g/L), iron deficiency and iron sufficient. Two flasks of 600 ml cultures were seeded for each treatment i.e. twelve flasks in total. After main- taining them in the normal incubation conditions for eleven days, they were di- vided into 100 ml flasks before exposing to the predetermined temperatures.

Since, there were five different temperature points and two biological replicates of each, in total that made sixty-six flasks of the cultures. Due to technical limita- tions, the cold treatment and the heat treatment of the samples performed at dif- ferent times with a time specific control groups that can be used to normalize the data. Once, the samples were exposed to the given temperature values (i.e. 37˚C for heat shock and 14˚C for cold shock), all of the cultures were processed with centrifugation at 3500RMP for 15min at 4˚C to precipitate the cells as a first step for RNA extraction.

27

2.5 Sampling and RNA extraction

For the individual stress applications total RNA was extracted immediately after the treatments from the cell pellet using Norgen RNA Extraction kit based on manufacturer’s protocol (Norgen Biotek Corporation, Canada). Briefly, 600 µl of lysis buffer was added to each cell pellets. The lysis mixture was moved into sterile tubes that contain 0.5mm dia. glass beads. For the disruption of the cells the tubes were shaken on bead-beater for 5min at 30Hz. After briefly centrifug- ing, the supernatant part was put into DNA removal tubes. DNA free elution was passed through RNA elution tubes and eluted with a 40µl elution buffer.

For the combination of stress experiment due to the high number of samples i.e.

66 samples, the all cell pellets were first preserved in 2ml RNA-later. Then the extraction of RNA was performed by randomly choosing twelve samples at each time. To minimize the variation among the samples the all extraction were done within 1.5 days. RNA quality was assessed using a nanochip on a model 2100 bi- oanalyzer (Agilent Technologies, Santa Clara, CA). RNA concentration was de- termined with a Qubit™ (Invitrogen, Carlsbad, CA).

2.6 Strand Specific RNA-seq Library preparation

Once total RNA quality (RIN ≥ 8.0 as measured on an Agilent Bioanalyser) and concentration were determined, strand specific RNA-seq library preparation was performed using the TrueSeq LT stranded RNA sample kit (Illumina, San Diego,

CA) according to manufacturer’s instructions. Briefly, the library preparation was started with 1 µg of total RNA as a starting material. Firstly, poly-A contain- ing mRNA molecules were pulled out from total RNA using poly-T oligo attached magnetic beads. During the second elution of the poly-A RNA, the mRNA mole-

28 cules were fragmented and primed for cDNA synthesis. The cleaved RNA frag- ments that were primed with random Hexamers were reverse transcribed into first strand cDNA using reverse transcriptase and random primers. Then, second strands of cDNAs were synthesized instead of the RNA templates. To prevent the blunt cDNA fragments from ligating to one another during the adapter ligation reaction, a single ‘A’ nucleotide was added to the 3’ ends of the fragments. Then, multiple indexing adaptors were ligated to the end of ds cDNA, preparing them for hybridization onto a flow cell. Those DNA fragments that have adaptor mole- cules on both ends were selectively enriched on PCR to amplify the amount of

DNA in the library. Prior to cluster generation, concentration and size of librar- ies were assayed using the Agilent DNA1000 kit ®. Libraries from all samples were sequenced in a single flow-cell (11-12 libraries per lane) on the Illumina

HiSeq 2000. Final reads were single-end 50 nt. Image analysis, base-calling and quality filtering were processed by Illumina software. In average 25.6 million reads were generated from each library (For a detailed summary, refer to Ap- pendices A-B).

2.7 RNA-seq data processing

The efficient and successful processing and analysis of RNA-seq data is crucial a scientific discovery. For that reason, the raw data was cleaned up, so that they can be used for mapping against a reference. Low quality bases and adapter se- quences were removed. The trimmed datasets were examined to make sure that quality scores are high enough [43].

TopHat and Cufflinks are recently developed software programs for comprehen- sive expression analysis of RNA-seq data [44-46]. First, preprocessed RNA-Seq

29 reads were aligned to the reference genome in order to identify trancripts splic- ing sites. These aligned reads were provided as input to another tool called Cuf- flinks. This tool assembles mapped RNA-Seq fragments into transcripts, evalu- ates their abundances, and checks for differential expression. The RNA-Seq fragment counts can be used as a degree of relative abundance of transcripts, and Cufflinks calculates abundance of transcripts in Fragments Per Kilobase of exon per Million fragments mapped (FPKM). Cufflinks then calculates the rela- tive abundances of these transcripts based on number of fragments support each one.

Normalization is necessary before analysis is performed, in order to ensure that differences in FPKMs are indeed due to differential expression and not experi- mental artifacts. Therefore, proper normalization of the data is a key step for comparing gene counts across samples. We used one of the most commonly used normalization techniques, quantile normalization method. Quantile nor- malization ensures that the counts from all samples have the equal distribution by sorting the counts from each data set and setting the distribution across all data sets. Analysis of variance was performed by R/MAANOVA [47]. The follow- ing equation describes a factorial design with three treatment factors:

����� = � + �� + �� + �� + (���)��� + �����

Where yijkl is the observed individual of the response variable; μ is the overall mean; ti, sj, fk are the main effects of temperature, salinity and Fe treatments, re- spectively; and (tsf)ijk is the third-order interaction effects that cover all varia- tions due to second order interactions too. εijkl is the residual.

30

James-Stein based F statistics (FS) were applied to detect differentially ex- pressed genes [48]. The resulting p-values were adjusted for false discovery rate

(FDR) for multiple comparisons [49].

For visualization and statistical analysis of processed RNA-seq data JMP (Version

11.1.1, SAS Institute, Cary, NC) was used.

31

Chapter 3

Results and Discussion

3.1 Effect of Iron deficiency and induction

The transcriptome (RNA-seq) sequencing yielded for each of the nine libraries

18.5 million reads with output quality of 94% (For a detailed summary, refer to

Appendix A). To check reproducibility of the experiment, we evaluated correla- tions within expression of genes across three biological replicates. In the hierar- chical clustering, except one of the “Fe-minus” libraries, each sample clustered together with their replicates (Fig. 2A), indicating that there are significant gene expression differences between samples with different Fe conditions. We could- n't figure out the possible cause for the outlier library, so we excluded it from further analysis.

Thirty-six percent of C. velia coding genes were differentially expressed due to either iron deficiency or induction or both when a significant threshold was set at FDR ≤ 0.05 (Fig. 2B). We found many genes known to be involved in photo- synthesis or electron transfer system components such as ferredoxin, carbonic anhydrase, cytochrome-c etc. were differentially expressed in iron-deficient samples. Reductions of ferredoxin or photosynthesis efficiency are amongst the methods that have been used to infer limitation of iron in some marine systems

[50]. The majority of genes that are significantly up - or down - regulated were hypothetical genes without functional annotation. How intracellular metabolic processes are differentially regulated during changes of Fe level can be improved with better functional annotation of the C. velia genes.

32

The iron induction part of the experiment was taken out for factorial experiment

because of its very similar gene expression profile with iron deficiency and also

due to technical difficulty of incorporation of iron induction into multi factorial

experiment i.e. to perform the induction several media washing and changing

steps are required that is impractical for the factorial experiment that involves a

large number of samples.

A B

Figure 2. Transcriptome profiles of iron depressed and induced C. velia cells. Hi- erarchical clustering of RNA-seq expression data separating replicates of each condition, A. Venn diagram of number of significantly differentially expressed features for each condition (in total 9176 genes). Significant threshold was set at false discovery rate of FDR ≤ 0.05, B.

3.2 Heat shock treatment Because the impact of fluctuations in an environment on physiology of an organism isn’t

clear, what constitutes stress for a given organism is not always well defined. However,

this is different when it comes to the temperature. Each organism has an opti-

mum range of temperature for life cycle, and outside the range, they go into the

stressed situation. Hence, response to the heat shock is well conserved among

all domains of the life and is a well-studied stress condition[24].

This is a very first heat shock response study of C. velia. For this study, two in-

dependent cultures of C. velia were exposed to 30 min., 2 h., and 4 h. of heat at

33

37 ˚C, and were compared to three replicates for the control group (at 26 ˚C).

Transcriptome sequencing yielded in average 20 million reads per each library with around 85% output quality each (For a detailed summary, refer to Appen- dix B). We checked the reproducibility of the data before further analysis. Hier- archical clustering of libraries based on their gene expression pattern has dis- played expected grouping of each library with their corresponding replicates. 30 min. heat treatment tends to cluster together with the controls, while two hours and four hours heat treatments tend to cluster together (Fig. 2A), suggesting that C. velia has an initial response and late response under heat shock [51].

Along with this, we found many (65%) genes significantly differentially ex- pressed under heat shock response. We found that out of nineteen heat shock proteins, which are well conserved among all organisms and well-known for their roles in heat shock response, fourteen were significantly up regulated when significant threshold was set at FDR ≤ 0.05. Because of similar gene expression profiles of C. velia after two and four hours of heat treatments the four hour time point was excluded from the factorial experiment.

A B

Figure 3. Transcriptome profiles of heat shock response in C. velia cells. A. Hierar- chical clustering of RNA-seq data of conditions and replicates, B. Venn diagram of num- ber of significantly expressed genes for each condition. Significance threshold was set at

34 false discovery rate of FDR < 0.05.

3.3 Multi-factorial experiment After the initial pilot study of the effects of salinity, iron deficiency and induction, heat shock individually on C. velia cells, we started our main experiment. Hence, we have designed and performed a multi-factorial experiment, which allows simultaneous application of salinity, temperature (i.e. heat and cold) and iron de- ficiency. Levels of each of these three factors were chosen based on what we learned from individual stress experiments (i.e. four hours heat treatment is ex- cluded due to its similar expression pattern with two hours treatment, iron in- duction is also dropped out because the expression profiles in Fe-induced sam- ples were similar to Fe-deficient samples so adding an additional level to the multi-factorial experiment would make it much more challengeable without too much additional gains.) Also, pilot studies of high and low salt were studied be- fore this project showed a sufficient perturbation of C. velia cells by triggering significantly differentially expression of many genes (data not shown). There are three factors in this multi-factorial experiment i.e. temperature, salinity and iron deficiency. Each factor has several levels for instance temperature has six levels; control, 30 min., and two hours of heat treatment at 37 ˚C for heat shock re- sponse and control, 30 min., and two hours of cold treatment at 14 ˚C for cold shock response. There are three different levels for salinity; high (66.6 g/L), normal (33.3 g/L) and low (16.7 g/L). In addition, there are two levels of iron, which are normal iron and iron deficiency. All of these factors with their levels make 36 unique combinations (see Figure 3. for visual representation). Except cold shock control, for each of the other 36 combinations, we had biological rep- licates, which makes 66 samples in total. For 65 of the 66 samples, they had RNA

35

RIN value greater than 8.0 and strand specific RNA-seq libraries were prepared from those 65 libraries and were sequenced to a high depth on illumina machine.

Every one of the libraries had 20 to 40 million reads with in average 93% output quality (For a detailed summary, refer to Appendix C). After preprocessing of the RNA-seq data, to make sure the experiment has a good reproducibility, we first evaluated the experiment variations within two biological replicates hierar- chical clustering of libraries based on their gene expression pattern has dis- played expected grouping of each library with their corresponding replicates, in- dicating that that experiment has worked and it has exceptionally high levels of reproducibility, Fig. 5A. This gave us high confidence in the results, given that the samples were randomized several times through the process and each part of the experiment was done in batches due to the large number of samples

Figure 4. Factorial experiment design. Design and vizualization all unique combinations of the factorial experiment

The observed response of each individual variable was decomposed into main ef- fects of salt, iron, and temperature and their interaction effects by ANOVA [52].

36

Hence, numbers of significantly expressed genes due to main effects or interac- tion effects can be checked separately. We found that 86% of the annotated genes were differentially expressed due to the main effects of iron deficiency, sa- linity, or temperature when significant threshold was set at FDR ≤ 0.05 (Fig. 5B).

Majority of genes, i.e. 18089, that showed a significant differential expression was common between salinity effect and temperature effect. However, tempera- ture and salinity had relatively large number of uniquely regulated genes: 3802 and 3166, respectively. We found that many genes were not differentially ex- pressed under iron deficiency, consistent with what we found in the pilot study,

Fig. 2B. This is not surprising, because iron usually exists at very low concentra- tions in sea surface water. For instance, the estimated iron concentration in seawater where Chromera velia was isolated (South Pacific Ocean, 33°S, 151°E;) is typically between 0.1 and 0.8 nM [33]. Previous studies have shown that growth rate of C. velia doesn't significantly increase when the concentration of iron is greater than 0.1 µM, but that would increase the maximum cell yield.

Here, the iron deficiency medium was prepared by adding citrate ions that are strong ligand of ferric ions. These cationic and anionic ions form insoluble com- pound that can’t be used by C. velia as an iron source. However, as it has stated by Sutak et al. the efficiency of this ligand is 98% at its maximum [42]. Given that the iron concentration of C.velia medium is relatively higher than the con- centration of iron in an ocean, remaining 2% of iron can be sufficient for C. velia to perform most if not all of their regular metabolism that require iron. In fact, it is assumed that iron concentration of oceans decreased significantly when the oceans shifted from low oxygenated state into the well-oxygenated state. Hence, iron became the limiting element for the primary producers of the oceans. How-

37 ever, most of the well adaptable organisms appear to have evolved their mor- phology and physiology in to cope well under Fe deficient environment [53].

Since, the relatives (i.e. apicomplexan parasites) of C. velia is well known for their successful adaptability into new conditions, reduced Fe requirement of C. velia has probably evolved during at the time of shift from poor oxygen state into am- ple oxygen state in the oceans.

Pacific ocean salinity level varies from 32 parts per thousand (ppt.) to 37 ppt.

[54]. Although optimum salinity level for growth of C. velia is 33 ppt., we have observed that the cells can display almost normal growth at as low as 17 ppt. of salt level (results not shown). However, the same was not observed when the sa- linity increased to 66 ppt. In this case, the growth rate and the maximum cell yield of the C. velia significantly affected. We speculated that since, many of the photosynthetic relatives of C. velia are also found in freshwater (such as many species of phylum dinoflagellata), it likely still possesses functional conserved genes from their freshwater ancestors.

Interestingly, when interaction effect has taken into account 863 additional genes showed significantly differential expression when significant threshold was set at FDR ≤ 0.1 (Fig. 5C). Those are the genes that show significant differ- ential gene expression under specific combination of multiple conditions and are not predictable from individual stress treatments.

4439 genes that were silent in all control conditions were expressed under stress conditions, while there were only eleven genes that were unique to the control samples, (Fig. 5D). However, 6311 features remained silent. Most of the- se genes probably transcribe into a very few transcripts in the cell. Although, we have sequenced with a quite good coverage, increasing the number of sequence

38

reads further, will help us to achieve expression of those genes.

A

B C C

D 25000 Figure 5. General outcome of the facto- 19843 rial experiment. Hierarchical clustering 20000 of 65 RNA-seq libraries based on their ex- 15000 pression pattern, A. Venn diagrams of 10000 6311 number of significantly expressed genes 4439 5000 for each condition due to only main effect, 11 0

Number of expressed genes B, and interaction effect included, C. Sig- None Only stress Only ctrl All Conditions nificance threshold was set at false dis- covery rate of FDR≤0.1 for interaction effects. 4439 genes that were silent in all control conditions were expressed under stress conditions, D.

39

Gene expression changes in multi-factorial experiments come from main and in- teraction effects. The main effect of each factor (i.e. temperature, salinity and

Iron) that we got after decomposition of factorial experiment is expected to be similar to what we got from individual factor treatments. Hence, to see if this is a case in our experiment, we checked factorial correlation between gene expres- sion changes due to the individual and factorial heat shock applications (i.e. main effect of the heat shock application after ANOVA decomposition). We observed a high correlation between the two (Fig. 6), which indicate that the overall exper- imental (factorial design) and computational (ANOVA decomposition) strategy worked well.

Figure 6. Correlation of gene expression patterns between individual experi- ment and factorial experiment after 2h of heat treatment.

Given that C. velia is a relatively newly described species, the genome of the or- ganisms has not annotated well yet. To gain clues to putative functions of genes,

40 we examined presence of protein domains in C. velia genes. The Pfam database is a large collection of protein families. Domain is a structural and func- tional unit of a protein. For searching a protein sequence against Pfam, HMMER - a program package for protein sequence similarity searches using probabilistic methods, was used [55].

The proteins with HSP domains are also called molecular chaperones, or heat shock proteins (HSPs) are universal proteins that act to support accurate protein folding inside a cell. They aid in the proper folding of newly formed proteins, and also help with the re-folding of denatured proteins due to environmental stress [21, 23, 24]. So far, five major HSP families have been described, and are categorized according to their molecular size (HSP100, HSP90, HSP70, HSP60, and the small HSPs) [22]. Heat shock proteins are also involved in a many other cellular processes that include preventing of aggregation and of proteins, protein degradation, protein trafficking, and maintenance of proteins conformation.

Here, we identified 24 proteins with HSP domains; five genes with HSP20 do- mains, 12 genes with HSP70 domains, and seven genes with HSP90 domains.

We examined gene expression profiles for the 24 proteins with HSP domains.

Out of those, 23 of the genes were significantly differentially expressed

(FDR≤0.05). Nineteen of the genes were up-regulated under heat shock treat- ment while the other four were down-regulated. Interestingly, expression of all of the HSP20 genes increased 16 to 32 folds as a response to the heat shock (Fig.

6A). However, two of each HSP70 and HSP90 genes were down-regulated due to heat treatment. This is expected, because some of these proteins are only ex- pressed under stress conditions (strictly inducible), while some are not heat-

41 inducible (constitutive) and present in cells under normal growth conditions.

HSPs proteins displayed statistically significant correlation between the pilot and the main multi-factorial experiments, R2=0.93. (Fig. 6B)

A B

Figure 7. Effect of heat shock treatment on genes with HSP domains. Log2 values of the fold change of the significantly differentially expressed genes compared to control groups, A. Correlation of HSP gene expression patterns between individual experiment (x-axis) and factorial experiment (y-axis) after 2 h. of heat treat- ment, R2=0.93, B.

The fact that HSP20 genes are evolutionarily well conserved among apicom- plexan parasites suggests the importance of these genes for the parasites. In fact, recent studies showed that HSP20 proteins are involved in motile life stages of some apicomplexan parasites such as Plasmodium spp., Toxoplasma gondii and

Neospora caninum [24-26]. We have observed the significant up-regulation of all

HSP20 proteins in Chromera cells under heat shock responses. Hence, in addi- tion to their assistance in intracellular proteins proper folding, we wanted to see if these genes have essential roles in the movement of bi-flagellated zoospores of

42

C. velia as they do in other apicomplexan relatives of the species. Although, mo- tile life phase of C. velia hasn't studied in details yet, the study on transformation between immotile and motile life phases of the species has showed that motility and salinity has an inverse relationship [10]. Interestingly, the expression of

HSP20 genes also suppressed under high salt and the same set of genes displayed positive expression under low salt level (Fig. 8A). In fact, dynein proteins that are microtubule motor proteins showed similar expression pattern with HSP20 proteins (Fig. 8B). This is exciting because, dynein proteins have two main func- tions in a cell; first, they mediate movements of vesicles and organelles along mi- crotubules within a cell and second they are one of the key proteins in the movement and structure of flagella and cilia. All these together lead us to ask if

HSP20 has any role in movement of C. velia cells. However, we have noted that the particular strain that was used in this experiment forms very few flagellated cells during its life cycle under culture conditions used. Together, they raise a possibility of existence of additional other roles of HSP20 in C. velia. Lastly, ex- pressions of all HSP20 genes are decreased due to iron deficiency in a similar manner (Fig. 8C).

43

A B

C

Figure 8. Effect of salinity and iron on expression of HSP20 and dynein tran- scripts. RNA-seq expression pattern of the HSP20 genes under different salt concentra- tions (A), and deficiency of iron (C). Expression profiles of dyneins under salinity (B).

44

3.4 Differentially expressed genes and cellular pathways

For a comprehensive understanding of biological roles of a gene in a cell, system biology approach needs to be used, which is looking at the expression level of sets of genes at once. However, this approach works the best when we have a genome with a relatively complete gene annotation. Lack of good annotation for the C. velia genes makes it challenging to understand the underlying cellular metabolic processes of those discovered numerous significantly differentially expressed genes (Fig. 5A, B). However, C. velia still has a numerous of well- an- notated genes whose encouraged us to look into the systemic distribution and regulation of those genes through various metabolic pathways. Hence, we

BLASTed significantly differentially expressed, up- and down-regulated genes of each condition against relatively well-established KEGG (Kyoto Encyclopedia of

Genes and Genomes) cellular pathways database [56]. Here, we used KAAS

(KEGG Automatic Annotation Server) pathway reconstruction server, which en- ables reconstruction of KEGG pathways based on sequence similarity and bi- directional hit rate of query genes. Figure-9 shows the global view of the chang- es in metabolic pathways due to combination of all stresses (For a detailed sum- mary of each individual stress, refer to Appendix-E, 1-8).

45

Figure 9. General representation of KEGG metabolic pathways. Green lines display modified steps of pathways due to one of the three applied stresses, grey lines represent unchanged or absent metabolic processes in C. velia. Generated using KAAS (KEGG Automatic Annotation Server).

46

We found that most of the differentially expressed genes were down-regulated under salt stresses, while the numbers of up- and down- regulated genes were relatively similar under cold and heat shock treatments (Fig. 10A), which sug- gests that the temperature fluctuations, instead of shutting down the majority of processes within the cell, force the cells to modify the regulations of pathways to make the cells able to cope with the stresses. However, the high numbers of down-regulated genes under salt stresses imply that minimizing the fundamen- tal, growth-related cellular processes is a preferable strategy to cope with this type of stresses. When BLASTed against KEGG pathways, out of those signifi- cantly differentially expressed genes (Fig. 10A), between 10-20% of them had hits to the specific pathways (Fig. 10B and Appendix-D).

A Total DE genes (FDR≤0.05) B Annotated genes with hits

10000 8954 1400 1254 8104 8303 7960 1147 1200 1058 8000 6242

1000 6000 4109 800 664 658

Genes 4000 600 Genes

2000 507 421 400 133 94 0 200 Cold 2hr Heat 2hr High Salt Low Salt 0 Stress condions Cold 2hr Heat 2hr High Salt Low Salt Down Up Stress conditions

Figure 10. Number of significantly up- and down- regulated genes for each condition, A. Number of genes that have both annotations and hit to the at least one of the KEGG pathways, B. To see a list of the pathways those genes were involved in, refer to Appen- dix-D.

47

Genes that are involved in anabolic pathways such as biosynthesis of secondary metabolites, biosynthesis of amino acids, carbon metabolisms were mostly down-regulated (Fig. 11A,B). This is especially the case in temperature fluctua- tions.

A Aminoacyl-tRNA biosynthesis B Biosynthesis of amino acids 19 20 46 16 50 15 38 15 40 33 30 10 Genes

5 Genes 20 7 5 10 0 0 Cold 2hr Heat 2hr High Salt Low Salt Cold 2hr Heat 2hr High Salt Low Salt Stress conditions Stress conditions

Figure 11. Precursory metabolic pathways for synthesis of proteins. Synthesis of aminoacyl-tRNA enzymes (A) and various amino acids (B) are significantly down- regulated due to heat and cold shocks, while almost unchanged under high and low salt conditions Interestingly, genes that are directly involved with a regulation and modifica- tions of RNA molecules (e.g. RNA polymerases, nucleotide metabolisms, spliceo- somes…) were significantly up-regulated under heat and cold shock stresses, while down-regulated under both high and low salt levels Fig. 12A,B. (For a de- tailed summary, refer to Appendix-D). This may suggest that the initial growth perturbation response of C. velia happens at the level of RNA metabolism. We speculate that this increases the production and modification of variety of RNAs and their translation into diversity of proteins, most likely stress responsive pro- teins e.g. chaperones. However, after a certain time, when the required proteins are synthesized, the production and modifications of RNA return back to the normal or even decreases and new stress responsive RNAs are likely to be pro-

48 duced only to renew the older proteins. This makes sense since, RNA is extract- ed from heat and cold treated samples within maximum after 2 hr of a treatment whereas cultures that were used for high and low salt stress studies were grown in the low and high salt media for 11 days. Exposing the cells to the two different stresses for a same time period can be a good and easy attempt to test this hy- pothesis.

The spliceosome is a large molecular complex that removes noncoding intron re- gions of pre-mature mRNA and splice the remaining coding exonic regions [57].

In addition to the canonical splicing, the spliceosomes perform alternative splic- ing to generate mature mRNAs from various combinations of exons. This is a major way to increase the genetic diversity in eukaryotes. Hence, the significant up-regulation of spliceosome components in C. velia under heat and cold shock stresses may indicate that, in addition to the expression changes at transcrip- tional level the cells are able to increase the mRNA diversity significantly at post transcriptional level by constructing different kind of mature mRNA molecules by alternative splicing. This characteristic of C .velia may mislead us in speculat- ing the total number of expression features in the C. velia genome. In fact, sever- al RNA molecules that are formed due to alternative splicing expected to have a different exon boundaries, which will make the annotation of those genes much more challengeable.

A Spliceosomes B RNA-polymerase

70 63 14 12 60 12 50 36 38 10 8 40 8 6 30 Genes 6 20 12 Genes 5 7 9 4 10 0 2 0 Cold 2hr Heat 2hr High Salt Low Salt 0 Heat 2hr High Salt Low Salt Stress conditions Stress conditions

49

C Purin Metabolism Figure 12. Some of the crucial pathways for synthesis and regulation of new RNA 60 50 50 45 37 molecules. Genes that are involved in syn- 40 30 thesis (B) and splicing (A) of premature Genes 20 RNA molecules are significantly up- 10 3 0 regulated due to heat shock treatment. Cold 2hr Heat 2hr High Salt Low Salt Stress conditions Those sets of genes plus genes that are in- volved in purin metabolism (C) are down- regulated under high and low salt stresses.

Most of the genes of cell cycle and DNA replications were also significantly down-regulated under stress conditions, such as, genes that are involved in DNA replication and other cell cycle stages, Fig. 13A-C. This agrees with what we ob- served (i.e. slow down of cell growth) while culturing the cells under stress con- ditions. However, these pathways were almost completely shut down under salt stresses, which make sense since it would take some time for a cell to completely stop the proliferation of itself after receiving a stress stimulus.

A Cell Cycle B DNA Replication

23 30 25 21 25 20 18 20 15 15 Genes Genes 10 10 5 5 0 0 Heat 2hr High Salt Low Salt Heat 2hr High Salt Low Salt Stress conditions Stress Conditions

Figure 13. Pathways that are directly C Mismatch Repair involved in multiplication of the cells. 20 16 C. velia proliferation was significantly 15 13 13

10 down-regulated under the heat shock, and Genes 5 almost completely shut down under the 0 salt stress. Heat 2hr High Salt Low Salt Stress conditions

50

Chapter 4

Conclusions

In this project, we performed transcriptional profiling of C. velia under various environmental conditions. We used a controlled experimental system to apply single and combinations of the stressors, which makes diverse environmental conditions. We initially performed pilot experiments to optimize conditions for heat shock response and iron deficiency and induction. Based on the things that we learned from these pilot studies we set up a multi-factorial experiment.

We determined that the gene expression response we detected for individual stresses were quite similar to those obtained from combinatorial experiment.

This means that the stresses we applied were comparable and supports our strategy of performing combinations of stresses.

A large number of genes were significantly differentially expressed due to inter- action effects of stresses. By performing a factorial experiment, we obtained rel- atively complete sets of the transcript profiles responding to single and com- bined stresses. Having the relatively complete set of genes is important, espe- cially to study the interaction effects between genes and the responsible meta- bolic pathways.

We observed that over four thousand genes that are transcriptional silent in normal growth conditions were positively expressed due to either exposure to at least one stress or a combination of stress conditions.

We found that HSP20, a well-known heat shock protein family, showed signifi- cant expression changes under stress conditions and their possible involvement in the movement of the cells.

51

Expressions of most of the metabolic pathways were changed due to stress con- ditions. The cells responded to stresses by significantly reducing their prolifera- tion and moderately cutting down most of the routine anabolic pathways under normal conditions. Significant up-regulation of spliceosome genes during heat and cold shock stresses suggests that in addition to the regulation of the expres- sions of genes at transcriptional level, C. velia also increases the diversity of mRNAs due to alternative splicing during the initial stages of a stress stimulus.

52

Future studies

It is known that a transcriptome of an organism is substantially correlated with its proteome. However, this correlation can be modest for some species due to post-transcriptional modifications and various stabilities of proteins. Although, this association has not been examined comprehensively in algae, previous stud- ies in yeast and mouse suggest an important but modest correlation between the levels of transcripts and the levels of proteins [58, 59]. Hence, this transcrip- tome study may benefit from a study that examines proteome of C. velia under the same conditions used here. Furthermore, HSP20 and other promising genes can be validated by reverse transcription-quantitative polymerase chain reaction

(RT-qPCR).

Also, the same study can be conducted by using the second member of

Chromerida, Vitrella brassicaformis. Since current work reveals the characteris- tics of C. velia, adding the second species will be valuable for distinguishing core conserved features of Chromerids from C. velia specific characteristics, which would be a very good addition to our current work.

A systems biology approach is a useful method for a comprehensive understand- ing of biological roles of a gene in a cell. This approach works the best when we have a genome with a relatively complete gene annotation. Since, significant number of the C. velia genes don't have well annotated protein domains or orthologous genes from closely related species, different strategy to reannotate these genes are necessary. Hence, the data can be used to further improve the genome annotation of C. velia by clustering of genes with similar expression pro- files [60].

53

In future, revisiting the current data for further analysis would shed light on many aspects of metabolic pathways and regulation of genome-wide gene ex- pression changes. Furthermore, investigating the genes that are involved in transcriptional regulation will help us to see the bigger picture of the C. Velia cel- lular processes under stress conditions

54

REFERENCES

1. Moore, R.B., et al., A photosynthetic closely related to apicomplexan parasites. Nature, 2008. 451(7181): p. 959-63.

2. Manguin, S., et al., Review on global co-transmission of human Plasmodium species and Wuchereria bancrofti by mosquitoes. Infect Genet Evol, 2010. 10(2): p. 159-77.

3. Tenter, A.M., A.R. Heckeroth, and L.M. Weiss, Toxoplasma gondii: from animals to humans. Int J Parasitol, 2000. 30(12-13): p. 1217-58.

4. Sharman, P.A., et al., Chasing the golden egg: vaccination against poultry coccidiosis. Parasite Immunol, 2010. 32(8): p. 590-8.

5. Trees, A.J., et al., Towards evaluating the economic impact of bovine neosporosis. Int J Parasitol, 1999. 29(8): p. 1195-200.

6. Weatherby, K. and D. Carter, Chromera velia: The Missing Link in the Evolution of . Adv Appl Microbiol, 2013. 85: p. 119-44.

7. Cumbo, V.R., Antimicrobial compounds in the scleractinian corals Montipora digitata and Montipora tortuosa. Honours Thesis. University of New South Wales, p. 86, 2005.

8. Cumbo, V.R., et al., Chromera velia is Endosymbiotic in Larvae of the Reef Corals digitifera and A. tenuis. , 2013. 164(2): p. 237-244.

9. Obornik, M., et al., Morphology and ultrastructure of multiple life cycle stages of the photosynthetic relative of apicomplexa, Chromera velia. Protist, 2011. 162(1): p. 115-30.

10. Guo, J.T., et al., Effect of Nutrient Concentration and Salinity on Immotile- Motile Transformation of Chromera velia. Journal of Eukaryotic Microbiology, 2010. 57(5): p. 444-446.

11. Obornik, M. and J. Lukes, Cell biology of chromerids: autotrophic relatives to apicomplexan parasites. Int Rev Cell Mol Biol, 2013. 306: p. 333-69.

12. Pawlowski, J., et al., CBOL protist working group: barcoding eukaryotic richness beyond the animal, plant, and fungal kingdoms. PLoS Biol, 2012. 10(11): p. e1001419.

55

13. Fleige, T., J. Limenitakis, and D. Soldati-Favre, Apicoplast: keep it or leave it. Microbes Infect, 2010. 12(4): p. 253-62.

14. Zhang, Z., B.R. Green, and T. Cavalier-Smith, Phylogeny of ultra-rapidly evolving dinoflagellate chloroplast genes: a possible common origin for sporozoan and dinoflagellate plastids. J Mol Evol, 2000. 51(1): p. 26-40.

15. Keeling, P.J., Evolutionary biology: bridge over troublesome plastids. Nature, 2008. 451(7181): p. 896-7.

16. Obornik, M., et al., Evolution of the apicoplast and its hosts: From heterotrophy to autotrophy and back again. International Journal for Parasitology, 2009. 39(1): p. 1-12.

17. Obornik, M., et al., Morphology, ultrastructure and life cycle of Vitrella brassicaformis n. sp., n. gen., a novel chromerid from the . Protist, 2012. 163(2): p. 306-23.

18. Quigg, A., et al., Photosynthesis in Chromera velia represents a simple system with high efficiency. PLoS One, 2012. 7(10): p. e47036.

19. Warner, M.E., W.K. Fitt, and G.W. Schmidt, Damage to photosystem II in symbiotic dinoflagellates: a determinant of coral bleaching. Proc Natl Acad Sci U S A, 1999. 96(14): p. 8007-12.

20. Tchernov, D., et al., Apoptosis and the selective survival of host animals following thermal bleaching in zooxanthellate corals. Proc Natl Acad Sci U S A, 2011. 108(24): p. 9905-9.

21. De Maio, A., Heat shock proteins: facts, thoughts, and dreams. Shock, 1999. 11(1): p. 1-12.

22. Li, Z. and P. Srivastava, Heat-shock proteins. Curr Protoc Immunol, 2004. Appendix 1: p. Appendix 1T.

23. Wu, C., Heat shock transcription factors: structure and regulation. Annu Rev Cell Dev Biol, 1995. 11: p. 441-69.

24. Montagna, G.N., K. Matuschewski, and C.A. Buscaglia, Small heat shock proteins in cellular adhesion and migration Evidence from Plasmodium genetics. Cell Adhesion & Migration, 2012. 6(2): p. 78-84.

25. Montagna, G.N., et al., Critical Role for Heat Shock Protein 20 (HSP20) in Migration of Malarial Sporozoites. Journal of Biological Chemistry, 2012. 287(4): p. 2410-2422.

56

26. Coceres, V.M., et al., Rabbit antibodies against Toxoplasma Hsp20 are able to reduce parasite invasion and gliding motility in Toxoplasma gondii and parasite invasion in Neospora caninum. Experimental Parasitology, 2012. 132(2): p. 274-281.

27. Martin, J.H. and S.E. Fitzwater, Iron-Deficiency Limits Phytoplankton Growth in the Northeast Pacific Subarctic. Nature, 1988. 331(6154): p. 341-343.

28. Price, N.M., B.A. Ahner, and F.M.M. Morel, The Equatorial Pacific-Ocean - Grazer-Controlled Phytoplankton Populations in an Iron-Limited Ecosystem. Limnology and Oceanography, 1994. 39(3): p. 520-534.

29. Milligan, A.J. and P.J. Harrison, Effects of non-steady-state iron limitation on nitrogen assimilatory enzymes in the marine diatom Thalassiosira weissflogii (Bacillariophyceae). Journal of Phycology, 2000. 36(1): p. 78- 86.

30. Li, D.X., et al., Effect of iron stress, light stress, and nitrogen source on physiological aspects of marine red tide alga. Journal of Plant Nutrition, 2004. 27(1): p. 29-41.

31. Kosman, D.J., Molecular mechanisms of iron uptake in fungi. Mol Microbiol, 2003. 47(5): p. 1185-97.

32. Philpott, C.C., Iron uptake in fungi: a system for every source. Biochim Biophys Acta, 2006. 1763(7): p. 636-45.

33. Turner DR, H.K., de Baar HJW, Introduction. Turner DR, Hunter KA, editors. , , The Biogeochemistry of Iron in Seawater. John Wiley & Sons Ltd, Chichester, UK, pp 1–7. 2001.

34. Munns, R., Comparative physiology of salt and water stress. Plant Cell Environ, 2002. 25(2): p. 239-250.

35. Abogadallah, G.M., Antioxidative defense under salt stress. Plant Signal Behav, 2010. 5(4): p. 369-74.

36. Rasmussen, S., et al., Transcriptome Responses to Combinations of Stresses in Arabidopsis. Plant Physiology, 2013. 161(4): p. 1783-1794.

37. Prasch, C.M. and U. Sonnewald, Simultaneous Application of Heat, Drought, and Virus to Arabidopsis Plants Reveals Significant Shifts in Signaling Networks. Plant Physiology, 2013. 162(4): p. 1849-1866.

57

38. Oshlack, A., M.D. Robinson, and M.D. Young, From RNA-seq reads to differential expression results. Genome Biology, 2010. 11(12).

39. Wang, Z., M. Gerstein, and M. Snyder, RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics, 2009. 10(1): p. 57-63.

40. Fernandes, A.D., et al., ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-Seq. PLoS One, 2013. 8(7): p. e67019.

41. Fisher, R., The Arrangement of Field Experiments. Journal of the Ministry of Agriculture of Great Britain 1926.

42. Sutak, R., et al., Nonreductive Iron Uptake Mechanism in the Marine Alveolate Chromera velia. Plant Physiology, 2010. 154(2): p. 991-1000.

43. Bolger, A.M., M. Lohse, and B. Usadel, Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 2014.

44. Trapnell, C., L. Pachter, and S.L. Salzberg, TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 2009. 25(9): p. 1105-11.

45. Roberts, A., et al., Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics, 2011. 27(17): p. 2325-2329.

46. Trapnell, C., et al., Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012. 7(3): p. 562-578.

47. Kerr, M.K., M. Martin, and G.A. Churchill, Analysis of variance for gene expression microarray data. J Comput Biol, 2000. 7(6): p. 819-37.

48. Cui, X., et al., Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics, 2005. 6(1): p. 59- 75.

49. Storey, J.D. and R. Tibshirani, Statistical significance for genomewide studies. Proc Natl Acad Sci U S A, 2003. 100(16): p. 9440-5.

50. Geider, R.J. and J. Laroche, The Role of Iron in Phytoplankton Photosynthesis, and the Potential for Iron-Limitation of Primary Productivity in the Sea. Photosynthesis Research, 1994. 39(3): p. 275-301.

51. Rosic, N.N., et al., Gene expression profiles of cytosolic heat shock proteins Hsp70 and Hsp90 from symbiotic dinoflagellates in response to thermal

58

stress: possible implications for coral bleaching. Cell Stress & Chaperones, 2011. 16(1): p. 69-80.

52. Kerr, M.K. and G.A. Churchill, Statistical design and the analysis of gene expression microarray data (Reprinted from Genet. Res., Camb., vol 77, pg 123-128, 2001). Genetics Research, 2007. 89(5-6): p. 509-514.

53. Brand, L.E., Minimum Iron Requirements of Marine-Phytoplankton and the Implications for the Biogeochemical Control of New Production. Limnology and Oceanography, 1991. 36(8): p. 1756-1771.

54. "Pacific Ocean: Salinity", Encyclopædia Britannica. Retrieved 2013.

55. Finn, R.D., J. Clements, and S.R. Eddy, HMMER web server: interactive sequence similarity searching. Nucleic Acids Research, 2011. 39: p. W29- W37.

56. Moriya, Y., et al., KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res, 2007. 35(Web Server issue): p. W182-5.

57. Patel, A.A. and J.A. Steitz, Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol, 2003. 4(12): p. 960-70.

58. Foss, E.J., et al., Genetic basis of proteome variation in yeast. Nat Genet, 2007. 39(11): p. 1369-75.

59. Ghazalpour, A., et al., Comparative analysis of proteome and transcriptome variation in mouse. PLoS Genet, 2011. 7(6): p. e1001393.

60. Hu, G., et al., Transcriptional profiling of growth perturbations of the human malaria parasite Plasmodium falciparum. Nat Biotechnol, 2010. 28(1): p. 91-8.

59

APPENDICES

60

Appendix A – Pilot Experiment: Heat Shock treatment Hiseq-2000 sequenc- ing output data summary

Sample GC Content Average Number of Reads % of Bases ≥ name (%) Length (bp) in Million Q30 R1 R2 R1 R2 R1 R2 R1 R2 Control-1 53.76 53.54 101 101 15.54 15.54 86.23 83.83 Control-2 52.68 52.4 101 101 15.27 15.27 86.26 83.6 Control-3 52.94 52.64 101 101 12.64 12.64 86.98 84.46 30 min.-1 52.73 52.43 101 101 19.79 19.79 87.05 84.52 30 min.-2 53.77 53.54 101 101 15.77 15.77 86.01 82.03 2 h. -1 52.63 52.57 101 101 19.06 19.06 86.06 82.51 2 h. -2 52.91 52.63 101 101 15.1 15.1 86.84 83.96 2 h. -3 52.56 52.7 101 101 57.62 57.62 86.72 83.61 4 h. -1 51.16 51.37 101 101 7.52 7.52 85.73 85.73 4 h. -2 53.15 52.82 101 101 22.52 22.52 86.78 83.54

61

Appendix B – Pilot Experiment: Fe deficiency and induction treatments Hiseq-2000 sequencing output data summary

Sample Average GC Content Number of Reads % of Bases ≥ Name Length (bp) (%) in Million Q30 R1 R2 R1 R2 R1 R2 R1 R2 Fe(-)-1 101 101 52.93 52.86 15.26 15.26 94.78 92.56 Fe(-)-2 101 101 53.3 53.11 13.95 13.95 94.73 92.34 Fe(-)-3 101 101 53.46 53.25 20.98 20.98 94.57 91.79 Fe(+)-1 101 101 53.35 53.07 21.99 21.99 94.69 92.31 Fe(+)-2 101 101 53.58 53.3 13.25 13.25 94.65 91.77 Fe(+)-3 101 101 53.54 53.36 13.52 13.52 94.62 92.21 Fe Ind-1 101 101 53.29 53.1 19.23 19.23 94.73 92.3 Fe Ind-2 101 101 53.48 53.27 29.78 29.78 94.59 92.02 Fe Ind-3 101 101 53.51 53.32 26.07 26.07 94.72 92.3

62

Appendix C – Multi-Factorial Experiment Hiseq-2000 sequencing output data summary (Single end)

Number GC Average of % of Bases Content Length Reads in ≥ Q30 S.No: Sample Name (%) (bp) Million R1 R1 R1 R1 1 C.v_Combination_Stress-11 52.84 51 26.14 95.01 2 C.v_Combination_Stress-14 53.38 51 28.85 94.95 3 C.v_Combination_Stress-26 51.86 51 20.89 95.04 4 C.v_Combination_Stress-2 53.17 51 53.47 92.35 5 C.v_Combination_Stress-41 53.45 51 26.80 93.98 6 C.v_Combination_Stress-48 52.98 51 19.51 93.87 7 C.v_Combination_Stress-57 53.15 51 21.57 93.79 8 C.v_Combination_Stress-61 52.94 51 21.96 94.29 9 C.v_Combination_Stress-18 52.03 51 38.10 93.76 10 C.v_Combination_Stress-28 53.45 51 50.33 94.10 11 C.v_Combination_Stress-34 51.86 51 24.26 94.84 12 C.v_Combination_Stress-38 53.04 51 14.77 93.98 13 C.v_Combination_Stress-40 53.35 51 20.31 95.14 14 C.v_Combination_Stress-46 53.02 51 28.58 93.62 15 C.v_Combination_Stress-47 52.69 51 22.25 94.09 16 C.v_Combination_Stress-49 53.09 51 44.49 92.57 17 C.v_Combination_Stress-52 53.47 51 31.39 94.18 18 C.v_Combination_Stress-56 52.21 51 18.96 93.72 19 C.v_Combination_Stress-59 52.82 51 29.07 93.97 20 C.v_Combination_Stress-63 51.54 51 24.18 93.66 21 C.v_Combination_Stress-9 52.06 51 26.30 93.78 22 C.v_Combination_Stress-10 52.83 51 27.16 94.94 23 C.v_Combination_Stress-13 52.02 51 23.62 94.33 24 C.v_Combination_Stress-15 53.29 51 39.64 93.06 25 C.v_Combination_Stress-16 53.39 51 27.05 95.00 26 C.v_Combination_Stress-19 52.43 51 25.87 93.94 27 C.v_Combination_Stress-20 52.55 51 25.04 94.62 28 C.v_Combination_Stress-21 52.93 51 23.68 95.09 29 C.v_Combination_Stress-22 52.17 51 22.77 95.16 30 C.v_Combination_Stress-25 53.58 51 24.99 94.14 31 C.v_Combination_Stress-27 52.77 51 35.99 93.13 32 C.v_Combination_Stress-29 52.21 51 11.80 92.28 33 C.v_Combination_Stress-30 52.58 51 19.80 94.58 34 C.v_Combination_Stress-31 52.83 51 24.12 94.53 35 C.v_Combination_Stress-33 53.04 51 23.82 93.64 36 C.v_Combination_Stress-3 51.22 51 22.04 95.03 37 C.v_Combination_Stress-39 51.31 51 22.76 93.94

63

38 C.v_Combination_Stress-42 53.28 51 40.62 92.55 39 C.v_Combination_Stress-4 53.04 51 23.60 94.51 40 C.v_Combination_Stress-53 53.83 51 33.29 93.93 41 C.v_Combination_Stress-54 51.91 51 27.31 94.68 42 C.v_Combination_Stress-7 53.43 51 29.74 93.62 43 C.v_Combination_Stress-12 53.17 51 31.65 93.14 44 C.v_Combination_Stress-17 53.67 51 23.86 95.07 45 C.v_Combination_Stress-1 53.37 51 24.81 94.12 46 C.v_Combination_Stress-23 53.05 51 20.56 94.61 47 C.v_Combination_Stress-24 51.79 51 21.53 94.21 48 C.v_Combination_Stress-32 52.93 51 21.15 95.11 49 C.v_Combination_Stress-35 53.30 51 25.13 93.93 50 C.v_Combination_Stress-36 51.06 51 23.24 94.37 51 C.v_Combination_Stress-37 52.38 51 18.83 94.28 52 C.v_Combination_Stress-43 53.39 51 29.78 94.64 53 C.v_Combination_Stress-44 53.11 51 24.72 93.41 54 C.v_Combination_Stress-45 53.46 51 39.78 93.69 55 C.v_Combination_Stress-51 53.43 51 26.21 93.93 56 C.v_Combination_Stress-55 53.58 51 27.99 94.36 57 C.v_Combination_Stress-58 53.89 51 36.56 92.65 58 C.v_Combination_Stress-5 53.28 51 27.56 94.11 59 C.v_Combination_Stress-60 53.06 51 35.28 94.04 60 C.v_Combination_Stress-62 53.48 51 36.15 91.85 61 C.v_Combination_Stress-64 53.43 51 31.10 93.89 62 C.v_Combination_Stress-65 52.73 51 22.79 94.28 63 C.v_Combination_Stress-6 52.61 51 32.10 87.62 64 C.v_Combination_Stress-8 52.57 51 22.20 94.51 65 C.v_Combination_Stress-66 52.61 51 25.83 93.54 Average 52.84 51.00 27.32 93.95

64

Appendix D- Number of DE genes and their involved KEGG pathways

Cold 2hr High 2hr High Salt Low Salt Name of Cellular Pathways

Expression level -> Down Up Down Up Down Up Down Up Total # of expressed genes (FRD≤0.05) 507 421 8104 8954 8303 6242 7960 4109 Total # of annotated genes with hits 133 94 1058 1070 1254 664 1147 658 Metabolic pathways 30 15 283 223 256 170 251 185 Biosynthesis of secondary metabo- lites 13 9 116 77 100 71 102 85 Biosynthesis of amino acids 7 1 46 25 38 28 33 33 RNA transport 5 1 27 26 34 16 27 18 Aminoacyl-tRNA biosynthesis 5 1 19 10 15 13 16 11 Pyrimidine metabolism 5 0 32 32 40 6 39 12 Glycerophospholipid metabolism 5 0 10 11 10 9 8 15 Arginine and proline metabolism 4 0 17 7 18 9 11 11 Tryptophan metabolism 3 0 10 4 6 10 7 6 Inositol phosphate metabolism 3 2 11 7 10 7 6 5 Purine metabolism 3 2 37 35 45 11 50 19 Protein processing in endoplasmic reticulum 3 0 26 29 42 11 36 8 Glycine, serine and threonine metab- olism 3 1 19 12 14 15 13 13 Selenocompound metabolism 3 1 9 4 2 5 5 5 Lysosome 3 1 17 15 25 9 20 9 Phosphatidylinositol signaling sys- tem 3 2 10 7 11 5 8 6 Basal transcription factors 3 0 5 7 6 2 3 3 Nucleotide excision repair 3 1 21 12 24 1 21 3 RNA degradation 3 1 18 20 27 7 27 9 Pyruvate metabolism 3 1 13 10 17 5 14 7 beta-Alanine metabolism 3 0 9 6 10 6 5 7 Folate biosynthesis 2 1 6 7 3 6 5 3 RNA polymerase 2 0 6 12 12 1 8 4 Glycolysis / Gluconeogenesis 2 3 15 13 18 10 14 13 Carbon metabolism 2 2 42 26 41 24 33 25 Porphyrin and chlorophyll metabo- lism 2 1 9 11 2 16 5 14 mRNA surveillance pathway 2 0 11 17 23 4 18 8 Aminobenzoate degradation 2 0 6 1 3 2 5 2 Fatty acid degradation 2 1 6 8 6 7 12 7 Cysteine and methionine metabolism 2 1 11 10 6 12 6 8 Cytosolic DNA-sensing pathway 2 0 3 8 6 0 4 4 Valine, leucine and isoleucine degra- dation 2 2 10 9 13 7 12 7

65

Glycerolipid metabolism 2 0 11 8 11 11 7 11 Lysine degradation 2 0 10 6 11 4 8 3 Propanoate metabolism 2 1 5 7 11 3 6 4 Pantothenate and CoA biosynthesis 2 0 5 7 7 3 5 5 Histidine metabolism 2 0 7 2 5 3 4 3 Sulfur metabolism 2 0 9 4 4 5 2 4 Alanine, aspartate and glutamate me- tabolism 2 0 10 3 11 5 8 6 2-Oxocarboxylic acid metabolism 2 0 9 2 10 5 10 9 Steroid biosynthesis 1 0 7 3 5 3 5 5 Tyrosine metabolism 1 1 9 12 5 10 9 8 Glutathione metabolism 1 1 13 4 7 7 9 9 Amino sugar and nucleotide sugar metabolism 1 1 15 7 14 4 19 6 Cell cycle 1 1 24 10 24 2 24 4 Ubiquitin mediated proteolysis 1 2 23 23 30 3 23 6 Base excision repair 1 0 14 10 16 1 12 1 Fatty acid metabolism 1 2 12 14 6 13 14 15 Peroxisome 1 1 9 19 13 12 14 12 Ascorbate and aldarate metabolism 1 0 9 4 5 4 6 6 Cholinergic synapse 1 0 3 2 3 3 3 1 Ribosome 1 2 34 29 15 37 13 42 Spliceosome 0 5 12 36 63 7 38 9 Biotin metabolism 0 1 5 4 3 4 2 5 SNARE interactions in vesicular transport 0 1 2 10 9 3 9 2 Non-homologous end-joining 0 1 5 4 5 0 6 2 Biosynthesis of unsaturated fatty ac- ids 0 1 4 6 2 3 6 7 DNA replication 0 0 21 11 18 0 23 3 Mismatch repair 0 0 13 8 16 0 13 1 Citrate cycle CA cycle 0 0 11 5 14 2 9 4 GlycosylphosphatidylinositolPI- anchor biosynthesis 0 0 10 6 9 0 13 1 Oxidative phosphorylation 0 0 10 16 17 6 13 6 Proteasome 0 0 9 12 32 0 23 0 Fructose and mannose metabolism 0 0 9 5 8 4 7 5 Photosynthesis 0 0 5 6 4 5 2 4 Apoptosis 0 0 3 6 6 4 7 2

66

Appendix E- Global metabolic map for differentially expressed genes

Figure 1: 2hr heat treatment up-regulated Figure 2: 2hr heat treatment down-regulated Figure 3: 2hr cold treatment up-regulated Figure 4: 2hr cold treatment down-regulated Figure 5: High salt treatment up-regulated Figure 6: High salt treatment down-regulated Figure 7: Low salt treatment up-regulated Figure 8: Low salt treatment down-regulated

67

Figure 1

68

Figure 2.

69

Figure 3.

70

Figure 4.

71

Figure 5.

72

Figure 6.

73

Figure 7.

74

Figure 8.