<<

2

Identification of Potential Biomineralization Linked Genes in Emiliania

huxleyi via High-Throughput Sequencing with RT-PCR Verification

By

Chrystal Grace Schroepfer

Research Thesis Submitted for the Masters Degree in Biological Sciences

Department of Biological Science, College of Science and Mathematics

California State University San Marcos

November 2011

3

Table of Contents Table of Contents ...... 2 Acknowledgements ...... 3 Abstract ...... 4 Introduction ...... 5 Coccolithophorids ...... 5 Biomineralization and Coccolithogenesis ...... 7 Sequence Profiling ...... 10 Methods and Materials ...... 14 Strains and Growth Conditions ...... 14 Scanning Electron Microscopy ...... 15 2+ - Ca Titration Estimates of CaCO3 ...... 16 Measuring Photosynthesis Rates ...... 17 RNA Extraction ...... 18 RNA Gel Electrophoresis ...... 20 Experion RNA Electrophoresis ...... 20 RNA Sequencing ...... 21 Primer Design ...... 23 Real Time RT-PCR ...... 23 Annotation ...... 26 Results ...... 27 Cell Growth ...... 27 Scanning Electron Microscopy ...... 29 Calcium Titration ...... 31 Photosynthesis Rates ...... 34 Solexa Profiles ...... 35 RNA Extraction & Gel Electrophoresis ...... 38 Comparative Reverse Transcriptase Real-Time PCR ...... 43 Annotation ...... 52 Discussion ...... 61 References ...... 77 Appendices...... 86

4

Acknowledgements

I would like to thank my committee members Dr. Betsy Read, Dr.

Matthew Escobar and Dr. Jose Mendoza for their guidance and support

throughout the thesis completion process.

I would like to thank my parents, John and Mary Schroepfer, my

closest friends, Steve and Jeanne Bâby, Gearald Denny, Tom Bento and my

Church Family at Bostonia Church of Christ for their love, encouragement and support over the years. To my brothers, David and Jason Schroepfer, thanks for all the laughs.

Lastly, I would like to thank all my friends that I made from the laboratory: Estela Carrasco, James Fuller, Jessica Garza, Karina Gonzalez,

Latha Kannan, Ray Liang, Tien Nguyen, Alyse Prichard, Analisa Sarno,

Andrew Segina, Christina Vanderwerken and William Whalen for their help and all the fun times we shared together. Christina and William: Thank you for being there. Christina: Thank you for being my best friend. William:

Thank you for guidance with the completion of this thesis paper. Andrew:

Peasant day was a blast. Thank You!!!!!

5

ABSTRACT

Emiliania huxleyi and galbana are two of the three living . E. huxleyi produces distinguishing shells of calcium carbonate called coccoliths through the process of biomineralization, while I. galbana does not. The study of the genetic underpinnings for biomineralization in coccolithophorids is in its infancy. This research is aimed at moving such study forward by identifying genes involved in biomineralization using transcript profiling via Solexa high-throughput RNA sequencing (RNAseq). Transcriptome profiles were generated from cells grown in normal and calcium deplete conditions with and without a sodium bicarbonate spike. Cell counts and rates of calcium uptake and photosynthesis confirmed previous studies that calcification and photosynthesis are independent processes. Photosynthesis rates and Ca2+ uptake were monitored during cell growth and SEM was used to determine the effect of treatments on calcification and coccolith morphology. As previously demonstrated cells grown in the absence of calcium showed fragments of coccoliths while cells grown in the presence of calcium showed well formed and intact coccoliths. 126 differentially expressed genes were identified using a negative binomial distribution to compare transcript levels of E. huxleyi cells grown under calcifying conditions in 9 mM Ca2+ as compared to non-calcifying conditions in 0 mM Ca2+. Real time RT-PCR analysis was performed to independently validate the differential expression of 25 potential biomineralization genes, 20 of which were significantly up-regulated and 3 that were significantly down-regulated. Although many of the 126 genes identified as being differentially expressed lacked significant homology to known sequences, others made biological sense with regard to biomineralization and/or coccolithogenesis. These included the differential expression of C-type lectins, titin-like proteins, glycosyltransferases and several proteoglycans. As part of this thesis work, gene predictions for cyclophilins in E. huxleyi were also manually curated and uploaded on the JGI E huxleyi genome portal.

Keywords: , , coccolith, coccolithophorid, coccolithogenesis, biomineralization, scanning electron microscopy, high- throughput RNA sequencing, transcriptome profiling, real-time PCR, manual gene model annotation, cyclophilin.

6

INTRODUCTION

Coccolithophorids

Coccolithophorids are unicellular marine algae categorized in the

Haptophyta division under the class (Edvardsen et al.

2000). Prymnesiophyceae contains four orders which are Isochrysidales,

Coccosphaerales, Prymnesiales and Pavlovales. The order Isochrysidales

consists of three living : 1) the coccolith bearing Emiliania huxleyi and

Gephyrocapsa oceanica, and 2) the non-coccolith bearing Isochrysis galbana.

Coccolithophorids have a dimorphic life cycle producing cells which alternate between a haploid (or motile) and diploid (or nonmotile) phase. In the nonmotile phase E. huxleyi and G. oceanica, cells share common traits

such as the presence of two flagellar apparatuses of equal or subequal size,

small delicate organic scale layers external to the plasmalemma, the

production of coccoliths, and a small or nonexistent haptonema (Edvardsen et

al. 2000). E. huxleyi and G. oceanica have similar intracellular structures,

life cycles and coccolith morphology. The similar morphology of the coccoliths

suggests a close phylogenetic relationship, perhaps they are even cryptic

species (Fujiwara et al. 1994). As sister species, E. huxleyi and G. oceanica

are identical except for one nucleotide difference in the 18s rDNA sequence

7

(Edvardsen et al. 2000). Also they are identical in the nucleotide sequences

of rbcL (ribulose-1,5-bisphosphate carboxylase/oxygenase) (Fujiwara et al.

1994). Similarly, the nonmotile cells of I. galbana have body scales similar

to the cell coverings in E. huxleyi and G. oceanica, which also supports a close phylogenic relationship between species. E. huxleyi, G. oceanica and I.

galbana are unique among the coccolithophorids because of their shared morphological and ultrastructural characteristics.

By comparison, G. oceanica is a little bit larger then E. huxleyi, has

structurally different coccoliths and is more sensitive to temperature changes.

By similarity, they both form large blooms and are producers of calcium

carbonate deposits in the world’s oceans (Winter and Siesser 1994; Rhodes.

1995). Even though the coccolith morphology is distinctively different in the

two species, they have similar structures and presumably similar calcification

mechanisms (Young et al. 1999). As a non-calcifying sister species I.

galbana will serve as an ideal subject for comparison to unravel the molecular mechanisms of calcification in coccolithophorids.

8

Biomineralization and Coccolithogenesis

Biomineralization is by definition the formation and deposition of inorganic solids in biological systems (Mann 2001). The most common inorganic solids are biominerals that are made from calcium carbonate, calcium phosphate or silica (Livingston et al. 2006). Forms of calcium carbonate include calcite, magnesium calcite or aragonite which are produced by organisms such as coccolithophorids, sea urchins and mollusks, respectivly. Coccolithophorids deposit calcium carbonate in the form of elaborate calcite shells called coccoliths that encapsulate the cell. Individual cells are typically surrounded by 10-50 coccoliths (Westbroek et al. 1984;

Tyrrell and Taylor. 1996). The structure of the coccolith consists of a base plate and radial arrays of calcium carbonate crystalline segments that include a flat radially oriented lower element, a hammer shaped upper element, a central element and a medially directed element.

Figure 1: Diagram of coccolith elements, from the International Nanoplankton Association, Natural history museum, London, England ©2010

9

While a variety of organisms are capable of carrying out

biomineralization processes, there are two principal methods of mineral

deposition, biologically-induced mineralization and biologically-mediated

mineralization (Mann. 2005). In biologically-induced mineralization, minerals are

deposited adventitiously along the surface of the cell, as a result of various

metabolic activities of the cell. In contrast, biologically-mediated mineralization

is a highly regulated process that occurs intracellularly and results in

materials with specialized functions and structures. Coccolithophorids exhibit

biologically-mediated mineralization, where mineralization occurs within a

vesicle where matrix molecules nucleate and shape the crystalline structure.

Coccoliths of E. huxleyi and G. oceanica are created within a

specialized intracellular compartment called the coccolith vesicle (CV) derived

from the Golgi apparatus (Mann 2001; Marsh 2003; Nguyen et al. 2005).

Coccolithophorids convert inorganic carbon into organic and biomineralized

2+ - product whereby Ca + 2HCO3  CaCO3 + CO2 + H2O

(Marsh 2003). The coccoliths are formed inside the CV in a process called

coccolithogenesis, where the orientation, size and shape of the crystalline

elements are carefully controlled. The CV is attached to a convoluted membrane system called the reticular body, forming the c.v - r.b complex, which is thought to be responsible for the rapid transport of mineral ions to

10

the CV (Westbroek et al. 1984; Marsh 2003). Although the exact function of

the reticular body has not yet been demonstrated, the delivery of mineral

ions in this manner may create a highly saturated solution to facilitate calcite

nucleation and growth (Westbroek et al. 1984; Marsh 2003). The CV is

located next to the nucleus. Smaller vesicles containing matrix molecules,

calcium inorganic ions and other machinery attach to the CV. The contents

are delivered to the CV where an organic scale or base plate is assembled.

Crystal nucleation of CaCO3 occurs on the base-plate. Once the coccolith is mature, the CV migrates to the cell membrane where it fuses to the cell membrane and the lith is released from the cell in a massive exocytotic process. Multiple overlapping coccoliths assemble around the algae cell to form a coccosphere.

The availability of nutrients and certain environmental factors influence

biomineralization and coccolith morphology. There is a strong association

between calcification rate and light intensity and while calcification is not light

dependent almost no calcification occurs in the dark (Trimborn et al. 2007).

Malformed coccoliths are typical of cells grown in high or low concentrations

of magnesium or when cells are starved of calcium (Blackwelder et al. 1976;

Herfort et al. 2004). When E. huxleyi is grown in artificial seawater with little

or no calcium, the rate of calcification is significantly reduced and few if any

11

coccoliths are produced. Coccoliths that are formed, moreover, are often

incomplete and possess thin crystalline elements (Herfort et al. 2004;

Leonardos et al. 2009). High concentrations of bicarbonate on the other

hand enhance the production of coccoliths (Tyrrell and Taylor. 1996; Paasche.

1998; Shiraiwa. 2003). Coccolith production, diameter, and volume of the

cell increases with the addition of bicarbonate (Shiraiwa. 2003). While the

physiochemical properties of varying calcium and bicarbonate concentrations have been studied, the identification of genes involved in biomineralization through comparative expression analysis has not been performed at the

transcriptional level. The ability to alter the environmental conditions, such as

nutrients, can be exploited in this manner to study the molecular mechanisms

that power the process of calcification.

Sequence Profiling

The new technology of high throughput sequencing provides insight

into genome-wide expression profiling and gene identification (Fields. 2007;

Graveley. 2008; Shendure and Hanlee. 2008). High-throughput sequencing

technologies have defined the transcriptional landscape of yeast, Arabidopsis

thaliania, chicken, marine fish, pig tissues, nerve dependent limb regeneration

in salamanders and mouse stem cells (Jones-Rhoades et al. 2007;

Nagalakshmi et al. 2008; Lister et al. 2008; Cloonan et al. 2008; Hornshoj

12

et al. 2008; Monaghan et al. 2009; Mu et al. 2010; Yao et al. 2011). The

new technologies differ from traditional sequencing methods (Sanger sequencing) in that they do not sequence individual DNA clones but sequence hundreds of thousands (454 sequencing) to tens of millions

(Solexa/Illumina and SOLiD) of DNA fragments in parallel. The gigabytes of data generated from these powerful tools is robust and unbiased, affording

statistical certainty that allows for the identification of even low expressed

transcripts (Cheung et al. 2006; Emrich et al. 2006; Johnson et al. 2007;

Weber et al. 2007). The data can also be used to improve gene models

generated by automated annotation. High throughput sequencing technology

produces large amounts (gigabases) of data using short sequence reads, the

lengths of which vary across platforms. For instance, sequences obtained for

Solexa/illumina and SOLiD generated reads of 25-150 nucleotides, while

sequences obtained from 454 sequencing are typically 200-500 nucleotides..

With Solexa/Illumina sequencing, 150-200 gigabases are produced. The

shorter Solexa reads have some advantages. First, Solexa/Illumina has

greater coverage depth. Secondly, the terminator chemistry is reversible which

eliminates the problem of homopolymers. Thirdly, the data is robust with

sequence errors close to random. Lastly, no prior knowledge of the transcriptome is required which is perfect for analysis of organisms with

13

poorly annotated genomes. On the other hand, in well annotated genomes,

high throughput sequencing allows the discovery of novel transcripts. The disadvantages are 1) large amounts of time are required for samples to be sequenced and 2) analysis of the data is difficult.

Unraveling the molecular process of biomineralization is still in its infancy. Some biomineralization proteins and novel genes have been identified in previous research, where calcifying and non-calcifying E. huxleyi cells were

grown in phosphate-limiting and phosphate-replete conditions (Wahlund et al.

2004; Quinn et al. 2006). Proteins and genes were identified through the

traditional use of microarrays, EST evidence and other bioinformatic tools.

The direct sequencing of the transcriptome of E. huxleyi grown under

different physiological conditions known to affect biomineralization may be a

powerful means of dissecting the genetic basis of the biomineralization

process. Calcium and sodium bicarbonate are necessary substrates for

proper coccolith formation. Thus, in this study, transcriptome profiles were

generated from cells grown in normal and calcium deplete conditions with

and without a sodium bicarbonate spike. This work is unique from previous

work done in biomineralization. There has not been any attempt to identify

novel biomineralization proteins/genes through the use of gene expression

profiles generated from high throughput sequencing technology.

14

Methods

Strains and Growth Conditions

Emiliania huxleyi and Isochrysis galbana strains were purchased from

established algal culture collections. The E. huxleyi strain 217 was obtained

from Plymouth Marine Biological Laboratory while I. galbana (CCMP 1323)

was obtained from Provasoli- Guillard National Center for Culture of Marine

Phytoplankton. Cultures were grown in modified F/2 (after Guillard 1975)

artificial seawater medium (ASW) with 400 mM NaCl, 10 mM KCl, 20 mM

MgSO4 · 7H2O, 20 mMMgCl2 · 6H2O, 0.4 mM HBO3, 0.88 mM NaNO3,

0.036mM NaH2PO4·H2O in Nanopure Water (Barnstead® DImond Nanopure™

Dubuque, IA.). A “vitamin cocktail” concentrate was added to yield a final

concentration of 50 μg biotin, 50 μg vitamin B12, and 100 ng thiamine-HCl per liter of media. The pH was adjusted to 8.0 via the addition of HCL, then this media was subsequently autoclaved at 120°C for 20 minutes. Prior to use, media was supplemented with trace elements from a 1000x stock solution (76.5 μM ZnSO4·7H2O, 39 μM CuSO4·5H2O, 42 μM CoCl2·6H2O, 910

μM MnCl2 · 4H2O, 26 μM Na2MoO4 · 2H2O, 5.8 mM FeCl2·6H2O, 11.7 mM

Na2EDTA·2H2O) that was filter sterilized using a 0.1 micron 25mm nylon

membrane filter cartridges (Corning # 21061-25). As well as, adding sufficient

2+ CaCl2 to bring the final [Ca ] to 9 mM. Four liter flasks with 2 liters of

15

ASW were inoculated at a 1:20 dilution of a mid-log phase stock culture

(to ~ 6 x 104 cells/ml) and allowed to grow for 7 days. Cultures were

grown photoautotrophically at 17-18°C using a cool white fluorescent light

(145 umol· m-2 ·s-2). Growth was monitored by cell counts using a

haemocytometers (HYCOR)

E. huxleyi and I. galbana were grown under physiological conditions known to affect calcification in coccolithophorids. These conditions included: 1) typical calcification when cells were grown with 9 mM Ca2+, 2) reduced

calcification when cells were grown in 0 mM Ca2+, 3) enhanced calcification

stimulated when cells were grown in ASW with 9 mM Ca2+ supplemented with 20 mM NaHCO3, and 4) little or no calcification when cells were grown

2+ in 0 mM Ca supplemented with 20 mM NaHCO3. Cultures were spiked

with 20 mM NaHCO3 for enhanced calcification on day 6 post-inoculation,

twenty-four hours before the RNA extraction.

Scanning Electron Microscopy

SEM was used to monitor calcification and the morphology of the

coccoliths. For SEM, a volume equivalent to 5.7 x 106 cells was filtered onto

a polycarbonate 0.8 µM 47 mm membrane (Millipore #ATTPO4700) and

16

allowed to dry. The filtered membranes were sent to Paul Reidel at

PhotoMetrics for SEM imaging.

2+ - Ca Titration Estimates of CaCO3 Deposition

EGTA (ethylene glycol-bis(2-aminoethyl)-tetraacetic acid) complexometric

titration with EGTA has been used as a direct method for determining the

Ca2+ content in seawater (Kremiling. 1983). The Ca2+ concentration was measured by titration according to Chisholm and Gattuso (1991). Each of the

4 cell cultures were titrated on day 0, 2, 4 and 6 post-inoculation. The metal indicator glyoxal-bis(2-hydroxanil) (GHA) is colorless in a borax buffer, but in the presence of a metal such as Ca2+, GHA creates a deep red chelate complex. Upon titration with the stronger metal chelator EGTA, GHA is displaced rendering a colorless solution. The amount of EGTA required to free the GHA of Ca2+, is directly proportional to the amount of Ca2+ in

solution.

To determine the Ca2+ concentration of the seawater samples, a

standard calcium solution (0.0103M calcium carbonate; 0.3 mM hydrochloric

acid; 0.0532 M magnesium nitrate; 0.09 mM strontium chloride; 0.685 M

sodium chloride) was prepared and used to standardize the EGTA solution to

0.01M. To process samples, 10 mls were placed in a beaker. The contents

17

were mixed; 4 mls of both 0.05% GHA (diluted with 1-propanol) and borax buffer (0.0525 M sodium tetraborate; 1.5 M sodium hydroxide) were added, and the sample was allowed to stir for three minutes to allow the red Ca2+-

GHA complex to form. Seven mls of 1-butanol were added to extract the

red Ca2+- GHA complex into the organic layer. The upper organic layer was

extracted with a separator into a new 25 ml Erlenmeyer flask. Titration was

then carried out with the standardized EGTA solution. The endpoint was

noted when the organic layer changed from red to colorless. The Ca2+

content was calculated with the following equation:

10 μmol EGTA 1 μmol Ca2+ mL EGTA x x = μmol Ca2+ mL EGTA 1 μmol EGTA

Measuring Photosynthesis rates Oxygen evolution occurs during the light-dependent reactions of

photosynthesis when light energy is used to split water molecules into

protons, electrons and O2. The electrons are used to power the electron

transport chain for the chemiosmotic ATP synthesis. Oxygen evolution for

each algal culture was measured by the use of a Hasatech oxygen electrode

system. Oxygen evolution was monitored on 2, 4 and 6 days after

inoculation. After calibration of the oxygen electrode 1 ml samples of algal

culture concentrated to 100X were placed into the DW1 electrode chamber

18

where oxygen generated from the samples was recorded in nmol/ml. A dual

fiber optic light source was used with a fiber placed one inch from the top

and left side of the chamber delivering 1360 µmol m-2s-1 of light to the

sample. The oxygen produced was plotted as an indirect measurement of the

rate of photosynthesis (oxygen evolved/unit time). Per-cell O2 evolution was calculated by dividing by mean number of cells/ml for that sample.

RNA Extraction

A standard guanidium isothiocyanate phenol/chloroform extraction was

used to extract total RNA from cultures of E. huxleyi and I. galbana. All

RNA extractions were performed according to the protocol of Strommer and

colleagues (1993). Each extraction was carried out in the middle of the light

phase. E. huxleyi and I. galbana cells were decalcified, by lowering the pH

of the culture with 0.1M HCl (from pH 8 to pH 5). After one minute the

pH was restored to 8.0 by the addition of 0.1 M NaOH. The dilute acid

treatment dissolves the CaCO3 coccoliths that would otherwise interfere with

the RNA extraction. Cells were harvested (Beckman Coulter Avanti™ J-20 XP

centrifuge) at RT, 7500 RPM for 10 minutes. Cells were lysed by grinding in

liquid nitrogen using a mortar and pestle and resuspended in 10 ml

extraction buffer (4 M guanidium isothiocyanate, 25 mM sodium nitrate, 0.5%

19

sarkosyl, 0.1% M β-mercaptoethanol). After vortexing the lysate for 30

seconds, 1 ml of 2 M sodium acetate (pH 4.0) was added. After vortexing

for an additional 30 seconds, 10 ml of water saturated phenol (pH 4.3) was added and the sample was vortexed again for 30 seconds before the addition of 2 ml of chloroform. Upon vortexing for an additional 30 seconds, samples were centrifuged at RT, 5000 g for 10 minutes. The upper aqueous layer containing the RNA was removed to a clean tube and a second phenol/chloroform extraction was performed. To precipitate the RNA an equal volume of cold isopropanol was added to the aqueous phase and the sample was placed in -20°C freezer for 24 hours. RNA was collected by centrifugation at 10,000 g for 10 minutes at 4°C and resuspended in 500 µl

Ultra pure water (GIBSON). After transferring the RNA to a 1.5 ml eppendorf tube, an equal volume of 4M lithium chloride was added, and the sample allowed to stand on ice for 1 hour. RNA was collected again by centrifuging (Biofuge Fresco Heraeus centrifuge) at 13,000 g for 5 minutes at

4°C and washed with 70% cold ethanol. After air-drying for 5 minutes, the

RNA was resuspended in 50 µl of RNAse free water. The concentration of

RNA was determined by measuring the absorbance at 260 and 280nm, and the integrity of the RNA was determined by agarose or capillary gel electrophoresis. RNA was aliquoted and stored at -80°C freezer.

20

RNA Gel Electrophoresis

Ribonucleic acid gel electrophoresis separates RNA by size and as

such is used to check for the integrity of the RNA. A 1.5% agarose RNA

gel (36.5 ml water; 5 ml 10x MOPS/EDTA buffer; 8.5 ml 35% formaldehyde)

was placed in an electrophoresis unit with 1X MOPS/EDTA. Each RNA

sample was prepped with RNA loading dye (final concentration 1X), RNA

(1000 ng) and water (total volume 20 µl). Samples were mixed and heated

at 65°C for 10 minutes prior to loading the gel and running for 1 hr at

85V. A picture of the RNA gel was obtained with the BioRad gel

documentation system.

Experion RNA Electrophoresis

The Bio-Rad Experion is an automated microfluidic electrophoresis

station that combines all electrophoresis steps, including staining, destaining,

band detection, and imaging into a quick 30 minute process. The experion

electrodes are cleaned before and after running the experion electorphoresis

station with the recommended cleaning solution and water in cleaning chips.

All contents in the RNA kit are allowed to equilibrate at room temperature

for 15 minutes. A gel-stain was prepared by placing 600 µl RNA gel into a filter tube and centrifuging for 1,500 g for 10 minutes. The filter was discarded and 65 µl of the RNA gel was placed in a new microcentrifuge

21

tube, with, 1 µl of RNA stain. The gel-stain was vortexed and wrapped in

aluminum foil to prevent light from degrading the photolabile dye.

The RNA Ladder and RNA samples were prepared for chip loading. A

total of 1 µl of RNA ladder was used for each chip. Samples were diluted

to approximately 100 ng/µl and 2 µl of sample was placed in separate micro

centrifuge tubes. The ladder and samples were denatured for 2 minutes at

70°C and then placed on ice.

The Experion RNA StdSens chip was primed with the filtered gel-stain

solution by pipetting 9 µl of the gel-stain solution into the gel priming (GS) well and placing the chip in the priming station. After priming chip, wells were loaded with 9 µl of gel stain while 9 µl of filtered gel was loaded in well labeled G. Loading buffer was placed into each of the sample wells.

The ladder (1 µl) was loaded into the well labeled L and the samples were loaded into wells 1-12. If there were less than 12 samples the unused sample wells were loaded with 1 µl water.

After loading, the chip was vortexed and placed into the Experion

electrophoresis station and run on the RNA setting.

RNA sequencing

Solexa/Illumina sequencing of total RNA from E. huxleyi and I.

galbana was used for transcriptomic analysis. Total RNA was extracted as

22

described previously, and sent to Beijing genomics

(http://www.genomics.cn/en/bgi.php?id=158). There, RNA sequence was

performed thusly according to this general protocol. A preparation of a cDNA library was constructed by first digesting total RNA with a DNAse I. Followed by mRNA purification using poly-T oligo-attached magnetic beads. The purified mRNA was then fragmented into smaller pieces using divalent cations under elevated pressure. Next double stranded cDNA was synthesized. After cDNA synthesis the overhang ends were repaired with T4 and Klenow DNA polymerases by removing the 3’ end and phosphorylating the 5’ end. Tailing of the 3’ adds an ‘A’ base to the 3’ end of the blunt phosphorylated fragments. Adaptors were ligated onto both ends of the fragments and then amplified with PCR. The cDNA library was validated by checking the size, purity and concentration of a resuspended constructs. Samples were then cluster amplified on a cBot. Library construction and sequencing was performed by Bejing Genomics, Reads were aligned to the genome using mapping and alignment with quality (Maq) software and a digital counting method was employed to count the number reads mapping to individual transcripts. Dr. Xiaoyu Zhang computer science department at CSUSM performed the data analysis whereby a negative binomial distribution was

23

employed to identify statistically significant differentially expressed genes

across treatments.

Primer Design

Primer 3 v. 0.4.0 was used to design primers for real time RT-PCR

validation of Solexa sequencing results of differentially expressed transcripts

(http://frodo.wi.mit.edu/primer3/ ). The primers of 18-22 base pairs were designed such that amiplcon size ranges were 75-100 base pairs. Primer GC

content range was between 20-80%. While the primer Tm range was 57-

63°C and the product Tm was 75-85°C. Primers were ordered from

Integrated DNA Technologies Inc. See Appendix A for primer list.

Real Time RT-PCR

Real time RT-PCR analysis was used to validate differential expression

of candidate transcripts using the RNA extracted from E. huxleyi.

Complementary DNA was synthesized with mRNA templates for the

purpose of obtaining the expression levels of transcripts. cDNA synthesis was performed using Verso™ cDNA Kit from Thermo Scientific. For each sample 2

µg of RNA was mixed with oligo DT primer (Final concentration 25ng/µl) and water to a total of 12 µl and heated at 70°C for 5 minutes. To each

24

sample 5X buffer (Final concentration 1X), dNTP mix (500 µM) and 1µl RT

enhancer verso enzyme mix was added. After mixing, samples were

incubated 42°C for 30 minutes. The reverse transciptase from the Verso

enzyme mix was then inactivated by heating the sample to 75°C for 10

minutes, and the cDNA was stored in the -20°C freezer.

The real time RT-PCR was performed according to standard

procedures using SYBR green chemistry on a iCycler iQ system (Bio-Rad).

The target gene application was performed in triplicate in a 96-well plate.

Each 25 µl reaction mixture contained 5 µl diluted (1:25) cDNA, 12.5 µl of a

2X SYBR green master mix, and 0.24 µM each of the forward and reverse

primers. The cycling conditions included 10 minutes polymerase activation

step (95°C) followed by 40 cycles of amplication (95°C for 10 seconds, 60°C

for 30 seconds and 72°C for 30 seconds). At the end of each cycle,

fluorescence was measured at 82°C. Melt curves were performed after the

real time RT-PCR amplification. The sample was denatured for 1 minute at

95°C, cooled for 1 minute at 55°C, and then the temperature was ramped

0.5° C every 10 seconds starting at 55°C to a final temperature of 95°C.

∆∆ Expression data was analyzed in Excel 2003 using 2- Ct (Livak and

Schmittgen. 2001; Arya et al. 2005). T-tests were used to determine significant treatments in a pairwise manner.

25

The Benjamini-Hockberg post hoc test was used to compensate for

the incidence of type I errors when using a fixed alpha (α). With 25 tests were performed the expected number of false positives is 0.05 * 25. To correct this, a more stringent α (or modified p-critical) was used. This was done by taking each p-value and comparing it to a modified α equal to

0.05/r (r = rank labeled 1 to n according to p-value). The observed p-value was then compared to the modified p-critical. Gene expression was considered statistically significantly different across samples if the observed p- value was smaller then the modified p-critical.

To characterize the magnitude and direction of difference in expression for each gene across the comparison pairs (9 mM Ca2+ to 0 mM Ca2+, 9

2+ - 2+ 2+ - mM Ca + NaHCO3 to 9 mM Ca , and 0 mM Ca + NaHCO3 to 0 mM

∆∆ Ca2+), the 2- Ct method (Arya et al. 2005) was used to calculate a “fold

change” for each pair. ∆∆Ct represents the difference between the paired

samples in reference normalized Ct units, and the value 2 is derived from

an assumed PCR amplification efficiency of 1. As comparisons were made

between different samples but always with the same primers, deviation from

the assumed PCR efficiency affects both samples equally. Thus the relative expression pattern derived from analysis of the real-time RT-PCR data holds true, even though the absolute level of expression cannot be determined.

26

∆∆ In this way, 2- Ct (comparative) real-time RT-PCR analysis was employed to validate the RNAseq results and to determine the magnitude of the relative expression differences between the treatment groups.

Annotation

Automated annotation or gene prediction can be improved by manual

curation, which involves examining intron/exon junctions, inspecting insertions

and deletions and verifying start and stop codons. Annotation began by

identifying genes of interest (cyclophilin) and their relevant domains from JGI

Joint Genome Institute E. huxleyi genome portal (http://genome.jgi-

psf.org/Emihu1/Emihu1.home.html). Gene models for each of the cyclophilins, identified by JGI automated annotation, were inspected and the best model for each gene was selected by examining bit scores and multiple sequence alignments. Manual curation of intron/exon boundaries, insertion and deletions, and the start and stop codons was accomplished using Expasy translation tool, EST evidence, and/or clustalw multiple sequence alignments. Upon verifying or modifying gene models, a three letter gene name was assigned, and information pertaining to defline (a description of the gene and its function), model notes and EST evidence was recorded and uploaded onto the JGI portal.

27

RESULTS

Cell Growth

E. huxleyi and I. galbana cultures were grown in ASW (pH 8) at two different Ca+2 concentrations (9 mM Ca2+ and 0 mM Ca2+). Total cell counts

were performed on each culture every other day for seven days (Figure 2,

3).

250

200 150 100

50

0 Cell Counts Cell (Cells/ml)(10000) 0 2 4 6 Days 9 mMol Ca+2 0 mMol Ca+2

Figure 2: Growth Curves for I. galbana Cultures in ASW with 9 mM and 0 mM Ca2+.

200

150

100 50

0

Cell Counts (Cells/ml)(10000) Cell 0 2 4 6 Days

9 mMol Ca+2 0 mMol Ca+2

Figure 3: Growth Curves for E. huxleyi. Cultures in ASW with 9 mM and 0 mM Ca2+.

28

Doubling times were calculated for each strain for seven days with

the formula: Log(q2/q1) Td = (t2 - t1)log(2)

Where t = time in days and q = quantity of cells at time t.

ANOVA of the doubling times showed no significant difference between I.

galbana and E. huxleyi grown in 9 and 0 mM Ca+2 (p–value = 0.0965).

However, separate 2-tailed t-tests within I. galbana and E. huxleyi grown in

9 and 0 mM Ca+2 indicated significant differences between treatments for E. huxleyi (p-value = 0.0288), but not for I. galbana (p-value = 0.17). The

means ± SD (hrs) for I. galbana cells grown in 9 and 0 mM Ca2+ were

31.18 ± 4.5 and 36.6 ± 7.7, while the means ± SD (hrs) for E. huxleyi

were 23.8 ± 0.4 and 30.9 ± 0.88, respectively. For E. huxleyi the growth

rates obtained from this study approximated those reported in the literature

where a doubling time of 24 hours has been observed (Soto et al. 2006).

However, the growth rate for I. galbana varied significantly from 14 hours

reported in the literature (Gopinathan 1984).

29

Scanning Electron Microscopy

Scanning electron microscopy (S.E.M) was used to characterize the

relative abundance and structural morphology of coccoliths produced by E.

huxleyi under the four different growth conditions. Figure 4A is a

representative SEM from a sample of E. huxleyi strain 217 cells grown in

ASW with ambient levels of Ca2+ (9 mM Ca2+). Under these conditions well

formed coccoliths with intact coccospheres are noted. When spiked with 20

- mM NaHCO3 and allowed to grow for 24 hrs, calcification was evidenced with an abundance of whole and intact coccoliths being produced (Figure

4B). A plethora of coccoliths were present in both micrographs of cells collected from cultures grown in 9 mM Ca2+ and 9 mM Ca2+ + Spike with no marked differences in coccolith morphology. Hence from SEM data alone it was not possible to ascertain whether the addition of sodium bicarbonate had any affect on calcification. When cells were grown in the absence of

Ca2+ (0 mM Ca2+), a small number of remnants of broken coccoliths was seen, but no intact coccoliths or coccospheres were observed (Figure 4C). E.

huxleyi cells grown in the absence of Ca2+ with the addition of a 20 mM

- NaHCO3 spike exhibit more coccolith fragments but again no intact coccoliths

or coccospheres were found (Figure 4D).

A B 30

C D

Figure 4: SEM images of E. huxleyi cells grown normal and calcium deplete f/2 ASW media with and without bicarbonate spike. A: 9 mM Ca2+, B: 9 mM Ca2+ with 20 mM - 2+ 2+ - NaHCO3 spike, C: 0 mM Ca , D: 0 mM Ca with 20 mM NaHCO3 spike.

31

As a non-calcifying coccolithophorid I. galbana is not expected to produce

coccoliths under any of the four growth conditions. In all instances no

coccoliths and only collapsed cells were evident (Figure 5). The collapsed

cells may be an artifact of the S.E.M. preparation procedure where cells in

the absence of coccoliths are unable to withstand the vacuum required for

sputter coating.

A B

Figure 5: Representative SEM images of Isochrysis galbana cells present no visible coccoliths under any treatment condition A; 9mM Ca 20mM bicarbonate spike, B; 0mM Ca no spike, 5000X field, 20kV.

32

Calcium Titration As an indirect estimate of Ca2+ used during biomineralization, a Ca2+

titration was performed to measure the amount of Ca2+ in spent media, the

assumption being that Ca2+ removed from media was used by the cells. In typical spent ASW from (calcifying) E. huxleyi cultures, there was a greater decrease in Ca2+ compared to (non-calcifying) I. galbana (Figure 6). From

day 2 to day 7, the [Ca2+] in 9 mM Ca2+ spent media from E. huxleyi decreased 21 µmol while that from I. galbana decreased 5 µmol (Figure 6A,

C). Conversely, the [Ca2+] in 0 mM Ca2+ media from E. huxleyi decreased

less than I. galbana. From day 2 to day 7, the Ca2+ from spent ASW from

E. huxleyi decreased 5 µmol and I. galbana decreased 12 µmol.

I. galbana E huxleyi

A C 140 120

120 100 100

+2 80 +2 80 60 60 uMol Ca uMol umol Ca umol 40 40 20 20

0 0 0 2 4 6 0 2 4 6 Days Days

9 mMol Ca +2 0 mMol Ca+2 9 mMol Ca +2 0 mMol Ca+2 B D 0.0003 0.0003

0.0002 0.0002

0.0001 0.0001 /Cell +2 0 0 2 4 6 2 4 6 -0.0001 -0.0001 nmol Ca nmol Ca+2/Cell -0.0002 -0.0002 -0.0003 -0.0003 Days Days

9 mMol Ca+2 0 mMol Ca+2 9 mMol Ca+2 0 mMol Ca+2

Figure 6: Amount of Ca2+ in spent F/2 media for I. galbana (left) and E. huxleyi (right) A&C:Ca2+ in spent ASW. B & D: Ca2+ uptake per cell. Day 0 was excluded in these measurements due to the confounding Ca2+ added when cultures were inoculated

33

Ca2+ uptake per cell was estimated by dividing the change in Ca2+

per ml per day by the mean number of cells per ml for that day. The initial

2+ (day 2) per cell uptake of Ca is highly variable compared with day 4 and

6 for both species. Counter intuitively, I. galbana appears to be loosing

calcium to its environment on day 2 (Figure 6B). On the other hand, E.

huxleyi appears to have a peak calcium uptake on day 2. Both cultures

show a leveling off of per cell calcium uptake by day 4 persisting through

day 6 which is consistent with the expected constant rate of biomineralization

during exponential growth.

34

Photosynthesis Rates

Photosynthesis rates were measured for both I. galbana and E.

+2 huxleyi grown in 9 mM and 0 mM Ca using O2 evolution/production. The purpose of monitoring the amount of oxygen produced per cell was to see if photosynthesis and biomineralization are coupled processes, since CO2 is released during calcification and has been hypothesized to potentially exist as a CO2 concentration mechanism. Photosynthesis rates for cells grown in 9

mMol Ca+2 were higher then cells grown in 0 mM Ca+2 for both I. galbana and E. huxleyi, while per cell rates were constant in both species regardless

of media treatment (Figure 7).

I. galbana E. huxleyi

A C 100 20

80 15 60 10 40 (nmol/ml/min) (nmol/ml/min) 2 20 2 5 O O 0 0 2 4 6 2 4 6 Days Days 9 mMol Ca +2 0 mMol Ca +2 9 mMol Ca+2 0 mMol Ca+2 B D 0.00025 0.00025 0.0002 0.0002 0.00015 0.00015 0.0001 0.0001 (nmol/min/cell) (nmol/min/cell) 2 2 0.00005

0.00005 O O

0 0 2 4 6 2 4 6 Days E Days 9 mMol Ca+2 0 mMol Ca+2 9 mMol Ca+2 0 mMol Ca+2 Figure 7: Photosynthesis Assays for I. galbana (left) and E. huxleyi (right). A& C

Photosynthesis Rates as determined by O2 evolution. B & D Oxygen Production per Cell.

35

Solexa Profiles

To identify genes potentially involved in biomineralization, transcript

profiles were generated from each of the four experimental conditions in E.

huxleyi using high throughput Solexa RNA sequencing (RNAseq). RNA

sequence reads that generated matches to predicted E. huxleyi gene

transcripts were tabulated. This produced a list of approximately 20,000

unique genes that were detected in the transcriptome. The frequency of

reads were statistically analyzed with a negative binomial digital count method

(Anders and Huber. 2010; Werner. 2010). This resulted in a total of 103

genes that were listed as being significantly up-regulated (Appendix B) and a

total of 22 genes listed as being significantly down-regulated (Appendix C) in response to conditions expected to promote biomineralization (Normal Calcium concentration vs. Calcium deplete). The up-regulated genes were sorted according to the highest expression in normal biomineralization conditions (9 mM Ca2+) and the down-regulated genes were sorted according to the lowest

p-value. The top 20 highest differentially up-regulated genes and the 5 most

strongly down regulated genes were chosen for Real-Time PCR analysis

(Table 2).

36 Up-Regulated in Calcification Promoting Conditions Prot. ID Homology Function 0 mM Ca 0 mM Ca + S 9 mM Ca 9 mM +S PValue 41308 T. adhaerans hyp. prot. Hypothetical protein 60 110 470 740 0.000206 433416 Hypothetical protein 95 161 721 980 0.00031 416147 Dictyostelium discoideum |Q9XYS3.1| NADPH oxidase 49 68 276 505 0.000331 356539 Serine kinase 65 48 290 391 0.000607 211653 Zinc metallopetidase 65 79 286 505 0.00105 423164 Hypothetical protein 48 18 311 61 0.001103 107508 Branchiostoma floridae|EEN58705.1| C-type lectin 111 144 565 737 0.001548 195816 Ectocarpus silicosus conserved unkown protein 66 101 244 559 0.002276 440390 Chlamydomonas reinhardtii predicted protein 99 74 310 507 0.002482 205013 Saccoglossus kowalevskii Titin-like multifibronectin 64 140 291 619 0.003505 353141 Hp/ Titin-like (fibronectin) 88 122 405 507 0.00404 366340 Hypothetical protein 50 51 214 232 0.004097 456555 Zinc metallopetidase 82 121 285 579 0.004587 370886 Hypothetical protein 72 118 323 471 0.005126 432839 Nuclear antigen 1099 143 3423 1621 0.005248 456347 Physcomitrella patens |EDQ66621.1| Protease inhibitor/myosin like 61 257 305 1007 0.005329 366493 Sparus aurata Guanine binding α subunit 49 70 218 275 0.00575 438208 Arabidopsis lyrata |EFH54886.1| Hypothetical protein 596 732 2433 2789 0.006523 218690 Phaeodactylum tricornutum| predicted protein 92 24 388 58 0.008819 199401 Vibrio coralliilyticus |EEX33214.1| Amine oxidase 1408 240 5238 795 0.00951 Down-Regulated in Calcification Promoting Conditions Prot. ID Homology Function 0 mM Ca 0 mM Ca + S 9 mM Ca 9 mM +S PValue 54552 Periplasmic binding protein 369 1157 70 119 0.000148 120394 ZIP family transporter 169 244 27 39 0.000996 67936 Ectocarpus siliculosus Short dehydrogenase 4 13 0 0 0.001117 113427 Oligopeptide binding protein 125 588 58 72 0.001714 413815 Flavin-containing monoxidase 1026 2108 175 482 0.003546 Table 2: JGI Emiliania Huxleyi Protein Identification numbers, best homology organism, protein function, digital counts from RNAseq data by culture condition, and P-values for significance of difference between 9 mM and 0 mM digital counts according to negative binomial analysis.

37

Amongst the significantly differentially expressed genes were those coding for

proteins that contain functional domains or were suspected based on

homology corresponding to 1) superoxide-generating NADPH oxidase

flavocytochrome 2) serine/protein kinase 3) zinc binding site 4) zinc finger

ring type 4) C-type lectin 5) nuclear antigen 6) protease inhibitor 7) amine oxidase 8) titin-like 9) guanine nucleotide binding protein 10) periplasmic binding protein 11) zip family transportor 12) short chain dehydrogenase 13) oligopeptide binding protein 14) flavin-containing monooxygenase 15) hypothetical proteins. Most of the significantly differentially expressed genes coded for putative proteins, some with homology to proteins of unknown function in other organisms.

38

RNA Extraction & RNA Gel Electrophoresis

Total RNA was extracted in order to obtain Solexa sequencing profiles for I. galbana. A total of four sets of independent RNA extractions with four

samples per set were performed and two sets were rejected due to quality

control issues. One extraction was for E. huxleyi and three for I. galbana

cells, grown and treated with different nutrient conditions. The E. huxleyi

extraction was used to replenish the supply of RNA for RT-PCR. The best

extraction out of the three sets of I. galbana extractions was sent away to

Beijing to be sequenced. The concentration and purity of RNA samples was

determined before sequencing (Table 3). The RNA concentrations ranged from

1697.7 to 3083.4 ng/ul with total RNA extracted varying from 61.3 to 555

µg. The range of the 260/280 ratios were generally within the range of high

purity, from 1.7 to 2.1.

39

Nanodrop Experion™ sample Treatment Ng/µL 260/280 260/230 28s:18s RQI Ehux 1 9 mM 2065.2 2.09 2.20 1.34 9.1 Ehux 1 9 mM + S 2012 1.84 1.96 1.35 9.1 Ehux 1 0 mM 2077.3 2.12 2.23 1.2 8.9 Ehux 1 0 mM+ S 2313 2.08 2.09 1.41 8.7 Igal 1 9 mM 2269.7 2.02 1.86 1.46 8.4 Igal 1 9 mM + S 2550.8 1.98 1.74 0.96 7.3 Igal 1 0 mM 1225.4 1.74 1.14 1.71 8.6 Igal 1 0 mM + S 2235.3 1.94 1.47 1.56 8.3 *Igal 2 9 mM 2263.4 1.89 1.52 1.43 8.3 *Igal 2 9 mM + S 3083.4 1.96 1.66 1.54 8.8 *Igal 2 0 mM 1697.7 1.96 1.6 1.55 9 *Igal 2 0 mM + S 2178.3 1.97 1.65 1.54 8.9 Igal 3 9 mM 2476.2 2.14 2.22 1.14 8.4 Igal 3 9 mM + S 2690.9 2.09 2.15 1.47 8.9 Igal 3 0 mM 2548.7 2.1 2.17 1.33 8.2 Igal 3 0 mM + S 2620.5 2.14 2.18 1.33 8.8 Table 3: RNA quality, purity and quality analysis for a representative E. huxleyi extraction and all three I. galbana RNA extractions. * = Samples sent to Beijing Genomics, S = - addition of 20 mM NaCO3 , RQI = RNA Quality Index (see text)

Traditional RNA gels

RNA samples were subjected to denaturing agarose gel electrophoresis in order to check the RNA integrity. This is done by visualization of the distinct RNA bands corresponding to the 28S and 18S ribosomal RNA subunits. The best set of I. galbana RNA samples and the E. huxleyi samples are presented in Figure 8. Distinct 28S and 18S rRNA bands were

40 present for I galbana and E huxleyi in all four conditions in the RNA gel along with other bands likely to be tRNA chloroplast or mitochondrial RNA

(Figure 8).

9 9S 0 0S A

28 S

18 S

B 9 9S 0 0S

28 S

1816 SS

Figure 8: Agarose/EtBr Gel Image of I. galbana and E huxleyi RNA extracts. A) I galbana B) E huxleyi. Intact bands indicate integrity of the RNA samples from the four culture conditions (lane labels). Also note the prominent 28 and 18s ribosomal subunit bands. A volume equivalent to 1 µg total RNA according to the Nanodrop concentration was loaded into each well. An equal amount of RNA was detected by examining the brightness of the bands. Figure A Lanes 1, 2 and 4 appear to have the same intensity which indicates an equal amount of RNA. Lane 3 on the other hand was brighter, which indicates more RNA was loaded than the rest of the lanes. On the other hand, Figure B lanes 1-4 have the same intensity and an equal amount of loaded RNA. 9 = 9 mM Ca2+, 9 mM Ca2+ + 20 - 2+ 2+ - mM NaCO3 , 0 = 0 mM Ca , 0 = 0 mM Ca + 20 mM NaCO3 .

41

Bio-Rad Experion® microfluidic chip

In addition to quantifying the 28s to 18s ratio, the Experion™ system

outputs a RNA Quality Index (RQI) score for each sample (Table 3). A

28s:18s rRNA ratio close to 2 indicates high integrity, while the RQI score is derived from a comparison between the electropherogram of each sample against a series of electropherograms from standardized degraded RNA samples. The scores are scaled from 1 to 10, with 1 being totally degraded and 10 being fully intact (Bio-Rad tech note 5761). Experion™ software also produces ‘virtual gel’ images from the electropherograms, which are suitable for comparison to traditional RNA gels (Figure 9). Distinct 28S and 18S rRNA bands were present in all 4 conditions for all the sets of I galbana

and the set of E huxleyi RNA extractions. The RQI ranged between 7.3 and

9.

42

A B

Figure 9: Experion™ virtual Gel image outputs from RNA chip for Isochrysis galbana RNA samples (A); L = Ladde r, lanes 1-4 extraction one 7 sep 2010; 9 mM Ca2+, 9 mM Ca2+, - 2+ 2+ - + 20 mM HCO3 , 0 mM Ca , 0mM Ca + 20 mM HCO3 ,respectively: likewise extraction two 6 Oct 2010 for lanes 5-8 and extraction three 9 Nov 2010 for lanes 9-12, and E. huxleyi RNA samples (B); L = Ladder, lanes 1-4 correspond to 9 mM Ca2+, 9 mM Ca2+, + - 2+ 2+ - 20 mM HCO3 , 0 mM Ca , and 0mM Ca + 20 mM HCO3 , respectively. Note the prominent bands for 28s and 18s rRNA.

Comparative Reverse Transcriptase Real-Time PCR The total RNA samples of E. huxleyi were used for cDNA synthesis.

These cDNA samples were used at a 1:25 dilution as template for Real-

Time PCR analysis. Reactions for each primer set were run in triplicate along with a reference gene [353329] for normalization. The reference gene was selected from a set of genes that showed the least difference in expression. Each plate also included a template free control.

43

Observed and expected product melting temperatures (Tm), as well as,

other primer characteristics are shown in Appendix A. Most of the observed

Tms approximated the values of the expected Tms. The occasional variation

from the expected Tms is likely to be a result of poor gene models or

incorrectly called intron-exon junctions, allelic variances and/or single

nucleotide polymorphisms.

Ct values were used to verify the statistical significance of differences in expression for the 25 genes selected from the RNAseq candidates and to determine the magnitude and ‘direction’ of the difference (up- or down regulated). Reference normalized Ct values were used as input for a set of pairwise t-tests comparing expression of genes in cells grown in: 1) 9 mM C

2+ 2+ 2+ - 2+ a to 0 mM Ca 2) 9 mM Ca + NaHCO3 to 9 mM Ca , and 3) 0

2+ - 2+ mM Ca + NaHCO3 to 0 mM Ca (Table 4). Of the 25 genes identified

as being significantly differentially expressed by RNAseq methodology (of 9

vs 0 mM Ca2+), real time RT-PCR comparisons showed 22 were validated

by a 2-tailed t-test and Benjamini Hockberg post-hoc test (Table 5). In

accordance with RNAseq and real time RT-PCR, nineteen were significantly

up-regulated and three were significantly down-regulated in cells cultured in

calcium replete (9 mM Ca2+) versus calcium deplete ASW.

44

Table 4: Magnitude and direction, plus statistical significance of differential expression for the 25 genes tested with Real-Time PCR (blue p-values are not significant).

9 mM vs 0 mM 9 mM + S vs 9 mM 0 mM + S vs 0 mM Protien ID Description Fold difference p-value Fold difference p-value Fold difference p-value

Putative homology to gb|EDQ66621.1| (Physcomitrella patens subsp. Patens). 456347 Function to proteinase inhibitor 119.8+ 5.23E-08 0.8 0.03 52 2.81E-08

Putative homology to gb|AAD22057.1| (Dictyostelium discoideum). Function to superoxide-generating NADPH oxdase 416147 flavocytochrome 38.7+ 6.15E-07 2 2.26E-05 24.1 3.25E-05 Putative homology to probable nuclear 432839 antigen 33.3+ 5.55E-06 0.7 0.01 2 0.00602 433416 24.8+ 1.53E-05 1.9 1.70E-03 15.9 9.14E-05 Putative to hypothetical protien Putative conserved unknown protein 195816 (Ectocarpus siliculosus)* 20.0+ 0.00011 2.3 3.16E-03 12.1 0.000125 Putative homology to hypothetical 353141 protein 19.7+ 7.07E-05 0.3 0.02 11.6 0.000144

Putative function is serine/protien 356539 kinase. 14.5+ 2.12E-05 10.1 1.17E-05 14.7 3.63E-05 Putative function is titin-like 205013 (Saccoglossus kowalevskii) 13.7+ 9.29E-05 0.3 0.03 5.4 7.94E-04

Putative function Peptidase M, neutral zinc metallopeptidases, zinc-binding 211653 site 13.0+ 6.10E-05 2.9 1.81E-03 5.5 5.02E-05

Putative homology to gb|EDP01940.1|. Chlamydomonas reinhardtii). Function 340390 Zinc Finger ring 12.7+ 1.64E-07 4.8 1.78E-05 19.7 3.46E-07

370886 Putative to hypothetical protein 12+ 5.50E-05 0.3 0.17 7.3 0.000362

Putative homology to b|EEN58705.1| (Branchiostoma floridae). Function to 107508 C-type lectin 11.7+ 0.000102 0.9 0.66 8.6 0.000157

Putative function to peptidase M, neutral zinc metallopeptidases, zinc- 456555 binding site 10.1+ 3.82E-06 4.9 0.16 5.4 2.09E-05

Putative homology to gb|EDV25835.1| 41308 (Trichoplax adhaerens). 5.1+ 0.09593 1 0.66 2.3 0.235 Putative function guanine nucleotide binding protein (G-protein), alpha 366493 subunit (Sparus aurata) 6.6+ 3.20E-05 1 0.51 2.4 0.00016 Putative homology to b|EEC43129.1| 218690 (Phaeodactylum tricornutum)* 4.2+ 0.000225 0.3 2.66E-04 0.5 0.007966

438208 Putative to hypothetical protien 3.5+ 1.62E-05 2.8 6.42E-05 6.2 3.83E-06

Putative homology to b|EEX33214.1| (Vibrio coralliilyticus). Function to 199401 amine oxidase 3.5+ 0.000756 0.11 2.46E-05 0.11 0.000514 366340 2.9+ 0.000218 3.6 7.21E-05 2.1 0.00428 Putative to hypothetical protein 423164 Putative to hypothetical protein 1.8+ 0.001701 0.3 3.32E-03 0.8 0.1108

Putative homology to b|EEE45627.1| (Labrenzia alexandrii). Function to 67936 short dehydrogenase* -0.91 0.624 -1.45 0.04 2 0.000115

Putative homology to b|EEY34986.1| (Leptotrichia goodfellowii). Function to 113427 oligopeptide binding protein* -1.7 0.101 0.9 0.85 10.5 1.41E-05

Putative homology to b|ABP00071.1| (Ostreococcus lucimarinus). Function 120394 to ZIP family transporter* -5.55 2.52E-05 -3.17 1.13E-04 2.4 0.00016 Putative homology to b|EDP05199.1| (Chlamydomonas reinhardtii). Function 413815 to flavin-containing monooxygenase -8.33 3.43E-06 4.3 4.74E-06 8.4 4.73E-05

Putative homology to b|ABF65666.1| (Ruegeria sp.). Function periplasmic 54552 binding protein* -58.8 0.000115 4.3 5.35E-03 13.1 1.00E-05

45

Table 5: t–test and Benjamini-Hochberg scaled alpha/critical value results for RT-PCR comparing 9mM to 0mM without bicarbonate spike.

Two sample t-test (2-tail) 9 vs 0 Benjamini Hockberg

Protien ID Description p-value Rank Mod p-crit

Putative homology to gb|EDQ66621.1| (Physcomitrella patens subsp. Patens). 456347 Function to proteinase inhibitor 5.23E-08 25 0.002 SIGNIFICANT

Putative homology to gb|EDP01940.1|. Chlamydomonas reinhardtii). Function Zinc 340390 Finger ring 1.64E-07 24 0.00208333 SIGNIFICANT

Putative homology to gb|AAD22057.1| (Dictyostelium discoideum). Function to superoxide-generating NADPH oxdase 416147 flavocytochrome 6.15E-07 23 0.00217391 SIGNIFICANT Putative homology to b|EDP05199.1| (Chlamydomonas reinhardtii). Function to 413815 flavin-containing monooxygenase 3.43E-06 22 0.00227273 SIGNIFICANT Putative function to peptidase M, neutral 456555 zinc metallopeptidases, zinc-binding site 3.82E-06 21 0.00238095 SIGNIFICANT Putative homology to probable nuclear 432839 antigen 5.55E-06 20 0.0025 SIGNIFICANT 433416 Putative to hypothetical protien 1.53E-05 19 0.00263158 SIGNIFICANT 438208 Putative to hypothetical protien 1.62E-05 18 0.00277778 SIGNIFICANT

356539 Putative function is serine/protien kinase. 2.12E-05 17 0.00294118 SIGNIFICANT

Putative homology to b|ABP00071.1| (Ostreococcus lucimarinus). Function to ZIP 120394 family transporter* 2.52E-05 16 0.003125 SIGNIFICANT 370886 Putative to hypothetical protein 5.50E-05 15 0.00333333 SIGNIFICANT

Putative function Peptidase M, neutral zinc 211653 metallopeptidases, zinc-binding site 6.10E-05 14 0.00357143 SIGNIFICANT

353141 Putative homology to hypothetical protein 7.07E-05 13 0.00384615 SIGNIFICANT

Putative function guanine nucleotide binding protein (G-protein), alpha subunit (Sparus 366493 aurata) 7.20E-05 12 0.00416667 SIGNIFICANT Putative function is titin-like (Saccoglossus 205013 kowalevskii) 9.29E-05 11 0.00454545 SIGNIFICANT

Putative homology to b|EEN58705.1| (Branchiostoma floridae). Function to C-type 107508 lectin 0.0001016 10 0.005 SIGNIFICANT Putative conserved unknown protein 195816 (Ectocarpus siliculosus)* 0.0001098 9 0.00555556 SIGNIFICANT

Putative homology to b|ABF65666.1| (Ruegeria sp.). Function periplasmic binding 54552 protein* 0.0001147 8 0.00625 SIGNIFICANT

366340 Putative to hypothetical protein 0.0002175 7 0.00714286 SIGNIFICANT Putative homology to b|EEC43129.1| 218690 (Phaeodactylum tricornutum)* 0.0002245 6 0.00833333 SIGNIFICANT

Putative homology to b|EEX33214.1| (Vibrio 199401 coralliilyticus). Function to amine oxidase 0.000756 5 0.01 SIGNIFICANT 423164 Putative to hypothetical protein 0.001701 4 0.0125 SIGNIFICANT Putative homology to gb|EDV25835.1| 41308 (Trichoplax adhaerens). 0.09593 3 0.01666667 NOT Putative homology to b|EEY34986.1| (Leptotrichia goodfellowii). Function to 113427 oligopeptide binding protein* 0.101 2 0.025 NOT Putative homology to b|EEE45627.1| (Labrenzia alexandrii). Function to short 67936 dehydrogenase* 0.624 1 0.05 NOT

46

A graphical comparison of the digital counts ratio from the RNAseq into and

the ‘fold difference’ data from real-time RT-PCR is shown in Figure 10. Ct

values and digital counts were normalized to 0 mM Ca2+.

A

A B

B

Figure 10: Validation of Solexa data using RT-PCR. A) real time RT- PCR gene expression of 9, 9S and 0S mM Ca2+ normalized to 0 mM Ca2+. B) Solexa Digital Count Ratio of 9, 9S and 0S mMol Ca2+ normalized to 0 mM Ca2+. Black arrows: discord between RNAseq and RT-PCR.

Further analysis of the congruence between the RNA sequencing and real

time RT-PCR data revealed that the direction of the real time RT-PCR fold

change and RNAseq digital count ratios were the same for all but one of

2+ 2+ the 24 genes in the 9 mM Ca vs 0mM Ca comparison, consistent with the use of significant differences in 9 mM vs 0 mM data from RNAseq as

47

the selection criteria for additional analysis using PCR. (Table 6). In eight

instances of comparison between carbonate spike treatments, the direction of

gene regulation inferred from the RNAseq data was contradicted by comparative real time RT-PCR analysis.

48

Table 6: Relative expression pattern comparison between RNAseq and Real-Time RT PCR analysis: Down = less expression of that gene in the listed sample; Up= more expression of that gene in the listed sample; Same = no difference in expression; None = count of zero; Bold = contradictory expression patterns.

9mM vs 0mM 9mM+S vs 9mM 0mM+S vs 0mM ID Description RNA PCR RNA PCR RNA PCR 199401 Amine oxidase Up Up Down Down Down Down 432839 Nuclear antigen Up Up Down Up Down Up 438208 Hypothetical protein Up Up Up Up Up Up 433416 Hypothetical protein Up Up Up Up Up Up 107508 C-type lectin Up Up Up Same Up Up 41308 Hom. T. adhaerans hyp. prot. Up Up Up Same Up Up 353141 Hp/ Titin-like (fibronectin) Up Up Up Down Up Up 218690 Hom. P. tricornutum hyp prot Up Up Down Down Down Down 370886 Hypothetical protein Up Up Up Same Up Up 423164 Hypothetical protein Up Up Down Down Down Down 340390 Zinc Finger (RING type) Up Up Up Up Down Up 456347 Protease inhibitor/ myoin like Up Up Up Down Up Up 205013 Titin-like multifibronectin Up Up Up Down Up Up 356539 Serine kinase Up Up Up Up Down Up 211653 Zinc metallopetidase Up Up Up Up Up Up 456555 Zinc metallopetidase Up Up Up Up Up Up 416147 NADPH oxidase Up Up Up Up Up Up 195816 Hom. E. silicosus hyp. prot. Up Up Up Up Up Up 366493 Guanine binding α subunit Up Up Up Same Up Up 366340 Hypothetical protein Up Up Same Up Same Up 120394 ZIP family transporter Down Down Up Down Up Up 413845 Flavin-containing monoxidase Down Down Up Up Up Up 113427 Oligopeptide binding protein Down Down Up Same Up Up 67936 Short dehydrogenase Down Same None Down Up Up 54552 Periplasmic binding protein Down Down Up Up Up Up

49

Of the 25 genes identified via analysis of RNAseq data as being significantly differentially expressed 19/20 up-regulated (95%) and 3/5 down-regulated

(60%) were found to be significantly different according to analysis of real time RT- PCR data. The lower rate of detection for down-regulated genes specifically may be due to their low expression levels, which are masked by variability in the data. In addition to verifying the significance of the difference in expression, comparative real-time RT-PCR analysis was used to confirm the regulation pattern (up vs down regulated) seen in the RNAseq data. The pattern was consistent 96% of the time for the comparison between calcium replete vs deplete conditions, with the only exception arising from a lack of significant difference seen in the real time RT-PCR data for the short chain dehydrogenase (67936), which appears to be a very low expressed gene judging from digital counts. For the Bicarbonate Spike treatment, which is expected to enhance biomineralization, mixed results were evidenced, with 14% of the comparisons being contradictory and an additional

18% being statistically inconclusive among all of the groups for an overall verification rate of 68%.

The most significantly differentially expressed gene among the 25 transcripts tested codes for a protease inhibitor (456347). When cells were grown in calcifying vs non-calcifying conditions, this protease inhibitor was up-

50 regulated more than 100 fold. Amongst the remaining differentially expressed transcripts, 13 exhibited a change between 10 – 100 fold, and 8 between 1

– 10 fold.

51

Annotation

Cyclophilins belong to a group of proteins with peptidyl-propyl cis trans

isomerase activity. These enzymes catalyze the isomerization of peptide

bonds from the trans form to the cis form at proline residues to facilitate

protein folding. In plants cyclophilins also bind to cyclosporine and

immunosuppressants and show mild antifungal and antibiotic activity (Fruman

et al. 1994; Romano et al. 2005; Wang and Heitman. 2005; Fu et al. 2007;

Perez and Weiz. 2008). In the sea urchin cyclophilins have been implicated as candidate biomineralization proteins (Livingston et al. 2006). In E. huxleyi

(http://genome.jgi-psf.org/Emihu1/Emihu1.home.html), cyclophilins are a rather

large gene family with over 89 predicted members. Of these, 25 were selected for manual annotation. Selection was based on: 1) amino acid sequence similarity across species and 2) the presence of a well-defined

peptidyl-propyl cis-trans isomerase domain (As defined by interPro database

(Table 7). Seven of the 25 annotated genes were found to be allelic

variants.

52

Table 7: Manually Annotated Cyclophilins. A new protein ID is assigned to corrected uploaded gene models. Protien ID (Old/New) Gene Name Domain Defline

Cyclophilin-type, putative cyclophilin isoform A involved in protein 316904/557962 CYPA4 folding, posttranslational modification.

Putative cyclophilin-like protein involved in protein folding, posttranslational modification; Sorting and Degradation; Ubiquitin mediated proteolysis [PATH:ath04120] 451960/557972 CYP1 64857 CYP2 Putative rotamase and protein folding, cyclophilin type 78487 CYP2' Putative protein folding, RNA interacting protein and 422936/557975 CYP3 posttranscriptional processes

Putative isomerase F, putative cyclophilin protein involved in protein

363448/557976 CYP4 folding, posttranslational modification

69317 CYP6 Putative cyclophilin protein involved in protein folding, posttranslational modification; Putative isomerase A 73735 CYP7 trans isomerase trans 369290 CYP4' 441261 CYP5 prolylcis - 75296 CYP5' - 416224 CYP8

419089 CYP8' Peptidyl 205479/558120 CYP9 457808/558133 CYP10 Putatuive cyclophilin involved in protein folding and posttranslational 450103 CYP10' modification 363223 CYP12 373193 CYP12' 365117/558163 CYP14 468248 CYP15 108238 CYP16 56686/558170 CYP17

315160/558154 CYP11 Putatuive cyclophilin (rotamase) involved in protein folding and posttranslational modification 78889 CYP11' Hypothetical protien; Putatuive cyclophilin incolved in protein folding 309909 CYP13 and posttranslational modification

53

Most of the cyclophilin gene models were pretty good. In some instances, however, the introns/ exon junctions were predicted incorrectly resulting in the insertion/deletion of between 3 - 67 base pairs. (Table 8). In a couple of cases exons were skipped and in other cases the start codon was incorrectly predicted or absent. Manual annotation of cyclophilins required a variety of bioinformatic tools. NCBI blastp provided by the JGI website provided phylogenetic relationships or the closest neighbor. Expasy translation tool was used to rectify inserted or deleted base pairs. ClustalW multiple sequence alignment was used to locate intron/exon junctions and to examine the sequence conservation and homology. After manual curation of the genes,

9 out of the 25 gene models were uploaded to the gene catalogue and the annotation page was completed by assigning a gene name, defline, description and indicating presence or absence of EST evidence.

Characteristics of cyclophilin gene models are shown in Table 9A and

Table 9B. The average bit score was 640 with the highest score at 1177 and the smallest at 375, indicating cyclophilins are highly conserved across species. In all but two cases an identifiable ATG start codon was detected.

The termination codon on the other hand varied: 40% of the cyclophilin proteins use a TAG, 48% use a TGA, 4% use a TAA and 8% a stop codon could not be identified. The closest neighbor for 20 of the cyclophilin

54

proteins annotated was related to some form of algae. Cyclophilin proteins

ranged in the length from 137 – 551 amino acids with an average length were 241 amino acids. The average length of all of the JGI predicted proteins in the E. huxleyi was 259 amino acids. The identity score for the

cyclophilins annotated ranged 37.5% and 79.5% with an average of 57.4%

while amino acid sequence homology ranged from 17.7% – 84.6% with an

average of 63.1%. The number of introns varied from 0 – 9 with the average being 3. Of the exons/inton junctions 2/3 were canonical junctions with GT AG 5’ & 3’ ends. The average GC content of the protein coding region was 66.22, which mirrors that of the entire genome. The average GC content of introns, however, was slightly higher at 74.18%.

A cladogram from the clustalΩ alignment of the 18 unique cyclophilins

is shown in Figure 11. Notable in this graphic is a division into two major

clades, one that has far less recent branching than the other. In comparison

to other species, cyclophilins have branched into 2 clades in the red alga

Griffithsia japonica (Vallon. 2005). However, 3 clades have been seen in the

unicellular alga Chlamydomonas reinhardtii and the vascular plant Arabidopsis

thaliana (Lee et al. 2001; Ramano et al. 2004).

. A percent ID matrix for the gap excluded clustalΩ alignment is shown in Figure 12. One point of interest is CYP 6 and 7 are 100%

55

identical when gap excluded. It is possible that CYP 6 and 7 are allelic

variants. Another point of interest is CYP 15 is the most distant cyclophilin

and is shown to be the most distant in both the cladogram and percent Id

matrix (PIM).

Clustal Omega (clustalΩ) alignments of the 18 cycophilins exclusive of the

allelic variants:

Figure 11: Phylogram generated by Clustal2 (neighbor joining with distance correction and gap exclusion) for the 18 cycophilins exclusive of the allelic variants based on a ClustalΩ alignment, both via the European Bioinformatics Institute http://www.ebi.ac.uk.

56

Figure 12: Percent Identity Matrix For Gap Excluded Alignment. Colors indicate how close sequences are to each other. Green = high identity Red = low identity Yellow = Intermediate

CYP1 CYP2 CYP3 CYP4 CYP5 CYP6 CYP7 CYP8 CYP9 CYP10 CYP11 CYP12 CYP13 CYP14 CYP15 CYP16 CYP17 CYPA4 100 43 36 42 40 40 40 43 34 35 39 36 39 38 25 30 42 40 CYP1 43 100 37 40 39 37 37 35 43 42 41 39 39 44 25 35 38 37 CYP2 36 37 100 35 34 37 37 31 34 35 35 31 37 36 25 24 35 40 CYP3 42 40 35 100 93 77 77 36 40 30 52 52 62 62 23 39 63 74 CYP4 40 39 34 93 100 78 78 34 39 29 55 50 64 62 22 37 62 70 CYP5 40 37 37 77 78 100 100 38 38 29 53 46 61 63 25 35 61 66 CYP6 40 37 37 77 78 100 100 38 38 29 53 46 61 63 25 35 61 66 CYP7 43 35 31 36 34 38 38 100 40 31 35 35 36 38 35 60 39 41 CYP8 34 43 34 40 39 38 38 40 100 31 41 34 39 43 24 38 38 37 CYP9 35 42 35 30 29 29 29 31 31 100 29 31 31 36 26 27 30 31 CYP10 39 41 35 52 55 53 53 35 41 29 100 47 55 53 22 33 48 49 CYP11 36 39 31 52 50 46 46 35 34 31 47 100 49 46 23 31 52 47 CYP12 39 39 37 62 64 61 61 36 39 31 55 49 100 54 23 39 59 65 CYP13 38 44 36 62 62 63 63 38 43 36 53 46 54 100 25 37 54 56 CYP14 25 25 25 23 22 25 25 35 24 26 22 23 23 25 100 24 28 25 CYP15 30 35 24 39 37 35 35 60 38 27 33 31 39 37 24 100 36 37 CYP16 42 38 35 63 62 61 61 39 38 30 48 52 59 54 28 36 100 63 CYP17 40 37 40 74 70 66 66 41 37 31 49 47 65 56 25 37 63 100 CYPA4

57

Table 8: Problems in cyclophilin gene models.

Gene name Protein ID Best Model Description of Annotated Errors 316904 557962 fgenesh_newKGs_pm.592__2 Start codon predicted incorrectly Exon 1 skipped 12 bp at 3' end not predicted CYP1 557972 estExtDG_Genemark1.C_530087 Additional 6 bp at 3' end of exon 2 predicted incorrectly Additional 3 bp at 5' end of exon 3 predicted incorrectly CYP3 557975 estExtDG_fgenesh_newKGs_pm.C_1040003 1 bp at 3' end of exon 2 not predicted additional 4 bp at 5' end of exon 3 predicted incorrectly CYP4 363448 fgenesh_newKGs_kg.148__31__2695551:1 Additional exon predicted (exon 2) incorrectly CYP9 205479 Gm1.2400171 13 bp at 3' end of exon 2 not predicted additional 22 bp at 5' end of exon 3 CYP10 457808 estExtDG_Genemark1.C_2760015 9 bp at 3' end of exon 3 not predicted CYP11 315160 fgenesh_newKGs_pm.240__7 Additional exon predicted (exon 2) incorrectly CYP11' 78889 E_gw1.10914.4.1 No start codon CYP14 365117 fgenesh_newKGs_kg.179__30__EVC00354_Peptidyl-prolyl Start codon predicted incorrectly Exon 1, Exon 2 and Exon 3 skipped CYP17 56686 Gw1.125.57.1 additional 38 bp at 3' end of exon 1 predicted incorrectly additional 16 bp at 5' end of exon 2 predicted incorrectly additional 67 bp at 3' end of exon 2 predicted incorrectly 15 bp at 3' end of exon 3 not predicted additional 21 bp at 5' end of exon 4 No start & stop codon

58

Table 9A: Characteristics of Cyclophilin gene models

Difference of Average Lengh of E. Lengh of huxleyi model vs Average Protein Present average Model Closest neighbor Protein ID Gene Bits Identity % Length Models in JGI length in JGI in a tree (taxa (Old/New) Name Description Score Score Homology Start Stop (a.a.) Website website with class info) peptidyl-prolyl cis-trans isomerase, cyclophilin Apicolplexans 451960/557972 CYP1 type 1177 45.4 64.1 ATG TGA 552 534 18 (Protist) peptidyl-prolyl cis-trans 64857 CYP2 isomerase 581 63 73.1 ATG TAG 168 163 5 Green Algae peptidyl-prolyl cis-trans 78487 CYP2' isomerase 581 63 77.2 ATG TAG 168 162 6 Green Algae peptidyl-prolyl cis-trans isomerase, cyclophilin 422936/557975 CYP3 type 1017 45.5 55.4 ATG TGA 460 484 -24 Green Algae peptidyl-prolyl cis-trans Aurecoccus isomerase, cyclophilin anophagefferens 363448/557976 CYP4 type 675 76 81.2 ATG TAG 165 179 -14 (algae) peptidyl-prolyl cis-trans Aurecoccus isomerase, cyclophilin anophagefferens 369290 CYP4' type 679 77 81.1 ATG TAG 165 171 -6 (algae) peptidyl-prolyl cis-trans isomerase, cyclophilin Pelagophytes 441261 CYP5 type 745 79 83.1 ATG TAG 173 172 1 (micro algae) peptidyl-prolyl cis-trans isomerase, cyclophilin Pelagophytes 75296 CYP5' type 753 79.5 83.2 ATG TAG 173 174 -1 (micro algae) peptidyl-prolyl cis-trans isomerase, cyclophilin Green 69317 CYP6 type 728 71.5 80.6 ATG TGA 176 173 3 Algae/Diatoms peptidyl-prolyl cis-trans isomerase, cyclophilin Green 73735 CYP7 type 728 71.5 80.6 ATG TGA 176 173 3 Algae/Diatoms peptidyl-prolyl cis-trans isomerase, cyclophilin Pelagophytes 416224 CYP8 type 608 42.9 36.8 ATG TGA 175 182 -7 (micro algae) peptidyl-prolyl cis-trans isomerase, cyclophilin Pelagophytes 419089 CYP8' type 609 43.1 36.8 ATG TGA 175 182 -7 (micro algae) peptidyl-prolyl cis-trans Oikopleura isomerase, cyclophilin dioica (sea- 205479/558120 CYP9 type 499 52.1 31.5 ATG TAA 328 169 159 squirts) peptidyl-prolyl cis-trans Aurecoccus isomerase, cyclophilin anophagefferens 457808/558133 CYP10 type 606 49.1 70.2 ATG TAG 238 465 -227 (algae) peptidyl-prolyl cis-trans Aurecoccus isomerase, cyclophilin anophagefferens 450103 CYP10' type 666 42.3 46.6 ATG TGA 293 459 -166 (algae) peptidyl-prolyl cis-trans isomerase, cyclophilin 315160/558154 CYP11 type; Signal pepyide 459 51.7 66 ATG TGA 177 173 4 Monocots peptidyl-prolyl cis-trans Monosiga isomerase, cyclophilin brevicollis/green 78889 CYP11' type 375 52.2 64.7 TGA 137 172 -35 algae peptidyl-prolyl cis-trans Aurecoccus isomerase, cyclophilin anophagefferens 363223 CYP12 type; Signal pepyide 499 38.1 37.5 ATG TGA 241 536 -295 (algae) peptidyl-prolyl cis-trans Aurecoccus isomerase, cyclophilin anophagefferens 373193 CYP12' type; Signal pepyide 499 38.1 37.5 ATG TGA 241 536 -295 (algae) peptidyl-prolyl cis-trans isomerase, cyclophilin 309909 CYP13 type 570 65 64.5 ATG TAG 173 261 -88 Green Algae peptidyl-prolyl cis-trans Ectocarpus isomerase, cyclophilin Silicolosus 365117/558163 CYP14 type 517 69.6 84.6 ATG 165 182 -17 (brown algae) peptidyl-prolyl cis-trans isomerase, cyclophilin 468248 CYP15 type; Signal pepyide 487 38.2 67.1 ATG TAG 244 235 9 Green Algae peptidyl-prolyl cis-trans isomerase, cyclophilin Pelagophytes 108238 CYP16 type; Signal pepyide 713 45.5 17.7 ATG TAG 391 173 218 (micro algae) Perkinsus peptidyl-prolyl cis-trans marinus isomerase, cyclophilin (protozoan 56686/558170 CYP17 type 591 66.1 81.3 171 214 -43 parasite) peptidyl-prolyl cis-trans isomerase, cyclophilin A; This isomerase activity has been hypothesized to be important during the Thalassiosira stress response of pseudonana 316904/557962 CYPA4 organisms 616 70.6 76.1 ATG TGA 165 168 -3 (diatom)

59

Table 9B: Characteristics of cyclophilin gene models in E. huxleyi

Shortest Longest Average Shortest Longest Average Number of Intron Coding Protein ID Gene # of Intron Intron Intron # of Exon Exon Exon Intron/Exon Conical Nonconical G+C G+C (Old/New) Name Introns (a.a) (a.a) (a.a) Exons (a.a) (a.a) (a.a) Junctions Junction Junction content Content 451960/557972 CYP1 3 24 99 28 4 34 267 138 3 1 1 78.70% 67.80% 64857 CYP2 1 75 75 75 2 73 95 84 1 1 0 71.70% 65.50% 78487 CYP2' 1 75 75 75 2 73 95 84 1 1 0 71.70% 65.50% 422936/557975 CYP3 4 23 75 45 5 39 208 92 4 3 0 70.20% 70.10% 363448/557976 CYP4 2 25 54 40 3 7 135 55 2 2 0 78.20% 65.30% 369290 CYP4' 2 22 57 40 3 7 135 55 2 2 0 78.60% 65.70% 441261 CYP5 2 33 55 44 3 8 142 58 2 2 0 75.00% 65.70% 75296 CYP5' 2 33 55 44 3 8 142 58 2 2 0 74.20% 65.90% 69317 CYP6 0 0 0 0 1 176 176 176 0 0 0 0% 63.30% 73735 CYP7 0 0 0 0 1 176 176 176 0 0 0 0% 63.30% 416224 CYP8 2 30 31 30.5 3 21 112 58 2 1 1 67.20% 64.40% 419089 CYP8' 2 30 31 30.5 3 21 112 58 2 1 1 67.20% 64.00% 205479/558120 CYP9 2 96 552 324 3 23 179 109 2 1 0 75.50% 68.20% 457808/558133 CYP10 7 21 403 102 8 3 98 29.75 7 5 1 68.88% 65.13% 450103 CYP10' 6 23 91 52 7 3 98 42 6 4 1 70.24% 65.30% 315160/558154 CYP11 1 796 796 796 2 5 172 88.5 1 0 1 66.58% 69.87% 78889 CYP11' 0 0 0 0 1 137 137 137 0 0 0 0% 69.83% 363223 CYP12 0 0 0 0 1 32 241 241 0 0 0 0% 66.67% 373193 CYP12' 0 0 0 0 1 241 241 241 0 0 0 0% 66.67% 309909 CYP13 1 32 32 32 2 47 126 87 1 1 0 78.95% 68.40% 365117/558163 CYP14 9 8 154 65.43 10 5 30 16.5 9 1 6 69.04% 62.22% 468248 CYP15 5 22 87 47 6 18 109 41 5 2 3 62.86% 65.44% 108238 CYP16 4 12 478 149.4 5 8 175 78.33 4 4 0 66.54% 66.92% 56686/558170 CYP17 4 22 33 26 5 23 44 34.33 4 1 2 66.67% 60.23% 316904/557962 CYPA4 4 24 43 31.23 5 5 67 33 4 2 1 69.65% 64.64%

60

DISCUSSION

The molecular underpinnings of coccolith biomineralization remain

elusive. Previous work aimed at identifying genes involved in biomineralization involved EST expression profiles, suppressive subtractive hybridization (SSH), cDNA microarrays, and real time RT-PCR (Wahlund et al. 2004; Nguyen et al. 2005; Quinn et al. 2006). Quinn’s 2006 microarray results were the most similar to this study, although compared to the RNAseq data the microarray screened far fewer sequences. The microarray data reported 127 unique differentially expressed sequences. Of these, 82 sequences were validated with real time RT-PCR. Of those 82, only 46 (38 up and 8 down- regulated) had concordant direction of differential expression for a single treatment validation rate of 56%. In this study, the RNAseq analysis had a total of

123 genes being significantly affected by 9 mM Ca2+ vs 0 mM Ca2+ treatment with 24 out of 25 genes being concordant in direction according to real time RT-PCR for a single treatment validation rate of 96%.

Additionally, the available GenBank sequences from the microarray

study were BLAST against the E. huxleyi genome to generate a list of hits to the top three models for each sequence. None of the resulting 95 protein

ID’s were among those identified in this study (Appendix Q).

61

In Nguyen et al. 2005, SSH was used to identify E. huxleyi genes

expressed when cells were grown in phosphate-limited vs phosphate-replete

media. A total of 168 gene sequences were differentially expressed. Of these

transcripts, 63 were explicit to the phosphate-limited condition, while 105 were

explicit to the phosphate-replete condition. Furthermore, the effectiveness of

SSH was demonstrated by the validation of the specific expression of a

subset of 8 genes using real time RT-PCR, tested over a 14 day time

course. It was not feasible to compare the genes identified via SSH with

those obtained in this study.

In Wahlund et al. 2004, EST profiles of E. huxleyi grown in

phosphate-limiting and phosphate-replete media were generated. The outcome

of the EST study was a unigene set of 4,057 unique sequences.

Additionally, over 40% of those expressed genes had similarity to known

genes from other organisms and also have been functionally categorized.

Other studies that contributed to edifying the knowledge of the

biomineralization process in E. huxleyi include works from Westbroek et al.

1984; Leonardos et al. 2009 and Langer et al. 2010. Westbroek et al. 1984, proposed a possible working hypothesis about coccolith formation in E.

huxleyi. Specifically, about a complex polysaccharide that has been connected

with calcium carbonate. This coccolith polysaccharide contains a mannose

62 backbone with highly branched side chains. The side chains are composed of at least 13 different monosaccharides which are methylated, dimethylated or contain carboxyl or sulphate ester groups.

With a polysaccharide staining technique, the polysaccharide of interest was located in 4 spots in the c.v – r.b complex: 1) outline of the membrane

2) fine threads in the lumen 3) base plate 4) a thin film surrounding the calcium carbonate crystals. The polysaccharide is believed to be anchored in the membranes of the c.v. – r.b. complex and extend into the lumen before and at the early stages of coccolith formation. Eventually, the lumen becomes dilated (forms space) and the first crystallites are formed along the rim of the base plate. The coccolith continues to grow by the coordinated pull of the cytoskeleton on the coccolith vesicle which causes further dilation.

Eventually, the polysaccharide adheres to the surface of the growing crystals, detaches from the membrane and forms a protective cover on the coccolith.

The polysaccharide has three functions proposed 1) inhibiting crystallization when anchored in a narrow vesicle’s membrane 2) Inducing crystal formation and nucleation when vesicle becomes dilated and 3) Completely inhibiting further growth when coccolith is formed by detaching from the membrane and covering the crystal surface.

63

The 126 genes identified as up-regulated under calcification promoting conditions in this study consisted mostly of polysaccharide associated proteins, such as proteoglycans from the aggrecan family. The aggrecan family is a class of extracellular matrix proteoglycans used in organizing collagen fibers in the biomineralization process (Arias & Fernandez. 2008). They have small lucine rich repeats in their core protein, so called small leucine-rich proteoglycans (SLRP) and highly negatively charged linear polysaccharide side chains called Glycosaminoglycans (GAPs).

In Langer et al. 2010, components of the cytoskeleton (actin microfilaments and microtubules) were tested to see if they where involved in coccolith morphogenesis. Different concentrations of a microtubule and actin inhibitor (colchicine and cytochalasin B) were added in E. huxleyi cultures and SEM analyses of samples were taken in order to determine the morphology of the coccoliths. In the higher concentrations of both inhibitors, more malformed coccoliths were seen. It was concluded that colchicine and cytochalasin B disrupted microtubules and actin microfilaments in the shaping of the coccoliths.

Furthermore, lectins and cytoskeleton components were present in the list of up-regulated genes in this study. Three C-type lectins and several actin cross-linking and bundle transcripts, as well as myosin and two titin-like

64 proteins with multiple fibronectin domains were among a few of the genes that were up-regulated under conditions that favor biomineralization. The C- type lectins encode homologues of ovovleidin, lithostathine and aggrecan which interact with calcite crystals to either inhibit or promote nucleation and growth, while, the cyctoskeleton exerts the force necessary for the mechanical control of the precise shaping of the coccolith crystals.

Many of the genes found in the up-regulated list make biological sense in terms of biomineralization involvement for because other polysaccharides, such as PS-2 and PS-3 and cytoskeleton elements have been found to be involved in biomineralization (Arias and Fernandez. 2008;

Langer et al. 2010). PS-2 facilitates crystal nucleation and PS-3 is thought to control calcite and morphology growth. So it is not surprising to see genes encode polysaccharides associated proteins or cytoskeleton elements on the list. C-type lectins have also played an important role in biomineralization of the purple sea urchin’s spicule matrix (Livingston et al. 2006; Mann et al.

2010). Since C-type lectin domains have also been found in spicule matrix proteins for the purple sea urchin, it is not surprising to see a few on the list of up-regulated genes for E. huxleyi.

Leonardos et al. 2009 addressed long-term adaptive responses to a range of [Ca2+] to see if calcification was linked to photosynthesis. E. huxleyi

65

cultures were grown in ASW under various Ca2+ concentrations (9 mM, 5

mM, 3 mM, 1 mM and 0 mM Ca2+) and in varying light levels.

Photosynthesis measurements were monitored using an oxygen electrode and

pigment contents were quantified by high performance liquid chromatography

(HPLC) analysis. SEM images were also used to analyze the structures of

the coccoliths. Normal well formed coccoliths were seen at 9 mM Ca2+.

Whole intact coccoliths, as well as intact coccoliths with an increase of slits in the proximal shield were observed in 5 mM Ca2+. At 3 mM Ca2+

coccoliths had broken fragments and were less calcified. Thinner and

undercalcifed coccoliths were seen in 1 mM Ca2+, as well as, disrupted

coccospheres and few intact coccoliths. Lastly, at 0 mM Ca2+ few coccolith

fragments were seen, as well as naked and collapsed cells. A similar

relationship between photosynthetic performance and light levels was seen in

all five Ca2+ treatments. The pigment data showed that it was light

dependent and not affected under varying [Ca2+]. Leonardos et al. 2009

concluded calcification and photosynthesis are independent processes.

The main goal of this project was to obtain a more thorough

understanding of coccolith formation (biomineralization) in Isochrysidales by

identifying genes potentially involved in the process. Gene candidates were

selected by being expressed to a significantly greater or lesser extent under

66

conditions that promote or inhibit biomineralization. A key underpinning of this

was to ascertain that the conditions intended to enhance (bicarbonate spike)

or retard (calcium depletion) coccolith formation actually did so in practice.

An elegant demonstration of this was provided by the Scanning Electron

Micrographs produced from cells grown in these conditions.

The differences in the micrographs between the E. huxleyi strain 217

cells grown in normal (9 mM Ca2+) and calcium deplete (0 mM Ca2+, 0 mM

Ca2+ + S) conditions are striking. Those grown in media replete with calcium

feature robust, symmetrical, well-formed coccoliths, while those grown in

calcium deplete condition are nearly devoid of even coccolith fragments.

Any effect of bicarbonate spike on coccolith formation was not evident

in the scanning electron micrographs. The bicarbonate ‘spike’ refers to the

addition of 20 mM more bicarbonate in addition to what was already present

in the media. Protons react with the bicarbonate and spontaneously

equilibrate with CO2 and H2O at room temperature.

- + CO2 + H2O ↔HCO3 +H

The spike helps to trigger biomineralization but is not a substrate for it. The

- concentration of the HCO3 is unknown and the effects are temporary. The

effects of the Ca2+ were obvious but the effects of the bicarbonate could not be resolved using SEM.

67

The expected results from the Ca2+ titration were E. huxleyi grown in

normal calcifying conditions would have the greatest consumption of Ca2+

from the surrounding media over the 7 day time period. E. huxleyi grown in

0 Ca2+ should mirror what is evidenced for I. galbana. I. galbana was used as a negative control. Absence of Ca2+ in the media is expected to stop the

production of coccoliths. In I. galbana it was expected for both normal and

depleted Ca2+ conditions to have no or little change in Ca2+ consumption. E.

huxleyi grown in normal calcifying conditions had the greatest decrease and

in I. galbana there was very little change (21 and 5 µmol). On the other

hand, in Ca2+ deplete conditions; I. galbana appeared to consume more Ca2+

then E. huxleyi (12 and 5 µmol). In addition, Figure 9 A and C show an

increase in Ca2+ from Day 0 to Day 2 instead of a flat or gradually

declining line. It is speculated that this sudden increase of Ca2+ into the

samples was introduced during the initial inoculation with cultures of normal

Ca2+ conditions. Also, it was more difficult to perform titration measurements

in Ca2+ ion depleted media that is meant to have no calcium ions in it.

Another source but not as prominent as the Ca2+ from inoculate cultures is

there could have been Ca2+ residue from inside the glassware. Overall, cells

that were producing coccoliths were expected to use more Ca2+ during active

coccolithogenesis as is demonstrated here for E. huxleyi.

68

Photosynthesis and calcification are independent processes according to

Leonardos et al 2009. This study supports those findings in that per-cell photosynthesis rates were constant for both species and did not correspond to the differences in Ca2+ uptake.

The Ca2+ titration results reflect more than simply calcification.

Although [Ca2+] possibly might affect photosynthesis, there are undoubtedly many other processes that are altered by calcium depletion. Ca2+ functions as a ubiquitous intracellular messenger. Several extracellular signals induce an increase of [Ca2+] in the cytosol. The Ca2+ in the cytosol of eukaryotic cells is kept low (5 nM) for rapid signaling purposes by transiently opening

Ca2+ channels in the plasma membrane or from an intracellular membrane.

Ca2+ acts as a cofactor to many proteins by binding tightly to proteins. Ca2+ binds to multiple ligands and its physical properties allow it to cross-link at different segments of a protein and induce the protein to undergo a significant conformational change.

One such ligand is calmodulin. Calmodulin mediates many Ca2+ regulated processes, such as the activation of Ca2+ pumps in the plasma membrane. When Ca2+ binds to calmodulin, it becomes active and will bind to enzymes and target proteins. Those enzymes and target proteins can activate, deactivate, phosphorylate, and dephosphorylate other target proteins.

69

So it is also expected that the relative expression of many genes not directly involved in biomineralization will be altered by calcium deprivation, but not the bicarbonate spike.

Transcriptome wide expression profiling captures differentially expressed genes. Among those, genes involved in biomineralization maybe in the minority, and thus most of the genes differentially expressed between treatments may not be actually involved in biomineralization. Genes that are less likely to be involved in biomineralization could be eliminated from the list of all genes differentially expressed in 9 mM vs 0 mM (RNAseq) by subtracting those that were not significant in qRT-PCR. Genes could in theory have been eliminated from the candidate list if they were not also affected similarly by bicarbonate spike, but in practice a huge number of genes (~4000) were differentially expressed in bicarbonate spike.

Seventeen of the top twenty-five genes that were identified as being differentially expressed were assigned probable function via homology or the presence of domains. Many of these made biological sense with regard to biomineralization and/or coccolithogenesis. For example, 107508, is one of the three closely related to known C-type lectins, which derive the ‘C’ from their

70

reliance on calcium for carbohydrate binding. Multiple proteins are most like

or had domains for zinc binding (211653, 456555, 440390). These included

probable peptidases, plus 440390 has a RING domain which is normally

associated with ubiquitination. One of the down-regulated genes, 120394, is

most closely related to a zinc transporter (ZIP family) identified in the

sequenced marine alga Ostreococcus spp. Differential expression of genes

for zinc binding and transporting proteins under calcium restriction might be

due to interference between calcium ions and zinc (de França 2002). This alone does not account for there being multiple proteases as well as a protease inhibitor, 356347, among the differentially expressed genes.

The significance of these genes suggests that some proteins were being synthesized while others were being degraded. Another gene of interest, 205013, is one of the two titin-like proteins, having multiple fibronectin type III domains and must be elastic. Maybe this is involved in the process whereby the E. huxleyi extrudes the coccoliths yet springs right back.

The balance of the differentially expressed genes coded for proteins that lacked known functional domains or significant homology to known

71

sequences; the latter may change as more organisms are sequenced and

proteins characterized.

Future studies could include performing RNAseq/real time RT-PCR on

I. galbana and eliminate genes that are differentially expressed in both I.

galbana and E huxleyi, in 9 vs 0 mM Ca2+. As I. galbana does not calcify and presumably does not possess or at least express the necessary biomineralization transcripts, changes in gene expression across treatments will reflect the pleiotrophic effects of manipulating calcium and sodium bicarbonate levels, and hence it will serve as an ideal control for narrowing the list of genes, possibly involved in biomineralization (Figure 13)

72

Figure 13: flow chart for the successive elimination of genes unlikely to be involved in biomineralization from the candidates identified in this study.

Additionally, if a Ca2+ gradient (15, 9, 5, 1 mM) was used instead of

the 9mM vs 0mM Ca2+ employed in this study, genes with expression that

varied with Ca2+ concentration would make good candidates for

biomineralization.

73

Furthermore, if Quinns’s, Nguyen’s, and Wahlund’s studies were

repeated using RNAseq, the results from those experiments could eliminate

genes not differentially expressed in phosphate deplete conditions and in

calcifying vs non-calcifying strains. The efficacy of RNAseq transcript profiling

for E. huxleyi (hiseq sequencing and E. huxleyi genome) was mainstreamed

due to the similarity in RNAseq and real time RT-PCR results. Thus, the

RNAseq method could be used to not just replicate, but greatly enhance the

microarray, SSH and EST studies to develop a more complete picture of

biomineralization genomics.

The Annotation of cyclophilins was based on the bitscore, homology and the presence of peptidyl-propyl cis trans isomerase domain. The actual annotation of cyclophilins was not that difficult. Cyclophilins are highly conserved across species and are have as a closest neighbor a type of algae for 80% of the transcripts annotated in this study.

The phylogenic relationship between the cyclophilin proteins gives an

opportunity for predicting how sequence divergence contributes to functional

specification. Two distinct clades were shown in Figure 11. One clade

showed more recent branching then the other, which suggests that the recent

clade is more closely related and possibly has similar functions. The other

clade showed more distant branching indicating that their functions are more

74

specific. CYP15 had the most distant and possibly has a unique function in

E. huxleyi, which warrants further study. The branching could also be due to

specific localization within the cell. Other studies indicate that distinct

cyclophilin clades were located in different organelles or part of the cell such

as mitochondia, chloroplast, the secretory pathway and cytosol (Lee et al.

2001; Ramano et al. 2004; Gan et al. 2009).

Twenty-five cyclophilins were annotated but none of them were

checked for expression using real time RT-PCR. The next step would be to

run a real time RT-PCR on the 25 cyclophilins and see if theses genes are

expressed in calcifying and/or non-calcifying strains of E. huxleyi. Furthermore,

the 25 annotated cyclophilins could be further characterized by determining

their exact functions and locations in E. huxleyi. .

The data presented here represents an important step forward to

identifying genes potentially involved in biomineralization, as well as providing

support that calcification and photosynthesis are independent processes. Also the use of RNAseq with real time RT-PCR validation has been shown to be

an accurate, reproducible molecular technique for study of gene expression in

Isochrysidales. The work in this study will be the baseline for future research

studies with other members of the Isochrysidales order and further build on

identifying genes that are potentially involved in coccolithogenesis.

75

References Anders, S and Huber, W. 2010. Differential expression analysis for sequence count data. Genome Biol. 11:R106.

Arias, J. and Fernandez, M. 2008. Polysaccharides and proteoglycans in calcium carbonate-based biomineralization. Chem. Rev. 108:4475-4482.

Arya M., Shergill, I. S., Williamson . M., Gommersall, L., Arya, N., and Patel, H. 2005. Basic principles of real-time quantitative PCR. Expert Rev. Mol. Diagn. 5:209-219.

Blackwelder P. L., Weiss R., E., and Wilbur K., M. 1976. Effects of calcium, strontium and magnesium on the coccolithophorid Cricosphaera (Hymenomonas) carterae. I. Calcification. Mar. Biol. 34:11-16.

Cheung, F., Haas, B. J., Goldberg, S. M., May, G. D., Xiao, Y., and Town, C. D. 2006. Sequencing medicago truncatula expressed sequenced tags using 454 life sciences technology. BMC Genomics 7:272

Chisholm J and Gattuso J. 1991. Validation of the alkalinity anomaly technique for investigating calcification and photosynthesis in coral reef communities. Limnol. And Oceanog. 36(6):1232-1239.

Cloonan, N., Forrest, A. R. R., Kolle, G., Gardiner, B. B. A. Faulkner, G. J., Brown, M. K., Taylor, D. F., Steptoe, A. L., Wani, S., Bethel, G., Robertson, A. J., Perkins, A. C., Bruce, S. J., Lee., C. C., Randade, S. S., Peckham, H. E., Manning, J. M., McKernan, K. J., Grimmond, S. M. 2008. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods 7:613-619.

76

Edvardsen B., Eikrem W., Green J., C., Anderson R., A., Moon-van der Staay, S. and Medlin L., K. 2000. Phylogenetic reconstructions of the Haptophyta inferred from 18S ribosomal DNA sequences and available morphological data. Phycologia 39(1):19-35

Fields, D. 2007. MOLECULAR BIOLOGY: Site-seeing by sequencing. Science 316: 1441-1442.

Fujiwara, S., Sawada, S., and Someya, J. 1994. Molecular Phylogenetic Analysis of rbcL in the prymnesiophyta. J. Phycol. 30: 863-871.

Fruman, D. A., Burakoff, S. J. and Bierer, B. E. 1994. Immunophilins in protein folding and immunosuppression. FASEB J. 8:391-400.

Fu, A., He, Z., Cho, H., Lima, A., Buchanan, B. Luan, S. 2007. A chloroplast cyclophilin functions in the assembly and maintenance of photosystem II in Arabidopsis thaliana. PNAS 104:15947-15952.

Graveley, B. R. 2008. Power sequencing. Nature 453(26):1197-1198

Gopinathan, C. P. 1984. Growth characteristics of some nannoplankters. J. mar. boil. Ass. India. 26:89-94.

Guillard, R.R.L. 1975. Culture of phytoplankton for feeding marine invertebrates. pp 26-60. In Smith W.L. and Chanley M.H (Eds.) Culture of Marine Invertebrate Animals. Plenum Press, New York, USA.

Johnson, D. S., Mortazavi, A., Myers, R. M., and Wold B. 2007. Genome- Wide Mapping of in Vivo protein-DNA interactions. Science 316:1497-1502

77

Jones-Rhoades, M., Borevitz, J., and Preuss, D. 2007. Genome-wide expression profiling of the Arabidopsis female gametophyte identifies families of small, secreted proteins. PLoS Genetics 3: e171

Herfort L., Loste, E., Meldrum, F., and Thake., B. 2004. Structural and physiological effects of calcium and magnesium in Emiliania huxleyi (Lohmann) Hay and Mohler. J Struc. Biol. 148:307-314.

Hornshoj H, Bendixen, E., Conley, L. N., Andersen, P. K., Hedegaard, J., Panitz, F., and Bendixen, C. 2009. Transcriptomic and proteomic profiling of two porcine tissues using high-throughput technologies. BMC Biol. 10:30.

Kremiling K. 1983. Determination of the major constituents p 229-251. In K. Grasshoff et al. [eds.]. Methods of sea water analysis. Verlag Chemie.

Langer, G., Nooijer, L., and Oetjen, K. 2010. On the role of cyrtoskeleton in coccolith morphogenesis: the effect of cyctoskeleton inhibitors. J. Phycol. 46: 1252-1256.

Lee, Yoo K., Hong, C. B., Suh, Y., and Lee I, K. 2001. A cDNA clone for cyclophilin from Griffithsia japonica and phylogenetic analysis of cyclophilins. Mol. Cells 13:12-20.

Leonardos, N., Read., Betsy, Thake, B., and Young J. R. 2009. No Mechanistic dependence of photosynthesis on calcification in the coccolithophorid Emiliania huxleyi (Haptophyta). J. Phycol. 45:1046-1051.

Lister, R., O’Malley, R. C., Tonti-Filippini, J., Gregory, B. D., Berry, C. C., Millar, A. H., and Ecker, J. R. 2008. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133:523-528.

78

Livak, K. J. and Schmittgen T. D. 2001. Analysis of relative gene expression ∆∆ data using real-time quantitative PCR and the 2- Ct method. Methods. 25:402- 408.

Livingston B. T., Killian C.E., Wilt F., Cameron A., Landrum M. J., Ermolaeva O., Sapojnikov V., Maglott D. R., Buchanan A. M. and Ettensohn C. A. 2006. A genome-wide analysis of biomineralization-related proteins in the sea urchin Strongylocentrotus purpuratus. Depart. Biol. 300:355-348.

Mann S. 2001. Biomineralization Principles and Concepts in Bioinorganic Materials Chemistry. Oxford University Press Inc., New York, p 6-8.

Mann, K., Wilt, F., and Poustka, A. Proteomic analysis of sea urchin (Strongylocentrotus purpuratus) spicule matrix. Proteome Sci. 8:33-46.

Marsh M. E. 1992. Isolation and characterization of a novel acidic polysaccharide containing tartrate and glyoxylate residues from the mineralized scales of a unicellular coccolithophorid alga . J Biological Chem. 267(28):20507-20512.

Monaghan, J. R., Epp, L. G., Putta, S., Page, R., B., Walker, J. A., Beachy, C. K., Zhu, W., Pao, G. M., Verma, I. M., Hunter, T., Bryant, S. V., Gardiner, D. M., Harkins, T. T., and Voss, S. R. 2009. Microarray and cDNA sequence analysis of transcription during nerve-dependent limb regeneration. BMC Biol. 7:1-19.

Mu, Y., Ding, F., Peng, C., Ao, J., Hu, S. and Chen, X. 2010. Transcriptome and expression profiling analysis revealed changes of multiple

79

signaling pathways involved in immunity in the large yellow croaker during aeromonas hydrophila infection. BMC Genomics. 11:506-520.

Nagalakshmi, U., Wang, Z., Waern, K., Shou, C., Raha, D., Gerstein, M., and Snyder, M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Sci Mag. 320.

Nguyen B., Bowers R., Wahlund T.M. and Read B. 2005. Suppressive subtractive hybridization of and differences in gene expression content of calcifying and noncalcifying cultures of Emiliania huxleyi strain 1516. J Appl. and Envir. Micro. 71(5):2564-2575.

Paasche, E. 1998. Roles of nitrogen and phosphorus in coccolith formation in Emiliania huxleyi (Prymnesiophyceae). Eur. J. Phycol. 33: 33-42.

Pessĵa de França F, Mora Tavares AP, Augusto da Costa AC.(2002) Calcium interference with continuous biosorption of zinc by Sargassum sp. Bioresour Technol. Jun;83(2):159-63.

Perez, S. and Weis V. 2008. Cyclophilin and the regulation of symbiosis in Aiptasia pallida. Biol. Bull. 215: 63-72.

Quinn, P. R., Bowers, M., Zhang, X., Wahlund, T. M., Fanelli, M. A., Olszova, D., Read, B.A. 2006. cDNA microarrays as a tool for identification of biomineralization proteins in the coccolithophorid Emiliania huxleyi (haptophyta). Appl. Environ. Microbiol. 72:5512-5526.

Ramano, P., Horton, P., and Gray, J. 2004. The Arabidopsis cyclophilin gene family. Plant Physiol. 134:1268-1282.

80

Romano, P., Gray, J., Horton, P., Luan, S. 2005. Plant Immunophilins: functional versatility beyond protein maturation. New Phyto. 166:753-769.

Rhodes L., L. 1995. oceanica and Emiliania huxleyi (Prymnesiophyceae = Haptophyceae) in New Zealands’s coastal waters: characteristics of blooms and growth in laboratory culture. New Zealand J. Mar. Fresh. Res. 29: 345-357

Shendure, J., and Ji, Hanlee. 2008. Next-generation DNA sequencing. Nature Biotech. 26(10):1135-1145.

Shiraiwa Y. 2003. Physiological regulation of carbon fixation in the photosynthesis and calcification of coccolithophorids. Comp. Biochem. Physiol. Part B. 136:775-783.

Soto A. R., Zheng H., Shoemaker D., Rodriguez J., Read B. A. and Wahlund T. M. 2006. Identification and preliminary characterization of two cDNAs encoding unique carbonic anhydrases from the marine alga Emiliania huxleyi. J Appl. and Envir. Micro. 72:5500-5511.

Strommer, J., R. Gregerson, and M. Vayda. 1993. Isolation and characterization of plant mRNA, p. 49-66. In B. R. Glick and J. E. Thompson (ed.), Methods in plant molecular biology and biotechnology. CRC Press, Boca Raton, Fla

Trimborn S., Langer G. and Rost B. 2007. Effect of varying calcium concentration and light intensities on calcification and photosynthesis in Emiliania huxleyi. Limnol. Oceanogr. 53:2285-2293.

81

Tyrrell T. and Taylor, A., H. 1996. A modeling study of Emiliania huxleyi in the NE Atlantic. J Mar. Sys. 9:83-112.

Vallon O. 2005. Chlamydomonas immunophilins and parvulins: survey and critical assessment of gene models. Eukaryoyic Cell 4:230-241.

Wahlund, T. M., Zhang, X. Y., and Read B. A. 2004. Expressed sequence tag profiles from calcifying and non-calcifying cultures of Emiliania huxleyi. Micropal. 50:145-155.

Weber, A. P.M., Weber, K. L., Carr, K., Wilkerson, C., and Ohlrogge, J. B. 2007. Sampling the Arabidopsis Transcriptome with massively parallel pyrosequencing. Plant Physiol. 144:32-42

Westbroek P., De Jong E. W., Van Der Wal P., Borman A. H., de Vrind J. P. M., Kok D., de Bruijn W. C. and Parker S. B. 1984. Mechanisms of calcification in the marine alga Emiliania huxleyi. Phil. Trans. R. Soc. Lond. B. 304:435-444.

Winter A. and Siesser W. G., 1994. Coccolithophores. Cambridge University Press, New York, pgs 17-21, 29-32, 39-40, 46, 63-69, 179-180.

Werner. 2010. Next generation sequencing in functional genomics. Brief. Bioinform. 5:499-511

Yao, J., Yuxiang, W., Weishi, W., Ning, W., Hui, Li. 2011. Solexa sequencing analysis of chicken pre-adipocyte microRNAs. Biosci. Biotechnol. Biochem. 75:54-61.

82

Young J.,R., Davis S., A., Bown P.,R., and Mann S. 1999. Coccolith Ultrastructure and Biomineralisation. J Struc. Biol. 126:195-215.

83

Appendix A: Primers for Real-Time RT PCR Protein ID Forward (5' to 3') Reverse (5' to 3') Prod. GC% Prod. Length Expected Tm Actual Tm 199401 CCGGCATATTGTAACCAACC CACTCGATCAAGGTCGTCAA 53 100 75.63 73.13 432839 TACAGCGGTCGGAACTACCT TGATTGGCACTGCACGTATT 56 96 76.71 75.73 438208 GCTACCGCAATACCGAACAT CTCCTTGTCGCTGAACTCG 61 99 78.69 76.62 433416 AAGTCTCAACGAGGGGCTTT CCTCTCCGAGATCTCCTTCA 64 88 79.17 77.3 107508 GGCCTAGAGATGGTGCAGAG ATGACGTCGCAAGAAAGGAG 66 83 79.84 76.08 41308 CGAAGAGACATCCTCCGAAC GCGGTGTACAGACCCTTGTT 60 99 78.27 73 353141 CAAAAGGGCCAAAGACAAGA TTCCGCTTACGCTCTCATTT 59 98 78.04 76.23 218690 CCAGAAGTATCGTCGCAACA GGTCGTATCCAAGGCTGAGA 56 95 76.46 77.62 370886 CAAAAGGGCCAAAGACAAGA TTCCGCTTACGCTCTCATTT 59 98 78.04 77.52 423164 GACCACCTCCACTTCGACAT AACGGAAGTTCAGCATCTCC 58 95 77.32 77.26 440390 CGAACCATCCGTTGAGATG TGGTCCGGTATCTTGAGAGC 66 79 79.29 74.49 456347 ACGGCGAGTGGAAGTACAAC AGAACGTGCCCGTGTAGAAG 63 81 78.31 76.33 205013 CCTTCTTCAAGTGGCTCGTC GGGCTTGAAGACTGAGATCG 61 77 77.13 76.18 356539 GATGTCCAAGGTGCTGGTG CTGCTTGAGCTAGTGCCAGA 69 99 82 80.2 211653 CATCGAGTGGATTCCGAACT TCACCGAAAACATGATCGAG 56 86 75.81 76.57 456555 AGCTGTGCTGGTACCTCGAT ATCCGCATGAAGAGGACAAC 57 88 76.38 73.78 416147 CATTCACGCTCTCCTCCTCT GCGTAGGTCCAGTCCATGTC 65 82 79.08 76.69 195816 CATCAACGTGTGCAACCTCT TGCCAAAGATCACAAACACC 59 95 77.75 77.16 366493 TTCCACAACACGTCGATCAT CTCCTTGTCCCACACAGTGA 57 96 77.14 76.41 366340 CGCAGCCATTCTTCTTCTCT CGGAGCACTTCTCGTCTAGC 60 83 77.37 75.73 120394 ACCTCGTCCTGACCAACAAG GACATGCCCGAGAGGAAGT 61 79 77.22 74.01 413815 ACACCACCAAGAGCTCCAAC TACTTTCTGCCCGAGAGCAT 60 100 78.5 76.99 113427 GAACAAGCGCAAGTTTGTCA GTCGAAGAGCGTGTCCATCT 65 96 80.55 78.13 67936 GCGAGGCGCTCTACTACCTA CGCGTACTGACGGATATGGT 66 89 80.34 77.72 54552 GATGACGGGAAGGGCTACTT GAGTCCTTAACGCCGAACAG 61 98 78.88 75.71 353329-ref CATCTACAAGCCCGAGGAGA GTGTGCCTGTTGTCGTTCC 64 87 79.39 77.28

84

Appendix B: List of Ca2+ up-regulated genes Gene ID X0.217 X0S.217 X9.217 X9S.217 PValue Description evalue amine oxidase [Vibrio coralliilyticus ATCC BAA- 450]gb|EEX33214.1| amine oxidase [Vibrio 199401 1408 240 5238 795 0.0095105 coralliilyticus ATCC BAA-450] 3.00E-31 432839 1099 143 3423 1621 0.00524769 hypothetical protein ARALYDRAFT_343817 [Arabidopsis lyrata subsp. lyrata]gb|EFH54886.1| hypothetical protein ARALYDRAFT_343817 438208 596 732 2433 2789 0.0065231 [Arabidopsis lyrata subsp. lyrata] 1.00E-46 433416 95 161 721 980 0.00031043 hypothetical protein BRAFLDRAFT_210423 [Branchiostoma floridae]gb|EEN58705.1| hypothetical protein BRAFLDRAFT_210423 107508 111 144 565 737 0.00154792 [Branchiostoma floridae] 1.00E-05 hypothetical protein TRIADDRAFT_11340 [Trichoplax adhaerens]gb|EDV25835.1| hypothetical protein TRIADDRAFT_11340 41308 60 110 470 740 0.00020558 [Trichoplax adhaerens] 1.00E-90 353141 88 122 405 507 0.00404042 predicted protein [Phaeodactylum tricornutum CCAP 1055/1]gb|EEC43129.1| predicted protein 218690 92 24 388 58 0.00881886 [Phaeodactylum tricornutum CCAP 1055/1] 4.00E-17 370886 72 118 323 471 0.00512635 423164 48 18 311 61 0.00110328 predicted protein [Chlamydomonas reinhardtii]gb|EDP01940.1| predicted protein 440390 99 74 310 507 0.00248222 [Chlamydomonas reinhardtii] 2.00E-05 predicted protein [Physcomitrella patens subsp. patens]gb|EDQ66621.1| predicted protein 456347 61 257 305 1007 0.00532886 [Physcomitrella patens subsp. patens] 4.00E-93

85

205013 64 140 291 619 0.00350497 PREDICTED: titin-like [Saccoglossus kowalevskii] 1.00E-25 putative serine/threonine kinase [Emiliania 356539 65 48 290 391 0.00060711 huxleyi] 9.00E-22 211653 65 79 286 505 0.00104969 456555 82 121 285 579 0.0045866 superoxide-generating NADPH oxidase flavocytochrome [Dictyostelium discoideum AX4]sp|Q9XYS3.1|NOXA_DICDI RecName: Full=Superoxide-generating NADPH oxidase heavy chain subunit A; AltName: Full=NADPH oxidase A; AltName: Full=Superoxide-generating NADPH oxidase flavocytochrome Agb|AAD22057.1| superoxide-generating NADPH oxidase flavocytochrome [Dictyostelium discoideum]gb|EAL62538.1| superoxide- generating NADPH oxidase flavocytochrome 416147 49 68 276 505 0.00033109 [Dictyostelium discoideum AX4] 2.00E-58 conserved unknown protein [Ectocarpus 195816 66 101 244 559 0.0022764 siliculosus] 2.00E-10 366493 49 70 218 275 0.00574977 transducin alpha subunit [Sparus aurata] 8.00E-74 366340 50 51 214 232 0.00409665 75525 46 71 199 256 0.00801878 putative surface antigen bspa [Emiliania huxleyi] 2.00E-49 443621 27 37 161 389 7.59E-05 GE22206 [Drosophila yakuba]gb|EDW94821.1| 114434 32 95 149 344 0.00787388 GE22206 [Drosophila yakuba] 4.00E-19 predicted protein [Thalassiosira pseudonana CCMP1335]gb|EED87108.1| predicted protein 112630 16 38 140 298 0.00011797 [Thalassiosira pseudonana CCMP1335] 1.00E-17 233264 42 36 140 166 0.00799613 unnamed protein product [Vitis vinifera] 2.00E-11 3-O-acyltransferase [Stigmatella aurantiaca DW4/3-1]gb|EAU67913.1| 3-O-acyltransferase 454871 46 16 136 129 0.00519388 [Stigmatella aurantiaca DW4/3-1] 2.00E-07 437466 15 49 134 348 0.00017967

86

proteophosphoglycan ppg4 [Leishmania major strain Friedlin]gb|AAZ14280.1| proteophosphoglycan ppg4 [Leishmania major 104922 17 34 130 184 0.00070699 strain Friedlin] 4.00E-10 NADPH oxidase [Uncinocarpus reesii 1704]gb|EEP75682.1| NADPH oxidase 471024 24 31 130 205 0.00074429 [Uncinocarpus reesii 1704] 7.00E-39 114115 36 24 129 149 0.00367121 Pc13g00220 [Penicillium chrysogenum Wisconsin 54-1255]emb|CAP91091.1| Pc13g00220 65333 36 80 126 332 0.00723934 [Penicillium chrysogenum Wisconsin 54-1255] 8.00E-78 245902 16 9 121 4 0.00342399 hypothetical protein SCHCODRAFT_16162 [Schizophyllum commune H4-8]gb|EFI96091.1| hypothetical protein SCHCODRAFT_16162 123711 11 19 98 178 6.68E-05 [Schizophyllum commune H4-8] 2.00E-07 hypothetical protein TRIADDRAFT_11340 [Trichoplax adhaerens]gb|EDV25835.1| hypothetical protein TRIADDRAFT_11340 227253 10 15 91 133 0.00011812 [Trichoplax adhaerens] 2.00E-18 PREDICTED: similar to predicted protein, partial 103514 8 22 89 145 0.00024444 [Hydra magnipapillata] 1.00E-14 hypothetical protein SELMODRAFT_20797 [Selaginella moellendorffii]gb|EFJ25239.1| hypothetical protein SELMODRAFT_20797 456463 17 18 80 97 0.00256663 [Selaginella moellendorffii] 1.00E-05 putative V8-like Glu-specific endopeptidase [Bdellovibrio bacteriovorus HD100]emb|CAE79812.1| putative V8-like Glu- specific endopeptidase [Bdellovibrio bacteriovorus 366611 26 21 77 120 0.00664216 HD100] 5.00E-27

87

predicted protein [Nematostella vectensis]gb|EDO43196.1| predicted protein 231872 5 33 68 156 0.00117923 [Nematostella vectensis] 3.00E-46 predicted protein [Phaeodactylum tricornutum CCAP 1055/1]gb|EEC47159.1| predicted protein 351211 11 4 63 28 0.00228699 [Phaeodactylum tricornutum CCAP 1055/1] 2.00E-97 435741 11 19 63 64 0.00712493 hypothetical protein BRAFLDRAFT_210423 [Branchiostoma floridae]gb|EEN58705.1| hypothetical protein BRAFLDRAFT_210423 123950 13 24 62 102 0.00516618 [Branchiostoma floridae] 1.00E-06 114548 14 20 61 86 0.00619932 glutamate receptor-like protein [Adineta vaga] 8.00E-08 PREDICTED: hypothetical protein, partial 211745 11 28 59 178 0.00082625 [Ornithorhynchus anatinus] 7.00E-20 hypothetical protein [Monosiga brevicollis MX1]gb|EDQ86701.1| predicted protein [Monosiga 423536 11 10 59 60 0.00235893 brevicollis MX1] 8.00E-05 conserved unknown protein [Ectocarpus 233421 5 9 55 14 0.00737431 siliculosus] 2.00E-12 121492 18 33 54 165 0.00550715 NLRC3 receptor [Ictalurus punctatus] 1.00E-27 Fatty acid desaturase subfamily [Methylophaga thiooxidans DMS010]gb|EEF79658.1| Fatty acid desaturase subfamily [Methylophaga thiooxidans 56174 17 21 53 118 0.00474362 DMS010] 7.00E-30 ubiquinol:cytochrome c oxidoreductase biogenesis factor [Chlamydomonas reinhardtii]gb|EDO97908.1| ubiquinol:cytochrome c oxidoreductase biogenesis factor 68211 7 6 52 12 0.00795261 [Chlamydomonas reinhardtii] 6.00E-63 447768 17 25 52 187 0.00119338

88

1-phosphatidylinositol-4,5-bisphosphate phosphodiesterase delta-3-A [Danio rerio]sp|A5D6R3.1|PLD3A_DANRE RecName: Full=1-phosphatidylinositol-4,5-bisphosphate phosphodiesterase delta-3-A; AltName: Full=Phosphoinositide phospholipase C-delta-3-A; AltName: Full=Phospholipase C-delta-3-A; Short=PLC-delta-3-Agb|AAI39850.1| Plcd3a 107969 10 15 47 82 0.00336639 protein [Danio rerio] 4.00E-35 Procollagen-proline dioxygenase [Paenibacillus sp. JDR-2]gb|ACS99249.1| Procollagen-proline 218501 13 20 46 122 0.00251221 dioxygenase [Paenibacillus sp. JDR-2] 4.00E-05 Sel1 repeat-containing protein [Oxalobacter formigenes OXCC13]gb|EEO29791.1| Sel1 repeat- containing protein [Oxalobacter formigenes 354003 12 4 45 56 0.00162069 OXCC13] 2.00E-05 446753 13 33 45 138 0.0084801 RecName: Full=Mucin-5AC; Short=MUC-5AC; AltName: Full=Gastric mucin; AltName: Full=Lewis B blood group antigen; Short=LeB; AltName: Full=Major airway glycoprotein; AltName: Full=Mucin-5 subtype AC, tracheobronchial; AltName: Full=Tracheobronchial mucin; 107528 2 19 44 69 0.00298996 Short=TBM; Flags: Precursor 1.00E-04 358042 7 1 43 2 0.00714419 210712 2 7 42 17 0.00277337 hypothetical protein CC1G_06124 [Coprinopsis cinerea okayama7#130]gb|EAU81913.2| hypothetical protein CC1G_06124 [Coprinopsis 354682 10 23 42 101 0.005951 cinerea okayama7#130] 0.002 hypothetical protein Cyan7425_0727 [Cyanothece sp. PCC 7425]gb|ACL43114.1| conserved 441276 6 19 41 135 0.00052782 hypothetical protein [Cyanothece sp. PCC 7425] 1.00E-10

89

hypothetical protein DSM3645_13228 [Blastopirellula marina DSM 3645]gb|EAQ77003.1| hypothetical protein DSM3645_13228 445604 1 19 40 71 0.00261313 [Blastopirellula marina DSM 3645] 0.003 PREDICTED: similar to GREB1 protein isoform a 223149 7 1 40 18 0.00191566 [Canis familiaris] 3.00E-04 238020 4 17 38 201 2.70E-05 predicted protein [Micromonas sp. RCC299]gb|ACO61431.1| predicted protein 222071 5 1 37 4 0.00416985 [Micromonas sp. RCC299] 2.00E-07 predicted protein [Nematostella vectensis]gb|EDO43196.1| predicted protein 71849 8 13 36 80 0.0025916 [Nematostella vectensis] 2.00E-61 cell wall surface anchor family protein [Streptococcus agalactiae 515]gb|EAO70983.1| cell wall surface anchor family protein 458958 9 18 35 124 0.00148389 [Streptococcus agalactiae 515] 3.00E-07 conserved unknown protein [Ectocarpus 458791 9 27 34 121 0.0071634 siliculosus] 1.00E-36 predicted protein [Nematostella vectensis]gb|EDO43196.1| predicted protein 220256 5 12 33 99 0.00041391 [Nematostella vectensis] 9.00E-35 246646 13 17 33 125 0.00213847 PREDICTED: im:7041156 [Danio rerio] 1.00E-10 conserved hypothetical protein [Bacteroides sp. D20]gb|EFA19082.1| conserved hypothetical 232262 5 17 32 113 0.00088299 protein [Bacteroides sp. D20] 6.00E-07 actin bundling protein [Naegleria gruberi]gb|EFC45441.1| actin bundling protein 65122 7 27 32 138 0.00320315 [Naegleria gruberi] 8.00E-28 67717 12 13 32 99 0.00233155 PREDICTED: hypothetical protein [Vitis vinifera] 4.00E-60 212373 12 19 32 112 0.00419376

90

putative diguanylate cyclase/phosphodiesterase 459728 4 12 28 104 0.00030452 [Leptospirillum ferrodiazotrophum] 6.00E-06 363130 4 15 28 82 0.00215128 predicted protein [Nematostella vectensis]gb|EDO43196.1| predicted protein 230507 4 16 28 79 0.00318935 [Nematostella vectensis] 1.00E-24 zinc metalloprotease [Coprinopsis cinerea okayama7#130]gb|EAU93329.2| zinc metalloprotease [Coprinopsis cinerea 101622 4 19 27 73 0.00861331 okayama7#130] 1.00E-09 conserved unknown protein [Ectocarpus 425652 6 19 26 82 0.00852564 siliculosus] 4.00E-41 autotransporter-associated beta strand repeat protein (3 repeats) [Yersinia pestis KIM D27]gb|EFA46897.1| autotransporter-associated beta strand repeat protein (3 repeats) [Yersinia 433502 3 2 25 117 7.56E-07 pestis KIM D27] 2.00E-08 conserved hypothetical protein [Bacteroides sp. D20]gb|EFA19082.1| conserved hypothetical 220527 4 15 23 100 0.00109313 protein [Bacteroides sp. D20] 9.00E-07 PREDICTED: similar to Macrophage mannose receptor 1 precursor (MMR) (CD206 antigen) 235469 3 2 22 23 0.00138327 [Gallus gallus] 2.00E-05 113066 4 8 22 53 0.00230552 231095 6 5 21 44 0.00342797 hypothetical protein THAPSDRAFT_21161 [Thalassiosira pseudonana CCMP1335]gb|EED94629.1| hypothetical protein THAPSDRAFT_21161 [Thalassiosira pseudonana 105827 5 2 19 21 0.00794441 CCMP1335] 1.00E-34 457035 8 28 19 149 0.00454754 calmodulin [Triticum aestivum] 3.00E-05

91

proteophosphoglycan 5 [Leishmania major strain Friedlin]gb|AAZ14281.1| proteophosphoglycan 5 221946 0 2 15 43 1.77E-05 [Leishmania major strain Friedlin] 2.00E-05 predicted protein [Arabidopsis lyrata subsp. lyrata]gb|EFH61481.1| predicted protein 115493 0 7 15 31 0.00355224 [Arabidopsis lyrata subsp. lyrata] 1.00E-08 hypothetical protein [Trichomonas vaginalis G3]gb|EAY23768.1| Leucine Rich Repeat family 213145 1 2 15 9 0.00732396 protein [Trichomonas vaginalis G3] 4.00E-06 366235 0 2 14 31 8.42E-05 hypothetical protein CY0110_08201 [Cyanothece sp. CCY0110]gb|EAZ90641.1| hypothetical protein 244456 1 1 14 10 0.00293465 CY0110_08201 [Cyanothece sp. CCY0110] 1.00E-27 proteophosphoglycan ppg4 [Leishmania major strain Friedlin]gb|AAZ14280.1| proteophosphoglycan ppg4 [Leishmania major 459700 3 4 14 66 0.00014634 strain Friedlin] 6.00E-06 hypothetical protein Hoch_6300 [Haliangium ochraceum DSM 14365]gb|ACY18771.1| conserved hypothetical protein [Haliangium 238496 3 4 14 32 0.00396799 ochraceum DSM 14365] 1.00E-40 hypothetical protein CLOBOL_02904 [Clostridium bolteae ATCC BAA-613]gb|EDP16760.1| hypothetical protein CLOBOL_02904 [Clostridium 223522 0 3 13 34 0.00019133 bolteae ATCC BAA-613] 2.00E-06 predicted protein [Nematostella vectensis]gb|EDO43196.1| predicted protein 62288 2 8 11 38 0.0096674 [Nematostella vectensis] 2.00E-70 predicted protein [Nematostella vectensis]gb|EDO43196.1| predicted protein 51793 5 5 11 47 0.00420892 [Nematostella vectensis] 6.00E-68 228648 1 0 10 9 0.00271287

92

PREDICTED: similar to LOC100149441 protein, 208032 1 4 9 33 0.00176711 partial [Hydra magnipapillata] 3.00E-40 434058 0 0 8 17 0.0001164 PREDICTED: 1-phosphatidylinositol-4,5- bisphosphate phosphodiesterase delta-1-like 204945 0 3 8 15 0.00886079 [Xenopus (Silurana) tropicalis] 3.00E-06 239461 2 2 8 19 0.00900161 GE22206 [Drosophila yakuba]gb|EDW94821.1| 206815 2 1 7 25 0.00181955 GE22206 [Drosophila yakuba] 1.00E-17 226365 0 0 6 7 0.00279041 Ribonuclease [Giardia lamblia ATCC 50803]gb|EDO79807.1| Ribonuclease [Giardia 220382 0 0 6 5 0.00546645 lamblia ATCC 50803] 2.00E-06 431789 2 0 6 20 0.00198451 MICAL-like protein [Phytophthora infestans T30- 4]gb|EEY61774.1| MICAL-like protein 110884 2 7 4 43 0.00798307 [Phytophthora infestans T30-4] 5.00E-26

93

Appendix C: List of Ca2+ down-regulated genes GeneID X0.217 X0S.217 X9.217 X9S.217 PValue Description evalue 214483 18 23 1 0 5.01E-05 periplasmic binding protein [Ruegeria sp. TM1040]gb|ABF65666.1| periplasmic binding 54552 369 1157 70 119 0.00014816 protein [Ruegeria sp. TM1040] 1.00E-38 hypothetical protein [Paramecium tetraurelia strain d4-2]emb|CAK88371.1| unnamed protein 200858 5 23 1 0 0.00047147 product [Paramecium tetraurelia] 2.00E-06 250673 13 20 1 1 0.00067593 438875 47 54 6 7 0.00072909 ZIP family transporter: zinc ion [Ostreococcus lucimarinus CCE9901]gb|ABP00071.1| ZIP family transporter: zinc ion [Ostreococcus lucimarinus 120394 169 244 27 39 0.00099639 CCE9901] 5.00E-75 periplasmic binding protein [Streptosporangium roseum DSM 43021]gb|ACZ87848.1| periplasmic binding protein [Streptosporangium roseum DSM 256730 44 153 14 16 0.00105849 43021] 3.00E-23 oxidoreductase, short chain dehydrogenase/reductase family [Labrenzia alexandrii DFL-11]gb|EEE45627.1| oxidoreductase, short chain dehydrogenase/reductase family [Labrenzia 67936 4 13 0 0 0.00111669 alexandrii DFL-11] 5.00E-43 oligopeptide-binding protein OppA [Leptotrichia goodfellowii F0264]gb|EEY34986.1| oligopeptide- binding protein OppA [Leptotrichia goodfellowii 113427 125 588 58 72 0.00171411 F0264] 3.00E-65 ZIP family transporter: zinc ion [Ostreococcus lucimarinus CCE9901]gb|ABP00071.1| ZIP family transporter: zinc ion [Ostreococcus lucimarinus 103027 160 282 38 45 0.00232899 CCE9901] 8.00E-75

94

PREDICTED: similar to collagen, type XXIX, alpha 113418 83 226 37 21 0.00240286 1, partial [Hydra magnipapillata] 2.00E-21 245108 13 13 0 2 0.00240793 flavin-containing monooxygenase [Chlamydomonas reinhardtii]gb|EDP05199.1| flavin-containing monooxygenase 413815 1026 2108 175 482 0.00354609 [Chlamydomonas reinhardtii] 2.00E-68 conserved unknown protein [Ectocarpus 233702 5 8 0 0 0.00387889 siliculosus] 6.00E-18 small multidrug resistance protein [Desulfitobacterium hafniense DCB- 2]gb|ACL20120.1| small multidrug resistance 59740 3 10 0 0 0.00387889 protein [Desulfitobacterium hafniense DCB-2] 5.00E-17 predicted protein [Physcomitrella patens subsp. patens]gb|EDQ78002.1| predicted protein 99134 151 93 24 28 0.00546139 [Physcomitrella patens subsp. patens] 4.00E-15 194295 3 9 0 0 0.00546645 209406 7 37 3 4 0.00557742 ZIP family transporter: zinc ion [Ostreococcus lucimarinus CCE9901]gb|ABP00071.1| ZIP family transporter: zinc ion [Ostreococcus lucimarinus 240483 296 1148 123 210 0.005974 CCE9901] 4.00E-82 373129 505 87 103 33 0.00671601 regulator of chromosome condensation (RCC1)- like protein [Phytophthora infestans T30- 4]gb|EEY56311.1| regulator of chromosome condensation (RCC1)-like protein [Phytophthora 352708 23 2 2 1 0.00732396 infestans T30-4] 1.00E-10 hypothetical protein [Monosiga brevicollis MX1]gb|EDQ88693.1| predicted protein 62944 9 11 1 1 0.00849107 [Monosiga brevicollis MX1] 3.00E-70

95

Appendix Q: comparison between Quinn 2006 microarray and this study

Paper Genbank Paper JGI 1516 browser ID ACCESSION Homology hit 1 desc hit 2 desc hit3 desc 77-2-14-G05.r.1 DQ658319.1 No hit 441958 put minor tail 414096 put minor tail 467640 put minor tail 77-2-5-E02.r.5 not found No hit Contig 16 DQ658258.1 No hit 442120 putative protein none none hyp DNA bind 77-2-15-F09.r.2 DQ658322.1 No hit 440435 nuclease 420673 hyp DNA bind nuclease 231370 unnamed DNA bind nuclease Contig 7 DQ658249.1 No hit 414308 alk phosphatase 433041 alk phosphatase none 77-2-7-G03.r.2 DQ658285.1 Similar to ankyrin 2 201109 ankyrin 1- like 359327 ankyrin 2 (B) like 312928 het. ribonuclear prot. Like 77-2-35-D05.r.1 DQ658351.1 No hit 79680 predicted 414165 predicted 433918 HUMAN NLRC5 like 77-2-19-C01.r.1 not found Gamma-carbonic anhydrase Contig 20 DQ658262.1 No hit 441958 put minor tail 467640 put minor tail 464162 put minor tail 77-2-1-A06.r.1 DQ658367.1 Hypothetical protein 440358 Hypothetical protein 122064 Hypothetical protein 96445 Hypothetical protein 77-2-6-H01.r.2 DQ658275.1 No hit none none none Contig 3 DQ658245.1 No hit 435909 glycine-rich protein 440197 glycine-rich protein 435910 glycine-rich protein 77-2-7-A10.r.5 not found IMP dehydrogenas/GMP reductase e Contig 14 DQ658256.1 No hit 433940 predicted 432445 predicted 435759 predicted hypothetical Oryza 77-2-32-E05.r.1 DQ658348.1 No hit 369968 sativa 111551 mucin 19/ put memb pro * 118278 CG3654 (drosphila) like M23/M37 family Contig 18 DQ658260.1 No hit 446839 peptidase 433931 M23/M37 family peptidase none 77-2-41-E04.r.1 DQ658363.1 No hit 359885 predicted none none 77-2-8-C05.r.1 DQ658289.1 No hit 465895 nischarin like 465884 nischarin like 123293 mucin associated surface pro Contig 17 DQ658259.1 No hit 433041 alk phosphatase 414308 alk phosphatase none glycotransferase fam 77-2-40-E03.r.1 DQ658364.1 No hit 432271 23 438013 glycotransferase fam 23 none Contig 10 DQ658254.1 Predicted esterase 336130 Lhcf65** 443878 Lhcf32*** 358662 Lhcf33**** Contig 24 DQ658266.1 Alpha-soluble NSF attachment protein 431677 predicted 438834 predicted 258834 put. alpha sol NSF attachment prot. 77-2-39-G04.r.1 not found No hit putative cytochrome 77-2-24-F03.r.1 DQ658337.1 Cytochrome b5 domain-containing protein like 444544 b5 domain 432985 putative cytochrome b5 domain 308954 putative cytochrome b5 domain Contig 22 not found Hypothetical protein 77-2-9-C12.r.1 DQ658294.1 Surface antigen BspA, Bacteroides forsythus 107748 cell surface protein 317543 put surface antigen PspA 251672 put surface antigen PspA Phosphoserine 77-2-36-F03.r.2 DQ658353.1 L-3-phosphoserine phosphatase 116987& phosphatase SerB 70007 Phosphoserine phosphatase SerB none Contig 21 DQ658263.1 No hit 444941 predicted 435682 predicted 443952 predicted 77-2-4-B06.r.1 DQ658270.1 Poss elongation of very-long-chain fatty acid protein 369636 predicted 371020 predicted 414268 putative fatty acid elongase putative arachidonate 77-2-41-E09.r.1 DQ658366.1 Arachidonate 15-lipoxygenase, second type 455707 15-lipoxygenase 452272 putative arachidonate 15-lipoxygenase none 77-2-19-F06.r.1 not found Phosphate-repressible phosphate permease

77-2-24-H06.r.1 DQ658339.1 No hit none Contig 15 DQ658257.1 Hypothetical protein 444546 hypothetical 436089 hypothetical 365388 hypothetical 77-2-4-G05.r.1 DQ658272.1 No hit 413928 predicted 442803 hypothetical, conserved 440188 hypothetical 77-2-19-E12.r.1 DQ658330.1 Small nuclear ribonucleoprotein polypeptide D2 459337 gamma-2 COP (At) 439549 Small nuclear ribonucleoprotein none 77-2-8-B04.r.1 DQ658288.1 4 Fe-S ferridoxin 413973 CE06704 441835 ankyrin repeat protein nuc-2 258266 ankyrin repeat domain 6

77-2-7-B11.r.5 DQ658281.1 Hypothetical protein 350739 hypothetical 362754 hypothetical 432280 hypothetical putative cysteine 77-2-3-B03.r.2 DQ658268.1 Cysteine protease 437163 protease 461333 putative cysteine protease 78350 putative cysteine protease

77-2-5-H09.r.3 DQ658274.1 No hit 441618 predicted 441119 predicted 438690 predicted 77-2-10-C04.r.2 DQ658297.1 No hit 440001 predicted 440709 predicted 440477 predicted 77-2-14-F02.r.1 DQ658318.1 No hit 437447 predicted 432382 predicted none Contig 4 DQ658246.1 Fucoxanthin chlorophyll a/c protein 431664 predicted 235468 put TF: HTH, AraC type 95820 C-type lectin Contig 23 DQ658265.1 Ac1147 350790 putative catalase 310662 Y4iL [rizobium] 315901 Y4iL [rizobium] 77-2-29-H11.r.1 not found No hit 77-2-30-E06.r.1 DQ658341.1 No hit 444469 predicted 441522 predicted 223420 put prot kinase

notes *manual annotation (Pagarate)best hit EHV 176 putative membrane protein

** manual annotation (LeFebvre) Light harvesting chlorophyll a/c binding protein. Part of diatoms & group IV.

***manual annotation (LeFebvre) Light harvesting chlorophyll a/c binding protein. Part of diatoms & haptophytes group IV. Other name:16B09. Chloroplast localised.

****manual annotation (LeFebvre) Light harvesting chlorophyll a/c binding protein. Part of diatoms & haptophytes group IV. Other name:16B09. Chloroplast localised. & best hit is to scaffold 112143951 E value 0 AV colored alike