<<

THE COMPLETE GENOME OF THE NON-PHOTOSYNTHETIC ALGA CRYPTOMONAS

by

Natalie A. Donaher

Submitted in partial fulfillment of the requirements for the degree of Master of Science

at

Dalhousie University Halifax, Nova Scotia April 2009

© Copyright by Natalie A. Donaher, 2009 Library and Bibliotheque et 1*1 Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de I'edition

395 Wellington Street 395, rue Wellington Ottawa ON K1A0N4 Ottawa ON K1A0N4 Canada Canada

Your file Votre reference ISBN: 978-0-494-50256-3 Our file Notre reference ISBN: 978-0-494-50256-3

NOTICE: AVIS: The author has granted a non­ L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par Plntemet, prefer, telecommunication or on the Internet, distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non­ sur support microforme, papier, electronique commercial purposes, in microform, et/ou autres formats. paper, electronic and/or any other formats.

The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these. this thesis. Neither the thesis Ni la these ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent etre imprimes ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission.

In compliance with the Canadian Conformement a la loi canadienne Privacy Act some supporting sur la protection de la vie privee, forms may have been removed quelques formulaires secondaires from this thesis. ont ete enleves de cette these.

While these forms may be included Bien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. Canada DALHOUSIE UNIVERSITY

To comply with the Canadian Privacy Act the National Library of Canada has requested that the following pages be removed from this copy of the thesis:

Preliminary Pages Examiners Signature Page (pii) Dalhousie Library Copyright Agreement (piii)

Appendices Copyright Releases (if applicable) TABLE OF CONTENTS

LIST OF TABLES vi

LIST OF FIGURES vii

ABSTRACT viii

LIST OF ABBREVIATIONS USED ix

ACKNOWLEDGEMENTS x

CHAPTER 1: INTRODUCTION vi

Photosynthetic 1

Loss of 11

Cryptomonads - the genera 14

Genomic complexity ....18

Loss of photosynthesis in 20

Non-photosynthetic plastid metabolism 21

Project goals 22

CHAPTER 2: MATERIALS AND METHODS 23

Culturing conditions 23

Total DNA extraction 24

Cesium chloride density gradient centrifugation 25

Southern hybridizations 27

Pyrosequencing and assembly .29

PCR, cloning, sequencing and genome annotation 31

CHAPTER 3: RESULTS AND DISCUSSION 34

SEM and cell morphology 34

iv Cesium chloride density gradient and Southern hybridizations.... 37

Pyrosequencing data 41

Plastid genome structure 43

Gene presence and absence analysis 53

Codon usage and tRNA complement 64

Loss of photosynthesis and metabolic shift 65

CHAPTER 4: CONCLUSION 71

REFERENCE LIST 74

APPENDIX A: DETAILED GENE TABLE 85

> v LIST OF TABLES

Table 1; Summary of probes used for Southern hybridization 30

Table 2: Summary of plastid genomes sequenced to date 48

Table 3: Gene presence or absence in eight plastid genomes 55

VI LIST OF FIGURES

Figure 1: Diagram of secondary endosymbiosis 5

Figure 2: Consensus phylogeny of the cryptomonad clade 16

Figure 3: Scanning electron microscopy images of CryptomonasParamecium..35

Figure 4: Cesium chloride gradient centrifugation and Southern hybridizations..38

Figure 5: A circular mapping diagram of the complete plastid genome 45

Figure 6: Schematic representation of the loss of photosynthetic genes 51

VII ABSTRACT

The cryptomonads are a group of unicellular that have acquired their plastid through the engulfment of a red algal cell ("secondary endosymbiosis").

There is evidence for multiple, independent losses of photosynthesis within the - including the species Cryptomonas Paramecium. They join a growing number of organisms that have become secondarily non-photosynthetic

(including land and apicomplexans) that nonetheless retain a plastid genome. Here I present the completely sequenced plastid genome of the non- photosynthetic alga Cryptomonas Paramecium. The 78 kb genome contains 83 genes including a single rRNA region, 29 tRNAs, and one pseudogene.

Compared to the completely sequenced plastid genomes of the photosynthetic cryptomonads Guillardia theta and Rhodomonas salina, it is approximately 50 kb smaller in size, and is completely missing the photosynthetic gene families of psa and psb. The GC content (38%) is higher than the other cryptomonad plastid genomes, as is the coding capacity (87%).

VIII LIST OF ABBREVIATIONS USED

bp base pair kb kilobase pair DNA deoxyribonucleic acid CCAP culture collection of algae and EDTA ethylene-diamine-tetra-aceticacid EtBr ethidium bromide LB Luria-Bertani rRNA ribosomal ribonucleic acid tRNA transfer ribonucleic acid GC/AT guanine, cytosine / adenine, thymine ATP adenosine triphosphate ERAD endoplasmic reticulum-associated degradation SSU small sub-unit LSU large sub-unit ITS internal transcribed spacer EST expressed sequence tag ddH20 double distilled water UV ultra-violet TE tris-EDTA PCR polymerase chain reaction V volts O/N overnight SDS sodium dodecyl sulfate SSC sodium chloride and sodium citrate solution CCD charge-coupled device TAE tris-acetate-EDTA TBE tris-borate-EDTA NCBI National Centre for Biotechnology Information v/v volume to volume

ix ACKNOWLEDGEMENTS

I would like to thank my supervisor, Dr. J. M. Archibald, as well as my committee members Dr. F. Doolittle and Dr. A. Roger. Their patience with my last minute meetings and insight into my research have been invaluable. It is easy to prioritize right and look good doing it when you are standing on the shoulders of giants.

To my lab mates - I don't think I could come up with something clever, biting or witty enough to do you all justice. Chris, Eunsoo, Hameed, Tia, Anna and more recently Christa, Julia, Bruce, Rob, Naoko and Goro - I kept telling everyone who would listen when I first started how lucky I was to find such a cohesive and inviting lab. As hard as I tried, I could only stretch my time here to

2.5 years and it still seems too short.

And finally to my family members, who didn't blink an eye when I asked how straight they could cut tile or lay hardwood all while trying to balance in the lab. "Thanks for the support" seems a bit trite and underwhelming - it's more like "Thanks for propping me up." To my strongest pillar and best friend, thank you Daniel. This is for you.

x 1

CHAPTER 1: INTRODUCTION

Photosynthetic eukaryotes

Eukaryotes are believed to have evolved photosynthesis ~1.5 billion years ago, when a cyanobacterium was engulfed by a phagotrophic cell (Yoon et al.,

2004). The endosymbiosis integrated the cyanobacterium into the cell as the (also known as the plastid). Proof of this long-ago event has come in the form of irrefutable molecular phylogenies, as well as biochemical and ultrastructural evidence linking the plastid to an endosymbiotic cyanobacterium

(Archibald, 2009). Three lineages evolved from that primary endosymbiosis: the , the and the (Moreira et al., 2000, Palmer,

2003, Rodriguez-Ezpeleta et al., 2005). The glaucophytes may be the most basal lineage of primary phototrophs, containing a peptidoglycan wall surrounding the plastid that very clearly resembles the cell wall of cyanobacteria (Steiner et al.,

2005). The glaucophytes and red algae share pigment types: phycobilins and chlorophyll a are present in both lineages, whereas the green lineage contains chlorophylls a and b. Algae are found almost everywhere water and sunlight occur together, even in relatively nutrient-poor areas (like the open ocean)

(Parker et al., 2008). Despite their small physical size, these marine organisms are nonetheless responsible for 40%-50% of the Earth's yearly photosynthetic output (Falkowski et al., 1998).

Photosynthesis occurs in the plastid, where imported, nuclear-encoded proteins combine with plastid-encoded proteins to propel the carbon fixation of environmental inorganic carbon into storage molecules. The basic apparatus involves Photosystem I (PSI) and Photosystem II (PSII). PSII oxidizes water, passing electrons through a cytochrome b6/f complex to PSI while creating a proton gradient to produce reducing equivalents for autotrophy (Falkowski et al.,

2004). The ATP harvested from light yields energy storage in the form of polysaccharides, created from carbon which has diffused into the chloroplast via the stroma (Ishida et al., 2008). Ribulose-1,5 - bisphosphate carboxylase/oxygenase (RuBisCo) is capable of using either CO2 or O2 to cleave the double bond of its RuBP substrate (ribulose 1,5-bisphosphate) (Parker et al.,

2008). The use of oxygen, a process called photorespiration, is less efficient at producing energy equivalents than the direct use of CO2 (Roberts et al., 2007).

RuBisCo substrate specificity factors vary among the photosynthetic organisms - red algae and share a form of RuBisCo that is less prone to oxygen fixation and greater carbon incorporation in low carbon conditions compared to green algae (Tortell, 2000). As carbon can be scarce in the open ocean, marine algae use a CCM (carbon concentrating mechanism) to gather inorganic carbon.

Carbonic anhydrases transport carbon, in the form of bicarbonate, to the active site of RuBisCo. The bicarbonate is less likely to diffuse out of cellular compartments, so it is easier to concentrate it in the stroma of the plastid, where

RuBisCo is also concentrated (Spalding, 2008, Kaplan and Reinhold, 1999). A second concentrating mechanism, the C4 pathway, may also operate in unicellular algae. Unlike the biophysical CCM, the C4 pathway is biochemical - converting inorganic carbon to a C4 compound (either malic or aspartic acid) which is decarboxylated to generate the carbon for RuBisCo (Reinfelder et al., 3

2000). Although several genome sequencing projects have found genes for putative C4 enzymes in the aquatic unicells, there remains some question as to the localization of the enzymes as they lack plastid targeting leader peptides

(Edwards et al., 2004, Kroth et al., 2008).

The distribution of photosynthesis across the eukaryotic tree of life is wide.

Currently, photosynthesis exists in four of the six hypothesized supergroups:

Rhizaria, , and the Archaeplastida (Lane and Archibald,

2008). Since photoautotrophs are preferred food for some heterotrophic predators ( in particular), higher food chain organisms are regularly exposed to from prey species (Weisse, 2002, Pedros-Alio, 1995). The combination of easy access and lucrative energy output highlights the evolutionary pressure on heterotrophic organisms to adopt a photosynthetic lifestyle.

The distinguishing feature separating the known diversity of photosynthetic eukaryotes is based on the method of plastid acquisition: "primary plastid" containing organisms versus "secondary plastid" containing organisms

(Archibald, 2009). Whereas primary endosymbiosis decribes the enslavement of a photosynthetic prokaryote by a eukaryotic cell, secondary endosymbiosis is the engulfment and retention of a photosynthetic eukaryotic cell by another eukaryotic cell (fig.1). In fact, the majority of algal lineages acquired their plastid via secondary endosymbiosis (Falkowski et al., 2004). Their abundance is a testament to nature's ability to transform and adapt established cellular processes to new circumstances; in this case, the integration of host and 4 endosymbiont to form an entirely new organisim. Examples of secondarily derived, plastid-conaining organisms include , , , apicomplexans, , chloroarachniophytes, and cryptophytes (Archibald, 2009).

Compared to green plastids, red plastids have retained the capacity to function more autonomously, with retention of core plastid genes such as rbcS

(which is nuclear-encoded in green lineages), RuBisCo regulators (e.g., cfxQ), and replication/chaperone genes (e.g., groEL and dnaB). It has been hypothesized that the presence of these genes in the plastid genome paved the way for a smoother incorporation of a secondarily-acquired plastid into otherwise heterotrophic cells. Since the red plastids were more "portable" (containing more genes) than the green plastids, more heterotrophs converted via engulfment of a red alga, and we now have an abundance of secondary (and tertiary) red-derived photosynthesizers in the ocean (Grzebyk et al., 2003). Of course this scenario is dependent on the acquisition of red plastids occurring more than once, which is still contentious and discussed further below (Keeling et al., 2004).

The process of secondary endosymbiosis is exceedingly complex from a genetic and cell biological perspective. In many secondary plastid-containing organisms, the only obvious feature of the plastid suggesting a secondary endosymbiotic origin is the additional membranes surrounding the organelle, as all other cellular features of the original endosymbiont have been phased out

(Hashimoto, 2005). Yet genetic evidence for the added complexity of the endosymbiont abound. Phylogenetic incongruencies, conspicuous protein 5

Figure 1: Diagram of secondary endosymbiosis. (A) Secondary endosymbiosis involving a phagotrophic and a photosynthetic eukaryote. (B) In most lineages, the endosymbiont nucleus is lost. (C) In chlorarachniophytes and cryptomonads, a relict endosymbiont nucleus persists as a . 6

Secondary host phagosomal membrane (A) Primary host plasma membrane

Cyanobacterial outer membrane

Cyanobacterial inner membrane

Secondary host phagosomal membrane (B) Primary host plasma membrane Primary host cytosol Secondary host endomembrane lumen

Secondary host phagosomal membrane (C) Primary host plasma membrane Primary host cytosol N=nucleus Secondary host P=plastid endomembrane lumen M=mitochondrion Nm=nucleomorph Primary host relict nucleus 7 targeting mechanisms and, in a select few species, the remnant nucleus of the endosymbiont (present as a nucleomorph) are all taken as evidence for the secondary acquisition of the plastid by one eukaryote from another eukaryote

(Douglas and Penny, 1999, Archibald, 2009).

Apicomplexans, cryptophytes, dinoflagellates, heterokonts and haptophytes all contain at least some members that possess a plastid derived from red algae. Whether from the same red alga (as a single event) or not is hotly contested (reviewed in Archibald, 2009). The published literature provides multiple examples of poorly supported phylogenetic trees linking the five groups

(Daugbjerg and Andersen, 1997, Oliveira and Bhattacharya, 2000, Mulleret al.,

2001). Cavalier-Smith (1999) first suggested uniting all five groups into the supergroup called Chromalveolata (a group consisting of the chromists - including cryptophytes, haptophytes and heterokonts, and the - containing the apicomplexans and dinoflagellates). Such a hypothesis assumes multiple instances of plastid loss. If instead, two or more independent red algal plastids were acquired, it would explain the patchy distribution among the

Chromista and Alveolata. Precedence for multiple independent acquisitions of plastids occurs with green algal-derived plastids, as the euglenids and the chlorarachniophytes are thought to have acquired their plastids in separate events (Archibald, 2009).

As a side-effect of genome integration and reduction occurring among the organisms mentioned above, many plastid-specific genes (and, if present, nucleomorph genes) have migrated to the host nucleus by endosymbiotic gene 8 transfer (EGT) (Martin et al., 1998, Soil and Schlieff, 2004). Once in the nucleus, genes are under regulatory control, as well as being "protected" from the potentially oxidizing metabolic environment of the organelle (Allen and Raven,

1996, Pfannschmidt et al., 2001, Barbrook et al., 2006). Targeting mechanisms are required to transfer nucleus-encoded, cytosol-translated proteins into the plastid compartment, which in secondarily derived organisms can be surrounded by up to four membranes. An amino (A/-) terminal transit peptide based system is used to transfer proteins through the two membranes belonging to the original cyanobacterial endosymbiont (facilitated by the TIC/TOC transmembrane protein complexes), whereas an additional (A/)-terminal signal sequence is required to move proteins through the additional membranes of a secondarily-derived lineage (McFadden and van Dooren, 2004, Gould et al., 2008). In some cases, this includes passage to the organelle via the endoplasmic reticulum, and then potentially shuttling through the additional membranes using vesicles, pores or the ERAD system to finally arrive at the plastid stroma (Gould et al., 2008,

Hempel et al., 2007).

The theme of a multi-layered "complex-cell-within-a-complex-cell" is expanded even further in the case of tertiary endosymbiosis, where a photosynthetic eukaryotic cell swaps its plastid for one from another eukaryotic photosynthesizer (dinoflagellates being the most conspicuous example) (Patron et al., 2006). The genetic mayhem caused by the single (or continual) introduction of plastids, with the ensuing evolution of targeting mechanisms and regulatory control, is not thoroughly understood (Kim and Archibald, 2008). 9

Many of the most genetically complex cells known have evolved the ability to photosynthesize by stealing the plastids of other eukaryotic organisms (Chesnick etal., 1996, Schnepfand Elbrachter, 1999).

At the top of the list of complex plastid-containing organisms, by virtue of their direct impact on human health, are the apicomplexans. Parasitic organisms with tiny genomes, these cells are entirely non-photosynthetic during their life cycle (McFadden et al., 1997). And yet, there is evidence that at least some of the members of this lineage (such as and Toxoplasma) contain a remnant plastid called an apicoplast. Electron micrographs show membrane- bound organelles that are believed to be reduced plastids, metabolic studies have shown plastid-specific proteins present in the cell and several apicoplast genomes have been sequenced (Kohleret al., 1997, Hopkins et al., 1999,

Tomova et al., 2006). Heterokonts are another group containing secondarily derived photosynthetic organisms. They are an interesting group because they are so morphologically diverse. Sizes in the group range from pico- sized unicellular organisms to huge, multi-cellular kelp (Kim and Archibald,

2008). Haptophytes, like the coccolithophore Emiliania huxleyi, are an ecologically important secondary plastid-containing lineage that can outnumber the combined total of all other phytoplankton cells during massive, cyclical oceanic blooms (Jeong, 1999). The dinoflagellates are incredibly adept at adopting new plastids from their food sources (examples include engulfment of diatoms, cryptophytes, and haptophytes), and this has resulted in a lineage with many tertiary-containing photosynthesizers with equally complex genetic 10 movement among the cellular compartments (Wilcox and Wedemayer, 1984,

Takishita et al., 2002, Schnepf and Elbrachter, 1999, Koike et al., 2005). Albeit complicated, their hyper-active organelle acquisition may prove useful in determining how genes transfer from the plastid to the endosymbiont nucleus then onto the host nucleus (Hackett et al., 2004b). Cryptophytes, the group containing my organism of interest, will be discussed in further detail later in this chapter. They are unusual among the red algal-derived lineages mentioned above because of the presence of a nucleomorph (Douglas et al., 2001). Also containing a nucleomorph (but of green algal ancestry) are the chloroarachniophytes. Comparison among the of both groups have already shed light on the process of genome reduction, and may one day answer why the endosymbiont nucleus of other secondary-plastid containing organisms have disappeared entirely (Silver et al., 2007, Lane et al., 2007).

Finally, the euglenids are the only photosynthetic lineage found in the supergroup

Excavata. Phylogenetic studies show the secondary endosymbiosis of a green alga among euglenids was a separate event from the event giving rise to the chloroarachniophyte plastid (Rogers et al., 2007). This relatively well-studied group could yield further insight into the ecological requirements for adoption of photosynthesis in a group of organisms that are otherwise entirely non- photosynthetic.

Combined with the Archaeplastida, which include the primary-plastid containing glaucophytes, red algae, green algae and land plants, the pool of 11 plastid containing organisms is enormous. In the next section, I will discuss the subsequent loss of photosynthesis by plastid-bearing organisms.

Loss of photosynthesis

The loss of photosynthesis in previously autotrophic organisms has occurred in the heterokonts, apicomplexans, dinoflagellates, haptophytes, land plants and the cryptophytes (the subject of this study). Since all of these distinct lineages have photosynthetic relatives, there is evidence for multiple, independent losses of photosynthesis (Kim and Archibald, 2008). The list includes primary and secondary plastid-containing organisms, both uni- and multi-cellular. In most cases studied to date, the plastid itself appears to be retained, regardless of the photosynthetic ability of the host (Bodyl, 2005). The evidence so far suggests that even in organisms that have completely reverted to a non-photosynthetic lifestyle, the plastid cannot be eliminated entirely, raising interesting questions regarding what, in fact, the plastid's function in non- photosynthetic organisms might be (like in the parasitic Perkinsus atlanticus;

Teles-Griloetal.,2007).

Non-photosynthetic flowering plants represent naturally-occurring photosynthesis mutants, and can provide insight into the evolutionary pressures at play for other plastid metabolic functions. Land plants that have lost photosynthesis and have become either mycoheterotrophic or parasitic have evolved in at least 11 angiosperm lineages, which is 1% of all angiosperm species (Barkman et al., 2007, Krause, 2008). They sometimes develop special 12 feeding organs instead of photosynthetic leaves, and are attached to the host by either the shoots or the roots. They are divided into hemi-parasites

(which remove only inorganic nutrients from their host) and holo-parasites (which remove organic and inorganic substances from the host plant). The genus

Cuscuta contains over 150 holoparasitic species and displays a continuum of angiosperm parasitism. The plastid genomes of this genus likewise show a varied ability to express photosynthetic genes - some having lost more photosynthetic genes than other species (Krause, 2008).

Studies on Cuscuta as well as other non-photosynethtic land plants

(whose plastid genomes tend to be more uniform in composition), have identified a number of "core" plastid proteins that appear to be encoded even in non- photosynthetic organellar genomes (Martin et al., 2002, Martin et al., 1998). The plastid genome of the non-photosynthetic liverwort plant Aneura mirabilis

(Wickett et al., 2008), the haustorial parasite Epifagus virginana (dePamphilis and Palmer, 1990), and the holoparisitic plants Cuscuta spp. have been published (McNeal et al., 2007). Although these land plants are primary plastid- containing organisms, the comparative studies are nonetheless worthwhile in a larger context, for example, in identifying trends in plastid genome reduction or re-arrangement that are applicable to both primary and secondary plastids. The relaxed selective constraints, with the subsequent pseudogenization of photosynthesis-specific genes, are apparent in these plant plastids at various levels of degradation. 13

More applicable to this study, which presents the completed plastid genome of the non-photosynthetic cryptomonad Cryptomonas Paramecium, are the unicellular heterotrophs. There are multiple plastid genomes either partially or completely sequenced that belong to organisms whose most closely related phylogenetic neighbours are photoautotrophs (whether primary or secondary).

To date, a partial plastid genome sequence for the (predominantly free-living) green alga Prototheca wickerhamii has been obtained (appx. 28 kb of 54 kb;

Knauf and Hachtel, 2002), as well as a complete plastid genome for the pathogenic green alga Helicosporidium sp. (de Koning and Keeling, 2006). Both of these organisms have primary plastids. The plastid genomes of multiple apicomplexans have been completed, and although they contain secondary plastids, they are no longer free-living (Kissinger et al., direct GenBank submission in 1997, Cai et al., 2003, Wilson et al., 1996). Apicomplexans import many metabolites and proteins via their host and thus generally have very reduced plastid genomes. While a typical plastid genome from a land plant is around ~150 kb, and that belonging to a non photosynthetic land plant is ~75 kb, the smallest apicomplexan plastid genome is a mere 35 kb (Krause, 2008). That leaves a single (complete) plastid genome sequence belonging to a secondarily- derived, free-living, secondarily non-photosynthetic organism with which to compare the genome studied here - that of the euglenoid Astasia longa. A. longa is a colorless heterotrophic flagellate whose close relative is the photosynthetic euglenoid Euglena gracilis (Gockel and Hachtel, 2000). It has a chloroplast genome of 73 kb, which contains tandemly arranged rDNA repeat 14 regions, three genes for RNA polymerase, 27 tRNA genes and, interestingly, the photosynthesis-related gene rbcL The comparative analysis of the green-algal derived E. gracilis versus its non-photosynthetic counterpart A. longa has paved the way for a similar comparison in the red-algal plastid-containing cryptomonad clade completed in this study.

Cryptomonads - the genera

The unicellular cryptomonads have been isolated from marine, brackish and freshwater environments (Klaveness, 1988). Most are photosynthetic (these species are called "cryptophytes") but there are some secondarily non- photosynthetic species and one, distantly related aplastidic genus (Gonionomas)

(McFadden et al., 1994). The shift from marine to freshwater ecosystems has taken place at least two, and possibly three times independently during cryptomonad evolution (Shalchian-Tabrizi et al., 2008). There are over 200 cryptomonad strains in the culture collections held around the world (Hoef-

Emden et al., 2002). The cryptomonads can be separated into five multi-species clades and two single-species groups; a Rhodomonas-an6-re\aied clade, a

P/agf/ose/m/s-and-related clade, a Gu/7/araf/a-and-related clade, a Hemiselmis- and-related clade, the Cryptomonas clade and the single species of

Proteomonas sulcata and Falcomonas daucoides (fig.2). Only the Cryptomonas,

Hemiselmis and Guilllardia clades have had any genomes completely sequenced. 15

Apart from the presence of complexes for chlorophyll a and C2 outside the plastid thylakoid structure, all cryptomonad cells contain the pigment phycoerythrin acquired from the ancestral red algal endosymbiont (MacColl et al.,

1976, Apt et al., 1995). The phycoerythrin evolved into 7 "biliproteins" (3 red phycoerythrin and 4 blue phycocyanin types) that are found within the thylakoid lumen of different species (Gantt et al., 1971, Hill and Rowan, 1989, Ludwig and

Gibbs, 1985). The modified biliprotein is thought to enhance the ability of cells to photosynthesize in low light conditions (Gervais, 1998, Hammer et al., 2002).

Cryptophytes are found in the lower layers of the photolimnion in the ocean, where they can exploit the low-light niche (Salonen et al., 1984). Even freshwater species prefer the lower epilimnion of freshwater lakes, where they can be found in greater abundance than their marine counterparts (Gervais,

1997). Cryptomonads are also able to travel along the vertical water column in search of nutrients or to escape predation (Pedros-Alio et al., 1995). The

Cryptomonas clade, to which Cryptomonas Paramecium belongs, contains brown/red/blue pigmented cells, as well as colourless non-photosynthetic species

(Hoef-Emden and Melkonian, 2003).

Cryptomonad cells have an asymmetrical shape, with a characteristic groove along the ventral cell wall (called the furrow-gullet), and a pair of anterior flagella that are used in locomotion (Kugrens et al., 1994). The cell is surrounded by a periplast - a protein/membrane structure that is comprised of two outer layers of polygonal protein plates sandwiching an interior plasma membrane (Brett et al., 1994). Ejectosomes, coiled projectiles released during 16

Figure 2: Consensus phylogeny of the cryptomonad clade. Cryptomonas Paramecium, whose plastid genome is presented in this thesis, is part of the Cryptomonas clade. Based on Hoef-Emden et al. (2002). 17

clade Proteomonas sulcata „- Falcomonas daucoides

am clade

clade outgroup.

clade

Guillardia clade 18 stress, can be seen under the light microscope lining the furrow-gullet. A large vacuole is used for osmotic pressure control in the freshwater genus

Cryptomonas (Morrall and Greenwood, 1980, Clay et al., 1999).

The cryptomonads are thought to be related to two other groups of aquatic photosynthesizers: the haptophytes and the heterokonts (Khan et al., 2007b). An early hypothesis suggested a specific relationship between these three lineages based on shared pigment types and membrane structure, raising the possibility that they share a red algal endosymbiosis event (Cavalier-Smith, 1986). More recently though, concatenated phylogenenies consistently resolve a specific relationship between the cryptomonads and the haptophytes, to the exclusion of heterokonts (Hackett et al., 2007a, Patron et al., 2007, Burki et al., 2008). This finding throws doubts on the monophyly of the Chromista lineage (cryptomonads- haptophytes-heterokonts) as well as the more broad-scale Chromista-Alveolata group (the Chromalveolates) (Lane and Archibald, 2008).

Genomic complexity

Since cryptophytes acquired their plastid via an endosymbiotic relationship with a red alga, the cells are complex amalgamations of genetic compartments with different evolutionary histories that must work as a coherent whole. The plastid in particular is a feat of integration, with four membranes (two from the original cyanobacterial endosymbiont, one from the secondary endosymbiont and one from the host vacuole) and a relict nucleus (called the nucleomorph) (Gould et al., 2008). The endosymbiont cytosol is called the 19 periplastic!ial space, and must be traversed by any plastid-targeted proteins synthesized elsewhere in the cell. Some core biochemical processes occur in the perplastidial space, denoted by the presence of ribosomes and starch storage, but they are poorly understood (Haferkamp et al., 2006). In cryptomonads, the outer-most membrane surrounding the plastid is continuous with the host nuclear envelope and the rough endoplasmic reticulum (Cavalier-

Smith, 1999).

Due to the integration of multiple genomes in a single cell, the cryptomonads can offer insight into the process of genome reduction and EGT from the plastid and nucleomorph into the host nucleus (Lane and Archibald,

2008). The genome reduction of the plastid appears to be even more pronounced in the non-photosynthetic species. Estimates of genome size in the cryptophyte plastid range from 130-150 kb, while the non-photosynthetic C.

Paramecium strain examined in this study has been shown by karyotyping to contain a plastid genome of a mere 70 kb (Goro Tanifuji, Doctorate thesis, 2006).

To date, three cryptomonads have had two organellar genomes sequenced: Rhodomonas salina's plastid (Khan et al., 2007b) and mitochondrial genome (Hauth et al., 2005) are complete while Hemiselmis andersenii had its nucleomorph (Lane et al., 2007) and mitochondrial genome sequenced (Kim et al., 2008). The model cryptomonad Guillardia theta has a sequenced nucleomorph (Douglas et al., 2001) as well as plastid genome (Douglas and

Penny, 1999). All are available in the GenBank database. Loss of photosynthesis in cryptomonads

Within the photosynthetic genus Cryptomonas, at least three lineages have lost photosynthesis independently (Hoef-Emden, 2005). The colourless cryptomonads still contain plastids, sometimes called leucoplasts. Evidence for the polyphyly of leucoplast-containing cryptomonads came in the form of phylogenetic trees of concatenated as well as single nucleomorph and nuclear genes (SSU, LSU, and ITS regions) (Hoef-Emden, 2005). The non- photosynthetic cryptomonads have accelerated evolutionary rates, which is consistent with previous studies that have shown higher substitution rates and

AT biases in organelle/endosymbiont genomes in plants and algae that have switched modes of nutrition. It is unclear whether a change in tropic strategy from autotrophic to heterotrophic is a result of genetic changes, or whether it is a cause of them. dePamphilis and Palmer (1990) suggested that genetic changes precede the loss of photosynthesis, but it is conceivably advantageous to adopt mixotrophy, which would relax the selective pressure on the plastid. Hoef-

Emden et al., (2005) also propose a relaxation of constraints among the cryptomonads prior to the loss of photosynthesis in several lineages. It has been suggested that the loss of sexual reproduction could also reduce selective restraints. Research into different life-cycles among the cryptomonads is on­ going, but to date, several species have been found to exist only as a single

(presumably assexually reproducing) morphotype (Hoef-Emden and Melkonian,

2003). 21

Non-photosynthetic plastid metabolism

The presence, and relative conservation, of a minimal plastid in non- photosynthetic organisms strongly suggests a functional existence rather than purely selfish propagation. In the euglenoid Astasia longa and the heterotrophic green alga Prototheca wickerhamii, there is evidence for transcription occurring in the leucoplast (Northern blotting in the former and sequencing of ESTs in the latter) (Gockel and Hachtel, 2000, Borza et al., 2005). The genetic characteristics identifying a functional leucoplast are unclear: while the leucoplast of A. longa lacks pseudogenes, the plastid DNA of the parasitic liverwort Aneura mirabilis has an elevated number of pseudogenes compared to other leucoplasts

(van der Kooij et al., 2000, Wickett et al., 2008). It is likely that pseudogenes are just the first step in the process of genome reduction, regardless of , and that Astasia longa has been non-photosynthetic for longer than has Aneura mirabilis.

The first glimpse into leucoplast functional significance was gleaned from studying the extremely small (35 kb) leucoplast genome of the apicomplexan parasites. Studies disrupting leucoplast function caused delayed death of the apicomplexan cell, suggesting a vital function in cell apoptosis (McConkey et al.,

1997, Fichera and Roos, 1997). Additionally, some metabolic processes were identified in the apicoplast, including fatty acid synthesis and tetrapyrrole biosynthesis. In the parasitic green alga Helicosporidium sp., additional pathways for amino acid synthesis were identified. In the free-living Prototheca wickerhamii, non-photosynthetic pathways were identified, including carbohydrate metabolism (Borza et al., 2005).

Project goals

In this thesis, I present the sequenced plastid genome of the non- photosynthetic cryptomonad Cryptomonas Paramecium. Studies into the plastid evolution of secondarily non-photosynthetic organisms will benefit from the complete sequencing of the plastid genome of C. Paramecium, which differs from many other sequences currently available, in that it is derived from a free-living organism. While there have been multiple sequencing projects for parasitic land- plants and parasitic apicomplexans, to date there has been just a single plastid genome published for a free-living, non-photosynthetic (Astasia longa).

The independent evolutionary acquisition of the (green algal) plastid in the euglenoid A. longa differentiates it from the (red algal-plastid containing) cryptomonad C. Paramecium. In fact, this plastid is the first free-living, red-algal derived, secondarily acquired leucoplast genome to be sequenced. Perhaps more important to the understanding of plastid evolution in secondarily non- photosynthetic organisms is the abundance of photosynthetic species closely related to C. Paramecium. Comparative analysis between C. Paramecium and the plastids of Guillardia theta and Rhodomonas salina highlight the genomic changes associated with the loss of photosynthesis in these unicellular algae. CHAPTER 2: MATERIALS AND METHODS

Culturing conditions

Cultures of Cryptomonas Paramecium were obtained from the Culture

Collection of Algae and Protozoa (CCAP). The strain designation for the culture used in this study is 977/2a. Cultures were maintained at room temperature, in either 1 L glass bottles or 750 ml_ disposable plastic culture flasks. Media was prepared with 1 g sodium acetate trihydrate plus 1 g Lab-Lemco powder per 1 L ofddH20.

Scanning electron microscopy (SEM)

SEM samples were prepared from Cryptomonas Paramecium cultures as follows: A 5% gluteraldehyde solution was added in a 1:1 v/v to a pellet obtained from liquid culture, and the cells were fixed for 30 min (final concentration of gluteraldehyde 2.5%). The cells were centrifuged at 800 g for 2 min and rinsed with filtered seawater three times (centrifugation performed after each step). The samples were suspended in a 2% osmium tetraoxide (OSCM) solution for 30 min, washed 3x with ddH20 and diluted in ddH20 in preparation for the next step.

The cell preparation was filtered through a syringe (0.25 M Millipore,

Billerica, MA, USA), and rinsed with an increasing concentration of ethanol. The first rinse was done with a 25% ethanol solution and allowed to incubate 5 min.

Similar incubations were done using 35%, 50%, 70%, 80%, 90% and then 3x

100% ethanol rinses. The samples were critical-point dried, and the filter was sectioned into 1 cm x 1 cm pieces for attachment to the coating stub with carbon 24 tape. The coating of gold/palladium target was done with a SC7620 mini Sputter

Coater (Quarum Technologies, New Haven, East Sussex, UK). Samples were viewed with a S-4700 cold field emission scanning microscope (Hitachi, Tokyo,

Japan).

Total DNA extraction

Total cellular DNA was extracted from large-scale liquid cultures (3-4 L) when cell density was approximately 22,000 cells/mL Cells were pelleted by centrifugation at 2,100 gfor 15 min (Beckman Model J2-21, Beckman Coulter

Inc., Fullerton, CA, USA). Pellets were gently resuspended in 100 mL of growth media, then transferred to 2x 50 mL Blue Max polypropylene conical Falcon tubes (Becton/Dickinson Labware, Franklin Lakes, NJ, USA). The Falcon tubes were centrifuged at 1850 g in a Sorvall Legend RT centrifuge (Mandel Scientific

Company Inc., Guelph, Ont.). The supernatant was poured off and replaced with

2x 2.5 mL of Extraction Buffer (1 M Tris-HCI (pH 7.5), 5 M NaCI, 0.5 M EDTA,

20% sodium dodecyl sulfate (SDS)) and the white pellet was resuspended by vortexing (MiniVortexer, VWR Scientific, Batavia, III. USA). The resulting 6 mL solution was aliquoted equally among 6x 2.0 mL microcentrifuge tubes and heated for 10 min (agitated at the five min mark) in a standard heatblock at 50°C

(VWR Scientific, Batavia, III. USA). The tubes were then spun at maximum speed in a microcentrifuge (13,000 g; Centrifuge 5415 D, Eppendorf, Mississauga,

Ont.). The supernatant was pipetted into 1 mL of £henol:chloroform:isoamyl alcohol (PCI) mixture (25:24:1) without disturbing the pellet. The microcentrifuge tube was shaken by hand for 5 mm, then centrifuged at maximum speed for 5 min to separate the aqueous and non-aqueous layers. The upper aqueous layer containing nucleic acids was removed without disturbing the interface, and was subjected to a second 1 ml_ PCI wash. The 5 min shaking/ 5 min centrifugation was repeated. A single chlorofornrisoamyl acohol (24:1) wash was done on the sample supernatant, further removing unwanted cellular components from the nucleic acids. The mixture was inverted for 2 min, and centrifuged at maximum speed for 2 min. The aqueous layer from this final separation (between 0.75 ml_

-1 mL) was transferred to a labeled microcentrifuge tube and to it was added 1/6 volume of 100% molecular grade isopropanol and 1/10 volume of 3 M sodium acetate. After inverting, the DNA precipitate was often visible immediately, although samples were always stored at -20°C for at least 24 h. Precipitated

DNA was pelleted by centrifugation at maximum speed for 30 min, after which the visible pellets were washed with 80% ethanol and centrifuged at maximum speed for 5 min. Ethanol was decanted off, and the pellets were dried under vacuum for 5-8 min in a dessicator (Bel-Art, Pequannock, NJ, USA) and were resuspended in 250 uL of TE buffer (10 mM Tris-HCI, 1 mM EDTA, pH 8.0). The resuspended DNA was either combined in preparation for further separation or diluted to serve as PCR template.

Cesium chloride density gradient centrifugation

Total cellular DNA was subjected to cesium chloride (CsCI) density gradient centrifugation in order to purify AT-rich organellar DNA for pyrosequencing. Resuspended total cellular DNA from approximately 25 L of C.

Paramecium culture was combined in a 15 mL Corex tube (42 microcentrifuge tubes at 250 uL TE each) to yield a final volume of 10.5 mL of TE. To this liquid was added 11 g of cesium chloride, and the mixture was shaken until the endothermic reaction reached room temperature. In order to visualize the separated bands under long wave UV light, 2 mg of Hoescht Dye 33258 was added to the Corex tube which was shaken horizontally (S-500 Orbital Shaker,

VWR Scientific, Batavia, III. USA) for at least 1 h (ideally, distributing the dye homogenously). In order to prepare the sample for high force ultracentrifugation, the TE-dye-CsCI mixture was transferred to a Quick-Seal centrifuge tube

(Beckman Coulter Inc., Fullerton, CA, USA) that was sealed without any air bubbles using the Beckman sealer (Beckman Coulter Inc., Fullerton, CA, USA).

Samples were spun at 40,000 rpm under vacuum for 44 h using a Ti-75 rotor in an L8-M ultracentrifuge (L870M Ultra, Beckman Coulter Inc., Fullerton, CA,

USA). Visualization of the DNA was achieved with long wave ultraviolet light

(BlakRay UVL-21, Ultra-violet Products Inc., San Gabriel, CA, USA). The sealed tube was punctured with a 30 gauge needle, and the plastic top was cut away with a heated blade to expose the gradient. Bands were removed from the top down with aid of a 10 mL syringe (BD 10 mL Syringe, Becton/Dickinson Labware,

Franklin Lakes, NJ, USA). Each syringe, containing DNA from one band, was

labeled and set aside for dye removal.

The intercalated Hoescht dye was removed from the DNA by mixing the gradient fractions in a 1:1 v/v of water-saturated butanol. Butanol was added to the sample via the syringe tip, shaken vigorously for 1 min by hand, and then allowed to separate at room temp for 5 min (until two well resolved phases appeared). The butanol was then removed with the syringe needle, and the process repeated until the aqueous CsCI mixture was free of Hoescht dye

(verified under UV light). In order to free the DNA from the cesium salts, an overnight ethanol precipitation was done using a 1x volume of TE buffer, a 2x volume of 100% ethanol, a 1/10 volume of sodium acetate and 2-4 ug of linear polyacrylamide (for organellar fractions). Samples were stored O/N at 4 °C and spun down in the ultracentrifuge (Thermo IEC B-22M, International Equipment

Company, Needham Heights, MA, USA) for 15 min. Between 3-6 washes were done on the invisible pellets with 80% ethanol. Each rinse was followed by 5 min spins at 10,000 g. Samples were rehydrated in 400 uL of TE buffer, and were used for Southern Hybridizations or sent for 454 pyrosequencing.

Southern hybridizations

Visualization of purified DNA from the isolated CsCI density gradient bands was done using a 0.8% agarose gel. About 100 ng of DNA was electrophoresed per organellar fraction from the gradient, at 80V for 45 min (Mini-

Sub Cell GT & PowerPac Basic, Biorad Laboratories, Hercules, CA, USA). The gels were stained with a [2 mg/mL] ethidium bromide solution for 4 min and destained for 15 min in ddH20. Digital images of the gel under UV light (High

Performance Ultraviolet Transilluminator, UVP Ltd..Upland, CA, USA) were taken 28 using an in-house gel documentation system (DigiDocIt, VWR Scientific, Batavia,

III. USA).

Gels were pre-treated with depurinating 0.25 HCI for 10 min, and then washed 2x (15 min each) with a denaturation solution (1.5 M NaCI, 0.5 M NaOH), followed by 2x (15 min each) of neutralization solution (1.5 M NaCI, 0.5 Tris-CI

(pH 7.0)). Equilibration of the gel for a minimum of 10 min in 20X SSC (3 M

NaCI, 0.3 M Na3C6H507-2H20 (pH 7.0)) was required before the gel was stacked into a capillary-based DNA transfer set-up. The DNA was transferred to a positively charged nylon membrane O/N (Roche Diagnostics Corp., Indianapolis,

IN, USA) and stabilized on the membrane through cross-linking with UV light (UV

Stratalinker, Strategene, LaJolla Calif., USA). The membrane was incubated

O/N at 45-55 °C with either nucleomorph, plastid, or mitochondrial rRNA gene probes to determine the relative purity of the organellar fractions from the cesium chloride gradient (see Table 1 for probe information). Membranes were pre- treated with heated hybridization buffer (5X SSC, 0.1% N-laurylsarcosine, 0.02%

SDS, 1% blocking reagent (Roche Diagnostics Corp., Indianapolis, IN, USA)) for

30 min, then exposed to hybridization buffer containing organelle specific rRNA gene probes created using the PCR digoxygenin (DIG) Synthesis Kit (Roche

Diagnostics Corp., Indianapolis, IN, USA). Overnight hybridizations were done in a Model 5420 incubator (VWR Scientific, Batavia, III. USA).

After 12-16 h of hybridization, membranes were washed 2x with low stringency buffer (2X SSC, 0.1% SDS) for 5 min at room temperature, then washed 2x in high stringency buffer (0.5X SSC, 0.1% SDS) for 15 min at hybridization temperature. Following the stringency washes, membranes were rinsed in Washing Buffer for 2 min (0.1 M maleic acid, 0.1 M NaCI, 0.3% Tween

20% (pH 7.5)) then 30 min in Blocking Buffer (0.1 M maleic acid, 0.15 M NaCI,

1X blocking reagent (pH 7.5)). A 30 min rinse in DIG kit Antibody-containing

Blocking Reagent (0.1 M maleic acid, 0.15 M NaCI, 1X blocking reagent (pH 7.5) with 3 uL of antibody) was followed by 2x 15 min rinses of Washing Buffer. A 3 min incubation with Detection Buffer (0.1 M Tris-HCI, 0.1 M NaCI (pH 9.5)) allowed visualization of the DIG-labelled probe using 1-2 ml_ of CDP-Star substrate (Roche Diagnostics Corp., Indianapolis, IN, USA) sealed with the membrane. Hybridization bands were visualized by exposing the membrane to

X-ray film for 2-20 min. Membranes were reused by stripping the exisiting probe off by incubating with Stripping Buffer (0.2 M NaOH, 0.1% SDS) for 2x 15 min at

37 °C.

Pyrosequencing and assembly

A total of 4.7 ug of enriched organellar DNA (representing ~ 75 L of culture) was sent to Patrick Chain and Stefanie Malfatti at the Lawrence

Livermore National Laboratory for pyrosequencing at the DOE Joint Genome

Institute in Walnut Creek, CA, USA. The pyrosequencing was done on a Roche

454 GS-FLX standard system (released in 2007, 454 Life Sciences, Branford,

CT, USA), with a total of 1.5 "runs" on the machine. At the Joint Genome

Institute (JGI), the genomic DNA was fragmented and blunt end "polished" in preparation for ligation to an adaptor sequence and streptavidin beads. A 30

Table 1: Summary of the probes used for Southern hybridization. Included are the primer pairs required to create the 3 organelle-specific Southern probes, with amplicon length.

Forward Reverse Amplicon 16S 5'-GGCTCAGGATGAACGCTGGC-3' 5'-CCTCACGCGGTATTGCTCCG-3' 350 bp Plastid 16S 5'-TTYGTGCCAGCAGCYGCGG-3' 5'-CGARCTGACGACARCCATGC-3' 550 bp Mito. 18S 5'-TTACCAGGTCCGGACATAGG-3' 5'-TGACTCACGCTTACAAGGC-3' 400 bp Nm. titration step was done to determine the optimal concentration of C. Paramecium

DNA per bead prior to emulsification with oil. Once the emulsification concentrations were determined, the single stranded, streptavidin coated beads were added to a DNA polymerase + sulfurylase + luciferase enzyme mixture and deposited on a PicoTiterPlate. Reagents (dNTPs and buffer) were administered to the plate sequentially, and the light signal generated during the incorporation of nucleotides was recorded by a CCD camera. The resulting sequence data was assembled with the 454 Life Sciences Newbler Assembly Software v1.1.03.24.

PCR, cloning, sequencing and genome annotation

The 454 sequence data was sifted using the Artemis 10.0 computer program (Rutherford et al., 2000), which marked open reading frames encoding predicted proteins of 40+ amino acids for manual comparison to the NCBI database (Bethesda, MD, USA) to determine the organellar origin of the 454- generated contiguous fragments ("contigs"). In this way, four contigs were identified as plastid in origin.

In order to close the gaps between the four 454 pyrosequencing contigs, region-specific primers were designed to the end reads of the contigs.

Additionally, verification of ambiguous internal sequences (e.g., stop codons within open reading frames) and 454 assembly was done using primer pairs designed to amplify 2-4 kb fragments across the entire genome (over 50 primer pairs in total). The concentration of template DNA was determined using a spectrophotometer. Polymerase chain reactions (PCR) were completed in a thermocycler (PTC-150 MiniCycler, MJ Research) with a 50 uL reaction volume

(34.3 uL water, 5 uL Taq buffer, 4 uL dNTPs, 4 uL MgCb, 0.7 uL forward primer

[50 mM], 0.7 uL reverse primer [50 mM], 0.3 uL Taq polymerase (Takara ExTaq,

Talkara Bio, Madison Wl, USA) and 1 uL template DNA). An initial 94 °C denaturation step (5 min) was followed by 38-40 cycles of: denaturing 94 °C for

30 sec, an annealing temp of 50 °C for 30 sec, and extension at 72 °C for 1-4 min (varied with size of amplicon). A final extension of 5 min at 72 °C was completed before storage at 4 °C.

PCR products were electrophoresed on 0.8% agarose TAE gels, and any strong bands of the correct size were excised for purification with the Qiagen

MinElute Gel Extraction Kit (Qiagen Inc., Mississauga, Ont.). The cleaned products were either direct sequenced using the amplification primers or cloned using the TOPO-TA PCR IV vector, the pGEM Easy vector or the TOPO-XL vector (Invitrogen, Burlington, Ont; Promega Corp., Madison, Wl, USA), depending on size. The products were ligated into the vector as per manufacturer's instructions (ligation times were usually extended to overnight at

4 °C). The vectors were introduced to chemically competent OneShot Top10 E. co//cells (Invitrogen, Burlington, Ont.) and grown overnight at 37 °C on LB (Luria broth) containing X-Gal (galactosidase substrate for blue-white screening) and ampicillin (for antibiotic screening). White colonies were selected and grown O/N in aerated 4 mL liquid LB media containing [100 ug/mL] ampicillin at 37 °C. The plasmids were isolated using the QiaQuick Spin Kit (Qiagen, Mississauga, Ont.). 33

The presence of inserts was verified using 0/N EcoRI restriction digestion, and insert-containing clones were sequenced using T7F and M13F primers.

Sequencing reactions were completed on a Beckman-Coulter CEQ 8000 capillary DNA sequencer.

In order to combine 454 data with in-house capillary sequences, the computer program Sequencher 4.5 (GeneCodes Inc., USA) was used. Once a single plastid genome contig was obtained from the four original contigs, genes were identified using the NCBI ORFinder and syntenic comparisons with plastid genomes of the closely related cryptomonads Rhodomonas salina CCMP1319 and Guillardia theta. Open reading frames were compared to the non-redundant database on the NCBI website using the BLASTX function (Altschul et al., 1990). tRNA sequences were identified using trnaScan-SE (Lowe and Eddy, 1997, http://lowelab.ucsc.edu/tRNAscan-SE/ accessed March 2009) using the option

"search for organellar tRNAs (-0)". Small and large ribosomal subunit RNA genes were identified by BLASTN. The program Tandem Repeats Finder

(http://tandem.bu.edu/trf/trf.submit.options.html, accessed March 2009) was used to identify any potential repeat regions. The circular genome map was constructed using CIRDNA (http://emboss.imb.nrc.ca, accessed March 2009).

GC content was determined using program Artemis 10.0 (Rutherford et al., 2000, http://mac.softpedia.com/get/Math-Scientific/Artemis.shtml, accessed Feb. 2008). CHAPTER 3: RESULTS AND DISCUSSION

SEM and cell morphology

The results of the cell fixation procedure for scanning electron microsopy

(SEM) were encouraging as many intact cells were visualized, often with both flagella present. The furrow-gullet (fig. 3A), although visible, is less pronounced than in other cryptomonads. In C. Paramecium, the periplast is not as distinctively polygonal as in some other cryptomonads. This feature was not an artefact of fixation, as the RhodomonasceWs undergoing the same procedure show much more pronounced polygonal plates (data not shown).

The size dimorphism reported in some cultures of C. Paramecium while viewed under light microscope were not observed within this limited sample size of SEM-fixed cells (fig.3C). The evidence for dimorphic life stages in

Cryptomonas includes the presence of two "size morphotypes" in a single clonal culture (Hoef-Emden and Melkonian, 2003). Cell division was very rarely observed under light microscope for these cultures. It has been previously reported that division occurs predominantly at night, and very quickly (mitotic division occurs in ten minutes). The SEM samples of cells from my culture were all roughly the same size (8-10 ^irn).

Unlike in the other cryptomonad sample prepared at the same time, C.

Paramecium cells in my culture were often covered in a fibrous coating, visible in multiple images taken from that sample (fig.3D). The material, able to withstand Figure 3: Scanning electron microscopy images of Cryptomonas Paramecium. The strain identification number: CCAP 977/2A (A) ventral view (B) dorsal view (C) two cells, with periplast plates visible, (D) fibrous net surrounding cells after fixation procedure. The scale bar is shared for all four photos. 36 the harsh fixation process, may be part of the mucilage (or palmella) created by

C. Paramecium to protect itself from predation while growing in colonies.

Cesium chloride density gradient and Southern hybridizations

From an initial cesium chloride gradient generated from the C.

Paramecium total DNA extract, four bands were identified and were putatively assigned organellar content based on previous work done on other cryptomonads (the first band putative nucleomorph DNA, the second band plastid DNA, the third fuzzy band as mitochondrial DNA and the fourth band as nuclear DNA). Upon completing the first set of Southern Hybridizations with organellar gene-specific probes, it became apparent that the third band did not contain organellar DNA, as hybridization with any of the probes was not detected. This led me to conclude that the (undefined) third band found between the well-defined second band and the intense fourth band was a "satellite" band consisting of sheared pieces of nuclear DNA whose overall AT content was slightly higher than the remainder of the nuclear genome found in band 4. In further gradient centrifugations, only three bands were isolated and tested. Band

1 (very faint), band 2 (higher intensity than band 1), and band 3 (which is the thickest band nearest to the RNA pellet at the bottom of the tube) (fig. 4A).

Preliminary spectrophotometer readings of the cleaned and purified cesium chloride fractions were used to determine gel-loading quantities.

Unfortunately, Hoescht dye absorbs at approximately the same wavelength as

DNA (260 nm) which may artificially increase the apparent DNA concentration. 38

Figure 4: Cesium chloride gradient centrifugation and Southern hybridizations. Sample preparation prior to the pyrosequencing involved (A) a cesium chloride gradient centrifugation, which separates total cellular DNA based on AT richness (seen as bands 1, 2, and 3 along the left). To verify fraction purity and quantity, an agarose gel was stained with ethidium bromide (B), and the DNA was transferred to a membrane for Southern hybridization with three organelle- specific gene probes (Nm = nucleomorph; Mito = mitochondrial; Plastid). 39

Ladder

10,000 bp 8,000 bp

Nm18S mm

Mito16S

I^PK'S^I^^

—Hoeschtdye— V/v C. Paramecium \°J Plastid 16S 40

As can be seen in the ethidium bromide stained agarose gel (fig. 4B), fraction 3 contains approximately twice as much DNA as in the other fractions (as determined by the mass ladder also visible in the image). The organellar probes used in this study correspond to the SSU RNA genes specific to each organelle, and were verified after synthesis to contain the correct target sequence by cloning the fragments and sequencing the inserts. All the probe sequences were verified by BLASTX searches, and received highest scores to either the previously sequenced C. Paramecium genes (in the case of the nucleomorph and plastid) or the Rhodomonas salina sequence (for the mitochondrial 16S rRNA gene).

The results of the hybridizations are shown in Fig.4 (B). The first probe examined (the nucleomorph-specific probe) shows faint signal in all three fractions. The similarity between nucleomorph and nuclear 18S rRNA sequence suggests that we should expect some amount of cross-hybridization with fraction three (therefore signal in lane three does not necessarily mean nucleomorph

DNA in this fraction). The mitochondrial and plastid probes have the strongest signal against fraction two, and do not cross with other fractions. The fact that the organelle DNAs co-migrate in the gradient to fraction 2 is consistent with the relative strength of fluorescence in band two versus band one, and suggests the

GC content of both the mitochondrial and plastid genomes in C. Paramecium are similar. The total lack of plastid or mitochondrial signal in fraction 1 denotes a relatively clean (albeit small) amount of nucleomorph DNA. The sample sent to the Lawrence Livermore Laboratory consisted of all of the DNA in fraction one and half of the DNA in fraction two. We anticipated, based on the estimated size of the genomes, that a relatively enriched sample of nucleomorph DNA would ensure that the larger genome received adequate sequencing coverage without sacrificing coverage of the smaller plastid and mitochondrial genomes.

Pyrosequencing data

The CsCI fractions of C. Paramecium DNA were sent for 454 sequencing.

The sequencing and assembly were performed by the Lawrence Livermore

National Laboratory. One and 1/2 plates were run, which yielded 158,500 reads, whose average length was 206 bp long. Although Roche, parent company to

454 LifeSciences, claims in online literature that the reads can number 400,000 per run, the number generated by experimental data in previous studies was closer to 200,000 (Huse et al., 2007). At 200,000 per run, we would expect around 300,000 reads for our run and a half, but only received 158,500 after quality filtering. This discrepancy may be caused by the large number of homopolymers in the AT-rich fraction of C. Paramecium DNA sent to the

Lawrence Livermore Laboratory. These homopolymers are known to cause errors in sequential pyrosequencing, which would be tagged by the quality- verification program to be removed from the assembly (Huse et al., 2007). Four contigs with lengths of 3,534 bp, 10,357 bp, 12,594 bp and 50,557 bp were identified as corresponding to the plastid genome, and totaled ~77 kb.

The average GC content of the reads were separated into a "high GC" category with GC > 45%, and a "low GC" category with GC < 45%. Of the C. Paramecium contigs, 25% fell into the "high GC category" (which suggests nuclear genome contamination) and 75% were in the "low GC" category. This latter category included the contigs making up the 77 kb plastid genome, the 38 kb mitochondrial genome, and the incomplete nucleomorph chromosomes.

There are numerous examples of de novo sequencing of entire genomes using newer pyrosequencing technologies, although they tend to be prokaryotes rather than eukaryotes (Swaminathan et al., 2007). Huse et al. (2007) did test the de novo sequencing accuracy of the GS20 pyrosequencing machine and found 99.5% accuracy. They noted that a small percentage of low quality reads accounted for a majority of the errors. Regardless, Goldberg et al., (2006) still suggest that most de novo sequencing projects should rely on Sanger sequencing for a scaffold and use 454 for supplementary sequencing of non- clonable as well as "hard stop" regions. Here I have done the opposite: a combination of 454-generated contigs linked by smaller sections sequenced from end-specific primers as well as internal verification of ambiguities. The gene order was verified in a similar way, with gene specific primers used to make amplicons of known size, and the PCR results visualized on an agarose gel.

The inter-contig regions of the four 454 contigs (ranging from 0 bp gap to

130 bp gap) were closed with an average of 7x coverage by amplifying and then sequencing multiple independent clones. Using the overlap of Sanger and 454 contigs in these regions, I found ~1 base mismatch per ~1250 bp of sequence.

These mismatches predominantly occurred in homopolymeric regions, usually after lengthy stretches of A or T (i.e., longer than six in a row). I targeted intra- 43 contig regions for further amplification based on a preliminary genome annotation, including 9 open reading frames (orfs) which contained a single or double nucleotide indel (resulting in a frame shift). These orfs were considered putative pseudogenes, but in all cases but one these frame shifts were resolved with additional sequencing. Additionally, two genes were found to contain an indel of 3 base pairs (resulting in no frame shift), while two more contained in- frame stop codons. These ambiguities were resolved by designing primers near to the region in question, and amplifying 2-4 kb sections for "end-read" Sanger sequencing.

Plastid genome structure

The plastid genome of the non-photosynthetic cryptomonad Cryptomonas

Paramecium was determined to be 77,690 bp. This is just over half the size of the plastid genome belonging to the closely related Rhodomonas salina, a photosynthetic member of the cryptomonad clade, which is 135,854 bp in size

(Khan et al., 2007b). The genome of the Guillardia theta plastid is 121,524 bp

(Douglas and Penny, 1999). While red algal-derived plastids tend to have more genes than their green algal counterparts (McFadden, 2001), the smallest plastid genomes known are from the red algal-derived apicoplasts of apicomplexans.

The plastid genome presented here is not reduced to such an extent, suggesting that the loss of photosynthesis in C. Paramecium has occurred more recently than in apicomplexans. On the other hand, the C. Paramecium plastid genome is larger than that of the green algal parasite Helicosporidium (37.4 kb) and 44

Prototheca wickerhamii (54.1 kb), but comparable to the parasitic plant Epifagus virginiana (70.0 kb) and the free-living Astasia longa (73.3 kb). While the C.

Paramecium plastid genome shows a ~ 1.5 fold reduction when compared to the plastid genome of its closest photosynthetic relatives (whose average is 128 kb), it is one of the least-reduced non-photosynthetic plastid genomes described thus far. As a comparison, the parasitic Helicospohdium genome has undergone a 4- fold reduction compared to its closest photosynthetic relative Chlorella vulgaris,

Prototheca a 3-fold reduction (as compared to the same green alga), and E. virginiana and A. longa both around a 2-fold reduction compared to their closest photosynthetic relatives. The plastid genome of a close photosynthetic relative of the apicomplexans has yet to be sequenced (although such an organism has recently been found; Moore et al., 2008), so it is difficult to determine whether it would fit into the trend of the oldest apparent loss of photosynthesis resulting in the smallest plastid genome size. The 78 kb genome size determined here fits right in with previous work done to karyotype the size and content of the C.

Paramecium plastid genome. Southern hybridizations completed on pulsed-field electrophoresis gels show cross-hybridization to a ~70 kb when an rbcL probe is used (G. Tanifuji, Doctoral Thesis, 2006).

The GC content of the C. Paramecium plastid genome is 38%, whereas the other cryptomonads R. salina and G. theta have values of 34% and 33% respectively. The unexpectedly higher GC content of the normally reduced (and thus AT-rich) plastid genome of C. Paramecium may be due to the higher 45

Figure 5: A circular mapping diagram of the complete plastid genome. The 78 kb genome of the non-photosynthetic alga Cryptomonas Paramecium contains a single rRNA operon, 83 protein coding genes, and 29 tRNAs. 46

trnUGU (T)-JPs4n trnUAC (V) ^r^ tmUCU (R)-£h||

trnGAU (\)

| transcription/replication | biosynthesis B photosynthesis B orfs/ycfs [~l translation O misc. 47 proportional amount of degraded coding region currently annotated as intergenic region. GC richness generally differs between the protein coding regions and the rRNA operon, with the latter being higher in GC content in C. Paramecium (38%

GC versus 49% GC). A GC skew analysis (data not shown), shows a marked change in direction of skew directly after the single RNA operon (just prior to the coding sequence for chll). Previous work has suggested these changes in GC skew to be potential origin of replication sites (de Koning and Keeling, 2006).

A measure of compaction in a genome is the ratio of coding versus non- coding sequence. In the C. Paramecium plastid genome, 87% of the sequence is coding. The plastid genome of the photosynthetic cryptomonad G. theta is 78% coding sequence (table 2), while the reduced genomes of non-photosynthetic plastids range from ~95% coding in the apicomplexans and Helicosporidium, to only 58% in the angiosperm Epifagus, with the parasitic organisms having less non-coding DNA than their free-living counterparts. The mean intergenic distance in the C. Paramecium genome, when determined by giving overlapping genes a value of zero, is 85 nucleotides. This is more similar to the parasitic plants (mean intergenic space of 135 nucleotides) compared to the parasitic algae and apicomplexans (ranging from 24 to 36 nucleotides) (de Koning and

Keeling 2006).

Compared to its closest cryptomonad relatives, C. Paramecium has reduced its plastid tRNA gene set. The G. theta genome has 30 tRNAs, R. salina has 31, but C. Paramecium has only 29. This number is still larger than 48

Table 2: Summary of cryptomonad plastid genomes sequenced to date. The two photosynthetic cryptomonads Guillardia theta and Rhodomonas salina are included with the non-photosynthetic Cryptomonas Paramecium. The size, GC percentage, number of protein-coding genes (CDS), tRNAs, presence of inverted repeats and overall coding capacity are summarized.

Size (nt) GC% #CDS tRNAs IR? coding % Guillardia 121,524 32% 147 30 yes 78% theta Rhodomonas 135,854 34% 146 31 yes 72% salina Cryptomonas 77,690 38% 83 29 no 87% Paramecium 49 the minimal set of tRNAs found in the parasitic alga Helicosporidium, with redundant isotypes for the glycine/serine/arginine/leucine and methionine amino acids (where the methionine has one "initiator" codon and one "internal" codon included in the set of three encoded in the plastid of C. Paramecium). Just as in the other cryptomonads and in Helicosporidium, a minimum set of tRNAs would seem to preclude the requirement of the C. Paramecium plastid importing tRNAs from outside the organelle.

One of the most striking absences in the C. Paramecium plastid genome is the lack of an inverted repeat compared to the other known cryptomonad plastid sequences. Inverted repeats (IR) consisting of a 16S-trnl-trnA-23S-5S operon (plus a variable set of other genes in some cases), are likely an ancestral state for the plastid genome. The loss of the rRNA repeat structure was not expected in C. Paramecium, and to make sure the characteristic was not due to an assembly error, in-house PCR amplicons were generated to verify the region around the single operon. The results obtained from the PCR-generated products were identical to those observed in the initial genome assembly. Unlike the other cryptomonad plastid genomes (that contain inverted repeats with rRNA operons), C. Paramecium has no repeat region and a single rRNA operon.

Despite its widespread distribution, the second rRNA operon has on occasion been shown to be absent or rearranged in some exceptional cases (e.g., charophytes, ulvophytes and trebouxiophytes). Within the non-photosynthetic

Helicosporidium, not only has the IR been lost but the operon has been split up as well (de Koning and Kelling, 2006). The authors propose that the disruption of 50 the operon (the 16S gene is almost diametrically opposite the 23S and 5S genes) in Helicopsoridium sp. plastid genome is related to the lack of an inverted repeat.

Since the rRNA operon is present as a single copy in C. Paramecium, and there is no longer an IR arrangement, it is potentially a case of an "operon in limbo" compared to its closest cryptomonad relatives. With Helicosporidium sp. as a test case, this may eventually lead to a disruption of the rRNA operon in the plastid genome of C. Paramecium.

Gene order is generally well conserved in C. Paramecium compared to G. theta and R. salina. Large tracts of genes are syntenic between all three genomes, including the highly conserved, co-expressed ribosomal proteins L2,

S19, S3, L16, L14, L5, S8, L36, S11. The atp genes also occur in syntenic order when compared to G. theta and R. salina. The large difference in gene complement is nearly entirely made up of photosynthetic genes (fig. 6, table 2).

The rearrangements that have occurred appear to be contained within the region of the rRNA operon, a remnant of the genome shuffling required to lose the inverted repeats in the first place. Smaller rearrangements have also occurred within the larger framework, as evidenced by the loss of synteny in this region compared to the other cryptomonads. Nonetheless, 78% of the genome is syntenic compared to G. theta and R. salina, accounting for 61 kb of the full 78 kb genome. 51

Figure 6: Schematic representation of the loss of photosynthetic genes. A syntenic region of the plastid genome shared between the photosynthetic cryptomonad Guillardia theta (genes on the left) and the non-photosynthetic cryptomonad Cryptomonas Paramecium (on the right). Genes are coloured according to their functional category, and represent approximately 10 kb of the Cryptomonas Paramecium genome. 52

C Paramecium

Misc. iH SHU Transl./Trans./Replic.

Biosynthesis

Photosynthesis

ycfs/orfs

, G. f/iefa Gene presence and absence analysis

Overall, there are 83 protein coding genes present in the Guillardia theta and Rhodomonas salina plastid genomes that are missing in C. Paramecium.

There are three open reading frames in C. Paramecium that share no similarity with anything in GenBank (orf555 and orf147) and two open reading frames that share sequence composition with other cryptomonad oris (orf335 and orf91).

There are 5 genes missing in both C. Paramecium and G. theta plastid genomes compared to R. salina (dnaX, orf142, orf146, RT, ycf26) and three pseudogenes

(chIB, chIN, chIL). There are no genes shared between the R. salina and C.

Paramecium genomes that are missing in the G. theta genome.

Compared to the 26 rpl genes in G. theta, there are only 25 genes for the

50S ribosomal sub-unit proteins in the C. Paramecium plastid genome. Of the 18

30S ribosomal sub-unit genes present in G. theta, 17 are present in C.

Paramecium. The rps6 gene, nestled just next to one of the rRNA operons in both G.theta and R.salina, is conspicuously missing in the C. Paramecium plastid

DNA, as is the gene for rpl32. Many ribosomal protein genes occur in operons as in other cryptomonads, the largest of which contains 17 consecutive genes

(rpl 3, 4, 23, 2, 22, 16, 29, 14, 24, 5, and 6, and rps 19, 3, 17, 8, and 5), and is

7.8 kb long.

As was noted in the first cryptomonad plastid genome sequence (Douglas and Penny, 1999), transcription is performed by a multi-sub-unit RNA polymerase referred to as PEP (pjastid-encoded polymerase). The suite of four genes of the eubacterial-like polymerase (rpoA, rpoB, rpoC1 and rpoC2) are present in the C. Paramecium plastid genome. Although it is possible for non-

photosynthetic plastids to rely entirely on nuclear-encoded polymerase proteins

(as in Cuscuta species) this is not the case in C. Paramecium, whose plastid still encodes polymerase enzymes. It is likely additional nuclear-encoded polymerase enzymes are imported to supplement the transcription of the plastid genes (Douglas and Penny, 1999). There are two post-transcriptional regulatory proteins present in C. Paramecium plastid DNA: ycf29 (a tctD regulator) and, interestingly, the cfxQ gene (a RuBisCo regulator found in the rbcUS operon in most non-green algae) (Simpson and Stern, 2002). Two missing regulators

include ycf27, which codes for a OmpR osmotic regulator, and ycf30, which is a lysR regulator. While other post-translational control genes, like me (encoding a mRNA post-transcriptional degradation enzyme) are missing, other

"housekeeping" genes are maintained in the C. Paramecium leucoplast (including infB, tef and tufA).

Many of the cell division proteins found encoded in the G. theta and R. salina genomes are missing from the leucoplast, such as hlpA (a chromatin- associated architectural protein), dnaB (a DNA helicase), min D and min E

(which prevent the creation of DNA-less "minicells" during division but has been transferred to the nucleus in other plastid-containing organisms) and ftsH (a metailoprotease involved as a protein chaperone) (Simpson and Stern, 2002).

Other chaperone proteins like groEL (encoding a heat-shock protein) and dnaK

(a polypeptide of the hsp70 family) are encoded in the plastid DNA to help with protein folding (Wang and Liu, 1991). While secG (a protein translocation gene) Table 3: Gene presence or absence in eight plastid genomes. This subset of completely sequenced plastid genomes are gathered from green and red lineages (secondary and primary plastids). Abbreviations: C.p.; Cryptomonas Paramecium, R.s.; Rhodomonas salina, G.t; Guillardia theta, P.f.; Plasmodium falciparum, A.I.; Astasia longa, H.s., Helicosporidium sp., E.v.; Epifagus virginiana, A.m.; Aneura mirabilis

Cp R.s G.t P.f A.I H.s E.v A.m acpA + + + - atpA + + + _ _ - ij, + atpB + + + .. - - ip + atpD + + + _. - - .. - atpE + + + - - - - + atpF xp + + _ _ - - + atpG + + + _ atpH + + + - - _ - + atpl + + + - _ - - +

ccsA (ycf5) - + + - - -• - V ccs1 (ycf44) - + + - cemA (ycf10) + + + . _ - _ + cfxQ + + + - chIB - V - + chll + + + - chIL _ V - • + chIN - V - _ - - _ + dpC + + + - cpeB - + + - dnaX - + - dnaB - + + - dnaK + + + - ftrB - + + - ftsH (ycf25) - + + - - + - - groEL + + + - - _ - - hlip(ycf17) _ + + _ - ~ - - hlpA _ + + - HvB + + + - - . - - ilvH + + + . „ - - - infB + + + - minD - + + - - - _ - minE _ + + - ORF142 - + - ORF146 „ + - _ - . _ _ ORF403 + + - ORF75 ORF99 pbsA petA petB petD petF petG petL (ycf7) petM (ycf31) petN (ycf6) psaA psaB psaC psaD psaE psaF psal psaJ psaK psaL psaM psbA psbB psbC psbD psbE psbF psbH psbl psbJ psbK psbL psbN psbT psbV psbW psbX psbY(ycf32) psbZ (ycf9) rbcL rbcS 57

C.p R.s G.t P.f A.I H.s E.v A.m rbcR (ycf30) + + - . . - - - rne - + + _ - - _ - rpH + + + rpl11 + + + _ - _ - - rpl12 + + + - + + - - rpl13 + + + rplU + + + + + + ip + rpl16 + + + + + + + + rpl18 + + + - - - - _ rpl19 + + + rpl2 + + + + + + + + rpl20 + + + - + + + + rpl21 + + + _ _ - - + rpl22 + + + _ + _ - + rpl23 + + + + + - V + rpl24 + + + _ - - - rpl27 + + + rpl29 + + + rpl3 + + + rpl31 + + + - - _ - - rpl32 _ + + _ + + - + rpl33 + + + _ - - + + rpl34 + + + rpl35 + + + rpl36 + + + + + + + + rpl4 + + + + - - - - rpl5 + + + _ + + - rpl6 + + + + - . - - rpoA + + + - - + xp + rpoB + + + + + + _ + rpoC1 + + + + + + - ' + rpoC2 + + + + + + _ + rps10 + + + rps11 + + + + + + + + rps12 + + + + + + + + rps13 + + + rps14 + + + - + + + + rps16 + + + _. _ - - - rps17 + + + + - - - - rps18 + + + - - _ + + rps19 + + + + + + + + rps2 + + + + + - + + 58

C.p R.s G.t P.f A.I H.s E.v A.m rps20 + + + - - • - - - rps3 + + + + + + + + rps4 + + + + + + + + rps5 + + + + _ - _ - rps6 - + + rps7 + + + . + + + + rps8 + + + + + + + + rps9 + + + - + - - - RT - + - secA + + + - - - _• - secG (ycf47) - + + _ _ - - - secY + + + sufB (ycf24) + + + sufC (ycf16) + + + tatC (ycf43) + + + - - - - - tsf + + + ~ - _ _ - tufA + + + + + + - - ycfl 2 _ + + - - - - + ycf19 + + + ycf20 + - + ycf26 _ + - _ - - • _ _ ycf27 - + + - - - - - ycf29 + .+ + - - - - . ycf3 - + + _ - - _ V ycf33 - + + ycf35 - + + - - _ - - ycf36 - + + ycf37 - + + ycf39 - + + ycf4 - + + - . - - + ycf46 . + + is absent, other components of the sec transport system are maintained (secA, secY). As well, the gene encoding the sec-independent transport protein tatC is also maintained. The proteolytic degradation pathway gene CIpC is also present.

The leucoplast in C. Paramecium has thus retained its ability to import necessary proteins from the cytoplasm (e.g., proteins linked to cell division) and can mediate their degradation.

Present in the C. Paramecium plastid DNA is a nearly full complement of atp synthase sub-units normally found in photosynthetic cryptomonads. While these 6 genes are present in the heterotrophic green alga Prototheca wickerhamii (where they were found to be expressed), they are missing in the non-photosynthetic euglenoid Astasia longa, the parasitic alga Helicosporidium, the parasitic plant Epifagus virgianiana, and the apicomplexan Plasmodium falciparum. The genes have varying degrees of similarity between the cryptomonads, with atpF (being the only pseudogene in the C. Paramecium genome) and atpG among those with the weakest degree of conservation, whereas atpB and atpH are among the strongest. Additionally, the overlap of sub-unit genes atpF and atpD, observed in both R. salina and G. theta, is not seen in C. Paramecium.

The subsystem is one of three systems involved in Fe-S cluster biosynthesis (the others being the NIF and ISC systems), and are conserved among eukaryotes, archaea, plants and parasites (Nachin et al., 2003). As was previously discovered in the photosynthetic cryptomonad G. theta, the suf genes are compartmentalized in cryptomonads, with sufB+C encoded in the plastid and sufD in the nucleomorph (Hjorth et al., 2005). SufC, found in the leucoplast genome of C. Paramecium, encodes an ATP-binding cassette which interacts with the sufB and sufD gene products. The latter is presumably a nuclear- or nucleomorph-encoded sufD protein, as only sufB and sufC were found in the plastid genome in this study. The reduction of the plastid DNA in the colourless

C. Paramecium has not affected this novel genetic arrangement. The actual Fe-

S cluster assembly is plastid-localized, and is likely a key metabolic function in the leucoplast (Borza et al., 2005).

There are two genes involved in chlorophyll biosynthesis that are maintained in the C. Paramecium leucoplast genome, the first being a magnesium chelatase called chll. Plastid encoded chll is closely related to a nuclear copy of chID (via a gene duplication) and along with chlH, is responsible for the addition of Mg2+ to protoporphyrin IX. Chll is found encoded in all plant plastids and in all algal plastid genomes examined to date except for peridinin- containing dinoflagellates (Li et al., 2006). The second gene involved in chlorophyll biosynthesis is pbsA, a heme oxygenase that is a key enzyme required for the synthesis of photosynthetic antennae (Richard & Zabulon, 1997).

In the non-photosynthetic C. Paramecium, these genes may have some alternate chelating function or are potentially involved in plastid-to-nucleus signaling (as predicted in P. wickerhamii; Borza et al., 2005).

The gene cemA and its homologs cotA/ycf10 have been identified in both the chloroplast genomes of higher-order land plants as well as in cyanobacteria.

Studies show this gene encodes a heme, b-type protein and is functional in CO2 transportation in some organisms (Rochaix, 1997). It, along with the ilv duo of genes (HvB and HvH), is present in the C. Paramecium plastid genome. The ilv genes work in conjunction to create an acetohydroxyacid synthase, which catalyzes the first step in the synthesis of the branched chain amino acids valine, leucine and isoleucine (Singh et al., 1995).

While the R. salina plastid genome encodes remnants of three protochlorophyllide reductase genes (chIL, chIN, and chIB) that are responsible for the light-independent synthesis of chlorophyll (Fong and Archibald, 2008,

Khan et al., 2007b), neither the photosynthetic G. theta nor the non- photosynthetic C. Paramecium plastids encode them. Similarly, the lateral gene transfer-derived DNA polymerase found in R. salina (and other plastid genomes in Rhodomonas clade) is not present in C. Paramecium nor G. theta (Khan et al.,

2007b).

In the parasitic liverwort Aneura mirabilis, the gene controlling heme attachment to cytochrome c (ccs>A) is a pseudogene. This gene is lacking entirely in C. Paramecium but is found in the other two photosynthetic cryptomonad plastid genomes. It may also be implicated in non-cyclic electron transport, and thus unnecessary where photosynthesis is no longer occurring. A related protein, encoded by ccsl, is thought to interact physically with the gene product of ccsA, as an apo-protein chaperone in the synthesis of

holocytochromes (Hamel et al., 2003) is also missing in C. Paramecium but is present in the other cryptomonad genomes. 62

Not surprisingly, the largest amount of gene loss occurs in the category of photosynthesis. The p sub-unit of phycoerythrin (cpeB), part of the phycobiliprotein complex in cryptomonads, is missing in C. Paramecium as well as in the genome of the photosynthetic heterokonts and haptophytes (Khan et al., 2007b). The photosynthetic regulator and electron transfer gene ftrB is gone in the leucoplast, as is the hlip gene (also known as ycf17). As in A. longa and other dramatically reduced plastid genomes, a long rbcL (large sub-unit of

RuBisCo) gene is present, presumably functioning in a non-photosynthetic capacity (Wickett et al., 2008), as well as the rbcS (small sub-unit of RuBisCo) gene.

A previous study claimed to have amplified a plastid-encoded psbA gene

(Yoon et al., 2002) from C. Paramecium, yet in the complete genome sequence presented in this study, no psbA gene was found. The lack of psbA genes is consistent with unpublished results of G. Tanifuji (per. comm.). I suggest the current psbA gene found in the NCBI database identified as C. Paramecium plastid may in fact be contamination. The psb family of proteins encode subunits of photosystem II. In the photosynthetic cryptomonad plastid genomes, there are

18 psb genes. In the parasitic liverwort, 5 of these genes are pseudogenes. In the substantially reduced plant E. virginiana, which is thought to have lost photosynthesis earlier than A. mirabilis, all PSII proteins are gone except psbA and psbB (which are pseudogenes). It is hypothesized that the psb family of genes is the first to disappear from a non-photosynthetic plastid (an exception being the ndh genes, if they are present) (Wickett et al., 2008). Similarly, the A. 63 longa plastid is completely devoid of the psb genes, which is in contrast to its closest photosynthetic relative E. gracilis, which has 11 psb genes. In total, the loss of the 18 psb genes accounts for approximately 7.5 kb of missing plastid

DNA in C. Paramecium.

The psa family of genes encode sub-units of the PSI photosynthetic complexes. In R. salina and G. theta (both photosynthetic cryptomonads), there are 11 psa genes that account for approximately 6.5 kb of DNA sequence. The

A longa plastid has lost all 5 psa genes compared to E. gracilis. The parasitic plants E. virginiana has also lost all 5 genes, and while 4 intact psa genes remain in A. mirabilis, there are 2 pseudogenes. While the case of A mirabilis makes the loss of the genes appear gradual, no such evidence of pseudogenization in cryptomonads has been found, which is consistent with the large evolutionary distance between the C. Paramecium clade and the other cryptomonads (fig.2).

Finally, the last gene family associated with photosynthesis that has nearly entirely disappeared from the C. Paramecium leucoplast genome is the pet family. In photosynthetic organisms, the sub-units of the pet proteins create a complex required for oxygenic photosynthesis, in particular the noncyclic electron flow mediated by the b6f complex. Eight pet sub-unit genes are present in the G. theta and R. salina plastid genomes. In other non-photosynthetic organisms (like

E virginiana, A. longa and A mirabilis), the pet genes are all either missing or pseudogenes. The one exception in C. Paramecium among this otherwise massive gene loss is the petF gene. In the cyanobacterium Synechococcus, petF encodes a ferredoxin to shuttle electrons while cross-linked to the psaD and psat gene products (both not encoded in the C. Paramecium plastid genome).

The loss of nearly all the pet genes in the C. Paramecium plastid genome accounts for approximately 1.5 kb of sequence.

Codon usage and tRNA complement

As in most other organellar genomes, the nucleotide frequencies in the C.

Paramecium plastid genome is biased towards AT nucleotides. The reasons for this compositional bias include the spontaneous mutational bias of G and C bases to A and T bases, which is likely exacerbated by missing DNA repair systems in organelles (Hoef-Emden et al., 2005). The bias is often noticeable when surveying the complement of tRNAs present in the genome.

Previous work by Morton et al. (2002) has identified two amino acid codon bias trends in various plastid genes. The first group of genes shows a "mutation pattern" which is indicated by a high frequency of A+T bases at all silent codon positions. For every amino acid, there exists a group of codons that are compositionally similar except for the third base (the silent codon position). In genes that show the mutation pattern, there is a preference for using the codons that end A or T versus those ending in G or C. This usage pattern is thought to correlate with weakly expressed genes (Hoef-Emden et al., 2005). In contrast, the second group of genes shows a different codon usage, counter to the first group of genes and, in fact, counter the mutational drift occurring in the whole genome. The "adaptation pattern" in these (presumably highly expressed) genes points to a selective force opposite to the overall genome composition (Hoef-

Emdenetal.,2005).

Assessing the combination of all plastid coding regions in C. Paramecium, the overall bias in the amino acids alanine (A), glutamic acid (E), lysine (K), leucine (L), proline (P), glutamine (Q), arginine (R), threonine (T) and valine (V) is towards the NNT and NNA codons over the NNC and NNG options, often by a margin of 2-to-1. Taken overall, this bias in coding sequence codons suggests that the "mutational bias" is more important than the "adaption pattern" in determining codon usage in most of the C. Paramecium non-photosynthetic plastid genome. A previous study in C. Paramecium (and other non- photosynthetic cryptomonads) assessing the photosynthetic gene rbcL have showed accelerated evolutionary rates, which correlated with a change in codon usage from an "adaptive" pattern to a "mutational" pattern. As mentioned above, this is likely indicative of reduced expression of this protein in the non- photosynthetic plastids and the relaxed constraints on rbcL (Hoef-Emden et al.,

2005).

Loss of photosynthesis and metabolic shift

In normal photosynthetic cells, the plastid is responsible for expressing and translating genes responsible for photosynthesis. There are multiple hypotheses on why core genes have remained in the plastid. One is that the hydrophobic membrane proteins cannot be imported from the cytosol (Daley and

Whelan, 2005), while another hypothesis links the redox state of the plastid so closely with the synthesis of the core proteins that the genes must be in the plastid genome (Allen, 2003, Race et al., 1999). A third hypothesis is that the genes are part of a tightly regulated complex of co-expressed genes, so transfer to the nucleus of just one of the proteins would disrupt the tight relationship

(Barbrook et al., 2006). Yet aside from the core set of photosynthetic genes

(plus the machinery required to express them), the remainder of the genes have all been observed to have been transferred to the nucleus in some organisms, or have successfully been knocked out in the lab (Daley and Whelan, 2005). So why do plastid genomes remain in secondarily non-photosynthetic organisms?

A hypothesis, expanded in the paper by Barbrook et al., (2006), can be broken down into two parts. The first has to do with the potential metabolic role of one of the tRNAs found in all of the plastid genomes sequenced to date, trnE, while the other part involves the potential export of a specific tRNA for use in another organelle. The trnE gene has been found on every plastid genome sequenced to date - even those belonging to the most pared-down non- photosynthetic organisms. In plants and algae, heme and chlorophyll biosynthesis requires the trnE (Glu) to create the components for the cytochromes and oxidative enzymes (Howe and Smith, 1991). Functional studies show the cytosol version of trnE cannot replace the plastid homolog of trnE. Barbrook et al. suggests that in many non-photosynthetic organisms, supplying the trnE is the sole metabolic function keeping the plastid genome around. The exception occurs in apicomplexans, whose trnE is not essential for heme biosynthesis. Their plastid genomes have been maintained in order to supply a different tRNA - that of tRNAf_met (Barbrook et al., 2006).

In apicomplexans, the tRNA f"met methionine start codon is not encoded in the mitochondrion, but is instead shared between the plastid and mitochondrion.

Thus the proposed reason apicomplexans have maintained a plastid genome, despite being non-photosynthetic, is so that it may supply the tRNA necessary to transcribe the oxidative proteins encoded in the mitochondrion. In certain stages of Plasmodium at least, the two organelles are always seen to be in physical contact (Hopkins et al., 1999, VanDooren et al., 2005), supporting the idea that the organelles share tRNAs. Both versions of this hypothesis can be summarized as stating that the plastid genomes of some non-photosynthetic organisms have been maintained in order to supply essential tRNAs.

Additonally, argues Barbrook et al. (2006), the reason many plastid genomes linger is simply because the chance to transfer the necessary material to the nuclear genome has already passed. In the "limited transfer window" hypothesis, organisms with a single (but essential) plastid organelle per cell can no longer tolerate the lysis of the organelle and subsequent "escape" of genes into the nuclear genome once the endosymbiont has been stably incorporated into the cell. In organisms that have multiple plastids in each cell (such as tobacco, which can have hundreds of plastids), every organelle lysis presents an opportunity for successful gene integration into the host genome. Accordingly, rates of gene transfer to the nucleus in tobacco are higher than in those organisms with a single plastid that have been studied to date (like apicomplexans, and the green alga Chlamydomonas reinhardtii), where lysis of the single plastid would presumably be lethal. The window of opportunity for these single-plastid organisms has long since passed: in the early stages of endosymbiosis, endosymbiont division would not have been so highly regulated and thus plastid copy number would have been higher. As it is now, one potential reason for the retention of plastid genomes in non-photosynthetic cells is simply because they no longer have the chance to escape the plastid for the greener (or redder?) pastures of the nucleus.

So what sort of metabolism is occurring in the plastids of secondarily non- photosynthetic organisms, and what would we expect for C. Paramecium?

Plastids in photosynthetic organisms carry out a variety of other metabolic processes such as nitrogen assimilation and amino acid/fatty acid/starch synthesis, but these processes can be overshadowed by the carbon-assembly factory known as photosynthesis. In non-photosynthetic organisms, these processes jump to the metabolic forefront (Neuhaus and Ernes, 2000, Wolfe et al., 1992). Leucoplasts in apicomplexans in particular are thought to exist as fatty acid biosynthesis centres, and evidence for fatty acid synthesis in other secondary endosymbiotic organisms exists (Borza et al., 2005, Krause 2008).

One of the first complete genes isolated from the Guillardia theta plastid genome was for an acyl carrier protein called acpA. It is a required cofactor in the synthesis and metabolism of fatty acids (Wang and Liu, 1991) and we know it is found in plastid genome of C. Paramecium. Further evidence for fatty acid synthesis occurring in the C. Paramecium plastid genome is the discovery of a 69 nuclear-encoded, plastid targeted fabD protein. FabD encodes for the malonyl

Co-A:ACP transcylase protein catalyzing the transfer of a malonyl moiety in amino acid synthesis (Khan et al., 2007a).

Plastid-targeted proteins were surveyed in the non-photosynthetic, predominantly free-living P. wickerhamii through an EST study, which identified enzymes responsible for the conversion of pyruvate to acetyl-coA as a first step to further carbohydrate metabolism, as well as fatty acid synthesis (Borza et al.,

2005). The same sorts of metabolism were identified in Plasmodium and

Helicosporidium. The retrograde signaling pathway involving chll likely occurs in

P. wickerhamii and C. vulgaris - and possibly in C. Paramecium - because the magnesium chelatase gene is maintained for nuclear cross-communication. Also shared among the distantly related Plasmodium; Helicosporidium, Prototheca and C. Paramecium genomes is chaperone activity, with genes or ESTs identified from all four organisms. True metabolic function analysis must take into account not only the proteins encoded in the plastid genome of a non-photosynthetic organism, but also the nuclear-encoded/plastid-targeted proteins, as well as how expression pattern of each category of proteins. This sort of expression analysis has yet to be addressed in the plastid of C. Paramecium.

The presence of shared genes and targeted proteins in lineages of primary and secondary, red and green algal plastids suggests a non-random selection of metabolic processes. Not only is the content somewhat conserved in terms of a few basic pathways, but de Koning and Keeling (2006) suggest that the actual genetic arrangement in non-photosynthetic plastids may also be converging on a shared set of traits. They refer to the common outcome of genome reduction in a plastid genome, with a shift in coding strand symmetry and tRNA complement in Helicosporidium (green, primary) and apicomplexans

(red, secondary) as "organized reduction". Not only are the genomes getting smaller than those of their closest photosynthetic relatives, their genomes are also getting more arranged (in terms of coding strand placement and gene order), suggesting a common evolutionary force that is affecting the two genomes. If this is true, C. Paramecium would fall into the "intermediate" stage

(along with E. virginana and A. longa) as being roughly the same structure as their photosynthetic counterparts, just more reduced. Even further along the continuum towards full functionality would be the non-photosynthetic angiosperms whose plastids are just in the process of losing their genes through large scale deletions and pseudogenization (Wickett et al., 2008).

The minimal complement of genes present in the plastids of these non- photosynthetic organisms (especially the free-living ones) provides an excellent starting point for functional studies of non-photosynthetic plastids. The redundancy and complexity of a photosynthetic plant plastid genome makes gene knock-out studies more difficult (Parker et al., 2008). Unicellular algae on the other hand, especially those with a minimal plastid genome, could yield a simplified model for studying plastid-encoded proteins of unknown function/association. CHAPTER 4: CONCLUSION

The complete Cryptomonas Paramecium plastid genome presented in this thesis is the first red-algal derived complex plastid from a free-living organism that has lost its ability to photosynthesize. The field of comparative genomics of secondarily non-photosynthetic plastids is in its infancy, and has largely consisted of plastids from flowering plants (Krause, 2008). The addition of the C.

Paramecium genome to the arsenal of complete plastid genome sequences increases the breadth of plastid genomes sampled to date, and will help to identify some common trends present in highly reduced organellar genomes.

There does indeed appear to be a " structured reduction" occurring in these plastid genomes, regardless of origin or complexity of the plastid (de Koning and

Keeling, 2006).

At 78 kb, the plastid genome of C. Paramecium is by far the smallest plastid among the unicellular cryptomonad algae sequenced to date. It was sequenced with a combination of Sanger sequencing and pyrosequencing techniques, highlighting the positive and negative attributes of each method.

Compared to the photosynthetic plastid genomes, it is approximately 50 kb smaller, with most of the difference due to loss of entire photosynthetic gene families such as the pet family, the psb and the psa gene families of proteins.

Interestingly, the rRNA operon, canonically present as inverted repeats in most plastid genomes, occurs as just a single copy in C. Paramecium. And like many other species of non-photosynthetic algae and higher plants, the rbcL gene is maintained in the C. Paramecium plastid genome, suggesting a non- photosynthetic function for the large sub-unit of RuBisCo.

Although the niche of non-photosynthetic plastid genome analysis is expanding, there remains a wealth of information to be mined. In the cryptomonads alone, it appears that a non-photosynthetic lifestyle has evolved multiple times (as is the case with the land plants) and therefore represents an opportunity to discover larger trends in genome streamlining likely to occur in these plastids. Analysis of the expression levels of the proteins encoded on the plastid genome has yet to be addressed in the cryptomonad clade, and once completed will likely provide much valuable information on the functional significance (if any) of the remaining photosynthesis-related genes.

Looking outside the cryptomonads, it is apparent that the number of non- photosynthetic plastid genomes sequenced that are of secondary or tertiary origin is still very low. As organisms with secondary plastids are abundant in the marine environment, and the most successful colonizers of a variety of ecological niches, we have just barely scratched the surface of leucoplast genome sequences from both parasitic and free-living non-photosynthetic organisms.

Determining what genes are maintained in non-photosynthetic plastid genomes may yield insight into the function for some of the unidentified proteins coded within the genome.

A better understanding of genome reduction associated with a drastic functional shift (such as the loss of photosynthesis) may also help answer the question "can one lose a plastid once it has been acquired?" This question is central to many currently proposed hypotheses dealing with higher-order organization of the eukaryotes containing secondarily-acquired plastids

(Archibald, 2009). The exact number of times this secondary acquisition happened, and how many lineages are involved, is still unclear. Increasing our knowledge regarding the continuum of photosynthetic ability may yield clues as to whether some of the chromalveolates (for example) did at one time contain a plastid. And with it, a greater comprehension of the processes behind the acquisition and loss of photosynthesis - one of the most influential metabolic developments on Earth. REFERENCE LIST

Allen, J. F. 2003. Why and mitdchondria contain genomes. Comparative and Functional Genomics 4:31-36.

Allen, J. F., and J. A. Raven. 1996. Free-radical-induced mutation vs redox regulation: Costs and benefits of genes in organelles. Journal of Molecular Evolution 42:482-492.

Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic Local Alignment Search Tool. Journal of Molecular Biology 215:403-410.

APT, K. E., J. L. Collier, and R. Grossman. 1995. Evolution of the Phycobiliproteins. Journal of Molecular Biology 248:79-96.

Archibald, J. M. 2009. The Puzzle of Plastid Evolution. Current Biology 19:R81- R88.

Barbrook, A. C, N. Santucci, L. J. Plenderleith, R. G. Hiller, and C. J. Howe. 2006. Comparative analysis of chloroplast genomes reveals rRNA and tRNA genes. BMC Genomics 7:297.

Barkman, T., J. McNeal, S.-H. Lim, G. Coat, H. Croom, N. Young, and C. dePamphilis. 2007. Mitochondrial DNA suggests at least 11 origins of parasitism in angiosperms and reveals genomic chimerism in parasitic plants. BMC Evolutionary Biology 7:248.

Bodyl, A. 2005. Do plastid-related characters support the chromalveolate hypothesis? Journal of Phycology 41:712-719.

Borza, T., C, E. Popescu, and R. W. Lee. 2005. Multiple metabolic roles for the nonphotosynthetic plastid of the green alga Prototheca wickerhamii. Eukaryotic Cell 4:253-261.

Brett, S. J., L. Perasso, and R. Wetherbee. 1994. Structure and Development of the Cryptomonad Periplast - a Review. Protoplasma 181:106-122.

Burki, F., K. Shalchian-Tabrizi, and J. Pawlowski. 2008. Phylogenomics reveals a new 'megagroup' including most photosynthetic eukaryotes. Biology Letters 4:366-369.

Cai, X. M., A. L. Fuller, L. R. McDougald, and G. Zhu. 2003. Apicoplast genome of the coccidian Eimeria tenella. Gene 321:39-46.

Cavalier-Smith, T. 1986. The Chromista: origin and systematics. Pp. 309-347 in F. E. a. C. Round, D. J, ed. Progress in phycological research. Biopress, Bristol. 75

Cavalier-Smith, T. 1999. Principles of protein and lipid targeting in secondary : euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. Journal of Eukaryotic Microbiology 46:347- 366.

Chesnick, J. M., C. W. Morden, and A. M. Schmieg. 1996. Identity of the endosymbiont of Peridinium foliaceum (Pyrrophyta): Analysis of the rbcLS operon. Journal of Phycology 32:850-857.

Clay, B. L, and P. Kugrens. 1999. Characterization of Hemiselmis amylosa sp nov and phylogenetic placement of the blue-green cryptomonads H. amylosa and Falcomonas daucoides. Protist 150:297-310.

Daley, D. O., and J. Whelan. 2005. Why genes persist in organelle genomes. Genome Biology 6:110.

Daugbjerg, N., and R. A. Andersen. 1997. Phylogenetic analyses of the rbcL sequences from haptophytes and heterokont algae suggest their chloroplasts are unrelated. Molecular Biology and Evolution 14:1242- 1251. de Koning, A. P., and P. J. Keeling. 2006. The complete plastid genome sequence of the parasitic green alga Helicosporidium sp is highly reduced and structured. BMC Biology 4:12.

Depamphilis, C. W., and J. D. Palmer. 1990. Loss of Photosynthetic and Chlororespiratory Genes from the Plastid Genome of a Parasitic Flowering Plant. Nature 348:337-339.

Douglas, S., S. Zauner, M. Fraunholz, M. Beaton, S. Penny, L. T. Deng, X. N. Wu, M. Reith, T. Cavalier-Smith, and U. G. Maier. 2001. The highly reduced genome of an enslaved algal nucleus. Nature 410:1091-1096.

Douglas, S. E., and S. L. Penny. 1999. The plastid genome of the cryptophyte alga, Guillardia theta: complete sequence and conserved synteny groups confirm its common ancestry with red algae. Journal of Molecular Evolution 48:236-244.

Edwards, G. E., V. R. Franceschi, and E. V. Voznesenskaya. 2004. Single-cell C- 4 photosynthesis versus the dual-cell (Kranz) paradigm. Annual Review of Plant Biology 55:173-196.

Falkowski, P. G., R. T. Barber, and V. Smetacek. 1998. Biogeochemical controls and feedbacks on ocean primary production. Science 281:200-206.

Falkowski, P. G., M. E. Katz, A. H. Knoll, A. Quigg, J. A. Raven, O. Schofield, and F. J. R. Taylor. 2004. The evolution of modern eukaryotic phytoplankton. Science 305:354-360. Fichera, M. E., and D. S. Roos. 1997. A plastid organelle as a drug target in apicomplexan parasites. Nature 390:407-409.

Fong, a., and J. M. Archibald. 2008. Evolutionary dynamics of light-independent protochlorophyllide oxidoreductase genes in the secondary plastids of cryptophyte algae. Eukaryotic Cell 7:550-553.

Gantt, E., M. R. Edwards, and L. Provasoli. 1971. Chloroplast structure of the . Evidence for phycobiliproteins within intrathylakoidal spaces. J Cell Biol 48:280-290.

Gervais, F. 1998. Ecology of cryptophytes coexisting near a freshwater chemocline. Freshwater Biology 39:61-78.

Gervais, F. 1997. Diel vertical migration of Cryptomonas and Chromatium in the deep chlorophyll maximum of a eutrophic lake. Journal of Plankton Research 19:533-550.

Gockel, G., and W. Hachtel. 2000. Complete gene map of the plastid genome of the nonphotosynthetic euglenoid flagellate Astasia longa. Protist 151:347- 351.

Goldberg, S. M. D., J. Johnson, D. Busam, T. Feldblyum, S. Ferriera, R. Friedman, A. Halpern, H. Khouri, S. A. Kravitz, F. M. Lauro, K. Li, Y. H. Rogers, R. Strausberg, G. Sutton, L. Tallon, T. Thomas, E. Venter, M. Frazier, and J. C. Venter. 2006. A Sanger/pyrosequencing hybrid approach tor the generation of high-quality draft assemblies of marine microbial genomes. Proceedings of the National Academy of Sciences of the United States of America 103:11240-11245.

Gould, S. B., R. R. Waller, and G. I. McFadden. 2008. Plastid evolution. Annual Review of Plant Biology 59:491-517.

Grzebyk, D., O. Schofield, C. Vetriani, and P. G. Falkowski. 2003. The mesozoic radiation of eukaryotic algae: The portable plastid hypothesis. Journal of Phycology 39:259-267.

Hackett, J. D., D. M. Anderson, D. L. Erdner, and D. Bhattacharya. 2004a. Dinoflagellates: A remarkable evolutionary experiment. American Journal of Botany 91:1523-1534.

Hackett, J. D., H. S. Yoon, S. Li, A. Reyes-Prieto, S. E. Rummele, and D. Bhattacharya. 2007. Phylogenomic analysis supports the monophyly of cryptophytes and haptophytes and the association of with Chromalveolates. Molecular Biology and Evolution 24:1702-1713. 77

Hackett, J. D., H. S. Yoon, M. B. Soares, M. F. Bonaldo, T. L. Casavant, T. E. Scheetz, T. Nosenko, and D. Bhattacharya. 2004b. Migration of the plastid genome to the nucleus in a peridinin dinoflagellate. Current Biology 14:213-218.

Haferkamp, I., P. Deschamps, M. Ast, W. Jeblick, U. Maier, S. Ball, and H. E. Neuhaus. 2006. Molecular and biochemical analysis of periplastidial starch metabolism in the cryptophyte Guillardia theta. Eukaryotic Cell 5:964-971.

Hamel, P. P., B. W. Dreyfuss, Z. Y. Xie, S. T. Gabilly, and S. Merchant. 2003. Essential histidine and tryptophan residues in CcsA, a system II polytopic cytochrome c biogenesis protein. Journal of Biological Chemistry 278:2593-2603.

Hammer, A., R. Schumann, and H. Schubert. 2002. Light and temperature acclimation of Rhodomonas salina (Cryptophyceae): photosynthetic performance. Aquatic Microbial Ecology 29:287-296.

Hashimoto, H. 2005. The ultrastructural features and division of secondary plastids. Journal of Plant Research 118:163-172.

Hauth, A. M., U. G. Maier, B. F. Lang, and G. Burger. 2005. The Rhodomonas salina mitochondrial genome: bacteria-like operons, compact gene arrangement and complex repeat region. Nucleic Acids Research 33:4433-4442.

Hempel, F., A. Bozarth, M. S. Sommer, S. Zauner, J. M. Przyborski, and U. G. Maier. 2007. Transport of nuclear-encoded proteins into secondarily evolved plastids. Biological Chemistry 388:899-906.

Hill, D. R. a., and K. S. Rowan. 1989. The Biliproteins of the Cryptophyceae. Phycologia 28:455-463.

Hjorth, E., K. Hadfi, S. Zauner, and U. G. Maier. 2005. Unique genetic compartmentalization of the SUF system in cryptophytes and characterization of a SufD mutant in Arabidopsis thaliana. Febs Letters 579:1129-1135.

Hoef-Emden, K. 2005. Multiple independent losses of photosynthesis and differing evolutionary rates in the genus Cryptomonas (Cryptophyceae): Combined phylogenetic analyses of DNA sequences of the nuclear and the nucleomorph ribosomal operons. Journal of Molecular Evolution 60:183-195.

Hoef-Emden, K., B. Marin, and M. K. Melkonian, 2002. Nuclear and nucleomorph SSU rDNA phylogeny in the cryptophyta and the evolution of cryptophyte diversity. Journal of Molecular Evolution 55:161-179. 78

Hoef-Emden, K., and M. Melkonian. 2003. Revision of the genus Cryptomonas (Cryptophyceae): a combination of molecular phylogeny and morphology provides insights into a long-hidden dimorphism. Protist 154:371-409.

Hoef-Emden, K., H. D. Tran, and M. Melkonian. 2005. Lineage-specific variations of congruent evolution among DNA sequences from three genomes, and relaxed selective constraints on rbcL in Cryptomonas (Cryptophyceae). Bmc Evolutionary Biology 5:56.

Hopkins, J., R. Fowler, S. Krishna, I. Wilson, G. Mitchell, and L. Bannister. 1999. The plastid in Plasmodium falciparum asexual blood stages: a three- dimensional ultrastructural analysis. Protist 150:283-295.

Howe, C. J., and A. G. Smith. 1991. Plants without Chlorophyll. Nature 349:109- 109.

Huse, S. M., J. A. Huber, H. G. Morrison, M. L Sogin, and D. Mark Welch. 2007. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biology 8:R143.

Ishida, H., K. Yoshimoto, M. Izumi, D. Reisen, Y. Yano, A. Makino, Y. Ohsumi, M. R. Hanson, and T. Mae. 2008. Mobilization of rubisco and stroma- localized fluorescent proteins of chloroplasts to the vacuole by an ATG gene-dependent autophagic process. Plant Physiology 148:142-155.

Jeong, H. J. 1999. The ecological roles of heterotrophic dinoflagellates in marine planktonic community. Journal of Eukaryotic Microbiology 46:390-396.

Kaplan, a., and L. Reinhold. 1999. CO2 concentrating mechanisms in photosynthetic microorganisms. Annual Review of Plant Physiology and Plant Molecular Biology 50:539-570.

Keeling, P. J., J. M. Archibald, N. M. Fast, and J. D. Palmer. 2004. Comment on "The Evolution of Modern Eukaryotic Phytoplankton". Science 306:2191b.

Khan, H., C. Kozera, B. A. Curtis, J. T. Bussey, S. Theophilou, S. Bowman, and J. M. Archibald. 2007a. Retrotransposons and tandem repeat sequences in the nuclear genomes of cryptomonad algae. Journal of Molecular Evolution 64:223-236.

Khan, H., N. Parks, C. Kozera, B. A. Curtis, B. J. Parsons, S. Bowman, and J. M. Archibald. 2007b. Plastid genome sequence of the cryptophyte alga Rhodomonas salina CCMP1319: lateral transfer of putative DNA replication machinery and a test of chromist plastid phylogeny. Molecular Biology and Evolution 24:1832-1842. 79

Kim, E. and J. M. Archibald 2008. Diversity and evolution of plastids and their genomes. Pp. 1-40 in H. A. S. Aronsson, A. S., ed. The chloroplast- interactions with the environment. Springer-Verlag, Berlin.

Kim, E., C. E. Lane, B. A. Curtis, C. Kozera, S. Bowman, and J. M. Archibald. 2008. Complete sequence and analysis of the mitochondrial genome of Hemiselmis andersenii CCMP644 (Cryptophyceae). Bmc Genomics 9:215.

Klaveness, D. 1988. Ecology of the Cryptomonadida: A First Review. Pp. 105- 133 in C. D. Sandgren, ed. Growth and Reproductive Strategies of Freshwater Phytoplankton. Cembridge University Press, Cambridge.

Knauf, U., and W. Hachtel. 2002. The genes encoding subunits of ATP synthase are conserved in the reduced plastid genome of the heterotrophic alga Protothecawickerhamii. Molecular Genetics and Genomics 267:492-497.

Kohler, S., C. F. Delwiche, P. W. Denny, L. G. Tilney, P. Webster, R. J. M. Wilson, J. D. Palmer, and D. S. Roos. 1997. A plastid of probable green algal origin in apicomplexan parasites. Science 275:1485-1489.

Koike, K., H. Sekiguchi, A. Kobiyama, K. Takishita, M. Kawachi, K. Koike, and T. Ogata. 2005. A novel type of kleptoplastidy in Dinophysis (): presence of -type plastid in Dinophysis mitra. Protist 156:225- 237.

Krause, K. 2008. From chloroplasts to "cryptic" plastids: evolution of plastid genomes in parasitic plants. Current Genetics 54:111-121.

Kroth, P. G., A. Chiovitti, A. Gruber, V. Martin-Jezequel, T. Mock, M. S. Parker, M. S. Stanley, A. Kaplan, L. Caron, T. Weber, U. Maheswari, E. V. Armbrust, and C. Bowler. 2008. A Model for Carbohydrate Metabolism in the Phaeodactylum tricornutum Deduced from Comparative Whole Genome Analysis. PLoS ONE 3:e1426.

Kugrens, P., R. E. Lee, and J. O. Corliss. 1994. Ultrastructure, Biogenesis, and Functions of Extrusive Organelles in Selected Nonciliate . Protoplasma 181:164-190.

Lane, C. E., and J. M. Archibald. 2008. The eukaryotic tree of life: endosymbiosis takes its TOL. Trends in Ecology & Evolution 23:268-275.

Lane, C. E., K. van den Heuvel, C. Kozera, B. A. Curtis, B. J. Parsons, S. Bowman, and J. M. Archibald. 2007. Nucleomorph genome of Hemiselmis andersenii reveals complete intron loss and compaction as a driver of protein structure and function. Proceedings of the National Academy of Sciences of the United States of America 104:19908-19913. 80

Li, S. L, T. Nosenko, J. D. Hackett, and D. Bhattacharya. 2006. Phylogenomic analysis identifies red algal genes of endosymbiotic origin in the chromalveolates. Molecular Biology and Evolution 23:663-674.

Ludwig, M., and S. P. Gibbs. 1985. DNA Is Present in the Nucleomorph of Cryptomonads - Further Evidence That the Chloroplast Evolved from a Eukaryotic Endosymbiont. Protoplasma 127:9-20.

Maccoll, R., D. S. Berns, and O. Gibbons. 1976. Characterization of Cryptomonad Phycoerythrin and Phycocyanin. Archives of Biochemistry and Biophysics 177:265-275.

Martin, W., T. Rujan, E. Richly, A. Hansen, S. Cornelsen, T. Lins, D. Leister, B. Stoebe, M. Hasegawa, and D. Penny. 2002. Evolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus. Proceedings of the National Academy of Sciences of the United States of America 99:12246-12251.

Martin, W., B. Stoebe, V. Goremykin, S. Hansmann, M. Hasegawa, and K. V. Kowallik. 1998. Gene transfer to the nucleus and the evolution of chloroplasts. Nature 393:162-165.

McConkey, G. A., M. J. Rogers, and T. F. McCutchan. 1997. Inhibition of Plasmodium falciparum protein synthesis - Targeting the plastid-like organelle with thiostrepton. Journal of Biological Chemistry 272:2046- 2049.

McFadden, G. I. 2001. Chloroplast origin and integration. Plant Physiology 125:50-53.

Mcfadden, G. I., P. R. Gilson, and D. R. A. Hill. 1994. Goniomonas- Ribosomal- Rna Sequences Indicate That This Phagotrophic Flagellate Is a Close Relative of the Host Component of Cryptomonads. European Journal of Phycology 29:29-32.

McFadden, G. I., and G. G. van Dooren. 2004. Evolution: red algal genome affirms a common origin of all plastids. Current Biology 14:R514-R516.

McFadden, G. I., R. E. Waller, M. E. Reith, and N. Lang-Unnasch. 1997. Plastids in apicomplexan parasites. Plant Systematics and Evolution:261-287.

Mcneal, J. R., K. Arumugunathan, J. V. Kuehl, J. L. Boore, and C. W. Depamphilis. 2007. Systematics and plastid genome evolution of the cryptically photosynthetic parasitic plant genus Cuscuta (Convolvulaceae). Bmc Biology 5:55. 81

Moore, R. B., M. Obornik, J. Janouskovec, T. Chrudimsky, M. Vancova, D. H. Green, S. W. Wright, N. W. Davies, C. J. S. Bolch, K. Heimann, J. Slapeta, O. Hoegh-Guldberg, J. M. Logsdon, and D. A. Carter. 2008. A photosynthetic closely related to apicomplexan parasites (vol 451, pg 959, 2008). Nature 452:900-900.

Moreira, D., H. Le Guyader, and H. Philippe. 2000. The origin of red algae and the evolution of chloroplasts. Nature 405:69-72.

Morrall, S., and A. D. Greenwood. 1980. A Comparison of the Periodic Substructure of the Trichocysts of the Cryptophyceae and Prasinophyceae. Biosystems 12:71-83.

Morton, B. R., U. Sorhannus, and M. Fox. 2002. Codon adaptation and synonymous substitution rate in diatom plastid genes. Molecular Phylogenetics and Evolution 24:1-9.

Muller, K. M., M. C. Oliveira, R. G. Sheath, and D. Bhattacharya. 2001. Ribosomal DNA phylogeny of the Bangiophycidae (Rhodophyta) and the origin of secondary plastids. American Journal of Botany 88:1390-1400.

Nachin, L, L. Loiseau, D. Expert, and F. Barras. 2003. SufC: an unorthodox cytoplasmic ABC/ATPase required for [Fe-S] biogenesis under oxidative stress. Embo Journal 22:427-437.

Neuhaus, H. E., and M. J. Ernes. 2000. Nonphotosynthetic Metabolism In Plastids. Annu Rev Plant Physiol Plant Mol Biol 51:111-140.

Oliveira, M. C, and D. Bhattacharya. 2000. Phylogeny of the Bangiophycidae (Rhodophyta) and the secondary endosymbiotic origin of algal plastids. American Journal of Botany 87:482-492.

Palmer, J. D. 2003. The symbiotic birth and spread of plastids: How many times and whodunit? Journal of Phycology 39:4-11.

Parker, M. S., T. Mock, and E. V. Armbrust. 2008. Genomic insights into marine microalgae. Annual Review of Genetics 42:619-645.

Patron, N. J., Y. Inagaki, and P. J. Keeling. 2007. Multiple gene phylogenies support the monophyly of cryptomonad and haptophyte host lineages. Current Biology 17:887-891.

Patron, N. J., R. F. Waller, and P. J. Keeling. 2006. A tertiary plastid uses genes from two endosymbionts. Journal of Molecular Biology 357:1373-1382. 82

Pedrosalio, C, R. Massana, M. Latasa, J. Garciacantizano, and J. M. Gasol. 1995. Predation by Ciliates on a Metalimnetic Cryptomonas Population - Feeding Rates, Impact and Effects of Vertical Migration. Journal of Plankton Research 17:2131-2154.

Pfannschmidt, T., K. Schutze, M. Brost, and R. Oelmuller. 2001. A novel mechanism of nuclear photosynthesis gene regulation by redox signals from the chloroplast during photosystem stoichiometry adjustment. Journal of Biological Chemistry 276:36125-36130.

Race, H. L, R. G. Herrmann, and W. Martin. 1999. Why have organelles retained genomes? Trends in Genetics 15:364-370.

Reinfelder, J. R., A. M. Kraepiel, and F. M. M. Morel. 2000. Unicellular C4 photosynthesis in a marine diatom. Nature 407:996-999.

Richaud, C, and G. Zabulon. 1997. The heme oxygenase gene (pbsA) in the red alga Rhodella violacea is discontinuous and transcriptionally activated during iron limitation. Proceedings of the National Academy of Sciences of the United States of America 94:11736-11741.

Roberts, K., E. Granum, R. C. Leegood, and J. A. Raven. 2007. Carbon acquisition by diatoms. Photosynthesis Research 93:79-88.

Rochaix, J. D. 1997. Chloroplast reverse genetics: new insights into the function of plastid genes. Trends in Plant Science 2:419-425.

Rodriguez-Ezpeleta, N., H. Brinkmann, S. C. Burey, B. Roure, G. Burger, W. Loffelhardt, H. J. Bohnert, H. Philippe, and B. F. Lang. 2005. Monophyly of primary photosynthetic eukaryotes: green plants, red algae, and glaucophytes. Current Biology 15:1325-1330.

Rogers, M. B., P. R. Gilson, V. Su, G. I. McFadden, and P. J. Keeling. 2007. The complete chloroplast genome of the chlorarachniophyte Bigelowiella natans: evidence for independent origins of chlorarachniophyte and secondary endosymbionts. Molecular Biology and Evolution 24:54-62.

Salonen, K., R. I. Jones, and L. Arvola. 1984. Hypolimnetic Phosphorus Retrieval by Diel Vertical Migrations of Lake Phytoplankton. Freshwater Biology 14:431-438.

Schnepf, E., and M. Elbrachter. 1999. Dinophyte chloroplasts and phylogeny: a review. Grana 38:81-97. 83

Shalchian-Tabrizi, K., J. Brate, R. Logares, D. Klaveness, C. Berney, and K. S. Jakobsen. 2008. Diversification of unicellular eukaryotes: cryptomonad colonizations of marine and fresh waters inferred from revised 18S rRNA phylogeny. Environmental Microbiology 10:2635-2644.

Silver, T. D., S. Koike, A. Yabuki, R. Kofuji, J. M. Archibald, and K. I. Ishida. 2007. Phylogeny and nucleomorph karyotype diversity of chlorarachniophyte algae. Journal of Eukaryotic Microbiology 54:403-410.

Simpson, C. L, and D. B. Stern. 2002. The treasure trove of algal chloroplast genomes. Surprises in architecture and gene content, and their functional implications. Plant Physiology 129:957-966.

Singh, B. K., and D. L. Shaner. 1995. Biosynthesis of Branched-Chain Amino- Acids - from Test-Tube to Field. Plant Cell 7:935-944.

Soil, J., and E. Schleiff. 2004. Protein import into chloroplasts. Nature Reviews Molecular Cell Biology 5:198-208.

Spalding, M. H. 2008. Microalgal carbon-dioxide-concentrating mechanisms: Chlamydomonas inorganic carbon transporters. Journal of Experimental Botany 59:1463-1473.

Steiner, J. M., F. Yusa, J. A. Pompe, and W. Loffelhardt. 2005. Homologous protein import machineries in chloroplasts and cyanelles. Plant Journal 44:646-652.

Swaminathan, K., K. Varala, and M. E. Hudson. 2007. Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey. Bmc Genomics 8:132.

Takishita, K., K. Koike, T. Maruyama, and T. Ogata. 2002. Molecular evidence for plastid robbery (Kleptoplastidy) in Dinophysis, a dinoflagellate causing diathetic shellfish poisoning. Protist 153:293-302.

Tanifuji, G. 2006. SybioticGene Re-Organization of Cryptomonads. Pp. 103.

Teles-Grilo, M. L, J. Tato-Costa, S. M. Duarte, A. Maia, G. Casal, and C. Azevedo. 2007. Is there a plastid in Perkinsus atlanticus ( )? European Journal of Protistology 43:163-167.

Tomova, C, W. J. C. Geerts, T. Muller-Reichert, R. Entzeroth, and B. M. Humbel. 2006. New comprehension of the apicoplast of Sarcocystis by transmission electron tomography. Biology of the Cell 98:535-545. 84

Tortell, P. D. 2000. Evolutionary and ecological perspectives on carbon acquisition in phytoplankton. Limnology and Oceanography 45:744-750. van der Kooij, T. A. W., K. Krause, I. Dorr, and K. Krupinska. 2000. Molecular, functional and ultrastructural characterisation of plastids from six species of the parasitic flowering plant genus Cuscuta. Planta 210:701-707. van Dooren, G. G., M. Marti, C. J. Tonkin, L. M. Stimmler, A. F. Cowman, and G. I. McFadden. 2005. Development of the endoplasmic reticulum, mitochondrion and apicoplast during the asexual life cycle of Plasmodium falciparum. Molecular Microbiology 57:405-419.

Wang, S. G., and X. Q. LIU. 1991. The Plastid Genome of Cryptomonas-Phi Encodes an Hsp70-Like Protein, a Histone-Like Protein, and an Acyl Carrier Protein. Proceedings of the National Academy of Sciences of the United States of America 88:10783-10787.

Weisse, T. 2002. The significance of inter- and intraspecific variation in bacterivorous and herbivorous protists. Leeuwenhoek Intern. 81:327-341.

Wickett, N. J., Y. Zhang, S. K. Hansen, J. M. Roper, J. V. Kuehl, S. A. Plock, P. G. Wolf, C. W. dePamphilis, J. L. Boore, and B. Goffinet. 2008. Functional gene losses occur with minimal size reduction in the plastid genome of the parasitic liverwort Aneura mirabilis. Molecular Biology and Evolution 25:393-401.

Wilcox, L. W., and G. J. Wedemayer. 1984. Gymnodinium acidotum Nygaard (Pyrrophyta), a dinoflagellate with an endosymbiotic cryptomonad. Journal of Phycology 20:236-242.

Wilson, R. J. M., P. W. Denny, P. R. Preiser, K. Rangachari, K. Roberts, A. Roy, A. Whyte, M. Strath, D. J. Moore, P. W. Moore, and D. H. Williamson. 1996. Complete gene map of the plastid-like DNA of the malaria parasite Plasmodium falciparum. Journal of Molecular Biology 261:155-172.

Wolfe, K. H., C. W. Morden, and J. D. Palmer. 1992. Function and Evolution of a Minimal Plastid Genome from a Nonphotosynthetic Parasitic Plant. Proceedings of the National Academy of Sciences of the United States of America 89:10648-10652.

Yoon, H. S., J. D. Hackett, and D. Bhattacharya. 2002. A single origin of the peridinin- and fucoxanthin-containing plastids in dinoflagellates through tertiary endosymbiosis. Proceedings of the National Academy of Sciences of the United States of America 99:11724-11729.

Yoon, H. S., J. D. Hackett, C. Ciniglia, G. Pinto, and D. Bhattacharya. 2004. A molecular timeline for the origin of photosynthetic eukaryotes. Molecular Biology and Evolution 21:809-818. APPENDIX A: DETAILED GENE TABLE

O (N iD cn IV ^-ivivTfrivrnvorsim^omTj-uDTHT-i cn cr> iv iv iv IV ivvovoro'frrooovovom^t-vDvomiv

00 o T-H ro CO rM rM cn •* in IV (N •* •* IV o ro oo m o m ro vo VO ro rv IV (N VO HI/HO vo t ro M- •* (N Cn iH T-H 01

H3 C ro 1X3 JP ID (Q ro ro (0 s f0 ro QJ « to 2 J2 .c •S J? •4-J ro QJ ^ QJ .c

V) salina 5 OJ Ifl JJ +J _ 4-> >-J J_> (0 rsi a. a. e- Q. Q. l u *t= ro Q ro a. to V) Q. (/> £» <0 (0

rsi n n n IH (N ro + • + + +

00 (N 00 ^voorjgvo^m^gg vo ro vo ^t in ro -3" vo o cn cn "tf 00 r\l ro in T-l 'f iH (N (N tv 00 rH 00 00 in o

(N -fr VO oo in TH CM m co w en in 00 •sr 00 IV 00 VO o ro oo ro oo vo IV 00 VO o(No •* o ro 8*(N i cr> in o 00 O Oi lO CM •

00 IV cn Cn O VO o o ^r ro (N i-t (rlin^oo^rvr.vpS|Ci2N ^ cn CM IV IV 00 c^oorvrvrvivrv^oo^i^in^^Lnf^^^^vDoopj^^n ro (N IV , VD°2* 00

o in vo oo m o fM cn vo m o CM n oo o \|- o •* cn oo 'd- oo oo oo i-i rM cn rv m m oo rsi cn cn vo cn m N n cn m o o i-t m •* o (\i oo m cn cn H CM rv IV TH fM VO **• rv in ONOOOtin N ton in co vo 00 fM ^- vo rv oo cn m oo vo TH rv m m in Hnn^ininininiOtfio^ioscomsyy'jHHHHHNNNNo N

•* vo oo rvmrMvovorooofMrovO' * vOi-irvorMOcnvorvmco 00 VO oo TH -I-H in 2 rM vo o IH rv m o T-H rv rv r\i cn vorvo^mcnoofMOoo • f\ ^* 0^r0t CIM *V\ I oo in ro o cn rM rv v_/ u j O \-JO T—THi fi M-^ i -\ T—i >j u j yjj i ^ \^ i -M -si v*j V<-"O. (iN > i r*i rn 00 o rM cn t^-iniooHiflin fMn-

< in E 3 < QJ < 1 /xi ^ ^ m o cn -I C/) X < (J < (/) "S u u " ^vB a. a. iJ- o. w Q. "r c c c c C O ro ro >: ro ro ro Willi o (J U u

(J (J (J (J CJ (J {J e)Ci)(J)tn|— (J(j}(J3e)(JCn(J}l3Cn(r) H ^ £ £ £ 86

tN TH rv LT) ^- rH omrNvDr\ivO'

IV *• 00 T-I rH rH coo^fimvocntN00 en H co CN O ro ^f CN m "vt- in cn fN VD cu rH rH m vo *t CM ro o 1 ro o i fN rH ro to vo m vO i_ O ' 1 q • i o cu CU . CD cu cu cu cu m ro r\i cu cu cu cu CU cu cu cu cu cucucucucucucucucucu u iv rsi o ro m 00 vD "* vo vo rv oo d Cn

o CO O 00 c vo .vo 3 s= V to •a 8 (0 .C3 U <0 f0 10 i-i a,* ro c Q. c C ,c (0 i= 03 cu "5 .") • • er y N —m id £^ rH oo Hz CO. CU ro - £3 rH Q. f° 5 a & a a (/) £viTlU rH C L r a (/} U C a - • a. c Q. (0 T3 Q. • 1_ L. 00 l/l — 4-> X a a >- fN fN a a. a. P- P- 4-> ro

H(Nn|,l,vl,,lftHHMMNn(N'-IHN(NIHn + ,+ + +, i i ' ' '+ + + + +' '+ + + + + + + + + +

c cu r\i cn n CN "5t O oo cn cn oo n cn VO in •* o •* E n1 ^ ^ TH CN IV CN rsi vo fM oo iv iv •3- ' in in fM H H iv ro CM fM 00 cu c

TH ro cn rH cn •ST fM \Q in n o in o rt p. in fM rv in VD IV m ro i o N H H ro rv o oo ro rv tN vo m oo cn ^ m " 1 ,N fN U1 U1 fM ' in m '' eo ** ' ' ^^ (N VO (N fM (N TH

fM cn VD THCOCoinrN^(TNr.,rovo}rroin»HcovovO''tf"OOT-Hroinoro fM O VD O o •* (N|smTHISnrslcn vo •* cn o vOlONC0N(V,vOro -<*• cn (N rv ro in oorvco^'vi-o^-vDiH 23 H '* fNfNfOfMLnTj-mrom

m * o co co rv ^t m n rvrsivo'-tovDoomrvrv cnocnrM^toovoLnvo^-Ti- O ro ro fM cn VOOOCnvDOtNvDfMCnfM a IV CT> rH T-HrHrHOOVOOOOrvrvtNV D (NfMrHIVOCTifNvOfNrv cnrHLnoo^tco\i-rvro o M-fMvooovDfMro^-rv OOOrHrHrHfMfNfOfO^T t5 fN(NrorofOforoforororororororofororororvocooorHrHrHfNrorororo Tj-vorvrvcooocncon

cn ro rM COrlOO't o cnrHTfrcnin'sr'frmorNvD vo ro cn O o *-vorvrvcooocncri o o o fN rsi ro ro fN IN r\i ro ro ro ro rorororofOfororororororororo •* •* •«- •^ "d- •* •*

U < U oo to co UJ 3 o in < cn rv ^f ^. \H o o -v fM ro w < ^ m oo vo 22 ^ 3 e o. Q. _- a. ro Q. Q. — Q. V> — _ fN TH rH fM — w — ^ in o.cL-° u c Q. P- a o. Q. Q. Q- - ro ro j ro XJ e-e P- Q. P- "o. Q. u

{JOO(J30(J(i)v3)0{JU(J3(Jv3WU(J(JC5iJ H-

•srcnisr\iT-ioor\iT-Hvoisrsm

st- rs o T-H fN o tn fN is fN •* ro •* fN vO CT> fN o •sf VO IS o st- IS o Tt s_ 00 O ro ro 00 ro ro fN •sr in fN ro Cn i-l m rs fN o m m IS 00 fN o fN st- o vq O O in is IS IS 1 1 1 T-H o a> di vo o o in vo cu cn vo 00 rio oio) d> d) d> d> rsd>i d) d> d> TdC> 0) i-H d» u m is fN st- is in d fN vo vo vo d in d IS VO VO IS m fN d d d m IS

fN Q ro 00 eu / ro jo ro fO ro ro ro ro ro PCC 6 .C PCC S doxa CC8106 2 ^c c c S rium c JS urea C _a .c 15 V) Q. , i i i i i i i i O. tat calda G . trtet a R. sal R . sal R. sal, /? . sal P. sali G . th e . purp R. sali 3 H. p R. sali G . th e /? . sa/ / /? . sali (/> :. para G . th e R . sa// ' of JO Cn IS m oo (/) i/> O rps l

Q. Q. ilv B rpl l tuf A rpl3 4 dp C rpo A sec Y rpll l sec A rpll 9 rpl3 6 __ rpI3 1 rpll 2 o pet F C .

rpsl l C o 3 S . elongatus sl O Lyngbya P L 2 Cyanothece i-H i/i Q. Q.

mmrorof)YHm7-imrNi-irNro ro ro ro , (N H r«J H H t\l CM + + + + + + + + + + + + ' + ' + ' ' + + + +• + + +

c CD ri­ o fN o cn ro o rN oo ro m oo in vo fN ve fN CTi * K? TH in o vo w, m- o i-i __ ro ro i-i r-t fN fN oo oo ro vo TH st­ T-H s|- cn •* £j cn ro *!8 *rs 0.5™ CL) fN ro fN VO c

T-H vo cn in IH m cn oo N" |S st- vo is . 00 VO o_o ImN roo T-oH •stro- irno r\i m o o is st- ro rsi o ^4. IN 1-1 T-I T-H en ^ ^ ^ U! i-t -ST m T-H fN T-H T-H 01 ^ iH T-H ro T-H 00 00 T-H

m oo mcnoorNvostrNO m sf- st- oo ro fN TH L-1/1, in ro is ro Jfi Zl o oo ro ro fN fN •"j-rscnrorocnforsooJiNT-icr n ro rs oo IS rs oo sf is oo is ^ ^ cn IH is vo oo is rororji'si-ror\irost- jjrorN ro fN ^J- is ro ro rsi fN

sj- oo r\i ro m oo is vo vo is ro rocnrji'sj-rorscnrocnTHrsvooorNmoro •* r\i <* •* oo •si- vo rN o fN vo 00 (N fN •* rsvovocninisvooOi-HTj-'

to \o ro C^ S CH rs S u. H U c T Z < ^ ^ d S CD P£ s u 5 S S - - 2 s s ^ s < ti CD ro -' TH w tu U O a P- P- £• P- a.Q . 3 u L P- a. Q. Q. Q. ELL-LLC C EE£i: r>.o«it_:L 4->4->4->4-> 4-J +J 4-1 4-J4-I

v5v3e)v5e.v^(J(i3C3(i)C_3(J(_9 (j

00 «fr (N O -st- ro vo ^j- oo ro vo ^c in

CU ro o m r\i vo IV iH O cn oo rv in r\i rv CM (N (N IV T-4 iv ro ^> o ro vo \f en o CD CU CD cu cu VO 1-4 CD fN V0 fN •<* •* iv TJ- en IV IV CM in rv d in vo vo ro IV

D. sP

ro ro ro 0 6 8* *H C .C •fas' e J-J 11 too !c -M 4-«* "<5 § E 9i y XT 0. XI • XI c o cu ro trl o (J of o cc Ol QL ro c vo en c _i I Ct T-1 *- n> UJ U 3 ^cn it^- o cu u ex y c. cn rv D! C O) i- >• Di — c ^ cL u > o < U

n n N H ^

c cu < < rv oo iv _ in ro z z - ,_ i-4 E ro vo oo m ro C: y en rv ro ro c£ ct 2 rv "J. ro cu iH 'ST i-l ~ CN i-l CfMN rt iH c

in rv rv i-l "vT QQ 00 ro m oo

•tfvOiH^^^ — vorNrv^}- ^r CM cn vo CN iv ro 'S- o ro IN ro (NfM-frljr^^-r^vO Tf (N ro vo rv rv o iv iv rv

ro ro cn i-i n^-coHcOH^-^ inrooro^rroTH r\l en ro rv vorvcooomen^oo o co vovovovovovovovoivrvrvrv rvrvrvrvivrviv in vovosj-i-ioi-irvcnroocoi-i vo rv

cn u r- U h- cu ft vo cn rv rv T-4 < b c uUhh SJ vo cu in iH o < P 13 ^-H fN fN z CU •r - v3 I- h- t Q. Q.Q c D zt oi o b c o ro

(3 (3 C3 C3 C5 (D fj to C3 ro 4-inJ £££££