<<

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

A Bell & Howell Information Company 300 Nortn Z eeb Road. Ann Arbor. Ml 48106-1346 USA 313/761-4700 800/521-0600 EVOLUTIONARY CONSEQUENCES OF THE LOSS OF PHOTOSYNTHESIS IN THE NONPHOTOSYNTHETIC CHLOROPHYTE ALGA .

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Dawne Vernon, B.S.

★ ★ ★ iHr ★

The Ohio State University

1996

Dissertation Committee: Approved by

C. William Birky, Jr.

Caroline A. Breitenberger Q Advisor / Thomas J . Byers Department of Mglecular Genetics Daniel J. Crawford UMI Number: 9620082

Copyright 1996 by Vernon, Dawne

All rights reserved.

UMI Microform 9620082 Copyright 1996, by UMI Company. All rights reserved.

This microform edition is protected against unauthorized copying under Title 17, United States Code.

UMI 300 North Zeeb Road Ann Arbor, MI 48103 Copyright

by

Dawne Vernon

1996 To my parents,

ii ACKNOWLEDGMENTS

Thanks are inadequate relative to the benefits I received from having Dr. C. W. Birky, Jr. as a mentor, both in terms of knowledge gained and thinking skills acquired. I learned from many fine teachers throughout my , but Dr. Birky and the late Dr. Elton Paddock reminded me how to think

(and how to observe), at two times in my adult life when I thought I already knew how to think fairly well. Both professors kept my logic sharp and my perspective broad. I thank Carl Kipp for similar input; no sloppy logic around him. In addition, the positive environment in Dr. Birky's laboratory contributes to the psychological well-being of all the students who pass through it.

Thanks to Dr. Tom Byers and Dr. Dan Crawford for advice on phylogeny and for time spent reviewing this manuscript.

Thanks to Dr. Caroline Breitenberger for her reviewing time, and for her great assistance in analyzing a hypervariable region in the EF-Tu .

Thanks to Pam Katko and Cal Griffith for helping me start this project. Thanks for technical help and friendship from Heather Deiderick, Lori Humphrey, and Robin McBride, and especially from Suzanne Nixon. Thanks to Dr. Steve Kuhl, Bob

Rumpf and Pam Mackowski for brainstorming. And thanks,

again, to Bob Rumpf for more unselfish help with software

than I can ever repay.

Finally, thanks to my supportive and understanding parents. I have had the rare opportunity to work professionally with both of my parents, so they have been parents, mentors, colleagues and friends to me. I couldn't have finished this project without them.

The work reported in Chapters IV and V will be submitted

for publication. The joint project reported in Chapter III has already been accepted for publication in the Journal of

Phyoology. VITA

October 23,1948 .... Born - Akron, Ohio

1971 ...... B.S., The Ohio State University, Columbus, Ohio

1968 - 1972 ...... Mathematician, E.S. Preston & Associates, Civil Engineers Columbus, Ohio

1972 - 1984...... Mathematician, Technical Manager and Business Manager, Right-of-Way Consultants & Associates, Professional Surveyors Marietta, Ohio

1984 - present ..... University Fellow, Graduate Teaching Associate, Graduate Research Associate, or Graduate Student Department of Molecular Genetics The Ohio State University Columbus, Ohio

PUBLICATIONS

Vernon-Kipp, D., Kuhl, S.A. & Birky, C.W.'Jr. 1989. Molecular evolution of Polytoma, a non-green chlorophyte. In: Boyer, Charles T., Shannon, Jack C., & Hardison, Ross C. [Eds.] Physiology. Biochemistry, and Genetics of Nonareen . American Society of Physiologists, Rockville, Maryland, pp. 284-6.

v Gordon, J., Rumpf, R., Shank, S.L., Vernon, D. & Birky, C.W. Jr. 1995. Sequences of the R m l 8 of humicola and C. dysosmos are identical, in agreement with their combination in the species C. applanata (). In: J. Phycol. 31:312-3.

Rumpf, R., Vernon, D., Schreiber, D. & Birky, C.W. Jr. 1996. Evolutionary consequences of the loss of photosynthesis in : Phylogenetic analysis of Rrnl8 (18S rDNA) in 13 Polytoma strains (Chlorophyta) . (in press, Feb, 1996, Journal of Phycology) .

Technical Editor of textbook: Vernon, R.C. 1996. Professional Surveyor's Manual. (in press, Sept., 1996, McGraw-Hill Publishing co.)

FIELD OF STUDY

Major Field: Molecular Genetics

Molecular Genetics and Evolution of

Professor: C. William Birky, Jr.

vi TABLE OF CONTENTS

ACKNOWLEDGMENTS...... iii

VITA ...... V

LIST OF T A B L E S ...... ix

LIST OF FIGURES ...... x

CHAPTER PAGE

I. INTRODUCTION...... 1

Introduction ...... 1 Hypotheses ...... 6 Project Goals ...... 7

II. ISOLATION AND CHARACTERIZATION OF DNA FROM POLYTOMA UVELLA 964 ...... 15

Introduction ...... 15 Materials and Methods ...... 16 R e s u l t s ...... 39 D i scussion...... 47

III. MOLECULAR PHYLOGENETIC ANALYSIS IN POLYTOMA . 59

General Background ...... 59

Accepted Paper: EVOLUTIONARY CONSEQUENCES OF THE LOSS OF PHOTOSYNTHESIS IN CHLAMYDOMONADACEAE: PHYLOGENETIC ANALYSIS OF RRN18 (18S RDNA) IN 13 POLYTOMA SPECIES (CHLOROPHYTA) ...... 66

Introduction ...... 66 Materials and M ethods ...... 70 Results and Discussion ...... 79 Conclusions...... 90

vii Additional Information from Phylogenetic Analysis ...... 94

IV. EVIDENCE OF FUNCTIONALITY IN TWO POLYTOMA LEUCOPLAST GENES ...... 105

General Background ...... 105

Submittable Paper: EVIDENCE THAT PLASTID RRN16 AND TUFA GENES ARE FUNCTIONAL IN THREE NON-GREEN CHLOROPHYTE ALGAE IN THE GENUS POLYTOMA...... 114

Introduction ...... 114 Materials and Met h o d s ...... 120 R e s u l t s ...... 127 Discussion...... 136

V. ACCELERATED EVOLUTIONARY RATES FOUND IN TWO POLYTOMA LECUOPLAST GENES ...... 151

General Background ...... 151

Submittable Paper: ANALYSIS OF EVOLUTIONARY RATES IN TWO PLASTID GENES IN THE NONPHOTOSYNTHETIC ALGA POLYTOMA ...... 158

Introduction ...... 158 Materials and Methods ...... 163 R e s u l t s ...... 182 Discussion...... 185

VI. S U M M A R Y ...... 198

LIST OF REFERENCES ...... 204

viii LIST OF TABLES

TABLE PAGE 1. Variations in Gillham & Boynton lysis p r o t o c o l ...... 18

2. Primers used for Rrnl8, rrnlS, tufA, rrn.23, rbcL . 37

3. Amplification conditions used for Rrnl8, rrnl6, tufA, rrn23, rbcL . 38

4. List of Polytoma s t o c k s ...... 71

5. List of complete Rrnl8 sequences and their sources ...... 72

6. Numbers of substitution and indel differences between complete sequences of Rrnl8 g e n e s ...... 73

7. Single-base substitution rate changes in non-green plastid genes ...... 154

8. Relative rate tests for r r n l 6 ...... 175

9. Relative rate tests for t u f A ...... 176

10. Base composition (as % G + C) in chlamydomonad rrnlS and tufA genes .... 177

11. Codon usage in chlamydomonad tufA plastid genes ...... 178

12. Increase in substitution rates (relative to green species) at synonymous versus non-synonymous sites in Epifagus and P. o b t u s u m ...... 179

ix LIST OF FIGURES

FIGURE PAGE 1. Polytoma lpDNA yield from various lysis p r o t o c o l s ...... 17

2. Buoyant densities in CsCl gradients. . . 24

3. CsCl + bis gradient for P.u. 964 .... 26

4. P.u.964 DNA digested with Hind III ... 27

5. P. uvella 964 nuclear ribosomal RNA (s) identified ...... 29

6. P. uvella 964 leucoplast ribosomal RNA gene(s) identified ...... 31

7. Dot blot with rrnl6/rrn23...... 33

8. Dot blot with t u f A ...... 35

9. Leucoplast DNA digested with various restriction e n z y m e s ...... 44

10. Phylogram of complete Rrnld sequences . 81

11. Cladograms based on partial Rrnl8 s e q u e n c e s ...... 83

12. Nucleotide alignment of a hypervariable region in tufA gene, showing chlamydomonad-specific insertions .... 124

13. Amino acid alignment of a hypervariable region in tufA gene, showing chlamydomonad-specific insertions .... 125

14. Protein structure of EF-Tu with insertion shown ...... 126

x 15. Three-taxon tree for relative rate t e s t ...... 164

16. Codon usage in tufA genes from two green species plus non-green P.6 2 - 2 1 ...... 166

17. Codon usage in tufA genes from two green and two Polytoma s p e c i e s ...... 168

xi CHAPTER I

Introduction

INTRODUCTION TO POLYTOMA

A number of chlorophyte algae and land have lost

the ability to do photosynthesis and now fill niches as free-

living heterotrophs or parasites on other plants (Patterson

and Larsen 1991, Pringsheim, 1963, Kuijt 1969). These

unusual provide a rare opportunity to observe a

natural experiment on the evolutionary consequences of loss

of a significant function. Several genera of

nonphotosynthetic chlorophytes have been identified. The

non-photosynthetic genus Prototheca is considered closely

related by morphology and biochemistry to photosynthetic

Chlorella (Huss et al. 1988, Kerfin and Kessler 1978, Pore

1985) . Eight non-photosynthetic genera have been identified within the (Pringsheim 1963), but only two,

Polytomella and Polytoma, are represented in culture

collections. The non-photosynthetic Polytomella species are

considered to be closely related to photosynthetic

(Pringsheim 1963), and the Birky laboratory has recently

begun to study several species in that genus. A larger group

of non-photosynthetic Polytoma species exists in collection,

1 classified by morphology and biochemistry as close relatives

of Chlamydomonas (Pringsheim 1963). The genus Polytoma is

the subject of this thesis, which describes studies of the

evolutionary effects of the loss of photosynthesis in several

species of the genus.

Polytoma algae are salmon-pink, not green, since they produce no chlorophyll but still produce carotenoids (Links,

Verloop & Havinga 1963) . Like Chlamydomonas, they are ovoid biflagellated unicells with a single cup-shaped plastid. The plastid, called a leucoplast, differs from a Chlamydomonas

in being devoid of most or all internal thylakoid membranes, but still contains , DNA, rRNA and stored

granules (Siu, Chiang and Swift 1975b, 1975c, 1976;

Kieras and Chiang 1971; Scherbel, Behn and Arnold 1974:

Vernon-Kipp, Kuhl and Birky 1989; D. Vernon and C. W. Birky,

Jr., unpubl.). Three Polytoma species (P. obtusum, P. uvella

19, and P. papillatum) have been fairly extensively studied with respect to morphology, ultrastructure and physiology

(Siu, Chiang and Swift 1975a, Gaffal 1978, Arnold and Blank

1980, Lang 1963).

Many extant photosynthetic chlamydomonads are

facultative heterotrophs that can utilize complex organic molecules as energy sources, and nonphotosynthetic mutants

are easily obtained from some of these species (Bold and

Wynne 1985, Harris 1989) . It is reasonable to assume that

the ancestor of a Polytoma clade was a green alga, similar to Chlairjydomonas, that incurred a mutation that blocked photosynthesis but survived because it was a facultative heterotroph. Once photosynthesis was lost due to a mutation in any of the many nuclear or chloroplast genes whose products are required for photosynthesis, all of the photosynthetic genes would be dispensable and free to accumulate additional mutations or deletions. These secondary events would then make the loss of photosynthesis irreversible, and the non-functional photosynthetic genes would become non-functional pseudogenes or be lost entirely.

Photosynthetic genes are only one of three functional classes of genes in plastid . Another class includes expression genes, which encode some of the RNA and protein components needed for and in plastids. There is also evidence for a third class of essential nonphotosynthetic genes, whose products are required for survival but are not involved in either expression or photosynthesis (Gillham and Boynton 1994).

Only a few possible examples of essential non-photosynthetic genes have been located, several in a nonphotosynthetic land plant (Wolfe, Morden and Palmer 1992), several in photosynthetic algae (Hwang and Tabita 1991; Michalowski,

Loeffelhardt and Bohnert 1991; Michaloski, Flachmann,

Loeffelhardt and Bohnert 1991), and one in a liverwort

(Laudenbach and Grossman 1991); but the presence of such a gene or genes can be inferred from compelling evidence (described in Chapter IV) in several green species and non­ green mutants.

Chloroplasts have several known functions other than photosynthesis, among them parts of the biochemical pathways involved in fatty acid synthesis, amino acid synthesis, nitrite and sulfate reduction, porphyrin biosynthesis, and the biosynthesis, storage and degradation of starch (Kirk and

Tilney-Bassett 1978, Howe and Smith 1991, Weeden 1981, Browse and Somerville 1991, Fiedler and Schultz 1985). Any of these pathways could be the reason why plastid function is required, if any as-yet-unlocated plastid genes code for some portion of these biochemical functions. Or there may be plastid-encoded transmission genes involved in DNA replication or repair functions. Currently, all known genes involved in these pathways and functions are nuclear-encoded and translated in the before being imported into the chloroplast, as are 80% of all used by (Gillham and Boynton 1994).

The chloroplast genomes of three plants have been completely sequenced (Ohyama et al. 1986, Shinozaki et al.

1986, Holwerda et al. 1986), and sequencing of several algal chloroplast genomes is almost finished. Of the approximately

120 genes present in these chloroplast genomes, fewer than half code for proteins involved in photosynthesis. The rest are expression genes and ORFS of unknown function. The expression genes include four rRNAs, four RNA polymerase

subunits, 30 tRNAs, and 20 of the 60 ribosomal protein genes.

If leucoplasts in nonphotosynthetic species must still

transcribe and translate the essential nonphotosynthetic

gene(s), the expression genes should still be subject to

selection and conserved in base sequence. The genes that had once coded for photosynthetic proteins might show mutations that are normally eliminated by selection because

they are detrimental. These might include mutations that are

obviously incompatible with normal gene function, such as

frameshifts, premature termination codons, or deletion of

essential parts (or all) of the gene. Leucoplast genes

studied extensively in three nonphotosynthetic organisms, the beech parasite Epifagus Virginia, the oak parasite

Conopholis americana, and the euglenoid Astasia longa, display those characteristics. Most expression genes are

still present in the leucoplast genomes and the sequences appear intact, and some leucoplast RNA and protein products have been demonstrated (dePamphilis and Palmer 1990; Wolfe,

Morden & Palmer 1991; Morden et a l . 1991; Wimpee et a l . 1991,

1992; Siemeister and Hachtel 1989, 1990, 1991; Siemeister,

Buchholz and Hachtel 1990). Conversely, many leucoplast genes that code for photosynthetic proteins in photosynthetic

species appear nonfunctional in key domains (including 5' or

3' untranslated regions), are grossly truncated, or are absent from these leucoplast genomes (dePamphilis and Palmer 1990, Wolfe and dePamphilis 1996, but see Siemeister and

Hachtel 1990, showing evidence of an expressed protein product from the photosynthetic rbcL gene).

There are several exceptions to the expectation that leucoplast expression genes will be retained and conserved.

First, the leucoplast genome in Epifagus is missing more than a dozen expression genes, even though several leucoplast RNA products have been located and identified {Wolfe et al.

1992) . The leucoplast genome in Conopholis is also missing several tRNA genes (Wimpee et al. 1992). (This problematical situation will be discussed in Chapter IV.) Second, the tempo of evolutionary change (measured as the rate of nucleotide substitutions) in most of the presumably- functional expression genes in those three species increased.

In some rRNA genes as much as a 40-fold higher rate than in green relatives was observed (Wolfe et al. 1992, Wimpee et a l . 1992). This observation was an important impetus for the studies reported in this thesis; would this occur in leucoplasts in Polytoma species also?

HYPOTHESES

One hypothesis tested in these studies is that the leucoplast genome in Polytoma is still present, functional, and codes for functional expression genes. The second hypothesis is that expression genes in the leucoplast in

Polytoma will show accelerated evolution, as they do in other 7 nonphotosynthetic species. While differences caused by accelerated evolution would probably be most apparent in pseudogenes because of the lack of selection, differences may be apparent when still-functional plastid genes are compared with homologs in green species, but might be of smaller magnitude than in pseudogenes.

PROJECT GOALS

Chapter II reports results of attempts to make the and its DNA amenable to molecular study. The goals at this early stage were isolation of DNA from Polytoma cells, separation of that DNA into fractions correlated with organellar origin, estimation of the size of a Polytoma leucoplast genome, and location of a nuclear gene to use in the phylogenetic study reported in chapter III and leucoplast genes to use in the molecular observation/comparison studies reported in Chapters IV and V.

Chapter III reports results of a phylogenetic analysis.

The goal of this study was to get molecular confirmation of the taxonomic positions of a number of Polytoma species, as previously determined by morphology and biochemistry. The goal was not a taxonomic revision, although the present of chlamydomonads based on morphological and biochemical criteria is being revised (Ettl & Schlosser

1992) . One purpose of the phylogenetic analysis was to determine whether independent nonphotosynthetic clades arise frequently or infrequently and how long they survive, in order to determine whether obligate heterotroph species are disadvantaged relative to photosynthetic species. The second purpose of this study was to allow selection of appropriate

Polytoma and Chlamydomonas species to use in a comparison of substitution rates in plastid genes from green and nongreen species (results reported in Chapter V ) . These comparisons

(called relative rate tests) require the selection of a green and a nongreen species for comparison, plus a second green species that is as closely related as possible but is still definitely an outgroup {CITE}.

Chapter IV reports results of molecular observations at the DNA level, and at the structural level of inferred RNA and protein molecules, in two presumably functional leucoplast expression genes (rrnl6 and tufA) from three

Polytoma species. The goal of these various observations was to look for features that are, or are not, consistent with functionality of the leucoplast genes. Nomenclature for nuclear and plastid genes reported in this thesis is expressed according to the Commission on Plant Gene

Nomenclature, established by the International Society of

Plant Molecular Biology (Reardon, ed. 1992) .

Chapter V reports results of comparisons of substitution rates in these same rrnl6 and tufA sequences. Using appropriate species and genes, the goal of these relative rate tests (explained in detail in Chapter V) is to separate the evolutionary effects of selection from those of mutation

by comparing sites still under selection to sites not under

selection. The rate of substitution at sites under selection

is determined by the joint effects of mutation and selection, while the rate at sites not under selection is equal to the mutation rate (Kimura 1983). Consequently a relative rates

test involving a pseudogene could determine whether mutation

rates are higher in nonphotosynthetic leucoplasts than in photosynthetic chloroplasts. The same is true for synonymous substitutions in protein coding genes under selection, which are neutral or nearly neutral. By comparing synonymous and nonsynonymous substitutions in expression genes, we were able to determine whether changes in substitution rates in nonphotosynthetic leucoplasts are due to changes in selection, or in mutation rates, or both. Literature Cited

Arnold, C. G. and R. Blank, 1980 Three-Dimensional structure of mitochondria and plastids in Chlamydomonas reinhardtii and Polytoma papillatum. Walter de Gruyter & Co., New York.

Bold, H. C. and M. J. Wynne, 1985 Introduction to the Algae. Prentice-Hall, Inc., Englewood Cliffs, N.J.

Browse, J. and C. Somerville, 1991 Glycerolipid synthesis: biochemistry and regulation. Annu, Rev. Plant. Physiol. Plant Mol. Biol. 42: 467-506. dePamphilis, C. W. and J. D. Palmer, 1990 Loss of photosnthetic and chlororespiratory genes from the plastid genome of a parasitic flowering plant. Nature 348: 337-339.

Ettl, H. and U. G. Schlosser, 1992 Towards a revision of the systematics of the genus Chlamydomonas (Chlorophyta). 1. Chlamydomonas applanata Pringsheim. Bot. Acta 105: 323-330.

Fiedler, E. and G. Schultz, 1985 Localization, purification, and characterization of shikimate oxidoreductase- dehydroquinate hydrolase from stroma of spinach chloroplasts. Plant Physiol. 79: 212-218.

Gaffal, K. P., 1978 Configural changes in the plastidome of Polytoma papillatum after completion of cytokinesis and during fusion of the gametes. Protoplasma 94: 175-191.

Gillham, N. W., 1994 Genes and Genomes. Oxford University Press, Oxford.

Harris, E. H., 1989 The Chlamydomonas Sourcebook. Academic Press, Inc., New York.

Holwerda, B. C., S. Jana and W. L. Crosby, 1986 Chloroplast and mitochondrial DNA variation in Hordeum vulgare and Hordeum spontaneum. Genetics 114: 1271-1291.

10 11

Huss, V. A. R., K. H. Wein and E. Kessler, 1988 Deoxyribonucleic acid reassociation in the taxonomy of the genus Chlorella. Archives of Microbiology 150:

Hwang, S. R. and F. R. Tabita, 1991 Acyl carrier protein- derived sequence encoded by the Chloroplast genome in the marine diatom Cylindrotehca sp. Strain N1. J Biol Chem 266: 13492-13494.

Kerfin, W. and E. Kessler, 1978 Physiological and biochemical contributions to the taxonomy of the genus Prototheca. II. Starch Hydrolysis and base composition of DNA. Archives of Microbiology 116: 105-107.

Kieras, F. J. and K.-S. Chiang, 1971 Characterization of DNA components from some colorless algae. Exp. Cell. Res. 64: 89- 96.

Kimura, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

Kirk, J. T. 0. and T.-B. R. A. E., 1978 The Plastics. Elsevier, North Holland, Amsterdam and New York.

Kuijt, J., 1969 The biology of parasitic flowering plants. University of California Press, Berkeley and Los Angeles.

Lang, N. J., 1963 Electron-microscopic demonstration of plastids in Polytoma. J. Protozool. 10: 333-339.

Laudenbach, D. E. and A. R. Grossman, 1991 Characterization and mutagenesis of Sulfur-regulated Genes in a Cyanobacterium: Evidence for Function in Sulfate Transport. J. Bacteriol 173: 2739-2750.

Links, J., A. Verloop and E. Havinga, 1960 The carotenoids of Polytoma uvella. Arc. Microbiol. 36: 306-324.

Michalowski, C. B., R. Flachmann, W. Loeffelhardt and H. J. Bohnert, 1991 Gene nadA, encoding Quinolinate Synthetase, is located on the Cyanelle DNA from Cyanophora paradoxa. Plant Physiol 95: 329-330. 12

Michalowski, C. B., W. Loeffelhardt and H. J. Bohnert, 1991 An ORF323 with homology to crtE, specifying Prephytoene Pyrophosphate Dehydrogenase, is encoded by Cyanelle DNA in the eukaryotic alga Cyanophora paradoxa. J Biol Chem 266: 11866-11870.

Morden, C. W., K. H. Wolfe, C. W. dePamphilis and J. D. Palmer, 1991 Plastid translation and transcription genes in a non-photosynthetic plant: intact, missing and pseudo genes. The EMBO Journal 10: 3281-3288.

Ohyama, K., H. Fukuzawa, T. Kohchi, H. Shirai, T. Sano, S. Sano, K. Umesono, Y. Shiki, M. Takeuchi, Z. Chang, S.-I. Aota, H. J. Inokuchi and H. Ozeki, 1986 Chloroplast gene organization deduced from complete sequence of liverwort Marchantia polymorpha chloroplast DNA. Nature 322: 572-574.

Patterson, D. J. and J. Larsen, Ed. (1991). The Biology of Free-Living Heterotrophic Flagellates. Oxford, Clarendon Press.

Pore, R. S., 1985 Prototheca taxonomy. Mycopathologia 90: 129-139.

Reardon, E. M. and C. Price, 1994 Nomenclature of Sequenced Plant Genes. Plant Molecular Biology Report 12: Sl-81.

Scherbel, G., W. Behn and C. G. Arnold, 1974 Untersuchungen zur genetischen Funktion des farblosen Plastiden von Polytoma mirum. Arch. Microbiol. 96: 205-222.

Shinozaki, K., M. Ohme, M. Tanak, T. Wakasugi, N. Hayashida, T. Matsubayashi, N. Zaita, J. CHungwongse, J. Obokata, K. Yamaguchi-Shinozaki, C. Ohto, K. Torozawa, B. Y. Meng, A. Sugita, H. Deno, T. Kamogashira, K. Yamada, J. Kusuda, F. Takaiwa, A. Kato, N. Tohdoh, H. Shimada and M. Sugiura, 1986 The complete nucleotide sequence of tobacco chloroplast genome: its gene organization and expression. EMBO 5: 2043- 2050.

Siemeister, G., C. Buchholz and W. Hachtel, 1990 Genes for the plastid elongation factor Tu and ribosomal protein S7 and six tRNA genes on the 73 kb DNA from Astasia longa that resembles the chloroplast DNA of Euglena. Mol. Gen. Genet. 220: 425-432. 13

Siemeister, G. and W. Hachtel, 1989 A circular 73 kb DNA from the colourless flagellate Astasia longa that resembles the chloroplast DNA of Euglena: restriction and gene map. Curr. Genet. 15: 435-441.

Siemeister, G. and W. Hachtel, 1990 Organization and nucleotide sequence of ribosomal RNA genes on a circular 73 kbp DNA from the colourless flagellate Astasia longa. Curr. Genet. 17: 433-438.

Siemeister, G. and W. Hachtel, 1990 Structure and expression of a gene encoding the large subunit of ribulose-1,5- bisphosphate carboxylase (rjbcL) in the colourless euglenoid flagellate Astasia longa. Plant Mol. biol. 14: 825-833.

Siu, C.-H., K.-S. Chiang and H. Swift, 1976 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 369-391.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. I. Ultrastructureal analysis of organelles. J. Cell. Biol. 69: 362-370.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Vernon-Kipp, D., S. A. Kuhl and C. W. J. Birky, 1989 Molecular evolution of Polytoma, a non-green chlorophyte., 284-286 in Physiology, Biochemistry, and Genetics of Nongreen Plastids, edited by C. T. Boyer, J. C. Shannon and R. C. Hardison. American Society of Plant Physiologists, Rockville, Maryland.

Weeden, N. F., 1981 Genetic and biochemical implications of the endosymbiotic origin of the chloroplast. J. Mol. Evolution 17: 133-139. 14

Wolfe, A. D. and C. W. dePamphilis, 1995 Alternate paths of evolution for the photosynthetic gene rbc in four nonphotosynthetic species of Orobanche. Plant Molecular Biology (submitted)

Wolfe, K. H., D. S. Katz-Downie, C. W. Morden and J. D. Palmer, 1992 Evolution of the plastid ribosomal RNA operon in a nongreen parasitic plant: Accelerated sequence evolution, altered promoter structure, and tRNA pseudogenes. Plant Mol. Biol. 18: 1037-1048. CHAPTER II

Isolation and Characterization of Plastid DNA

from Polytoma uvella 964

INTRODUCTION

Several lineages of chlorophyte algae have branches with non-photosynthetic, permanently heterotrophic species. These branches include the genus Prototheca, closely related (by morphology and biochemistry) to the photosynthetic Chlorella

(Huss et al. 1988, Kerfin and Kessler 1978; Pore 1985); the genus Polytomella, closely related to Carteria (Pringsheim

1963); and the genus Polytoma, whose closest photosynthetic relatives are the genus Chlamydomonas (Pringsheim 1963) . We recently completed a molecular phylogenetic study of a large group of Polytoma species, reported in Chapter III. See

Table 4 in Chapter III for a list of the Polytoma species studied.

The investigation in Polytoma began with development of the basic methodology necessary to make these algae amenable to molecular study in the laboratory. Appropriate media and growth conditions were determined, as well as methodology to isolate and separate DNA from the three major DNA-containing organelles. Isolation and separation of DNA from two of the

15 three organelles (nuclear DNA from plastid DNA) was successful in Polytoma uvella UTX 964. A search was successful for a suitable gene (Rrnl8) for the phylogenetic study reported in Chapter III. In addition, several conserved genes were located in the leucoplast genomes of several Polytoma species, for use in a study of molecular evolutionary changes in the highly unusual leucoplasts of these nonphotosynthetic algae, reported in Chapters IV and V.

Four leucoplast genes (rrnl6, rrn23, tufA and rbcL) were detected in several Polytoma species.

MATERIALS AND METHODS

Cell Lyeie

Isolation and separation of DNAs from the various cellular organelles in Chlamydomonas frequently involves a gentle whole-cell lysis technique devised to preserve high molecular weight organellar DNA (Grant, Gillham and Boynton

1980; Gillham and Boynton, pers. comm.). This Gillham &

Boynton technique (G/B technique hereafter) begins with algal cells harvested from liquid culture using a 10,000 rpm centrifugation (Beckman J2-21) for 10 minutes at room temp.

Cells are resuspended in 50 mL (per 10 L liquid culture) TEN buffer (10 mM Tris pH 8, 10 mM EDTA., 150 mM NaCl) plus 2%

SDS (sodium dodecyl sulfate, Sigma Chemical Co.), 2% Sarcosyl (n-lauroylsarcosine, Sigma) and 10 ng Pronase E (protease Polytoma DNA Yield from Various LYSIS P ro to co ls

«■£ a i

■4 Lyaozyme Pronase B, DEPC B, Pronase Freeze cells, Prot.K cells, Freeze Prot.K,DEPC IProt.K, higher 9 Temp shake Temp 9 higher IProt.K, Freeze cells, Prot.K, DEPC Prot.K, cells, Freeze Prot.K, DKPC, detergent 2X DKPC, Prot.K, Break with Beads, not saved Pellet Beads, with Break Break with Beads, saved Pellet Beads, with Break 1^9 (TOP band) band) (TOP IpDNACaCl gradient 1^9 a-tl\>Ci9^>jOOC0O.*IN)6}^Cn a-tl\>Ci9^>jOOC0O.*IN)6}^Cn g wetg weight wholeof cells pelletin

q protocol K I Chlamy' 1 Orig'inai 1 1 1 1 ' ' r" S M Q U > Figure 1. Polytoma IpDNA yield from various lysis protocols LT Table 1. Variations in Gillham and Boynton lysis protocol.

Abbreviations shown on the table: IX Pronase E = 10 |ig per L of cells

IX Proteinase K = 0.3 mg per mL of lysate

IX Lysozyme = 12 mg per mL of lysate

NaP04 = 10 mM pH 8 DEPC = 50 |1L diethylpyrocarbonate per mL of lysate

SDS = sodium dodecylsulfate

SK = sarcosyl

r.t. = room temperature

disc. = discard

18 Table 1.

VARIATION 1 2 3 4 5 6 7 8 9 10 11 YIELD ATTEMPT

Original G/B No TE IX Pronase No 2% SDS No 24 hrs No No No spin r.t 0.2 E 2% SK 4 °C No disc. A No TE IX Pronase No 2% SDS No 24 hrs No Yes Yes r.t 1.0 E 2% SK 4° C B No TE IX Pronase No 2% SDS No 24 hrs No Yes No spin r.t 9.0 E 2% SK 4° C No disc. C No TE IX Prot K Yes 2% SDS No 15 min No No No spin 4° C 3.8 2% SK r.t No disc. D No TE IX Prot K Yes 2% SDS Yes tl 2X Prot K No No spin 4° C 9.6 2% SK No disc. E Yes TE 2X Prot K Yes 2% SDS No It 2X Prot K No No spin 4° C 15.7 2% SK No disc. ft F - 1 No TEN IX Lyso Yes 2% SDS No 2X Lyso No No spin 4° C 4.3 2% SK No disc. F - 2 No TEN IX Lyso Yes 2% Tween No It 2X Lyso No No spin 4° C 5.5 20 No disc.

vo Table 1. (continued)

VARIATION 1 2 3 4 5 6 7 8 9 10 11 YIELD ATTEMPT

G -1 No NaP04 10X Pron E Yes 5 % SDS Yes 15 min No No No spin 4° C 3.8 5 % SK r.t No disc. G -2 Yes NaP04 20X Pron E Yes 12% Yes f t 20X No. No spin 4° C 5.7 Tween 20 Pron E No disc. G - 3 No NaP04 10X Pron E Yes 5 % SDS Yes t l No No No spin 4° C 1.5 5 % SK No disc. G -4 Yes NaP04 30X Pron E Yes 12% Yes tf No No No spin 4° C 0.7 Tween 20 No disc. G -5 Yes N aP04 30X Pron E Yes 12% Yes n No No No spin 4° C 1.2 Tween 20 No disc. H No N aP04 IX Prot K Yes 4 % SDS Yes tf 2X No No spin 4° C 8.9 4 % SK Prot K No disc. I Yes NaP04 2X Prot K Yes 4 % SDS Yes «i 2X No No spin 4° C 12.6 4 % SK Prot K No disc.

LEGEND 1. = Freeze cells before lysis 2. = Buffer 3 = Enzyme Added

4 = 1 st Enzyme Shake 37° C, 30 min 5. = detergent added 6. = DEPC Added

7. = Duration Temp of Enzyme Detergent Shake 8. = 2nd Addition of Enzyme Detergent Shake

9. = Shake with glass beads after chemical lysis 10. = Discard cell debris pellet prior to phenol shake

11.= Temperature for 1 hr phenol shake 12. = Yield in mg/p DNA per g cell pellet to o 21 P-5147, Sigma) per L of cells, then slowly shaken in a 4° cold room for 24 hours. Whole-cell DNA is isolated from the lysate by two extractions with equal volumes of phenol and

CIA (24:1), and ethanol precipitation with two volumes of 95% ethanol. Initial attempts to transfer the G/B technique, unmodified, to Polytoma cells gave low yields of leucoplast

DNA. Attempts were then made to modify the G/B technique, using different types or amounts of lysing agents and adding freezing or mechanical lysing methods. These modifications are listed in Table 1, in the same order as the yields shown in Figure 1.

Separation of DNAs from Various Organelles

Nuclear, plastid and mitochondrial DNAs in Chlamydomonas can be separated on the basis of their different A+T content by CsCl equilibrium density centrifugation (Chiang and Sueoka

1967). Separation is enhanced by adding bisbenzimide

(Hoechst Dye No. 33258, Sigma), which intercalates in runs of

A-T pairs at least three bases long and decreases the density of DNA (Muller & Gautier 1975). This technique has also been used successfully with P. obtusum, P. mirum (SAG 62-3), and

P. uvella UTX 19 (Kieras and Chiang 1971; Scherbel, Behn and

Arnold 1974) . Whole-cell DNA is suspended in TE (10 mM Tris pH 8, 1 mM EDTA) , to which is added 200 bisbenzimide and

CsCl to an O.D. of 1.395, prior to centrifugation in 13 mL 22 Quickseal tubes (Beckman) for 48 hours in a Beckman Ti-50

rotor at 48K rpm. The gradient is fractionated by puncturing

the side of the tube just below each band (top, middle, and bottom) and withdrawing the band with a syringe. Buoyant densities of the three bands are determined by measuring the distance from each band to the bottom of the tube, and

comparing to a known standard (Beckman). The dye is removed with CsCl-saturated isopropanol, then the CsCl is removed by

six 6-hour dialyses in TE buffer.

Test of purity of separated DNA fractions from gradients

Approximately 2 0 micrograms of Hind Ill-restricted DNA

from each gradient fraction was separated by electrophoresis

in 0.7% agarose gels. DNA was transferred from the gel to

GeneScreenPlus membranes (DuPont) by electroblotting in an

EC230 electroblotter (E-C Apparatus Corp.), using DuPont protocols for GeneScreenPlus.

Southern hybridization used heterologous probes from C. reinhardtii containing known nuclear and chloroplast genes.

The nuclear probe contained the nuclear Rrnl8 plus Rrn25 genes, and the chloroplast probe contained the plastid rrnl6 gene plus the 3' end of rrn23. The probes were labelled with a-32P dATP ,using the Multi-Prime hexamer primer kit

(Amersham) . Optimized hybridization and wash conditions for 23 discriminatory binding of the nuclear and chloroplast probes

are described in Figures 5 and 6.

To conserve DNA, some hybridization was done on dot blots of 1 (ig and 5 ng of fractionated Polytoma DNA on

GeneScreenPlus membranes. The dot blots were hybridized to

the rrnl6/rrn23 chloroplast probe, plus an additional

chloroplast probe containing the 3' two-thirds of the

Chlamydomonas tufA gene plus part of the 3 1 untranslated

region. Hybridization conditions for discriminatory binding

in each of these two dot blot probings are described in

Figures 7 and 8.

PCR Amplification

For PCR amplifications, DNA from whole-cell CTAB lysates of 1 L cultures was used, as described in Chapter III.

Internal primers near the 5' and 3' ends of the nuclear Rrnld gene and the plastid rrnlS, tufA and rbcL genes were used for amplification. Primers, and amplification reaction conditions used, are shown in Tables 2 and 3. Figure 2. Buoyant densities in CsCl gradients.

Whole cell DNA separated in CsCl gradients into

various cellular fractions.

A) : C. reinhardtii fractions identified as

chloroplast (cp) DNA in top ("TOP") band;

mitochondrial (mito) DNA and nuclear (nu rDNA)

ribosomal DNA in the two minor middle ("MID")

bands; and nuclear (nu) DNA in the bottom ("BOT")

band (Chiang and Sueoka 1967, Bastia et a l . 1971,

Ryan et al. 1978).

B) & C) : fractionation seen in two Polytoma

species (Kieras and Chiang 1971). Note:

fractionation (not shown) in P. mirum SAG 62-3 is

very similar gradient (Scherbel et al. 1974). .

D) : fractionation seen in the study reported in

this thesis. (Leucoplast and nuclear fractionation

into top and bottom bands confirmed by

hybridization to known chloroplast and nuclear

probes).

24 Buoyant Densities in CsCl Gradients

cp DNA 1.86 'TOP’ 1.695 1.685 mito DNA 1.707 1.711 1.686 1.69 } 'MID' nu rDNA 1.711 1.714 1.712 1.727 •Bor nu DNA 1.723 1.725 mito DNA

Chlamydomonas Polytoma Polytoma uvella Polytoma uvella reinhardtii obtusum 19 964

B D

Figure 2 to ui C s C l GRADIENTS

CsCl* bis GRADIENT

TOP 1.66 ' T O P ' MIDDLE f 1.60 } 'UID' I:?!'”}’001' -BOTTOM f

Polytoma uvella 964

Polytoma uvella 964 Figure 3. CsCl + bis gradient for P.u.964 to o\ Figure 4. P.u.964 DNA digested with Hind III.

Whole cell DNA from Polytoma uvella 964, separated

in a CsCl gradient, then restricted with Hind III,

and run through a 0.7% agarose gel. Note that the

Hind III banding patterns on the gel are different

between top (Top) and bottom (Bot) bands,

reflecting the successful separation of leucoplast

and nuclear DNA. However, the gel lane containing

middle (Mid) band DNA shows the same banding

pattern as in the nuclear (Bot) band lane,

reflecting that some nuclear DNA is part of the

middle band fractionation. This mixture of DNAs in

the Mid band frequently occurred during

fractionation.

27 DNA DIGESTED WITH HINDIII

REF Top Mid Bot

PUTATIVE

LEUCOPLAST GENOME 1.66 "TOP*- MITOCHONDRIAL GENOME 1.69

NUOEEAR GENOME- BOT1!

Polytoma uvotla 964

Figure 4

to 00 Figure 5. P.uvella nuclear ribosomal RNA gene(s) identified.

Hybridization was at 28° C; washes at 28° C in 2X

SSC, then at 42° C in 2XSSC + 0.5% SDS. Probe was

nuclear Rrnl8 and Rrn25 genes. Electroblot

transfer contained CsCl-separated DNA from P.u.964

from top, middle and bottom bands in the gradient.

The nuclear probe bound to a single Hind III

fragment in the nuclear DNA lane (bottom band) but

did not bind to DNA in the leucoplast lane (top

band). (Note: probe also bound to the same-size

fragment in the middle band lane, since the middle

band separation from gradients frequently contained

a mixture of DNA from top and/or bottom gradient

bands.)

Demonstrates successful separation of leucoplast

DNA from nuclear DNA using CsCl gradient.

29 IP- u v ella NUCLEAR

RIBOSOMAL RNA

GENE(S) IDENTIFIED

REFTop REF Mid Top Bot Mid Bot

12 - -1 2

2— - 2

- 1

PROBE is: Chlamv. nuclear genes for 18 S & 25 S rRNAs

Figure 5. Figure 6. P.uvella 964 leucoplast ribosomal RNA gene(s)

identified.

A) = photograph of the agarose gel before transfer.

B) = X-ray film after hybridization.

C) = Diagram (to scale) of X-ray film.

Hybridization was at 28° C; washes at 28° C in 2X

SSC, then at 42° C in 2XSSC + 0.5% SDS. Probe was

chloroplast rrnl6 gene plus 3' end of rrn23 gene.

Electroblot transfer contained CsCl-separated DNA

from P.u.964 from top, middle and bottom bands in

the gradient. Chloroplast probe bound to two

fragments in the leucoplast DNA lane (top band) .

Probe also bound to the same fragment in the bottom

(nuclear) band lane that bound to the nuclear probe

used in Figure 5, presumably because nuclear Rrnl8

and Rrn25 genes and plastid rrn.16 and rrn23 genes

(the two probes for Figures 5 and 6) are homologous

genes. (Note: probe also bound to the same nuclear

fragment in the middle band lane, reflecting the

frequent occurrence of nuclear DNA mixed into the

middle band fractionation in the CsCl gradient.)

Demonstrates successful separation of leucoplast

DNA from nuclear DNA using CsCl gradient.

3 1 REF. Top Mid Bot REF. Figure 7. Dot blot with rrnl6/rrn23.

P.uvella 964 leucoplast ribosomal RNA gene(s)

identified on dot blots. Rows C, D and E on the

dot blot contained CsCl-separated DNA from P.u.964,

from top (leucoplast), middle (mixture) and bottom

(nuclear) gradient bands. Probe was the

chloroplast rrnl6 gene plus the 3' end of the rrn23 gene. Columns 1, 6 and 11 contained l|lg of DNA.

Columns 2,7 and 12 contained 5 fig of DNA.

Hybridizations I and II were at 28° C, then washed

at two different stringencies. Neither condition

was stringent enough to show discriminatory binding

of the chloroplast probe on top band (leucoplast)

DNA dot. More stringent conditions for

Hybridization III were: hybridization at the

equivalent of 49° C (room temperature + 30%

formamide); two initial washes in 2X SSC at r.t.;

more stringent wash at 55° C in 2X SSC + 1% SDS.

These stringent conditions show discriminatory

binding of the chloroplast probe on leucoplast DNA

dots (plus middle dots, which are usually a mixture

of top and bottom bands in the gradients), but not

on nuclear DNA dots.

33 All probed with chloroplast rrn16 & rrn23 rDNA

Hybr. I Hybr. II Hybr.

28° C 28° C + 49° C more stringent wash than in Hybr. I

tr ir ir

Neither provide discriminatory Conditions provide binding conditions. discriminatory binding of chloroplast probe on leucoplast (top band) DNA dots. Figure 7 Figure 8. Dot blot with tufA

P.uvella 964 leucoplast tufA gene identified on dot

blots. (See Figure 7 for content of DNA on dots.)

Probe was the 3' 2/3 of tufA coding region plus

part of 3' UT region. Hybridization III was a

control reaction because labelled 1 kb ladder (BRL)

was present. Lack of signal shows that 1 kb ladder

is not binding to any of the Polytoma DNA.

Hybridization I was at 49° C equivalent (r.t. + 30%

formamide); Hybridization II at 38° C equivalent

(r.t. + 15% form.). Both were washed at r.t. in 2X

SSC; then at 55° C in 2X SSC + 1% SDS. Both

stringency conditions show discriminatory binding

of the tufA probe on leucoplast DNA dots (plus

middle dots, which are usually mixtures of top and

bottom gradient bands), but not on nuclear DNA

dots.

35 Probe is chloroplast tufA Probe is 1kb ladder

Hybr. I Hybr. II Hybr. Ill

49° C 38° C 49° C

It It It

Both conditions show discriminatory Control, binding of chloroplast probe on (see Legend) leucoplast (top band) DNA dots.

Figure 8 Table 2. Primers used for PCR amplification of Rrnl8, rrnl6, tufA, rrn23, & rbcL genes

Gene Primer Direction of Sequence0 Location priming Rrnl8 SSU1A 5 '-3 ' 5'-TGGTTGATCCTGCCAGTAG-3’ 5 - 236 SSU2 3 '-5 ' S’-CACTTGGACGTCTTCCTAGT-S' 1768 -1788 rrnl6 A - 17 5’ -3' 5, G t t t GATCCTGGCTCAC-3' 12-29fc 5005 - 15 3 '- 5 ’ 3'-CATGTGTGGCGGGCA-5' 1330 -1315 tufA IF 5 '- 3 ’ 5'-GGDCAYGTTGAYCAYGG-3' 55 - 71& 5R 3*-5’ 3'-TGACANCCRCGRCCRCA-5' 1222 -1238 1130R 3’ -5' 3'-CCRATACGGDCCACTRGC-5' 1124 -1141 rrn23 A5 5 '-3 ' S’-AGAGGCG ATGAAGG ACGU G-3' 40 - 60c M3 3 '-5 ' 3, tcttcacgcttacgact -5' 1244 -1260 M5 5 '-3 ’ 5*-AGAAGTGCGAATGCTGA-3' 1244 -1260 Z3 3’ -5 ’ S'-GGATCATGCTCTCCTGG-S' 2650 - 2670 rbcL 2 S'-3' 5’-GCTTACGTAGCTTAC-3' 1485-1499b 9 3'-S' 3'-ACCCCATTGCGAGGT-5' 2435 - 2421 3 S' -3' 5'-GTAGAACGTGACAAA-3 1659 -1673 4 S'-3' 5-GCTGGTACTTGTGAA-3' 1920 -1934 8 3 '-S' 3’-ACCACAAATCCAATT-5’ 2688 - 2674 = forward primers arc reported as bases on sense strand, reverse primers on anti-sense strand h = sequence from C. reinhardtii r = sequence from E. coli u> Table 3. Amplification conditions used for PCR amplification of Rrnl8, rrnl6, rrn23, tufA, & rbcL genes in several Chlamydomonads

Gene Amplification Primers used Amplification Conditions product size (in bps)

Rrnl8 1 min (a\ 94° C, SSU1A to SSU2 1780 2 min (S>, 50° C, 2 min (m 72° C, 25 cycles, 3 mM Mg rm l6 (same as above) A-17 to 5005-15 1300

tufA (same as above. 1 F to 5 R 1200 but 55° C anneal, 3 mM Mg) 1 F to 1130 R 1100 rbcL (same as Rml8) 3 to 9 775 3 to 9 600 in P.u.196, (expected 775) 4 to 8 1000 in P.u. 964, (expected 775) 2 to 9 950 rm23 (same as Rrnl8) A5 to M3 1200 M5 to Z3 1400 u> oo 39 RESULTS

Conditions for Growing Polytoma Cultures

Polytoma cultures grow in either liquid medium or on solid agar plates. Pringsheim's Polytomella Medium (2 g sodium acetate, 1 g yeast extract, 1 g tryptone in 1 L distilled water) is routinely used, but they also grow in many other algal media, as long as acetate is added as a fixed-carbon source. Long-term survival requires transferring plate cultures to fresh plates, or addition of

9-10 volumes of fresh medium to liquid cultures, at monthly intervals.

Methods for Harvesting DMA from Polytoma Cultures

For large-batch high-density growth of liquid cultures prior to harvesting for DNA extractions, liquid cultures should be diluted 1:10 with fresh medium every 3 to 5 days, and harvested after the second or third sequential dilution.

No aeration of the liquid cultures is needed. Prior to lysis, a Millipore cell concentrator (or multiple centrifugations in 500 ml tubes for 10 min at 5000 rpm) can be used to concentrate the cells from 20 - 40 L of liquid culture to 50 or 100 mL prior to lysis. 40 Comparison of Lysis Methods

Figure 1 shows the yield of DNA from Polytoma uvella 964 cells with various modifications of the Gillham & Boynton technique. The short bar at the far left on the graph shows the yield using the original G/B technique, unmodified, which produced only 200 ng of Polytoma leucoplast DNA per grams of cells harvested. An attempt to discard the cell debris pellet after cell lysis, but before phenol extraction, revealed that 90% of the leucoplast DNA was still in the cell debris pellet (compare Variations A and B). For all variations other than A, therefore, the cell debris was included in the phenol extraction and not discarded. Shaking the cells with glass beads improved the yield 45-fold

(Variation B), but sheared the DNA badly. Background smears in lanes of restricted leucoplast DNA on agarose gels obscured the leucoplast restriction fragments (data not shown), so mechanical lysis was abandonned. Most variations on the G/B technique improved the yield at least 10-fold, and one (Variation E) showed an 80-fold improvement, yielding 16 micrograms of leucoplast DNA per gram of cells. Proteinase K

(protease P-0390, Sigma) lysed much better than Pronase E

(protease P-5147, Sigma) or lysozyme. Much more leucoplast

DNA was recovered from frozen cells, and the combination of a

Proteinase K lysis of frozen cells gave the best yield. 41 Banding Pattern in CaCl Gradients of P.u. 964 DNA

A CsCl separation of DNA in P.u. 964 produces banding

patterns that are analogous to those seen in Chlamydomonas

reinhardtii and three other Polytoma species (Figure 2) . The

number of separated DNA bands after centrifugation, the

relative positions of the bands, and the relative intensities

of the bands are similar in all five chlamydomonad species.

In C. reinhardtii, P. obtusum, P. uvella 19, and P. mirum 62-

3, the bottom band in the gradient contains nuclear DNA and

the top band contains leucoplast DNA (Chiang and Sueoka 1967;

Kieras and Chiang 1971; Scherbel, Behn and Arnold 1974; Siu,

Chiang and Swift 1975b). In C. reinhardtii, the lower of the

two minor bands in the middle of the gradient is a band consisting of the tandem repeats of nuclear rDNA genes,

and the upper of the two minor middle bands contains mitochondrial DNA (Bastia et al. 1971, Ryan et al. 1978).

Figure 3 shows a CsCl gradient for Polytoma uvella 964

from my study. This gradient is notable in two respects: the yield of leucoplast DNA (top band) was unusually high, and

this is one of very few gradients in which both of the minor bands in the middle fraction were visible. (The distinct

separable appearance of these two middle bands is not as apparent in the photograph as it was to the eye, since the photograph was not taken quite perpendicular to the gradient.) The two minor middle bands were quite faint in most P.u.

964 gradients, and the upper middle band was rarely visible; as a result it was difficult to remove a middle fraction from most gradients, and the middle fraction probably was contaminated with remnants of the large bands located above and below it. Visual analysis of Hind Ill-restricted DNA from middle fractions of the gradient, run in an agarose gel

(example in Figure 4), confirm this mixing, showing Hind III fragments characteristic of either the top or bottom bands, or both. This mixing was also confirmed by Southern hybridization results described below.

Purity of DNA Separation in CsCl gradient

Southern hybridization confirmed that CsCl bands of P.u.

964 DNA corresponded to those in the other chlamydomonad species shown in Figure 2. A nuclear gene probe found a homologous fragment in the bottom band gradient fraction of

P.u. 964, but gave no signal on DNA from the top band fraction (Figure 5). A chloroplast DNA probe found homologous fragments in the top-band gradient fraction, but not the bottom band fraction (Figure 6). However, these nuclear and chloroplast probes also confirm that the middle- band DNA fraction extracted from the gradient was a mixture, contaminated with top and bottom band DNA, since both nuclear and plastid probes found homologous fragments in the middle- band fraction. The nuclear Rrnl8/Rrn25 probe hybridized to a 4.7 kb ffindlll fragment in both bottom and (mixed) middle gradient fractions (Figure 5). The chloroplast rrnl6/rrn23 probe hybridized to 6.2 kb and 7.2 kb Hind III fragments in both top and (mixed) middle gradient fractions (Figure 6).

This plastid probe also hybridized to the same 4.7 kb

fragment in the bottom and (mixed) middle fractions that hybridized with the nuclear Rrnl8/Rrn25 probe, presumably because the plastid rrn23 gene and the nuclear Rrn25 gene are homologs with analogous functions in each organelle.

Hybridizations on dot blots also showed preferential binding of two chloroplast probes to DNA from the top

fraction in CsCl gradients, and again showed that the middle band fraction contains a mixture of DNAs. The rrnlS/rrn23 probe showed a strong signal on top fraction dots and a moderate signal on dots from the (mixed) middle fraction

(Figure &) . It showed virtually no signal on dots from the bottom fraction, even at a stringency that showed cross­ hybridization to a phage lambda DNA control dot. The tufA probe showed a similar signal pattern on dot blots: strong signal with the top fraction, moderate signal with the

(mixed) middle fraction, and virtually no signal with the bottom fraction or any of the negative controls, even at a

lower stringency than was discriminatory with the rrnlS/rrn23 probe (Figure 8). Hind III

1.66 "TOP*- 1.69 'MID* 1.73 1.75 i^'BOT*

Polytom• uvella 964

Figure 9. Leucoplast DNA digested with various restriction enzymes 45 Size of the Leucoplast Genome

Sal I and Hind III restriction produced the clearest

restriction fragment patterns for leucoplast DNA in P.u. 964

(Figure 9) . The sum of the fragment sizes in a Sal I digest

is 220 kb; the sum of fragment sizes in a Hind III digest is

170 kb.

Amplification of Conserved Nuclear and Plastid Genes

Leucoplast genes were initially amplified from Polytoma

uvella UTX 964, Polytoma uvella UTX 19 and Chlamydomonas

humicola SAG 11-9, hereafter called P.u.964, P.u.19 and C. humicola. Three chi airy domonad species (two non-green and one more green) were later added to the leucoplast gene study:

Polytoma obtusum DH1, Polytoma sp. SAG 62-27 and

Chlamydomonas dysosmos UTX 2399, hereafter called P. obtusum,

P. 62-21 and C. dysosmos.

PCR products of the expected sizes (1780 bps and 1300

bps, respectively) were obtained, initially, for Rrnl8 from

P.u.964, and for rrnl6 from P.u. 964, P.u. 19 and C. humicola.

Subsequent amplifications produced Rrnl8 from the twelve

other Polytoma species listed in Table 4 in Chapter III; and

rrnl6 from P. obtusum, P.62-21 and C. dysosmos. TufA

products of the expected sizes (1100 bps for P. obtusum and

1200 bps for the other three) were amplified from P. 62-21, C.

humicola and C. dysosmos. The identity of these genes was

confirmed by sequencing (see Chapters III and IV) . We obtained PCR products of the expected sizes (1200 bps and

1400 bps, respectively) for the 5' one-half and the 3' one- half of rrn23 from P.u.964 and P.u.19. Internal primers for rbcL generated a PCR product in C. humicola (representing the middle 75% of the gene) that was the expected size (950 bps), but we could not duplicate this amplification in two Polytoma species, since several rbcL primers failed to work in these

Polytoma species. A product of the expected size (775 bps) in P.u. 964 was obtained using primers for the middle half of the rbcL gene, but a product representing the 3 1 half of rbcL was 200 bases longer than expected in P.u. 964. In P.u. 19, the product representing the middle half of rbcL was 200 bases shorter than expected.

Amplification reactions with tufA primers gave a product of the expected size (1200 bps) from P.u.964 whole-cell lysates. Multiple amplifications were performed, pooled, and purified using Gene Clean (Bio 101), then sequenced, along with the rest of the leucoplast genes sequenced for the study reported in Chapter IV. The sequence of this tufA amplification product from P.u.964 lysates was aligned to a large array of algal, cyanobacterial and eubacterial tufA sequences (described in Delwiche et al. 1995). it was discovered that the DNA template sequenced was from a contaminating gram-positive bacterium closely related to

Bacillus, rather than from the leucoplast gene in P.u. 964.

Calculations of divergence differences show that the contaminant sequence is 79% similar to a Bacillus tufA sequence, and only 70% similar to the sequence in

Chlamydomonas humicola, the chlorophyte alga to which the contaminant sequence is most closely related. Confirming these divergences, parsimony and neighbor-joining trees show the contaminant sequence grouped with Bacillus among the eubacteria. Finally, the contaminant sequence has only one amino acid between E. coli amino acid positions 179 and 180, and no amino acids between E. coli 348 and 349, This is typical for many eubacteria, but not typical for cyanobacterial and algal sequences, which fill those locations with ten amino acids and 5-16 amino acids, respectively (Delwiche et al. 1995) . The stage at which a contamination occurred is unknown, but will be considered and investigated. Whether this situation implies anything concerning the presence or absence of a tufA gene in the leucoplast genome of P.u.964 is debatable. This concern will also be considered, and appropriate experiments designed to try to answer that question.

DISCUSSION

Lysis of Polytoma Cells

Much manipulation of Gillham and Boynton's lysis technique allowed its adaptation for use in Polytoma.

However, even though modifications improved rny initial yield using this technique by 80-fold, the yield is still quite low 48 in terms of number of micrograms of leucoplast DNA. Given the amount of DNA needed for Southern hybridization and restriction mapping, the low yield still represents a bottleneck for future experiments. Other laboratories (e.g.

DePamphilis and Palmer 1990) use CTAB lysis (the same technique described in Chapter III to get DNA from 1-L cultures for PCR) to prepare DNA for CsCl fractionation.

This alternate use of CTAB will be evaluated in the future.

Separation of Plastid and Nuclear DNA in Polytoma

Leucoplast and nuclear DNAs of P.u.964 were successfully separated from each other on the basis of their different A+T contents by using CsCl-bisbenzimide gradients. All electroblot and dot blot hybridizations at discriminatory stringencies confirmed the presence of nuclear DNA in the bottom gradient fraction but not the top fraction, and confirmed leucoplast DNA in the top gradient fraction but not the bottom fraction. However, middle gradient fractions

(expected to contain mitochondrial DNA and nuclear rDNA) were always a mixture of DNA fractions and contained varying amounts of leucoplast and/or nuclear DNA as contaminants.

Location of Conserved Nuclear and Plastid Genes in

Several Polytoma Species

Restriction analyses, gradient studies, and hybridizations were confined to one Polytoma species, P.u.964; but PCR products for nuclear Rrnl8 and leucoplast rrnl6, rrn23, tufA, and rbcL were obtained from several

Polytoma species. The identities of nuclear Rrnl8 genes in

13 Polytoma species were confirmed by sequencing, as part of the molecular phylogenetic analysis reported in Chapter III.

The identities of leucoplast rrnl6 and tufA genes were similarly confirmed for two or three Polytoma species, and the sequences formed the databases for the leucoplast functionality and substitution rate studies reported in

Chapters IV and V. The identities of the PCR products obtained from several Polytoma species using rrn23 and rbcL primers are not yet confirmed.

Genome Size in P.u.964

Several factors could contribute to the large range of

170 kb from a Hind III digest and 220 kb from a Sal I digest for the estimate of leucoplast genome size in P.u.964. The

Hind III digest may contain undetected doublet or triplet bands, or some fragments may be so small they ran off the gel, leading to an under-estimate of the size. Either, or both, restrictions may be incomplete digestions, leading to an over-estimate of the size. Since Sal I digestion was only attempted one time, I do not know whether incomplete digestion occurred. However, many Hind III leucoplast digestions were observed in gels over the course of this leucoplast study, and all digestions produced the same 50 fragments. Therefore, the Hind III digestions are almost

certainly not incomplete digestions, so 170 kb should be an accurate minimum estimate of the leucoplast genome size in

this Polytoma species, although the actual size could be

larger.

The size estimates of 170 kb and 220 kb for the

leucoplast genome in P.u.964 are similar to, or somewhat

smaller than, sizes of chloroplast genomes in three

Chlamydomonas species. By restriction mapping, the chloroplast genome size is estimated as 190, 240, and 290 kb

in C. reinhardtii, C. moewusii UTX 97, and C. eugametos UTX

9, respectively (Rochaix 1978, Turmel et al. 1987, Lemieux et al. 1987). Siu et al. (1975c) estimated the size of the leucoplast genome in P. obtusum at 450 kb, calculating kinetic complexity using solution renaturation kinetic studies. Since this method also estimated the size of the chloroplast genome in C. reinhardtii at 300 kb (Siu, Chiang and Swift 1975c), whereas restriction mapping found only 190 kb (Rochaix 1978), these kinetic calculations probably overestimated the size of the Polytoma genome.

The range of known chloroplast genome sizes in chlorophyte algae is much broader than the 135 kb to 160 kb range in most angiosperms (Palmer 1991, Birky 1988).

Chloroplast genome sizes of chlorophyte algae include 85 kb in Codium fragile (Hedberg et a l . 1981), and 175 kb in 51 Chlorella ellipsoidea (Yamada 1983), as well as the larger sizes already cited in the three green chlamydomonads.

The size difference I detected between the leucoplast genome in P.u.964 and chloroplast genomes in green chlamydomonads is not as large as the size differences seen between leucoplast and chloroplast genomes in other non­ photosynthetic plants and algae that have been studied. The leucoplast genome in the non-green angiosperm Epifagus

Virginia is half the size of the chloroplast genome in tobacco, and the leucoplast genome in Conopholis americana

(closely related to Epifagus) is only one third the size of the chloroplast genome in tobacco (dePamphilis and Palmer

1990, Wimpee et al. 1991, Colwell 1994). The leucoplast genome in the non-photosynthetic euglenoid Astasia longa is only half the size of the chloroplast genome in Euglena

(Seimeister and Hachtel 1989). Comparisons to the chloroplast genomes in closely-related green species showed that the missing DNA in Epifagus and Astasia contained mostly photosynthetic genes. Also deleted from the genomes in

Epifagus and Conopholis were some genes required for protein synthesis: plastid tRNAs, ribosomal protein genes, and RNA polymerase subunits (Wolfe et al. 1992, Wimpee et al. 1992).

Epifagus and Astasia leucoplast genomes may represent stages in the progressive deletion of nonfunctional photosynthetic genes that may not yet be finished, since portions of obviously-truncated pseudogenes are still present in the genome. The reason for this reduction in size is unknown. It might be because the rate of deletion exceeds that of duplication or other mechanisms for increasing genome size. In favor of this interpretation is the scarcity of duplicated genes, transposable elements, imported DNA from the nucleus or , and nonfunctional DNA in general, in chloroplast genomes, all of which speak to a low rate of events that would increase genome size (Birky 1988,

Gillham 1994, E. Harris and R. Spreitzer, pers. comm.).

The circular chloroplast genome in C. reinhardtii replicates after D-loop formation, with both Cairns and rolling circle replication detected (Wu et al. 1986, Bogorad

1991) . The particular chloroplast DNA molecules chosen for replication (to achieve the appropriate copy number) are chosen randomly, with the result that some molecules may be chosen for replication more than once per , and some molecules may not be replicated at all (Birky 1983,

1994). Given these data, it seems reasonable to surmise that there may be intracellular selection favoring smaller plastid genomes. Smaller molecules may have a replicative advantage, possibly because they finish replicating sooner than larger circles and so are available sooner than larger molecules to be replicated a second time by chance. Detailed mapping and sequencing of the leucoplast genome in P.u.964 will be required to confirm the genome size, identify the retained 53 sequences, and determine if new sequences have been acquired by duplication or other processes.

ACKNOWLEDGMENTS

Several of Dr. Birky's students used the initial amplification searches in this project to learn how to optimize PCR reaction conditions. Sam Shank, Dave Schreiber, and Bob Rumpf helped locate the nuclear Rrnl8 gene in P.u.964 and helped locate several leucoplast genes in P.u.964, P.u.19 and C. humicola. They then went on to do Rrnl8 sequencing in other Polytoma and Chlamydomonas species, reported in Chapter

III. I then used our optimized reaction conditions to amplify the leucoplast genes from the particular Polytoma and

Chlamydomonas species I chose to analyze for the work reported in Chapters IV and V.

Algal clones for nuclear and chloroplast probes for hybridization were kindly provided by E. Harris, N. Gillham,

J. Boynton at Duke University; J.-D. Rochaix at the

University of Geneva, R.W. Lee at Dalhousie University, and

J. Palmer at Indiana University. Amplification and sequencing primers were kindly provided by P. Fuerst at the

Ohio State University for Rrnl8, rrnl6, and rrn23; by J.

Palmer for tufA; and by R. Spreitzer at the University of

Nebraska for rbcL. C. Woese' and G. Olsen's Ribosomal Data

Base was invaluable, both for aligning my rrnlS sequences and for modeling rRNA secondary structures. A large array of algal, cyanobacterial and eubacterial tufA sequences, kindly- provided by C. Delwiche and J. Palmer at Indiana University, was invaluable for aligning rny tufA sequences. Also, C.

Delwiche was the first to notice anomalies in the putative

P.u.964 tufA sequence that I subsequently concluded was from a bacterial contaminant. We thank him for sharp eyes. Literature Cited

Bastia, D . , K. S. Chiang, H. Swift and P. Siersma, 1971 Heterogeneity, complexity, and repetition of the chloroplast DNA of Chlamydomonas reinhardtii. Proc. Natl. Acad. Sci. USA 68: 1157-1161.

Birky, C. W., Jr., 1983 Relaxed cellular controls and organelle heredity. 222: 468-475.

Birky, C. W., 1988 Evolution and variation in plant chloroplast and mitochondrial genomes, in Plant Evolutionary Biology, edited by L. D. Gottlieb and K. J. Subodh. Chapman and Hall, New York.

Birky, C. W., Jr., 1994 Relaxed and stringent genomes: Why cytoplasmic genes don't obey Mendel's laws. J. Hered. 85: 355-365.

Bogorad, L., 1991 Replication and transcription of plastid DNA, in The Molecular Biology of Plastids, edited by L. Bogorad and I. K. Vasil. Academic Press Inc., New York.

Chiang, K.-S. and N. Sueoka, 1967 Replication of chloroplast DNA in Chalmydomonas reinhardtii during vegetative cell cycle: its mode and regulation. Biochemistry 57:

Colwell, A., 1994 Genome evolution in a non-photosynthetic plant, Conopholis americana. Washington University.

Delwiche, C. F., M. Kuhsel and J. D. Palmer, 1995 Phylogenetic Analysis of tufA sequences indicates a cyanobacterial origin of all plastids. Molecular Phylogenetics and Evolution 4: 110-128. dePamphilis, C. W. and J. D. Palmer, 1990 Loss of photosnthetic and chlororespiratory genes from the plastid genome of a parasitic flowering plant. Nature 348: 337-339.

55 56

Gillham, N. W., 1994 Organelle Genes and Genomes. Oxford University Press, Oxford.

Grant, D. M., N. W. Gillham and J. E. Boynton, 1980 Inheritance of chloroplast DNA in Chlamydomonas reinhardtii. Proc. Nat. Acad. Sci. USA 77: 6067.

Hedberg, M. F., Y. S. Huang and M. H. Hommersand, 1981 Size of the chloroplast genome in Codium fragile. Science 213: 445-447.

Huss, V. A. R., K. H. Wein and E. Kessler, 1988 Deoxyribonucleic acid reassociation in the taxonomy of the genus Chlorella. Archives of Microbiology 150: 509-511

Kerfin, W. and E. Kessler, 1978 Physiological and biochemical contributions to the taxonomy of the genus Prototheca. II. Starch Hydrolysis and base composition of DNA. Archives of Microbiology 116: 105-107.

Kieras, F. J. and K.-S. Chiang, 1971 Characterization of DNA components from some colorless algae. Exp. Cell. Res. 64: 89- 96.

Lemieux, C., M. Turmel, V. L. Seligy and R. W. Lee, 1985 The large subunit of rubisco is encoded in the IR sequence of the Chlamydomonas eugametos chloroplast genome. Current Genetics 9: 139-145.

Muller, W. and F. Gautier, 1975 Interactions of heteroaromatic compounds with nucleic acids. A*T-specific non-intercalating DNA ligands. Eur. J. Biochem 54: 385-394.

Palmer, J. D., 1991 Plastid : Structure and Evolution, in The Molecular Biology of Plastids, edited by L. Bogorad and I. K. Vasil. Academic Press, Inc., New York.

Patterson, D. J. and J. Larsen, Ed. (1991). The Biology of Free-Living Heterotroohic Flagellates. Oxford, Clarendon Press.

Pore, R. S., 1985 Prototheca taxonomy. Mycopathologia 90: 129-139. 57

Pringsheim, E. G., 1963 Farblose Algen. Gustav Fischer Verlag, Stuttgart.

Rochaix, J. D., 1978 Restriction endonuclease map of the cpDNA of Chlamydomonas reinhardtii. J. Mol. Biol 126:

Ryan, R., D. Grant, K. S. Chiang and H. Swift, 1978 Isolation and characterization of mitochondrial DNA from Chlamydomonas reinhardtii. Proc. Natl. Acad. Sci. USA 75: 3268-3272.

Scherbel, G., W. Behn and C. G. Arnold, 1974 Untersuchungen zur genetischen Funktion des farblosen Plastiden von Polytoma mirum. Arch. Microbiol. 96: 205-222.

Siemeister, G. and W. Hachtel, 1989 A circular 73 kb DNA from the colourless flagellate Astasia longa that resembles the chloroplast DNA of Euglena: restriction and gene map. Curr. Genet. 15: 435-441.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 369-391.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Tunnel, M . , G. Bellemare and C. lemieux, 1987 Physical mapping of differences between the chloroplast DNAs of the inter fertile algae Chlamydomonas eugametos and Chlamydomonas moewusii. Current Genetics 11: 543-552.

Wimpee, C. F., R. Morgan and R. L. Wrobel, 1992 Loss of transfer RNA genes from the plastid 16S-23S ribosomal RNA gene spacer in a parasitic plant. Curr. Genet. 21: 417-422. .

Wimpee, C. F., R. L. Wrobel and D. K. Garvin, 1991 A divergent plastic genome in Conopholis americana, an achlorophyllous parastic plant. Plant Mol. Biol. 17: 161-166.

Wolfe, K. H., C. W. Morden, S. C. Ems and J. D. Palmer, 1992 Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. J. Mol. Evol. 35: 304-317. 58

Wu, M., J. K. Lou, C. H. Chang, Z. Q. Nie and X. M. Wang, 1986 Initiation of chloroplast DNA replication, in Extrachromosomal Elements in lower , edited by R. B. Wickner, A. Hinnebusch, A. M. Lambowitz, I. C. Gunsalus and A. Hollaender. Plenum Press, New York. CHAPTER III

GENERAL BACKGROUND

Choice of Gene for Molecular Phylogenetic Analysis

Based on phenotypic criteria, several Polytoma species were taxonomically placed near Chlamydomonas species in the family Chlamydomonadaceae among the Chlorophyte algae. The phenotypic data used included basic cell morphology, sexual cycle characteristics, and biochemical properties of cell walls, carotenoids, and stored starch (Pringsheim 1963;

Links, Verloop and Havinga 1963). The molecular study reported in Chapter III, of 13 Polytoma species, used nucleotide sequences of the Rrnl8 gene. Rrnl8 codes for the cytoplasmic small subunit ribosomal RNA that participates in translating messenger from the nucleus. Rrnl8 is one of the most slowly-evolving genes and thus might not be expected to show significant differences among species or genera belonging to the same family. However, Chlamydomonadaceae is an ancient family, showing a greater amount of divergence within it than the divergence between angiosperms and gymnosperms (Chapman and Buchheim 1991) . Also, many more sequences are available from Chlamydomonas species or other genera in the Chlamydomonadales and Volvocales for Rrnl8 than

59 60 for the faster-evolving Rrn25 gene. An analysis of Rrnl8

sequences was able to answer all the questions posed,

including the main questions concerning the number of origins

of Polytoma species and their relationships to certain

Chlamydomonas species; only the relationships within a large

clade of very closely-related Polytoma species could not be resolved.

Ribosomal RNA genes have unique attributes that contribute to the wide use of their sequences in phylogenetic analysis, not only for global "trees-of-life'1 (Gray et al.

1984, Pace et a l . 1986, Sogin et al 1986, Cedergren et a l .

1988, Gunderson et al. 1987, Woese and Pace 1993), but also

for elucidating relationships within the chlorophytes and chlamydomonads (Rausch et al. 1989; Buchheim et al. 1990;

Chapman and Buchheim 1991; Buchheim and Chapman 1991, 1992;

Turmel et al. 1993; Buchheim et al. 1994). Ribosomal RNA genes are ubiquitous in all organisms, and their rRNA products perform equivalent functions in all organisms and in mitochondria, chloroplasts, and the cytoplasm. The rRNA consists of alternating regions of high and low conservation.

The primary structure of rRNA is extremely well-conserved, and is frequently invariant at enough locations to allow unambiguous alignment of most of the molecule from almost any organism. Alignment is further enhanced by conservation of the secondary structure of the rRNA, providing additional support for unambiguous alignment of functionally equivalent 61 positions in the folded rRNA molecule. Other regions of the sequence evolve at much higher rates and are highly variable, both in nucleotide sequence and in length. Differences in these regions can define relationships among closely-related species (Hill et al. 1990, Sogin 1991, Hillis and Dixon 1991,

Olsen and Woese 1993) .

Phylogenetic Inference Using Sequence Data

Once aligned, sequence data are usually transformed into inferred phylogenetic relationships by the use of one or more of three philosophically and mathematically distinct methods

(Olsen 1988, Swofford and Olsen 1990, Hillis and Moritz 1990,

Li and Grauer 1991): maximum parsimony (Fitch 1971), maximum likelihood (Felsenstein 1981), and distance methods. Both parsimony and likelihood extract information, separately, from each site (nucleotide or amino acid position) in the sequence; the distance methods calculate a single number reflecting the proportion of sites in an entire sequence that differ between any two species.

Maximum parsimony looks for character states (in sequences, either nucleotides or amino acids) in common between any two extant species. Parsimony screens the dataset for phylogenetically-informative sites, i.e. those at which there are at least two different bases, each present in at least two different species. Using that subset of the data, parsimony groups together species that have the most characters in common, and tries to recreate the genealogical relationships that might have created the extant species in the dataset. This recreation includes recreating all ancestral species at all nodes (branch points) of a potential geneology. Depending on the number of species in the dataset, either all possible trees, or just a subset, are evaluated. Potential tree topologies are evaluated using the philisophy of parsimony (Occam's Razor), and the tree that requires the smallest number of character substitutions, summed over all branches in the tree, is considered to be the best re-creation of the true tree.

Distance methods use the information at all sites in the sequence. For each pair of species, the proportion of sites that differ is calculated and corrected for multiple hits, resulting in a matrix of pairwise genetic distances between all pairs of species. Most distance methods then involve various goodness-of-fit analyses to test different tree topologies, until a topology is found that best fits the matrix of all pairwise differences observed between the extant species. The neighbor-joining method (Saitou and Nei

1987) forms trees by sequentially grouping together least- distant nodes. Since the mathematical algorithms of the neighbor-joining method do not require equal rates of substitution along all lineages, and since rate differences were expected (and observed) in some plastid genes in the studies reported herein, our studies include this method. A maximum likelihood method begins by assuming a particular model of the evolution of base sequences (or amino acid sequences). Ideally, each possible tree is then tested to determine the likelihood that its topology fits the dataset. The tree with the maximum likelihood is taken as the best approximation of the true tree. Likelihood methods involve intensive calculations, because each and all possible reconstructions of ancestral states (not just the most parsimonious reconstruction) is evaluated. Common assumptions in maximum likelihood models include a Poisson distribution of substitutions at each site, different substitution rates for transition versus transversion substitutions, different base compositions in different species, and a gamma distribution of rates at different sites. These were the assumptions used in the nuclear gene analysis to be described.

Parsimony, distance, and maximum likelihood methods make different assumptions and are subject to different sources of error (Nei 1987, Felsenstein 1978, Sobor 1988, Stewart 1988,

Penny et al 1992) . Consequently a tree produced by all three methods is especially robust because it demonstrates that the dataset was robust enough to not be affected by these different kinds of error. For that reason, the nuclear analyses reported in Chapter III employed all three tree- reconstruction methods, and the plastid analysis reported in

Chapter V employed all methods but maximum likelihood. For 64 both analyses, all methods produced trees with similar topologies.

Acknowledgments

Chapter III is a slightly modified version of an article accepted for publication in the February, 1996 issue of

Journal of Phycology, submitted by Robert Rumpf, Dawne

Vernon, David Schreiber, and C. William Birky, Jr. Chapter

III reports the results of a joint project to sequence and phylogenetically analyze 13 Polytoma species and two

Chlamydomonas species. I sequenced the Rrnl8 gene from two of the Polytoma species (P.u.964 and P. obtusum) . Dave

Schreiber sequenced part of P.u.19. Two undergraduate students, Julie Gordon and Sam Shank, sequenced part of

Chlamydomonas dysosmos, and participated in preparing a published article describing the close relationship of C. dysosmos and C. humicola (Gordon et al . 1995). Bob Rumpf sequenced the remaining 10 Polytoma Rrnl8 genes plus Rrnl8 from Chlamydomonas humicola. I aligned our sequences to a chlamydomonad database, and prepared secondary structure models for all 15 genes. With planning and analysis assistance from iryself and Dr. Birky, Bob Rumpf directed the various tree-making software in recreating the probable genealogy.

We thank Mark A. Buchheim (University of Tulsa) for providing us with his algal rRNA sequence data; Paul Fuerst and his collaborators (Ohio State University) for primers; and Mark A. Buchheim, Russell L. Chapman, Nicholas W.

Gillham, and Jeffrey A. Palmer for helpful suggestions with the manuscript. Carl Woese1 and Gary Olsen's Ribosomal Data

Base was an invaluable source of information for 18S rRNA secondary structure models. CHAPTER III

Evolutionary Consequences of the Loss of Photosynthesis in

Chlamydomonadaceae: Phylogenetic Analysis of Rrnld (18S rDNA)

in 13 Polytoma Species (Chlorophyta)

INTRODUCTION

Most chlorophyte algae are photosynthetic and contain chlorophyll in their plastids, as suggested by the common appellation "green algae". However, a number of non-green heterotrophic species are also classified in the Chlorophyta on the basis of morphological and biochemical similarities to various photosynthetic groups. One of the best-known genera of heterotrophic chlorophytes is Polytoma, comprise of oval to elongated biflagellated unicells with a single large plastid (leucoplast) that mostly surrounds the centrally- located nucleus (Lang 1963, Pringsheim 1963). The leucoplast contains ribosomes and DNA (Scherbel et al. 1974, Siu et al.

1975; Siu et al. 1975a, b, c, Vernon-Kipp et al. 1989).

Polytoma is believed to be closely related to Chlamydomonas on morphological grounds (Pringsheim 1963). The genus

Chlamydomonas is diverse in terms of morphology, habitat, and metabolism and has been classified into over 450 species on

66 67 the basis of morphology and sensitivity to

autolysins (Ettl 1976, Schlosser 1984, Ettl and Schlosser

1992) . We used DNA sequence data to verify the close

relationship between one isolate of Polytoma and

Chlamydomonas reinhardtii (Vernon-Kipp et al. 1989); here we

show that 12 additional isolates of Polytoma are also members of the Chlamydomonas clade.

It is generally assumed that nonphotosynthetic

chlorophytes arose from photosynthetic ancestors. Polytoma

species are obligate heterotrophs capable of utilizing acetate as their sole carbon and energy source, while many

chlamydomonads are facultative heterotrophs that can utilize acetate, photosynthesis, or both. Nonphotosynthetic mutants are easily obtained in C. reinhardtii and can result from mutations in any of a number of different genes in the chloroplast or nucleus (Harris 1989). Many of them are

easily maintained on acetate in the light or dark, but detailed comparisons of growth rates to wild type cells have been made for only two mutants (Boynton et al. 1972) . One of

these, ac-20, grows only slightly slower than wild type under heterotrophic conditions with no light and acetate, in spite of having a reduced rate of plastid protein synthesis. Thus

it is reasonable to suppose that some nonphotosynthetic mutants (perhaps a small minority) might be selectively neutral or nearly neutral in nature, especially in an environment rich in organic nutrients and under less intense and discontinuous illumination. Such a mutant could give rise to a Polytoma species by the following steps: 1) The mutation is fixed in the population (probably by random drift), or the mutant cell establishes a new population in a new habitat; 2) the heterotrophic population becomes isolated from photosynthetic members of the same species by geography and/or the acquisition of genetic isolation mechanisms; and

3) additional mutations in the photosynthetic pathway accumulate in the absence of selective pressure, making the loss of photosynthesis irreversible. The population will be classified as Polytoma when chlorophyll synthesis is lost.

Chlorophyll loss can result either from the initial mutation or from a subsequent mutation; the latter is more likely, given the large number of photosynthetic genes in which the initial mutation might occur.

The loss of photosynthesis is likely to have important consequences for evolution of the plastid genome. These include the loss of most or all genes coding for proteins involved in photosynthesis (Wolfe et al. 1992b; but see

Siemeister and Hachtel 1990) and the accelerated evolution of genes coding for rRNAs and proteins involved in protein synthesis (Wolfe et a l . 1992a). In addition, it is likely that selection operating at the level of species or populations favors photosynthetic over nonphotosynthetic organisms in some mutritional niches. Although an obligate heterotroph might be well-adapted and reproduce as rapidly as 69 a photosynthesizer in an environment rich in organic nutrients, it will be restricted to that niche and be unable to occupy the nutrient-poor environments that are open to photosynthesizers.

As part of our investigations of the evolutionary consequences of the loss of photosynthesis in algae, we determined the evolutionary relationships of 13 of the approximately 20 available stock cultures of Polytoma to species of Chlamydomonas. The nuclear Rrnl8 gene, and the small subunit ribosomal RNA for which it codes, have been widely used to reconstruct evolutionary histories. The sequences of these genes have been used to clarify the phylogeny of 28 Chlamydomonas species and 12 species from closely related genera (Jupe et al. 1988, Buchheim et al.

1990, Buchheim and Chapman 1991, Chapman and Bucheim 1991,

Buchheim and Chapman 1992, Larson et al. 1992). The maximum sequence divergence within this group is about 10%, greater than that between gymnosperms and angiosperms. Parsimony trees generated with these data consistently group these organisms into six or seven clades. To which, if any, of the members of this group are the various Polytoma isolates related?

We have now sequenced the complete Rrnl8 gene from 13

Polytoma and two Chlamydomonas species. We constructed phylogenetic trees using our sequences and those of other species of Chlamydomonas and closely-related green algae 70 provided by other researchers. Our goals were to determine

1) the number of distinct heterotrophic lineages, i.e. the

number of times extant nonphotosynthetic species arose from

photosynthetic ancestors; 2) the age of the heterotrophic

lineages; and 3) the extent of speciation and dispersion of

the nonphotosynthetic lineages. Our results show only two

origins, one represented by a single isolate, the other by

12 widely distributed isolates that probably represent a

single species.

MATERIALS AND METHODS

Polytoma stocks

All isolates were obtained from culture collections or other laboratories and subcloned (with the exception of SAG

62-2C, which we were unable to subclone successfully). Their

sources are given in Table 4. Some isolates have never been assigned specific names, while others were given provisional names that appear to have been dropped subsequently; therefore we will mainly refer to them in this Chapter by their respective collection stock numbers. Stocks were grown

in 10 mL liquid PM media (2.0g sodium acetate (hydrate), l.Og yeast extract, l.Og tryptone, doubly distilled water to 1 L) and transferred monthly using a 1:10 dilution of existing stock into fresh medium. 71

t a b l e 3. List of Polytoma stocks.

Stock number* Species1* Where collected Collector Accession #c

UTEX 19 P. uvella Ehrenberg ? E. G. Pringsheim U22940

UTEX 964 P. uvella Ehrenberg Woods Hole, Mass.. USA F. Moewus U22943

ATCC 30963 P. obtusum ? ? U22938

SAG 195.60 (Tetrablepharis sp.) South Africa E. G. Pringsheim U22937

CCAP 62/2 M P. uvella Ehrenberg Leiden, Holland E. G. Pringsheim U22942

SAG 62-2 c P. uvella Ehrenberg Brixen, Tirol. Italy E. G. Pringsheim U22941

SAG 62-13 (P. mirum Pringsheim) Soiling, Germany E. G. Pringsheim U22934

SAG 62-16 (P. difficile Pringsheim) Heidelberg. Germany E. G. Pringsheim U22932

SAG 62-18 (P. elliplicum Pringsheim) Klasen-Brixen, Tirol, Italy E. G. Pringsheim U22933

SAG 62-20 Polytoma sp. Prutz, Tirol, Italy E. G. Pringsheim U22939

SAG 62-21 (P. anomale Pringsheim) California, USA E. G. Pringsheim U22931

SAG 62-27 (P. oviforme Pringsheim) Tenerife, Spain E. G. Pringsheim U22936

DH1 P. obtusum ? ? U22935

a ATCC = American Type Culture Collection; CCAP = Culture Collection of Algae and Protozoa; SAG = Sammlung von Algenkulturen Gdttingen; UTEX = University of Texas Culture Collection of Algae; DH1 was supplied by David Herrin and came originally from the collection of L. Provasoli at Yale University; strain designation is that of the authors. b Names in parentheses are provisional or former names of strains now designated Polytoma sp. cGenbank Accession number for Rrnlfi (18S) gene sequence. 72

Table 5. List of complete Rrnl8 sequences and their sources.

Taxon GenBanka Reference

Asteromonas gracilis M95614 Unpublished

Chlamydomonas reinhardtii Gunderson et al.

1987

Chlamydomonas humicolab U13984 Gordon et al. 1995

Chlorella vulgaris X13688 Huss and Sogin 1989

Dunaliella parva M62998 Unpublished

Dunaliella salina M84320 Wilcox et al. 1992

Glycine max M20017 Eckenrode et al.

1985

Oryza sativa X00755 Takaiwa et al. 1984

Saccharomyces cerevisiae M27607 Mankin et al. 1986

Volvox carteri X53904 Rausch et al. 1989

Zamia pumila M20017 Nairn and Perl 1988

a Accession numbers. b Combined with other strains under C. applanata by Ettl and Schlosser (1992) . TABLE 6. Numbers of substitution and indel differences between complete sequences of Rrnld genes from pairs of strains. Indels are above the diagonal, substitutions below.

T w m -T S fa x am n QpanaD.saSna Aster m i r ti.hum "P964 P tf P& ic w P62-18 “PDfll FIRST T5CTT -P S 2T 'F6S3T T3BS5T CNoM* - 13 10 11 11 14 11 11 19 14 15 13 13 19 14 16 16 19 15 Votox 110 - 7 16 16 19 19 16 22 17 18 16 16 22 17 19 19 22 18 C .itb h ftS r 106 14 - 15 14 18 15 15 19 14 15 13 13 20 14 16 16 19 15 Qpenm 88 88 81 - 0 3 8 4 12 7 8 6 6 13 7 9 9 12 8 0. sMrm 90 90 84 12 - 3 8 4 12 7 8 6 6 13 7 9 9 12 8 Asteromxres 106 102 94 40 43 - 11 7 15 10 11 9 9 16 10 12 12 15 11 Ftyfam & -2J 91 87 81 53 54 69 - 18 26 19 19 18 18 26 19 23 21 24 20 C .h w ccS " 80 78 70 29 28 49 51 • 12 9 10 8 8 12 9 9 9 14 8 Pdfform« l 88 82 75 32 31 54 56 18 - 9 10 8 8 4 7 9 7 12 8 ftytom*id 89 83 75 32 31 54 55 18 0 - 3 1 1 9 2 6 4 7 3 ftytomiB2-2c 89 82 75 32 31 54 55 18 0 0 - 2 2 10 2 7 5 8 3 J ty b m & A 89 82 75 32 31 54 55 18 0 0 0 - 0 8 1 5 3 6 2 Po&to/m&W 89 82 75 32 31 54 55 18 0 0 0 0 - 8 1 5 3 6 2 A#n»DH1 89 82 75 32 31 54 55 18 0 0 0 0 0 - 7 7 7 12 8 185.80 90 82 76 33 32 55 56 19 1 1 1 1 1 1 - 6 2 5 3 Potytumei-K 88 83 76 32 31 54 54 19 1 1 1 1 1 1 0 ■ 6 11 5 fty o m tlH 90 83 76 33 32 55 56 19 1 1 1 1 1 1 0 0 ■ 7 3 62-20 89 83 76 33 32 55 56 19 1 1 1 1 1 1 2 2 2 * 8 /tyftuwsttto 89 83 76 33 32 55 56 19 1 1 1 1 1 1 2 2 2 2 ~ /tyftm>62-2m 91 84 77 34 33 56 57 20 2 2 2 2 2 2 3 3 3 3 3 74 DNA preparation

Cells were grown to log phase in 1-2 L PM. Cells were harvested by centrifugation {Beckman JA10, 14K x g, 10 min) , and the resulting pellet was frozen in liquid nitrogen for storage at - 80° C. The cell pellet was ground with sand in a cold mortar, transferred into 5 mL CTAB buffer (100 mM Tris

HC1 pH 8, 1.4 M NaCl, 20 mAf EDTA, 2% cetyltrimethyl ammonium bromide, and 0.2% p-mercaptoethanol) in a mortar warmed to

65° C, and incubated at 65° C for 45 - 60 min. The resulting lysate was extracted once with 24:1 chloroform: isoamyl alcohol, and the DNA was precipitated with ethanol and resuspended in 1 mL TE (lOmM Tris»HCl pH 8.0, ImM EDTA pH

8 .0) .

Polymerase Chain Reaction (PCR) amplifications

The Rrnl8 gene was amplified from the crude lysates with primers located at the ends of the gene. The 5' and 3' primers were, respectively, SSU1A = 5 1-TGGTTGATCCTGCCAGTAG-31

(sense strand sequence) and SSU2 = 3 1 -CACTTGGACGTCTTCCTAGT-5'

(antisense sequence). The optimal amplification conditions for each lysate were determined empirically. Then multiple amplifications under these conditions were performed and pooled for use as template in subsequent sequencing reactions. The pooling of multiple individual amplifications minimizes sequencing errors due to misincorporation of nucleotides by TAQ polymerase during individual 75 amplifications. The pooled DNA was purified via the

GeneClean kit (Bio 101).

Accidental amplification of DNA from the wrong organism was extremely unlikely because 1) algal DNA molecules greatly outnumbered any contaminating DNA molecules because multiple amplifications were done, each using about 1 mL of the crude lysate from a 2-L culture; and 2) all sequences clearly belong to the Chlamydomonas clade, of which the only cultures in the lab were the Polytoma species, Chlamydomonas reinhardtii, C. humicola, C. dysosmos, and three species of

Polytomella that have different sequences from those shown here (Pam Mackowski, unpubl.).

Direct sequencing of dsDNA

The pooled PCR products were directly sequenced using the dsDNA Cycle Sequencing kit and protocols (BRL), with two modifications: 200-500 ng of PCR product and 1.5 - 2 mL of

10 U-mL"1 TAQ DNA polymerase were used for each reaction.

Primers for both strands were constructed for conserved sites at approximately 200-bp intervals along the gene and used for consecutive sequencing reactions. Each isolate was sequenced at least twice on one strand; several were sequenced twice on both strands. The high level of conservation of this gene allowed for an additional proofreading step: sequences from two closely-related species were compared and sites of divergence were checked on the autoradiographs to positively 76 identify differences, which were then sequenced again for verification. All sequence autoradiograms were read independently by at least two people.

Other DNA sequences

The complete Rrnl8 sequences of other algae, yeast, and plants were obtained from GenBank; accession numbers and references are given in Table 5. Partial sequences of 18S rRNAs from additional Chlamydomonadaceae species shown in

Figure 11 were provided by Mark A. Buchheim (University of

Tulsa).

Sequence Alignment

DNA sequences were entered directly in the SeqApp computer application package (Gilbert 1992) running on

Macintosh computers. SeqApp, through links to CAP (Contig

Assembly Program, Huang 1992) and ClustalV (Higgins and Sharp

1988, 1989, Higgins et al. 1992), was also used to assemble the sequencing fragments of each species into a contig, and then to produce the initial alignment of six green

Chlamydomonadaceae and 13 Polytoma isolates. We refined this alignment by hand, influenced by primary sequences from three land plants (Zamia pumila, Glycine max and Oryza sativa) and a yeast (Saccharomyces cerevisiae). We further refined the alignment using conserved patterns in the secondary structures for this gene in one land plant (Zea 77 mays, Gutell et al. 1985), three green algae (Chlorella

vulgaris, Huss and Sogin 1989; Volvox carter!, Rausch et al.

1989; Chlamydomonas reinhardtii, Gutell et al. 1985; and S.

cerevisiae, Gutell et a l . 1985).

We were able to align 98.9% of the Rrnl8 gene from 12 of

the 13 Polytoma isolates (the isolates in the P. uvella

clade), leaving out only the first four 5' positions and the

last sixteen 3' positions for lack of data in all species.

For the entire array of complete algal sequences for the taxa

shown on the accompanying trees, we were able to align 96.6%

of the gene, omitting only two segments of 6 and 35 bases where the alignment was ambiguous, as well as the short

sections at the 5' and 3' ends for lack of data; this left

1,775 base pairs.

For a second dataset of partial rRNA sequences from 29

additional Chlamydomonadaceae and 30 other chlorophytes

(kindly provided by Mark Buchheim), our complete sequences were truncated to match the partial sequences and aligned as

above. This data set had 963 total sites.

Phylogenetic ana lye is

Three types of phylogenetic analysis programs were run

on Macintosh and Iris Indigo computers. The DNAboot,

DNADist, and Neighbor-Joining algorithms of Phylip v. 3.56

(Felsenstein 1989, 1993) were used to construct neighbor-

joining trees; the Kimura 2-parameter method with a transition/transversion ratio of 2.0 was used to correct for multiple hits. PAUP v. 4.0d25 (Swofford 1993, results published with permission of the author) was used to generate maximum parsimony trees with three kinds of searches: exhaustive and branch-and-bound heuristic searches which give the most parsimonious tree, and full heuristic search, from which the 50%-majority consensus tree was used. All trees

(except the one from the exhaustive search maximum parsimony) were bootstrapped 100 times. Branch swapping by tree- bisection-reconnection (TB) was used with accelerated transformation (ACCTRAN); ten random sequence addition replicates were used for each of the 100 bootstraps in the heuristic searches, for a total of 1,000 replicates. There were 91 informative sites in the complete sequences, and 157 in the partial sequences. Finally, fastDNAml (Olsen et al.

1994) was used to construct maximum likelihood trees. Two data sets were used in the phylogenetic analysis: 1) the complete sequences from 6 green Chlamydomonadaceae, 13

Polytoma isolates, and Chlorella; and 2) a subset of the partial sequences including 24 green Chlamydomonadaceae,

Polytoma 964 and 62-21, and Chlorella. Chlorella vulgaris was used as an outgroup because Chlorella species are consistently separated from Chlamydomonas species in trees based on 5S and 18S rRNA sequences (Chapman and Bucheim

1991). Distance matrices (Table 6) were constructed with the aid of The Indel/Substitution Machine, a Hypercard-based 79 program (Robert Rumpf, unpubl.). The program constructs matrices of pairwise base substitutions and indels (gaps due to either insertions or deletions.

RESULTS AND DISCUSSION

Choice of genes

Several factors make the Rrnl8 gene an excellent candidate for phylogenetic analysis. The gene product (18S rRNA) is essential for protein synthesis and therefore can be found in all organisms. The rRNA product forms a secondary structure rich in stems and loops; the structure is sufficiently conserved that it can be used to improve alignments. Finally, this gene has already been used to determine the phylogenetic relationships of numerous organisms (e.g. Fernholm et al. 1989), and a large database of 18S sequences and phylogenetic trees is available for comparison (Maidak et al. 1994). For the photosynthetic

Chlamydomonadaceae in particular, complete sequences of the

Rrnl8 gene are available for eight species and partial sequences of the 18S rRNA are available for 18 additional species (Vernon-Kipp et al. 1989; Buchheim et al. 1990;

Buchheim and Chapman 1991; Chapman and Bucheim 1991; Buchheim and Chapman 1992); Mark A. Buchheim, pers. comm.). 80 Alignments

The secondary structure of the 18S rRNA is highly conserved, in part by compensating base substitutions (Hillis et al. 1994). This conservation can be used to verify the accuracy of alignments: if a region in one sequence has a particular secondary structure, the same region in another sequence should have a similar structure. For example,

Polytoma sp. 62-27 exhibited a large number of base substitutions compared to the other Chlamydomonadaceae; nevertheless, the secondary structure in all regions of the rRNA was conserved.

Sequence differences

The numbers of pairwise base substitutions and indels in the Rrnl8 gene were tabulated for the isolates used in tree reconstruction (Table 6). Stocks 19, 964, 62-2c, 62-3, 62-

18, and DH1 do not differ in base substitutions, and were combined for the purpose of phylogenetic analysis as the

Polytoma 964 clade. Stocks 195.80, 62-16, and 62-21 are also identical (except for indels) and were combined as the

Polytoma 195.80 clade. Most of the Polytoma isolates can be differentiated by indels; only stocks 62-3 and 62-18 have

Rrnl8 sequences that are identical in all respects. Figure 10. Phylogram of complete Rrnl8 sequences.

Numbers on the branches are branch lengths from

the neighbor-joining algorithm; phylogram branch

lengths are proportional to these numbers. The

tree topology is the same as that of the most

parsimonious tree obtained with an exhaustive or

branch-and-bound search; the 50% majority

consensus tree obtained with a full heuristic

search, and the maximum likelihood tree. Numbers

at the nodes are numbers of bootstrap replicas

(out of 100) containing the corresponding clade;

from top to bottom, these are from the neighbor-

joining, maximum parsimony, and maximum

likelihood algorithms. The most parsimonious

tree had tree length of 265, a consistency index

of 0.8528, and a rescaled consistency index of

0.6436.

81 82

Bootstrap Numbers: Neighbor-Joining Maximum Parsimony Maximum Likelihooo

-.00002,, wov7r yrorna 964 clade .00004 L Polytoma 62-2M .00112 .00568 100 H 1001 L Polytoma 195.80 clade .00319 197; ioor00053 "91' ^ P o ly to m a 62-20 .00342 891 60t 921 681 Chlamydomonas humicola .00281

.00433 84. .00348 891 .00257 Mi Dunaliella salina !66i - = — 801 971 851 ------981 Dunaliella parva .00342 .00527 981 .01823 Asteromonas gracilis

.01775 Polytoma 62-27

■00618 Volvox carteri 1100 I .02651 -100 I 1100 I Chlamvdomonas reinhardtii 03504 .00242 Chlorella vulgaris

Figure 10 Figure 11. Cladograms based on partial R m l 8 sequences.

The cladograms were obtained with the neighbor-

joining (left) or maximum parsimony (right)

methods. Numbers in boxes at the nodes are

numbers of bootstrap replicas (out of 100)

containing the clade. Clades with nodes marked

by dark triangles on the nodal box were supported

by the maximum likelihood tree. Clades marked 1

through 6 represent the six major clades of

Chlamydomonadaceae, as per phylogenetic analyses

by Jupe et al. 1988, Buchheim et a l . 1990,

Buchheim and Chapman 1991, Chapman and Buchheim

1991, Buchheim and Chapman 1992, and Larson et

al. 1992.

83 Neighbor-Joining Maximum Parsimony

-15 C. humicola H. capensis H, zimbabwiensis -16 -29 H. droebakensis 7 ’ Slephanosphaera O I M . 6 J , - [ 7 ] X " i l Chlorogonium TJTt_ H. lacustris -- ‘t 94*- Polytoma uvella -§lj— j - Asferomonas as] H 201 I60fr-i c D. parva D. salina

r— clade 4 56i P-1^ " C. reinhardtii P-^5 “ '=■ C. zebra

0 clade 3 C. peterfii C. mexicana C. eugametos clade 5 Hs. C. pitschmann l=J. C. geitleri Polytoma 62-27 clade 1 tr Carteria crucifera p . -<87' . g 0 > - Carteria eugametos Chlorella vulgaris

Figure 11 85 Phylogenetic trees

Neighbor-joining, Maximum Parsimony, and Maximum

Likelihood algorithms were used to reconstruct the phylogeny of these species. Each of these methods has distinct advantages and disadvantages and is prone to different sources of error (Hillis and Moritz 1990, Miyamoto and

Cracraft 1991, Stewart 1993, Hillis et al. 1994), so that the strongest conclusions can be drawn when all methods produce the same tree topology. For the complete-sequence data set, all three methods produced the same cladogram (Fig. 10). For these trees we used the default weighting of 2 transitions:1 transversion. We determined that the actual ratio in this data set was 1.726:1; when we used this ratio, the same parsimony and maximum likelihood trees were obtained (not shown) . Twelve of the Polytoma isolates form a clade which we refer to as the P. uvella clade; it includes stocks 62.20 and 195.80 and the ten isolates in the 964 and 195.80 clades.

The thirteenth Polytoma isolate, 62-27, was consistently separated from the P. uvella clade by two green lineages (C. humicola; and Asteromonas, Dunaliella parva, and D. salina) .

Trees were also constructed using the larger data set of partial-sequences (Fig. 11). Many bootstrap values were low, but the neighbor-joining, maximum parsimony, and maximum likelihood trees agreed in several important features.

First, these trees are similar to those of Buchheim et al.

(Buchheim et al. 1990, Buchheim and Chapman 1991, 1992, 86 Chapman and Bucheim 1991, Buchheim and Chapman 1992) , in that they divide the Chlamydomonadaceae into six major clades: 1)

Carteria crucifera and Carteria eugametos; 2) Carteria sp.

762 and Carteria radiosa, 3) Chlamydomonas peterfii and C. mexicana; 4) Chlamydomonas zebra, C. reinhardtii, and Volvox aureus; 5) Chlamydomonas geitleri, C. pitschmanii; and close sister species C. moewusii UTEX 9 and C. eugametos UTEX 97, and 6) close sister species C. humicola SAG 11-9 and C. dysosmos UTEX 2399 with Dunaliella, Asteromonas,

Chlorogonium, Stephanosphaera, and Haematococcus species.

Second, the P. uvella clade consistently falls within the

Haematococcus clade (clade 6 in Figure 11). Third, Polytoma

62-27 is a sister-group of the geitleri clade (clade 5).

Fourth, Polytoma 62-27 is always separated from the P. uvella clade by one or more photosynthetic lineages.

Polytoma is polyphyletic

All trees supported a minimum of two origins of the genus Polytoma. All trees have at least one green lineage between Polytoma 62-27 and the P. uvella clade. For the maximum parsimony trees, this includes all of the most parsimonious trees produced by bootstrap resampling, plus the

38 trees with lengths no more than 10 changes longer than that of the most parsimonious tree, i.e. from 265 to 275. In the complete sequence tree, a monophyletic origin of all

Polytoma species from a single heterotrophic mutant would require a minimum of two reversions of the loss of

photosynthesis (the partial sequence trees would require

three or four reversions) . We can assess the probability of

two reversions from the branch lengths on the maximum

parsimony tree of complete sequences. Between the original

loss of photosynthesis and the second of these hypothetical

reversions (on the C. humicola branch) , the Rrnl8 gene

accumulated a minimum of 18 base substitutions (data not

shown), and 6 insertions or deletions (estimated by multiplying the number of base substitutions times the ratio

of substitutions to insertions or deletions between strain

62-27 and C. humicola) . During this time each of the photosynthetic genes, which would not be subject to selection after photosynthesis was first lost, must have incurred many additional substitutions, insertions, and deletions.

Reactivating photosynthesis would then require the reversion of multiple base substitutions, insertions, and deletions in one lineage, which is extremely unlikely.

The P. uvella clade may represent a single species or

several closely related species

The topology of the phylogenetic trees demonstrates that

Polytoma species arose at least twice. However, it does not eliminate the possibility that the P. uvella clade is polyphyletic, i.e. that there are more than two origins overall. That this is unlikely is indicated by the low genetic variability within this clade. Within the P. uvella

clade, 62-3 and 62-18 are identical. The pairwise sequence

divergence within the remaining P. uvella clade is less than

0.17% base substitutions (0 to 3 subs/1775 bp). This can be

compared to the divergence of the Rrnld genes of

Chlamydomonas moewusii Gerloff (UTEX 97) and C. eugametos

Moewus (UTEX 9), which are considered to be conspecific on

the basis of their ability to interbreed in the laboratory

(Gowans 1963) and of their similar morphology (Ettl and

Schlosser 1992) . Partial sequences of the small subunit rRNA

of C. moewusii and C. eugametos, provided by Mark Buchheim

(pers. cornmun.), differ by 1 base substitution out of 737 base pairs (approximately 50% of the gene) ; this corresponds

to a sequence divergence of 0.13% base substitutions. All

the substitution differences within the P. uvella clade occur

at only four positions, none of which are phylogenetically

informative. Thus, it is not surprising that the clade was not convincingly resolved by any of the three computer

algorithms.

There is more variation within the P. uvella clade with

respect to indels than base substitutions. Members of the

clade differ by 0 to 12 indels, compared to 2 in the partial

sequences from C. moewusii and C. eugametos. All are

insertions or deletions of a single base pair. Although

there are many more indels than substitutions within this

clade, all the indels occur at just 25 sites, of which only eight are phylogenetically informative. Parsimony analysis of the P. uvella clade using indels placed P. uvella 964 with

P. obtusum DH1 in a separate clade from the other strains, which form a polytomy (not shown) . This agrees with sequence data from the leucoplast rrnlS gene (D. Vernon, unpubl.)

While we were unable to do interbreeding tests with these

Polytoma species, they can be distinguished from each other morphologically (C.W.Birky Jr, unpubl.) Also, while variation within the P. uvella clade for the nuclear RrnlS gene is similar to the divergence of the conspecific C. moewusii and C. eugametos, sequence divergences in the leucoplast rrnlS gene are more than 50-fold higher in the comparison of two P. uvella species than in a C. moewusii/C. eugametos comparison (D. Vernon, unpubl.)

Estimating ages of Polytoma cladee.

The loss of photosynthesis that gave rise to the P. uvella clade must have occurred after the common ancestor of the P. uvella clade and C. humicola, but before the last common ancestor of all members of the P. uvella clade. The first point can be estimated from the minimum divergence between a member of the P. uvella clade and C. humicola, which is approximately 1% (18 base substitutions/1775) . The second point can be estimated from the maximum divergence between two members of the P. uvella clade, which is 0.17%.

Wilson et al. (1987) presented evidence that the small- 90 subunit rRNA gene evolves at an approximately constant rate of 10-10 base pair substitutions per site per year in organisms ranging from to vertebrates. This would place the loss of photosynthesis in the lineage leading to the P. uvella clade somewhere between 50 million and 8.5 million years ago. For 62-27, we can only estimate the maximum time since the loss of photosynthesis, because this nonphotosynthetic lineage has only one member. The average divergence between 62-27 and members of the C. humicola-P. uvella clade is approximately 3.1%, which gives us a maximum divergence time of roughly 155 million years between 62-27 and its nearest green relatives. Photosynthesis could have been lost at any time since then.

CONCLUSIONS

Our data show only two origins for the thirteen Polytoma stocks we examined. One of these gave rise to the P. uvella clade, which may represent a single species. It may have arisen recently, as indicated by the limited genetic divergence within the clone. Nevertheless, it spread widely across both continents from which samples have been taken.

The example of P. uvella shows that a heterotrophic clade can be very successful for at least several million years. This conclusion assumes that the collection sites in Table 4 are correct, but some may be erroneous due to contamination or other mistakes in the long maintenance of the strains in culture. The second Polytoma clade is represented by a single species (62-27). it might have arisen more recently and had less time to spread, or it might be the remnant of an ancient heterotrophic lineage that is dying out. Of course we cannot rule out the possibility that it represents a clade that is successful in nature but is rarely collected by

Pringsheim's methods or does not thrive in stock collections.

These observations are compatible with the hypothesis that heterotrophic mutants rarely give rise to heterotrophic species, presumably because the mutants are at a selective disadvantage relative to their photosynthetic cohorts. The data are also compatible with the hypothesis that heterotrophic species are at a disadvantage relative to photosynthetic species. This might be manifested as a higher extinction rate, lower speciation rate, or both. This disadvantage could be expected on the basis that the heterotrophs are more limited in the range of environments they can occupy than are their phototrophic congeners, which are also facultative heterotrophs. They may also be at a disadvantage in environments rich in organic nutrients where both can grow. Normal photosynthetic strains of

Chlamydomonas reinhardtii and C. humicola both show higher growth rates with mixotrophic growth, where they have both sunlight and acetate as carbon and energy sources, than with strictly heterotrophic growth (on acetate in the dark)

(Boynton et al. 1972, Laliberte and de la Noue 1993). Similarly, photosynthetic C. reinhardtii grow faster under mixotrophic conditions than do nonphotosynthetic mutants which are obligately heterotrophic. Only in the dark do some mutants grow at nearly the same rate as the wild type

(Boynton et al. 1972) . A higher extinction rate and/or lower speciation rate of nonphotosynthetic lineages is an attractive explanation for the failure of nonphotosynthetic species of Chlamydomonadaceae to replace the photosynthetic forms.

If heterotrophic mutants frequently gave rise to heterotrophic species, we would expect to see many independent origins of Polytoma, not just two. And if there were no species-level selection favoring photosynthetic over heterotrophic species, we would expect to see more heterotrophic lineages branching from deep within the inclusive Chlamydomonas clade, some of which might show substantial divergence within the lineage. However, conclusive evidence for or against selection at the individual and species level would require the phylogenetic analysis of a much larger sample of sequences, preferably from a random sample of recent isolates of photosynthetic and nonphotosynthetic species, preferably coupled with breeding analysis to identify biological species and competition experiments under as nearly natural conditions as possible.

The earliest possible origin of the P. uvella heterotrophic clade, about 50 mya, falls close to the time of the KT (Cretaceous - Tertiary) boundary. The time of the KT

boundary coincides with the time of mass extinctions,

probably brought about by an extraterrestrial impact (Alvarez

et al. 1980). The large quantities of dust placed in the

atmosphere by such an event might significantly reduce the

available sunlight. It is interesting to speculate that a

non-photosynthetic mutant might be least disadvantaged

relative to its green relatives, and most likely to become

established as an independent lineage, under these

conditions. Genes from more species of P. uvella would have

to be sequenced in order to test this hypothesis by

accurately dating the origin of this nonphotosynthetic

lineage.

In addition to illuminating the evolutionary history of nonphotosynthetic algae, our data provide the necessary historical background for studies of the effects of the loss of photosynthesis on the molecular evolution of plastid genes. These studies show that the sequence evolution of

several plastid genes is accelerated in P. uvella 964 but not

in Polytoma sp. 62-27 (Dawne Vernon, unpub.). Accelerated evolutionary rates also have been reported for plastid rRNA and ribosomal protein genes in the nonphotosynthetic angiosperm Epifagus (Wolfe et al. 1992). The lack of detectable acceleration in P. 62-27 suggests that photosynthesis may have been lost more recently in this

lineage than in the P. uvella lineage. ADDITIONAL INFORMATION FROM CHAPTER III

The phylogenetic analysis reported in Chapter III provided information necessary for selection of appropriate algal species to include in the plastid gene analysis reported in Chapter IV and in the plastid gene rate analysis reported in Chapter V. The aim of the plastid analysis in

Chapter V was to compare rates of base pair substitutions among Polytoma species and their photosynthetic relatives.

This analysis uses relative rate tests that require use of species with a particular phylogenetic relationship. For each Polytoma species, a closely related green species needs to be located, plus an outgroup species that is more distantly related to the two species whose substitution rates are to be compared that they are to each other. Therefore, the phylogenetic trees were examined for green species closely related to each Polytoma species. Of the species for which there exists a complete Rrnl8 gene sequence, C. humicola is one of the two green species closest to the 12- species P. uvella clade (Figure 10). Therefore, C. humicola was chosen for relative rate calculations involving Polytoma species in the P. uvella clade.

C. humicola has a very closely-related green sister

94 species, C. dysosmos, which is not shown on these phylogenetic trees; the Rrnl8 genes of these two green chlamydomonads differ by only one indel, and these species have recently been combined with others under the specific name C. applanata (Ettl and Schlosser 1992, Gordon et al

1995) . C. dysosmos was also included in the plastid gene sequencing that formed the databases for the studies reported in Chapters IV and V, although the chloroplast rrnl6 and tufA genes of C. humicola and C. dysosmos are also so similar, with only one indel difference between their r m l 6 genes and only two substitution differences between their tufA genes, that C. dysosmos was not included in the relative rate calculations.

To choose green species for the substitution rate analyses involving P. 62-27, the sole representative of the second Polytoma clade in the Rrnl8 trees, algal sequences already available from other laboratories were considered.

Sequences were available for chloroplast rrnl6 genes from C. eugametos and its very close sister species, C. moewusii

(Durocher et al 1989) . Sequences for both rrnl6 and tufA were available from C. reinhardtii (Dron et al. 1982, Baldauf and

Palmer 1990). The phylogenetic tree using partial Rrnl8 sequences (Figure 11) shows the relationships of C. eugametos and C. reinhardtii to P. 62-27. On this tree, P. 62-27 falls within the same C. eugametos clade (clade 5 in Figure 11) that contains C. eugametos and C. moewusii, so C. moewusii was 96 used as the close green species in the relative rate calculations for P. 62-27.

Two species to were selected as representatives from the

12-member P. uvella clade because of the availability of data from many previous studies. Leucoplasts, leucoplast DNA and ribosomes in P. obtusum (and to a lesser extent in P.u.19 and

P. mirum 62-3) have been studied over the past 30 years using light microscopy, electron microscopy, sucrose gradients, biochemical testing, solution hybridization and CsCl gradients (Siu, Chiang and Swift 1975a, 1975b, 1975c, 1976;

Links, Verloop and Havinga, 1963; Kieras and Chiang 1971;

Scherbel, Behn and Arnold 1974; Lang 1963). Although P.u.19 was the P. uvella isolate used in some of these earlier studies, we had already used P.u.964 in the early stages of our studies, for restriction analyses, hybridizations, cloning, and some sequencing, because it had the hardiest growth under laboratory conditions. Therefore, we continued with P.u.964 as the second representative from the P.uvella clade in the leucoplast studies reported in Chapters IV and

V. Literature Cited

Alvarez, L. W., W. Alvarez, F. Asaro and H. V. Michel, 1980 Extraterrestrial cause for the Cretaceous-Tertiary extinction. Science 208: 1095-1108.

Baldauf, S. L. and J. D. Palmer, 1990 Evolutionary transfer of the chloroplast tufA gene to the nucleus. Nature 344: 262- 265.

Bastia, D., K. S. Chiang, H. Swift and P. Siersma, 1971 Heterogeneity, complexity, and repetition of the chloroplast DNA of Chlamydomonas reinhardtii. Proc. Natl. Acad. Sci. USA 68: 1157-1161.

Birky, C. W., Jr., 1983 Relaxed cellular controls and organelle heredity. 222: 468-475.

Birky, C. W., 1988 Evolution and variation in plant chloroplast and mitochondrial genomes, in Plant Evolutionary Biology, edited by L. D. Gottlieb and K. J. Subodh. Chapman and Hall, New York.

Birky, C. W., Jr., 1994 Relaxed and stringent genomes: Why cytoplasmic genes don't obey Mendel's laws. J. Hered. 85: 355-365.

Bogorad, L., 1991 Replication and transcription of plastid DNA, in The Molecular Biology of Plastids, edited by L. Bogorad and I. K. Vasil. Academic Press Inc., New York.

Boynton, J. E., N. W. Gillham and J. F. Chabot, 1972 Chloroplast deficient mutants in the green alga Chlamydomonas reinhardi and the question of chloroplast ribosome function. J. Cell Sci. 10: 267-305.

Buchheim, M. A. and R. L. Chapman, 1991 Phylogeny of the colonial green flagellates: a study of 18S and 2 6S rRNA sequence data. BioSystems 25: 85-100.

Buchheim, M. A. and R. L. Chapman, 1992 Phylogeny of Carteria () inferred from molecular and organismal data. J. Phycol. 28: 363-374.

97 98

Buchheim, M. A., M. A. McAuley, E. A. Zimmer, E. C. Theriot and R. L. Chapman, 1994 Multiple origins of colonial green flagellates from unicells: Evidence from molecular and organismal characters. Molecular Phylogenetics and Evolution 3: 332-343.

Buchheim, M. A., M. Turmel, E. A. Zimmer and R. L. Chapman, 1990 Phylogeny of Chlamydomonas (Chlorophyta) based on cladistic analysis of nuclear 18S rRNA sequence data. J. Phycol. 26: 689-699.

Cedergren, R., M. W. Gray, Y. Abel and D. Sankoff, 1988 The evolutionary relationships among known life forms. J. Mol. Evol. 28: 98-112.

Chiang, K.-S. and N. Sueoka, 1967 Replication of chloroplast DNA in Chalmydomonas reinhardtii during vegetative cell cycle: its mode and regulation. Biochemistry 57:

Chapman, R. L. and M. A. Bucheim, 1991 Ribosomal RNA gene sequences: Analysis and significance in the phylogeny and taxonorry of green algae. Crit. Rev. Plant Sci. 10: 343-368.

Colwell, A., 1994 Genome evolution in a non-photosynthetic plant, Conopholis americana. Washington University.

Delwiche, C. F., M. Kuhsel and J. D. Palmer, 1995 Phylogenetic Analysis of tufA sequences indicates a cyanobacterial origin of all plastids. Molecular Phylogenetics and Evolution 4: 110-128.

dePamphilis, C. W. and J. D. Palmer, 1990 Loss of photosnthetic and chlororespiratory genes from the plastid genome of a parasitic flowering plant. Nature 348: 337-339.

Dron, M. , M. Rahire and J. D. Rochaix, 1982 Sequence of the chloroplast 16S rRNA gene and its surrounding regions of Chlamydomonas reinhardtii. Nucleic Acids Research 10: 7609- 7620.

Durocher, V,, A. Gauthier, G. Bellemare and C. Lemieux, 1989 Curr. Genet. 15: 277-282.

Eckenrode, V. K., J. Arnold and R. B. Meagher, 1985 Comparison of the nucleotide sequence of soybean 18S rRNA with the sequences of other small-subunit rRNAs. J. Mol. Evol. 21: 259-269. 99 Ettl, H., 1976 Die Gattung Chlamydomonas Ehrenberg. Beih. Nova Hedwigia 49: 1-1122.

Ettl, H. and U. G. Schlosser, 1992 Towards a revision of the systematics of the genus Chlamydomonas (Chlorophyta) . 1. Chlamydomonas applanata Pringsheim. Bot. Acta 105: 323-33 0.

Felsenstein, J., 1978 Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27: 401-410.

Felsenstein, J., 1981 Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17: 368-376.

Felsenstein, J., 1989 Phylogeny Inference Package (version 3.2). Cladistics 5: 164-166.

Felsenstein, J., 1993 PHYLIP (Phylogeny Inference Package) version 3.5p. Department of Genetics, University of Washington, Distributed by the author.

Fernholm, B., K. Bremer and H. Jornvall, 1989 The Hierarchy of Life. Molecules and Morphology in Phylogenetic Analysis., in Proceedings from Nobel Symposium 70, edited by Excerpta Medica, Amsterdam.

Fitch, W. M., 1974 Toward defining the course of evolution: minimum change for a specified tree topology. Syst. Zool. 20: 406-416.

Gilbert, D. G., 1992 SeqApp, a biological sequence editor and analysis program for Macintosh computers. Published electronically on the Internet, available via gopher or anonymous ftp to ftp.bio.indiana.edu.,

Gillham, N. W., 1994 Organelle Genes and Genomes. Oxford University Press, Oxford.

Grant, D. M., N. W. Gillham and J. E. Boynton, 1980 Inheritance of chloroplast DNA in Chlamydomonas reinhardtii. Proc. Nat. Acad. Sci. USA 77: 6067.

Hedberg, M. F., Y. S. Huang and M. H. Hommersand, 1981 Size of the chloroplast genome in Codium fragile. Science 213: 445-447.

Huss, V. A. R., K. H. Wein and E. Kessler, 1988 Deoxyribonucleic acid reassociation in the taxonomy of the genus Chlorella. Archives of Microbiology 150: 509-511 100 Gordon, J., R. Rumpf, S. L. Shank, D. Vernon and C. W. Birky, Jr., 1995 Sequences of the rrnl8 genes of Chlamydomonas humicola and C. dysosmos are identical, in agreement with their combination in the species C. applanata (Chlorophyta)' J. Phycol. 31: 312-313.

Gowans, C. S., 1963 The conspecificity of Chlamydomonas eugametos and Chlamydomonas moweusii: An experimental approach. Phycologia 3: 37-44.

Gray, M. W., D. Sankoff and R. J. Cedergren, 1984 On the evolutionary descent of organisms and organelles: a global phylogeny based on a highly conserved structural core in small subunit ribosomal RNA. Nucleic Acids Research 12: 5837- 5852.

Gunderson, J. H., H. Elwood, A. Ingold, K. Kindle and M. Sogin, 1987 Phylogenetic relationships between chlorophytes, chrysophytes and oomycetes. Proc. Nat. Acad. Sci. USA 84: 5823-5827.

Gutell, R. R., B. Weiser, C. R. Woese and H. F. Noller, 1985 Comparative anatomy of 16-S-like ribosomal RNA. Prog. Nuc. Acids Res. Mol. Biol. 32: 155-216.

Harris, E. H., 1989 The Chlamydomonas Sourcebook. Academic Press, Inc., New York.

Higgins, D. G., A. J. Bleasby and R. Fuchs, 1992 Clustal V: improved software for multiple sequence alignment. CABIOS 8 : 189-191.

Higgins, D. G. and P. M. Sharp, 1988 CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73: 237-244.

Higgins, D. G. and P. M. Sharp, 1989 Fast and sensitive multiple sequence alignment on a microcomputer. CABIOS 5: 151-153.

Hillis, D. M., John P. Huelsenbeck, Clifford W. Cunningham, 1994 Application and Accuracy of Molecular Phylogenies. Science 264: 671-677.

Hillis, D. M. and M. T. Dixon, 1991 Ribosomal DNA: Molecular Evolution and phylogenetic inference. Quarterly Review of Biology 66: 411-453.

Hillis, D. M. and C. Moritz, Ed. (1990). Molecular Svstematics. Sunderland, Mass., Sinauer Associates, Inc. 101

Huang, X. , 1992 A contig assembly program based on sensitive detection of fragment overlaps. Genomics 14: 18-25.

Huss, V. A. and M. L. Sogin, 1989 Primary structure of the Chlorella vulgaris small subunit ribosomal RNA coding region. Nuc. Acids Res. 17: 1255.

Jupe, E. R., R. L. Chapman and E. A. Zimmer, 1988 Nuclear RNA genes and algal phylogeny--the Chlamydomonas example. BioSystems 21: 223-230.

Laliberte, G. and J. de la Noue, 1993 Auto-, hetero-, and mixotrophic growth of Chlamydomonas humicola (Chlorophyceae) on acetate. J. Phycol. 29: 612-620.

Lang, N. J., 1963 Electron-microscopic demonstration of plastids in Polytoma. J. Protozool. 10: 333-339.

Larson, A., M. M. Kirk and D. L. Kirk, 1992 Molecular phylogeny of the volvocine flagellates. Mol. Biol. Evol. 9: 85-105.

Li, W. H. and D. Grauer, 1991 Fundamentals of Molecular Evolution. Sinauer Associates, Inc., Sunderland, Massachusetts.

Links, J., A. Verloop and E. Havinga, 1960 The carotenoids of Polytoma uvella. Arc. Microbiol. 36: 306-324.

Maidak, B. L., N. Larsen, M. J. McCaughey, R. Overbeek, G. J. Olsen, K. Fogel, J. Blandy and C. R. Woese, 1994 The Ribosomal Database Project. Nucleic Acids Research 22: 3485- 3487.

Mankin, A. S., I. G. Skryabin and P. M. Rubtsov, 1986 Identification of the additional nucleotides in the primary structure of yeast 18S rRNA. 44: 143-145.

Miyamoto, M. M. and J. Cracraft, Ed. (199.1) . Phvloaenetic Analysis of DNA Sequences. New York, Oxford University Press.

Nairn, C. J. and R. J. Perl, 1988 The complete nucleotide sequence of the small-subunit ribosomal RNA coding region for the cycad Zamia pumilai Phylogenetic implications. J. Mol. Evol. 27: 133-141.

Nei, M., 1987 Molecular Evolutionary Genetics. Columbia University Press, New York. 102 Olsen, G. J., H. Matsuda, R. Hagstrom and R. Overbeek, 1994 fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41-48.

Olsen, G. J. and C. R. Woese, 1993 Ribosomal RNA: a key to phylogeny. FASEB J. 7: 113-123.

Pace, N. R., G. J. Olsen and C. R. Woese, 1986 Ribosomal RNA Phylogeny and the primary lines of Evolutionary Descent. Cell 45: 325-326.

Penny, D., M. D. Hendy and M. A. Steel, 1992 Progress with methods for constructing evolutionary trees. TREE 7: 73-79.

Pringsheim, E. G., 1963 Farblose Algen. Gustav Fischer Verlag, Stuttgart.

Rausch, H., N. Larsen and R. Schmitt, 1989 Phylogenetic relationships of the green alga Volvox carteri deduced from small-subunit ribosomal RNA comparisons. J. Mol. Evol. 29: 255-265.

Saitou, N. and M. Nei, 1987 The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406-425.

Scherbel, G., w. Behn and C. G. Arnold, 1974 Untersuchungen zur genetischen Funktion des farblosen Plastiden von Polytoma mirum. Arch. Microbiol. 96: 205-222.

Schlbsser, U. G., 1984 Species-specific sporangium autolysins (cell-wall-dissolving enzymes) in the genus Chlamydomonas., 409-418 in Systematics of the Green Algae, edited by D. E. G. Irvine and D. John. Cambridge University Press, Cambridge.

Siemeister, G. and W. Hachtel, 1990 Structure and expression of a gene encoding the large subunit of ribulose-1,5- bisphosphate carboxylase (rbcL) in the colourless euglenoid flagellate Astasia longa. Plant Mol. bioi. 14: 825-833.

Siu, C.-H., K.-S. Chiang and H. Swift, 1976 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 369-391. 103

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. I. Ultrastructureal analysis of organelles. J. Cell. Biol. 69: 362-370.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Sober, E., 1988 Reconstructing the Past: Parsimony, Evolution and Inference. MIT Press, Cambridge, Mass.

Sogin, M. L., 1991 The Phylogenetic Significance of sequence diversity and length variations in eukaryotic small subunit ribosomal RNA coding regions.

Sogin, M. L., H. J. Elwood and J. H. Gunderson, 1986 Evolutionary diversity of eukaryotic small-subunit rRNA genes. Proc. Natl. Acad. Sci. USA 83: 1383-1387.

Stewart, C.-B., 1993 The powers and pitfalls of parsimony. Nature 361: 603-607.

Swofford, D. L., 1993 PAUP: Phylogenetic Analysis Using Parsimony, Version 3.5p. Computer program distributed by the Illinois Natural History Survey, Champaign, Illinois.,

Takaiwa, P., K. Oona and M. Sugiura, 1984 The complete nucleotide sequence of a rice 17S rRNA gene. N u c . Acids Res. 12: 5441-5448.

Turmel, M., R. R. Gutell, J.-P. Mercier, C. Otis and C. Lemieux, 1993 Analysis of the chloroplast large subunit ribosomal RNA gene from 17 Chalmydomonas Taxa. J. Mol. Biol. 232: 446-467.

Vernon-Kipp, D., S. A. Kuhl and C. W. J. Birky, 1989 Molecular evolution of Polytoma, a non-green chlorophyte., 284-286 in Physiology, Biochemistry, and Genetics of Nongreen Plastids, edited by C. T. Boyer, J. C. Shannon and R. C. Hardison. American Society of Plant Physiologists, Rockville, Maryland.

Wilcox, L. W., L. A. Lewis, P. A. Fuerst and G. L. Floyd, 1992 Group I introns within the nuclear-encoded small- subunit rRNA gene of three green algae. Mol. Biol. Evol. 9: 1103-1118. 104

Wilson, A. C., H. Ochman and E. M. Prager, 1987 Molecular time scale for evolution. TIG 3: 241-247.

Wolfe, K. H., C. W. Morden, S. C. Ems and J. D. Palmer, 1992 Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. J. Mol. Evol. 35: 304-317.

Wolfe, K. H., C. W. Morden and J. D. Palmer, 1992 Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc. Nat. Acad. Sci. USA 89: 10648-10652. CHAPTER IV

GENERAL BACKGROUND

Evidence indicating Functionality of Leucoplast

Genomes in Nonphotosynthetic Species

Two separate but related issues are involved in the question of functionality in these unusual non-photosynthetic organisms - whether the leucoplast is functional as an organelle (participating in some biochemical cell function), and whether the leucoplast genome is functional genetically, transcribing and translating leucoplast genes. There is significant evidence for functionality of both entities. As described in Chapter I of this thesis, chloroplasts contain genes with three classes of function: photosynthetic, expression, and essential nonphotosynthetic (ENP) functions.

Of the more than 100 chloroplast genes, less than half code for photosynthetic proteins, and all chloroplast genomes sequenced contain several ORFs that may be candidate genes coding for as-yet-undetermined ENP-function proteins.

Chloroplasts, as organelles, have so many known functions other than photosynthesis that the need for the continued presence of the organelle in these nonphotosynthetic species should be in little doubt. Parts of numerous biochemical

105 106 pathways involve the chloroplast: fatty acid, amino acid, and porphyrin biosynthesis; nitrite and sulfate reduction; and biosynthesis, storage and degradation of starch (Kirk and

Tilney-Bassett 1978, Howe and Smith 1991, Weeden 1981, Browse and Somerville 1991, Fiedler and Schultz 1985). These pathways probably guarantee retention of the leucoplast organelle. In addition, proteins needed for any of these pathways might be plastid-encoded, thus ensuring the retention of proper genetic functioning of the leucoplast genome.

A few potential candidates for ENP leucoplast genes have been located by sequence homology searches in several photosynthetic algae, in a liverwort, and in the nonphotosynthetic angiosperm Epifagus. The Marchantia chloroplast genome may contain two genes involved in sulfate transport (Laudenbach and Grossman 1991). ORFs in the chloroplast genomes of Cylindrotheca sp.Nl and Cyanophora paradoxa show homology to genes from fatty acid, carotenoid and NAD biosynthetic pathways {Hwang and Tabita 1991;

Michalowski, Loeffelhardt and Bohnert 1991; Michalowski,

Flachmann, Loeffelhardt and Bohnert 1991). Of the four ORFs found in the Epifagus leucoplast genome, two have homology to known genes, so their function can be surmised. One Epifagus

ORF is homologous to an E. coli gene (accD) believed to function in the first step of fatty acid synthesis. The second ORF is similar to the E. coli clpP gene, and may be 107 involved in cleavage of leader peptides from proteins imported into the leucoplast. The function of the remaining two unidentified ORFs in the Epifagus leucoplast is unknown, but they show no homology to the four unidentified ORFs found in Astasia's leucoplast genome (Wolfe et a l . 1992).

Continued need for genetic function in leucoplasts can be inferred from observations in all nonphotosynthetic species studied, including the Polytoma species which are the subject of this thesis. Microscopic observations and molecular analyses reveal selective retention and conservation of leucoplast ribosomes, leucoplast DNA, and (at a finer perspective) intact and functional-looking leucoplast expression genes, even in leucoplast genomes only one-third to one-half the size of chloroplast genomes in close green relatives (Scherbel, Behn and Arnold 1974; Kieras and Chiang

1971; Siu, Chiang and Swift 1975b, 1975c, 1976; Vernon-Kipp et a l . 1989; Siemeister and Hachtel 1989, 1990a, 1990b;

Siemeister, Buchholz and Hachtel 1990; Wimpee et al. 1991,

1992; dePamphilis & Palmer 1990; Wolfe et al. 1991; Colwell

1995; D. Vernon and C.W. Birky, Jr., unpubl).

In addition, northern and western analyses in Epifagus and Astasia reveal leucoplast RNA and protein products.

Northern analysis in Astasia shows mRNAs for tufA and rbcL, and the protein product of rbcL (EF-Tu) is detectable on westerns (Siemeister and Hachtel 1990a, 1990b; Siemeister,

Buchholz and Hachtel 1990). Northerns in Epifagus show 108 several plastid rRNAs plus at least a half dozen mRNAs

(dePamphilis and Palmer 1990).

Several genetic and/or antibiotic experiments provide additional support for genetic functionality. In normal photosynthetic Chlamydomonas species, maintained under

facultatively-heterotrophic conditions, antibiotics were used to block plastid protein synthesis. Nuclear DNA synthesis, measured by density shift analysis, was found to be blocked also, then restored in mutants shown to have mutations in chloroplast rRNA or ribosomal protein genes that conferred antibiotic resistance (Blamire et al. 1974). Two Polytoma species {P.uvella UTX 964 and P. mirum SAG 62-3) show sensitivity to streptonycin, erythromycin and spectinomycin, which inhibit chloroplast protein synthesis in Chlamydomonas

(S.Kuhl, D. Vernon & C. W. Birky, Jr., unpubl.)

In nonphotosynthetic Euglena mutants, leucoplast DNA has gross rearrangements, amplifications and/or deletions, but most mutants retain at least one copy of leucoplast rRNA genes (Heizmann et al. 1982, Hussein et a l . 1982).

Leucoplast genetic function was not addressed, although leucoplast rRNA transcription was seen in some mutants

(Hussein et al. 1982).

Despite significant retention of most expression genes, some anomalous results from sequencing in Epifagus and

Conopholis need explanation. Several leucoplast tRNA genes, leucoplast ribosomal protein genes, and all four leucoplast 109 RNA polymerase subunit genes are missing from the leucoplast genome in Epifagus (Wolfe et a l . 1992); and several leucoplast tRNA genes are apparently missing from the leucoplast genome in Conopholis (Wimpee et al. 1992b).

Import (either passive or active) of these molecules from the cytoplasm would probably be necessary for continued genome functionality. On the other hand, several investigators use these data to suggest that the leucoplast genomes in those particular non-green species may no longer be genetically functional (Feierabend 1992, Wimpee et al. 1992b).

Future Genetic Experimentb

The most direct evidence of leucoplast functionality listed above comes from the genetic experiments in temporarily non-photosynthetic Chlamydomonas. Even more compelling would be similar genetic results from permanently non-photosynthetic species. No genetic experiments addressing the issue of leucoplast functionality have been reported in either Astasia or Epifagus.

Unfortunately, we were unable to address leucoplast functionality in Polytoma species in the most direct way. We were hoping to apply the methods of Chlamydomonas genetics to these non-green algae, because genetic crosses in chlamydomonads can provide powerful evidence of the cellular location of a particular gene. With appropriate nuclear and plastid, and even mitochondrial, mutants as markers, one can 110 ascertain the genome in which a gene resides by the unique patterns of inheritance in the progeny of a genetic cross.

Nuclear genes will show Mendelian patterns in the progeny; and organellar genes will show non-Mendelian uniparental patterns, with plastid inheritance predominantly from one mating type and mitochondrial inheritance predominantly from the other mating type (Birky 1995). We tried the several conventional ways to induce mating in several Polytoma species, but were unsuccessful in detecting mating events.

We are attempting, however, to demonstrate leucoplast functionality in Polytoma with another powerful method.

Fortuitously, the two leucoplast genes (rrnl6 and tufA) for which we already have sequences in several Polytoma species are implicated in the response to certain antibiotics, as elucidated by experiments done in Chlamydomonas and many other organisms. Wild-type Chlamydomonas cultures, grown heterotrophically, are sensitive to antibiotics (streptonycin and spectinonycin, among others) that inhibit protein synthesis on organelle ribosomes (Conde et al. 1975, Gillham

1994, Harris 1989). We have already demonstrated that

Polytoma cell growth is also sensitive to these antibiotics

(S. Kuhl, D. Vernon and C. W. Birky, Jr., unpubl.). These antibiotics could be acting on plastid ribosomes, or on mitochondrial ribosomes, or could have some other, unknown, effect. If it could be shown that these antibiotics were acting on plastid ribosomes, then that would mean the plastid must be making some essential non-photosynthetic (ENP)

protein. Experiments in Chlamydomonas have produced exactly

that result. In Chlamydomonas, mutants resistant to high

concentrations of these two antibiotics are nearly all

chloroplast mutants (Harris et al. 1989). Many show mutations in the chloroplast rm l 6 gene. These mutational changes are not only very specific to the rrnl6 gene, but also occur at only two or three characteristic locations in the DNA sequence of this gene (Noller et a l . 1990, Cundliffe

1990) so sequencing reactions in rrnlS genes from mutants could be easily targeted to those specific areas of the gene.

Mutant rrnl6 genes with substitutions at these same characteristic locations have also been demonstrated in tobacco and E. coli (Yeh et al. 1994, Sigmund 1984). Corn is naturally resistant to spectinomycin, and its rrnlS gene shows the characteristic base changes (Schwarz and Kossel

1980)..

Investigations involving such antibiotic-resistant mutants in several Polytoma species are already in progress.

We have generated spectinomycin-resistant and streptomycin- resistant Polytoma mutants; they are ready to grow for lysis and sequencing. If the Polytoma sequences have mutations in the same characteristic sites in the rrnlS gene, we will have good evidence that the antibiotics are acting on leucoplast ribosomes, and that, therefore, the leucoplast ribosomes in

Polytoma are functional and making some ENP proteins. 112 Similarly, most mutants resistant to the antibiotic

kirromycin show mutations at one of only several

characteristic locations in the DNA sequence of the tufA gene

(Anborgh and Parmeggiani 1991), so sequencing reactions could

be targeted to those areas. The range of naturally-occurring

tufA mutants includes eubacteria (Mesters et a l . 1994,

Landini et al. 1993, Vijgenboom et al. 1994), archaebacteria

(Kessel and Klink 1981), and yeast mitochondria (Piechulla

and Kuntzel 1983), as well as one Chlamydomonas mutant

generated after mutagenesis with fluorodeoxyuridine (Harris

1989, and pers. comm.).

After a very long delay in obtaining kirromycin, we

are now ready to select for kirromycin-resistant Polytoma mutants as well. Correlation of a physiological change to a

change in the DNA sequence would let us go beyond what could

be asserted after either northern or western analysis, because neither of those two analyses proves that the RNA or protein detected is functional.

Acknowledgment b

Amplification and sequencing primers were kindly provided by P. Fuerst at the Ohio State University for rrnl6

and by J. Palmer at Indiana University for tuf A. C. Woese1

and G. Olsen's Ribosomal Data Base (Maidak et a l . 1994) was

invaluable, both for aligning iry r m l 6 sequences and for modeling rRNA secondary structures. A large array of algal, 113 cyanobacterial and eubacterial tufA sequences, kindly- provided by J. Palmer and C. Delwiche at Indiana University, was invaluable for aligning my tufA sequences. Also, C.

Delwiche was the first to notice anomalies in the putative

P.u.964 tufA sequence that I subsequently concluded was from a bacterial contaminant. We thank him for sharp eyes.

Thanks to C. Delwiche for his willingness to review my tuf A alignments, and to R. Gutell at the University of Colorado for his willingness to review my rRNA secondary structures.

Thanks to D. Herrin at the University of Texas for saving the

P. obtusum species in his culture collection, when it apparently got lost from all other collections. Finally, thanks to C. Breitenberger at the Ohio State University for her great help in analyzing the hypervariable region in EF-

Tu. CHAPTER IV

Evidence that Plastid rrnl6 and tuf A Genes are Functional in

Three Non-Green Chlorophyte Algae in the Genus Polytoma.

INTRODUCTION

A number of unusual land plant and algal species have lost the ability to do photosynthesis and now fill niches as free-living heterotrophic algae or as parasites on other plants. They are recognized by their lack of chlorophyll.

The plastids of these non-green species are called leucoplasts; a few have been extensively studied, representing one angiosperm lineage, one chlorophyte algal lineage, and the euglenoid algal lineage. The leucoplast genome from a nonphotosynthetic angiosperm, the beech root parasite Epifagus Virginia (Orobanchaceae), has been completely sequenced (Wolfe et al. 1992). Genome mapping and sequencing of leucoplast genomes from the oak parasite

Conopholis americana (Orobanchaceae) and Astasia longa

(Euglenophyta) are in progress (Siemeister and Hachtel 1989).

Leucoplasts in these and other nonphotosynthetic species lack internal thylakoid membranes and, in some species, the leucoplast genomes are greatly reduced in size compared to photosynthetic relatives (dePamphilis and Palmer 1990, Wimpee

114 115 et al. 1991, Siemeister and Hachtel 1989).

There is strong evidence that the leucoplast of these non-photosynthetic species is functional and contains a functional genome, even though photosynthesis has been lost.

Chloroplasts contain genes with three types of functions: for photosynthesis, for expression of leucoplast genes

{transcription and translation), and for essential functions other than photosynthesis. Potential candidates for essential nonphotosynthetic (ENP) leucoplast genes were found, by sequence homology, in the chloroplasts of two chlorophyte algae (CyJLindrotheca sp.Nl and Cyanophora paradoxa), a liverwort {Marchantia) , and in the leucoplast of the nonphotosynthetic angiosperm Epifagus virginiana (Hwang and Tabita 1991; Michalowski, Loeffelhardt and Bohnert 1991;

Michalowski, Flachmann, Loeffelhardt and Bohnert 1991;

Laudenbach and Grossman 1991; Wolfe et al. 1992). The potential functions of these candidate ENP genes come from the list of biochemical pathways known to take place partly in plastids: fatty acid, amino acid, and porphyrin synthesis; nitrite and sulfate reduction; and the biosynthesis, storage and degradation of starch (Kirk and Tilney-Bassett 1978, Howe and Smith 1991, Weeden 1981, Browse and Somerville 1991,

Fiedler and Schultz 1985). An ORF in the chloroplast genome of Cylindrotheca may be involved in fatty acid synthesis

(Hwang and Tabita 1991). ORFs in Cyanophora may be involved in carotenoid and NAD biosynthesis pathways (Michalowski, 116 Loeffelhardt and Bohnert 1991; Michalowski, Flachmann,

Loeffelhardt and Bohnert 1991). The Marchantia chloroplast genome may contain two genes involved in sulfate transport

(Laudenbach and Grossman 1991). One of the four candidate

ENP genes in Epifagus' leucoplast may be involved in cleavage of leader peptides from proteins imported into the leucoplast. Another ORF may contribute to an amino acid biosynthetic pathway (Wolfe et al. 1992).

Observational and genetic experiments provide additional indirect and direct evidence of leucoplast functionality.

Leucoplasts with stored starch, ribosomes, and DNA can be demonstrated by microscopy or can be otherwise separated and detected using sucrose gradients or differential centrifugation, in all non-green species studied including the Polytoma species in the study reported here (Siu, Chiang and Swift 1975b, 1975c, 1976; Kieras and Chiang 1971;

Scherbel, Behn and Arnold 1974; Vernon-Kipp et al. 1989;

Morden et al. 1991; Wimpee et al. 1992a; Siemeister et al.

1990; D. Vernon and C. W. Birky, Jr., unpubl.). Selectively conserved rRNA and expression genes are found in Epifagus,

Conopholis, and Astasia, as well as in Polytoma. Several leucoplast mRNAs and one leucoplast-encoded protein (rbcL) are seen in Astasia, and six mRNAs are seen in Epifagus

(dePamphilis and Palmer 1990; Siemeister and Hachtel 1990a,

1990b). 117 In heterotrophically-maintained Chlamydomonas cultures, antibiotics that block chloroplast protein synthesis also block nuclear DNA synthesis, which is restored in mutants with mutations in chloroplast rRNA or ribosomal protein genes

(Blamire et al. 1974). Several Polytoma species are also sensitive to antibiotics that block chloroplast protein synthesis in Chlamydomonas {S. Kuhl, D. Vernon, C .W. Birky,

Jr., unpubl.}.

Some observations in Euglena, Epifagus and Conopholis are problematic. Plastid DNA could not be detected by

Southern hybridization in heat-bleached Euglena mutants

(Conkling 1993), although plastid DNA was successfully separated and located in mutants using CsCl gradients

(Heizmann et al. 1982, Hussein et al. 1982). The leucoplast in Epifagus lacks several leucoplast tRNA genes, several ribosomal protein genes and all four RNA polymerase subunit genes (Wolfe et al. 1992); and Conopholis lacks several leucoplast tRNA genes (Wimpee et al. 1992), even though all other expression genes are intact and appear functional.

Observations on stress-induced nonphotosynthetic mutants, whose longevity has not been verified beyond a few decades, may not apply to non-photosynthetic species; perhaps only mutants that retain some plastid functions are viable long enough to be detected in nature as separate non­ photosynthetic species. In the angiosperms, some type of 118 passive or active import may exist to provide the missing expression genes.

On the whole there is strong evidence that most, if not all, leucoplast genomes have at least one ENP gene, and consequently that most or all nonphotosynthetic species of plants and chlorophyte algae retain these leucoplast ENP genes together with those required for protein synthesis.

We are investigating the evolutionary consequences of the loss of photosynthesis in the chlorophyte algal genera

Polytoma and Polytomella, nonphotosynthetic single-celled biflagellated algae belonging to the large clade of chlamydomonad algae (see Chapter III; P. Mackowski, unpubl.)

Some investigations in Polytoma species are complete and are reported here. Polytoma algae presumably arose from nonphotosynthetic mutants of green algae similar to

Chlamydomonas. They were able to survive because their green ancestors, like many species of Chlamydomonas, were facultative heterotrophs, so the mutants simply became obligate heterotrophs. The single large cup-shaped plastid in Polytoma does not have internal thylakoid membranes, but still contains ribosomes, DNA, rRNA and stored starch granules (Siu, Chiang and Swift 1975a; Gaffal 1978; Arnold and Blank 1980; Lang 1963; D. Vernon and C. W. Birky, Jr., unpubl.). Samples of thirteen Polytoma isolates were obtained, collected from globally-distant locations, as described in Chapter III.. These algae had all been 119 classified by morphological and biochemical criteria as

chlamydomonads (Pringsheim 1963). This classification was

confirmed with a molecular phylogenetic analysis of sequences

of the nuclear Rrnl8 gene coding for the small-subunit rRNA,

as reported in Chapter III. The tree topology shows a bi-

phyletic origin for these 13 species, with 12 of the species

grouping together in what we call the Polytoma uvella clade,

while the thirteenth (P. sp.62-27) is the sole representative

of a second clade, separated from the P.uvella clade in all

phylogenetic trees by at least one green lineage.

Leucoplast genes were investigated from three

representative Polytoma species: P.uvella UTX 964 and P.

obtusum DH1 from the P. uvella clade, and P. sp. SAG 62-27

(hereafter called P.u.964, P. obtusum and P.62-27). Two

leucoplast genes were amplified and sequenced - both are

plastid expression genes involved in translation of

leucoplast mRNAs (rrnl6, coding for the plastid small-subunit

rRNA, and tufA, which codes for the plastid elongation factor

Tu). For comparison, genes from two photosynthetic species

(Chlamydomonas humicola SAG 11-9 and Chlamydomonas dysosmos

UTX 2399) were also amplified and sequenced. The sequences

of the leucoplast genes show no changes relative to those in

green species that would be incompatible with function.

These observations are consistent with the hypothesis that protein synthesis occurs in the leucoplast of Polytoma and is

essential for the organism. 120

MATERIALS AND METHODS

Organisms

The algal strains were obtained from culture

collections, subcloned, and grown in Polytomella medium, as described in Chapter III.

DNA preparation

The rrnl6 gene of P.u.964 was located on a 6.2-kb

Hindlll insert in a leucoplast genomic library cloned in pBluescript. These clones were prepared from the leucoplast

DNA fraction of a CsCl+bisbenzimide equilibrium gradient, after whole-cell DNA isolation with a modified version of a

lysis method developed in order to yield high-molecular weight chloroplast DNA from Chlamydomonas (Grant, Gillham and

Boynton 1980), as previously described in Chapter II. DNA

for use as sequencing template was obtained, using alkaline

lysis miniprep preparation protocols (Sambrook et al.

1989), from the clone containing the gene, then purified using the GeneClean kit (Bio 101). All other genes in this study were amplified from partially purified whole-cell DNA isolated from CTAB lysates of 1 L algal cultures, as described in Chapter III. 121 Polymerase Chain Reaction (PCR) amplifications

Primers located near the ends of the rrnl6 and tufA genes were used to obtain DNA templates for sequencing. The

5' and 3' primers for rrnl6 were A-17 = 5 1 -GTTTGATCCTGGCTCAC-

3' and 5005-15 = 3 ' -CATGTGTGGCGGGCA-5' . The 5' and 3'

(degenerate) primers for all but one of the tufA genes were

IF = 5 ' -GGDCAYGTTGAYCAYGG-3 ' and 5R = 3 ' -TGACANCCRCGRCCRCA-

5'. Minimal degeneracy primers were designed by J.Palmer and by us, after inspection of known tufA sequences from various plants and algae. Primer 5R did not amplify tufA from P. obtusum, so the 3' ends of the tufA genes from the other chlamydomonad species were inspected for conserved areas, and an alternative 3' primer (113 OR = 3 1-CCRATACGGDCCACTRGC-51) was designed and used, located 100 bases further 5' of the original 3' primer 5R. This amplified a tufA fragment from

P. obtusum that was 100 bases shorter than the other chlamydomonad sequences. The rrnl6 amplification products sequenced were 1300 bases long; the tufA amplification products were 1200 bases long, except for P. obtusum's sequence, which was 1100 bases long. Optimal amplification conditions were determined for each gene empirically; multiple separate amplifications were performed and pooled, and purified using GeneClean. 122 Sequencing

Both strands of each gene were sequenced, using a slightly-modified version of the dsDNA Cycle Sequencing kit

(Life Technologies), as previously described {CITE our J

Phycol paper in press}. Most internal primers for sequencing were obtained from Paul Fuerst for the rrnl6 gene and from

Jeff Palmer for the tufA gene; additional internal primers in conserved regions were designed to fill gaps in sequence coverage.

Sequence Alignment-rrnl6

rrnl6 sequences were obtained from the RDP database

(Maidak et al. 1994) for three green chlamydomonads, two species of Chlorella, and eight land plants. The five new sequences for the study reported herein (P.u.964, P. obtusum,

P. 62-27, C. humicola and C. dysosmos) were added to the RDP sequences in an alignment array in SeqApp (Gilbert 1992), then hand-aligned to match the RDP alignment. This alignment was refined, by hand, for the most variable portions of the gene with the help of conserved patterns in the secondary structures of the 16S rRNA in three green algae

(Chlamydomonas reinhardtii, Chlamydomonas moewusii, and

Chlorella vulgaris) and three land plants (Zea mays,

Marchantia polymorpha and Nicotiana tabacum). All secondary structures were retrieved from RDP datafiles. 123 1287 bps of the r m l 6 gene were aligned (85% of the mature rRNA), excluding part or all of only three stem-loop

structures where the variable number of bases between species made alignment ambiguous, and leaving out the first 29 5' positions and the last 146 3 1 positions for lack of data in some or all species.

Sequence Alignment-tufA

To assist alignment of tufA sequences, C. Delwiche and

J. Palmer at Indiana University provided their alignment array, with 18 eubacterial, 8 cyanobacterial, 26 algal, and 4 land plant sequences (array described in Delwiche et al.

1995). The Polytoma and Chlamydomonas sequences were aligned to various subsets of this array using DNA sequences, but influenced by the resulting amino acid alignment.

1053 bps of the tufA gene were aligned (85% of the coding region), leaving out the first 72 5' positions and the last 96 3' positions for lack of data in some or all species.

Analysis of 16S Secondary Structures

Initial rRNA secondary structure models were formed for all three Polytoma sequences and the two green Chlamydomonas sequences, by visual comparison to the secondary structure model of Chlamydomonas reinhardtii (Gutell et a l . 1985).

These models are currently being reviewed by Robin Gutell at the University of Colorado, to see how well our analysis fits - -C A G T T thernuijjh - A C TC T Chlamydia -A C A T T Jlejaslipes •A C T C T bacteroida -GAGCT pseudomonas - GAACT ecoli -G AA CT salmonella -C A T C T TCCA bacillus -G A A C T streptococcus -A C A C T mycobactrm - ACGCT micrococcus -GATGA TGGCA GCACTCCG ...... GA spirulina -TCGGA GGTT CGGCAGCT...... QAAl aruscystis - AATCA TGdQGAA CAGATGCT...... GA a | porphyra -G CTCA TGBGGAT CGACAGC T - - ...... GA, -GACCA tg B g g t t C A A T T G T A ...... GA, cycbtella -G A A TC TGAATA |TGGTG GAAAAACA...... T< coleochaele -GAAGC TGAATA TG ATAAAACA...... A< chara - AAAAC JGATA ATACGCAAAT T ...... CA, codium -ACGAA TG AAGAGTCG - - ...... arabidopsis -ACTTC iAG A G G A A TCT...... ^ A ' nicotiana A - - -GTAGG ITGC G TA ACCCTTC A T C A G T A ...... GCTG AAGAAb A T T C AAAlCAAA Creinhanhii T - - -G T T T C iGTCAT TGC GTAATCCTTC ATCTGTT ...... GCTG AAGaACATTC TAAcAA P6227 A - - -ACTGG TTTCTCACAr TGC GTAGC---TC ATdAGCAGCA- GG AGCAGclrG AAGAGCAATC AAAcAA Chumicola T ACTGC c t t B a c t c a ) CGG ATACG---CA ACGAGACTCC .GAA - - - ACTA ACAAAAAGGG TAATTC Pobtusum

Triplet Repeats

direct Repeats 124 Figure 12. Nucleotide alignment of a hypervariable region in tufA gene, showing Chlamydomonas-specific insertions 5 insertion

OLPPGVE- VTFI Iknmijt TLPEGVE- Chlamydia TLAEGVE- flexistipej Eubacteria TLPEGTE- hactnvides 1 ELPKOKE• VS I I patm kt ELPEGVE- ejoU ELPEQVE- talmoaetta HLPEQVE- I bacillus ELPAGTE- W fJIftHtYYfff TLPEGTE- TN IS mycobactrm TLPEGTE- m csococou DEFTADDG STPE • spinUxa 8DFTADDQ SAAE- aaacystis NGFTADDG TDAE- porphyn AQFTSDDG STAE ayptom onar DQFTAODG SIVE cyclotella ESFEYEIG GKTW- cokoduae algae and EAFEYDNG DKTR- d m KTFQGKID NTQIQ* codiwn land plants TKIMMOKD EESK • arabidopsis TSITTDKG EESK...... nicotiana VGFNHI CM RNPSSV A EEHSNKH CrtMiardtii VSFSHIGM RNPSSV A EEHSNKl P6227 Chlamys TGFSHIGM RS-SSAAAAA EEOSNKB Chumicola TAFTHAQT DT-GROSE-T NKKGNSI Pobtusum and first insertion event Polytomas second insertion event

Figure 13. Amino acid alignment of a hypervariable region in tufA gene, showing chlamydomonad-specific insertions 126 EF-Tu protein

y

Chl amydomonad-specific insertion

Protein structure of EF-Tu with insertion shown 127 his extensive collection of rRNA structure models based on comparative analysis (Gutell et al. 1994, Woese and Pace

1993).

RESULTS

Primary and Secondary Structure Alignments of

Sequences of Polytoma rrnl6 Genes Are Consistent with

Functionality

The primary sequences of the three Polytoma rrnl6 sequences could be unambiguously aligned for all positions except three short regions. The sequences were easily folded into secondary structures very similar to those in green species, by visual comparison to the secondary structure model of C. reinhardtii (Gutell et al. 1985) . The only alignment ambiguities are located in secondary structure stems 6, 11 and 29 (numbering as per Neefs et a l . 1993).

Even though base substitution differences were significantly large between the Polytoma sequences and their closest green relatives, there were very few single-base or multi-base indel (insertion or deletion) differences among sequences, so alignment was not unduly confounded by the high base substitution differences. The rrnl6 sequences of P. 62-27 and C. humicola have base differences at 9.5% of the aligned sites, but there are only 4 single-base indels (plus an 11- base insertion in the region corresponding to stem 37-C in C. humicola) . The two P. uvella sequences have bases different 128 from C. humicola at 14 -15% of aligned sites, but they differ from C. humicola by only 18 and 16 single-base indels, respectively, plus either four or three short multi-base indels (2 to 6 bases long). The net effect of all indels on the overall length of the genes is that the P. 62-27 rrnlS sequence is only 16 bases longer than in C. moewusii, its closest green relative; and the two P.uvella rrnlS sequences are only 22 and 13 bases longer, respectively, than in C. humicola, their closest green relative. Again, therefore, alignment was not unduly confounded by the high number of base substitutions.

Alignment analysis was aided by the presence of 38 of 48 stems and 98 of 121 loops that were invariant in size in all three Polytoma sequences, as well as in all the available green chlamydomonad sequences. (The remaining two stems and three loops at the 5' and 3' ends of the gene could not be analyzed for lack of data in some or all sequences.)

Neither large indels nor truncations were seen in these

Polytoma rrnlS sequences. All indels and base substitutions seen were compatible with the typical secondary structure folding patterns expected in a functional rRNA molecule

(Gutell et al. 1985, 1994). 129 Phylogenetic Information from Polytoma rrnl6

Sequences, and Additional Evidence of Functionality

Several single-base or multi-base insertion events were observed in stem regions in either C. humicola plus the two

species from the P.uvella clade, or just in the two P.uvella

species, that involved compensating pairs of insertions

(insertions on both sides of the stem). Three examples of these also provide confirmation from leucoplast genes of the phylogenetic relationships inferred from our analysis using the nuclear RrnlS gene, reported in Chapter III. An 11-base insertion in stem 37-C seems to have occurred only in the phylogenetic lineage leading to C. humicola and the P. uvella clade, since the sequences we obtained from two representative P. uvella species also have insertions in stem

37-C which are not present in sequences from any of the other green or nongreen species in our array. The lineage leading to C. humicola and the P. uvella clade also shows compensating multi-base insertion events on both sides of stem 6, resulting in a longer, but well-bonded, stem 6 in C. humicola than in C. reinhardtii. The two P. uvella sequences show even longer compensating multi-base insertions on both sides of stem 6 than does C. humicola, resulting in even longer stems 6 than in C. humicola, but still well-bonded.

In stem 25, compensating single-base deletions (relative to

C. reinhardtii) on both sides of the stem are seen only in the two P. uvella sequences, so these matching deletion 130 events probably occurred after the split of the P. uvella clade from C. humicola.

As also seen with indel differences, of the large number of base substitutions in the Polytoma sequences, many are members of a pair of compensating substitutions on both sides of a stem, where the second substitution results in a site where hydrogen bonds can form across the stem, as was the case before the first substitution occurred. A portion of these are compensating substitutions seen in P.u.964 but not in P. obtusum (or vice versa), and therefore probably occurred not only after the P. uvella clade split from C. humicola, but also after these two non-green Polytoma species diverged from each other, i.e. after photosynthesis was lost.

The presence of these compensating mutations is strong evidence for the continuation of selective processes, implying continued functionality after these Polytoma species became non-green. However, these data could also be seen in sequences that were functional for an extended period of time in the past and have recently become non-functional. We cannot determine which is the case here.

Amino Acid Sequences of Polytoma LeucoplaBt tufA Genes are Consistent with Functionality

The two Polytoma tufA base sequences (and the sequences from the two green chlarnydomonads, C. humicola and C. dysosmos), and the inferred amino acid sequences of the corresponding proteins, could also be unambiguously aligned,

except in one short hypervariable region (see Figures 12 and

13). There were no indel differences between the Polytoma

sequences and their closest green relatives, except in this

hypervariable region . In addition, although the tufA genes

of P. 62-27 and C. humicola show base substitution

differences of 15%, most of those differences are at third

codon positions. P. 62-27 and C. humicola only differ at 4%

of first and second position sites, so alignment was not

confounded by the number of base substitutions at third position sites. While the P. obtusum tufA sequence showed much more nucleotide divergence from C. humicola (26% base

substitutions for all codon positions and 10% for

first+second positions), these genes could also be easily aligned at the DNA level. Translation of nucleotides to amino acids results in 6% amino acid divergence between P.

62-27 and C. humicola and 15% amino acid divergence between

P. obtusum and C. humicola.

No indels (except in the hypervariable region) and no premature stop codons were seen in the Polytoma sequences in

the 85% of the genes sequenced. Excluding the hypervariable

region, all amino acid substitutions in the P. 62-27

sequence, relative to green chlamydomonads, were

substitutions also seen in at least one of the functional algal, cyanobacterial or eubacterial genes in the alignment array sent by Palmer & Delwiche. Similarly, all but six 132 amino acid substitutions in the P. obtusum sequence, relative

to green chlarrydomonads, were not seen in other functional genes in the alignment array; but those six amino acid

substitutions were all conservative substitutions. These data, plus the analysis below of the likely minimal effects of the hypervariable region on the structure of the protein, are all consistent with continued functionality of the

Polytoma tufA genes.

A Hypervariable Region in Chlamydomonad tufA Sequences

The hypervariable region starts at amino acid 354 in the amino acid sequence of Thermus thermophilus and ends at amino acid 361, shown in Figures 12 and 13. The tufA sequences shown in those Figures, except for the last three chi airy domonad sequences, are part of the array provided by C.

Delwiche and J. Palmer and described in Delwiche et al.

1995) . Thermus thermophilus is used as the "type" sequence in Figures 12, 13 and 14 because its protein structure is known from X-ray crystallography (Berchtold et al. 1993).

Between amino acids 354 and 361 in T.t., all known cyanobacterial, algal, and land plant sequences have at least five or six more amino acids than do the many known eubacterial sequences. The four chlamydomonad sequences shown at the bottom of the array in Figures 12 and 13 have an extra 14 to 16 amino acids, relative to eubacterial sequences. An ancient insertion, perhaps due either to strand slippage or a , in the lineage leading to the ancestors of cyanobacteria could account for the five or six extra amino acids in all cyanobacterial/algal/plant lineages. However, observation of the cyanobacterial sequences in this region of the gene did not reveal direct repeats that might be indicative of the ends of such a insertional event. Perhaps more recent substitution events have obscured evidence of more ancient events. Observation of the chlamydomonad sequences, however, does reveal evidence that might indicate two chlamydomonad- specific mutational events. The lineage leading to the chlamydomonad algae (including the two Polytoma sequences) appears to have experienced an additional insertion event or events, somewhere in the region between amino acids 354 and

360 in T.t., which expanded the extra amino acids (relative to eubacteria) from 5 - 6 amino acids to 14 - 16 amino acids.

Observation of the four chlamydomonad sequences offers some clues to the type and location of insertional events that may have occurred, which would be expected to be more recent events than the cyanobacterial insertion for which no evidence was found in the DNA sequences. In Figure 12, the direct repeat ACATTCAAA is found near the beginning and the end of this hypervariable region in C. reinhardtii. With slight variations, this direct repeat is also seen in P. 62-

27 and C. humicola; and remnants of the repeat are still visible in P. obtusum. The sequence immmediately 3 ‘ of the 134 first ACATTCAAA in C. reinhardtii also shows some homology to the sequence immediately 3' of the second ACATTCAAA. This may be evidence of a transposable element insertion. In addition, there is indication of perhaps a strand slippage event unique to the lineage leading to C. humicola. In the middle of this hypervariable region, C. humicola has an in­ frame 4X repeat of the trinucleotide GCA (or an out-of-frame

5X repeat of CAG or AGC). This trinucleotide repeat is not found in the other three chlamydomonads, it can be easily aligned as an insertion relative to C. reinhardtii and P. 62-

27, and it makes the C. humicola sequence the longest of any species in this hypervariable region.

Evidence of either of the chlanydomonad-specific insertional events is not as obvious in the P. obtusum sequence, and the alignment of the hypervariable region of the P. obtusum sequence to the rest of the chlamydomonads is questionable. The higher substitution rate in the P. obtusum sequence, relative to the other three chlamydomonad sequences, has probably obscured the signature sequences that are more obvious in the other three chlamydomonad sequences.

(This rate difference is detailed and quantified in Chapter

V.) However, portions of the ACATTCAAA direct repeats are still detectable in P. obtusum, and there is no evidence of the GCA repeats seen in C. humicola.

The alignment of the four chlanydomonad sequences relative to each other, even with a somewhat questionable 135 alignment for P. obtusum, is almost certainly a better re­

creation of phylogenetically alignable positions within the

chlamydomonads in this hypervariable region than is the

alignment shown in Figures 12 and 13 for the relationship of

the chlamydomonads relative to the cyanobacteria and other

algae. In addition to the alignment depicted in Figures 12

and 13, two other possible alternative alignments were found.

In a second possible alignment, all but one of the five or

six extra amino acids in the cyanobacteria/algae/land plants

align very well at the 3' end of the hypervariable region,

not near the 5' end as shown. In a third possible alignment,

these five or six extra amino acids align in the middle of

the hypervariable region, and span the indel gap created by

the GCA repeats in C. humicola.

However, the alignment ambiguities in this hypervariable

region do not affect an investigation of the probable effect

of these extra amino acids on the potential functionality of

the EF-Tu proteins in the two Polytoma species. The location

of this hypervariable region is shown in Figure 14 on the EF-

Tu protein structure of Thermus thermophilus, using HyperChem

software. The EF-Tu protein is comprised of three domains.

Domain 1 binds GTP and the Mg+ cofactor (shown), and the

charged tRNA (not shown) is now known to bind in a groove between Domains 1 and 2 (Berchtold et al. 1993, Nissen et al.

1995) . The hypervariable region in cyanobacteria, algae and

land plants lies along the outside surface of Domain 3 . The 136 location where the extra 5-16 amino acids would be in those species (somewhere between amino acids 354 and 360 in T.t.) is labelled in Figure 14.

DISCJPLS.StflN

Probable Effect of Hypervariable Region on

Functionality of the Two Polytoma EF-Tu Proteins

In the hypervariable region, the differences between the

P. 62-27 tufA sequence and the C. reinhardtii tufA sequence

(which codes for a known functional EF-Tu protein, Baldauf and Palmer 1990) are fairly minor. The lengths of the two sequences in this hypervariable region are identical, and there are only two amino acid differences between the two sequences, both of which are conservative substitutions.

Those few differences appear to be consistent with a still- functional EF-Tu protein in P. 62-27. The C. humicola sequence has only two more amino acids in this hypervariable region than does the (functional) C. reinhardtii sequence, plus five amino acid differences between the two sequences

(four of which are conserved substitutions). These few differences also appear to be consistent with a still- functional protein (which would be expected in this photosynthetic chlamydomonad) . The P. obtusum sequence in this hypervariable region has only one more amino acid than the C. reinhardtii sequence, but there are 18 amino acid differences between the two sequences, only 9 of which are 137 conserved substitutions. Can the amino acid composition of the EF-Tu protein in P. obtusum change that much in the hypervariable region and still be functional? We submit that the answer may be yes, for the following reasons. The location of the hypervariable region is along the outside surface of Domain 3, which is well away from any known binding sites except part of the tRNA (Nissen et al. 1995).

Extra amino acids located along the extreme outside surface of the protein would probably not adversely affect initial folding of the protein or the conformational changes known to occur during catalysis {Berchtold et al. 1993) . The amino acid composition for the hypervariable region in P. obtusum is even more hydrophilic than are the sequences in the other three chi airy domonads (including the functional C. reinhardtii), suggesting that none of the hypervariable regions in our three new chlarrydomonad sequences would disrupt protein folding because they would probably not tend toward a position in the interior of the protein. A two- dimensional analysis of the drawings included in the recently reported tRNA location by Nissen et al. appears to show that the hypervariable loop of EF-Tu and the tRNA lie close to each other, but a larger number of amino acids would probably bulge in a different direction than near the tRNA.

Therefore, it does not appear that extra amino acids in the hypervariable region would adversely affect tRNA binding, but three-dimensional analysis of this question using HyperChem 138 awaits the report of the atom coordinates of the tRNA.

Perhaps most importantly, observation of the sequences on

Figures 12 and 13 shows that proteins known to be functional can tolerate a lot of length variability and a lot of amino acid variability in this region. Cyanobacterial, algal and plant proteins tolerate 5 - 6 more amino acids between position 354 and 360 than the five labelled on Figure 14 for

T.t., and those proteins are still functional. The C. reinhardtii protein tolerates 14 more amino acids between 354 and 360 than shown in Figure 14, and it is still functional.

Given that comparison, it does not seem to be such a stretch to consider that the two Polytoma EF-Tu proteins (and C. humicola's protein) might still be functional with one or two more amino acids in this region than in C. reinhardtii plus some non-conservative substitutions.

Phylogenetic Information from the Hypervariable Region

The phylogenetic relationships inferred from amino acid sequences in this hypervariable region contradict the phylogeny reconstruction done in Chapter III. In phylogenetic trees based on complete sequences of the nuclear

Rrnl8 gene (or genes trees from either of the two plastid genes rrnlS and tufA), P. 62-27 is more closely related to C. humicola than to C. reinhardtii. However, only 2 of the 23 amino acids in the hypervariable region between T.t. positions 354 and 360 are not identical in P. 62-27 and C. 139 reinhardtii, while P. 62-27 and C. humicola differ from each other by 5 amino acid substitutions plus 4 amino acid indels.

This could be a case of convergent evolution, or it could be a statistical artifact due to the small number of substitutions involved. It could also be further evidence that one or two additional strand slippage or insertion events may have occurred exclusively in the lineage leading to C. humicola.

In any case, signatures for a possible chi arny dononad- specific insertion event are visible in both Polytoma sequences, as well as in both green Chi airydomonas sequences.

These chlamydomonad-specific signatures and the similar lengths of the hypervariable regions in all four chlamydomonad sequences, together with the tufA sequence divergence data described in Chapters IV and V, should firmly establish these two Polytoma tufA sequences as plastid chlamydomonad genes. This makes unlikely the possibility that they represent contaminating genes that were accidentally amplified.

Strand Slippage Event in C. humicola tufA is Common

The trinucleotide repeat seen in the hypervariable region of tufA in C. humicola (an in-frame 4X GCA repeat or out-of-frame 5X repeats of CAG or AGC) is a sequence commonly seen repeated as stretches of five or more repeats in many functional genes (Morell 1993). Sometimes the number of 140 triplet repeats becomes very large. The resulting genes are

sometimes still functional, as is the case for the gene for

glucocorticoid receptor in rat (Gearing et al. 1993), or the

TFIID gene in humans (Hoffmann et al. 1990, Hancock 1993).

However, very large numbers of CAG repeats are seen in genes

responsible for Huntington's disease and Kennedy's disease

(MacDonald et al 1993), while CGG repeats are implicated in

fragile X syndrome and CTG repeats are implicated in myotonic

dystrophy (Suthers et a l . 1992).

rrnl6 and tufA Sequences in Polytoma are Consistent

with Functionality

Analysis of leucoplast rrnl6 and tufA in three Polytoma

species shows sequences that are all consistent with

continued functionality of both plastid genes. There are no major insertions or deletions in the sequences, no

truncations of the proteins caused by STOP codons, and no

apparent disruption of the normal form and bonding patterns

in either the rRNAs or the EF-Tu protein.

In addition, nucleotide substitution rates in the tufA

sequences at first and second codon positions are much lower

than the rates at third positions, implying that first and

second positions are (or were, until recently) more

constrained by selection than are third positions. This

signature of selection implies either that the genes are

still functional, or that the genes were functional for some 141 extended time in the past and have only recently become non­ functional. Unfortunately, we cannot determine which is the case here.

However, it is significant that the leucoplast sequences from P. obtusum and P.u.964 do not show any mutations that would necessarily render the gene products non-functional.

The rate comparison study described in Chapter V shows much higher substitution rates in these leucoplast genes in P. obtusum and P.u.964 than in green species. Accelerated evolution in these two members of the P.uvella clade suggests that the P. uvella clade comes from a very old nonphotosynthetic lineage with enough time to acquire nonsense mutations and frameshifts that would inactivate EF-

Tu and destroy the secondary structure of the 16S rRNA, and yet we see nothing in these P. uvella clade sequences to indicate non-functionality.

On the other hand, P. 62-27 leucoplast genes do not show increased substitution rates. A likely explanation is that a

P. 62-27 ancestor may have lost photosynthesis much more recently than it was lost in the lineage leading to the

P.uvella clade. We cannot verify this hypothesis on the phylogenetic tree because we did not locate green species as close to P. 62-27 as the green species we found relative to the P.uvella clade. This hypothesis would imply that the likelihood of a P. 62-27 leucoplast gene acquiring an indel or substitution that inactivates it is very small. 142 Therefore, the apparent functional appearance of rrnlS or

tufA from P. 62-27 may say nothing about whether the gene is

truly still functional.

The observations in the study reported here coincide with those seen in leucoplast rRNA genes and protein genes

that have plastid expression functions in Epifagus,

Conopholis and Astasia. While the substitution rates in most expression genes in those nonphotosynthetic species are higher than in photosynthetic relatives, the expression genes are, for the most part, present and intact, and appear

functional (dePamphilis and Palmer 1990, Wimpee et al. 1992a,

Siemeister and Hachtel 1990b).

Further Investigations of Leucoplast Functionality

Serendipitously, mutations in the two leucoplast genes

sequenced for this study (rrnlS and tufA) may be able to be correlated with known phenotypes. In Chlamydomonas, mutants resistant to high concentrations of the antibiotics

spectinoirycin and kirromycin are nearly all chloroplast mutants (Harris 1989, Harris et al. 1989). Many show mutations in the chloroplast rrnl6 gene (in spectinomycin experiments) and in the tufA gene (in kirromycin experiments) . Spectinomycin-resistant mutants have characteristic mutations in rrnlS genes from Chlamydomonas to

tobacco to E. coli (Harris et al. 1989, Noller et al. 1993,

Cundliffe et al. 1993, Yeh et a l . 1994, Sigmund 1984); and 143 kirromycin-resistant mutants with tufA mutations have been found in Chlamydomonas, archaebacteria and eubacteria (Harris

1989, Kessel and Klink 1981,Mesters et al. 1994, Landini et al. 1993, Vijgenboom et al. 1994). We should therefore be able to provide evidence of leucoplast functionality in

Polytoma almost as powerful as from a genetic cross if we can show a correlation between resistance to an antibiotic and a change in the DNA sequence of the corresponding leucoplast genes. These antibiotic experiments are in progress. Literature Cited

Anborgh, P. H. and A. Parmeggiani, 1991 New antibiotic that acts specifically on the GTP-bound form of elongation factor Tu. The EMBO Journal 10: 779-784.

Arnold, C. G. and R. Blank, 1980 Three-Dimensional structure of mitochondria and plastids in Chlamydomonas reinhardtii and Polytoma papillatum. Walter de Gruyter & Co., New York.

Birky, C. W., 1995 Uniparental inheritance of mitochondrial and chloroplast genes. Proc. Natl. Acad. Sci. USA 92: 11331- 11338.

Blamire, J., V. R. Fletchner and R. Sager, 1974 Regulation of nuclear DNA replication by the chloroplast in Chlamydomonas. Proc. Nat. Acad. Sci. USA 71: 2867.

Bold, H. C. and M. J. Wynne, 1985 Introduction to the Algae. Prentice-Hall, Inc., Englewood Cliffs, N.J.

Browse, J. and C. Somerville, 1991 Glycerolipid synthesis: biochemistry and regulation. Annu. Rev. Plant. Physiol. Plant Mol. Biol. 42: 467-506.

Colwell, A., 1994 Genome evolution in a non-photosynthetic plant, Conopholis americana. Washington University.

Conde, M. F., J. E. Boynton, N. W. Gillham, E. H. Harris, C. L. Tingle and W. L. Wang, 1975 Chloroplast genes in Chlamydomonas affecting organelle ribosomes. Molec. Gen. Genet. 140: 183.

Cundliffe, E., 19? Recognition Sites for Antibiotics within rRNA, in The Ribosome: Structure, Function, and Evolution, edited by W. E. Hill, P. B. Moore, A. Dahlberget al. The American Society for Microbiology, Washington, DC. dePamphilis, C. W. and J. D. Palmer, 1990 Loss of photosnthetic and chlororespiratory genes from the plastid genome of a parasitic flowering plant. Nature 348: 337-339.

144 145

Ettl, H. and U. G. Schlosser, 1992 Towards a revision of the systematics of the genus Chlamydomonas (Chlorophyta). 1. Chlamydomonas applanata Pringsheim. Bot. Acta 105: 323-330.

Feieraband, J., 1992 Conservation and structural divergence of organellar DNA and gene expression in non-photosynthetic plastids during ontogenetic differentiation and phylogenetic adaption. Bot. Acta 105: 227-231.

Fiedler, E. and G. Schultz, 1985 Localization, purification, and characterization of shikimate oxidoreductase- dehydroquinate hydrolase from stroma of spinach chloroplasts. Plant Physiol. 79: 212-218.

Gaffal, K. P., 1978 Configural changes in the plastidome of Polytoma papillatum after completion of cytokinesis and during fusion of the gametes. Protoplasma 94: 175-191.

Gillham, N. W., 1994 Organelle Genes and Genomes. Oxford University Press, Oxford.

Harris, E. H., 1989 The Chlamydomonas Sourcebook. Academic Press, Inc., New York.

Harris, E. H., B. D. Burkhart, N. W. Gillham and J. E. Boynton, 1989 Antibiotic resistance mutations in the chloroplast 16S and 23S rRNA genes of Chlamydomonas reinhardtii: Correlation of genetic and physical maps of the chloroplast genome. Genetics 123: 282-292.

Heizmann, P., Y. Hussein, P. Nicolas and V. Nigon, 1982 Modifications of chloroplast DNA during streptomycin induced mutageneis in Euglena gracilis. Curr. Genet. 5: 9.

Holwerda, B. C., S. Jana and W. L. Crosby, 1986 Chloroplast and mitochondrial DNA variation in Hordeum vulgare and Hordeum spontaneum. Genetics 114: 1271-1291.

Howe, C. J. and A. G. Smith, 1991 Plants without chlorophyll. Nature 349: 109.

Hussein, Y., P. Heizmann, P. Nicolas and V. Nigon, 1982 Quantitative estimations of chloroplast DNA in bleached mutants of Euglena gracilis. Current Genetics 6: 111.

Hwang, S. R. and F. R. Tabita, 1991 Acyl carrier protein- derived sequence encoded by the Chloroplast genome in the marine diatom Cylindrotehca sp. Strain Nl. J Biol Chem 266: 13492-13494. 146 Kessel, M. and F. Klink, 1981 Eur. J. Biochem. 114: 481-486.

Kieras, F. J. and K.-S. Chiang, 1971 Characterization of DNA components from some colorless algae. Exp. Cell. Res. 64: 89- 96.

Kimura, M . , 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

Kirk, J. T. 0. and T.-B. R. A. E., 1978 The Plastics. Elsevier, North Holland, Amsterdam and New York.

Landini, P., M. Bandera, A. Soffientini and B. P. Goldstein, 1993 J. Gen. Microbiol. 139: 769-774.

Lang, N. J., 1963 Electron-microscopic demonstration of plastids in Polytoma. J. Protozool. 10: 333-339.

Laudenbach, D. E. and A. R. Grossman, 1991 Characterization and mutagenesis of Sulfur-regulated Genes in a Cyanobacterium: Evidence for Function in Sulfate Transport. J. Bacteriol 173: 2739-2750.

Maidak, B. L., N. Larsen, M. J. McCaughey, R. Overbeek, G. J. Olsen, K. Fogel, J. Blandy and C. R. Woese, 1994 The Ribosomal Database Project. Nucleic Acids Research 22: 3485- 3487.

Mesters, J. R. , L. A. H. Zeef, R. Hilgenfeld, J. M. de Graaf, B. Kraal and L. Bosch, 1994 The structural and functional basis for the kirromycin resistance of mutant EF-Tu species in Escherichia coli. The EMBO Journal 13: 4877-4885.

Michalowski, C. B., R. Flachmann, W. Loeffelhardt and H. J. Bohnert, 1991 Gene nadA, encoding Quinolinate Synthetase, is located on the Cyanelle DNA from Cyanophora paradoxa. Plant Physiol 95: 329-330.

Michalowski, C. B., W. Loeffelhardt and H. J. Bohnert, 1991 An ORF323 with homology to crtE, specifying Prephytoene Pyrophosphate Dehydrogenase, is encoded by Cyanelle DNA in the eukaryotic alga Cyanophora paradoxa. J Biol Chem 266: 11866-11870.

Morden, C. W . , K. H. Wolfe, C. W. dePamphilis and J. D. Palmer, 1991 Plastid translation and transcription genes in a non-photosynthetic plant: intact, missing and pseudo genes. The EMBO Journal 10: 3281-3288. 147 Neefs, J. M., Y. Van de Peer, P. DeRijk, S. Chapelle and R. De Wachter, 1993 Compilation of small ribosomal subunit RNA structures. Nucleic Acids Research 21: 3025-3049.

Noller, H. F., D. Moazed, S. Stern, T. Powers, P. N. Allen, J. M. Robertson, B. Weiser and K. Triman, 19? Structure of rRNA and its functional interactions in Translation, in The Ribosome: Structure, Function, and Evolution, edited by W. E. Hill, P. B. Moore, A, Dahlberget al. The American Society for Microbiology, Washington, DC.

Ohyama, K., H. Fukuzawa, T. Kohchi, H. Shirai, T. Sano, S. Sano, K. Umesono, Y. Shiki, M. Takeuchi, Z. Chang, S.-I. Aota, H. J. Inokuchi and H. Ozeki, 1986 Chloroplast gene organization deduced from complete sequence of liverwort Marchantia polymorpha chloroplast DNA. Nature 322: 572-574.

Piechulla, B. and H. Kuntzel, 1983 Eur. J. Biochem. 132: 235-240.

Reardon, E. M. and C. Price, 1994 Nomenclature of Sequenced Plant Genes. Plant Molecular Biology Report 12: Sl-81.

Scherbel, G., W. Behn and C. G. Arnold, 1974 Untersuchungen zur genetischen Funktion des farblosen Plastiden von Polytoma mirum. Arch. Microbiol. 96: 205-222.

Schwarz, Z. S. and H. Kossel, 1980 The primary structure of 16S rDNA from Zea mays chloroplast is homologous to E. coli 16S rRNA. Nature 283: 739-742.

Shinozaki, K., M. Ohme, M. Tanak, T. Wakasugi, N. Hayashida, T. Matsubayashi, N. Zaita, J. CHungwongse, J. Obokata, K. Yamaguchi-Shinozaki, C. Ohto, K. Torozawa, B. Y. Meng, A. Sugita, H. Deno, T. Kamogashira, K. Yamada, J. Kusuda, F. Takaiwa, A. Kato, N. Tohdoh, H. Shimada and M. Sugiura, 1986 The complete nucleotide sequence of tobacco chloroplast genome: its gene organization and expression. EMBO 5: 2043- 2050.

Siemeister, G., C. Buchholz and W. Hachtel, 1990 Genes for the plastid elongation factor Tu and ribosomal protein S7 and six tRNA genes on the 73 kb DNA from Astasia longa that resembles the chloroplast DNA of Euglena. Mol. Gen. Genet. 220: 425-432.

Siemeister, G. and W. Hachtel, 1989 A circular 73 kb DNA from the colourless flagellate Astasia longa that resembles the chloroplast DNA of Euglena: restriction and gene map. Curr. Genet. 15: 435-441. 148

Siemeister, G. and W. Hachtel, 1990 Organization and nucleotide sequence of ribosomal RNA genes on a circular 73 kbp DNA from the colourless flagellate Astasia longa. Curr. Genet. 17: 433-438.

Siemeister, G. and W. Hachtel, 1990 Structure and expression of a gene encoding the large subunit of ribulose-1,5- bisphosphate carboxylase (rbcL) in the colourless euglenoid flagellate Astasia longa. Plant Mol. biol. 14: 825-833.

Sigmund, C. D., M. Ettayebi and E. A. Morgan, 1984 Antibiotic resistance mutations in 16S and 23S ribosomal RNA genes of Escherichia coli. Nucleic Acids Research 12: 4653- 4663.

Siu, C.-H., K.-S. Chiang and H. Swift, 1976 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 369-391.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Siemeister, G., C. Buchholz and W. Hachtel, 1990 Genes for the plastid elongation factor Tu and ribosomal protein S7 and six tRNA genes on the 73 kb DNA from Astasia longa that resembles the chloroplast DNA of Euglena. Mol. Gen. Genet. 220: 425-432.

Siemeister, G. and W. Hachtel, 1989 A circular 73 kb DNA from the colourless flagellate Astasia longa that resembles the chloroplast DNA of Euglena: restriction and gene map. Curr. Genet. 15: 435-441.

Siemeister, G. and W. Hachtel, 1990 Organization and nucleotide sequence of ribosomal RNA genes on a circular 73 kbp DNA from the colourless flagellate Astasia longa. Curr. Genet. 17: 433-438.

Siemeister, G. and W. Hachtel, 1990 Structure and expression of a gene encoding the large subunit of ribulose-1,5- bisphosphate carboxylase (rbcl>) in the colourless euglenoid flagellate Astasia longa. Plant Mol. biol. 14: 825-833. 149 Sigmund, C. D., M. Ettayebi and E. A. Morgan, 1984 Antibiotic resistance mutations in 16S and 23S ribosomal RNA genes of Escherichia coli. Nucleic Acids Research 12: 4653- 4663.

Siu, C.-H., K.-S. Chiang and H. Swift, 1976 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 3 69-391.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Siemeister, G., C. Buchholz and W. Hachtel, 1990 Genes for the plastid elongation factor Tu and ribosomal protein S7 and six tRNA genes on the 73 kb DNA from Astasia longa that resembles the chloroplast DNA of Euglena. Mol. Gen. Genet. 220: 425-432.

Siemeister, G. and W. Hachtel, 1989 A circular 73 kb DNA from the colourless flagellate Astasia longa that resembles the chloroplast DNA of Euglena: restriction and gene map. Curr. Genet. 15: 435-441.

Sigmund, C. D., M. Ettayebi and E. A. Morgan, 1984 Antibiotic resistance mutations in 16S and 23S ribosomal RNA genes of Escherichia coli. Nucleic Acids Research 12: 4653- 4663.

Siu, C.-H., K.-S. Chiang and H. Swift, 1976 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 369-391.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382. 150

Vijgenboom, E., L. P. Woudt, P. W. H. Heinstra, K. Rietveld, J. van Haarlem, G. P. van Wezel, S. Shochat and L. Bosch, 1994 Three tuf-like genes in the kirromycin producer Streptomyces ramocissimus. Microbiology 140: 983-998.

Weeden, N. F., 1981 Genetic and biochemical implicacions of the endosymbiotic origin of the chloroplast. J. Mol. Evolution 17: 133-139.

Wimpee, C. F., R. Morgan and R. Wrobel, 1992 An aberrant plastic ribosomal RNA gene cluster in the root parasite Conopholis americana. Plant Mol. Biol. 18: 275-285.

Wimpee, C. F., R. Morgan and R. L. Wrobel, 1992 Loss of transfer RNA genes from the plastid 16S-23S ribosomal RNA gene spacer in a parasitic plant. Curr. Genet. 21: 417-422.

Wimpee, C. F., R. L. Wrobel and D. K. Garvin, 1991 A divergent plastic genome in Conopholis americana, an achlorophyllous parastic plant. Plant Mol. Biol. 17: 161-166.

Wolfe, A. D. and C. W. dePamphilis, 1995 Alternate paths of evolution for the photosynthetic gene rhc in four nonphotosynthetic species of Orobanche. Plant Molecular Biology (submitted)

Wolfe, K. H., C. W. Morden, S. C. Ems and J. D. Palmer, 1992 Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. J . Mol. Evol. 35: 304-317.

Wolfe, K. H., C. W. Morden and J. D. Palmer, 1991 Ins and outs of plastid genome evolution. Curr. Opinion Genet. Develop. 1: 523-529.

Wolfe, K. H., C. W. Morden and J. D. Palmer, 1992 Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc. Nat. Acad. Sci. USA 89: 10648-10652.

Yeh, K. C . , K. Y. To, S. W. Sun, M. C. Wu, T. Y. Lin and C. C. Chen, 1994 Point mutations in the chloroplast 16S rRNA gene confer streptomycin resistance in Nicotiana plumbaginifolia. Current Genetics 26: 132-135. CHAPTER V

GENERAL BACKGROUND

Origins of Relative Rate Tests

Relative rate tests were designed as part of analyses

involving the molecular clock hypothesis - that there is a

constancy in the rate of evolution across lineages

(Zuckerkandl and Pauling 1962) . The amino acids and lineages

chosen in early tests of this hypothesis provided significant

evidence in its support (reviewed in Wilson et al. 1977).

However, some genes and lineages fail to show constancy

(e.g., King and Jukes 1969, Wu and Li 1985). New examples

continue to appear in support of either side of the sometimes

lively debate concerning appropriate range(s) for a clock and

explanations for exceptions found (e.g., Chao and Carr 1993) .

One of the genes used in the studies reported in this thesis,

rrnl6, is an example of one gene where global tests show

constancy over a broad range of lineages (Ochman and Wilson

1987) .

One of several problems that beset early investigations of the molecular clock hypothesis was the uncertainty in estimates of the times when species diverged. Sarich &

Wilson proposed a relative test that makes it unnecessary to

151 152 know absolute times of divergence (Sarich and Wilson 1973).

Wu & Li's mathematical approach to the relative rate test is a marvelously insightful combination of phylogenetic, geometric and algebraic logic (Wu and Li 1985). The rationale is that limiting the analysis to portions of a phylogeny that all converge on the same ancestral node means that the divergence time from that common ancestral node to any of the extant species you wish to compare (to each other) must logically be the same amount of time, because both lineages to extant species started from the same common ancestor. In its simplest form illustrated in Fig. 15, any difference in the evolutionary divergences Koi and K02 (from the outgroup to taxon 1 and from the outgroup to taxon 2) must be due to differences in the evolutionary rates along the branches connecting taxons 1 and 2 to their common ancestor (A) . Relative rate tests were originally used with amino acid data to investigate variations in evolutionary rates.

Mathematical Rationale for Relative Rate Teste

Wu & Li extended the relative rate test to DNA, rather than amino acid, analyses and gave it a formal mathematical justification (Wu and Li 1985) . To perform a relative rate test, one must first have some phylogenetic knowledge about the two species whose rates one wishes to compare, in order to choose an outgroup species that is more distantly related 153 to the two species whose substitution rates are to be compared than they are to each other. This phylogenetic knowledge will allow establishment of a 3-taxon tree as shown in Figure 15. Then, the geometry of a 3-taxon tree lets one separate the observed divergences (after correction for multiple hits) between all three pairs of species into x, y or z components. The geometric logic is that the divergence

(x) from the outgroup (0) to the nearest common ancestor (A) of the two species being compared (taxons 1 and 2) is the same number (x), whether that divergence is a part of the total divergence observed between outgroup 0 and taxon 1 or a part of the total divergence observed between outgroup 0 and taxon 2. With that logic, equations can be derived to express x, y and z in terms of the observed divergences Koi,

K02 and K1 2 . At that point, three algebraic equations with three unknowns can be solved with minor algebraic manipulation. The actual rates comparison, then, involves comparing y and z, which are the portions of the observed divergences that have been apportioned mathematically from the nearest common ancestor (A) to each of the two species being compared (where y = Kai and z = Ka2) • This comparison can be done either as a simple ratio of y/z (Kai/Ka2 ) » or as the difference (y - z = Kai - Ka2) • Tables 8 and 9 show both ways to express these differences. Table 7. Single-base substitution rate changes in Non-green plastid genes

Non-green Gene Rate Increase Relative rates obtained by species in Non-Greens

EPIFAGUS Tree Wu & Li rrnl6 40X parsimony & distance “ Algebra tufA (in nucleus) rbcL (grossly truncated pseudogene) rrn23 5X parsimony & distance " orfl, clp, mat 2X,3X,4X parsimony & distance" accD, orfz IX, IX parsimony & distance" average of rRNAs 3X to 8X parsimony & distance" ribosomal proteins & pooled tRNAs CONOPHOLIS rm l6 40X parsimony" tufA (in nucleus) ASTASIA rbcL (Note: gene looks functional) rbcL 3X distancec rrn23 1.7X distance ** POLYTOMA rrnl6 3.3X to 4.3X for P.u.964 parsimony & distance6 RR6 rrnl6 2.9X to 4.0X for P. obtusum parsimony & distance6 RR6 rrnl6 0.5X to 0.8X for P. 62-27 parsimony & distance6 RR6 tufA 3.3X to 4.7X for P.obtusum parsimony & distance6 RR6 tufA 0.9X to 1.4X for P. 62-27 parsimony & distance6 RR6 155 To evaluate the significance of any rate differences, the null hypothesis is that the rate from the non-green species (taxon 1) to the closest common ancestor (A) is the same as the rate from the green species (taxon 2) to the common ancestor (A), or H0 : y - z = 0, or y/z = 1. The difference can be evaluated for statistical significance by calculating the variance and standard error, and noting any overlaps in standard errors at the 5% or 1% significance level. If the number of substitutions between species exceeds 20, statistical significance can be evaluated relative to a normal distribution. Relative rate tests are most robust mathematically when the length of the DNA sequence being compared exceeds 250 nucleotides (which is the case for both rrnl6 and tufA sequences evaluated in my study). Also, the outgroup needs to be phylogenetically located as close as possible to the two species being compared (since the further away the outgroup, the more the calculations minimize any rate differences that exist (Li,

Luo and Wu 1985) . In the evaluations reported here, enough green Chlamydomonad sequences are available that fairly close green species could be chosen as outgroup for the rate tests.

Accelerated evolutionary rates seen in functional plaBtid genes in other non-green species

Extensive relative rates analyses have been performed in leucoplasts of two non-photosynthetic lineages, the 156 angiosperm Epifagus Virginia, and the euglenoid alga Astasia longa. In both nonphotosynthetic species, substitution rate increases were found in almost all leucoplast genes still present in the two leucoplast genomes (Wolfe et al. 1992,

Norden et al. 1991, Siemeister et al. 1990). The genes studied and the range of rate increases found for leucoplast genes in these two organisms, plus rate increases seen in

Conopholis americana, a less-well-studied non-photosynthetic angiosperm, along with a summary of the rate increases I found in Polytoma, are shown in Table 7.

Acknowledgments

Bob Rumpf provided much help with tree-making software, and Stacy Seibert provided the same for the MEGA software.

Their time is much appreciated. Amplification and sequencing primers were kindly provided by P. Fuerst at the Ohio State

University for rrnl6 and by J. Palmer at Indiana University for tufA. C. Woese' and G. Olsen's Ribosomal Date Base was invaluable, both for aligning my rrnl6 sequences and for modeling rRNA secondary structures. A large array of algal, cyanobacterial and eubacterial tufA sequences, kindly provided by C. Delwiche and J. Palmer at Indiana University, was invaluable for aligning my tufA sequences. Also, C.

Delwiche was the first to notice anomalies in the putative

P.u.964 tufA sequence that I subsequently concluded was from a bacterial contaminant. We thank him for his willingness to analyze my tufA alignments and for his sharp eyes. Finally, thanks to E. Harris at Duke University, C. dePamphilis at

Vanderbilt University, and A. Wolfe, formerly at Vanderbilt

University but currently at the Ohio State University, for numerous helpful discussions. CHAPTER V

Analysis of Evolutionary Rates in Two Plastid Genes in the

Nonphotosynthetic Alga Polytoma

INTRODUCTION

Non-photosynthetic land plants and chlorophyte algae are

interesting natural experiments on the evolutionary

consequence of the loss of a significant cell function.

After losing the ability to do photosynthesis, non-

photosynthetic species use various alternative carbon

sources, the plants becoming parasitic on other plants while

the algae take up complex organic molecules from their

environment. Recognized by their lack of chlorophyll, these

non-green organisms provide unique plastid ("leucoplast")

genomes that exhibit several effects from the loss of

photosynthesis.

In photosynthetic species, in addition to genes for

photosynthetic functions, chloroplasts contain genes with two

other classes of function: expression genes (involved in

transcription and translation of leucoplast genes), and genes

for essential nonphotosynthetic (ENP) functions. Three plant

chloroplast genomes have been completely sequenced (Ohyama et a l . 1986, Shinozaki et al. 1986, Holwerda et al. 1986), and

158 159 several algal chloroplast genomes are partially sequenced.

Of the approximately 120 genes located in these chloroplast genomes, fewer than half code for proteins involved in photosynthesis. The rest are expression genes and ORFs, some with unknown functions. The expression genes include four rRNA genes, four RNA polymerase subunit genes, 30 tRNA genes and 20 of the 60 ribosomal protein genes.

A few potential candidates for ENP-function leucoplast genes have been located, by homology searches (Laudenbach and

Grossman 1991, Hwang and Tabita 1991, Wolfe et al. 1992c).

The proteins encoded may provide functions in some of the following biosynthetic pathways, portions of which are known to take place in chloroplasts: fatty acid, amino acid and porphyrin synthesis; nitrite and sulfate reduction; and the biosynthesis, storage and degradation of starch (Kirk and

Tilney-Bassett 1978, Howe and Smith 1991, Michalowski et al.

1991a, 1991b).

Only a few investigations of these unusual non- photosynthetic organisms have been attempted, limited to several parasitic angiosperms (Scrophulareae and

Orobanchaceae); several Polytoma species, which are chlorophyte algae; and the euglenoid alga Astasia (Siu et al.

1975b, 1976; dePamphilis and Palmer 1990; Siemeister et al.

1989; Colwell 1994; Wimpee et al. 1991, 1992a; Wolfe et al.

1991, 1992a; Wolfe and dePamphilis 1996). The leucoplasts of these non-photosynthetic species differ from the chloroplasts 160 of their close green relatives in numerous features.

Morphologically, they are characterized by reduction or elimination of the thylakoid membranes. They all retain leucoplast ribosomes and leucoplast DNA. However, their leucoplast genomes are often reduced in size and complexity compared to chloroplast genomes in photosynthetic relatives.

These leucoplast genomes allow investigation of the effects of selection on rates of evolution. When the ability to do photosynthesis is lost, the photosynthetic genes lose their function and consequently are no longer subject to selection; they become pseudogenes and can be lost entirely.

In contrast, the leucoplast genes needed for transcription and translation of the leucoplast genome probably remain functional and subject to at least some degree of selection.

The leucoplast genomes of the beech root parasite Epifagus virginiana, (Orobanchaceae), the oak parasite Conopholis americana (Orobanchaceae), and the euglenoid Astasia longa display these characteristics. Most expression genes are still present in the leucoplast genomes and appear intact, and some leucoplast RNA and protein products have been demonstrated (dePamphilis and Palmer 1990; Wimpee et al.

1992a; Siemeister and Hachtel 1990a, 1992b). Conversely, many leucoplast genes that coded for photosynthetic proteins appear nonfunctional in key domains, are grossly truncated, or are absent from these leucoplast genomes. Despite selective retention of most expression genes, however, the 161 leucoplast genome in Epifagus is missing more than a dozen

expression genes (tRNA genes, ribosomal protein genes, and

all four RNA polymerase subunit genes), and the leucoplast

genome in Conopholis is apparently missing several leucoplast

tRNA genes (Wolfe et a l . 1992b, Wimpee et al. 1992b).

The tempo of evolution has also changed in the

leucoplast genomes of these non-photosynthetic species.

Most, but not all, of the presumably still-functional genes

analyzed in leucoplasts show an increased rate of nucleotide

substitution compared to rates in their green relatives.

Most of the rate increases found in functional leucoplast

genes in Astasia and Epifagus are in the range of 1.5X to 8X,

while rrnl6 genes in Epifagus and Conopholis show 40X rate

increases.

A significant number of non-photosynthetic chlorophyte

algal species are known whose leucoplast genomes have not yet

been subject to such detailed investigation at the molecular

level: the genus Prototheca, closely related to the

photosynthetic Chlorella; and the genera Polytomella and

Polytoma. Phylogenetic analyses of Rrnl8 sequences show that

the latter two genera are polyphyletic but fall within the

clade that includes all Chlamydomonas species as well as a

number of other photosynthetic genera (reported in Chapter

III; P. Mackowski, unpubl.). We report here results from molecular investigations in several Polytoma leucoplast genomes. 162 Many species of Chlamydomonas are facultative auxotrophs, capable of utilizing acetate as their sole carbon and energy source. Nonphotosynthetic mutants are readily isolated in C. reinhardtii (Harris 1989). It is assumed that

Polytoma species arose as nonphotosynthetic mutants of facultative auxotrophs similar to the extant Chlamydomonas.

The single large cup-shaped leucoplast in Polytoma does not have thylakoid membranes, but still contains ribosomes, DNA, rRNA and stored starch granules (Lang 1963; Siu et al. 1975b,

1976; Scherbel, Behn and Arnold 1974; Vernon-Kipp et al.

1989) . Polytoma is sensitive to inhibitors of chloroplast protein synthesis, which is additional evidence that the leucoplast is synthesizing at least one protein that is essential for auxotrophic growth and reproduction (S. Kuhl,

D. Vernon, C. W. Birky, Jr., unpubl.)

Isolates analyzed came from culture collections of numerous Polytoma species collected from Europe and the

United States. The nuclear Rrnl8 gene, which codes for the small-subunit rRNA of cytoplasmic ribosomes, was cloned or amplified and sequenced from 13 Polytoma species and the sequences were subjected to phylogenetic analysis, reported in Chapter III. The tree topology showed two independent origins for these 13 species, with 12 of the species grouping together in the P. uvella clade, while the thirteenth

Polytoma species (P. 62-27) is the sole representative of a second clade. 163 To investigate the evolution of leucoplast genes in

Polytoma, two representatives were chosen from the P. uvella clade (P. uvella 964 and P. obtusum DHl), plus P. 62-27 from the second clade. Two leucoplast genes were amplified and sequenced: rrnl6, coding for the plastid small subunit rRNA, and tufA, which codes for the plastid elongation factor Tu.

For comparison, these genes from two closely-related photosynthetic relatives (Chlamydomonas humicola SAG 11-9 and

C. dysosmos UTX 2399) were also amplified and sequenced. The two P. uvella species show increased substitution rates in both rrnl6 and tufA, compared to green relatives, while P.

62-27 does not. The increase appears to be due to a combination of increased mutation rate and relaxed selection.

MATERIALS AND METHODS

Organisms

P.uvella 964 (UTX 964) and P. 62-27 (SAG 62-27) were obtained from The University of Texas Culture Collection of

Algae and from Sammlung von Algenkulturen Gottingen, respectively. P. obtusum (designated strain DHl by us) was obtained from David Herrin at the University of Texas,

Austin; it originally came from L. Provasoli's collection at

Yale. All cultures were subcloned once or twice, and grown in Polytomella medium, as described in Chapter III. 164

Figure 15. 3-taxon tree for relative rate test (Wu & Li

1985) .

A: Tree topology for relative rate test to

determine base substitution rate differences

along non-green lineages (to Taxon 1) versus

green lineages (to Taxon 2).

B: rrnl6 sequences used in relative rate

tests.

C: tufA sequences used in relative rate tests.

C. reinhardtii used as Outgroup (0) for both

sets of tests.

D: Algebraic equations to solve for base

substitutions along various lineages:

Km = x + y

K02= x + z

K l2 = y + z

y-KM- V _ &0\ + ^122 K 02

\ = KA, = Kn - K M

y K rate difference = — = —— or y - z = KA1 - KA-, z K a2 165

Taxon i (non~g±.eenj

barest * cohuhqq N e s t o r 2 = fi Taxon 2 tests) (Sreen)

* reinhardtii

p *U.964

( t u f a c*tua!icoia tests)

C.humicoia

r®anhardt c-reinhardt i-,' 62-27 p »62-27

C. moewusii/ euSametos C.htaaicoia 8

0 m m

4

I m I §

GSR w 166 Figure 16. Codon usage in tufA genes from two green species plus P. 62-27 iue 6 (continued) 16Figure

C3 RSCU ia C laotiadtn

E3 RSCUinChumfeoU

■ RSCU in P 62*27

ACC- ACA-

COQ CCA COC ccu-l

C / 5 UCC-

QUO GUA OUC- GUU-

AUG* AUA-g, NN AUC- S AUU-

CUG- CUA CUC- CUU- UUG- UUA- b a t *

UUC UUU T at L 91 t

□ m m b

B S B SIS S 3 I ■°e ^ o ■B o

up 1 1 GAG-P §§ GAA-C i ! G o nR w r * c ^ I E D Figure 17. Codon usage in tufA genes from two green and two Polytoma species iue 7 (continued) 17Figure

obtusum

obtusum S CAA - H RSCU in C reinhardtil btusum E3 RSCU in C humlcola

■ RSCU in P 62-27 obtusum ID RSCU in P obtusum

obtusum UAU-fc ooiusum

obtusum GCAHttft obtusum GCC - obtusum GCU- ooiusum

obtusum

obtusum

obtusum

M I I M I I I t I I I II M ooiusum

691 Figure

obtusum 17

btusum ' H o n n i t n o c ( ? m m obtusum

obtusum

obtusum

obtusum

obtusum

obtusum

AUO- obtusum obtusum NN AOC obtusum

obtusum H RSCU in C reinhmJtii

obtusum £3 RSCU m C hatniool*

obtusum I RSCU la P 62-27 obtusum a B RSCU in P obtusum obtusum

obtusum

obtusum ^ uuc O U uuu obtusum 171 d n a preparation

The rrnl6 gene of P.u. 964 was located as a 6.2-kb

Hindlll insert in a leucoplast genomic library cloned in pBluescript. These clones were prepared from the leucoplast

DNA fraction of a CsCl+bisbenzimide equilibrium gradient,

after whole-cell DNA isolation with a modified version of a

lysis method developed in order to yield high-molecular weight chloroplast DNA from Chlamydomonas (Grant, GiIlham and

Boynton 1980), as previously described in Chapter II. DNA

for use as sequencing template was obtained, using alkaline lysis miniprep plasmid preparation protocols (Sambrook et al.

1989), from the clone containing the gene, then purified using the GeneClean kit (Bio 101). All other genes in this study were amplified from partially purified whole-cell DNA isolated from CTAB lysates of 1L algal cultures, as described in Chapter III.

Polymerase Chain Reaction (PCR) amplifications

Primers located near the ends of the rrnl6 and tufA genes were used to obtain DNA templates for sequencing. The

5* and 3' primers for r m l 6 were A-17 = 5 1-GTTTGATCCTGGCTCAC­

S' and 5005-15 = 3'-CATGTGTGGCGGGCA-5'. The 5' and 3'

(degenerate) primers for all but one of the tufA genes were

IF = 5'-GGDCAYGTTGAYCAYGG-31 and 5R = 3'-TGACANCCRCGRCCRCA-

5'. Primer 5R did not amplify tufA from P. obtusum, so the

3 1 ends of the tufA genes from the other chlamydomonad 172 species were inspected for conserved areas, and an alternative 3' primer (113OR = 3 1-CCRATACGGDCCACTRGC-5') was designed and used, located 100 bases further 5' of the original 3' primer 5R. This amplified a tufA fragment from

P. obtusum that was 100 bases shorter than the other chi airy domonad sequences. The rrnl6 amplification products sequenced were 1300 bases long; the tufA amplification products were 1200 bases long, except for P. obtusum's sequence, which was 1100 bases long. Optimal amplification conditions were determined for each gene empirically; multiple separate amplifications were performed and pooled, and purified using GeneClean.

Sequencing

Both strands of each gene were sequenced, using a slightly-modified version of the dsDNA Cycle Sequencing kit

(Life Technologies), as previously described in Chapter III.

Most internal primers for sequencing were obtained from Paul

Fuerst for the rrnl6 gene and from Jeff Palmer for the tufA gene; additional internal primers in conserved regions were designed to fill gaps in sequence coverage.

Sequence Alignment-rrnl6

rrn!6 sequences were obtained from the RDP database for three green chlamydomonads, two species of Chlorella, and eight land plants. The five sequences that comprise the 173 study reported herein (P.u.964, P. obtusum, P.62-27, C.

humicola and C. dysosmos) were added to the RDP sequences in

an alignment array in SeqApp (Gilbert 1992), then hand- aligned to match the RDP alignment. This alignment was

refined, by hand, for the most variable portions of the gene with the help of conserved patterns in the secondary structures of the 16S rRNA in three green algae

(Chlamydomonas reinhardtii, Chlamydomonas moewusii, and

Chlorella vulgaris) and three land plants (Zea mays,

Marchantia polymorpha and Nicotiana tabacum) . All secondary structures were retrieved from RDP datafiles.

1287 bps of the rrnl6 gene were aligned (85% of the mature rRNA), excluding part or all of only three stem-loop structures where the variable number of bases between species made alignment ambiguous, and leaving out the first 29 5' positions and the last 146 3' positions for lack of data in some or all species.

Sequence Alignment-tufA.

To assist alignment of tufA sequences, C. Delwiche and

J. Palmer at Indiana University provided their alignment array, with 18 eubacterial, 8 cyanobacterial, 26 algal, and 4 land plant sequences (array described in Delwiche et al.

1995). The Polytoma and Chlamydomonas sequences were aligned to various subsets of this array using DNA sequences, but influenced by the resulting amino acid alignment. 174 1053 bps of the tufA gene were aligned (85% of the

coding region), leaving out the first 72 5' positions and the

last 96 3' positions for lack of data in some or all species.

Relative Rate Calculations.

To quantify any potential differences between rates of

nucleotide (or amino acid) substitution in the three Polytoma

species studied, compared to green species, relative rate

tests were performed (Sarich and Wilson 1973). The method of

Wu and Li (1985) was used, employing the geometry of a three-

taxon phylogenetic or gene tree to separate the observed

sequence divergences between all three pairs of species in

such a tree into x, y, or z components, as shown on Figure 15. The corrected sequence divergence (Koi) between an

outgroup (O) and a non-green species (taxon 1), can be

separated geometrically and algebraically into x and y

components. Similarly, the other two observed divergence

rates in this three-species comparison can be expressed algebraically in terms of x, y or z components. Algebraic manipulation allows one to solve for y = Kai and z = K&2 / which are the substitution rates from a common ancestor (A)

to a non-green species (taxon 1) and to a green species

(taxon 2). Then, any rate differences between green lineages and non-green lineages can be expressed either as the difference (y - z) or as the ratio (y/z) of these substitution rates. To evaluate the significance of any rate Table 8. Differences in (or ratio of) the number of nucleotide substitutions per site between non-green Polytoma lineages (Taxon 1) and green Chiairydomonas lineages (Taxon 2), for the plastid rrnl6 gene. Method to apportion substitutions per lineage

For all alignable positions For Stems only For Loops only

SPECIES COMPARED Taxon 1 (Wu & Li algebra) Parsimony Neighbor (Wu & Li algebra) (Wu & Li algebra) (non-green) tree joining tree & Taxon 2 *M-*A 2 k ai/ k a i- k A2 k m/ (green) / * A 2 Au A u / K A2 a,2 P. u. 964 to C. humicola 0.093 4.2X 4.3X 3.3X 0.112 5.5X 0.071 3.2X

P.obtusum to C.humicola 0.074 3.2X 4.0X 2.9X 0.102 4.4X 0.039 2. IX

P. 62 - 27 to C.eugametos -0.015 0.8X 0.5X 0.7X -0.014 0.8X -0.016 0.7X

For both tables 8 & 9, Wu & Li's relative rate test was used (Wu& Li 1985). Substitutions on table 9 were categorized as Synonymous and non' synonymous using the weighted amino acid pahway method of Nei & Gojobori (1986). The outgroup reference species for all comparisons was C. reinhardtii. Table 9. Differences in (or ratio of) the number of nucleotide substitutions per site between non-green Polytoma lineages (Taxon 1) and green Chlamydomonas lineages (Taxon 2), for the plastid tufA gene.

Method to apportion substitutions per lineage For all alignablc nucleotide positions in the gene For all alignable amino acids

2 SPECIES COMPARED Taxon 1 (Wu & Li algebra) Parsimony Neighbor (Wu & Li algebra) (non-green) Tree Joining Tree & Taxon 2 ^Al &A2 Ka/ K m/ K m/ k m - k a 2 Km/ (green) /K a2 /K a2 /K a2 /K a2

P. obtusum to Pattern C. humicola 0.138 3.6X 4.7X 3.3X 0.086 3.3X

P.62 - 27 to C. humicola 0.025 1.4X 0.9X MX -0.003 0.9X For all 1st + 2nd codon positions only ______For all non-synonymous sites P. obtusum to C. humicola 0.052 2.9X 2.5X 3.2X 0.048 2.9X P. 62 - 27 to C. humicola 0.003 MX 1.3X 1.3X 0.000 1.0X For all 3rd codon positions only______For all synonymous sites P. obtusum to C.humicola 0.450 5.8X 4.6X 5. IX 0.978 11.5X P. 62-27 to C. humicola 0.090 1.5X 1.0X 2.2X 0.178 1.6X 6 7 1 Table 10. Base composition (as % G + C ) in Chlamydomonad rrnl6 & tufA plastid genes

SDecies Total r m l 6 tuf A Plastid Genome all 3 1 2 1+2 3 positions GREENS Chloralla 53.2 ------(35.6)"

C reinhardtii 36 53.3 ------

C xnoawusii 50.0 ------

Gonium ------(34.5)"

Pandorina ------(34.2)"

C humicola 51.1 38.6 57.1 37.8 47.5 21.0

NON-GREENS P. 62-27 48.3 34.1 56.3 36.6 46.5 9.4

P. obtusum 17 46.0 44.1 56.4 37.6 47.0 39.3

P.u. 964 17 44.1

a = only partial tufA sequences are available for these three green species Table 11. Codon Usuage in Chlamydomonad tufA plastid genes

Species Number of Codons Not used Number of Codons used in tufA more frequently than expected

Greens Total end in end in RSCU RSCU G/C A/T >1-5 > 3 C reinhardtii 23 20 3 16 5

C humicola 19 14 5 18 4

C. dysosmos 21 19 4

Non-greens

P. 62-27 20 19 1 17 4

P. obtusum 4 2 2 16 1 Table 12. Increase in Substitution Rates (relative to Green species) at Synonymous versus Non-synonymous sites in Epifagus and P.obtusum

Increase in Substitution Rates

Species Genefs) Svn sites Non-svn sites 3rd codon 1st & 2nd codon Codon usuaee position sites position sites bias

Epifagus Average of 2.IX (compared to many ribo- / (N.A.) (N. A.) Yes tobacco) somal protein genes 4.5X

P.obtusum tufA 11.5X 4.6X to (compared to / 5.8X C.humicola) 2.9X / Minimal 2.5X to 3.2X 180 differences, the null hypothesis is that the rate from a non­ green species (taxon 1) to the closest common ancestor (A) shared by the non-green and its closest green relative (taxon

2) is the same as the rate from the green species (2) to the common ancestor (A) , or H0: y - z = 0, or y/z = 1. Both the difference and the ratio are shown in Tables 8 and 9, but the difference was evaluated for statistical significance.

Chlamydomonas reinhardtii was the outgroup for each comparison. Chlamydomonas humicola was the closest known green relative for which sequence data are available, except in the case of the rrnl6 gene of P. 62-27, for which

Chlamydomonas moewusii was the closest known green relative.

Pairwise sequence differences for use in the Wu & Li test were obtained by expressing observed substitution differences as pairwise distances and correcting for multiple hits. The correction method used was the Jukes & Cantor one- parameter correction, calculated using MEGA (Jukes and Cantor

1969, Kumar et al. 1993). Two other correction calculation methods were compared to the Jukes & Cantor correction: the

Kimura two-parameter method, which allows different rates of transition versus transversion, and the Tamura method, which uses information about G+C content as well as separate transition and transversion rates, again using MEGA (Kimura

1983, Tamura 1992). All three correction methods added about the same number of unobserved substitutions for both 181 sets of gene sequences (data not shown), so the simplest model was used because it has the smallest variance.

A second method was also used to separate observed sequence differences into rates along different green or nongreen lineages, employing phylogenetic software. Gene trees, where the observed substitutions are apportioned to the various branches of the tree using phylogenetic algorithms, provided inferred substitutions on each green or nongreen branch. The apportioned substitutions from the nearest node common to a Polytoma species and to its nearest green relative (C. humicola or C. moewusii) were then compared, as the y/z ratio in the Wu & Li test; and the same null hypothesis (H0: y - z = 0) was tested. Gene trees were produced using (i) parsimony analysis (exhaustive search) via

PAUP v.3.5 (Swofford 1993), and (ii) a neighbor-joining distance analysis via Phylip v. 3.56 (Felsenstein 1989,

1993), as reported in Chapter III.

Codon usage and the amount of codon usage bias in our tufA sequences were also investigated, using the RSCU

(relative synonymous codon usage) analysis in MEGA. RSCU is the ratio of the observed frequency of a particular codon to the expected frequency of that codon calculated on the assumption that all codons are used equally frequently; an

RSCU value significantly different from 1 is evidence of biased codon usage. 182 RESULTS

Relative Rate Teats for rrnl6 genes

Table 8 shows the results of the relative rate tests for

the rrnl6 gene in all three Polytoma species. In addition to analyzing all unambiguously-alignable nucleotide positions,

sites in stems and loops were analyzed separately. For this purpose, proposed secondary structures of the 16S rRNAs in these Polytoma and Chlamydomonas species were determined, using visual comparison to the secondary structure model of

C. reinhardtii (Gutell et al . 1985). For this stem versus loop comparison, a loop was defined as any series of unpaired nucleotides in a "loop" at the end of a stem, or in a "bump" along one side of a stem, or at the intersection of two stems. All relative rate tests show significantly increased substitution rates in the branch leading to P. obtusum versus the branch leading to the closest known green species

Relative Rate Teats for tufA Genes

Table 9 shows the comparison of relative rate tests for the tufA gene in two of the three Polytoma species studied.

(Note that this analysis is currently missing a tufA sequence from P.u.964. It was only recently discovered that a tufA gene from a bacterial contaminant was amplified instead of 183 the plastid tufA. A second search for the gene in P.u.964 is

planned, as soon as time permits.) In addition to

calculating relative rates of nucleotide substitutions for

all aligned sites in these tufA sequences, several subsets of

the data were also analyzed: first+second codon positions,

third codon positions, synonymous substitutions, and

nonsynonymous substitutions. Substitutions were classified

as synonymous versus nonsynonymous using the weighted amino

acid pathway method of Nei and Gojobori, implemented in MEGA

(Nei &. Gojobori 1986) . Rates of amino acid substitutions were also compared. All relative rate tests showed a

significantly higher substitution rate in the branch leading

to P. obtusum than in the branch leading to the closest known

green species (C. humicola), whereas the rate in the branch

leading toward P.62-27 is not significantly different from

the branch leading to the green species.

Base Composition and Codon Usage

G+C content of the rrnl6 genes is slightly lower in the

three Polytoma species than in the three green chi arty domonads

for which sequences of this gene are available (Table 10).

G+C content in the two Polytoma tufA sequences differ in several ways from their green relatives. At the first and second codon positions, the G+C content is nearly the same in

P. 62-27 and P. obtusum as in the two green chlamydomonad species for which complete tufA sequences are available. But 184 in the third codon position the G+C content is much lower in

P.62-21 than in sequences from the two green species; while

P. oJbtusum has a much higher G+C content than in sequences from the green chlamydomonad species.

Codon usage and amount of codon bias in tufA also differs between the two Polytoma species. The tufA gene shows much more codon bias in P. 62-27 than in P. obtusum.

In fact, the tufA gene shows approximately the same codon bias in P.62-21 as in the three green chlamydomonad species

(Table 11 and Figure 16). The three sequences from green species all show a considerable bias against codons ending in

G or C, with 19 to 23 codons not used at all, most of which end in G or C. Of the 20 codons not used inthe P.62-27 tufA gene, 19 end in G or C. TufA shows much less bias in P. obtusum, where only four codons are not used, two of which end in G or C (Figure 17). A simple global screen of codons using an RSCU analysis (to normalize a comparison of the frequencies) shows no significant difference between these non-green and green species with respect to the frequency of codons used above the RSCU = 1.5 level. However, a screen for codons used at or above RSCU values of 3 shows fewer highly-used codons in P. obtusum than in the other green and non-green species (Table 11). 185 DISCUSSION

IB P. 62-27 a recently non-green lineage?

The tufA gene of P. 62-27 shows codon bias similar to that of several close green relatives, and neither the rrnlS gene nor the tufA gene shows an increased substitution rate in the lineage leading to P. 62-27 relative to green lineages.

These observations are both consistent with the hypothesis that P.62-21 may belong to a lineage that lost photosynthesis more recently than did the P. uvella clade that contains P. uvella 964 and P. obtusum. The P.62-21 lineage may simply not have had enough time to demonstrate a significant rate increase, nor to lose the codon bias it shares with green relatives.

The expression gene sequences we sampled from the P. uvella clade, on the other hand, all show significant rate increases, compared to green relatives, and tufA from P. obtusum (the only protein-coding gene we obtained from the P. uvella clade) shows minimal codon bias. These data are both consistent with the hypothesis that the P. uvella clade may result from a more ancient loss of photosynthesis than in the

P. 62-21 lineage; therefore there has been enough time in the

P. uvella clade for relaxed selection to be evident in the form of rate increases in still-functional genes, and enough time for relaxed selection to be evident in the form of a large reduction in the amount of codon bias. 186 Curiously, a comparison to leucoplast data from Epifagus does not show a correlation for these two types of relaxed selection. Most leucoplast genes in Epifagus show rate increases compared to green relatives, but the protein coding genes show no apparent relaxation of codon bias, which is still similar to that seen in tobacco relatives (Wolfe et a l .

1992c).. Perhaps different selective pressures have resulted in retained codon bias in Epifagus but not in P. obtusum.

Estimates of time since loss of photosynthesis cannot help support the hypothesis of a recently non-photosynthetic

P.62-27 lineage, nor can they elucidate why Epifagus only shows one type of relaxed selection, whereas P. obtusum shows two types. Unfortunately, phylogenetic data based on sequences have not identified any green species as close to

P.62-27 as the green species found relative to the P. uvella clade, and P. 62-27 is the sole member found from its clade.

Therefore, our estimate of time for loss of photosynthesis in the P.62-27 lineage (calculated in Chapter III) is based on a part of the tree containing few species and deep branches; and, with a single-member clade, we can only estimate the maximum divergence time since loss of photosynthesis, not the minimum. For the date of the change to nonphotosynthetic status in P.62-27 ancestors, Ochman & Wilson's universal substitution rate for rRNA genes can be used with the divergences seen in Polytoma nuclear Rrnl8 genes, reported in chapter III (Ochman and Wilson 1987). This calculation gives 187 a maximum divergence of roughly 155 million years between

P. 62-27 and its nearest green relatives. Photosynthesis

could have been lost at any time since then.

For the P. uvella clade, the change to nonphotosynthetic

status in an ancestor occurred between 50 MYA and 8.5 MYA

(calculated in Chapter III). Fossil evidence in the

angiosperm lineage near Epifagus points to the change to

nonphotosynthetic status in an Epifagus ancestor at no more

than 50 MYA, and perhaps as recently as 5 MYA (Wolfe et al

1992c). Assume that the actual time of occurrence of the

status change in the P. uvella clade lineage was at the older

end of the possible range and that the actual time in the

Epifagus lineage was at the more recent end of the possible

range. Then, relative to the P. uvella clade, the lineage

leading to Epifagus might not have been nonphotosynthetic

long enough to lose the codon bias that is still evident in

Epifagus leucoplast genes. However, this tells us nothing about why rate increases in leucoplast genes are seen in

Epifagus, but not seen in P. 62-27.

Do Differences in tufA Base Composition Correlate With Codon Bias?

In the green species C. reinhardtii and C. humicola, the

G+C content of the tufA gene differs markedly among the three codon positions, being highest in the first position and lowest in the third position. In all four species shown in Table 10, green or nongreen, the base composition in the

second position of tufA is similar to the overall base

composition of the C. reinhardtii chloroplast genome. The

base composition in P.62-21 is the same as that of the green

species in the first and second codon positions but the

decrease in the third position is much greater. The unknown

selective or mutational pressure that lowered G+C content in

the third position of all these species appears to have

reduced it by roughly the same proportion in P.62-27 as in

the other species, relative to the total G+C content of the

leucoplast genomes. Since total leucoplast G+C content in

Polytoma species is lower than in green chlamydomonads (17%

G+C versus 36% G+C), G+C contents at third positions in tufA

reflect a similar pattern of decrease, to 9% in P. obtusum

and to 16 to 21% in the two green chlamydomonads. The cause

of this tendency to low G+C content in the third position is

unknown. Perhaps it is merely a consequence of the bias against codons ending in G+C. Or perhaps the low G+C content

in these genomes is due to some other factor and the codon bias is a secondary consequence. The answer might become clearer with study of base composition and codon bias in additional protein-coding genes in chloroplasts and leucoplasts.

The third position base composition in P. obtusum is less problematical. The tufA gene of P. obtusum is the only gene in this study that shows little codon usage bias, and is 189 also the only gene with high G+C content at third position sites. A correlation between lack of bias against codons ending in G or C and a higher G+C content at those third position sites is evident. P. obtusum uses 15 to 19 more codons in the tufA gene than do the other three chlamydomonads, and there is no bias against G or C at third codon positions in the only four codons not used in P. obtusum. Again, whether the higher G+C content at third positions in this gene is the cause or a consequence of unbiased codon usage remains to be determined.

Increased Divergence Rates in Plastid Genes

All relative rate tests show accelerated divergence for presumably functional leucoplast genes in two of the three

Polytoma species we investigated. We would expect to see increased rates in pseudogenes, since genes no longer needed by the alga (the photosynthetic genes, for example) would no longer be subject to selection. But when functional genes also show accelerated rates, this implies that selection is relaxed on these genes as well, or that the mutation rate is increased in the entire leucoplast genome, or both.

Accelerated evolutionary rates were also detected by relative rate tests of many presumably functional ribosomal protein genes and rRNA genes in the nongreen plant Epifagus (Wolfe et al 1992c). Wolfe et al. suggest that selection on components of the leucoplast protein synthetic machinery is 190 relaxed because there is a lower rate of protein synthesis.

Instead of producing the many plastid-encoded photosynthetic proteins, including large amounts of ribulose bis-phosphate carboxylase, the most abundant protein in the world, leucoplasts from non-green species may be transcribing and translating only one or a very few essential nonphotosynthetic genes, perhaps at very low rates. Their assumption is that the reduced retirement for protein synthesis results in the cell being able to tolerate more mutations that reduce the efficiency of protein synthesis.

Perhaps a small amount of correctly translated elongation factor (and similarly "relaxed" rRNAs, ribosomal proteins, etc.) is all that is needed for adequate translation of the other essential leucoplast genes.

We suggest, in addition, that reduction of codon bias, which we observed when comparing tufA in P. obtusum to green relatives, is another example of relaxed selection. Codon bias is frequently a bias against non-abundant tRNAs, and is a type of selection most frequently seen in genes for proteins that have to be synthesized quickly or in large amounts. Since it is unlikely that the few ENP genes produced from leucoplasts fall into either category, a reduction in codon bias would make sense. The anomalous condition, then would appear to be the bias still seen in genes in Epifagus, not the lack of bias seen in P. obtusum. 191 Lack of Codon Biaa in tufA in P. obtusum may Allow a

More Accurate Estimate of Neutral Mutation Rate than

Possible in Epifagus

Substitution rates increased more for putatively neutral substitutions than for non-neutral substitutions in tufA in P. obtusum. The magnitude of the difference varies somewhat, depending on how neutral and non-neutral substitutions are separated. Table 12 compares third position versus first+second position rate increases using tree-based substitutions or substitutions from Wu & Li's algebra, as well as ratios of synonymous to non-synonymous substitutions. All four analyses show higher rate increases for synonymous or third-position substitutions than for nonsynonymous or first+second position substitutions. In

Epifagus, however, the results are the opposite: the increase in the rate of synonymous substitutions is smaller than the increase in the rate of non-synonymous substitutions (Wolfe et a l . 1992a). We suggest that this difference between

Epifagus and P. obtusum is due to selection for codon usage in Epifagus which affects synonymous substitutions. In P. obtusum, however, lack of any significant codon bias suggests that there is little or no selection on synonymous sites.

So, we can see the same two phenomena in Polytoma obtusum as seen in Epifagus. Higher substitution rates at non-synonymous sites, relative to green species, probably reflect relaxed selection on at least the plastid genes 192 involved in translation, and perhaps relaxed selection on the remaining functional plastid genes as well. Higher substitution rates at synonymous sites, relative to green species, suggests an increase in plastid mutation rate in P. obtusum relative to the chloroplasts of its green relatives.

We may be better able to detect and measure this rate increase in Polytoma than in Epifagus because it is partly masked in the latter species by selection affecting synonymous sites. A reasonable explanation for the higher mutation rate in leucoplasts than in chloroplasts is the same as that proposed for relaxed selection. Perhaps the non­ green organisms can tolerate plastid DNA replication and repair functions that are less efficient and more error prone, or absent entirely, as well as translational machinery that is less efficient. The result of relaxed selection on plastid replication and repair genes, which probably reside in the nucleus, would be an increase in mutation rate at all sites, non-synonymous as well as synonymous, which is what we observe in the tufA gene in P. obtusum.

It will be interesting to see if the same phenomena occur in other plastid genes in P. obtusum and in other

Polytoma species. It will also be interesting to see how the estimate of a Polytoma plastid neutral mutation rate provided by this tufA sequence compares with estimates from the presumed pseudogene rbcL, large remnants of which we have located in three Polytoma species. Literature Cited

Chao, L. and D. E. Carr, 1993 The molecular clock and the relationship between population size and generation time. Evolution 47: 688-690.

Colwell, A., 1994 Genome evolution in a non-photosynthetic plant, Conopholis americana. Washington University.

Delwiche, C. F., M. Kuhsel and J. D. Palmer, 1995 Phylogenetic Analysis of tufA sequences indicates a cyanobacterial origin of all plastids. Molecular Phylogenetics and Evolution 4: 110-128. dePamphilis, C. W. and J. D. Palmer, 1990 Loss of photosnthetic and chlororespiratory genes from the plastid genome of a parasitic flowering plant. Nature 348: 337-339.

Felsenstein, J., 1989 Phylogeny Inference Package (version 3.2). Cladistics 5: 164-166.

Felsenstein, J., 1993 PHYLIP (Phylogeny Inference Package) version 3.5p. Department of Genetics, University of Washington, Distributed by the author.

Gilbert, D. G., 1992 SeqApp, a biological sequence editor and analysis program for Macintosh computers.' Published electronically on the Internet, available via gopher or anonymous ftp to ftp.bio.indiana.edu.,

Grant, D. M., N. W. Gillham and J. E. Boynton, 1980 Inheritance of chloroplast DNA in hlamydomonas reinhardtii. Proc. Nat. Acad. Sci. USA 77: 6067.

Gutell, R. R., N. Larsen and C. R. Woese, 1994 Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. Microbiological Reviews 58: 10-26.

Howe, C. J. and A. G. Smith, 1991 Plants without chlorophyll. Nature 349: 109.

193 194

Hwang, S. R. and F. R. Tabita, 1991 Acyl carrier protein- derived sequence encoded by the Chloroplast genome in the marine diatom Cylindrotehca sp. Strain N1 . J Biol Chem 266: 13492-13494.

Jukes, T. H. and C. R. Cantor, 1969 Evolution of protein molecules., in Mammalian Protein Metabolism, edited by H. N. Munro. Academic Press, New York.

Kimura, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

King, J. L. and T. h. Jukes, 1969 Non-Darwinian evolution. Science 164: 788-798.

Kirk, J. T. 0. and T.-B. R. A. E., 1978 The Plastics. Elsevier, North Holland, Amsterdam and New York.

Kumar, S., K. Tamura and M. Nei, 1993 MEGA: Molecular Evolutionary Genetics Analysis, version 1.0. University Park, PA, Institute of Molecular Evolutionary Genetics.

Laudenbach, D. E. and A. R. Grossman, 1991 Characterization and mutagenesis of Sulfur-regulated Genes in a Cyanobacterium: Evidence for Function in Sulfate Transport. J. Bacteriol 173: 2739-2750.

Li, W.-H., C.-C. Luo and C.-I. Wu, 1985 Evolution of DNA sequences, in Molecular Evolutionary Genetics, edited by R. J. MacIntyre. Plenum Press, New York.

Maidak, B. L., N. Larsen, M. J. McCaughey, R. Overbeek, G. J. Olsen, K. Fogel, J. Blandy and C. R. Woese, 1994 The Ribosomal Database Project. Nucleic Acids Research 22: 3485- 3487.

Michalowski, C. B., R. Flachmann, W. Loeffelhardt and H. J. Bohnert, 1991 Gene nadA, encoding Quinolinate Synthetase, is located on the Cyanelle DNA from Cyanophora paradoxa. Plant Physiol 95: 329-330.

Michalowski, C. B., W. Loeffelhardt and H. J. Bohnert, 1991 An ORF323 with homology to crtE, specifying Prephytoene Pyrophosphate Dehydrogenase, is encoded by Cyanelle DNA in the eukaryotic alga Cyanophora paradoxa. J Biol Chem 266: 11866-11870. 195

Morden, C. W . , K. H. Wolfe, C. W. dePamphilis and J. D. Palmer, 1991 Plastid translation and transcription genes in a non-photosynthetic plant: intact, missing and pseudo genes. The EMBO Journal 10: 3281-3288.

Nei, M. and T. Gojobor, 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3: 418-426.

Nissen, P., M. Kjeldgaard, S. Thirup, G. Polekhina, L. Reshetnikova, B. F. C. Clark and J. Nyborg, 1995 Crystal Structure of the Ternary Complex of Phe-tRNA, EF-Tu, and a GTP Analog. Science 270: 1464-1472.

Ochman, H. and A. C. Wilson, 1987 Evolution in bacteria: Evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26: 74-86.

Ohyama, K., H. Fukuzawa, T. Kohchi, H. Shirai, T. Sano, S. Sano, K. Umesono, Y. Shiki, M. Takeuchi, Z. Chang, S.-I. Aota, H. J. Inokuchi and H. Ozeki, 1986 Chloroplast gene organization deduced from complete sequence of liverwort Marchantia polymorpha chloroplast DNA. Nature 322: 572-574.

Sarich, V. M. and A. C. Wilson, 1973 Generation time and genomic evolution in primates. Science 179: 1144-1147.

Sembrook, J., E. F. Fritsch and T. Maniatis, 1989 Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Lab Press, Cold Spring Harbor, NY.

Shinozaki, K., M. Ohme, M. Tanak, T. Wakasugi, N. Hayashida, T. Matsubayashi, N. Zaita, J. CHungwongse, J. Obokata, K. Yamaguchi-Shinozaki, C. Ohto, K. Torozawa, B. Y. Meng, A. Sugita, H. Deno, T. Kamogashira, K. Yamada, J. Kusuda, F. Takaiwa, A. Kato, N. Tohdoh, H. Shimada and M. Sugiura, 1986 The complete nucleotide sequence of tobacco chloroplast genome: its gene organization and expression. EMBO 5: 2043- 2050..

Siemeister, G., C. Buchholz and W. Hachtel, 1990 Genes for the plastid elongation factor Tu and ribosomal protein S7 and six tRNA genes on the 73 kb DNA from Astasia longa that resembles the chloroplast DNA of Euglena. Mol. Gen. Genet. 220: 425-432.

Siemeister, G. and W. Hachtel, 1990 Organization and nucleotide sequence of ribosomal RNA genes on a circular 73 kbp DNA from the colourless flagellate Astasia longa. Curr. Genet. 17: 433-438. 196

Siu, C.-H., K.-S. Chiang and H. Swift, 1976 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Swofford, D. L., 1993 PAUP: Phylogenetic Analysis Using Parsimony, Version 3.5p. Computer program distributed by the Illinois Natural History Survey, Champaign, Illinois.,

Tamura, K., 1992 Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Molecular Biology and Evolution 9: 678-687.

Vernon-Kipp, D., S. A. Kuhl and C. W. J. Birky, 1989 Molecular evolution of Polytoma, a non-green chlorophyte., 284-286 in Physiology, Biochemistry, and Genetics of Nongreen Plastids, edited by C. T. Boyer, J. C. Shannon and R. C. Hardison. American Society of Plant Physiologists, Rockville, Maryland.

Wilson, A. C., S. S. Carlson and T. J. White, 1977 Biochemical evolution. Annu. Rev. Biochem. 46: 573-63 9.

Wimpee, C. F., R. Morgan and R. Wrobel, 1992 An aberrant plastic ribosomal RNA gene cluster in the root parasite Conopholis americana. Plant Mol. Biol. 18: 275-285.

Wimpee, C. F., R. Morgan and R. L. Wrobel, 1992 Loss of transfer RNA genes from the plastid 16S-23S ribosomal RNA gene spacer in a parasitic plant. Curr. Genet. 21: 417-422.

Wimpee, C. F., R. L. Wrobel and D. K. Garvin, 1991 A divergent plastic genome in Conopholis americana, an achlorophyllous parastic plant. Plant Mol. Biol. 17: 161-166.

Woese, C. R., Pace, N. R. Probing RNA structure, function, and history by comparative analysis, in The RNA World, edited by The Cold Spring Harbor Laboratory Press, Cold Spring Harbor.

Wolfe, A. D. and C. W. dePamphilis, 1995 Alternate paths of evolution for the photosynthetic gene rbe in four nonphotosynthetic species of Orohanche. Plant Molecular Biology (submitted) 197

Wolfe, K. H., C. W. Morden, S. C. Ems and J. D. Palmer, 1992 Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. J. Mol. Evol. 35: 304-317.

Wolfe, K. H., C. W. Morden and J. D. Palmer, 1991 Ins and outs of plastid genome evolution. Curr. Opinion Genet. Develop. 1: 523-529.

Wolfe, K. H., C. W. Morden and J. D. Palmer, 1992 Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc. Nat. Acad. Sci. USA 89: 10648-10652.

Wu, C.-I. and W.-H. Li, 1985 Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. USA 82: 1741-1745.

Zuckerkandl, E. and L. Pauling, 1962 Molecular disease, evolution and genic heterogeneity, in Horizons in Biochemistry, edited by M. Kasha and B. Pullman. Academic Press, New York. SUMMARY

My project consisted of a study of genes in the leucoplast genomes of several species of Polytoma, which are non-photosynthetic chlorophyte algae, in order to determine the evolutionary consequences of the loss of photosynthesis.

Evolutionary changes seen in several other previously- photosynthetic lineages include selective retention and conservation of leucoplast expression genes, selective loss of many photosynthetic genes, and an increase in the rate of nucleotide substitutions in the leucoplast genome. The major impetus for my study was to determine whether Polytoma leucoplast genomes have changed in ways similar to these other previously-photosynthetic lineages.

I formed two hypotheses. One hypothesis was that the leucoplast genomes in Polytoma species are still functional and contain functional expression genes. The second hypothesis was that these leucoplast expression genes, even though still functional, would show accelerated evolution, relative to homologous genes in photosynthetic relatives.

198 199 This study in Polytoma leucoplast genomes began with attempts to make the organism and its DNA amenable to molecular study. Isolation and separation of DNA from two of the three organelles (nuclear DNA separated from leucoplast

DNA) was successful. The quality of separation was tested with hybridizations to known nuclear and chloroplast gene probes. Separation techniques involved use of a

CsCl/bisbenzimide gradient, taking advantage of the varied

A+T content in each of the three organelle genomes, and substantial modifications of a gentle lysis lysis method used for Chlamydomonas. Isolation of mitochondrial DNA was not successful.

Hybridization and PCR amplification searches for a suitable gene (nuclear Rrnl8) for a phylogenetic study were successful. A phylogenetic analysis of Rrnl8 sequences in 13

Polytoma species provided the following answers: 1) the placement of these 13 Polytoma species as chlamydomonads, based on morphology and biochemistry, is confirmed, and 2) these 13 Polytoma species result from at least two independent origins, with 12 species grouping together in the

P. uvella clade, while the thirteenth Polytoma species (P.62-

27) is separated from the P. uvella clade by at least one green species.

Three Polytoma species were chosen for a study of leucoplast genes: two representatives from the P. uvella 200 clade (P.u.964 and P. obtusum), plus P.62-27 as the sole

representative of the second Polytoma clade. Three

leucoplast expression genes and one leucoplast photosynthetic

gene were located by amplification (rrnl6, rrn23, tufA and

rbcL) . The identities of two of these four leucoplast genes

(rrnl6 and tufA) were confirmed by sequencing, and rrnl6 and

tufA are the two leucoplast expression genes whose

functionality and evolutionary rates were analyzed as the

remainder of my project.

To assess potential for function, I analyzed DNA

sequences for rrnl6 and tufA genes from several Polytoma

species and compared them to genes that produce known

functional products in photosynthetic relatives. I also made

models of potential secondary structure folding for the rRNA

sequences, and I analyzed the inferred amino acid sequences

and protein structures for the tufA sequences. All analyses

in these two Polytoma leucoplast genes show features that are

consistent with functionality. Even a hypervariable region

in EF-Tu (the protein product of tufA), where some

chlamydomonad-specific insertions have apparently occurred,

shows only moderate differences between the Polytoma EF-Tu proteins and the known functional EF-Tu from C. reinhardtii.

In the substitution rate corrparisons, the two

representatives from the P. uvella clade show increased

substitution rates in both leucoplast expression genes, 201 compared to photosynthetic relatives; but P.62-27 sequences

do not show increased rates for either leucoplast gene.

P. 62-27 may belong to a lineage that lost photosynthesis more recently than did the P. uvella clade. The P.62-21

lineage may simply not have had enough time to demonstrate

the rate increase seen in the representatives from the P.

uvella clade, which may result from a more ancient loss of photosynthesis. Unfortunately, we cannot verify this hypothesis with our current sampling of chlamydomonad

species, because we did not locate green species as close to

P.62-27 as the green species we found relative to the P.

uvella clade.

After analyzing substitution rates for all nucleotide positions in the tufA gene, I also did two additional rate analyses for the two types of substitutions in protein-coding genes that are subject to different types or degrees of selection, in order to separate the evolutionary effects of selection from those of mutation. I compared rates of non- synonymous substitutions, where a nucleotide substitution changes the amino acid, to see whether selection had a different effect in green versus non-green species. I also compared rates of synonymous substitutions, where nucleotide substitutions do not change the amino acid, to get an estimate of the mutation rate, and to see whether mutation rates are higher in Polytoma leucoplasts than in chloroplasts of green relatives. Since codon usage bias is a type of 202 selection that affects synonymous substitutions, and might

affect the magnitude of my estimate of the mutation rate, I also looked for differences in codon bias between tufA genes

in Polytoma compared to green relatives.

When I did those rate comparisons in tufA from Polytoma

obtusum, I found increased rates for both types of

substitutions, synonymous and non-synonymous. That data

implies that the effects of both selection and mutation have changed. A higher rate of non-synonymous substitutions

implies that the effect of selection in this expression gene

in P. obtusum is less than in green relatives, i.e., that selection is relaxed. A higher rate of synonymous

substitutions implies that P. obtusum has a higher mutation rate in the leucoplast than in the chloroplast of its close green relative. Also, codon usage bias in tufA in P. obtusum

is minimal compared to close green relatives (and compared to

tufA in P.62-21). Therefore, selection caused by codon bias

should probably not be significantly masking the true mutation rate I estimated in P. obtusum. However, a better estimate of mutation rate would come from a pseudogene, where no selection at all could mask the mutation rate. Since we have located the rbcL gene (presumed to be a photosynthetic pseudogene) in several Polytoma species, the laboratory's next step will be to compare a mutation rate estimated from that pseudogene to the estimate I obtained from synonymous substitutions in tufA. As a result of my study, Polytoma can be added to the list of non-photosynthetic species that show significant evolutionary effects from the loss of an important cell function. It will be interesting to see revealed, in future years, the causes of the relaxed selection and increased mutation seen in these unusual non-photosynthetic lineages. Literature Cited

Alvarez, L. W. , W. Alvarez, F. Asaro and H. V. Michel, 1980 Extraterrestrial cause for the Cretaceous-Tertiary extinction. Science 208: 1095-1108.

Anborgh, P. H. and A. Parmeggiani, 1991 New antibiotic that acts specifically on the GTP-bound form of elongation factor Tu. The EMBO Journal 10: 779-784.

Arnold, C. G. and R. Blank, 1980 Three-Dimensional structure of mitochondria and plastids in Chlamydomonas reinhardtii and Polytoma papillatum. Walter de Gruyter & Co., New York.

Baldauf, S. L. and J. D. Palmer, 1990 Evolutionary transfer of the chloroplast tufA gene to the nucleus. Nature 344: 262- 265.

Bastia, D., K. S. Chiang, H. Swift and P. Siersma, 1971 Heterogeneity, complexity, and repetition of the chloroplast DNA of Chlamydomonas reinhardtii. Proc. Natl. Acad. Sci. USA 68: 1157-1161.

Berchtold, H., L. Reshetnikova, C. 0. A. Reiser, N. K. Schirmer, M. Sprinzl and R. Hilgenfeld, 1993 Crystal structure of active elongation factor Tu reveals major domain rearrangements. Nature 365: 126-132.

Birky, C. W . , Jr., 1983 Relaxed cellular controls and organelle heredity. 222: 468-475.

Birky, C. W. , 1988 Evolution and variation in plant chloroplast and mitochondrial genomes, in Plant Evolutionary Biology, edited by L. D. Gottlieb and K. J. Subodh. Chapman and Hall, New York.

Birky, C. W., Jr., 1994 Relaxed and stringent genomes: Why cytoplasmic genes don't obey Mendel’s laws. J. Hered. 85: 355-365.

204 205

Birky, C. W., 1995 Uniparental inheritance of mitochondrial and chloroplast genes. Proc. Natl. Acad. Sci. USA 92: 11331- 11338.

Blamire, J., V. R. Fletchner and R. Sager, 1974 Regulation of nuclear DNA replication by the chloroplast in Chlamydomonas. Proc. Nat. Acad. Sci. USA 71: 2867.

Bogorad, L., 1991 Replication and transcription of plastid DNA, in The Molecular Biology of Plastids, edited by L. Bogorad and I. K. Vasil. Academic Press Inc., New York.

Bold, H. C. and M. J. Wynne, 1985 Introduction to the Algae. Prentice-Hall, Inc., Englewood Cliffs, N.J.

Boynton, J. E., N. W. Gillham and J. F. Chabot, 1972 Chloroplast ribosome deficient mutants in the green alga Chlamydomonas reinhardi and the question of chloroplast ribosome function. J. Cell Sci. 10: 267-305.

Browse, J. and C. Somerville, 1991 Glycerolipid synthesis: biochemistry and regulation. Annu. Rev. Plant. Physiol. Plant Mol. Biol. 42: 467-506.

Buchheim, M. A. and R. L. Chapman, 1991 Phylogeny of the colonial green flagellates: a study of 18S and 26S rRNA sequence data. BioSystems 25: 85-100.

Buchheim, M. A. and R. L. Chapman, 1992 Phylogeny of Carteria (Chlorophyceae) inferred from molecular and organismal data. J. Phycol. 28: 363-374.

Buchheim, M. A., M. A. McAuley, E. A. Zimmer, E. C. Theriot and R. L. Chapman, 1994 Multiple origins of colonial green flagellates from unicells: Evidence from molecular and organismal characters. Molecular Phylogenetics and Evolution 3: 332-343.

Buchheim, M. A., M. Turmel, E. A. Zimmer and R. L. Chapman, 1990 Phylogeny of Chlamydomonas (Chlorophyta) based on cladistic analysis of nuclear 18S rRNA sequence data. J. Phycol. 26: 689-699.

Cedergren, R,, M. W. Gray, Y. Abel and D. Sankoff, 1988 The evolutionary relationships among known life forms. J. Mol. Evol. 28: 98-112.

Chao, L. and D. E. Carr, 1993 The molecular clock and the relationship between population size and generation time. Evolution 47: 688-690. 206

Chapman, R. L. and M. A. Bucheim, 1991 Ribosomal RNA gene sequences: Analysis and significance in the phylogeny and taxonomy of green algae. Crit. Rev. Plant Sci. 10: 343-368.

Chiang, K.-S. and N. Sueoka, 1967 Replication of chloroplast DNA in Chalmydomonas reinhardtii during vegetative cell cycle: its mode and regulation. Biochemistry 57:

Colwell, A., 1994 Genome evolution in a non-photosynthetic plant, Conopholis americana. Washington University.

Conde, M. F., J. E. Boynton, N. W. Gillham, E. H. Harris, C. L. Tingle and W. L. Wang, 1975 Chloroplast genes in Chlamydomonas affecting organelle ribosomes. Molec. Gen. Genet. 140: 183.

Cundliffe, E., 19? Recognition Sites for Antibiotics within rRNA, in The Ribosome: Structure, Function, and Evolution, edited by W. E. Hill, P. B. Moore, A. Dahlberget al. The American Society for Microbiology, Washington, DC.

Delwiche, C. F., M. Kuhsel and J. D. Palmer, 1995 Phylogenetic Analysis of tufA sequences indicates a cyanobacterial origin of all plastids. Molecular Phylogenetics and Evolution 4: 110-128. dePamphilis, C. W. and J. D. Palmer, 1990 Loss of photosnthetic and chlororespiratory genes from the plastid genome of a parasitic flowering plant. Nature 348: 337-339.

Dixon, M. T. and D. M. Hillis, 1993 Ribosomal RNA secondary structure: Compensatory mutations and implications for phylogenetic analysis. Mol. Biol. Evol. 10: 256-267.

Dron, M. , M. Rahire and J. D. Rochaix, 1982 Sequence of the chloroplast 16S rRNA gene and its surrounding regions of Chlamydomonas reinhardtii. Nucleic Acids Research 10: 7609- 7620.

Durocher, V., A. Gauthier, G. Bellemare and C. Lemieux, 1989 Curr. Genet. 15: 277-282.

Eckenrode, V. K., J. Arnold and R. B. Meagher, 1985 Comparison of the nucleotide sequence of soybean 18S rRNA with the sequences of other small-subunit rRNAs. J. Mol. Evol. 21: 259-269.

Ettl, H., 1976 Die Gattung Chlamydomonas Ehrenberg. Beih. Nova Hedwigia 49: 1-1122. 207 Ettl, H. and U. G. Schlosser, 1992 Towards a revision of the systematics of the genus Chlamydomonas (Chlorophyta). 1. Chlamydomonas applanata Pringsheim. Bot. Acta 105: 323-33 0.

Feieraband, J., 1992 Conservation and structural divergence of organellar DNA and gene expression in non-photosynthetic plastids during ontogenetic differentiation and phylogenetic adaption. Bot. Acta 105: 227-231.

Felsenstein, J., 1978 Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27: 401-410.

Felsenstein, J., 1981 Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17: 368-376.

Felsenstein, J., 1988 Phylogenies from Molecular Sequences: inference and reliability. Annu. Rev. Genet. 22: 521-565.

Felsenstein, J., 1989 Phylogeny Inference Package (version 3.2). Cladistics 5: 164-166.

Felsenstein, J., 1993 PHYLIP (Phylogeny Inference Package) version 3.5p. Department of Genetics, University of Washington, Distributed by the author.

Fernholm, B., K. Bremer and H. Jornvall, 1989 The Hierarchy of Life. Molecules and Morphology in Phylogenetic Analysis., in Proceedings from Nobel Symposium 70, edited by Excerpta Medica, Amsterdam.

Fiedler, E. and G. Schultz, 1985 Localization, purification, and characterization of shikimate oxidoreductase- dehydroquinate hydrolase from stroma of spinach chloroplasts. Plant Physiol. 79: 212-218.

Fitch, W. M., 1974 Toward defining the course of evolution: minimum change for a specified tree topology. Syst. Zool. 20: 406-416.

Gaffal, K. P., 1978 Configural changes in the plastidome of Polytoma papillatum after completion of cytokinesis and during fusion of the gametes. Protoplasma 94: 175-191.

Gearing, K. L., J. A. Gustafsson and S. Okret, 1993 Heterogeneity in the polyglutamine tract of the glucocorticoid receptor from different rat strains. Nucleic Acids Research 21: 2014. 208

Gilbert, D. G., 1992 loopDloop, a Macintosh program for visualizing RNA secondary structure. Published electronically on the Internet, available via gopher or anonymous ftp to ftp.bio.indiana.edu.,

Gilbert, D. G., 1992 SeqApp, a biological sequence editor and analysis program for Macintosh computers. Published electronically on the Internet, available via gopher or anonymous ftp to ftp.bio.indiana.edu.,

Gillham, N. W., 1994 Organelle Genes and Genomes. Oxford University Press, Oxford.

Goodman, M., 1976 Protein sequences in phylogeny, in Molecular Evolution, edited by F. J. Ayala. Sinauer, Sunderland, Mass.

Gordon, J., R. Rumpf, S. L. Shank, D. Vernon and C. W. Birky, Jr., 1995 Sequences of the rrnl8 genes of Chlamydomonas humicola and C. dysosmos are identical, in agreement with their combination in the species C. applanata (Chlorophyta)* J. Phycol. 31: 312-313.

Gowans, C. S., 1963 The conspecificity of Chlamydomonas eugametos and Chlairydomonas moweusii: An experimental approach. Phycologia 3: 37-44.

Grant, D. M., N. W. Gillham and J. E. Boynton, 1980 Inheritance of chloroplast DNA in hlamydomonas reinhardtii. Proc. Nat. Acad. Sci. USA 77: 6067.

Gray, M. W., D. Sankoff and R. J. Cedergren, 1984 On the evolutionary descent of organisms and organelles: a global phylogeny based on a highly conserved structural core in small subunit ribosomal RNA. Nucleic Acids Research 12: 5837- 5852.

Group, T. H. s. D. C. R., 1993 A Novel Gene containing a trinucleotide repeat that is expanded and unstable on Huntington's Disease chromosomes. Cell 72: 971-983.

Gunderson, J. H., H. Elwood, A. Ingold, K. Kindle and M. Sogin, 1987 Phylogenetic relationships between chlorophytes, chrysophytes and oomycetes. Proc. Nat. Acad. Sci. USA 84: 5823-5827. 209 Gutell, R. R., N. Larsen and C. R. Woese, 1994 Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. Microbiological Reviews 58: 10-26.

Gutell, R. R., B. Weiser, C. R. Woese and H. F. Noller, 1985 Comparative anatomy of 16-S-like ribosomal RNA. Prog. Nuc. Acids Res. Mol. Biol. 32: 155-216.

Hancock, J. M., 1993 Evolution of sequence repetition and gene duplications in the TAT-binding protein TBP (TFIXD). Nucleic Acids Research 21: 2823-2830.

Harris, E. H., 1989 The Chlamydomonas Sourcebook. Academic Press, Inc., New York.

Harris, E. H., B. D. Burkhart, N. W. Gillham and J. E. Boynton, 1989 Antibiotic resistance mutations in the chloroplast 16S and 23S rRNA genes of Chlamydomonas reinhardtii: Correlation of genetic and physical maps of the chloroplast genome. Genetics 123: 282-292.

Hedberg, M. F., Y. S. Huang and M. H. Hommersand, 1981 Size of the chloroplast genome in Codium fragile. Science 213: 445-447.

Heizmann, P., Y. Hussein, P. Nicolas and V. Nigon, 1982 Modifications of chloroplast DNA during streptomycin induced mutageneis in Euglena gracilis. Curr. Genet. 5: 9.

Higgins, D. G., A. J. Bleasby and R. Fuchs, 1992 Clustal V: improved software for multiple sequence alignment. CABIOS 8: 189-191.

Higgins, D. G. and P. M. Sharp, 1988 CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73: 237-244.

Higgins, D. G. and P. M. Sharp, 1989 Fast and sensitive multiple sequence alignment on a microcomputer. CABIOS 5: 151-153.

Hill, W. E., P. B. Moore, A. Dahlberg, D. Schlessinger, R. A. Garrett and J. R. Warner, Ed. (19?). The Ribosome: Structure. Function, and Evolution. Washington, DC, American Society for Microbiology.

Hillis, D. M., John P. Huelsenbeck, Clifford W. Cunningham, 1994 Application and Accuracy of Molecular Phylogenies. Science 264: 671-677. 210 Hillis, D. M. and M. T. Dixon, 1991 Ribosomal DNA: Molecular Evolution and phylogenetic inference. Quarterly Review of Biology 66: 411-453.

Hillis, D. M. and C. Moritz, Ed. (1990) . Molecular Svstematics. Sunderland, Mass., Sinauer Associates, Inc.

Hirai, A., A. Kanno, M. Iwashashi, Y. Nishizama, J. Hiratsuka and M. Sugiura, 1988 The organization of chloroplast and mitochondrial genomes in rice. Genome 30: S320.

Hoffman, A., E. Sinn, T. Yamamoto, J. Wang, A. Roy, M. Horikoshi and R. G. Roeder, 1990 Highly conserved core domain and unique N terminus with presumptive regulatory motifs in a human TAT factor (TFIID). Nature 346: 387-390.

Holwerda, B. C., S. Jana and W. L. Crosby, 1986 Chloroplast and mitochondrial DNA variation in Hordeum vulgare and Hordeum spontaneum. Genetics 114: 1271-1291.

Howe, C. J. and A. G. Smith, 1991 Plants without chlorophyll. Nature 349: 109.

Huang, X., 1992 A contig assembly program based on sensitive detection of fragment overlaps. Genomics 14: 18-25.

Huss, V. A. and M. L. Sogin, 1989 Primary structure of the Chlorella vulgaris small subunit ribosomal RNA coding region. Nuc. Acids Res. 17: 1255.

Huss, V. A. R., K. H. Wein and E. Kessler, 1988 Deoxyribonucleic acid reassociation in the taxonomy of the genus Chlorella. Archives of Microbiology 150: 509-511

Hussein, Y., P. Heizmann, P. Nicolas and V. Nigon, 1982 Quantitative estimations of chloroplast DNA in bleached mutants of Euglena gracilis. Current Genetics 6: 111.

Hwang, S. R. and F. R. Tabita, 1991 Acyl carrier protein- derived sequence encoded by the Chloroplast genome in the marine diatom Cylindrotehca sp. Strain Nl. J Biol Chem 266: 13492-13494.

Jaeger, J. A., D. H. Turner and M. Zuker, 1989 Improved predictions of secondary structures for RNA. Proc. Nat. Acad. Sci. USA 86: 7706-7710.

Jaeger, J. A., D. H. Turner and M. Zuker, 1989 Predicting optimal and suboptimal secondary structure for RNA. Methods in Enzymology 183: 281-306. 211

Jukes, T. H. and C. R. Cantor, 1969 Evolution of protein molecules., in Mammalian Protein Metabolism, edited by H. N. Munro. Academic Press, New York.

Jupe, E. R., R. L. Chapman and E. A. Zimmer, 1988 Nuclear RNA genes and algal phylogeny--the Chlamydomonas example. BioSystems 21: 223-230.

Kerfin, W. and E. Kessler, 1978 Physiological and biochemical contributions to the taxonomy of the genus Prototheca. II. Starch Hydrolysis and base composition of DNA. Archives of Microbiology 116: 105-107.

Kessel, M. and F. Klink, 1981 Eur. J. Biochem. 114: 481-486.

Kieras, F. J. and K.-S. Chiang, 1971 Characterization of DNA components from some colorless algae. Exp. Cell. Res. 64: 89- 96.

Kimura, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

King, J. L. and T. h. Jukes, 1969 Non-Darwinian evolution. Science 164: 788-798.

Kirk, J. T. 0. and T.-B. R. A. E., 1978 The Plastics. Elsevier, North Holland, Amsterdam and New York.

Kuijt, J., 1969 The biology of parasitic flowering plants. University of California Press, Berkeley and Los Angeles.

Kumar, S., K. Tamura and M. Nei, 1993 MEGA: Molecular Evolutionary Genetics Analysis, version 1.0. University Park, PA, Institute of Molecular Evolutionary Genetics.

Laliberte, G. and J. de la Noue, 1993 Auto-, hetero-, and mixotrophic growth of Chlamydomonas humicola (Chlorophyceae) on acetate. J. Phycol. 29: 612-620.

Landini, P., M. Bandera, A. Soffientini and B. P. Goldstein, 1993 J. Gen. Microbiol. 139: 769-774.

Lang, N. J., 1963 Electron-microscopic demonstration of plastids in Polytoma. J. Protozool. 10: 333-339.

Larson, A., M. M. Kirk and D. L. Kirk, 1992 Molecular phylogeny of the volvocine flagellates. Mol. Biol. Evol. 9: 85-105. 212 Laudenbach, D. E. and A. R. Grossman, 1991 Characterization and mutagenesis of Sulfur-regulated Genes in a Cyanobacterium: Evidence for Function in Sulfate Transport. J. Bacteriol 173: 2739-2750.

Lemieux, C., M. Turmel, V. L. Seligy and R. W. Lee, 1985 The large subunit of rubisco is encoded in the IR sequence of the Chlamydomonas eugametos chloroplast genome. Current Genetics 9: 139-145.

Li, W.-H., C.-C. Luo and C.-I. Wu, 1985 Evolution of DNA sequences, in Molecular Evolutionary Genetics, edited by R. J. MacIntyre. Plenum Press, New York.

Li, W. H. and D. Grauer, 1991 Fundamentals of Molecular Evolution. Sinauer Associates, Inc., Sunderland, Massachusetts.

Links, J., A. Verloop and E. Havinga, 1960 The carotenoids of Polytoma uvella. Arc. Microbiol. 36: 306-324.

MacIntyre, R. J., Ed. (1985). Molecular Evolutionary Genetics. New York, Plenum Press.

Maidak, B. L., N. Larsen, M. J. McCaughey, R. Overbeek, G. J. Olsen, K. Fogel, J. Blandy and C. R. Woese, 1994 The Ribosomal Database Project. Nucleic Acids Research 22: 3485- 3487.

Mankin, A. S., I. G. Skryabin and P. M. Rubtsov, 1986 Identification of the additional nucleotides in the primary structure of yeast 18S rRNA. 44: 143-145.

Mesters, J. R., L. A. H. Zeef, R. Hilgenfeld, J. M. de Graaf, B. Kraal and L. Bosch, 1994 The structural and functional basis for the kirromycin resistance of mutant EF-Tu species in Escherichia coli. The EMBO Journal 13: 4877-4885.

Michalowski, C. B., R. Flachmann, W. Loeffelhardt and H. J. Bohnert, 1991 Gene nadA, encoding Quinolinate Synthetase, is located on the Cyanelle DNA from Cyanophora paradoxa. Plant Physiol 95: 329-330.

Michalowski, C. B., W. Loeffelhardt and H. J. Bohnert, 1991 An ORF323 with homology to crtE, specifying Prephytoene Pyrophosphate Dehydrogenase, is encoded by Cyanelle DNA in the eukaryotic alga Cyanophora paradoxa. J Biol Chem 266: 11866-11870. 213 Miyamoto, M. M. and J. Cracraft, Ed. (1991). Phylogenetic Analysis of DNA Sequences. New York, Oxford University Press.

Morden, C. W., K. H. Wolfe, C. W. dePamphilis and J. D. Palmer, 1991 Plastid translation and transcription genes in a non-photosynthetic plant: intact, missing and pseudo genes. The EMBO Journal 10: 3281-3288.

Morell, V., 1993 Huntington's Gene Finally Found. Science 260: 28-30.

Muller, W. and F. Gautier, 1975 Interactions of heteroaromatic compounds with nucleic acids. A*T-specific non-intercalating DNA ligands. Eur. J. Biochem 54: 385-394.

Nairn, C. J. and R. J. Perl, 1988 The complete nucleotide sequence of the small-subunit ribosomal RNA coding region for the cycad Zamia pumila: Phylogenetic implications. J. Mol. Evol. 27: 133-141.

Neefs, J. M., Y. Van de Peer, P. DeRijk, S. Chapelle and R. De Wachter, 1993 Compilation of small ribosomal subunit RNA structures. Nucleic Acids Research 21: 3025-3049.

Nei, M., 1987 Molecular Evolutionary Genetics. Columbia University Press, New York.

Nei, M. and T. Gojobor, 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3: 418-426.

Nissen, P., M. Kjeldgaard, S. Thirup, G. Polekhina, L. Reshetnikova, B. F. C. Clark and J. Nyborg, 1995 Crystal Structure of the Ternary Complex of Phe-tRNA, EF-Tu, and a GTP Analog. Science 270: 1464-1472.

Noller, H. F., D. Moazed, S. Stern, T. Powers, P. N. Allen, J. M. Robertson, B. Weiser and K. Triman, 19? Structure of rRNA and its functional interactions in Translation, in The Ribosome: Structure, Function, and Evolution, edited by W. E. Hill, P. B. Moore, A. Dahlberget al. The American Society for Microbiology, Washington, DC.

Ochman, H. and A. C. Wilson, 1987 Evolution in bacteria: Evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26: 74-86. 214

Ohyama, K., H. Fukuzawa, T. Kohchi, H. Shirai, T. Sano, S. Sano, K. Umesono, Y. Shiki, M. Takeuchi, Z. Chang, S.-I. Aota, H. J. Inokuchi and H. Ozeki, 1986 Chloroplast gene organization deduced from complete sequence of liverwort Marchantia polymorpha chloroplast DNA. Nature 322: 572-574.

Olsen, G., N. Larsen and C. Woese, 1991 The Ribosomal DNA Database Project. Nucleic Acids Research 19: 2017-2021.

Olsen, G. J., 1988 Phylogenetic Analysis using Ribosomal RNA. Methods in Enzymology 164: 793-812.

Olsen, G. J., H. Matsuda, R. Hagstrom and R. Overbeek, 1994 fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41-48.

Olsen, G. J. and C. R. Woese, 1993 Ribosomal RNA: a key to phylogeny. FASEB J. 7: 113-123.

Pace, N. R., G. J. Olsen and C. R. Woese, 1986 Ribosomal RNA Phylogeny and the primary lines of Evolutionary Descent. Cell 45: 325-326.

Palmer, J. D., 1991 Plastid Chromosomes: Structure and Evolution, in The Molecular Biology of Plastids, edited by L. Bogorad and I. K. Vasil. Academic Press, Inc., New York.

Patterson, D. J. and J. Larsen, Ed. (1991) . The Biolocrv of Free-Living Heterotrophic Flagellates. Oxford, Clarendon Press.

Penny, D., M. D. Hendy and M. A. Steel, 1992 Progress with methods for constructing evolutionary trees. TREE 7: 73-79.

Piechulla, B. and H. Kuntzel, 1983 Eur. J. Biochem. 132: 235-240.

Pore, R. S., 1985 Prototheca taxonomy. Mycopathologia 90: 129-139.

Pringsheim, E. G., 1963 Farblose Algen. Gustav Fischer Verlag, Stuttgart.

Rausch, H., N. Larsen and R. Schmitt, 1989 Phylogenetic relationships of the green alga Volvox carteri deduced from small-subunit ribosomal RNA comparisons. J. Mol. Evol. 29: 255-265. 215 Reardon, E. M. and C. Price, 1994 Nomenclature of Sequenced Plant Genes. Plant Molecular Biology Report 12: Sl-81.

Rochaix, J. D., 1978 Restriction endonuclease map of the cpDNA of Chlamydomonas reinhardtii. J. Mol. Biol 126:

Ryan, R., D. Grant, K. S. Chiang and H. Swift, 1978 Isolation and characterization of mitochondrial DNA from Chlamydomonas reinhardtii. Proc. Natl. Acad. Sci. USA 75: 3268-3272.

Saitou, N. and M. Nei, 1987 The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406-425.

Sarich, V. M. and A. C. Wilson, 1973 Generation time and genomic evolution in primates. Science 179: 1144-1147.

Scherbel, G., W. Behn and C. G. Arnold, 1974 Untersuchungen zur genetischen Funktion des farblosen Plastiden von Polytoma mi rum. Arch. Microbiol. 96: 205-222.

Schlosser, U. G., 1984 Species-specific sporangium autolysins (cell-wall-dissolving enzymes) in the genus Chlamydomonas., 409-418 in Systematics of the Green Algae, edited by D. E. G. Irvine and D. John. Cambridge University Press, Cambridge.

Schwarz, Z. S. and H. Kossel, 1980 The primary structure of 16S rDNA from Zea mays chloroplast is homologous to E. coli 16S rRNA. Nature 283: 739-742.

Sembrook, J., E. F. Fritsch and T. Maniatis, 1989 Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Lab Press, Cold Spring Harbor, NY.

Shinozaki, K., M. Ohme, M. Tanak, T. Wakasugi, N. Hayashida, T. Matsubayashi, N. Zaita, J. CHungwongse, J. Obokata, K. Yamaguchi-Shinozaki, C. Ohto, K. Torozawa, B. Y. Meng, A. Sugita, H. Deno, T. Kamogashira, K. Yamada, J. Kusuda, F. Takaiwa, A. Kato, N. Tohdoh, H. Shimada and M. Sugiura, 1986 The complete nucleotide sequence of tobacco chloroplast genome: its gene organization and expression. EMBO 5: 2043- 2050.

Siemeister, G., C. Buchholz and W. Hachtel, 1990 Genes for the plastid elongation factor Tu and ribosomal protein S7 and six tRNA genes on the 73 kb DNA from Astasia longa that resembles the chloroplast DNA of Euglena. Mol. Gen. Genet. 220: 425-432. 216 Siemeister, G. and W. Hachtel, 1989 A circular 73 kb DNA from the colourless flagellate Astasia longa that resembles the chloroplast DNA of Euglena: restriction and gene map. Curr. Genet. 15: 435-441.

Siemeister, G. and W. Hachtel, 1990 Organization and nucleotide sequence of ribosomal RNA genes on a circular 73 kbp DNA from the colourless flagellate Astasia longa. Curr. Genet. 17: 433-438.

Siemeister, G. and W. Hachtel, 1990 Structure and expression of a gene encoding the large subunit of ribulose-1,5- bisphosphate carboxylase (rbcL) in the colourless euglenoid flagellate Astasia longa. Plant Mol. biol. 14: 825-833.

Sigmund, C. D., M. Ettayebi and E. A. Morgan, 1984 Antibiotic resistance mutations in 16S and 23S ribosomal RNA genes of Escherichia coli. Nucleic Acids Research 12: 4653- 4663.

Siu, C.-H., K.-S. Chiang and H. Swift, 197 6 Characterization of Cytoplasmic and Nuclear Genomes in the colorless alga Polytoma. III. Ribosomal RNA cistrons of the nucleus and leucoplast. J. Cell Biol 69: 383-392.

Siu, C.-H., K. S. Chiang and H. Swift, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. V. Molecular structure and heterogeneity of leucoplast DNA. J. Mol. Biol. 98: 369-391.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. I. Ultrastructureal analysis of organelles. J. Cell. Biol. 69: 362-370.

Siu, C.-H., H. Swift and K. S. Chiang, 1975 Characterization of cytoplasmic and nuclear genomes in the colorless alga Polytoma. II. General characterization of organelle nucleic acids. 69: 371-382.

Sober, E., 1988 Reconstructing the Past: Parsimony, Evolution and Inference. MIT Press, Cambridge, Mass.

Sogin, M. L., 1991 The Phylogenetic Significance of sequence diversity and length variations in eukaryotic small subunit ribosomal RNA coding regions.

Sogin, M. L., H. J. Elwood and J. H. Gunderson, 1986 Evolutionary diversity of eukaryotic small-subunit rRNA genes. Proc. Natl. Acad. Sci. USA 83: 1383-1387. 217

Stewart, C.-B., 1993 The powers and pitfalls of parsimony. Nature 361: 603-607.

Suthers, G. K., S. M. Huson and K. E. Davies, 1992 Instability versus predictability: the molecular diagnosis of myotonic dystrophy. J. Med. Genet. 29: 761-765.

Swofford, D. L., 1993 PAUP: Phylogenetic Analysis Using Parsimony, Version 3.5p. Computer program distributed by the Illinois Natural History Survey, Champaign, Illinois.,

Takaiwa, P., K. Oona and M. Sugiura, 1984 The complete nucleotide sequence of a rice 17S rRNA gene. Nuc. Acids Res. 12: 5441-5448.

Tamura, K., 1992 Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C content biases. Molecular Biology and Evolution 9: 678-687.

Turmel, M . , G. Bellemare and C. lemieux, 1987 Physical mapping of differences between the chloroplast DNAs of the interfertile algae Chlamydomonas eugametos and Chlamydomonas moewusii. Current Genetics 11: 543-552.

Turmel, M . , R. R. Gutell, J.-P. Mercier, C. Otis and C. Lemieux, 1993 Analysis of the chloroplast large subunit ribosomal RNA gene from 17 Chalmydomonas Taxa. J. Mol. Biol. 232: 446-467.

Vernon-Kipp, D., S. A. Kuhl and C. W. J. Birky, 1989 Molecular evolution of Polytoma, a non-green chlorophyte., 284-286 in Physiology, Biochemistry, and Genetics of Nongreen Plastids, edited by C. T. Boyer, J. C. Shannon and R. C. Hardison. American Society of Plant Physiologists, Rockville, Maryland.

Vijgenboom, E., L. P. Woudt, P. W. H. Heinstra, K. Rietveld, J. van Haarlem, G. P. van Wezel, S. Shochat and L. Bosch, 1994 Three fcuf-like genes in the kirromycin producer Streptomyces ramocissimus. Microbiology 140: 983-998.

Weeden, N. F., 1981 Genetic and biochemical implications of the endosymbiotic origin of the chloroplast. J. Mol. Evolution 17: 133-139.

Wilcox, L. W . , L. A. Lewis, P. A. Fuerst and G. L. Floyd, 1992 Group I introns within the nuclear-encoded small- subunit rRNA gene of three green algae. Mol. Biol. Evol. 9: 1103-1118. 218 Wilson, A. C., S. S. Carlson and T. J. White, 1977 Biochemical evolution. Annu. Rev. Biochem. 46: 573-63 9.

Wilson, A. C., H. Ochman and E. M. Prager, 1987 Molecular time scale for evolution. TIG 3: 241-247.

Wimpee, C. F., R. Morgan and R. Wrobel, 1992 An aberrant plastic ribosomal RNA gene cluster in the root parasite Conopholis americana. Plant Mol. Biol. 18: 275-285.

Wimpee, C. F., R. Morgan and R. L. Wrobel, 1992 Loss of transfer RNA genes from the plastid 16S-23S ribosomal RNA gene spacer in a parasitic plant. Curr. Genet. 21: 417-422.

Wimpee, C. F., R. L. Wrobel and D. K. Garvin, 1991 A divergent plastic genome in Conopholis americana, an achlorophyllous parastic plant. Plant Mol. Biol. 17: 161-166.

Woese, C. R., Pace, N. R. Probing RNA structure, function, and history by comparative analysis, in The RNA World, edited by The Cold Spring Harbor Laboratory Press, Cold Spring Harbor.

Wolfe, A. D. and C. W. dePamphilis, 1995 Alternate paths of evolution for the photosynthetic gene rhc in four nonphotosynthetic species of Orohanche. Plant Molecular Biology (submitted)

Wolfe, K. H., D.S. Katz-Downie, C. W. Morden and J. D. Palmer, 1992 Evolution of the plastid ribosomal RNA operon in a nongreen parasitic plant: Accelerated sequence evolution, altered promoter structure, and tRNA pseudogenes. Plant Mol. Biol. 18: 1037-1048.

Wolfe, K. H., C. W. Morden, S. C. Ems and J. D. Palmer, 1992 Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. J. Mol. Evol. 35: 304-317.

Wolfe, K. H ., C. W. Morden and J. D. Palmer, 1991 Ins and outs of plastid genome evolution. Curr. Opinion Genet. Develop. 1: 523-529.

Wolfe, K. H., C. W. Morden and J. D. Palmer, 1992Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc. Nat. Acad. Sci. USA 89: 10648-10652. 219 Wu, C.-I. and W.-H. Li, 1985 Evidence for higher rates of nucleotide substitution in rodents than in man. Proc. Natl. Acad. Sci. USA 82: 1741-1745.

Wu, M., J. K. Lou, C. H. Chang, Z. Q. Nie and X. M. Wang, 1986 Initiation of chloroplast DNA replication, in Extrachromosomal Elements in lower eukaryotes, edited by R. B. Wickner, A. Hinnebusch, A. M. Lambowitz, I. C. Gunsalus and A. Hollaender. Plenum Press, New York.

Yamada, T., 1983 Characterization of IR sequences and rRNA genes of chloroplast DNA from Chlorella ellipsoidea. Current Genetics 7:

Yeh, K. C., K. Y. To, S. W. Sun, M. C. Wu, T. Y. Lin and C. C. Chen, 1994 Point mutations in the chloroplast 16S rRNA gene confer streptomycin resistance in Nicotiana plumbaginifolia. Current Genetics 26: 132-135.

Zuckerkandl, E. and L. Pauling, 1962 Molecular disease, evolution and genic heterogeneity, in Horizons in Biochemistry, edited by M. Kasha and B. Pullman. Academic Press, New York.

Zuker, M., 1989 On finding all suboptimal foldings of an RNA molecule. Science 244: 48-52.