INFORMATION TO USERS

This reproduction was made from a copy of a manuscript sent to us for publication and microfilming. While the most advanced technology has been used to pho­ tograph and reproduce this manuscript, the quality of the reproduction is heavily dependent upon the quality of the material submitted. Pages in any manuscript may have indistinct print. In all cases the best available copy has been filmed.

The following explanation of techniques is provided to help clarify notations which may appear on this reproduction.

1. Manuscripts may not always be complete. When it is not possible to obtain missing pages, a note appears to indicate this.

2. When copyrighted materials are removed from the manuscript, a note ap­ pears to indicate this.

3. Oversize materials (maps, drawings, and charts) are photographed by sec­ tioning the original, beginning at the upper left hand comer and continu­ ing from left to right in equal sections with small overlaps. Each oversize page is also filmed as one exposure and is available, for an additional charge, as a standard 35mm slide or in black and white paper format. *

4. Most photographs reproduce acceptably on positive microfilm or micro­ fiche but lack clarity on xerographic copies made from the microfilm. For an additional charge, all photographs are available in black and white standard 35mm slide format.*

*For more information about black and white slides or enlarged paper reproductions, please contact the Dissertations Customer Services Department.

T T A /f.T Dissertation U 1VJLJL Information Service

University Microfilms International A Bell & Howell Information Company 300 N. Zeeb Road, Ann Arbor, Michigan 48106

8625279

Ralph, David Allen

EVOLUTION OF CYTOPLASMIC GENOMES

The Ohio State University Ph.D. 1986

University Microfilms International300 N. Zeeb Road, Ann Arbor, Ml 48106

Copyright 1986 by Ralph, David Allen All Rights Reserved

PLEASE NOTE:

In all cases this material has been filmed in the best possible way from the available copy. Problems encountered with this document have been identified here with a check mark ■/ .

1. Glossy photographs or pages______

2. Colored illustrations, paper or print ______

3. Photographs with dark background i /

4. Illustrations are poor copy______

5. Pages with black marks, not original copy ______

6. Print shows through as there is text on both sides of page______

7. Indistinct, broken or small print on several pages ______

8. Print exceeds margin requirements______

9. Tightly bound copy with print lost in spine______

10. Computer printout pages with indistinct print ______

11. Page(s)______lacking when material received, and not available from school or author.

12. Page(s) seem to be missing in numbering only as text follows.

13. Two pages numbered . Text follows.

14. Curling and wrinkled pages ______

15. Dissertation contains pages with print at a slant, filmed as received

16. Other

University Microfilms International

EVOLUTION OF CYTOPLASMIC GENOMES

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Phylosophy in the Graduate

School of the Ohio State University

By

David Allen Ralph, B.S., M.S. ft « « * »

The Ohio State University

1986

Dissertation Committee: Approved by

P. S. Perlman

P. A. Fuerst Advisor

S. Falkenthal Molecular, Cellular and Developmental Biology Program Copyright by David Allen Ralph 1986 This dissertation is dedicated to Peggy McDaniel without whose help I could not have completed this work.

11 ACKNOWLEDGEMENTS

I would like to express my deepest thanks to John

Briggs, Karl Joplin, Jason Tash, Makis Skoulakis, Elio

Vanin, Kirk Mecklenburg, Chip Pretzman, Yasoko Rikihisha,

and Phil Perlman for assisting and supporting me during my years in graduate school.

iii VITA

Augest 2, 1954 Born - Dallas Texas

1976...... B.S., Ohio State University

1979 M.S., Ohio State University

1981-1985.... Intern, Ohio Department of Health

PUBLICATIONS

Streett, D.A., David Ralph, and.Fred Hihk. 1980. Replication of Nosema algerae in three insect cell lines. J. Protozool. 27:113-117.

Hinlc,W.F., D.A. Ralph, and K.H. Joplin. 1985. Metabolism and characterization of Insect cell lines, in Comprehensive Insect Physiology Biochemistry and Pharmacology, eds. G.A. Kerkut and L.I. Gilbert. Pergamon Press. New York.

Pretzman Jr. Charles I., David Ralph, Lynn Mishler, and Joyce Bodine. 1985. Rapid separation of IgM from whole serum using spun column chromatography. J. Immunological Methods. 63:301-307.

Pretzman Jr., Charles I., Yasoko Rikihisha, and David Ralph. 1986. Serological diagnosis of Plutomic horse fever. Clinical Microbiology (submitted)

PUBLICATIONS IN PREPARATIONS

Ralph, David and Phil Perlman. Structure of the Cox I gene in ten species of

Ralph, David and Phil Perlman. A newly observed in the Cox I gene of Saccharomyces capensis has an unusual structure. Ralph, David, Charles I. Pretzman Jr., Karl Poetter, Scott Gordon, Jon Clark, Paul Fuerst, and Phil Perlman. Molecular differenciation of Rickettsia.

FIELDS OF STUDY

Major Field: Molecular Biology (Dr. Phil Perlman-advisor)

Minor Field: Molecular biology and in vitro cultivation of parasitic organisms. (Dr. Paul Fuerst, Dr.Fred Hink, and Dr. Yasoko Rikihisha)

V TABLE OF CONTENTS

DEDICATION...... ii ACKNOWLEDGMENTS...... iii VITA...... iv TABLE OF CONTENTS...... vi LIST OF TABLES...... ix LIST OF FIGURES...... X

CHAPTER I: INTRODUCTION...... 1 I.A. Preface...... 1 I.B. Introduction to organelle ...... 5 I.B.1. Why choose Saceharomyces sp. as a model system to study mitochondrial functions?...... 5 I.B.2. Types of mutants which affect mitochondrial function...... 8 I.B.3. The organization of yeast and mammalian mitochondrial genomes...... 13 I.B.4. Yeast mitochondrial genes contain introns.... 18 I.B.5. Other fungal mitochondrial genomes contain introns...... 23 I.B.6. Translation products of fungal mitochondrial introns...... 35 I.B.7. Other translation products of fungal mitochondrial introns: Maturases...... 40 I.B.8. Cis-acting sequences required for introns related to cob 14...... 50 I.B.9. Cis -acting mutants in other introns in the S_j_ cerevisiae mitochondrial genome...... 54 I.B.10. Computer modelling RNA secondary structures of yeast mitochondrial introns...... 56 I.B.11. Amino acid homologies between intronic ORF of yeast...... 58 I.B.12. Autocatalytic and catalytic activities of group I introns...... 61 I.B.13. Oxi I5g, a group II intron, is also autocatalytic...... 70 I.B.14. Nuclear pre-mRNA splicing: Similarities with group II mitochondrial introns...... 72 I.B.15. All fungal mitochondrial introns belong to either class I or class II...... 76 I.B.16. Introns in organelle genes of plants...... 79 I.B.16.a. The cytochrome oxidase subunit II gene in the mitochondrial genomes vi of flowering plants...... 79 I.B.16.b. Introns in chloroplast tRNA genes...... 81 I.B.16.c. Introns in chloroplast genes other than tRNA genes...... 85 I.B.17* A class I intron in the thymidylate synthase gene of the bacteriophage T4...... 89 I.B.18. Possible class I intron in an archaebacterial 23S rRNA gene...... 90 I.B.19* Short regions of high G+C content in the mitochondrial genome of S. cerevlslae ...... 91 I.B.20. Conclusion...... 92 I.C. Taxonomy and physiology of Rickettsia...... 95 I.C.1. Taxonomy of the family Rickettsiaceae ...... 95 I.C.2. Typhus group rickettsia...... 98 I.C.3* The Rocky Mountain Spotted Fever group...... 105 I.C.4. Scrub typhus...... 120 I.C.5. Antigens shared between different groups within the genus Rickettsia...... 125 I.C.6. Rickettsial physiology...... 127 I.C.7. The genus Rochalimaea ...... '...... 132 I.C.8. Rickettsia are a good model for early mitochondrial evolution...... 131*

CHAPTER II: METHODS...... 138 II.A. Strains used...... 138 II.B. Media and buffers used...... 139 II.C. Purification of mitochondrial DNA...... 140 II.D. Tissue culture, growth of rickettsia and purification of rickettsial DNA...... 141 II.E. Separation of restriction endonuclease derived fragments of DNA on agarose gels and band isolation...... 146 II.E. DNA blot hybridization...... 147 II.F. Labeling of DNA probes...... 148 II.G. Construction of recombinant clones...... 149 II.H. Transformation of bacteria...... 151 II.I. Bal-31 deletions...... 154 II.J. Screening colorless M13 plaques for inserts...... 155 U.K. Isolation of single stranded M13 DNA...... 155 II.L. DNA sequencing of M13 derived ssDNA...... 157 II.M. The Gimenez hemolymph stain...... 158 II.N. The microimmunofluorescence (micro IF) test.. 159

CHAPTER III: RESULTS...... 16 1 III.A. The structure of the oxi3 gene in 11 species of yeast...... 161 III.A.1. Southern blots probed with intron specific sequences...... 16 1 III.A.2. Cloning aI3b and the novel Hindlll fragment from S^ capensis vi i into pEMBL-18...... 181 III.A.3. Recombinant clones used to sequence novel regions of the capensis oxl3 gene...... 184 III.A.4. The sequence of oxi3 I3b...... 186 III.A.5. Oxi3 I3b has a long ORF which is continuous with the 5'exon ...... 187 III.A.6. Intron aI3b contains sequences which are homologous to the conserved cis-acting sequences of class I introns...... 192 III.A.7. Possible unusual sequence interactions in aI3b...... 202 III.A.8. The distribution of nucleotides in aI3b...... 205 III.B. Comparison between cob 13 from S. cerevlsiae and the cob intron in A nldulans ...... 207 III.C. Serological and genomic differentiation of Rickettsia ...... 209 III.C.1 Antigenic variation in the genus Rickettsia ...... 209 III.C.2. Restriction site polymorphisms differentiate Rickettsia ...... 213

CHAPTER IV: DISCUSSION...... 226 IV.A. Introns in organelles...... 226 IV.A.1. The distribution of introns in the oxi3 gene of yeast species in the genus Saccharomyces supports the hypothesis that all yeast mitochondrial introns are optional...... 226 IV.A.2. 0x13 I3b is a divergent class I intron...229 IV.A.3. Evidence for lateral translocation of organelle introns between species...... 235 IV.A.4. Deletions and insertions are important mechanisms in the evolution of organelle introns...... 244 IV.A.5. Similarities between organelle introns and infectious agents...... 252 IV*A.6. Future directions...... 257 IV.B. The beginning of rickettsial molecular ...... 258

TABLES...... 265 FIGURES...... 281 BIBLIOGRAPHY...... 370 LIST OF TABLES

Table Page

1. P and Q sequences of class I Introns...... 266 2. R and S sequences...... 269 3. Yeast strains used in this study...... 271 4. Rickettsia sp. used in this study...... 272 5. Zero mixes...... 273 6. Termination mixes...... 274 7. Distribution of oxl3 introns: Summary...... 275 8. Codon usage in aI3b’s ORF...... 277 9. Rickettsial Serology...... 279 10. Estimated percent mismatched pair between rickettsial genomes...... 280

ix LIST OF FIGURES

Figure Page

1. Structure of the yeast mitochondrial genome...... 282 2. Interactions between conseved sequence elements class I introns...... 284 3. Interactions between conserved sequence elements in aI3b...... 286 4. Computer generated secondary structure of. al4....287 5. Computer generated secondary structure of aI1....288 6. EcoR1 digests of mtDNAfrom 11 species of yeast..290 7. Location of probes...... 292 8. Yeast mtDNAs probed with all specific sequences..293 9. Yeast mtDNAs probed with aI2 specific sequences..294 10. Yeast mtDNAs probed with aI3 specific sequences..295 11. Yeast mtDNAs probed with al4 specific sequences..296 12. Yeast mtDNAs probed with aI5g specific sequences...... 297 13. BamH1 plus EcoR1 digests of mtDNAs...... 299 14. BamH1 plus EcoR1 digests of mtDNAs hybridized to pKM-2...... 301 15. Taq 1 digests of mtDNAs probed with pKM-2...... 303 16. Mapping oxi3 inserts in selected species of yeast...... 307 17. Hindlll plus EcoR1 digests of mtDNA...... 309 18. Hindlll plus EcoR1 digests of mtDNA probed with pKM-2...... 311 19. Yeast mtDNAs probed with aI3b specific sequences...... 312 20. Yeast mtDNAs probed with al4b specific sequences...... 313 21. Aval plus FstI digests of yeast mtDNAs...... 315 22. BamH1 plus PstI digests of yeast mtDNAs.(#1).... 317 23. BamH1 plus PstI digests of yeast mtDNAs.(#2).....318 24. BamH1 plus PstI digests probed with EcoR1 band 7 specific sequences. (#1)...... 319 25. BamH1 plus PstI digests probed with EcoR1 band 7 specific sequences. (#2)...... 320 26. Bcl21 and Hindl6 compared to capensis mtDNA..322 27. Progressive Bal 31 digestion of Bcl21...... 323 28. Stratagy for sequencing aI3b...... 325 29. Sequence of aI3b...... 327 30. Amino acid sequence for 0RF found in aI3b...... 332 31. Distribution of stop codons in aI3b...... 334 32. Amino acid homology between aI3b .and cob 14...... 336 33. Interaction between R and S in aI3b...... 338 34. Interaction between P and Q in aI3b...... 340 x 35. Interaction between E and E' in aI3b...... 341 36. Proposed secondary structure of aI3b...... 343 37. Proposed secondary structure of the 3 1 end of aI3b...... 3^4 38. Homology between G+C clusters in aI3b...... 345 39. Comparison of cob 14 and oxi3 14...... 347 40. Sequence comparison of cob 13 of cerevi3iae and the cob intron of JU nldulans ...... 349 41. Comparison of amino acid sequences from the ORF of cob 13 and the 1U nldulans cob intron...351 42. Control"lambda digests for rickettsial blots.....353 43. Hindlll digests of rickettsial .(#1)...... 354 44. Rickettsial DNAs probed with !tEcoR3n...... 355 45. Rickettsiai DNAs probed with nPst1,f...... 356 46. Rickettsial DNAs probed with nPst2"...... 357 47. Hindlll digests of Rickettsial DNAs.(#2)...... 358 48. Rickettsial DNAs probed with 16S RNA specific sequences...... 359 49. Line drawing of nEcoR3n data...... 361 50. Line drawing of nPst1n data...... 363 51. Line drawing of nPst2n data...... 365 52. Line drawing of "16S" data...... 367 53* Summary of rickettsial data...... 369

xi Chapter I: Introduction

I.A. Preface

All eucaryotic cells are fundamentally more complex than eubacterial and archaebacterial cells. One manifestation of this increased complexity is that all eucaryotic cells are internally compartmentalized.

Eucaryotic cells are composed of a series of membrane bound compartments (organelles) in which various functions and metabolic pathways are sequestered. These organelles are surrounded by a highly organized metabolically active fluid-gel called the cytosol or cytoplasm which is, in turn, bounded by the cells outer membrane.

Another manifestation of eucaryotic complexity is that all eucaryotes possess multicomponent genomes. In the simplest case, the genome of a eucaryote is divided into a number of very large linear DNA molecules which are packaged and organized by association with a wide variety of proteins and RNAs into structures called

chromosomes. These chromosomes are localized in an 1 organelle called the nucleus in which RNA is transcribed,

■f . processed and/or modified and exported to the cytoplasm in association with its own series of accessory proteins.

Some RNAs do not leave the nucleus. The possible functions of these nuclear RNAs will be discussed later in this introduction.

In addition to the division of the eucaryotic genome into chromosomes, most eucaryotic cells contain one or more additional types of organelles which contain their own metabolically active and independently replicated

DNAs. Typically, these organelles are divided into two groups, mitochondria and chloroplasts. The genome of these organelles encode their own ribosomal RNAs, tRNAs, and a limited set of proteins and other functions. In fact, these organelles probably evolved from once free living procaryotes which subsequently became endosymbiotic with the ancestrial eucaryotic cell (Sagan,

1967 , Yang et al., 1985, Kuntzel et al., 1981, Gray, 1983 and Gray et al., 1984).

As time progressed, it is hypothesized that the protochloroplasts and protomitochondria became obligately symbiotic. In this process, the protoorganelle genomes became reduced. First, genes which encoded

"housekeeping" functions which were now being provided by the eucaryotic host were lost. Then DNA encoding essential organellar functions (proteins) was moved from

the organelle genome into the nuclear genome. Once in

the nucleus, these genes were transcribed in the nucleus,

translated on cytoplasmic ribosomes and imported back

into the organelle (for example Ibrahim and Butow, 1980,

Neupert and Schatz, 1981, Highfield and Ellis, 1978,

Grasser et al., 1982, Daum et al., 1982, and Grasser et

al., 1982). It is probable that this feature of

essential import is what separates endosymbiotic

bacteria from organelles.

The movement of DNA between various cellular

compartments is probably an ongoing process. There is

evidence for fragments of or entire genes recently moving

from the mitochondria to the nucleus in the filamentous

fungi (Schmidt et al. 1983,) in yeast (Farrelly and

Butow, 1983, and Fox, 1983) and in humans (Nomiyama et

al., 1984). In corn there are sequences in the nucleus which are homologous to some mitochondrial

(Lerner et al., 1983) and in the mitochondrial genome

there are sequences which appear to have been recently

stolen from the chloroplast genome (Stern and Lonsdale,

1982 and Stern and Palmer, 1984).

The result of these extremely ancient symbiotic

relationships with its accompanying numerous genome

rearrangements has led to the evolution of the modern eucaryotic cell. This cell type is composed of multiple genomes with divergent evolutionary histories. These relationships are extremely mutualistic. The eucaryotic host nucleus receives an extreme selective advantage by acquiring the respiratory functions of the mitochondria and/or the photosynthetic functions of the chloroplasts, while mitochondria and chloroplasts are mere fragments of their free living ancestors and have no existence outside the eucaryotic cell. The success of this system requires intimate cooperation and coordinated expression of the various genomes.

While reading this dissertation, the reader may not recognize the common thread which unifies this work. The unifying theme was to examine how endosymbionts evolve to become organelles and how the evolution of organelle genomes is ongoing. To answer the first part, I have examined the divergent structures of the mitochondrial gene for cytochrome oxidase subunit I in ten different species of yeast in the genus Saccharomycest and to answer the second part, I have developed a model system involving endosymbiotic bacteria in the genus Rickettsia 5

I.B. Introduction to organelle introns

I.B.1. Why choose Saccharomyces sp. as a model system to study mitochondrial functions?

Yeast within the genus Saccharomyces are somewhat unusual in that in the presence of glucose they repress numerous metabolic pathways, including respiration, and rely on the fermentation of glucose to ethanol and carbon dioxide to meet their energy needs (Perlman and Mahler,

1974). This phenomonon, termed "glucose repression", results in a rapidly dividing yeast population which produces a great deal of ethanol. This little piece of biological knowledge has escaped almost no ones attention since mankind developed the technology to make pots.

What is more important is that, if one feeds yeast glucose then the nuclear genes which produce products required for mitochondrial respiration, and all the genes on the mitochondrial genome are not required for growth.

Therefore, tight unconditional mutants in the mitochondrial genome and in the nuclear genes encoding mitochondrial proteins can be maintained in yeast. These would be lethal in most eucaryotes which are not facultative anaerobes. The appreciation of these facts dates to Slonimski and Ephrussi (1949) and the cytoplasmic inheretance of some of these mutants to

Mounolou et al. (1966,and 1968).

Another important advantage in yeast mitochondrial genetics is that yeast are sexual organisms. Haploid yeast cells are usually phenotypically one of two mating types, "a" or "alpha". In the presence of the opposite mating type (or the mating pheromone of the opposite mating type) a haploid cell undergoes a relatively simple differentiation to become a gamete. Gametes of the opposite mating types fuse to become diploid zygotes.

Zygotes are capable of vegetative replication by mitosis until environmental cues trigger the onset of .

Haploid cells are also capable of vegetative mitotic replication, but in this rarely occurs because wild type haploid cells switch their mating types once per cell cycle. Thus after one cell cycle haploid cells usually undergo gametogenesis and fuse to their sister

(mother) cell.

The process of mating type switching is inherently interesting and has been the focus of much research

(Tatchell et al., 1981, Klar et al., 1982, and Nasmyth,

1982). Basically, whether a haploid cell is "a" or

"alpha" mating type is controlled by the "Mat" locus on chromosome 3. When the cell is "a", the Mat locus contains "a" specific genes. When the cell is "alpha", 7 the Mat locus contains "alpha" specific genes. The Mat locus is different when it is "mat-a" or "mat-alpha". In addition, there are single transcriptionally silent copies of "mat-a" and "mat-alpha" about a hundred kilobases away from the transcriptionally active Mat locus on opposite ends of the same chromosome. Once per cell cycle, the sequence at the transcriptionally active

Mat locus is switched to the opposite mating type by means of an extremely efficient unidirectional involving the silent mating type locus of the opposite type. What is extremely interesting, is that this unidirectional gene conversion i3 initiated by a double strand cut at a specific site in the transcriptionally active Mat locus. The endonuclease resposible for this double strand cut is the product of the "HO" locus (Nasmyth, 1985).

There are two reasons why this is important to mitochondrial genetics. First, there is an analogous unidirectional gene conversion event which occurs in the mitochondria and will be discussed latter. Second, there are mutants at the "HO" locus of cerevlsiae which prevent mating type switching. Thus, it is possible to maintain haploid strains of this yeast species and control the parentage of the progeny of any mating. It is the existance of mutant "ho" loci which make all yeast genetics possible.

There is one other important consideration relevent to choosing to study mitochondrial genetics in yeast. It is that, unlike most gamete fusion events, both parents in a yeast mating contribute mitochondrial genomes to the zygote. In other words, the zygote can be functionally heterozygous or heteroplasmic and contain two different mitochondrial genomes. After a time, these genomes will separate in a process of vegetative segregation, but before they do, recombination between the two genomes is common (Birky, 1983). Complimentation groups can be established and standard cis-trans tests can be performed.

I.B.2. Types of mutants which affect mitochondrial function

Mutants which affect respiration are easily recognized on agar plates with a medium containing glucose. Mutants in respiration related genes form small colonies by fermenting the glucose to ethanol. Wild type colonies are much larger because, after the glucose is exhausted, they can switch on respiratory functions and continue to grow by utilizing the ATP generated by the energetically more efficient respiratory chain. There are, of course, other genetic reasons why mutant cells might produce small colonies, but the majority of the small colony mutants are respiration related.

Nonrespiration related mutants can be easily detected by counter screening for growth on a nonfermentable carbon source such as glycerol or lactate. Respiration related mutants will not grow on nonfermentable carbon sources.

There are seven classes of mutants which affect mitochondrial functions. The first class which will be discussed includes the "pet " mutants. Pet mutants are respiration deficient and have primary lesions in products encoded by the nucleus. As it turns out, the vast majority of mitochondrial proteins are encoded by the nucleus. Since this class of mutants was discovered by Sherman and Slonimski (1964) there have been more than four thousand different pet mutants isolated (Michaelis et al. 1982). These mutants fall into hundreds of complimentation groups and affect various functions from charging mitochondrial tRNAs to Kreb's cycle proteins to maturation and stability of mitochondrial mRNAs (Pape and

Tzagoloff, 1984, Deickerman et al., 1984, and Muller et al.,1984).

The second and third classes of mutants affecting mitochondrial functions are referred to as mit- and syn- mutants. These mutants are missense, frameshift or various.sized but relatively small deletions or insertions in the structural genes or cis-acting regions in the mitochondrial genome (Tzagoloff et al.,

1975). In the simplest case mit- mutants affect only one mitochondrial encoded product. There is an interesting exception, which will be discussed in detail later, in which a product encoded by the same region that also encodes the cytochrome b apoprotein ( cob ) is required for the expression of both cob and cytochrome oxidase subunit I ( oxi3 ) (Church et al., 1979, Lazowska et al.,

1980, Alexander et al., 1980, and Dhawale et al., 1981).

On the other hand, syn- mutants aff'ect mitochondrially encoded genes which are required for mitochondrial protein synthesis. Obviously, mutants which affect protein synthesis can have catastrophic effects on the expression of all mitochondrially encoded proteins. These mutants are usually in rRNA genes, tRNA genes or processing sites required for maturation of these RNAs. There is one protein gene in this class. It is the var1 gene (Perlman et al., 1977, and Zassenhaus and Perlman, 1982). The gene encodes a protein which is part of the mitochondrial small ribosomal subunit. It is the only ribosomal protein encoded by the mitochondria.

It is interesting that most of these syn- mutants are leaky. It appears that some level of mitochondrial protein synthesis is required for the stability of the mitochondrial genome (Julou and Bolotin-Fukuhara, 1982).

This phenomenon may be related to the observation that at least one nuclear mutant which affects RNA processing also results in the instability of the mitochondrial genome (Labouesse et al., 1985). One possible explanation of this (not given by the authors) is that protein synthesis is blocked by a failure of the L-rRNA to mature. The instability of the mitochondrial genome may be secondary.

The fourth and fifth classes of mutants affecting mitochondrial functions are also related. These are the rho- and rho(zero) mutants. Rho- mutants (also called

"petite" mutants) have sustained huge deletions of the mitochondrial genome (Perlman, 1978). Typically, seventy five percent or more of the mitochondrial genome has been deleted. The small piece of remaining mitochondrial DNA is unable to direct the synthesis of any mitochondrially encoded proteins since it is impossible to delete over half of the mitochondrial genome and still have a full set of tRNAs, rRNAs, and var1. Petite mutants, however, do transcribe RNA and perform those RNA processing events which are either autocatalytic or catalysed by nuclear encoded proteins. Petite mutants are also capable of recombination with mit- and syn- 12 mutants to restore wild type sequence and .therefore, function. Rho(zero) mutants are an extreme example of a petite in which the entire mitochondrial genome is deleted.

The sixth class of mutants related to mitochondrial functions do not affect respiration. These mutants are resistant to antibiotics which selectively affect mitochondrially encoded products (reviewed by Dujon,

1981). These antibiotics are and erythromycin whose activity is tempered by mutants in the mitochondrial large rRNA gene, oligomycin which affects the ATPase and is tempered by mutations in the mitochondrial genes encoding subunits 6 and 9 of the

ATPase. In addition, paromomycin affects the small ribosomal RNA, and antimycin A, funiculosin and diuron all affect cytochrome b. There are mutants in the genes encoding these products which also make the yeast cell resistant to the effects of these drugs.

The seventh class of mutants are the second site suppressors of pet-,mit-, or syn- mutants. There are nuclear encoded suppressors of mitochondrial mutants

(Anziano 198-4, Lambouesse et al., 1985). There are mitochondrial suppressors of nuclear pet- mutants (Muller et al., 1984) and there are mitochondrially encoded supressors of mitochondrial mit- mutants (for example 13

Haldi, 1985, and Hill et al., 1985). This class of mutants is very useful in that they give strong evidence that two gene products or domains interact physically.

I.B.3. The organization of yeast and mammalian mitochondrial genomes.

The two most intensely studied mitochondrial genomes are the yeast mitochondrial genome and the mammalian mitochondrial genome. In addition to the extensive genetic analysis of the yeast mitochondrial genome, large sections of this genome have been sequenced (reviewed by de Zamaroczy and Bernardi, 1985). Unfortunately, the sequencing of the yeast mitochondrial genome is not complete. There are still roughly 20 kilobase pairs (kb) of this circular 85 kb genome which are unknown. The main reason for this is that researchers only sequence regions which contain known mutants or structural genes.

I believe this is a big mistake.

On the other hand, mammalian mitochondrial systems do not lend themselves to genetic analysis because recombination experiments are very difficult or impossible and respiration deficient mutants are lethal to intact mammals. Antibiotic resistance markers have been used to study some mitochondrial genes in tissue 14 culture (Wallace et al.,1982, Kearsey and Graig, 1982, and Doersen and Staubridge, 1982). Fortunately, the entire mitochondrial genome of humans (Anderson et al.,1981), bovines (Anderson et al., 1982), and mice

(Bibb et al., 1981) have been sequenced.

While single base substitutions abound between these three mammalian mitochondrial genomes, the overall structures and information contents of these DNAs are almost identical. The products of the open reading frames (ORFs) have been determined either by comparison of the deduced amino acid sequences of the ORFs with the amino acid sequences of known proteins (Anderson et al.,

1981) or by immunological techniques involving antibodies raised against synthetic polypeptides predicted from portions of the ORFs cross reacting with known protein complexes (Chomyn et al., 1985). The mammalian mitochondrial genome encodes a large (16S) and a small

(12S) rRNAs, 22 tRNAs, six subunits of the NADH reductase complex, three subunits of the cytochrome oxidase complex, two subunits of the mitochondrial ATPase complex, the cytochrome b apoprotein. In addition it contains a roughly eight hundred base pair region called the D-loop which contains the origin of replication for the heavy (H) strand of the mitochondrial DNA. The

D-loop is also interesting in that it is often maintained 15

as a partially replicated triple stranded region in which

one strand is not base paired with any DNA (reviewed by

Clayton, 1982). The light strand (L) origin of

replication is a separate roughly thirty base pair domain

located in a tRNA cluster. This origin would be capable

of forming a stem loop structure with an eleven or twelve

base pair stem if it were in a single stranded

configuration.

The mammalian mitochondrial genome is a surprisingly

compact, 16 kb, circular molecule. Structural genes are

encoded on large polycistronic RNA precursors which are punctuated by tRNA genes. Trimming the RNA by simply

cutting out the tRNAs is sufficient to generate the mRNA precursors. There are little are no encoded 5' or 3' untranslated regions. Many of the mRNA precursors end with a single U or UA dinucleotide instead of a standard

UAA termination codon. The termination codons are generated when the pre-mRNAs are polyadenylated (Battey and Clayton, 1978). There are no introns encoded in the primary transcipts of the mammalian mitochondrial genome.

Despite the fact that the yeast mitochondrial genome has not been completely sequenced, a great deal is known about the basic structure and organization of this genome. First of all, yeast mitochondrial genes are not punctuated by tRNA genes. Also, while yeast 16

mitochondrial genes are transcribed in a polycistronic

manner, there are large 5 1 and 3* untranslated regions

(Thalenfeld et al.,1983, 19 8, Van Ommen et al., 1979,

Grivell et al., 1982, Fox and Boerner, 1980, and Cobon et

al., 1982). Sometimes these regions are larger than the

ORFs of the structural genes themselves (Hensgens et al.,

1980). Furthermore, mRNAs of yeast mitochondria are not

polyadenylated.

The basic structure and organization of the yeast

mitochondrial genome for strain ID41-6/161 is presented

in Figure 1. As far as is known, this genome encodes a

large (21S) and a small (15S) rRNA, 25 tRNAs and an

extremely interesting metabolically important 9S RNA from

the "tsl" (tRNA synthesis locus) locus (Miller and

Martin, 1983). In addition, it encodes the structural

genes for subunits 1,2, and 3 of the cytochrome oxidase

complex (Cox1, Cox2, and Cox3). For historic reasons,

the genes encoding these proteins are termed ox!3, oxi 1

, and oxi2, respectively. The genome also encodes

three subunits of the mitochondrial ATPase complex

(subunits 6, 8, and 9), the genes of which are referred

to as oli2, aap1, and olil.respectively. The two

remaining major translational products of the yeast

mitochondrial genome are the cytochrome b apoprotein (

cob ) and the var1 gene product which is part of the small ribosomal subunit. Finally, there are various numbers of URFs in this genome. The number of URFs is strain dependent. None of these URFs has significant homology with the mammalian mitochondrial genome. The metabolic functions of these URFs are covered in several other sections of this dissertation.

Given the extreme divergence between the structure and organization of the mammalian and yeast mitochondrial genomes, it is suprising to find that they encode a common set of products. These are CoxI, CoxII, CoxIII,

ATPase subunits 6 and 8, the cytochrome b apoprotein, and, of course, the rRNAs and tRNAs. In fact, all mitochondrial genomes, which have been studied in detail, have been found to encode at least the same three subunits of cytochrome oxidase, ATPase subunit 6, the cytochrome b apoprotein, in addition to rRNAs and tRNAs. 18 I.B.M. Yeast mitochondrial genes contain introns

Sequences found in the mature translatable mRNAs or functional rRNAs and tRNAs are called exons. Sequences found between exons and which must be spliced out of precursor RNAs to make mature and functional RNAs are called introns. Figure 1 presents a composite map of the mitochondrial genomes of three widely studied strains of

Saccharomyces sp. yeast (Sanders et al., 1977, van Ommen et al., 1979 and Hensgens et al., 1983 )• These strains are 1041-6/161, D273-10B, and S^ carlbergensis. The most striking observation about this comparison is the presence of various numbers of introns interrupting the genes for the L-rRNA, cytochrome b apoprotein ( cob ), and CoxI ( oxi3 ). For simplicity, strain ID41-6/161

(hereafter referred to as 161) will be used as the reference strain when developing a nomenclature for yeast mitochondrial introns. There is one intron, which occurs in some laboratory strains (ie. D273-10B), which does not occur in 161. This intron is in the L-rRNA gene and will be called omega (Bololin et al, 1971, Dujon, 1980). The other introns will be denoted by the gene in which they are found and numbered in a 5' to 3' manner. Therefore, the first intron in cob is termed cob 11 (or just bI1) and the last is termed cob bI5 (bI5). In the oxi3 gene 19

of strain D273-10B the introns are called ox!3 11, 12,

13, 14, I5gamma (or simply oxi3 I5g)(Bonitz et ,al.,

1980a and 1980b). In other laboratory strains (ie. 161),

there are two additional introns in the oxi3 gene which

interupt the fifth exon of strain D273-10B(Hensgen et

al., 1983). These introns are named oxi3 I5alpha (I5a)

and oxi3 I5beta (I5b). Henceforth these introns will be

called all through aI5g, respectively.

It should be noted that the 161 and D273 mitochondrial genomes are functional equivalents. Both genomes express a wild type phenotype in a variety of nuclear backgrounds including the nuclear background in which the other was first found (Dhawale et al., 1981).

It is possible that the presence of a particular intron may convey a small phenotypic difference when compareed to an otherwise isogenic strain. Nonetheless, there are no phenotypic differences which can be clearly ascribed to the presence or absence of introns when comparing these two strains. Therefore, many of the introns found in the 161 genome are optional for wild type function.

The question as to whether any of the introns in the cob gene are required for wild type function has been formally addressed (Labouesse and Slonimski, 1983,

Gargouri et al., 1983). This group made a series of mitochondrial genome constructions which lack one or more of the five introns found in some wild type cob genes.

All the exon sequences in these laboratory constructions

were wild type for respiratory functions. Some of these

intron deleted strains were isolated as revertants of mitochondrial mit- mutants which were defective in the

splicing of a particular intron. These mutants reverted by cleanly deleting the defective intron. They found

that only intron 4 was required for wild type respiration. Furthermore, intron 4 deletions made wild type amounts of cytochrome b apoprotein but failed to make Coxl. This is the cob-box effect which will be dicussed in detail in a latter section. In fact, a construction, which contained no introns at all in its cob gene, was wild type for the production of cytochrome b. It was further shown that a dominant nuclear suppressor, nam-2, can suppress the need for bl4 in Coxl production. The intronless cob construction in a nam-2 nuclear background had respiratory function. Therefore, all the introns in cob are functionally optional given the correct nuclear background.

A similar phenomenon has been observed by Hill et al.

(1985). This group studied a variety of nuclear pet- mutants of the CBP-2 gene (Dieckmann et al., 1984, McGraw and Tzagoloff, 1983). This nuclear gene appears to have no other function except that it is required for the excision of cob 15 from the cob pre-mRNA. One of the revertants of a nuclear mutant of this gene was found to be a clean excision (deletion) of cob 15 from the cob gene. This cob 15 deletion had completely wild type respiratory functions as judged by cytochrome b production and growth rates on a nonfermentable carbon source. That strain was also respiratory competent in a nuclear background in which the CBP-2 gene had been genetically disrupted.

It is puzzling that cerevi3iae maintains a large number of introns in its mitochondrial genome and a collection of nuclear genes whose only apparent function is to splice these introns. Mutations in either the mitochondrial introns or in the nuclear genes required for their splicing can result in the collapse of wild type respiratory functions. It is as if cerevisiae maintains a complex and mutation prone system for splicing mitochondrial introns which do not convey any appearent selective advantage to their host genomes.

Furthermore, since it has been observed that mutations in mitochondrial intron splicing can revert by cleanly excising the affected introns, a mechanism exists by which one would expect that eventually all yeast mitochondrial introns would be lost. Such an intronless yeast mitochondrial genome has never been observed in nature. In fact, as will be discussed in the next section, introns occur in all fungal mitochondrial genomes which have been examined.

There are two obvious explanations of the maintenance of introns in fungal mitochondrial genomes. The first is that the introns may confer some selective advantage to their host genomes which has simply escaped detection.

For example, the selective advantage afforded by the intron could be exerted under conditions not normally encountered in the laboratory. Alternatively, the selective advantage could be slight and simply not detected over the time frame of laboratory experiments.

The second explanation is that these introns may be capable of duplicating themselves and spreading to new locations within their host genomes or even to the mitochondrial genomes of other species. A balance between duplication and deletion could result in intron maintenance. It is also possible that a combination of these two explanations is true and results in the maintenance of fungal mitochondrial introns. The discussion of these two possibilities will be a recurring theme in this dissertation. 23 I.B.5. Other fungal mitochondrial genomes contain introns

In all species of fungi so far examined, introns occur in some of the structural genes of their respective mitochondrial genomes. In many cases, these introns are also optional, in that the occurrence of many of these introns is strain dependent. An excellent example of this occurs in the first other fungal mitochondrial genome which will be discussed. This is the ascomycte

Neurospora crassa. The mitochondrial genome of N . crassa encodes rRNAs, tRNAs, the cytochrome b apoprotein

( cob ), Coxl, CoxII, CoxIII, several subunits of the

NADH reductase complex (Ise et al., 1985), and subunits 6 and 8 of the ATPase. This information content is similar to that of the mammalian mitochondrial genome. In addition, as mentioned earlier, the N^ crassa mitochondrial genome contains a sequence which is homologous to the gene encoding the ATPase subunit 9.

However, this is probably a (Schmidt et al.,

1983)* There is no var1 homolog, but there is one ribosomal protein subunit (S5) which is mitochondrially encoded (Lambowitz et al., 1976 and Burke and

RajBhandary, 1982).

Four of the structural genes of the N^ crassa mitochondrial genome are known to contain introns. There is one intron in the L-rRNA gene (Burke and RajBhandary,

1982), two introns in cob (Citterieh et al., 1983, Burke et al., 1984), and one intron in the URF1 homolog (Burger et al., 1985). There are up to four introns in the gene for Coxl ( oxi3 ) (Collins et al., 1983) all of which are optional. In fact, there are naturally occurring strains of crassa in which there are no introns in the oxi3 gene at all.

The second mitochondrial genome of a fungus within the Ascomyce3 which has been examined is that of

Aspergillus nidulans. This genome has been almost entirely sequenced (Scazzocchio et al., 1983). This 33kb genome is relatively densely packed with open reading frames (ORFs), a 23S and 16S rRNA and tRNAs. It contains all the structural genes found in the mitochondrial genome of IK cerevisiae with two exceptions. The first is that it also lacks a homolog to the var1 gene. The second exception is that the sequence homologous to the

ATPase subunit 9 (the proteolipid) gene also appears to be a pseudogene (Turner et al., 1979). Or at least, the nuclear homolog is required for wild type function of the

ATPase complex. Since both iK nidulans and IK crassa are obligate aerobes, genetic analysis of this problem has been difficult.

In addition to the genes it shares with the S. cerevisiae mitochondrial genome, the nidulans

mitochondrial genome contains ORFs which share 26$ to 39$

amino acid homology with the URFs numbered 1, 3» 4, and 5

of the human mitochondrial genome. N_^ crassa mitochondria also encode an URF which has 80$ amino acid

homology to the iU nidulans ORF which is homologous to

the human URF1. Recently, these URFs have been assigned

a function in the human mitochondrial system (Chomyn et al., 1985). They encode subunits of the mitochondrial

NADH reductase. Finally, the A^ nidulans mitochondrial encodes six URFs which are unique to this fungus.

Three of the genes in the A_^ nidulans mitochondrial genome contain introns. They are the same genes which

contain introns in S_^ cerevisiae . The L-rRNA gene contains one intron which happens to be homologous to the

L-rRNA intron in N^ crassa (Jacquier and Dujon, 1983).

In addition, the oxiA (homolog of oxi3 ) and cob genes contain three and one introns respectively (Waring et al., 1984).

The next fungal mitochondrial genome to be discussed

is that of the fission yeast, Schizosaccharomyces pombe

It is another good example of strain dependent variation in introns found within mitochondrial genes of

the same species. Some strains of this yeast contain one intron in the cob gene and two introns in ox!3 (Coxl) (Lang et al., 1983). On the other hand, a different strain has been reported to lack the intron in cob

(Trinkl et al., 1985). pombe is also interesting in that it possesses a very densely packed mitochondrial genome of only about 19 kb. This is not much larger than the mammalian mitochondrial genomes. Furthermore, most of the genes are separated by tRNAs (Lang et al.,1983).

A surprising observation relevent to this discussion was made by Lang (198H) while he was sequencing the 3* end of the Coxl gene of pombe . It was found that the second intron in the Coxl gene of pombe is highly homologous to the third intron in this gene in A . nidulans. There is a large ORF in both of these introns. The amino acid homology is 70$ between these two reading frames. By contrast, the exons share only

58$ amino acid homology. Furthermore, these introns are inserted in exactly the same place in the Coxl gene.

Besides the similarities between the Coxl intron 3 of

A . nidulans and the Coxl intron 2 of pombe. there is another A^ nidulansintron which has striking homology to another fungal mitochondrial intron. These two introns are the cob intron of A^ nidulans and the cob 13 of the long strains of cerevisiae (Waring et al., 1981, and Waring et al., 1982). The exon sequences for the cytochrome b genes of these two species share 27

only 61$ amino acid homology. Surprisingly, the introns

cross hybridize to each other on Southern blots. The

sequence of the nidulans cob intron has been

determined (Waring et al , 1982). The sequence of the

S. cerevisiae cob 13 has not been published; however,

it is known that the /U nidulans and S^ cerevisiae

introns interrupt exactly the same place in the exon

sequences. Fortunately, M.A. Haldi has sequenced the S .

cerevisiae intron (1985) and a comparison of the two

introns will be included as part of this dissertation.

These two introns are of different lengths and therefore

cannot be colinear. Nonetheless, they do contain regions

of more than one hundred nucleotide each which are

clearly more homolgous than the surrounding exons.

A . nidulans. S. pombe and cerevisiae are not

closely related to each other. The question of how these

fungi can share introns in their mitochondrial genomes which are more homolgous than their surrounding exons may have a very interesting answer which could be related to how mitochondrial introns can be duplicated and moved into new sequences.

Another interesting observation can be made about the location of the various introns in the L-rRNA gene. As it happens, the introns in the mitochondrial L-rRNA gene of S. cerevisiae, N . crassa,and the second intron. (991 nucleotides) of the nuclear encoded cytoplasmic

L-rRNA gene of the cellular slime mold Physarum sp. are

in exactly the same place (Burke and RajBhandary. 1982).

(As will be discussed in the next paragraph, the

within the genus Kluveromyce3 also possess an intron in

their mitochondrial L-rRNA in the 3ame location which is

highly homologous to the omega intron in cerevisiae

). The sequence around this location, near the 3* end of

this gene, is highly conserved between species. I have

examined the sequence of the mitochondrial L-rRNA gene of

A . nidulans as reported by Netzker et al ( 1982) and have

found that its intron is also in the same location. The

sequence for both the iU nidulans and N_j_ crassa

splice junctions is

5 'TACGCTAGGGAT...intron...AACAGGCTATTT31. Surprisingly,

there are only short streches of homology between the N .

crassa-A. nidulans intron and the S .

cerevisiae-(Kluveromyces ) or Physarum sp. introns. The yeast and slime mold introns are also not homologous.

The short streches of homology which these introns do share will be discussed latter. Basically, except for the matches between the A_^ nidulans and N^ crassa and the cerevisiae and Kluyveromyces L-rRNA introns, these introns are different and not simple variations on each other. 29

Introns which occur in L-rRNA genes are interesting for a number of reasons which will become obvious by the end of this introduction. For these reasons Jacquier and

Dujon (1983) surveyed sixty five species or strains of yeast for the presence of sequences homologous to omega by probing Southern blots with omega specific and exon specific recombinant clones. Their findings were that most species and strains of yeast do not contain sequences homologous to omega. However, omega specific specific sequences were detected in some yeasts. Omega is an optional intron in S^ cerevisiae , S . carlburgensis , and S^ bisporus. That is, it is present in some but not all strains of these species. On the other hand, omega specific sequences were found in all sixteen species tested within the genus Kluyveromyces

These authors then cloned and sequenced the omega-like intron and portions of the surrounding exons from thermotolerans. They found that, except for two small (less than 40 nucleotides) insertions, the S . cerevisiae and thermotolerans introns are colinear over there entire length. Furthermore, these sequences are 79-7% homologous at the nucleic acid level. There also happens to be an ORF in both of these introns. The two ORFs are completely colinear and 70.9$ homologous at 30 the amino acid level. There is some evidence that the

ORFs are translated into proteins on which there is selection pressure in that there are regions in which the

ORFs use completely different codons to transcribe exactly the same amino acids.

Another fungal mitochondrial genome which has been examined is that of Podospora anserina. It is also the last of the three Ascomyces fungi which have been discussed in this introduction. The mitochondrial genome of several strains of P^ anserina have been restriction mapped (Kuck et al., 1985b). The restriction maps and the sizes of these genomes are polymorphic. The sizes of the various genomes vary from 84 to 101 kb. The mitochondrial genome of the P_^ anserina strain most widely studied (race s) is 94 kb. Many of the restriction site polymorphisms and much of the length variation can be explained in terms of optional inserts which vary in length between 0.9 and 2.75 kb. At least three (and perhaps all) of these optional inserts are optional introns.

Several of the genes encoded on the P^ anserina mitochondrial genome have been localized and studied in more detail. These are the Coxl, cytochrome b , ATPase subunit 8, URF1 (NADH reductase), L-rRNA, Sm-rRNA, several tRNAs, and several free standing ORFs with unasslgned functions (Osiewacz and Esser, 1984, Write and

Cummings, 1984, Cummings et al., 1985, and Kuck et al.,

1985). The URF1 gene, which is continous in Nj_ crassa ,

A . nidulans , and mammalian mitochondria, is interrupted by three introns. The L-rRNA genes contains either one or two introns. The 5' most intron is optional. The 3' intron is approximately the right length and is in approximately the right place (as determined by restriction mapping only) to be related to the L-rRNA intron of N^ crassa and iU nidulans. Further studies will be necessary to confirm this possibility. The cytochrome b apoprotein gene contains at least two introns (probably more) one of which is optional. The

Coxl gene is huge (over 25 kb) and contains at least ten introns (Kuck et al., 1985a). One of these introns (not the first) is optional in wild type strains. The other genes, which have been examined, in the anserina mitochondrial genome are continous.

The reason so many investigators have been interested in the anserina mitochondrial genome is that this species of fungus undergoes a process of senescence and death after prolonged passage in vegetative growth (Smith and Rubenstein, 1973). This senescence is characterized by the loss of the normal mitochondrial genome and replacement of the genome by up to five circular mitochondrial plasmids (Stahl et al., 1978, and Cummings

et al., 1979). These mitochondrial plasmids have been

shown to be derived from the wild type mitochondrial

genome. Both nuclear and mitochondrial mutants, which

block the senescence phenomenon and allow continous

passage, have been isolated (Vierny et al., 1982). These

mitochondrial plasmids do not form in these mutants.

Significantly, all of the mitochondrial mutants, which do

not undergo senescence, lack the sequence from which one

of the mitochondrial plasmids is derived (Kuck et al.,

1985b). This is called the "alpha" plasmid. The alpha plasmid has recently been sequenced (Cummings et al., 1985). It has been shown that the alpha plasmid is

derived by the clean excision of ox!3 (Coxl) intron I.

The entire intron but no exon sequences are present in

this plasmid.

Besides the fact that the oxi3 intron I seems to be

critical for the senescence phenomonon in P_^ anserina ,

there are two reasaons why this intron is interesting.

First, during senescence, this intron has been observed

to translocate and integrate into the JP^ anserina nuclear genome (Write and Cummings, 1983). And second,

the P_^ anserina oxi3 intron I contains short regions of homology with the first and second introns of the S.

cerevisiae oxi3 gene and the cob intron found in S. 33 pombe (Michel and Lang, 1985) This homology will be discussed in detail in the Discussion section of this dissertation.

Another fungal mitochondrial genome, which has been examined is that of the yeast, Torulopsis glabrata

(Clark-Walker et al., 1983a). This genome is rather small (19.0 kb) but the CoxI gene contains two introns.

Clark-Walker et al (1983b) have also examined the mitochondrial genome of Saccharomyces exiguus. They used restriction mapping and Southern hybridization with mitochondrial specific probes from S_^ cerevislae to show that the 23.7 kb S_^ exiguus mitochondrial genome has an informational content similar the cerevislae mitochondria genome. Furthermore, the cob and CoxI genes contain one and two introns respectively. One of the

CoxI introns is of additional interest in that it shares homology to oxi3 IM from cerevislae.

The last fungal mitochondrial genome I which will be discussed is that of the water mold, Achlya amblsexualis

An interesting feature of this mitochondrial genome is that within it lie two large inverted repeats which divide this circular mitochondrial genome into a large single copy region, two inverted repeat sequences and a small single copy region. The size of the inverted repeat sequence is strain dependent and varies between 9.94 and 11.24 kb. The repeats encode the large and

small rRNAs (Shumard et al., 1985). The inverted repeats

allow intramolecular crossover events to occur. These

events are such that the mitochondrial genome of this

species occurs as a population of two kinds of DNA

molecules which are identical except that the orientation

of the two single copy regions to each other is reversed

(Hudspeth et al., 1983). Intramolecular recombination

events of this kind are not rare in nature. It has been

shown to occur in the nuclear plasmid of cerevisiae

(Broach and Hicks, 1980) and in many chloroplast genomes

(Palmer, 1983).

Regions which hybridize to cerevisiae probes for

the large and small rRNAs, CoxI, II, and III, ATPase subunits 6 and 9 (subunit 8 was not examined), and cob were identified. Interestingly, a series of optional strain dependent insertions were found in regions encoding the L-rRNA, Cox2 and Cox3 (Shumard et al ,1985) ranging in size from 310 to 860 base pairs (bp). These authors suggest the possibility that some of these inserts may be introns. 35

I.B.6. Translation products of fungal mitochondrial introns

There are two kinds of products derived from fungal mitochondrial intron sequences. The first is the RNA itself which is the primary product of transcription of any gene with introns. Intron derived RNAs from fungal mitochondria are extremely interesting and will be discussed in several of the following sections. The other possible product of intron derived sequences is protein. Many, but not all, fungal mitochondrial introns contain ORFs. The genetic and biochemical evidence that at least some of these ORFs are translated into biologically active proteins will be discussed in this section.

The first fungal mitochondrial intron encoded product to be discussed is the product of the intron in the N . crassa mitochondrial L-rRNA gene. This protein is called

S5 and is a protein subunit of the mitochondrial small ribosomal subunit. This protein was shown to be translated by the mitochondrial ribosomes (Lambowitz et al., 1976). Later, this protein was isolated and its amino acid composition (LaPolla and Lambowitz, 1981) and its isoelectric point were determined (Lambowitz et al.,

1979). Circumstantial evidence that S5 is encoded by the

L-rRNA intron was provided when it was shown that conditional nuclear mutants which block splicing of the mitochondrial L-rRNA precursor particularly fail to synthesize S5 (Collins et al., 1979). Furthermore, it was known that the spliced intron is maintained in the mitochondria as a stable transcript (Green et al., 1981).

Finally, this intron was sequenced by Burke and

RajBhandary (1982). The intron is 2295 bp long and contains a free standing ORF of 1278 nucleotides. This

ORF begins with an AUG initiation codon and, if translated into a protein, will encode a protein with the same approximate size, amino acid composition and basicity as S5. It is interesting to note that this intron contains no long streches of homology with either the var1 gene of cerevisiae, which also encodes a protein subunit of the small ribosomal subunit, or the omega intron of cereviciae (and Kluyveromyces sp.) which happens to be in the same place as this L-rRNA gene intron. The importance of the small streches of homology which occur between many organelle introns, including the

L-rRNA introns, will be discussed in the following section.

The next biologically active protein to be discussed is encoded by the the omega intron in the L-rRNA gene of some strains of yeast. Omega has no selectable phenotype

in vegetatively growing cells; however, when an omega+

strain is mated to an omega- strain, omega undergoes what

is the formal equivalent of a duplication transposition

event so that virtually all progeny of such a cross

acquire this intron. The intron is inserted in exactly

the same place in both donor and recipient genomes

(Dujon, 1980). Furthermore, there is flanking

co-conversion associated with omega insertion. That is,

the L-rRNA gene exon sequences are also transferred from

the omega+ genome to the omega- genome. The amount of

flanking co-conversion is not fixed and varies from one

insertion event to another. The probability that a

flanking exon marker will be co-converted is proportional

to its distance from the insertion site (Jacquier and

Dujon, 1985). Co-conversion rarely exceeds one thousand nucleotides on either side of the intron insertion site.

It is this co-conversion and its associated polarity which originally allowed the polarity phenomonon to be detected and characterized (Bolotin et al.,1971).

The kinetics of this nonreciprocal exchange of genetic information have been studied by Zinn and Butow

(1985). They found that the movement of omega into the omega- genome is very rapid and is completed within five to six hours after mating. Surprisingly, this exchange 38

is initiated by a double strand break in the recipient

genome at the site of insertion.

There are several known-alleles of the omega domain.

The first two of these are the wild type omega+

(competent donor) and wild type omega- (competent

recipient) genotypes. Additionally, there are mutants

which block the normal genetic exchange. The first of

these are the omega-n (omega neutral) alleles (Dujon,

1980, and Dujon and Jacquier, 1983). All of the omega-n mutations are single base pair substitutions within three

nucleotides upstream or downstream of the site of intron

insertion. This "target sequence" is

5'TAGGGAT...omega...AACAGGGT3'.

The second class of mutant alleles are the omega-d

(omega deficient) alleles (Jacquier and Dujon, 1985, and and Macreadie et al., 1985). These mutants contain the omega intron and are wild type for respiratory functions

(ie., they splice normally) but fail to increase the rate of transmission of the omega intron or flanking markers in omega- x omega-d crosses over what would be expected by normal recombination. Except for one allele, all of the omega-d mutation have been found to be either frameshifts or single base substitutions in the omega

ORF.

Macreadie et al. (1985) characterized thirteen omega-d mutations. They found that there were phenotypic differences in the degree of diminished omega tramsmission between the various mutants. Significantly, in those mutants with the greatest reductions in omega transmission, the double strand break in the insertion sequence of the recipient, omega-, genome was not detected. The double strand break was detected in mutants with less diminished activity. Also important was the observation that the degree of diminished omega transmission was affected by the nuclear background.

This indicates that there are nuclear genes whose products participate in omega transmission. Jacquier and

Dujon (1985) have also shown that efficient omega transfer requires mitochondrial translation. In the face of this evidence, Macreadie et al (1985) have given the omega ORF the status of a gene and have named this gene fit 1 (for factor for intron transmission). Very late in the writing of this dissertation, Colleaux et al. (1986) showed that the product of the fit 1 gene is, in fact, the endonuclease which makes the double strand cut which is discribed above.

There are several other examples of recombinational events which are mediated or initiated by endonucleases.

These include the "Chi" recombination system from E . coli (Pontecelli et al., 1985), recombination in 40

vertebrate B-cells to generate imunoglobulin diversity

(Hope et al., 1986), FLP mediated recombination of the

yeast 2 micron nuclear plasmid, and the HO mediated mating type switching of yeast (Kostriken et al., 1983,

Weiffenbach et al., 1983 and Klar et al., 1984).

I.B.7. Other translation products of fungal mitochondrial

introns: Maturases

In many strains of cerevisiae (ie., 161 and

777-3A), the cob gene contains six exons and five introns

spread over a continuous nine kb region of the mitochondrial genome. To date hundreds of mit- mutants have been localized to this gene. Early work (Kotylak and Slonimski, 1976, and Slonimski et al., 1978) showed

that these mutations could be clustered into ten groups or "Boxes". Genetic analysis with petite genomes followed by restriction mapping of the various petite genomes permitted these boxes to be placed in an order which was colinear with the cob gene. This order is

5,4,3,8,10,1,9»7,2,6 (Jacq et al., 1980, and Church et al., 1979)* Analysis of zygote complimentation, mRNA and protein phenotypes, and DNA sequencing of these mutants has allowed this group and others (Alaxander et al.,

1980, Jacq et al.,1982, Mahler et al.,1982, DeLaSalle et al., 1982, Dieckmann et al., 1982, Anzlano et al., 1982,

Lazowska et al., 1980, Rodel et al., 1983, Bechman et al., 1981, Schweyen et al., 1982, and for review see

Perlman and Mahler, 1983, and Grivell et al., 1983) to make the following startling observations: 1. All mutations in box 3, box 10, and box 7 also disrupt the production of CoxI ( oxi3 ) by preventing proper pre-mRNA splicing. 2. All mutants in box 9, box 2, and box 6 prevent the production of a functional cob product but do not affect oxidase. Some mutants in the other clusters also block production of both cytochrome b and oxidase.

3. Many of these mutants, especially those in boxes 3,

10, 9, 7, and 2, accumulate allele specific proteins which are related to the mature cytochrome b apoprotein but, in many cases, are larger. 4. Many of these mutants accumulate RNAs which are larger than the mature cob mRNA and can be explained by a failure to splice specific introns. This effect is polar so that failure to splice an upstream (5') intron also results in the failure to splice the downstream (3*) introns except intron 5. 5.

Introns 1 and 5 of cob splice in rho- strains but introns

2, 3, and 4 of cob do not.

Sequence analysis of wild type genomes has revealed that cob 12, cob 13, and cob 14 have long ORFs which are continuous and in-frame with their preceding exon but do 42 not extend to the downstream exon. Cob 11 and cob 15 do

not have ORFs. Basically, box 5, 4, 8, 1, and 6 are

largely marked by exon mutations while box 3» 10, and 7

contain mutants in the ORFs of introns cob 12, 13, and

14, respectively. Box 9 and box 2 mutants also map to

cob 14 but can be shown to act in els. Therefore, box

9 and box 2 mutants cannot involve defects in an intron

encoded protein.

In the face of this evidence, Lazowska et al. (1980)

have proposed that the ORFs of some yeast mitochondrial

introns encode proteins which are required for the proper

splicing of the intron from which they are encoded.

These proteins are termed "maturases". Translation of

these proteins is initiated at the AUG initiation codon

of cob exon I. There is then read-through from the exons

into the intronic ORFs. It was further hypothesized that

other mutants, most notably those in box 9 and box 2, represent mutations in structural elements in the pre-mRNA itself which are required for splicing.

This hypothesis clearly explains the observed phenomonology relevent to the splicing of the cob pre-mRNA. First, the polarity effect can be explained.

If an upstream intron is not spliced, then there can be no read-through from the initiation codon of cob into the

downstream introns because the ORFs of these introns are not continuous with the downstream exons. Therefore, the

downstream introns cannot be spliced either even though

their sequence is wild type. Second, allele-specific

proteins, which are larger than the cytochrome b

apoprotein can be explained, if one imagines a series of

upstream exons fused to the product of an intronic ORF.

Allele-specific proteins which are the correct size to be

such a fusion have been detected for cob 12 (box 3) and

cob 14 (DeLaSalle et al., 1982, and Anziano et al.,

1982). Significantly, mutants whose sequence predicts

that there will be a premature termination in the ORF of an intron, produce allele-specific proteins which are

smaller. Finally, the protein product of the cob 14 ORF has been directly detected in mutants using antibodies directed against a portion of this ORF expressed in a lacZ fusion (Jacq et al., 1984). The maturases themselves have never been directly observed in wild type cells. This can be attributed to the possibility that maturases exist in only trace amounts in wild type cells since the predicted action of the maturases is to destroy their own mRNA. It is also possible that investigators have not yet used sufficiently sensitive techniques to detect them.

The mechanism by which splicing defective mutants in cob can block splicing in oxi3 can also be explained. Bonitz et al (1980,) have sequenced the short version of oxi3 found in cerevisiae . This strain lacks aI5a and aI5b. They found that all the introns in that strain except the last one (aI5g) contain long ORFs which are in frame and continuous with their preceding exon.

Interestingly, the fourth intron of oxi3 is 70f homologous with the fourth intron in cob. There are also regions within these introns where the homology is much higher.

Groups working both in France (Dujardin et al., 1983. and Labouesse et al., 1985) and at Ohio State University

(Anziano, 1984, and Anziano et al., 1986) have seriously studied this phenomonon. Both groups found that mutants which block the production of a functional cob 14 encoded maturase also block the splicing of oxi3 14. Both groups also have found a dominant nuclear suppressor which suppresses mutants in the cob 14 maturase. Additionally, both groups have compelling genetic evidence that the mode of action of this nuclear suppressor is to allow the oxi3 14 ORF to encode a maturase which can then participate in the correct splicing of both cob 14 and oxi3 14. Finally, there are mutants in oxi3 14 which block the activity of the nuclear suppressor. Therefore, in wild type strains of cerevisiae. the cob 14 maturase is required for the splicing of cob 14 and oxi3 45

14 while the oxi 14 ORF is a pseudogene for this

function. But, in some nuclear backgrounds, oxi 14 can

encode a functional maturase, and if the cob 14 maturase

is defective, then the cob 14 ORF becomes a functional

pseudogene.

Compared to the cob gene, the splicing of oxi3

transcripts have not been as intensely studied. Long

strains of oxi3 contain eight exons and seven introns

(Bonitz et al., 1980, and Hensgens et al.,1983a). The first six introns contain long ORFs. The question arises as to whether these introns encode maturases also. This question is complicated by the fact that in wild type cells, oxi3 exon probes hybridize on Northern blots of mitochondrial RNA to thirty one oxi3 related transcripts

(Hensgens et al., 1983b). All of these except the mature

2.1 kb mature mRNA are larger splicing intermediates.

Strains with fewer introns have simpler patterns which are still complex. Another interesting feature is that all, aI2, and aI5g (and cob 11 as well) are excised from primary transcripts and maintained as stable appearently circular single stranded RNA molecules (Arnberg et al.,

1980, and Halbreich et al., 1980). These are the only introns in the cerevisiae mitochondrial genome which are excised as stable circles.

The group headed by Leslie Grivell (Grivell et al., 1982, Hensgens et al 1983b, and Hensgens et al ., 1984) has used a series of splicing defective mutants to look for evidence of maturases in oxi3. Unfortunately, there were genetic problems with all their mutants. Most turned out to be either deletions of more than 5.0 kb, double mutants which affected more than one intron, or leaky mutants. These problems make interpretation of their observations difficult. They also looked at splicing and transcripts in rho- petite mutants.

There are, however, several conclusions which can be made from that work. First, a!5g splices in petites and, therefore, cannot require a mitochondrially encoded maturase. Second, all, aI2, al4, and possibly aI5a do not splice in petites. This demonstates that these introns require a mitochondrially encoded product for successful splicing but does not localize the source of the product to the intron itself. Another interesting finding is that petite mutants do not accumulate large oxi3 transcripts which are simply exon sequences plus unspliced introns. Instead, there is at least one specific cut in the unspliced pre-mRNA which results in the separation of the unspliced all and aI2 sequences from the sequences attached to aI5a. Unfortunately, the exact site of this cut was never precisely pinned down.

It probably occurs either at the exon4/al4 boundary or 47

the exon5/aI5alpha boundary. Grivell's group speculated

that this is a result of an aborted splicing reaction in which there is a cut but no splice. This group also has some evidence that shows that aI3 and al4 do splice in mit- genomes which block the splicing of all and aI2.

Another group, headed by Piotr Slonimski (Carignani et al., 1983) has published unambiguous data showing that the all ORF encodes a maturase. That mapped ten mutants in all. Several of these mutants were cis-actlng and mapped to either the 5' or 31 ends of the intron. Three of the mutants were trans-recessive and were mapped to the middle of the ORF. Mutants in this intron produce allele specific proteins which are derived from the proteolytic cleavage of a fusion protein made by read through from the first exon into the intronic ORF.

Further, trans-mutants which are more 5' make shorter allele specific proteins than do the more 3' mutants.

This would be expected if the mutants contain premature stop codons.

Carignami et al. (1983) also did a series of Northern blots in which they probed RNA from mutant and wild type cells with either an al1 specific or an exon4-al4 specific probe. They found that in mutant cells the most abundant signal for both probes was to a 6.0 kb RNA.

That is large enough to be an RNA which contains exons 48 plus all plus one other intron but not aI2. I have examined the data from both groups and would like to propose the following hypothesis. In tight mitochondrially encoded all mutants, all and aI2 fail to splice. 0xi3 13 cannot be processed fully. Oxi 14 also splices because its requirements for a maturase is fulfilled by cob 14. Intron aI5a does not splice but does cut at its 5' exon/intron boundry. Intron aI5b does not splice. Intron aI5g does splice and does not require a maturase. This hypothesis predicts that aI2, aI5a, and aI5b (and possibly aI3) in addition to all encode products required for efficient splicing of their respective introns. Of course this is only a hypothesis, but it is the only one which is consistent with the published sizes of RNAs from mutant strains. As an alternative, it may be that the reported lengths of these RNAs are in error at which point this hypothesis is no better than any other.

Relevant to this discussion is the work of Simon and

Faye (1984) who examined a nuclear encoded pet- mutant which blocks splicing of oxi3 transcripts but does not affect either the L-rRNA or cob. They found that strains with this nuclear mutant fail to splice all and aI2. This mutant does splice aI3 normally but is very inefficient at splicing al4 and aI5g. (aI5a and aI5b were not present in this strain.) This indicates that al4 and cob I^4 may hava slightly different requirements for nuclear encoded products involved in splicing. This also indicates that nuclear encoded proteins are involved in splicing aI5g. As a point of interest, there are also pet- mutants which specifically block the splicing of cob but not the preL-rRNA or oxi3 (Dieckmann et al., 1982).

Mutants which specifically block the splicing of the preL-rRNA have never been observed but mutants of this nature are likely to be lethal to the mitochondrial genome (Citterich et al., 1983). 50

I.B.8. Cis-acting sequences required for of introns

related to cob intron 4

As described previously (Jacq et al., 1980), there

are two clusters of mutations, box 9 and box 2, in the

fourth intron of cob which act in cis to block the

splicing of that intron in cereviciae. Most of

these mutations do not block the splicing of al4 and,

therefore, do not involve significant alterations in the

cob 14 maturase. Furthermore, one mutation in box 1 (

cob exon 4) also acts.in ci3 to prevent splicing of the

cob 14. This box 1 mutation is a G-A transition in the

second to last nucleotide of cob exon 4 (De La Salle et

al., 1982).

Several groups have studied the mit- splicing

defective mutations which map to box 9 and box 2 and have

examined second site suppressors which also map to cob

14 (Anziano et al ., 1982, Weiss-Brummer et al., 1982,

Weiss-Brummer et al., 1983, and Holl et al., 1985). All

box 9 mutations are missense or one or two base pair

deletions/frameshifts in an eighteen nucleotide sequence

composed of two nine nucleotide literal palidromes. Box

9 is located between bases 330 and 347 of the intron.

The sequence of box 9 in wild type strains is

5' TCAGAGACT-ACACGCACA3'• All box 2 mutations map to a 51

twelve base pair sequence which occurs 33 to 21 nucleotides from the 3* splice site. The wild type sequence for this domain is 5'AAGATATAGTCC3'.

Surprisingly, there is a five base sequence in box 9,

S'GACTAS', which can base pair with a five base sequence in box 2, S'CTGATS1. As unlikely as this long range interaction may seem at first, the importance of this short stem has been confirmed by the identification of many mutants and by genetic analysis of second site suppressors of splicing defective mutants which occur in these sequences. It has been shown that splicing defective mutants in both box 9 and box 2 which disrupt this stem can be suppressed by mutations in the complimentary sequence which restore this stem.

There is another interesting cluster of mutants in the 3' half of box 9 (box 9R). These involve the sequence S'GCACAS'* It turns out that there is a complimentary five base sequence (called box 9R1),

3»CGTGT5', located 46 bases from the 5' end of this intron. This stem is also predicted to play an important role in the splicing of this intron. The validity of this stem has also been confirmed by finding suppressor mutants in the sequence near the 5' splice site which compliment the primary mutants in box 9 and restore base pairing. Sequence analysis of the other S_j_ oereviclae mitochondrial introns (Bonitz et al., 1980, Nobrega and

Tzagoloff, 1980, Dujon, 1980, Hensgens et al., 1982, Holl

et al., 1985) reveals that all of the mitochondrial

introns except cob 11, cob 12, oxi3 11, oxl3 12, and oxi3

I5g contain sequences which are homologous to the 5' part of box 9 (5'TCAGAGACTA31) and box 2 (see Figure 2).

(For convenience, sequences homologous to the 5* half of box 9 and box 2 are referred to as box 9L or box 2 some authors. For most of this dissertaion, the nomenclature of Davies et al., (1982) will be used. These authors call box 9L and box 2 R and S, respectively.) Cis

-acting mutations in the box 9L of oxi3 14 (Netter et al., 1982) and in cob 15 (Bonjardim and Nobrega, 1984) confirm the importance of these sequences.

Davies et al. (1982) and Waring and Davies (1984) have examined the sequences of a number of introns which contain box 9L and box 2 homologs. In addition to the two base pair interactions described in the preceding paragraph, they find evidence for two additional conserved base pair interactings. The first proposed interaction is between two sequences which occur between box 9R' and box 9L. They named these two sequences "P" and "Q". They also named box 9R* and box 9R E and E' respectively. P is located two or three nucleotides from 53

E. P has a consensus sequence of 5 1ATGCTGGAAA3•. Q is located a varying distance from P and is between P and box 9L (R). The consensus sequence for Q is

5'AATCAGCAGG3*. Significantly, the sequence in P,

5'TGCTG3', could base pair with the sequence in Q,

S'ACGACS1. No Iji vivo cis-acting mutants, which block splicing, have been found in either P or Q.

The other interaction predicted by Waring and Davies involves a sequence which they call the "internal guide seqence" (IG). This sequence always contains a "G" nucleotide which is flanked on the 5' side with several bases which can pair with the first few bases of the 3' exon and on the 3' side with bases which can pair with the last few bases of the 5' exon. It is hypothesised that these short stems actually form and serve to bring the exons into alignment for splicing. The IG sequence always occurs 5' to the sequence which base pairs with the 3' portion of box 9 (box 9 right) and usually occurs near the 5' splice site. (Jis-acting mutants have been observed jLn vivo which would disrupt these proposed interactions (De La Salle et al ., 1982, and Haldi,

1985). The importance of this interaction has also been confirmed by in vitro mutagenisis (Waring et al., 1986).

Waring and Davies (1982 and 1984) also show that while the sequence E'(box 9R ) is not conserved between 54 introns, there is a sequence in this domain which can base pair with a sequence just 5' to the P homolog (E).

There are therefore seven conserved sequence element in this group of introns. The order of these conserved sequene elements is 5'exon, IG, E, P, Q, R, E', S,

3*exon. They propose that P base pairs with Q, E base pairs with E', and R base pairs with S. (A diagramatic representation of these proposed interactions is shown in

Figure 2.) In so doing the exons are brought relatively close together. The IG then brings the exons in very close proximity for splicing to occur. Typically, for this group of introns, the last base of the 5' exon is a

"T" and the last base of the intron is a "G" (Nomiyama et al., 1981).

I.B.9* Cis-acting mutants in other introns in the

S . cerevisiae mitochondrial genome

The data reviewed in the last section indicate that the majority of yeast mitochondrial introns share structural features and short conserved sequences and therefore probably a common mechanism for splicing.

There are, however, five introns which do not have these structural features or conserved sequences. They are cob

II, cob 12 (box 3), oxi3 11, oxi3 12, and oxi3 I5g. It 55

is significant that cob 11, oxi3 11, oxi3 12, and oxi3

I5g excise as stable circular RNAs (Halbreich et al.,

1980, Grivell et al., 1980, and Simon and Faye, 1984).

This common product of splicing indicates that these

introns may have a common mechanism of splicing. Cob 12 is probably a special case and will be discussed later.

Relative to the intense genetic analysis of cob 14,

the introns which excise stable circular RNAs have not been studied in as much depth with iji vivo mutants.

Nonetheless, three cis -acting mutants in cob 11 have been characterized and sequenced (Schmelzer et al., 1982, and Rodel et al., 1983)• This intron can be folded into a stem loop structure which is completely different from that of cob 14 (see following section). These mutants either disrupt the stability of base pair stems occurring near the splice boundaries or decrease the size of a small loop near the middle of the intron. Carignani et al. (1983) have geneticly mapped seven cis-acting mutants in oxi3 11 (all). These mutants were not sequenced but were clustered near the 5' or 3' splice boundries. These findings are in contrast to what was observed for cob 14 in which many of the cis-acting mutant were clustered in box 9 which is not near a splice boundary. 56

I.B.10. Computer modelling RNA secondary structures of

yeast mitochondrial introns

Michel et al. (1982) and Michel and Dujon (1983) were

the first to use computer modelling to predict the

secondary structure of yeast mitochondrial introns. They

used two computer programs, one called EPURE (which

identified dispersed and/or marginal homologies) and the

other called HELCAT (which calculated the lowest free

energy of proposed structures), along with analytical

stratagies developed by Noller et al. (1981) to predict

the structures which they report.

Their surprising finding was that all yeast mitochondrial introns except cob 12 and 13 ( which they

did not analyse) fall into one of two groups (or classes)

based on secondary structures. These groups are called

group I (or type I or class I) and group II (or type II

or class II). It turns out that the previously published

sequence of the 3' end for cob 12 (box 3) is incorrect.

The correct sequence for this intron has been determined

by M. A. Haldi (1985 dissertation).

A significant feature of these proposed secondary

structures is that in those introns with large ORFs, the majority of the ORF is located in a large unstructured

loop which is separated from the core structure by 57

moderately large (ie. at least twelve bp) stem structure.

The ORF loops, however, do not necessarily occur at the

ends of the same stem structures in different introns.

Furthermore, the presence of an ORF is not corelated with

either class I or class II. Both groups have member

introns which have or lack ORFs.

The proposed proposed secondary structure for oxi3 14

is shown in Figure 4 (Michel and Dujon, 1983). 0xi3 14

is a typical group I intron. Cob 14 is also a class I

intron. Five stem structures found in all class I

introns are shown in this figure labeled "a" through "e".

"A" corresponds to the proposed base pairing of the

internal guide and the 5’ exon. "B" is the proposed "E"

to "E’" base pairing. "C” is the proposed "P" and "Q" base pairing. "D" a previously unrecognized conserved stem structure has since been confirmed by a leaky cis-acting mutant (Haldi, 1985 dissertation). MEW does not correspond to any known mutants but its large stem structure may make it resistant to disruption with single base changes. Significantly, The 5' portion of box 9

(WR”) and box 2 ("S") (here called f and f 1 respectively) are not base paired with other structures and therefore are free to interact given tertiary folding of the secondary structure. Given the strength of the genetic analysis and the similarities in the computer generated 58

secondary structures, it seems likely that these proposed secondary and tertiary interactions do occur. The L-rRNA intron (omega), cob 13, cob 14, cob 15, oxi 13 (aI3), oxi3 14 (al4), oxi3 I5a (aI5a), and oxi3 I5b (aI5b) are the group I introns which occur in laboratory strains of

S . cerevisiae (reviewed by Waring and Davies, 1984).

The proposed secondary structure of all, a group II intron, is shown in Figure 5 (Michel et al., 1982). Even though this figure is drawn to bias one towards the following conclusion, it is clear that this secondary structure does bring the exons into relatively close proximity. Genetic analysis of in vivo mutants of this group of introns is sparse, but Rodel et al. (1983) have sequenced cis-actlng mutants in cob 11 which would disrupt the 5* most or 31 most st.em of this structure.

Cob 11, oxi3 11 (al1), oxi3 12 (aI2), and oxi3 I5g (aI5g) are the group II introns found in laboratory strains of

S . cereviciae .

I.B.11. Amino acid homologies between intronic ORF proteins of yeast

When Hengens et al. (1983a) sequenced introns aI5a and aI5b, they discovered that these introns contain sequences homologous to box 9L (R) and box 2 (S) from cob 59

14. They also observed that these two introns contain large ORFs. The ORF of aI5b was interesting in that it is free standing just as is the ORF of omega ( ie. the

ORF is not continous with either exon 6 or exon 7). The

ORF of aI5g is the only free standing ORF in an intron in a protein encoding gene in the mitochondrial genome of S . cerevisiae.

Hensgens et al. (1983) also compared the amino acid sequences of the various ORFs of the group I introns.

They found that there is a region of about 115 amino acids, located roughly in the middle of these ORFs, in which the amino acid sequences were varyingly homologous.

Of course, cob 14 and al4 were more closely homologous to each other than any other pair of introns. They also found that the cob 12 (box 3) ORF and a free standing URF located 3' to the gene for Cox II ( oxi1 ) shared this homologous region. There also happens to be an S homolog

(but no R) located in the free standing oxi1 region URF.

This region of homology is bounded on both sides by a nine amino acid sequence whose consensus sequence is

(apolar)2, gly, (apolar)2, (asp/glu), (gly/ala), asp, gly. This sequence is usually represented using the single letter amino acid code. This sequence is called

LAGLI-DADG. This name is derived as a composite of the two nine amino acid sequences found in cob 14. Between the LAGLI-DADG sequences, the ORFs show varying degrees of homology, but based on this homology they can be divided into two groups. The first group contains the free standing oxi1 region URF, cob 14, al4, aI3, and aI5b. The second group, which share 21 to 29$ amino acid homology between themselves is composed of the omega intron, cob 12 and aI5a. A possible third group of class

I intron ORFs has been observed by Burger and Werner

(1985); however, no yeast introns have ORFs which belong to this group.

Hensgens et al. (1983a) argue that this is sufficient amino acid homology to indicate an evolutionary relatedness amoung these ORFs. This view is consistent with the data regarding the conservation of secondary structures within the intron group. There is, however, no evidence correlating this conserved domain with the biological activities, such as the omega fit 1 function or maturases, ascribed to these ORFs. In fact, most of the cob 14 trans-acting maturase mutants disrupt or alter the reading frame of this ORF in a region which is well

3' of the LAGLI-DADG domain.

Hensgens et al. also examined the amino acid sequence of the ORFs of the group II introns, oxi3 11 and oxi3 12.

These introns are 50$ homologous two each other at the

DNA level, therefore; it was not surprising to find that they also share amino acid homology. There were,

however, no streches of homology between the group I

intronic ORFs and the group II intronic ORFs. Therefore,

it appears that the protein maturases of the group I

introns are completely different from the protein

maturases of the group II introns. The ORFs of these two

group II introns do not share any amino acid homology

with other yeast mitochondrial ORFs. There is, however,

amino acid homology between these ORFs and two other ORFs

found in other fungal mitochondria (Michel and Dujon,

1984) and some viral proteins (Michel and Lang, 1985).

These homologies will be discussed in later sections.

I.B.12. Autocotalytic and catalytic activities of group

I introns

Tetrahymena is a genus of ciliated protozoa. In T . pigmentosa and T^ thermophllua the nuclear encoded

L-rRNA gene of the cytoplasmic ribosomes contains a 407 bp intron. This intron has been sequenced in both species (Wild and Sommer, 1980, Kan and Gall, 1982).

These two introns are almost identical. Also of interest

to this discussion is the fact that the nuclear gene encoding the 26S cytoplasmic rRNA of the cellular slime mold Physarum polycephalum also contains two introns, 62 one of which is in the same position as and is homologous to the Tetrahymena intron.(Nomiyama et al., 198 1). What is particularly striking about the sequences of these nuclear encoded rRNA gene introns is that all four introns contain the conserved sequence elements and secondary structures which are typical of group I introns

(Cech et al., 1983, Waring et al., 1983, Waring and

Davies, 1984, and Michel and Dujon, 1983).

Early work in the laboratory of Tom Cech (Zaug and

Cech, 1980, and Cech et al., 1981) showed that the 28S rRNA gene could be transcribed iji vitro in isolated nuclei of T^_ thermophila. It was also shown that the intron in the rRNA gene was spliced out of these transcripts by isolated nuclei. Significantly, it is shown that the intron is excised as a linear molecule which has a guanosine nucleotide covelently attached via a normal 31 to 5' phosphodiester bond to its 5' end (Zaug and Cech, 1982). This guanosine nucleotide is not transcribed but is added during the excision process.

A dramatic break through in this investigation occurred when Cech's laboratory showed that the intron encoded RNA itself could autocatalyse its own excision and ligation of exons (Kruger et al., 1982). This was done using a recombinant plasmid DNA as a template for a bacterial RNA polymerase. Manipulating the salt concentrations of the buffers allowed these workers to

uncouple polymerization and Iji vitro splicing. In

vitro splicing was shown to occur in the complete absence

of proteins. The physical parameters which optimize the

autocatalytic activity of this group I intron have been

determined (Kruger et al., 1982, and Zaug et al., 1985).

These conditions are 100 mM (NH4)2S04, 10 mM MgC12, 30 mM

Tris-Cl (pH=7.5), and 100 microM GTP at 42 degees C.

In addition to the autoexcision discussed above, the

Tetrahymena intron autocatalyses the cyclization of

itself coupled to the loss of the fifteen 5* most

nucleotides including the non-encoded 5' guanosine

nucleotide (Inque et al., 1985, and Sullivan and Cech,

1985). This process is reversible and can result in the

addition of new nucleotides to the 5* end of the intron

encoded RNA. A summary of the autocatalytic activities

of this RNA are as follows. First, the intron catalyses

two transesterification reactions in which the hydrolysis

of the phosphodiester bonds at the intron/exon boundries are coupled to the ligation of the two exons and the formation of a 5' to 3' bond between the 3' hydroxyl group of the guanosine cofactor and the 5' end of the intron. This reaction, like all others catalysed by this intron, requires magnesium. Note that this series of reactions does not result in a net change in the number 64 of phosphodiester bonds. Therefore, the internal energy of the system is conserved. This can help explain the fact that no outside source of energy, such as ATP or

GTP, is required for autoexcision.

A second coupled cleavage-ligation reaction occurs in which there is hydrolysis of the phosphodiester bond between the uridine nucleotide at position fourteen in the intron and the adenine nucleotide at position fifteen. (There is also a minor reaction product that involves this same transesterification reaction only with the uridine at position nineteen instead of the adenosine at position fifteen.) This hydrolysis is coupled to the ligation of the 3' guanine nucleotide at the end of the intron to the adenine nucleotide at position fifteen (or the uracil at position twenty). This results in the liberation of a linear 15-mer (the fourteen 5 ’ most nucleotides of the intron plus the non-encoded guanosine) and a covelently closed circular single stranded intron derived RNA in which all bonds between nucleotides are standard 3* to 5' phosphodiester bonds. Note again that no net change in the number of bonds has occurred. The structure of this circular molecule has been confirmed by extensive biochemical analysis (Inque and Cech, 1985).

Once in the circular form, the intron can catalyse one of two possible reactions. First, it can simply 65

reopen to form a linear molecule which is colinear with the intron encoded sequence minus the fourteen (or nineteen) 5 1 most nucleotides. This is fundamentally a nuclease activity since there is a net hydrolysis of one phosphodiester bond. The second reaction is an intramolecular recombination in which the energy liberated in the hydrolysis reaction involved in linearization is conserved by forming a new phosphodiester bond between the 5' adenosine nucleotide of the intron and the 3' hydroxyl group of some short . The which are most reactive in this process are CU, UCU, UUU, and NUUU

(Sullivan and Cech, 1985). Of this series UCU is most reactive. It is interesting that the last three nucleotides of the 5* exon are also UCU (Wild and Sommer,

1980) and that UCU is predicted to base pair with the IG sequence (Waring and Davies, 1984) of this intron. This intramolecular recombination is reversible. The reverse reaction regenerates the circular intron-derived molecule and the oligonucleotide.

Perhaps the most exciting finding concerning the

Tetrahymena intron is that the linear intron minus the nineteen 5' most nucleotides is a true enzyme (Zang and

Cech, 1986). This linear intron minus nineteen nucleotides (L-19) is the final product of repeated rounds of cyclization and reopening. The first reaction which this enzyme catalyses is a transesterification in which one nucleotide is removed from one oligonucleotide and transferred another oligonucleotide. The example used by Zaug and Cech was the oligonucleotide pC5. A typical reaction was pC5 + pC5 goes to pC6 + pC4. The products could then reenter this process and yield small amounts of products as big as pC30 or as small as pC3.

No products smaller than pC3 were observed at high concentrations of substrate (10 micromolar pC5). This series of reactions is fundamentally a polymerization reaction. The L-19 intron has lost the 5' portion of its

IG. The resulting 3' portion of the IG left at the 5' end of L-19 is S'UGGAGGGS'. It was hypothesized by Zang and Cech (1986) that oligocytosine is a good substrate because it can bind to the remaining portion of the IG.

At low concentrations of pC5 (8.5 nanomolar), a different reaction was observed. Basically, the intermediate in the transesterification reaction was trapped by a lack of receptor oligonucleotides. This problem was resolved by the hydrolysis of pC away from the donor oligonucleotide without transfer to another oligonucleotide. This reaction is fundamentally a ribonuclease activity. It is clear that the transesterification (polymerization) and hydrolytic (RNAse) activities of the L-19 enzyme are basically the same as the cyclization and linearization autocatalysis reactions discussed above. The big difference is that the reactions catalysed by the L-19 result in the regeneration of L-19. Since L-19 is not consumed or modified at the end of the reaction, L-19 is a true enzyme. Zaug and Cech also showed that this enzymatic activity is specific for ribonucleotides.

Deoxyribonucleotides were not reactive.

In addition to the work which has been done in the laboratory of Tom Cech, Waring et al. (1985) have also studied the Tetrahymena intron. This group inserted this intron along with seven base pairs of the 5' exon and 37 base pairs of the 31 exon into the Smal site of the M13 derived vector M13Mp8. This Smal site is located in the

5' region of a gene encoding beta-galactosidase. The expression of this enzyme can be detected in colonies of bacteria on agar plates using color indicators. If the intron splices in coli cells allowing the enzyme to be expressed, then the colonies will be a different color than if the intron does not splice. M13 derived vectors have the added advantage in that the single stranded DNA from the phage are excellent substrates for DNA sequencing and in vitro site specific mutagenisis

(Zoller and Smith, 1983). 68

Waring et al. (1985) found that the Tetrahymena

intron splices in coli. In addition, they

constructed and examined a series of jLn vitro mutagenisis products. Two deletion mutants, one which

deleted the last three bases of the intron and the 3' exon and the other which deleted an internal HincII fragment, were both splicing defective. The 3' deletion has lost the exon sequence which binds to the IG sequence and the conserved guanosine nucleotide at the 3* end of the intron. The internal deletion has lost "Q" and therefore can not form the conserved P-Q stem found in all group I introns.

Three point mutations, which block splicing, were also characterized. One of these was in box 2 ("S"). The same mutation has been found to be splicing defective in vivo derived mutants in cob 14 of cerevisiae (De La

Salle et al., 1982, and Jacq et al., 1982). The other two mutants affected the P-Q interaction. These are the only point mutations known in P and Q. Both affected a proposed G-C base pair. One was a G-C tranversion in Q and was a tight mutant. The other was a C-U transversion in Q and was leaky. The leakiness of this mutant was explained by to the possibility of a weak G-U base pairing.

Cech’s group (Burke et al., 1986) has recently used a 69 similar appoach as described in the last paragraph to investigate coplimenting mutations in box 9 (R) and box 2

(S) of the Tetrahymena intron in vitro. They have used this technique to unambiguouly demonstrate the interaction between these two conserved sequences.

In addition to the Tetrahymena intron, several other class I introns have also been shown to possess varying degrees of autocatalytic activity. This activity has been detected either by examining the ability of in vivo transcribed precursor RNAs to incorporate a nonencoded guanosine nucleotide, direct electron microscopy of RNAs, or by processing in vitro transcribed precursor RNAs in protein free buffers which promote self splicing. Class

I introns which have been shown to be autocatalytic are omega, aI3, aI5a (Tabak et al., 1984 and Arnberg et al.,

1986), and the L-rRNA intron and the first cob intron of

N . crassa (Garriga and Lambowitz, 1983 and 1984). The

N . crassa introns are interesting in that they do not splice in certain mutant nuclear backgrounds at nonpermissive conditions. This indicates that nuclear encoded products play a role in splicing even those introns which are autocatalytic in vitro (Bertrand et al., 1982). 70

I.B.13* Oxi3 I5g, a class II intron, is also autocatalytic

Until recently, most studies on fungal mitochondrial introns have focused on the group I introns, however; in view of the findings discussed in this section, this is likely to change. Two groups, one working in the United

States (Peebles et al., 1986) and the other working in

Europe (van der Veen, et al., 1986), have cloned the 3' most intron of the oxi3 gene of cerevlciae (aI5g) plus portions of its flanking exons into the SP6 vector.

0xi3 I5g is an 887 nucleotide long group II intron

(Michel et al., 1982) which does not contain an ORF and which splices in petites (Hensgens et al., 1983b). Like all other group II introns in cerevisiae mitochondria, this intron is excised as what appears to be a relatively stable coveletly closed circular molecule

(Arnberg et al., 19&0, Halbreich et al., 1980, and

Hensgens et al., 1982).

Both groups have made the striking finding that in vitro made transcripts, are capable of autocatalysing the proper excision of this intron and ligating together the exons. As in vivo (Hensgens et al., 1983), the primary product of the excision event is not linear but appears to be a covalently closed circular molecule. Of the two groups, Peebles et al. have more closely defined the optimum physical conditions required for this autocatalytic activity. They found a sharp optimum at 10 mM magnesium acetate, 2 mM spermidine, 40 mM Tris-acetate

(pH=7.6) and 45 degrees C. This is a relatively low ionic strength buffer when compared to the salt concentration which optimize the autocatalytic reactions of the group I introns. Also in contrast to what was observed for the group I introns, monovalent cations such as sodium and ammonium, were not beneficial in promoting the autocatalytic activity of this class II intron. At low concentrations monovalent cations had no effect, but at concentrations, which were optimum for the excision of the Tetrahymena intron, monovalent cations were inhibitory for the excision of aI5g in vitro . It is also important to note that there is no requirement for a guanosine nucleotide in the excision of this intron.

Another extremely interesting finding concerning this autocatalytic activity is the nature of the bond which closes the circle of the excised intron. It had previously been determined that this bond in class II intron derived circles was not the standard 3' to 5' phosphodiester bond as is the case for the Tetrahymena derived circles (Hensgens et al.,1983). Both Peebles et 72

al. and van der Veen et al. have determined that this

bond is a 2’ to 5' branch involving the 5' most

nucleotide (a guanosine) attached to an adenosine

nucleotide very close to the 3* intro/exon boundry.

Therefore, the excised intron is really not a circle as

it first appeared but a lariat with a very short tail.

The chemical nature of the branch point trinucleotide

following nuclease P1 digestion (Wallace and Edmonds,

1983), by mapping the expected strong stop to reverse

transcriptase at a 2' to 5' branch and by showing that

this bond is sensitive ta a HeLa-cell derived enzyme

which specificly hydrolyses 21 to 51 bonds (Ruskin and

Green, 1985).

From these studies, it is quite clear that class II

introns are spliced by a mechanism which is distinct from

the mechanism utilized by class I introns. The proposed

secondary structures, environmental parameters required

for splicing, and the primary products of excision are

different for these two groups of mitochondrial introns.

These two groups of introns are not simple variations on

a common theme; they are distinct.

I.B.14. Nuclear pre-mRNA splicing: Similarities with

group II mitochondrial introns 73

Introns in nuclear protein encoding genes are not conserved in either length, sequence or secondary structure; however, at the 5 ’ end of these introns there is almost always a GT dinucleotide and at the 3' end there is almost always an AG dinucleotide. This is called

"Chambon's rule" (reviewed by Breathnach and Chambon,

1981). In addition to these invariant dinucleotides, there are less well conserved larger sequences that occurs at the exon/intron boundries. The 5 1 intron/exon boundry tends to be (C or A) AG/GT(Pu)AGT. Mutations in these first few bases of the intron dramatically effect the splicing of the betaglobin genes in some beta-thalassemia patients (Treisman et al., 1983). The

3' intron/exon boundry tends to be (Py)11N(Py)AG/G

(Mount, 1982). In addition, mammalian introns have a sequence near the 3' end of these introns (ie. about

30-40 nucleotides away from the 3* splice site) which is

PyXPyTPuAPy (Ruskin et al.,1984). This sequence is similar to the sequence TACTAAC which is a highly conserved element near the 31 end of yeast nuclear pre-mRNA introns (Langford and Gallwitz, 1983, and

Pikielny et al., 1983)* Point mutations in the "TACTAAC box" have been shown to disrupt splicing. Mutations affecting the first or second adenosines of this sequence are particularly detrimental (Langford et al., 1984). 74

Most of the rest of these introns is dispensable for splicing functions. Large deletions, which.preserve the elements described above splice normally (Gruss and

Khoury, 1980, and Wierling et al., 1983)* As it happens, the 5' nucleotide of group II introns is a guanosine and the 3' end is the dinucleotide APy.

The remarkable observation about nuclear intron excision is that the intron is excised as a lariat in which the 5' guanosine is attached to an adenosine nucleotide near the 3' splice boundry by a 21 to 5' bond

(Padgett et al., 1984, reviewed by Keller, 1983).

Furthermore, the adenosine at the branch point is in the

"TACTAAC box" in yeast and in the similar sequence in mammalian introns (Ruskin et al., 1984). Mutations in the "TACTAAC box" of yeast block the formation of the lariat branch (Newman et al., 1985). Several groups have developed systems by which they can process pre-mRNA and splice introns in vitro in cell extracts (Krainer et al., 1984, Padgett et al., 1983, and Hernandez and

Keller, 1983). All these systems are characterized by a requirement for ATP and magnesium and a long lag time

(usually more than 30 minutes but varying depending on the substrate and/or extract used) between the start of incubation and the observation of spliced exons. The requirement for ATP and the lag time observed before splicing can be explained by the observation of large splicing complexes called "spliceosomes" (Brody and

Abelson, 1985). Spliceosomes are large (ie. at least

60S) and take time and energy to construct (Frendeway and

Keller, 1985) These splicing complexes also contain

U-snRNPs (small nuclear ribonucleoprotein particles) particularly U-1, U-2, and U-5 snRNPs. SnRNPs are particles which contain both protein and RNA and are probably involved in nuclear RNA processing (Kramer et al., 1984).

The observation that both nuclear pre-mRNA introns and organelle class II introns produce a 2'-5' lariat between the 5* guanosine of the intron and an adenosine near the 3' end of the intron has lead to the hypothesis that the two types of introns are spliced via a common mechanism. In the nucleus, the components of the spliceosome provide in trans much of what is provided in cis by the internal structure of class II fungal mitochondrial introns. This has led to the hypothesis that the two types of introns are evolutionarily related.

Since, nuclear pre-mRNA splicing requires snRNPs which contain RNA, it is posible that splicing in nuclear pre-mRNAs is also a fundamentally RNA catalysed event

(reviewed by Lewin, 1986, and Cech, 1986). 76

I.B.15. All fungal mitochondrial introns belong to either

class I or class II

While there is evidence of introns in all fungal

mitochondrial genomes which have been examined, introns

have been completely sequenced from only six species.

These are £3^ cerevisiae, Kluveromyces sp., pombe

, crassa, A . nidulan3 and anserina. A total

of 32 introns from these species have been sequenced.

Surprisingly, all except two clearly fold into secondary

structures characteristic of either class I or class II

introns. Most of these introns have been reviewed by

Michel and Dujon (1983) and Waring and Davies (1984).

The rest have been sequenced since those reviews were

published and have been assigned to their rescective

classes by the authors who have published their

sequences.

The two cerevisiae introns not discussed in the

reviews mentioned above are cob 12 and cob 13. Cob 13

has been sequenced by P. Q. Anziano (1984,) and is

clearly a class I intron. Cob 12 is puzzling in that it

does not appear to have any of the conserved cis-acting

sequences characteristic of class I introns (ie., P, Q,

R, or S) but does encode a maturase which is

characteristic of the class I intron ORFs (Hensgens et 77 al., 1983a). Cob 12 may be a divergent form of the class

I intron type.

Only one intron has been sequenced from Kluveromyoe3 sp. It is homologous to omega and is also a class I intron.

Six introns from crassa have been sequenced. All except one are clearly class I introns. These are the

L-rRNA (Burke and RajBhandary, 1982), the two cob introns

(Helmer-Citterich et al., 1983, and Burke et al., 1984) the intron in URF1 (Burger and Werner, 1985) and the 3' intron of the ATPase subunit 6 gene (Morelli and Macino,

1984). The one remaining possible intron is near the 5 1 end of the ATPase sbunit 6 gene, as discussed previously, this sequence may not be an intron at all. The sequences for the four optional introns in oxi3 have not been published.

Five introns occur in the mitochondrial genome of A . nidulans. All are class I introns. They are the

L-rRNA (Netzker et al., 1982) the cob intron, which shares homology with the cob 13 of S. cerevisiae,

(Waring et al., 1982) and the three oxi3 introns (Waring et al., 1984). So far, no class II introns have been found in either crassa or iU nidulans.

Four introns have been sequenced from P^ aniserina

The three introns in the URF1 gene are all class I introns (Michel and Cummings, 1985). Burger and Werner

(1985) have noticed that two of these introns in P .

aniserina have ORFs which are homologous to the ORF of

the intron in N_^ crassa which they sequenced. Hensgens

et al. (1983) were able to divide the class I ORFs into

two groups based on amino acid homologies. Burger and

Werner suggest that the N_^ crassa intron ORF and the two

ORFs from the homologous introns in P_^ aniserina

represent the first three members of a thrid group of

ORFs found in class I introns. The fourth intron, which

has been sequenced from aniserina, is the first

intron in the CoxI (homologous to oxi3 ) gene. This

intron, which is involved in the senescence phenomonon,

is a class II intron. There may be sixteen or more other introns in the P^ anserina mitochondrial genome which

are yet to be investigated.

Finally, of the three introns, which are found in the

S. pombe mitochondrial genome, the two oxi3 introns are

class I introns, while the optional intron in cob is a

class II intron (Lang et al.,1983, and Trinkl et al.,

1985).

For a summary of the conserved cis-acting

sequences found in all sequenced class I introns, see

Tables 1 and 2. 79

I.B.16. Introns in Organelle genes of plants

I.B.l6.a. The cytochrome oxidase subunit II gene in the

mitochondrial genomes of flowering plants

Flowering plants can be divided into two groups,

monocots and dicots, based on basic differences in organ

symetry, root and leaf structure, and the number of

cotyletons (embryonic leaves) which occur in their seeds.

Monocots have one cotyleton while dicots have two.

Cereal grains are good examples of monocots while beans,

spinach and tobacco are good examples of dicots.

Mitochondrial genomes of flowering plants are

considerably larger than the mitochondrial genomes of

other eucaryotes (Leaver and Gray, 1982) and encode more polypeptides (Leaver and Forbe, 1980). Unlike other mitochondrial genomes, flowering plant mitochondria also encode a 5S rRNA (Bonen and Grey, 1980).

One of the protein encoding genes of plant mitochondria is of interest in the discussion of organelle introns. This is the cytochrome oxidase subunit II gene. Fox and Leaver (1981) sequenced this gene from Zea mays. It contains a 794 bp intron which Michel and Dujon (1983) have classified as a group

II intron. Both its secondary structure and its 3' end 80 are very homologous to the class II intron in fungal mitochondria. This intron does not contain a significant

ORF.

These findings prompted Kao et al. (1984) to examine the structure of this gene in four other plant species.

These species were the monocots rice and wheat, and the dicots Oenothera bertiana (Hiesel and Brennicke, 1983) and Pisum sativum (pea). They found that the two monocot species have an intron which is highly homologous to the corn intron. These homologous introns are also in exactly the same place in the CoxII gene as is the corn intron. Between rice and corn, the two exons were 99.5 and 10055 homologous at the nucleotide and amino acid levels respectively. This indicates that the rate of single base substitutions in structural genes of plant mitochondria may be much less than that which has been observed for mammalian mitochondria.

The corn intron was much shorter (794 nucleotides) than the introns found in wheat (exact sequence not shown) and rice (1265 nucleotides). However, this size difference can largely be attributed to a single 461 bp insertion found in the wheat and rice introns which is lacking in the corn intron. If this insertion is not considered, then the rice and corn intron are 98 .655 homologous at the nucleotide level (no significant ORFs present). The 461 bp insertion is of interest in that at the site of its insertion eight base pair direct repeats were generated at both ends of the insertion. The generation of direct repeats is usually observed in insertions of transposible elements as a result of how the staggered cut, which is involved in the DNA insertion, is repaired. It could be that the 461 bp insertion was a byproduct of normal repair of a double stand break in the mitochondrial genome. The insertion is at the end of a large stem as predicted by the proposed secondary structure; therefore, the insertion does not disrupt the core secondary structure of this class II intron.

Kao et al. also conclude, that since all three of the monocots examined have this intron, while neither of the examined dicots have it, it may be a general rule that this intron is present in monocots but not dicots.

I.B.l6.b. Introns in Chloroplast tRNA genes

Many nuclear encoded tRNA genes contain introns

(Ogden et al., 1980, and Abelson et al., 1979)• These introns are very short (13-60 bp) and are spliced by a mechanism unrelated to the splicing mechanisms used by other nuclear and organellar introns (Peebles et al., 82

1983, and Greer et al., 1983). By contrast, chloroplast tRNA gene introns are hundreds or thousands of base pairs long and can be folded into secondary structures which are characteristic of either class I or class II introns.

Not all chloroplast tRNA genes contain introns.

The first tRNA gene, which will be discussed, is the chloroplast tRNA-UUA-Leu. This gene has been sequenced in corn (Steinmetz et al., 1982), a monocot, and in the broad bean, Vicia faba (Bonnard et al., 1984) a dleot.

Both of these genes are split by a class I intron which is in the same place in their anticodon loops. The corn intron is 458 bp long, and the bean intron is 451 bp long. The mature tRNAs are 88 nucleotides long and differ only by three single base substitutions. This is not to surprising since tRNA genes evolve fairly slowly.

In fact, these tRNA exons are 60% homologous to their counterparts in coli. What was surprising was the homology between the two introns. If one divides these introns into thirds, it can be shown that the 5* third and the 31 third of the corn intron are 85% homologous to the 5* third and 3' third of the broad bean intron respectively. The conserved sequence (structural) elements E, P, Q, R (box 9), and E* are all in the first third of these introns while S (box 2) is in the last third. However, the homology extends well beyond these 83

conserved domains. The middle third is also more

homologous than it would first appear. The sequence

divergence observed in this region can largely be

explained by a series of small insertions and/or

deletions. The largest of these is a 27 bp sequence

found only in the corn intron and a 13 bp sequence found

only in the bean intron. These two sequences account for most of the sequence divergence observed in the middle

region of these introns.

Another interesting feature of these introns concerns the proposed "internal guide sequence" (IG). The portion of the IG, which is proposed to bind to the 3' exon, appears to be involved in a different stem structure.

However, the secondary structure inherent in a tRNA molecule may circumvent the need for a strong IG-exon interaction. The other conserved cis-acting sequences found in the tRNA introns are shown in Tables 1 and 2.

The next four tRNA introns, which will be discussed, occur in the spacer region between the 16S and 23S rRNA genes in the chloroplast genomes. This region has been sequenced in Euglena gracilis (Graf et al.,1980, and

Orozoco et al., 1980), corn (Koch et al., 1981), and tobacco (Takaiwa and Sugiura, 1981). In all three genomes, this region encodes the tRNA-AUU-Ile and the tRNA-GCA-Ala. The E^ gracilis genes are continuous 84 while these genes are each interrupted by a single class

II intron (Michel and Dujon, 1983) in corn and tobacco.

In corn the tRNA-Ile and tRNA-Ala introns are 949 and

806 bp long respectively, while in tobacco these introns are 707 and 710 bp long respectively. What is remarkable is that all four of these introns are highly homologous to each other. The biggest differences between these four introns are two large insertion/deletions. There is a 229 bp sequence, which occurs in the corn tRNA-Ile gene, and a 103 bp sequence, which occurs in the corn tRNA-Ala gene, which do not occur in the respective genes in tobacco.

There are three other tRNA introns which have been sequenced from chloroplasts. All are from tobacco and all are class II Introns. The first of these is the 571 bp intron found in the tRNA-UAC-Val (Deno et al., 1982).

This intron is interesting in that the same tRNA gene from spinach does not contain an intron (Sprouse et al.,

1981). Another tRNA intron was found in the tRNA-UCC-Gly gene (Deno and Sugiura, 1984). This 691 bp intron is in the D-loop of the tRNA instead of the anticodon loop.

All other tRNA gene introns (including nuclear tRNA genes) are in the anticodon loop. The last tRNA intron is a 2526 bp intron in the anticodon loop of the tRNA-UUU-Lys gene (Sugita et al., 1985). This intron 85 also has a 509 codon ORF, which begins with an AUG initiation codon and ends with UGA. These authors did not seem to appreciate the similarities between this intron and the other class II intron and did not compare its ORF with the ORFs of other class II introns.

I.B.16.C. Introns in chloroplast genes other than tRNA genes

Two protein encoding genes from the tobacco chloroplast genome have been found to be interrupted by introns. These are the ribosomal protein gene3 L-2 and

S12 (Zurawski et al., 1984, and unpublished data see

Sugita et al, 1985). Both arc interrupted by what appears to be a single class II intron. It is interesting that the L-2 gene has been sequenced from spinach. This gene is continuous in spinach. There is also an 823 bp intron in the wheat chloroplast gene which encodes subunit I of the ATP synthase complex (Bird et al., 1985). By comparison with the gracilis introns discussed in the next paragraph, the wheat intron is probably related to the class II mitochondrial introns; however, these authors did not study the secondary structure of this intron.

Of all the organellar genomes yet examined, it 86

appears with the data currently available that the

Euglena gracilis chloroplast genome contains the most

introns in protein encoding genes. Heteroduplexing

experiments show that at least fifteen of the protein

encoding genes of this genome contain introns (Koller and

Delius, 1984). For example, the gene for the large

subunit of ribulose-1,5-biphosphate carboxylase contains

nine introns (Koller et al., 1984). None of these

introns has been fully sequenced. All of these introns

are about 0*5 kb long, however; their boundry sequences indicate that they may be related to fungal mitochondrial

class II introns.

A more closely studied example of introns found in

the E^ gracilis chloroplast genome occurs in the region of the psbA gene which encodes a highly conserved 32KD

thylakoid membrane protein (Keller and Stutz, 1984, and

Keller and Michel, 1985). Four introns occur in the 32KD protein gene, and two additional introns occur in an unidentified gene which is just 5' to the 32KD protein gene. These introns ranged in size from 433 to 616 bp.

All six introns contained sequences homologous to those of class II introns near their 5' and 3' boundries; however, only one of these introns (the fourth intron in

the p3bA gene) could be folded into a secondary

structure similar to that of fungal mitochondrial class II introns. The other five introns are at least very divergent and possibly degenerate forms of class II introns. Keller and Michel conclude that "assuming the loss of information apparently suffered by these introns is real, they may best be regarded as well advanced intermediates in a process of information transfer which, starting from a self-splicing intron, would end with one that merely specifies the cuts and ligations to be preformed by a externally encoded splicing machinery."

Given the similarities between mitochondrial class II introns and nuclear pre-mRNA introns, further studies of the E_^ gracilis introns may help explain the origin and evolution of nuclear introns. Questions relevent to RNA mediated catalysis and RNA splicing of class II mitochondrial introns and nuclear pre-mRNAs have been discussed in a prevous section. Other questions about the proteins involved in this process also need to be addressed. For example, if the yeast mitochondrial class

II introns, the Euglena chloroplast introns and mammalian nuclear introns are all examples of processes which lie on the same continuum, then are the nuclear encoded proteins involved in yeast mitochondrial intron splicing related to the proteins involved in splicing in Euglena chloroplasts or, by analogy, with mammalian nuclear transcripts? For that matter, is there a relationship 88

between the proteins involved in splicing yeast nuclear

introns and yeast mitochondrial class II introns?

The last five organelle introns which will be

discussed are all class I intron which occur in the

chloroplast genome of Chlamydomonas reinhardii. Four of these introns occur in the 32KD ( psbA ) gene of C . reinhardii (Erickson et al., 1984). These introns are larger (1.1 to 1.8 kb) than the four introns found in the same gene in E^ gracilis. Only the boundry sequences for these introns have been sequenced but it is clear that they are completely unrelated to the introns of the

E. gracilis 32KD protein gene. In fact, the first best estimate is that all four are class I introns. Not only is the last base of each 5' exon a thymidine, but the last base of each intron is a guanidine. Of the four introns, a box 2 homolog has been observed in three, and of these three, two also contain a box 9 homolog. Two of the introns also have ORF's (more than 43 codons) which are in frame with the 5 ’ exon.

The other intron, which has been observed in the C . reinhardii chloroplast genome, is in the 23S rRNA gene

(Rochaix et al., 1985). It has been fully sequenced and is clearly a class I intron. It is not in the same place in the L-rRNA gene as the other L-rRNA introns. This intron is 888 bp long and contains a 163 codon ORF. An 89 interesting feature of this ORF is the presence of a

"LAGLI-DADG" dodecamer homolog near its 5' end. This suggest that this ORF may encode a maturase-like protein.

If this is true, then this is the shortest known maturase-like ORF. It is also unusual in that only one of the dodecamer sequences was found. The conserved cis

-acting sequence homologs observed in this intron are shown in figure 5.

I.B.17. A class I intron in the thymidylate synthase gene of the bacteriophage T4

T4 is a large DNA coliphage of the gram negative bacterium, E^ coli. Its genome encodes a large number of genes. About half of these genes, such as tRNAs and thymidylate synthase (td), have fully functional homologs in the host's genome. There are no known introns in the

E . coli genome, therefore, it was very surprising when it was discovered that the thymidylate synthase gene of

T4 contains a 1.0 kb intron (Belford et al., 1985, Chu et al., 1984, and reviewed by Schmidt, 1985).

It has been shown that this td gene is the functional td gene in the T4 genome and that both exons encode amino acid sequences found in the mature protein. Also, the pre-mRNA is colinear with the DNA sequence, and the 90

spliced mRNA and the excised intro have been detected.

The last base of the 5 1 exon is a thymidine, and the last

base of the intron is a guanosine.

Very late in this investigation, the sequence of this

intron has been published Chu et al., 1986). The intron

is a 1017 bp class I intron with a 2^5 codon free

standing ORF. The authors of this paper did not look for

LAGLI-DADGF homologs in this ORF. This intron is capable

of autocatalysing its own excission and cyclization.

This reaction required magnesium and a guanosine

cofactor.

I.B.18. Possible class I intron in an archeobacterial

23S rRNA gene

Kjems and Garrett (1985) have sequenced the 23S rRNA gene from the archeobacterium Desulfurococcus mobilis.

There is only one copy of this gene in this organism.

The extremely exciting finding was that this gene contains a 622 nucleotide long intron which have the consensus sequences for both R (box 9 right) and S (box

2). The R homolog is TCAAGAGACTA, which is an 8 out of

ten match with the consensus, and the S homolog is

ACUAGAAAUAGU, which matches the consensus at 10 out of 12 sites. The secondary structure of this intron was not 91

investigated by these authors. This intron also has an

ORF which begins near the 5' end of the intron and

extends into the 3' exon. Homology between this ORF and

the ORF of other class I introns was also not

investigated.

I.B.18. Short regions of high G+C content in the mitochondrial genome of Sj^ cerevisiae

The overall distribution of nucleotides in the yeast mitochondrial genome is not random. As a whole, the genome is very A+T rich being roughly 82^ A+T (reviewed by de Zamaroczy and Bernardi, 1985). The A+T content of exons and structural genes vary but averages about 75>

A+T. Intergenic regions are very A+T rich; however, scattered throughout the genome are numerous small

(usually 20-60 nucleotides) G+C rich regions. These regions are somrtimes called "GC site clusters” or ”G+C

clusters" because the restriction endonucleases Haelll and Hpall tend to have recognition sites within these regions. These G+C clusters can be almost pure G+C and are often optional; that is, some G+C clusters are present in some strains but are absent from others.

There are also a large number of small palindromic G+C rich sequences in the mitochondrial genome of crassa 92

(Yin et al., 1981) which comprise 5-10$ of the genome.

Each of them contain two closely spaced PstI sites.

These might be cis-acting RNA processing sites or

might be functionally optional as are many (if not all)

of the G+C clusters found in yeast.

The reason these G+C clusters are interesting is that

at least one of them, the "a" insert in the var1 gene,

participates in an asymetrical gene conversion event

(Strausberg et al., 1978, Strausberg et al., 1981, and

Hudspeth et al., 1984). In a cross between a strain with

the "a" insert and a strain without it, a majority of the

progeny have the "a" insert even though outside markers

are not preferencially inherited.

1.B.19. Conclusion

In this Introduction, a broad review of the structure and function of organelle genomes has been presented.

The focus of this discussion has been on the introns found in yeast and other fungal mitochondrial genomes.

Almost without exception these introns can be divided into two distinct classes based on secondary structures.

Some introns of both classes have large ORFs. At least some of these intronic ORFs of both classes of introns encode proteins which are required for the 93 splicing of the intron which encodes it. These proteins are called maturases. One maturase, encoded by cob 14 of

S . cereviciae, is required to splice oxi3 14 in addition to its native intron. Intronic ORFs may also encode other proteins with biological activities. These include the fit1 protein of the omega intron and the ribosomal protein, S5, which is encoded by the mitochondrial L-rRNA intron of N^ . crassa and A . nidulans.

These introns are also of great interest in that they encode RNA molecules which are themselves autocatalytic and/or catalytic. In addition, introns which resemble the secondary structures of fungal mitochondrial introns are widespread in nature. Introns of the class I type have been observed in mitochondria, chloroplasts, the T4 phage, archaebacter and certain nuclear rRNA genes of lower eucaryotes. Class II introns also occur in a wide variety of mitochondrial and chloroplast genomes and may be related to nuclear introns of higher eucaryotes.

Many, and perhaps all, introns in yeast mitochondria are optional for wild type respiration. There is a high degree of strain dependent variation attributable to the presence or absence of certain introns in some strains.

This has led to the hypothesis that other optional introns may occur in wild type yeast mitochondrial 94 genomes which by chance do not occur in laboratory strains.

Because of all these findings, I have sought to identify novel introns in yeast mitochondrial genomes which do not occur in laboratory strains of yeast. The oxi3 gene was likely to be a productive place to look for novel introns because it contains the greatest variety of introns in laboratory strains. Some of those introns were already known to be optional. I have examined the oxi3 gene from ten closely related wild type yeast species and strains in the genus Saccharomyce3. I have found two additional optional introns and have completely sequenced one of them. I have compared this newly sequenced intron with previously sequenced introns and have expanded on the analysis of selected other introns. 95

I.C. Taxonomy and physiology of Rickettsia

The members of the genus Rickettsia are interesting for three reasons: 1. Some members of this genus are serious pathogens of humans and domesticated animals. 2.

The ecological niche occupied by these organisms is quite unlike that occupied by any other group of organisms. 3.

Rickettsia are relatively simple organisms with reduced genome sizes compared with other bacteria. As a result of these characteristics a number of very interesting and/or useful research goals can be met by studying this group. Some of these goals will be discussed in the following sections. One problem in working with

Rickettsia is that, until recently, the technology to handle these organisms has lagged behind the interest in investigating this group. Some of the recently improved technology will be discussed in the Introduction. Other advances made as part of the work covered in this dissertation will be discussed in the Results section.

I.C.1. Taxonomy of the family Rlckettsiaceae

The Rlckettsiaceae are a group of gram negative bacteria which in nature exist only as symbionts with eucaryotic cells. All of the Rlckettsiaceae have cell 96 walls, and none of them have flagella. Burgey's Manual

(1983, section written by E. Weiss and J. Moulder) describes the family as follows:

Family Rlckettsiaceae (within Order

Rickettsiales)

A. Tribe Rickettsieae

1. Genus Rickettsia (12 species

described)

2. Genus Rochallmaea (2 species)

3. Genus Coxiella (one species, C .

burnetii)

B. Tribe Ehrlichieae

1. Genus Ehrlichia

2. Genus Cowdrla

3 . Genus Weorickettsia

C. Tribe Wolbachieae

1. Genus Wolbachia

2. Genus Rickettsiella

There is no particularly good reason for beleiving that this group is monophyletic. The exact relationship of the tribe Ehrlichieae and the tribe Wolbachieae to each other or to the tribe Rickettsieae is unclear. In fact, the relationships of the various genera within the Ehrlichieae and Wolbachieae are not clear. On the other hand, there i3 one genus in this group that clearly does not belong. This is Coxiella burnetii the causitive agent of Q fever. First of all, it is gram positive under certain staining conditions. And second, it undergoes a form of differenciation which is strikingly similar to endospore formation. These are just two of the many fundamental differences which distinguish C. burnetii from all the other members of the family

Rlckettsiaceae. For these reasons this organism will not be further considered in this dissertation.

Once burnetii is removed from this family, the members of this family are at least superficially similar to each other. For example, in addition to the characteristics described above, all members of this family have at least one naturally occurring invertebrate host. Fleas, lice, and ticks (particularlly in the family Ixodidae) are common invertebrate hosts of

Rlckettsiaceae which also infect vertebrates.

The remainder of this section will deal with relationships between the members of the genus Rickettsia and the genus Rochalimaea. The genus Rickettsia consists of obligate intracellular symbionts of arthropods and vertebrates. None of these organisms is capable of growth outside of a host cell. In fact, as will be discussed in the section on physiology, they are extremely labile outside of their host cells. Even under the most favorable conditions yet devised, they can survive for only about two hours in an extracellular environment. The Rickettsia are also extremely unusual in that they reproduce free in the cytoplasm of their host cells unbounded by any membrane of host origin.

They are very small (0.4 by 1.2 microns) and extremely difficult to visualize using most standard light microscopy techniques. In fact, early taxonomies grouped the Rickettsia with viruses. This finally changed when

Gimenez (1964) developed a staining procedure which differentially stains the Rickettsia and the host cell cytoplasm. The details of the Gimenez staining procedure are given in the Methods section. Some members of the genus Rickettsia cause disease in humans. The details of these pathologies need not be discussed here, but pathological Rickettsia cause similar pathologies in humans.

I.C.2. Typhus group rickettsia

The genus Rickettsia is usually divided into three groups. The first group is the typhus group. There are currently three members of this group. They are R. prowazekii, the causitive agent of epidemic typhus, R . typhi (also known as IK mooseri), the causitive agent of endemic typhus, and IK canada (McKiel et al., 1967) which has been isolated from ticks in Canada and

California (Lane et al., 1981) but has not been observed to cause significant pathology in laboratory mammals.

However, four human subjects, which had previously been clinically diagnosed as having Rocky Mountain Spotted

Fever (RMSF) but who had failed to develop antibodies which cross react with the known RMSF agents, were shown to possess antibodies which recognize IK canada (Bozeman et al., 1970). Sera from those patients did not react with the other typhus group agents either. This is insufficient evidence to diagnose these individuals' diseases as being caused by IK canada but demonstrate a posibility that some strains of IK canada are infectious and possibly pathological in humans.

Organisms within the typhus group were originally classified togather because antisera raised against one member of the group will cross react with the other two members of the group (McKiel et al., 1967, and Myers and

Wisseman,1981). One good example of the shared antigenicity of the typhus group rickettsia is the work by Black et al. (1983) who made 137 different monoclonal antibody producing hybridoma cultures using IK 100

prowazekii as a source of antigens. They found that 104

of the monoclonal antibodies reacted only with R .

prowazekii. Eleven reacted with prowazekii and R .

typhi , and eleven more reacted with all three Typhus

group members.

Serology by itself, however, is a poor taxonomic

character because it samples only a very small portion of

a genome's information. For this reason, several

investigators have used molecular biology techniques to more closely define the relationships between the members of this group (Tyeryar et al., 1973, Myers and

Wissemann, 1980, and Myers and Wissemann, 1981). Those investigators determined DNA base composition, genome length and percent homology between members of the typhus group and other rickettsial species. Genome length and percent homology between genomes were determined by various techniques all of which rely on renaturation kinetics for their predictions. There is a great deal of variation between these experiments; however, all of the genomes were found to be very A+T rich. All the members of the Typhus group were found to be 71+/-0.5$ A+T. R . rickettsii, a member of the RMSF group, was 67.4 $ A+T, and Roohalimaea quintana was 61.2$ A+T. ( coli is

48.3$ A+T.) The similarities between R_^ canada and the other two members of the Typhus group break down however 101

when one examines genome size and percent homology with

the other genomes. IK typhi and IK prowazekii both

have genomes which are roughly 110+/-9 megadaltons (mD)

and (depending on which experiment one examines) are

70-100$ homologous to each other. These species are

clearly closely related. IK rickettsil has a genome of

roughly 130+/-10 mD and is about 40+/-20$ homolgous to

the three members of the typhus group. IK canada has a

genome of l43+/-4mD and is 42+/-10$ homologous to R .

typhi , IK prowazekii , and IK rickettsii. In

other words, IK canada is no more closely related to R .

typhi and R_. prowazekii than it is to IK rickettsii

, and its genome size more closely resembles that of R .

rickettsii. This is a classic example of how serology

has led investigators to mistakenly classify rickettsial

species. (For comparison, the EK coli genome is 241+/-

13 mD ((Gillis et al., 1970))).

R . typhi and IK prowazekii are clearly closely

related to each other. Exactly how closely related they

are is yet to be precisely determined. Eiseman and

Osterman (1976) attempted to examine this relationship by

looking at protein phenotypes on SDS-PAGE gels.

Difficulties in obtaining adequate amounts of rickettsia, which were free of host cell contamination, made it

necessassary for these investigators to radiolabel 102 rickettsial proteins in vivo in irradiated L-929 cells in the presence of cycloheximide (1.0 to 2.0 mg/ml).

Cycloheximide selectively blocks translation on cytoplasmic ribosomes and allows one to label only rickettsial (and mitochondrial) proteins. These authors found that IK typhi and IK prowazekii are clearly distinguishable on the basis of protein mobilities on

SDS-PAGE.

A similar technique has also been used to examine different isolates of IK prowazekii (Oaks et al. 1981).

These investigators examined five strains of R . prowasekii. Three of the strains were human pathogens isolated in the Old World. One was an avirulent derivative of one of these pathogenic strains, and the fifth strain was an isolate from United States with presumed human pathogenicity. On SDS-PAGE, the four pathogenic strains appeared identical. The avirulent strain differed from the others by one polypeptide. On two-dimensional gel electrophoresis which tests about 175 polypeptides, more differences became apparent. Several strain dependent differences were noted which distinguish between the virulent Old World strains and the US isolate. Two polypeptides were identified in all pathogenic strains which were absent from the avirulent derivative. Furthermore, two polypeptides were 103 identified as specific for the avirulent strain. No relationship was determined between the polymorphic polypeptides.

Another important characteristic of members of the genus Rickettsia is their invertebrate hosts (vectors) range. Traditionally, R^ prowazekii is thought to be spread only by human body lice ( Pediculus humanus humanus ) and occurs naturally only in humans and this louse. Rj^ typhi is ussually thought to be spread from rats to humans by one of the fleas which are ectoparasitic on those rodents. While these vectors are clearly important in the maintenance and spreading of these Rickettsia to humans, it vastly underestimates the invertebrate host ranges of the bacteria (reviewed by

Schmidt and Roberts, 1981, and Bozeman et al., 1975). R . typhus can be spread from rat to rat by the fleas,

Xenopsylla cheopis, Nosopsyllus fasciatus, and

Leptopsylla segnis, the louse, Polyplax spinulosa, and the tropical rat mite, Ornithonyssus bacoti, To put this into perspective for individuals not familiar with arthropod evolution. Lice and fleas are about as closely related to each other as mammals and fish (based on predicted divergence time) while fleas and lice are about as far removed from mites as mammals are from starfish. Taken as a whole, R^ typhi has a fairly large 104 host range in that it can infect and reproduce in fleas, lice, mites, rodents, and humans. There is no reason to believe that it cannot thrive in other hosts as well.

A similar situation exists for R_^ prowazekii also.

Until recently, this organism had been thought to exist only as a pathogen of humans and the human body louse.

This changed dramatically when Bozeman et al. (1975,) showed that humans who had mysteriously contracted a typhus-like disease in the south east United States were living in close proximity (often in the same building) as flying squirrels which were infected with prowazekii

This organism has been repeatedly isolated and, as was discussed in a previous paragraph, has a protein phenotype which is almost indistinguishable from that of

Old World strains of R^ prowazekii (Oaks et al., 1981).

In addition, the DNA of this organism has been compared with that of Old World isolates using DNA-DNA hybridization (Myers and Wisseman, 1980) and restriction endonuclease digestions (Regnery and Spruill, 1983, and

Regnery et al., 1983). The DNA of the US isolate was found to be nearly identical to the Old World isolates and to have few restriction pattern differences. With this in mind, worker in the US began to look for vectors which might be spreading the disease from the flying squirrel population to humans. They found isolated R. 105 prowazekii three times from squirrel lice and twice from squirrel fleas (Bozeman et al., 1981, and Sonenshine et al., 1978). These findings make the typhi and R . prowazekii invertebrate host ranges look strikingly similar. R_^ canada, which might not be a true member of the typhus group, has only been isolated from ticks.

I.C.3. The Rocky Mountain Spotted Fever group

The second group in the genus Rickettsia is the

Rocky Mountain Spotted Fever (RMSF) group. Like the typhus group, members of this group were originally grouped together because of shared antigenicity (Bell and

Stoenner, 1960, and Philip et al., 1978). The definitive member of this group is Rickettsia rickettsii the causitive agent of RMSF. The other members of this group are as follows:

Hjl akari Huebner et al., 1946

R_j_ australis Andrew et al., 1946 Hi conorii Philip et al., 1966 Hi montana Bell et al.,1963 Hi parkerii Bell and Pickens, 1953 Hi rhipicephali Burgdorfer et al., 1975 R. siberica Bell and Stoenner, 1960 106

In the past, many investigators attempte to differientiate between strains or species of rickettsia by a variety of immunological techniques. Researchers have attempted compliment fixation tests (Pickens et al.,1965, and Plotz et al.,19^4), cross infection and vacination tests (Lackman et al.,1965, and Plotz et al.,

19*16), toxin neutralization assays in mice (Bell and

Stoenner, 1960 , and Robertson and Wisseman, 1973), and cross reactivity with antigens of the bacteria Proteus

(strain 0X19) (Hechemy et al., 1979). However, by the late 1970's, it was obvious that these techniques were inadequate. They are either too insensitive, too laborious and time consuming and/or too inaccurate to be used routinely. As is stated in the introduction of the paper by Philip et al. (1978), "All of these methods have shortcomings, and none is generally applicable to large-scale strain differentiation. Accumulation of isolates that are antigenically uncharacterized and increasing evidence of heterogeneity of strains have accentuated need for a practicable serologic method to classify Spotted Fever Group rickettsiae.11 To fulfill this need, Philip et al. adapted the microimmunofluorescence technique of Wang (1971) for the identification of rickettsial strains and species. The details of this test are given in the Methods section. Basically, this technique relies on a battery of

fluorescently labeled "typing" sera. Such sera are

obtained from mice which had been immunized with one of

the type strains of each of the species of rickettsia.

Serial dilutions of the typing sera are added to dots on

microscope slides on which intact rickettsia had been

fixed with acetone. After washing, the rickettsial cells

are viewed with a UV microscope. Data were collected as

to what was the highest dilution of serum which

fluorescently labels the rickettsia. Antigenic

relatedless was determined by comparing the affinaty of a

serum with its homologous antigen and with an

experimental antigen.

Philip et al. tested 72 strains of rickettsia against

their own homologous typing serum and against the typing

sera of all other isolates. They found that each of

tested rickettsia fell into one of eighteen groups based

on cross reactivity of shared antigens. Three of these

groups represented the Typhus group which was discussed

above. The other fifteen groups were all subgroups within the RMSF group. Eight of these groups represented

the eight recognized species shown above. ( R^ bellij was not tested in this survey.) R^ rickettsii could be

divided into two closely related groups one of which was

only found in Montana and Idaho (Hlp-like strains) and 108 the other of which was found over a wide range which extends from Montana to N. Carolina to Brazil (R-like strains). Three of the groups contained only one member and as such were viewed with some doubt by Philip et al. and were not considered further. This is unfortunate in that one of these singular serotypes has since been repeatedly isolated from Amblyomma americanum ticks and is now considered to be an undescribed new species

(Burgdorfer et al., 1981). The remaining three groups are interesting in that each of these groups contain members which were isolated in different years and in different locations. One of these three groups was more closely related to the two FN rickett3ii groups and may

represent a new strain within this species; however, the other two groups are quite distinct and probably represent previously undescribed species.

Even though the microimmunofluorescence (micro-IF) test is vastly superior to previously used techniques, it still has some shortcomings of its own. The first is that it is a somewhat subjective. Determination of which dilution of a particular serum is the last to react with a particular antigen is not always definitive. These tests require a high degree of technical training and practice to distinguish between faint but specific antiboby binding and nonspecific background. Different 109 investigators can have different interpretations which can only be resolved by interpreting tests together with other workers until all interpretations are the same. On an international level, this is impractical. Another problem is that various typing sera can have widely divergent titers to their homologous antigen. In the paper by Philip et al., homologous titers ranged from

1:64 to 1:32,768. Only sera with very high homologous titers cross react with all members of the RMSF group (or show the slight cross reactivity which is shared between the RMSF group and the Typhus group which will be discussed in a following paragraph). Another problem with the antisera is that sera from different sources may have different cross-reacting properties. When I started working with rickettsia, many workers including myself used a rabbit derived serum, whose homologous antigen was

R . rickettsii, to rapidly screen Rickettsia for the presence of RMSF group specific antigens. This serum reacted equally well with R_^ akari and Rj^ rickettsii.

As will be shown in this dissertation, these two organisms are not the most closely related pair in the

RMSF group. This is another example of how cross reactivity between immune sera can can lead to inferior taxonomic determinations. Quality of antigens can also be a problem with this test. High levels of 110 contaminating host cell material can lead to high levels of background.

Philip et al. (1978) found evidence for at least two aditional species (or serotypes) of rickettsia in the

RMSF group in addition to the eight species which had been described at the time of their writing. The question now arises as to exactly how many species of rickettsia there are in the RMSF group. In addition to the two additional members of this group which were discovered by Philip et al., there are probably at least five others. The first of these is called the "East Side

Agent" because it has only been found on the east side of the Bitterroot valley in western Montana. This isolate has never been cultivated in vitro in cell culture or fertilized hens eggs but does occur in up to 80J of the

Dermacentor andersoni ticks on the east side of the

Bitterroot valley. In guinea pigs, infection with this organism results in a low fever and some antibody response but no severe pathologies (Burgdorfer et al.,

1981). Microimmunoflorecence shows that this organism shares some antigens with R^ rickettsii. The second undescribed RMSF group Rickettsia was found in 1.8 to

11.7% of the Ixodes ricinu3 ticks collected in

Switzerland (Burgdorfer et al., 1979) and therefore is called the "Swiss agent". This agent infected and killed I ll hens eggs and Vero (green monkey) tissue culture cells

(see Methods section) and Infected the tunica vaginalis of the vole, Microtus pennsylvanica. In the micro-IF

test, immune sera raised against the "Swiss agent" cross react with seven members of the RMSF group. This cross reaction was strongest against R^ rickettsii antigens.

The third other possibly member of the RMSF group is

R . belli! (Philip et al., 1983). This species is defined as a group of rickettsial strains which are serologicly related to the type strain of R_^ belli!, strain 369-C. Philip et al. (1983) report the isolation of 263 such independently isolated strains. All of these strains were isolated from ticks, but the host range was surprisingly large in that this organism was isolated from six species of ixodid ticks and two species of argasid ticks. In the micro-IF test sera raised against individual species of the RMSF group do not cross react with EU bellii strains; however, this is an artifact derived from the use of mouse sera in this test. Other mammals such as rabbits (see above), humans, guinea pigs and goats produce antibodies in responce to infection with R^ rickettsii which cross react strongly with R . bellii strains. Examination of these data shows that R . bellii strains are very heterogenious in the degree with which they cross react with anti-RMSF group anti-sera. 112

Surprisingly, human sera from patients recovering from R . typhi and R^ prowazekii infections also very strongly cross react with R_^ bellii antigens. These finding led these investigators to examine the G+C content of the R . bellii genome and to examine the SDS-PAGE profiles of various type strains and bellii isolates. The G+C content of the Rj;_ bellii genome is 30.1 +/-0.l\$ which more closely resembles that of the Typhus group than it does the RMSF group. The SDS-PAGE profiles showed that the protein profiles of R^ bellii strains did not closely resemble the profiles of members of either the

Typus group or the RMSF group. Surprisingly, ever though the R_j_ bellii protein profiles were very similar to each other, there was also variation between the profiles of various R^ bellii isolates. Philip et al. do not discuss the significance of this heterogenicity but it may be that R^ bellii is really an assemblage of closely related species which is no more closely related to the

RMSF group as it is to the Typhus group.

In addition to the ricketsial isolates discussed above, Roberts and Wisseman (1973) have found evidence of two more species of RMSF group in isolates of rickettsia collected in southern Asia. These isolates have not been as intensely studied as the other isolates of the RMSF group collected in North America, and their conclusions 113 are based on serological observations.

Given the observed variability in the rickettsial isolates which have been examined the question arises as to how many species of rickettsia are there. It seems that there are many more than have as yet been described.

There are currently eight species in the RMSF group plus

R . bellii. In addition, there at least seven (maybe as many as ten) other rickettsia strains which have already been observed which may be new species as well.

Some of the prsviously described species (particularly R . bellii)( may have to be sudivided. Finally, many species of blood feeding ticks (especially those which do not feed on humans) have not been surveyed for rickettsial symbionts. There may be many more rickettsial species waiting to be discovered. The eight described species may only represent a small fraction of the total number of species.

Another important characteristic of RMSF group rickettsia are their respective invertebrate hosts. All of these species and unclassified strains are symbiotic in arthropod species within the order Acari (ticks and mites). In fact, all species except two occur in ticks of the family Ixodidae, and within this family there is an obvious tendency for Rickettsia to be found in members of the genus Dermacentor. (Other genera are also represented, and the over representation of Deraoentor may be due to investigators concentrating their search

efforts in this genus. Dermacentor ticks are also easy

to collect; in fact, these ticks will seek you out.) The

two species of Rickettsia which have invertebrate hosts

outside the family Ixodidae are R_^ bellii and R^ akari

As mentioned above, R^ bellii is found in a wide variety of tick species. Most of these tick species

occur within the family Ixodidae (and the genus

Dermacentor) but at least two species within the family

Argasldae are also hosts. The other rickettsial species which has ar invertebrate host outside the family

Ixodidae is R_^ akari. "Akari" is the greek word for mite. R^ akari is vectored by a mite parasite of mice,

Allodermanyssu3 sanguineus.

An interesting phenomenon about the relationship between these rickettsia and their arthropod hosts is that, by-and-large, the rickettsial infection is not that deterimental to the ticks. Female ticks can pass the organism to their offspring transovarially with high efficiency. Ticks which are infected with a Rickettsia when they hatch can easily survive to adulthood and pass the infection to their offspring transovarially.

Burgdorfer and Brinton (1975) present evidence that continous transovarial transmission of R. rickettsii 115

will begin to have a detrimental effect on the tick

population after five generations. (The generation time

for Dermacentor andersonil, the host species they

examined, is two years.) I am not convinced. Even if R .

rickettsii does adversely effect its tick hosts after a

decade, there is no reason to believe that any other

species of rickettsia do also.

Another interesting phenomonon associated with

Rickettsia of the RMSF group (and the other two groups as

well) is that some of them cause serious pathologies in

humans. Clincal manefestations of all diseases caused by

infection of humans with pathological members of the genus Rickettsia are about the same. These pathogens

infect and cause hyperplasia in the endothelial cells of

the vascular system (Silverman and Bond, 1984). These infections tend to cause focal lesions in capillaries which basicly leak RBCs. These foci have the appearance of a rash when viewed on the skin. This rash is a general characteristic of rickettsial derived pathologies in humans. The destruction of ones capillaries, however, has more serious consequences than just a rash. Patients with rickettsial diseases also have severe headaches, high fevers, stupor, and intravascular thrombosis which can lead to a variety of serious pathologies including the usually fatal condition of intravascular coagulation. 116

Since no new diseases caused by members of the genus

Rickettsia have been described since 1950, the details of

these pathologies can be found in any good referance text

(for example Davis et al., 1980).

While all pathological members of the genus

Rickettsia have pathologic syndromes which are all

variations on a common theme, there is great species and

strain variation in severity of these pathologic

syndromes. IK prowazekii (the causitive agent of

epidemic typhus) and IK rickettsii (the causitive agent

of RMSF) cause by far the most catastrophic pathologies

in humans. Mortality in humans, who are not promptly

treated with antibiotics such as tetracycline or

chloramphenicol, is high. The other members of the

Typhus group and the RMSF group usually cause less

serious pathologies in humans. IK akari, R .

australis, IK conori, and IK siberica cause

rickettsialpox, Queensland tick typhus, boutonneuse

fever, and Siberian tick typhus, respectively.

The reason that this list of human pathogens is

presented in a dissertation, which is not concerned with

mechanisms of virulence, is that there are a number of

RMSF group Rickettsia which are conspicuously absent

from this list. These are IK rhipicephali, R . montana, R . parkari, the "East Side Agent", the 117

"Swiss Agent", and JK bellii. None of these organisms has ever been linked to human disease even

though humans are often bitten by ticks which are infected with these organisms. Some of these Rickettsia do cause mild pathologies in laboratory animals and all except the "East Side Agent" can infect and destroy fertilized hens' eggs and tissue culture cells (Anacker et al., 1980). IK montana, for example, has never been shown to cause any pathologies in mammals (or ticks) even though it can chronically infect rodents of the genera Microtus and Peromi3cus (Bozeman et al., 1967).

The undescribed species found in A^ amerlcanum ticks

(prototype strain WB-8-2) (Burgdorfer et al., 1981) is another good example. This strain infects and kills fertilized hens' eggs and tissue culture cells but is avirulent in all laboratory animals tested. Therefore, there is a continuum of virulence of RMSF group

Rickettsia for mammals such as humans. The ability to cause disease in humans (or other mammals) is not a general characteristic of rickettsial species. Virulence is, however, the key characteristic which brings the existence of rickettsia to our attention.

The last characteristic of the RMSF group rickettsia which will be discussed is the SDS-PAGE protein profiles of these organisms. The protein phenotypes of only a 118

small number of the hundreds of Independently isolated

strains of rickettsia within the RMSF group have been

examined; however, these examinations do include

representative isolates of all eight of the recognized

species of the RMSF group plus IK bellii (Eisemann and

Osterman, 1976, Pedersen and Walters, 1978, Anacker et

al., 1980, Philip et al., 1983, Anacker et al., 1984,

Anacker et al.,1984 and Anacker et al., 1986). Several

of the unclassified strains have also been examined (see

Philip et al., 1978) but will not be discussed here.

Taken as a whole, these reports show that there is a

striking similarity between the protein profiles of R .

rickettsii, R . siberica, R . conori, R . parkari

, IK rhlpicephali and IK montana. Most of the bands

seen on these gels are shared between several species

within this group; however, only a small number of bands

are clearly shared between all members. IK akari shares

at least five bands with the other members of this group,

but its overall pattern is more divergent than that shown

by the other six species. As discussed previously

(Philip et al., 1983), the protein profiles of selected

strains of IK bellii are not similar to the protein

profiles of either the Typhus group or the RMSF group.

It is also interesting that, except for IK belli,

for those species from which more than one strain has 119

been examined, there is little or no observable intraspecies variation. The best example of intraspecies variation within the RMSF group rickettsia concerns the two recognized strain variants within the species R . rickettsii (Philip et al., 1978, Anacker et al., 1984, and Anacker et al., 1986). These two groups of strains are called the "R-like" strains, which all appear to be virulent pathogens in humans and laboratory animals, and the "HLP-like" strains, which appear to be somewhat less virulent. On carefully run SDS-PAGE gels one difference in electrophoretic patterns can be seen. More significantly, when monoclonal antibodies were raised against these two strains, it was found that a small number of these monoclonal antibodies recognized strain dependent epitopes on two proteins (120 KDa and 155 KDa).

Most monoclonal antibodies reacted equally well with these proteins from both strains. While these proteins have identical electrophoretic mobilities and share many antigenic determinants, it is clear that they are not identical in these two strains. It is not known if the observed differences in these two proteins can account for the appearent strain differences in virulence.

Recently (Anacker et al., 1984, Hanson, 1983, and G.

Dasch, personal communication), a technical problem related to electrophoretic profiles of rickettsial 120 proteins has come to light. This problem results from the finding that the migration of some rickettsial proteins on SDS-PAGE varies dramatically with the temperature at which the proteins are denatured in SDS.

Heating of the gel while it is being run can cause some proteins to smear. On the other hand, strictly controlling temperature during denaturation and electrophoresis can lead to the observation of strain dependent variation in protein profiles. These investigators believe that this phenomonon may be a useful taxonomic tool in differentiating between closely related strains.

I.C.4. Scrub Typhus

Scrub typhus is the third group within the genus

Rickettsia. This disease is found in Southeast Asia,

Japan, and Northern Australia and has been known for centuries. The pathology of this disease is similar to that of the RMSF group and the typhus group pathogens.

Its invertebrate hosts are trombiculid mites. The larvae of these mites are called chiggers. The life cycle of these mites is somewhat unusual in that only the larvae are parasitic (blood feeding). Nymphs and adults are predatory on insect eggs and soft bodied insects (see 121

Schmidt and Roberts, 1981 for details of life cycle).

Larvae take only one blood meal. This means that all

infected chiggers, which spread disease to mammals, were

infected transovarially. All of the rickettsial

organisms which cause scrub typhus are currently grouped

togather in the species, Rickettsia tsutsugamushi. In

endemic areas, "tsutsugamushi" is the popular name of the

scrub typhus disease, and roughly translates into English

as "mite disease". R^ tsutsugamushi infects fertilized

hens' eggs and has morphological and staining

characteristics simmilar to the other members of the

genus Rickettsia. There is one difference between the

other rickettsia and the scrub typhus agent. This

difference is in the relative thickness of certain fine

structure feastures in the cell wall of R^ tsutsugamushi relative to the other rickettsia (Siverman, and Wisseman,

1978). This difference may be relevent to certain

technical problems in isolating this rickettsia which will be discussed later in this section. There are very little physiological data relevent to scrub typhus agents, but there is an extensive older literature on the epidemiology and immunology of this disease (for example,

Bell et al., 19*16, Rights et al., 19*18, Bennett et al.,

19*19, and Irons, 19*16). To summarize the data, there is a great deal of heterogeneity in this group relative to 122

virulence and antigenicity. Recovery from infection caused by one of these rickettsia confers life long immunity to becoming reinfected by that same serotype but does not give life long protection against the other strains. Humans can suffer from scrub typhus caused by different strains. Three common strains (Karp, Gilliam, and Kato strains) and several less common strains have been recognized. These strains differ in virulence as well as antigenicity.

An interesting phenomonon has been observed relative to the virulence of the Gilliam strain in different strains of inbred mice. In some strains of mice, the

Gilliam strain is very pathogenic while other strains of mice are naturally resistant (Groves and Osterman, 1978, and Groves et al., 1980). This natural resistance was found to be due to a single, autosomal, dominant allele of a locus on Chromasome 5 which is closely linked to the retinal degereration locus. The function of the product of this locus is unknown. A similar phenomonon may also be occurring in humans infected with RMSF. The mortality rate for black males infected with RMSF (18.351) is about three to four times the fatality rate for other racial groups or black females. The fatality rate amoung black males may even be higher in that there is a form of RMSF which is rapidly fulminating and is frequently 123

misdiagnosed because patients die before the rash

appears. Such cases are probably under reported. There

is now some evidence that increased mortality amoung

black males may be associated with a deficiency for the

enzyme glucose-6-phosphate dehydrogenase (G6PD) (Walker

et al., 1981). This is a recessive sex-linked trait.

Another good example of how host physiology can affect

the out come of a rickettsial infection has recently been

reported by Todd et al. (1982). These workers developed

a cell line from the vole, Microtus pennsylvanicus.

This vole is known to become chronicly infected with many

rickettsial species. Todd et al. found that the cell

line derived from this vole could also become persistently infected with rickettsia. The infected cell lines continued to divide. These same isolates of rickettsia produce devastating cytopathologies in other cell lines. These three examples underscore the importance of host physiology in determining the pathologic consequences of symbiontic relationships.

Until recently, almost nothing was known about the biochemistry or genetic variability of Rj_ tsutsugamushi

The reason for this can be explained by the technical problems one encounters when one attempts to isolate this organism. Every paper which I have read on the this subject begins with a paragraph which reads very much like the following one taken from Tamura et al. (1985),

"The study of these phenotypic variations at the

molecular level is complicated by the difficulty in

obtaining IK tsutsugamushi in pure form since the

rickettsia organism is very fragile, sticks to host

components, and therefore not amenable to purification."

In the last few years however, there have been techincal

improvements in in vitro cultivation of R .

tsutsugamushi in tissue culture cells (Wiesse et al.,

1973, and Hase 1983), radiolabeling rickettsial proteins

in the pressence of antibiotics which inhibit host cell

translation (Hanson and Wisseman, 1981, and Hanson,

1983), and in the use of Percol gradients to isolate

intact rickettsia (Tamura et al., 1985). These

investigators found that there are many similarities amoung the electrophoretic profiles of proteins from R .

tsutsugamushi but that all of the strains are distinct.

On western blots, these authors find that these strains share some but not all antigenic determinants. This situation is similar to what has been observed for the various members of the RMSF group. An interesting observation made by T. Hase (1983) from examining electron micrographs of infected tissue culture cells: R .

tsutsugamushi are released from infected cells in small vesicles which are bounded by host membrane. There is 125

some photographic evedence that these vesicles are the infectious units of tsutsugamushi. If this is true, then this rickettsia never leaves the host cell cytoplasm. It is always endosymbiontic.

I.C.5. Antigens shared between different groups within the genus Rickettsia*

As has been described in the previous sections, the members of the genus Rickettsia have been divided into three groups based on cross reactivity of antisera raised against particular rickettsial species. While it is true that these cross reactions are strongest between members of the same group, it is also true that most mammals which are infected with one rickettsia species produce small amounts of antibodies which cross react with all rickettsial species. It has been known for some time that humans who are recovering from typhus infections also develop antibodies to R^ rickettsii (Bengtson,

1945). It is also known that guinea pigs infected with

R . rickettsii are resistent to Infection with R^ typhi

(Parker, 1943). The details of this cross reaction has been studied by Ormsbee et al. (1978). This group studied the immune sera of 55 patients recovering from R . prowazekli infections and 15 patients who were recovering from RMSF. They found that 36 of the patients recovering

from typhus produced antibodies (IgG only) which cross

reacted with several species in the RMSF group. About

half of patients who were recovering from RMSF produced

antibodies (IgG and/or IgM) which cross reacted with at

least one of the three species in the Typhus group.

These cross reacting antibodies coould be detected at

dilutions of the patients sera which averaged from 1:64

to 1:256, and therefore, are significantly higher than

background. While only a fraction of all patients

developed cross reacting antibodies, these results

demonstrate that there are antigenic determinants which are shared between all the members of both the RMSG group and the Typhus group.

The fact that all rickettsia share some antigens can also be demonstrated by the Weil-Felix test. These workers observed in 1916 (Weil and Felix, 1916) that some patients recovering from any rickettsial disease

(including some strains of R^ tsutsugamushi ) produce antibodies which cross react with some strains of Proteus

(strains Proteus 0X19, OXK, and 0x2). For many years,

this cross reaction was the basis for clinical diagnosis.

It is no longer used because rickettsial antigens are easier to produce and the Weil-Felix test gives an unexceptably high rate of false negatives (Hechemy et 127

al., 1979). It has also been observed that all species of the genus Rickettsia do not induce production of antibodies which cross react with all three of the strains of Proteus listed above; however, even though some patients do not develop these cross reacting antibodies these data indicate that there are antigenic determinants which are shared by the members of the genus

Rickettsia.

I.C.6. Rickettsial physiology

Members of the genus Rickettsia (and some possibly related species in the tribe Wolbachieae) are unique in that they are the only organisms known to thrive free in the cytoplasm of eucaryotic cells. All other intracellular symbionts (with the possible exception of the amastigote stage of Trypanosoma cruzi) are surrounded by membranes of host cell origin. This unusual niche has led many investigators over the years to investigate the physiology of these organisms. Until the mid 1970's, however, cultivation and isolation technologies were inadequate, and progress was slow. One very notable exception to this was the pioneering effort by Weiss et al. (1967). Those workers found that R . rickettsii can take up and utilize glutamate, glutamine, 128

and pyruvate. The process requires oxygen and generates

carbon dioxide from reduced carbon found in these

substrates. Glucose is not taken up or utilized by

rickettsia. These should be interpreted with some

caution because rickettsia cannot survive outside of

their host cells the cells analysed were dying during

these experiments. For historic prespective, it is

interesting that those investigators handled thousands of

R . rickettsii infected fertilized hens' eggs to get

enough rickettsia to do these experiments. They isolated

only 0.05-0.1 mg of rickettsia per egg.

The results of Weiss et al. (1967) suggest that

rickettsia have fully functional tricarboxylic acid cycle

and oxidative phosphorylation capabilities but lack glycolysis. This was confirmed when Coolbaugh et al.

( 1976) examined extracts of R_;_ typhi for specific

enzymatic activities. They found that this rickettsia

contains high levels of malate dehydrogenase activity and moderate levels of activity for glutamate-oxaloacetate

transaminase, glutamate, succinate, and isocitrate

dehydrogenases, and citrate synthethase. No activity was

found for glucose-6-phosphate dehydrogenase,

fructoaldolase, phosphoglucose isomerase, or pyruvate kinase. These findings confirm that rickettsia lack

glycolysis but do have a tricarboxylic acid cycle and 129

suggest that glutamate is a good substrate for

rickettsial metabolism. Weiss et al. (1967) found that

glutamate was oxidized more vigorously than glutamine or

pyruvate.

The findings of Coolbaugh et al. (1978) have been

expanded upon by Phibbs and Winkler (1981, and 1982).

Their work and the work discussed in the next paragraph were facilitated by the development of Renografin density gradient centrifugation as a method of purification of rickettsia (Smith and Winkler, 1979). This method greatly improves the separation of rickettsia from host

cell material but does not dramatically improve the yield. Phibbs and Winkler partially purified the enzymes citrate synthase and malate dehydrogenase, and the enzyme

complexes for pyruvate dehydrogenase and 2-oxoglutarate

dehydrogenase from R^ prowazekii. Inhibitors of

these enzymes were also investigated. The physical location of some of these enzymes is known (Smith and

Winkler, 1979). Smith and Winkler were able to separate

R . prowazekii into various cell fractions. They found that the succinate dehydrogenase was isolated with the cell membrane but that glutamate-oxaloacetate

transaminase and malate dehydrogenase were soluble in the interior of the riokettsia.

The citrate synthase, which they isolated, had several unusual properties. First, its molecular weight was only 62 KDa. This is only about one quarter of the size of this enzyme in other gram negative bacteria but is about the same size as this enzyme isolated from gram positive bacteria and mitochondria. Also, the enzyme isolated from R^ prowazekii is sensitive to strong competitive inhibition by ATP but is not inhibited by

NADH or alpha-ketoglutarate. This is the opposite of the situation common to other gram negative bacteria but is the same as the situation for gram positive bacteria and mitochondria. The IU prowazekii gene which encodes this citrate synthase has been cloned and expressed in E . coli (Hood et al.,1984). This was done directly by transforming a Sau3A partial library ligated into pBR322 into a citrate synthase deficient strain of coli.

The enzyme produced in E^ coli was smaller than the enzyme isolated from rickettsia (possibly indicating that

E . coli has some difficulty in translating the rickettsial gene) but was still inhibited by ATP.

Other investigators (Zahorchak and Winkler, 1981) have examined rickettsia for the presence of an ATPase which couples the oxidation of tricarboxylic acid cycle intermediates to the production of ATP. Such an ATPase activity has probably been found. It requires an intact rickettsial membrane and is sensitive to inhibition by 131 dicyclohexylcarbodiimide (DCCD). These properties are

shared with other ATP synthases.

Another extemely interesting feature of rickettsial

physiology is that rickettsia are able to import ATP, ADP

and NAD directly from their extracellular environment

(ie., their host's cytoplasm) (Winkler, 1976, Williams

and Weiss, 1978, Atkinson and Winkler, 1981). For ATP

and ADP this is a carrier-mediated system in which, for

every molecule of ATP or ADP which comes in, one molecule

of ATP or ADP must go out. Also, metabolic inhibitors

such as 2,H-dinitrophenol, cyanide, and carbonyl cyanide m-chlorophenylhydrazone did not inhibit this process. In

fact, conditions which were known to inhibit rickettsial metabolism were shown to increase the rate of ATP/ADP

transport. AMP is not transported by this system.

ATP/ADP facilitated transport systems are not common in

nature but are known to occur in mitochondria. The mitochondrial ATP/ADP transport system is sensitive to

inhibition by atractyloside and bongkrekic acid which, however, do not inhibit the rickettsial transport system.

Recently (Krause et al., 1985), the gene encoding the R .

prowazekii ADP/ATP translocator has been cloned into a cosmid library transformed into 13^ coli in a manner analogous to what was done with the citrate synthase gene. 132 NAD can also be directly transported into rickettsia.

The details of this process are not known. It is interesting, however, that, when ATP concentrations are low, this NAD is quickly metabolized to AMP. When ATP levels are high, NAD accumulates. This study and the ones cited above all serve to illustrate the central role that ATP concentrations have on regulating rickettsial metabolism.

I.C.7. The genus Rochalimaea

The genus Rochalimaea currently consists of two species, R^ quintana (the causative agent of trench fever) and R_^ vinsonli (which does not appear to cause any disease in mammals). Both species are ecologically associated with voles. They have been shown to be related by DNA hybridization (Myers et al., 1979) and physiology (reviewed by Weiss and Dasch, 1982). Neither species can utilize glucose (Weiss and Dasch, 1978), but both can utilize glutamate, pyruvate and glutamine under appropriate conditions (Huang, 1967). These results are obviously similar to those obtained with by members of the genus Rickettsia. R . quintana has also been shown to be related by DNA homology to members of the genus Rickettsia (Myers and Wisseman, 1980) even though 133 the R^ qulntana genome is more G+C rich than the rickettsial genomes tested (39$ vs. 29$). The genus

Rochalimaea and the genus Rickettsia also have similar morphological and staining characteristics. The two important differences between the genera are that

Rochalimeae species are not intracellular symbionts, and they can also be grown axenically in complex media

(Vinson and Fuller, 1961).

An extremely important finding relative to the evolutionary origin of rickettsia has recently been made by Weisburg et al. (1985). They sequenced the 16S ribosomal RNA from R^ quintana. Their surprising finding is that, based on sequence comparisons with this gene, R_^ quintana is very closely related to the alpha subdivision of the gram negative eubacterial purple bacteria. Typical members of this subdivision are

Agrobacterium tumefaciens and Rhizobium leguminosarum

( coli is member of the gamma subdivision of the purple bacteria.) As is well known, tumefaciens and

R . leguminosarum are important facultative symbionts of plants. There is, thus, a continuum of symbiotic relationships extant in the alpha subdivision of the purple bacteria. This continuum ranges from free living or facultative symbionts to obligate extracellular symbionts to obligate intracellular symbionts. 134

I.C.8. Rickettsia are a good model for early mitochondrial evolution

The superficial similarities between mitochondria and bacteria have been noticed for almost a century (Altmann,

1890, and Sagan, 1967). These similarities have led to

the previously controversial hypothesis that mitochondria evolved from endosymbiotic bacteria. This hypothesis states that endosymbiotic aerobic bacteria entered into a mutualistic relationship with early eucaryotes in which the advantages of aerobic metabolism were afforded to the otherwise anaerobic eucaryote in exchange for the protection, mobility and constant intracellular environment offered to the endosymbiont. Over time, genetic information required for an extracellular existance was lost and genetic information required for the replication and metabolic functions of the endosymbiont was progressively moved from the genome of the protomitochondrion into the nucleus. Once in the nucleus, this information was transcribed in the nucleus and translated on cytoplasmic ribosomes after which these proteins were transported back into the mitochondria.

The end result of this process is the modern which maintains only a very small portion of its original genetic information. Recent studies on 135 the evolution of rRNAs have confirmed this hypothesis

(Kuntzel and Kochel, 1981, and Gray et al., 1984). It

has further been demonstrated that mitochondria evolved

from free living gram negative eubacteria of the alpha

subdivision of the purple bacteria (Yang et al., 1985).

Rickettsia have several features which are remarkably

similar to the hypothesized protomitochondria. First,

they are obligate intracellular symbionts which, like

mitochondria, exist free in the cytoplasm of their host

unbounded by any host derived membrane. Second, they are

related to bacteria in the alpha subdivision of the

purple bacteria, but compared to their free living

relatives in the purple bacteria, the members of the

genus Rickettsia have suffered a large genome reduction

(ie. 50—60%). Third, they have lost the ability to

metabolize substrates which most free living bacteria can

metabolize (i.e. hexoses). Fourth, rickettsia have

aquired a carrier mediated transport system which allows

them to exchange ATP and ADP with their environment.

Similar transport systems are only known from

mitochondria. Fifth, metabolically important enzymes

(ie. citrate synthase) have become regulated in a manner much more similar to that of mitochondria than that of

free living eubacteria. All of these features have led

me to the hypothesis that rickettsia are well advanced along the evolutionary pathway leading from free living

bacteria to mitochondria. My hypothesis is not that

rickettsia are the ancestors of mitochondria. Quite the

contrary, mitochondria probably became ensymbiotic

billions of years ago. Rickettsia probably evolved much

more recently. My hypothesis is that rickettsia are

mimicking the evolutionary pathway followed by

mitochondria. The similarity between mitochondria and

rickettsia has been noticed by at least one other

investigator, H. Winkler (1976); however, he hypothesized

that rickettsia might be the ancestors of mitochondria.

When beginning this work on the molecular biology of

rickettsia, it became apparent that relatively little was

known about the molecular biology of this group.

Furthermore, the taxonomy of the RMSF group was confused

due to the ambiguities of the serotyping methods used to

define species. Actual genetic distances between members

of the RMSF group were unknown. I have taken the task of

developing the data base needed to distinguish species of

rickettsia unambiguously and to determine their degree of

relatedness as a first step in studying the molecular

evolution of this group. To do this, I have used a

series of random recombinant clones to examine

restriction site polymorphisms. In collaboration with workers at the Ohio Department of Health, I have also adapted a column chromatography technology for the separation of rickettsia from host i>ell material whi achieves high levels of purity and greatly improves yeilds. Chapter II. Methods

II.A. Strains used

Table 3 lists all the respiration competent yeast strains and species used in this study. All yeast strains except laboratory strains of S_^ cerevisiae were wild type diploid strains. Strain ID41-6/161 is a haploid laboratory strain of the genotype "a" mating type, ade-, lys-. The mitochondrial genome of this strain is wild type for respiratory functions and is var1

(40.0), omega-, cap1-R, oli1-R, and par-R. The map of this mitochondrial genome is presented in Figure 1 and will be referred to as 161 in this text. As this work was in progress, other worker in the laboratory (K.

Joplin, A. Morawiec, and J. Wenzlau) succeeded in moving the mitochondrial genomes of several of these yeast species into the nuclear background of ID41-6/161 by the technique of spheroplast fusion. Because mitochondria proved to be much easier to isolate from an ID41-6/161 nuclear background than some of these strains (ie. S. aceti), these fusion products were used to isolate mitochondrial DNA in some experiments. All strains of / 138 139 yeast were maintained on RG plates (see below) to insure that all experiments were performed with wild type mitochondrial DNA.

All species of rickettsia used in this work are presented in Table 4. All rickettsia examined in this investigation were the kind gift of Dr. W. Burgdorfer at the NIH Rocky Mountain Laboratory or Dr. G. Dasch at the

Naval Research laboratory.

II.B. Media and buffers used

YEPD and RG media were prepared as previously described (Birky, 1975, and Birky et al., 1978).

Simi-solid media contained 1.5$ agar.

M-199 and L-15 were perchased from GIBCO and were hydrated as recommended by the manufacturer. Both media were supplimented with 5% heat inactivated fetal calf serum, 1$ 200 mM Glutamine, and 1$ 1M HEPES. Delbeco's phospate buffered saline (with and without divalent cations) was purchased from GIBCO.

STV was as follows:

KC1 4 mM

NaCl 136 mM dextrose 5 mM EDTA 0.6 mM

NaHC03 5 mM

Phenol Red 30 mg per liter

Trypsin (GIBCO 1:250) 0.5 grams per

liter

Lysis buffer was made as follows:

NaCl 5 mM

EDTA 10 mM

Tris-Cl (pH=8.0) 10 mM

Brain Heart infusion broth (BHI) was purchased from

DIFC0 and was hydrated and sterilized as recommended. LB medium was made as described by Maniatis et al. (1981).

LB plates were made with 1.5% agar. LB amp and LB tet plates were made by adding 50 mg or 15 mg per liter of ampicillin or tetracycline respectively to the media just prior to pouring plates.

All media were sterilized either by filter sterilization or by autoclaving.

II.C. Purification of mitochondrial DNA

The growth of yeast cultures, isolation of mitochondria and purification of mitochondrial DNA by 141

isopycnic centrifugation in CsCl/bis-benzimide gradients

were performed as described by Hudspeth et al. (1980).

Small scale (i.e. 1.5 ml) preparations of total cell DNA

(minilysates) were performed by the method of Jacquier

and Dujon (1985).

II.D. Tissue culture, growth of rickettsia and

purification of rickettsial DNA

All rickettsia were grown in Vero (monkey) cells.

This cell line has been in culture for over eighteen

years. The stocks used in these experiments had been in

continuous culture for over three years. Vero cells were

maintained in M-199 supplimented as described above in

disposible polystyrene tissue culture flasks manufactured

by Corning. Vero cells were maintained at 37 degrees C.

Subculture of Vero cells was accomplished by rinsing monolayers of cells with Delbecco's phospate buffered

saline (PBS) without divalent cations (purchased from

GIBCO). After rinsing, the cell sheets were overlayed with 0.5 ml/25 square centimeters of STV and incubated at

37 degrees C until the cells had rounded up as observed

under an inverted phase contrast microscope (manufactured

by Nikon). The trypsin solution was then decanted and

the cells were displaced from the polystyrene by shaking the flask. Dislodged cell were then mixed with fresh

M-199 medium and transferred into new flasks as required.

Subculture diluted were 1:4 to 1:12 depending on how many flasks were required and how many days between subcultures were desired. Cells were subcultured every three to five days as required. All media and buffers used in Vero cell cultivation were filter sterilized with nitrocellulose membranes with an average pore size of

0.22 microns.

Growth of rickettsia in tissue culture has been previously described (Anacker et al., 1974, Cory et al.,

1974, Burgdorfer et al., 1975, and Wike et al., 1972).

In this study, M-199 was poured off confluent monolayers of Vero cells. Inocula (usually in volumes of less than

2.0 ml) of rickettsia were then added to the Vero cells and allowed to incubate at 32-3^ degrees C for 30-60 minutes. After this time, L-15, which had been supplimented as described above, was added to these cultures so that there was 1.0 ml of L-15 for every 5.0 square centimeters of monolayer surface area. All rickettsia examined in this study were cultured in Vero cells incubated at 33+/-1 degree C in L-15 medium.

Preliminary work, which is not desribed here, indicated that these conditions were superior to other cell lines or primary cultures, higher temperatures and other media. 143

Vero cells became maximally infected in three to seven days depending on the number of plaque forming of units of rickettsia in the innoculum and on which species of rickettsia was being grown. Most rickettsial cultures were grown in 150 squared centimeter tissue culture flasks.

To begin the harvest of the rickettsia, the medium from infected cultures was poured into a sterile 250 ml centrifuge bottles. To these flasks was added 3.0-5.0 ml of STV. Flasks were incubated with trypsin at 37 degrees

C until most cell became detached. Detached cell3 were pooled with the medium from the infected culture and centrifuged at about 8000 x g for 15 minutes. The supernatant was poured off and discarded. The pellet was resuspended in 1/20 its original volume in either K36 buffer (Wiess et al., 1967) or Delbeco's PBS without divalent cations. (All later experiments were done with

PBS instead of K36.) To this suspension, 5.0 mg/ml of trypsin (GIBCO 1:250) was added. This trypsin plus cell suspension was then swirled continuously at room temperature or 28 degrees C until all the stringy and gelatinous coagulants were dispersed (about 35 min.). The action of the trypsin was then stopped by adding an equal volume of BHI.

The rickettsial suspension was then centrifuged at 350 x g for five minutes (low speed spin) to remove host

cell debris. The pellet was set aside. This step was

repeated until no pellet formed after centrifugation.

The supernatant was then stored on ice until the

following steps were completed. Microscopic observations

of Gimenez stained samples of the pellets from the low

speed spins were shown to contain vast numbers of host

cell nuclei and rickettsia (see results). These

rickettsia were associated with pieces of cytoplasm. Chip

Pretzman at the Ohio Department of Health has devised a

simple technique which allows the separation of these

rickettsia from the host nuclei. It was found that

resuspending the low speed pellets in distilled water for

5 minutes followed by addition of 1/10 volume of 10x

Hanks salts resulted in liberation of the rickettsia

while ly3ing the nuclei. When centrifuged at 1200 x g,

the nuclei pelleted and the rickettsia remained in the

supernatant. The supernatants from the various low speed

spins were pooled and spun 9,500 x g for 30 minutes to

pellet rickettsia.

The rickettsial pellet was then resuspended in 5-10 ml of L-15 medium plus 10JC v/v DMS0. This suspension was

then frozen at minus 70 degrees C overnight (or longer

periods as needed). The frozen rickettsial suspensions

were thawed slowly by setting out the suspension at room temperature (usually requiring about 30 min.). The

rickettsia were then pelleted at 20,000 x g for 20 min.

The pellet was then resuspended in PBS or K36 (2.0 ml)

and loaded on a Sephacryl S-1000 column (Pharmacia) with

a height of about 35 cm and a bed volume of about 70-80

ml. This column had been equilibrated with the same

buffer in which the rickettsia had been resuspended. The

column was eluted with either K36 or PBS which had been

degassed. A low pressure liquid chromatograph apparatus manufactured by Pharmacia was used to control elution and

to monitor and record absorbance at 280 nm. Purified rickettsia were concentrated by centrifuging at 20,000 x g for 20 min.

Pelletted rickettsia were then resuspended in lysis buffer (0.2-0.3 ml). To this was added in order 1/10 volume of a 10 mg/ml solution of lysozyme, 1/10 volume of a 50 mg/ml stock of Proteinase K, and 1/10 volume of 20%

SDS. All three of these solutions were in water. The

cloudy rickettsia suspension cleared within seconds after

the SDS was added. The lysed rickettsia were then incubated for at least 20 min. (sometimes hours) at 55 degress C in a water bath. The lysed rickettsia were then extracted with phenol, chloroform:isoamylalcohol

(24:1), and then ether. Each extraction was followed by centrifugation in a microfuge for not less than 10 min. 146

Chlorophorm:isoamylalcohol extractions were repeated until no interface was observed. All samples were extracted twice with ether. After ether extraction the nucleic acids were precipitated by adding 1/10 volume of

3.0 M Na acetate and at least three volumes of ethanol.

All ethanol precipitations of rickettsia DNA were incubated at minus 70 degres C for at least 8 hours.

Precipitated DNA was isolated by microfuging for 15 min. Nucleic acid pellets were washed twice with ice cold 70J ethanol and dried under vacuum for at least 25 min.

II.D. Separation of restriction endonuclease derived fragments of DNA on agarose gels and band isolations

All restriction endonucleases used in this study were purchased from commercial vendors and used in buffers recommended by them. Agarose used in these experiments was ultrapure grade, purchased from either BRL or IBI.

Agarose gels were made with either TEA or TBE (Maniatis et al., 198D(final concentration of 1x). Restricted DNA was loaded onto argarose gels which were electrophoresed at voltages of 100 volts or less. Agarose gels were 147

stained with ethidium bromide (2.0 mg ethidium

bromide(EB) per liter of water) for 7-10 min. followed by

destaining for 10-20 min in distilled water. DNA was visualized by exposing the gel to UV light.

DNA was band isolated by a protocol modified from

Thuring et al. (1975) by cutting the desired band out of

the agarose gel and mincing it finely with a sharp and

clean blade. The minced DNA band was then frozen at -70 degrees C and thawed three times. The final slurry was

then placed in a 5.0 ml syringe with a glass wool plug at the bottom. The plunger of the syringe was then used to compress the slurry until no more fluid could be extracted. Then 50 microliters of lysis buffer was added to the compressed slurry and the plunger was applied again. The fluid from both compressions was pooled and passed through a 0.45 micron pore size filter. The filtered fluid was then extracted once each with chloroform:isoamylalcohol (24:1) and ether and precipitated with ethanol as described in II.C.

II.E. DNA blot hybridizations

DNA blots (Southern, 1975) of agarose gels to nitrocellulse membranes were performed by the bi-directional blotting method of Smith and Summers (1980). All blots were prehybridized in 1x Denhardt's solution with 50£ formamide and 100 micrograms of heat denatured sonicated calf thymus DNA. Hybridization was performed at k2 degrees C or at room temperature depending on the expected degree of homology between probes and targets and the relative A+T content of the probes and targets. Washes were in 2x SSC followed by either 0.2x SSC or 0.5x SSC. (1x SSC is 0.3 M NaCl and

0.03 M sodium citrate.)

II.F. Labelling of DNA probes

Recombinant plasmids (see below) used as probes were labeled by first nicking the DNA with dilute solutions of

DNase I and then nick translating at 12-16 degrees C with pol I in the presence of deoxyNTPs and alpha P-32 labelled deoxyATP (Rigby ct al., 1977). All probes were denatured with heat at 95 degrees C for 10 min followed by quick chilling on ice prior to adding the probe to the prehybridized filter.

M13 derived probes (see below) were labeled by modifying the protocol for dideoxy sequencing described below (Sanger et al., 1981). Primers were annealed to the ssDNA of the M13 probe by heating a mixture of primer and M13 DNA to 95 degrees C in the polymerase reaction 149

buffer suplied by BRL. This mixture was allowed to cool

slowly to room temperature. After cooling, deoxyNTPs

were added to a final concentration of 10 microM. 1/10

volume of 100 mM dithiothreotol was also added. Next, 10

microCi of alpha P-32 labelled deoxyATP was added.

Finally, 1.0 unit of the pol I large (Klenow) fragment

was added and incubated for 30 min at room temperature.

The reaction was stopped by adding an equal volume of a

100mM EDTA solution (pH ajusted to 7.5).

All probes used were run over a "spin" column

(Maniatis et al., 1982) to remove unincorporated

nucleotides. "Spin" columns were equilibrated with a

buffer of 50 mM Tris-Cl (pH 7.5) and 20 mM EDTA.

II.G. Construction of recombinant clones

Three types of vectors were used in this study, pBR322 (see Maniatis et al., 1982), pEMBL derived vectors

(Dente et al.,1983), and M13 derived vectors (Messing et al., 1981). All pBR322 and pEMBL vectors were purified on CsCl/EB gradients prior to use. All M13 vectors were purchased. All vectors were cut with the appropriate and examined on agarose gels to determine the completeness of the reactions. All vectors were then phenol, chlorophorm:isoamylalcohol, and ether extracted and ethanol precipitated. Dried DNA pellets of cut vector DNA were then brought up in 1.1x ligation buffer salts. Dried pellets of insert DNA were also redisolved in 1.1x ligation buffer salts. Aliquots of insert and vector DNA were then run on agarose gels with various amounts of other DNA of known concentrations.

Comparison of the relative intensity of EB staining allowed for an estimation of the concentrations of both vector and insert DNA. Quantities of DNA as small as 5.0 ng per band could be detected.

Ligations were set up by mixing vector and insert

DNAs with additional 1.1x ligation salts so that the following characteristics were acheived: when the vector was cut with a single restriction enzyme, the DNAs were mixed so that there was a three fold molar excess of insert ends relative to vector ends; when the vector was cut with two different restriction enzymes, the DNAs were mixed such that there was an equal molar ratio of vector and insert ends. In all experiments, the concentration of vector was maintained at 10-15 ng per microliter. The mixture of DNAs was then heated to 60 degrees C for 15 minutes and allowed to slowly cool in a heating block to room temperature (usually 45-60 min.). To this mixture,

1/10 volume of 10x ligation additives (ATP, DTT, and BSA) were added. Then 0.2 to 0.5 microliters of T4 ligase 151

(Biolabs usually 300000 units/ml) was added to the ligation mixture (0.1 microliter/10 microliters of ligation mixture). Ligations were allowed to proceed overnight at 4 degrees C. An aliquot of all ligations was run on an agarose gel to determine that ligation actually occurred.

1II.H. Transformation of bacteria

Bacterial strain HB101 was used to propogate pBR-322 derived recominant clones. HB101 cell3 were made competent by the RbCl method of Hanahan (1983) and were frozen at minus 70 degrees C until needed. Bacterial strains 71/18 and JM101 were made competent by the calcium chloride method (Maniatis et al., 1982). Strain

71/18 and JM101 cells were used to propogate all recombinant clones utilizing other vectors.

Transformations were accomplished by adding ligated

DNA to 0.2 ml of competent cell, mixing gently and incubating on ice for 40 min. In each transformation experiment, several different amounts of the ligated DNA were added to different aliquots (0.2 ml each) of competent cells. These amounts varied so that 10-50 ng of ligated vector DNA was added to each tube of competent cells. In a control transformation, 10-100 ng of uncut 152 vector DNA was added to competent cells. After

Incubating on ice for 40 min., competent cells with DNA were heat shocked at 42 degrees C for two min.

For pBR322 and pEMBL vectors, the heat shock was followed by addition of 0.8 ml of LB medium followed by an additional incubation for one hour at 37 degrees C.

After the one hour incubation at 37 degrees C, pBR322 derived clones were plated out on LB amp plates. The pEMBL derived clones were plated out on MacConkey's agar-amp plates. Both types of plates were then incubated overnight at 37 degrees C. The next day, pBR322 derived ampicillin resistant colonies were counter-screened for resistence to tetracycline. (All pBR322 derived recombinant clones constructed in this study had inserts in the tet-R gene of the pBR322 genome.) For the pEMBL derived recombinants, counter screening was not required because all inserts would disrupt the beta-galactosidase gene of this vector. As such, ampicillin resistant transformants with inserts are white while transformants without inserts are red. Care must be taken to dilute the transformation culture sufficiently so that there are only about 100 transformants per plate because overcrowding can result in false positives.

Once transformants were identified which probably contained inserts, these colonies were grown up in 5-10

ml of LB plus 50 mg/liter ampicillin overnight and their

plasmid DNAs were isolated by the rapid lysis by boiling method (Maniatis et al., 1982). Once isolated these

DNA's were cut with appropriate restriction enzyme and

run on agarose gels to characterize the inserts.

Occasionally, large scale preps of these plasmids were

done by the alkaline lysis method described by Maniatis

et al.(1982).

Transformation of bacteria with ligated M13 DNA is

the same as with other vectors up to and including the

heat shock step. After the heat shock, 10 microliters of a 100 mM isopropyl-beta-delta-thiogalactopyranoside

(IPTG) solution (23.8 mg IPTG/ml of water), 50 microliters of 2%

5-bromo-4-chloro-3-indoly1-beta-delta-galactoside (X-gal)

(25 mg X-gal/1.25 ml dimethylformamide), 0.2 ml of an early logarithmically growing host bacteria, and 2.0 ml of melted soft agar (0.3% agarose in LB equillibrated at

45 degrees C) are added to the 0.2 ml of heat shocked competent cells. This mixture is gently mixed and quickly poured onto an LB plate. After about 18 hours,

transformants which possibly have inserts appear as

turbid and colorless plaques. Transformants without

inserts appear as blue plaques. This technique works for M13 (and pEMBL) because the inserts are inserted into the 5' region of the beta-galactosidase gene of M13 (and pEMBL). IPTG is an inducer of this enzyme, and X-gal is a synthetic substrate which can be cleaved by this enzyme to form a blue pigment. Recombinant inserts interrupt the coding sequence for this gene and prevent expression.

II.I. Bal-31 deletions

Bal-31 was purchased from BioLabs and used according to their recommendations. The pEMBL derived clone, from which the Bal-31 deletions were made, was linearized at the EcoRI site in the polylinker. The degree of Bal-31 digestion was controlled by removing aliquots of DNA from the reaction mixture at five minute intervals and stopping the reaction by adding EGTA. After heat inactivation, phenol extraction and ethanol precititation, the ends of the digested DNA were "healed" with the large fragment of Poll and dNTPs. The Poll was heat inactivated (65 degrees C for 20 min.) and the DNA was cut with Hindlll. This mixture was then shotgun cloned into M13-mpl8 and transformed into 71/18 cells as previously described. 155

II.J. Screening colorless M13 plaques for inserts

Colorless plaques from M13 transformations were

picked with the aid of a micropipetter and ejected into

2.0 ml of fresh LB medium. These cultures were allowed

to grow for 6-8 hours (no longer) at 37 degrees C. After

this time, 1.5 ml of these cultures was microfuged for 10

min. to pellet cells. All except 10 microliters of the

individual cultures was then placed in the refrigerator

and used later as described in the next section. To the

remaining 10 microliters, 1 microliter of 20% SDS in

water was added to lyse the phage. The lysed phage were

then electrophoresed on an agarose gel and the migration

of the phage DNA was compared to the migration of other

M13 phage DNAs which lack inserts or had inserts of known

size.

U.K. Isolation of single stranded M13 DNA

M13 is also interesting in that, while the "RF" replicative form of this phage inside the infected host cell is double stranded, the virus which is released from

the infected cell is single stranded. This single stranded phage is isolated directly from an infected culture by first microfuging 1.5 ml of LB which had been inoculated about 6-8 hours earlier by picking a colorless and turbid plaque with a micropipetter and ejecting the plaque into the LB. After microfuging for 5 min., 1.2 ml of the supernatant were transferred to a different microfuge tube and the rest of the culture including the cell pellet was stored in the refrigerator as a virus stock. To the 0.8 ml of culture supernatant, 0.3 ml of a

20$ PEG (polyethyleneglycol) in 2.5 M NaCl. This solution was mixed by inverting the tube several times and allowed to stand at room temperature for 15 min.

After incubating at room temperature, the phage were collected by microfuging for 10 min. The phage precipitate forms a small but visible pellet. As much of supernatant as possible was poured off the virus pellet.

A tissue was used to remove the last drop. The virus pellet was resuspended in 110 microliters of TES and extracted with phenol, chloroform:isoamylalcohol (24:1), and ether. 1/10 volume of 3.0 M sodium acetate was added the extracted phage, and the phage pellet was precipitated by addition of 0.3 ml of 95$ ethanol. The precipitated single stranded phage DNA (ssDNA) was incubated overnight at minus 70 degrees C, pelleted by microfuging for 15 min., washed once with ice cold 70 $ ethanol, and dried under vacuum for at least 20 min. The dried pellet was resuspended in 50 microliters of water. 157

II.L. DNA sequencing of M13 derived ssDNA

Almost all the reagents used in the dideoxy chain termination sequencing protocol used in this study were purchased from BRL as a kit. The universal primers provided by BRL were annealed to the ssDNA M13 recombinants by mixing 5.0 microliters of ssDNA with 2.5 micoliters of primer ssDNA, 1.0 microliters of 10x polymerase reaction buffer (100 mM Tris-Cl ((pH 8.0)),

100 mM MgC12, 300 mM NaCl), and 4.0 microliters of water.

This mixture was heated to 95 degrees C for 5.0 min. and allowed to slowly cool to room temperature (usually about

1 hour). To this cooled mixture, one microliter of 100 mM dithiothreitol in water, 1.0 unit of poll large fragment (in one microliter of polymerase dilution buffer), and one microliter of alpha-P32-dATP (about 10 microcurries) were added. After these ingredients were mixed, 3.0 microliters of this mixture were combined with

2.0 microliters of each of the four termination mixes which were made as described in Tables 5 and 6. These termination mixtures are different from those described by the BRL instruction manual and have been modified as

3uch to accomodate the high A+T content of yeast mitochondrial sequences. These reactions were allowed to proceed for 15 min. Then 1.0 microliter of of a 0.5 mM

dATP solution in water was added to each reaction. The

reactions were then allowed to proceed for an additional

15 min. The reactions were stopped by adding various

amounts of sequencing loading dye (50£ formamide) and

heating to 95 degrees C for 5 min. 10 microliters of

loading dye were added to the A and T reactions while 20

microliters were added to the C and G reactions. 2.5

microliters of the final denatured DNA plus loading dye

mixtures were loaded onto lanes of an 8.0 %

acrylamide-8.0 M urea gel. Most gels used in this study

were run on an IBI sequencing apparatus at 1200 volts.

All gels were dried and exposed to X-ray film without an

intensifying screen.

If it was found that the termination reactions did

not run out as far as was desired, then a small amount

(usually one microliter) of additional dNTP stock

solution was added to 9.0 microliters of the appropriate

termination mix. This additional dNTP stock solution was in the form of a 0.05 mM stock in water.

II.M. The Gimenez hemolymph stain

The Gimenez hemolymph stain (Gimenez, 1965) specifically stains rickettsia red with basic carbol-fuchsin yielding a relatively low percentage of

false positives. The stock solution of basic

carbol-fuchsin was made by dissolving 10 gr. of basic

carbol-fuchsin in 90 ml of 95% ethanol. To this solution,

900 ml of a solution of 890 ml of water plus 10 ml of

liquified phenol was added. On the day that the stain

was to be used, 4.0 ml of the final basic carbol-fuchsin

solution is mixed with 10 ml a 50 mM phosphate buffer (pH

7.4) and filtered through filter paper to remove

sediment. The phosphate buffered basic carbol-fuchsin was the working stain solution. Cells to be examined for rickettsia are first air dried on a glass microscope slide and then fixed for 5 minutes with acetone. After fixation and the acetone has evaporated, the cells were stained for 5 minutes with basic carbol-fuchsin (red) and then counter stained with a 1% malachite green solution in water for 40 seconds. Excess stain was washed off with gently running tap water. The slides were then air dried and examined at 1000x under a compound microscope.

Rickettsia stain red while host cells and most other bacteria stain blue-green.

II.N. The microimmunofluorescence (micro IF) test

This test was performed essentially as described by Philip et al. (1978). A series of 5.0 microliter drops

of rickettsial antigens were placed individually in the wells of microscope slides which had been manufactured by

Cel-line Associates with a plastic coating which prevents accidental mixing of the drops. These drops were allowed

to dry and then were fixed in acetone. Care must be taken to be sure that there is no DMSO mixed with these antigens.

After fixation, the antigens are overlayed with serial dilutions of fluorescein labelled antibodies in

PBS and incubated at 37 degress C in a 100$ relative himidity atmosphere for 30 min. After incubation the slides were washed in PBS and examined with a UV microscope manufactured by Zeiss. Antibody titers were indicated as the greatest dilution of the antibody which flourescently labelled the various antigens. Chapter III. Results

III.A. The structure of the oxi3 gene in 11 species of

yeast

In this section of the Results, the structure of the

oxl3 gene of eleven species of yeast will be discussed.

It will be shown that the various oxi3 genes differ

primarily in the occurance of optional introns.

Restriction endonuclease recognition sites are largely

conserved between these species in this gene. This has led to the discovery of previously unobserved introns by determining the distance between conserved restriction sites.

III.A.1. Southern blots probed with intron specific probes

Over the course of this investigation, mtDNAs from these species of yeast were digested with a wide variety of restriction enzymes. When these digestions were analysed, it was consistently observed that no two

161 species produced identical restriction fragment patterns.

The enzyme, EcoR 1, provides a good example of this observation. Figure 6 shows an EcoR1 restriction pattern for the mitochondrial DNAs (mtDNAs) from the eleven species of yeast examined in this investigation. It is quite clear that all of these patterns are different. The two strains which most closely resemble each other are S . aceti and capensis. The only clearly observable difference between these two species based on this restriction pattern is that aceti lacks one band which S_^ capensis has. This band closely resembles S . cerevisiae EcoR1 band 8. Another worker (Karl Joplin) has shown that the cerevisiae EcoR1 band 8 and the similarly migrating capensis band are homologous.

Surprisingly, aceti not only does not have this band, it also lacks most sequences which are homologous to S . cerevisiae EcoR1 band 8. The rest of the species also have a few bands which comigrate with or resemble bands derived from the EcoR1 digest of the control strain, 161.

This is particularly true of cerevisiae EcoR1 bands 7 and 8; however, the observation still stands that these various patterns are quite distinct. This led to the working hypothesis early in this work that these various mtDNAs differ significantly.

The first experiments done in this work, were Southern blots of EcoR1 or BamHI digests of these mtDNA samples probed with intron specific sequences representing the five introns of the oxi3 gene found i

S . cerevisiae strain D273. The probes for the first four introns ( oxi3 11 through oxi3 1*0 were made by cloning intron specific Mbo1 fragments derived from petite genomes into the BamHI site of pBR322. This work was done with Kirk Mecklenburg. The oxi3 I5g probe was constructed by H.-L. Cheng. The genomic locations of these various probes are shown in Figure 7 which is a composite restriction map of cerevisiae strains

D273 and 161 derived from mapping and sequence data

(Bonitz et al., 1980, Hensgens et al., 1983, and Anziano

198-4 dissertation). The results of these five Southern blot experiments are shown in Figures 8 through 12 and are summarized in Table 7. Each of these figures shows different intron specific clone hybridized against the various mtDNAs. From these results it is shown that the distribution of introns is unique for each species.

Furthermore, the only intron which is present in all strains examined is oxi3 13. All other introns are apparently optional. ( aceti was not compared in this original survey because of difficulty in isolating intact mtDNA from it. aceti may be identical to S . capensis. Data presented below will show that 164 aceti and capensis are eolinear between oxi3 13 and the last exon.)

The relative sizes of the EcoR1 and BamHI fragments which hybridized to the various probes are informative.

S . cerevisiae strain 161 has five EcoR1 sites in the oxi3 gene. Three of these are in ox!3 12 while the other two are in the last exon. For most of the signals observed in these experiments, the bands which hybridized to one of the various probes are aproximately the same general size as the homologous band derived from strain

161. There are four very notable exceptions. These are

S. uvarum, S. capensis, S. ellipsoideus, and S. diastaticus. S . uvarum is listed in the ATCC catalog as being a 3train of carlsburgensi3. S. carlsburgensi3 is known to have only intron aI2, aI3, and aI5g (Sanders et al., 1977, and Grivell et al., 1983)

(Figure 1). As shown in Figure 12, the EcoR1 band from

S . uvarum which hybridizes to the aI5g specific probe is much smaller than the other fragments from the other species which hybridize to this probe. By examining the

EB stained agarose gel of the various EcoR1 digests and comparing the band which corresponds to the signal from

S . uvarum with the EcoR1 fragments of strain 161 (the sizes of which are all known), it is possible to conclude that the size of the EcoR1 fragment from S. uvarum 165

which hybridizes to the aI5g probe is the right size to

be a fragment which extends fron the 3' most EcoR1 site

in intron aI2 to one of the EcoR1 sites in the last exon

in a strain which only has introns aI2, aI3, and aI5g.

S . uvarum has these three introns and apparently no

others, confirming the arrangement determined by earlier

workers.

The other three species listed above have EcoR1 bands which hybridize to various intron specific probes which are much larger than observed in strain 161. These strains also lack aI2 and therefore do not have the EeoR1 sites found in that intron. The next EcoR1 site in strain 161 which is 5' to the ones found in intron aI2 occurs many kb 5' in the vicinity of the l4S-rRNA gene.

There are obviously no EcoR1 sites in these strains which are immediately 5' to the oxi3 region. That these strains have an EeoR1 site in the vicinity of the 31 end of oxi3 can be seen in Figure 13 and 14. Figure 13 is an agarose gel stained with EB on which were run the various mt-DNAs after having been cut with both EcoR1 and

BamHI. Figure 14 is a picture of an autoradiogram of a

Southern blot of the gel shown in Figure 13 probed with a recombinant probe (pKM2 see Figure 7) which contains aI5g and some of the flanking exon sequences. All of these species have sequences which hybridize to this probe including Sj_ ellipsoideus which does not have aI5g.

None of these species contains an FcoR1-BamH1 fragment which hybridizes to this probe which is more than about 1 kb larger than the homologous band from strain 161. Most have smaller fragments. While these data do not prove that the observed EcoR1 sites are in the last exon, they indicate that all of the EcoR1 sites must be at least close to the last exon. Again, S_^ uvarum is an informative example. As with the EcoR1 derived band, the

EcoR1/BamH1 band which hybridizes to this 3' oxi3 probe is the right size to be a fragment generated by cutting the BamHI site in oxi3 13 and the EcoR1 site in the last exon.

From examining Figures 11 and 12 it is also clear that the EcoR1 band which hybridizes to the oxi3 13 specific probe is the same band which hybridizes to the oxi3 I5g specific probe. These two autoradiograms are identical being made from the same bidirectional

Southern blot. This puts further limits on where the respective EcoRl sites might be located.

The conservation of restriction sites within homologous introns shared by these various species can be further demonstrated by the date shown in Figure 15.

This is an autoradiogram of a Southern blot of Taq1 digests of the various mtDNAs. All of the species show a 167

conspicuous band (marked by the arrow) which corresponds exactly to the homologous band from strain 161. The two

Taq1 sites which define this band are at the 5 1 end of aI5g and in the last exon. Not only are these sites conserved, but since the Taq1 fragments are the same size, the homologous introns are probably colinear.

Taken as a whole, these data are consistent with the hypothesis that the introns and exons shared by these species are highly homologous to each other. All of these species have an intron which is homologous to aI3»

Furthermore, all of these species have a BamHI site near the middle of the oxi3 gene. There are not more than five BamHI sites in any of these mitochondrial genomes

(Figures 23 and 2H). This makes the average BamHI site about 16 kb away from any other. The simplest interpretation is that the observed BamHI site is in the oxi3 13 homolog as it is in the sequenced cerevisiae gene. This is the only BamHI site in the oxi3 region. If this is true, then the EcoR1 sites (or at least one of them) are probably in the last exons of all species examined. In fact, Rosemary Jarrell has cloned a fragment of the ox!3 gene from ellipsoides using this

EcoR1 site in this species. For those species which have oxi3 12, there are also EcoR1 sites in this intron which are analogous to the EcoR1 sites found in this intron in 168 strain 161.

If one makes the assumption that the observed BamHI site in these oxi3 genes is in the oxl3 13 homologs and that there is an EcoR1 site in the last exon which is analogous to the EcoR1 site found in strain 161, then the distance between the BamHI site and the EcoR1 site can be diagnostic for the number of introns found in these species. The first two introns in the oxi3 gene of 161 are very large being about 2.6 kb each. Most of the known introns in laboratory yeast strains are much smaller being more on the order of 0.8-1.5 kb in length.

With this in mind the data from Figure 14 can be reexamined. This figure shows EcoR1/BamH1 double digests of the various species' mtDNAs probed with pKM2. As previously discussed, uvarum (lane 2) has a fragment which is the right size to be generated from an ox!3 gene which only has introns oxi3 13 and I5g (aI2 is 5' to the region shown here). S_;_ coreanus (lane 7) produces a band which comigrates with the homologous strain 161 band

(lane 11). This is strong evidence that the two species have oxi3 genes which are colinear in this region. S . coreanus therefore probably has oxi3 I5a and I5b in addition to aI3, al4, and aI5g. None of the other isolates tested has an oxi3 gene colinear with that of strain 161. This is not too surprising since only 169 coreanus and dlastatlcus have introns which are homologous to intron oxl3 14. What is surprising is that all of the species except S_;_ uvarum and S_^ coreanus have homologous EcoR1/BamH1 fragments which are too large to be colinear with strain 161 except for the deletions of particular introns. In other words, all of these species must have inserts in their oxi3 genes relative to the strain 161 in addition to possibly lacking introns found in strain 161.

The clearest example of this is found in aceti

(lane 10) and capensis (lane 4). These two species produce bands which comigrate. In addition, it is known

(Figure 11) that S_^ capensis lacks a ox!3 14 homolog.

Despite this, both species produce an EcoR1/BamH1 band which is larger than the related band from strain 161.

Furthermore, since capensis (and probably aceti also) lack oxi3 14, the size of the insert(s) in the S . capensis oxi3 gene relative to that of strain 161 must be at least 2.0 kb.

To further map the large insert found in Sj^ aceti,

S . capensis and diastatlcus a series of double restriction digests was performed in which one of the restriction enzymes was BamHI. These restriction digests were electrophoresed and blotted to nitrocellulose. The 170 resulting Southern blot was probed with a recombinant clone which contains Bglll band 5 from strain 161 cloned into the BamHI site of pBR322 (Figure 16). This Bglll fragment extends from aI2 to a Bglll site found in exon 5 of strain 161 (see Figure 7). This clone obviously contains all of intron aI3 and exon 4.

The first data which will be discussed are lanes 5 through 8. These fragments resulted from a double digest of BamHI and Bell. In strain 161, the relevant Bell sites are in exons 4 and 5. Note that all three species examined and strain 161 produce a small band which hybridizes to the probe and comigrate with each other.

In strain 161, this band is generated by cutting the

BamHI site in aI3 and the Bell site in exon 4. The fact that these bands comigrate suggests further that the observed BamHI site found in the other three species is analogous to the BamHI site found in aI3 and that the

Bell site found in exon 4 is also conserved between species. These four species are also colinear between these BamHI and Bell sites. The next larger bands in these lanes result from the hybridization of the probe to homologous sequences which lie between the Bell sites.

In strain 161 and in diastaticus, these sites are in exons which flank al4. The fact that these bands comigrate indicates that there are no large insertions or deletions which differentiate these two homologous introns. The suprising finding was that in aceti and

S . capensis these two Bell sites are much farther apart

( 2.2 kb instead of 1.3 kb) than in stain 161 or S . diastaticus . Since all of the signals seen on these lanes can be accounted for, this downstream Bell site must be very close to or in the sequences which are homologous to exon 5. The simplest interpretation is that this Bell site is homologous to the Bell site found in exon 5 of strain 161. The next bigger bands seen in this autoradiogram result from cross hybridization between the al4 sequences in the probe and cob 14. These introns are about 70% homologous. The largest bands in these lanes resulted from the probe hybridizing to sequences which are 5* to the BamHI site.

Double digests with BamHI and Hindlll of these same four species are shown in Figure 16, lanes 9-12. Strain

161 and S^ diastaticus both have a major signal which comigrates with that of the other. This band in strain

161 results from cutting the BamHI site in aI3 and the

Hindlll site found in al4. Both aceti and S. capensis yield bands of a slightly larger size which comigrate with each other. Since S^ capensis does not have al4, this BamH1/Hindlll fragment must be generated by a novel Hindlll site found in these species but not in 172 strains 161, D273, or Sj^ diastaticus. This Hindlll site is about 1.4 kb from the BamHI site and is clearly upstream of the exon 5 Bell site discussed in the last paragraph. In addition to the 1.4 kb bands seen in S . aceti and capensist these two species also have weaker signals which correspond to fragments of about 1.8 kb. This fragment is generated by a second Hindlll site found in these species but not in either S^_ diastaticus or strain 161. The distance between the BamHI site in aI3 and the second (3') novel Hindlll site found in S . capensis and aceti is about 3.2 kb. The next larger signal seen in these lanes results from cross hybridization with cob 14. The largest bands which hybridized with this probe also result from hybridization with sequences which are 5' to the BamHI site.

An extremely interesting signal can be seen in the S . diastaticus BamH1/HindIII digest shown in lane 9. This signal corresponds to a fragment which is about 1.0 kb.

As will be shown by the data discussed in the next several paragraphs, this fragment results from cutting the Hindlll site in the ox!3 14 homolog and a Hindlll site which is homologous to the 3' novel Hindlll site found in aceti and Sj_ capensis.

Lanes 1 through 4 of Figure 16 result from

BamHI/Bglll double digests of these four species. Except 173

strain 161 none of these species has aI2, and therefore,

doe3 not have the 5' Bglll site used to make the

Bglll-band 5 probe. Surprisingly, these three species

also lack the downstream Bglll site found in exon 5 of

strain 161. This is one of the very few examples

encountered in this study of restriction sites not being

conserved beween homologous regions of various species.

Next I determined the distance between the Hindlll

sites discussed above and the EcoR1 site found in the

last exon. To accomplish this the various mtDNAs were

cut with Hindlll and EcoR1, electrophoresed (Figure 17),

and blotted to nitrocellulose. This Southern blot was

then probed with pKM2 (Figure 18). Suprisingly, all of

the species have Hindlll sites in their oxi3 genes.

This includes uvarum which has been shown to have only aI2, aI3, and aI5g. None of these introns has a

Hindlll site in laboratory strains of cerevisiae.

The size of the 23^ uvarum Hindlll/EcoR1 fragment which hybridizes to this probe is about 5.0 kb. This would place this novel Hindlll site very near the first exon or possibly in the 5' portion of the intron in uvarum which is homologous to aI2. This is possibly a second example of the very rare restriction site polymorphisms which exist in these species in shared homologous regions. As for the rest of the species, coreanus has a band which comigrates with the homologous band from strain 161 (Figure 18, lane 7). This is again consistent with this species having the same introns in the 3' portion of its oxi3 gene as strain 161 and with the conservation of the Hindlll site in the intron in S . coreanus which is homologous to alH. The S_^ cerevisiae industrial strain has a homologous band which is larger than the band observed in strain 161. This is consistent with this strain also having a novel Hindlll site somewhere near the exon which is homologous to exon 4 in strain 161. As will be shown in the next paragraph, this

Hindlll site is homologous to the 5' novel Hindlll site seen in aceti and Sj^ capensis. The rest of the species have Hindlll sites which are closer to the EcoR1 site in the last exon than is the Hindlll site in al4 of strain 161. As will also be shown in the next paragraph all of these Hindlll sites can be explained as being homologous to one of the novel Hindlll sites observed in

S . aceti and capensis.

Taken as a whole the data discussed in the previous paragraphs are consistent with there being two introns in

S. aceti and capensis which are not found in previously described strains of the genus Saccharomyces .

The first of these introns is located between the Bell 175 site found in exon 4 of strain 161 and the Bell site found in exon 5 of strain 161. The second intron is located just 3' of the Bell site in exon 5 of strain 161.

Neither Sj_ aceti nor capensis has an intron which is homologous to al4 of in strain 161. Furthermore, all of the yeast species investigated in this study except S . coreanus, S . uvarum and strain 161 have one or both of these previously undescribed introns as indicated by the presence of anomolous Hindlll sites. The verification of this description will be decribed in the following sections. The Bell to Bell fragment from £3^ capensis which includes the 5' novel intron (henceforth called aI3b) and the Hindlll to Hindlll fragment also from S . capensis which spans the middle of aI3b into the 5* portion of the 3' novel intron (henceforth called al4b) have been cloned into pEMBL-18 and other vectors.

(Henceforth, aI3 and al4 from strain 161 will be referred to as aI3a and al4a, respectively.)

Subclones in M13 of the pEMBL clones described in the last paragraph have been used as probes against Southern blots of the mtDNAs of all eleven species of yeast cut with Hindlll and BamHI. These M13 derived clones are intron specific probes for these novel introns. The aI3b specific probe (C2) extends from a Dral site near the 5' end of aI3b to the Hindlll site in this intron. This Hindlll site is located in the middle of this intron.

This clone has been completely sequenced and will be

discussed in the next section. The alUb specific probe

(Hindi6-Hpa13) extends from a Hpall site, which is very

close to the 5' boundary of al4b to the 3* novel Hindlll

site which is 3' to the Bell site found in strain 161

exon 5. A portion of this clone starting from the

Hindlll site and reading towards the 5' exon has also

been sequenced. The results of this experiment are shown

in Figures 19 and 20. The backgound bands observed on

this autoradiogram are due to minor cross hybridization with other mitochondrial sequences. This cross hybridization resulted from having to use relatively low stringency in performing this experiment due to the extremely high A+T content of these probes. Nonetheless, all of these species except £3^ coreanus, S . diastaticus, S . uvarum, and strain 161 have an intron which is homologous to intron aI3b. Furthermore,

S . aceti, S . capensis, and diastaticus have intron al4b. These results are summarized in Table 7.

The al4b specific signal observed for diastaticus is interesting in that this Hindlll band is made by cutting the Hindlll site found in the intron which is homologous to aUla of strain 161 and the Hindlll site found in the al4b. It is also interesting that while al4a and al4b 177 can exist in the same gene, no strains were observed in which aI3b and al4a coexist.

There are only two other introns known from yeast oxi3 genes. These are aI5a and aI5b. For completeness, it would have been good to have developed intron specific probes for these introns also. This however was not done. The sizes and sequences of these introns however are known (Hensgens et al., 1983). The distance between the BamHI site in aI3a and the EcoR1 site in the last exon is also known. Furthermore, the distribution of introns homologous to aI3b, al4a, al4b, and aI5g among these various species is also known. By adding up the sizes of all the known introns and exons one can determine how much space is still unaccounted for. As previously discussed, the restriction site data for S . coreanus are consistent with this species being colinear with strain 161 from exon 2 to the end of the gene. S . uvarum is also accounted for and contains only aI2, aI3a, and aI5g. The cerevisiae industrial strain is also colinear with strain 161 except that aI3b is substituted for al4a. For ellipsoideus, all of the distance between the BamHI site and the EcoR1 site is accounted for by only having one intron, aI3b, between these sites.

For all the remaining species there must be one other intron besides the ones for which there are specific 178 probes between the BamHI site and the EcoR1 site. While mitochondrial DNA from aceti was not probed with some of the intron specific probes, it is clear from restriction site data that S_^ aceti and S_^ capensis are colinear between the BamHI site and the EcoR1 site in the last exon.

One way to predict if the unaccounted for space in these species' oxi3 gene is occupied by a previously described intron is to examine the various mtDNAs for the presence of rare restriction sites which occur in aI5a or aI5b. If such a restriction site were found in the correct place, it would be strong evidence for the determination of the intron. There is an Aval site in aI5a in strain 161 which can serve this purpose. In fact, there are only two Aval sites in the mtDNA of strain 161. The other site is not close to the oxi3 region. As will be discussed in a following paragraph, there are 0-2 PstI sites in these various mtDNAs. Where a PstI site is present one of them is located about 4.0 kb 3' to the end of the ox!3 gene within EcoR1 band 8 of strain 161. A double digest of PstI and Aval will therefore produce a characteristic band of about 6.1 kb if the unidentified intron contains an Aval site. The results of this experiment are shown in Figure 21. The smaller bands seen on this gel resulted from contamination of these mtDNA preps with nuclear DNA. No

6.1 kb bands were observed; however, a 7.5 kb band was

observed for strain 161, the industrial £3^ cerevisiae,

and S_^ coreanus. This is the band that would be

expected from those species which have the Aval site

conserved in aI5a in addition to having the 3' PstI site

and introns aI5b and aI5g. Since the Aval site is

conserved among species which are known to have aI5a

homologs and the other species do not have this site, it

is concluded that these other species probably do not

have an intron which is homologous to aI5a. The, as yet,

unaccounted for space in the remaining species most

probably results from them having intron aI5b. It will be

necessary for more definative experiments to be done in

the future to more accurately characterize the

unidentified intron which these other species possess.

The tentative distribution of introns aI5a and aI5b are

shown in Table 7. It is possible that another, as yet,

undescribed intron exists in the position designated as

aI5b in some strains.

The last piece of restriction site data discussed in

this section concerns the PstI site which occurs in the

URF which is U.O kb 3* to the oxi3 gene and near the oli2

(ATPase subunit 6) gene. This PstI site is near the middle of the EcoR1 band 8 of strain 161 mtDNA and is the only PstI site in that genome. Figures 22 and 23 show a

series of BamHI single digests and BamH1/PstI double

digests run on agarose gels and stained with EB. The

PstI site is located 10.5 kb 3 1 to the BamHI site in

aI3a. It is clear that all of these species except S .

aceti and diastaticus have at least one PstI site.

Another worker (Karl Joplin) has shown that these two

species the URF as well as the PstI site. This is

another example of how these mitochondrial genomes differ

by large insertions and deletions.

For the rest of the species, the PstI site is located

3' to the oxi3 gene. This was demonstrated by probing

Southern blots of the gels shown in Figures 22 and 23 with a recombinant clone containing EcoR1 band 7 from

strain 161. (This clone was made by Karl Joplin.) This

EcoR1 fragment is between the BamHI site in aI3a and the

PstI site. The results of these experiments are shown in

Figures 24 and 25. In several of the lanes of the BamHI only digests the signals appear faint and/or smeared.

This is because The BamHI fragment which hybridizes to this probe is extremely large and is near or above the average size of the uncut DNAs. (MtDNA is somewhat sheared during purification. This results in a relatively low yield of the largest restriction fragments) Using the sizes of the BamHI fragments from 181

strain 161 as size markers, the sizes of all of the

BamH1/PstI fragments from the other species are all

consistent with the distribution of introns shown in

Table 7. Another interesting observation from Figure 22

is that capensis seems to have two PstI sites. The location of the other PstI site is not known. aceti and diastaticus have no PstI sites.

III.A.2. Cloning aI3b and the novel Hindlll fragment from

S . capensis into pEMBL-18

As described in the previous section, aI3b is bounded by Bell sites found in the flanking exons (see Figure

16). It has also been determined that both aI3b and al4b have one Hindlll site (see also Figure 16). It was therefore decided to clone the Bell fragment, which contains the entire aI3b sequence, and the Hindlll fragment, which contains the 3' portion of aI3b, exon 5, and the 5' portion of al4b, into pEMBL-18. To clone these fragments, advantage was taken of the fact that in equal molar solutions smaller restriction fragments more efficiently ligate into vector DNAs than larger fragments. As shown in Figure 26, when Bell or Hindlll digests of capensis are performed, the two fragments described above are the smallest fragments generated by these two digests, respectively. It was therefore

possible to "shotgun clone" Bell and Hindlll digests of

S . capensis mtDNA into BamHI or Hindlll cut pEMBL-18

DNA, respectively. Despite the fact that other mitochondrial DNA fragments were present in these ligation mixes and that the capensis mtDNA was contaminated with some amount of capensis nuclear

DNA, the majority of all transformants which contained inserts had the desired inserts. For the BclI/BamH1 ligation-transformation, plasmid DNAs from 30 white colonies were analysed. Eighteen of these had inserts of which 13 were the desired Bell fragment. For the Hindlll ligation-transformation, sixteen white colonies were analysed. Seven of them had inserts of which five were the desired Hindlll fragment. Figure 26 shows an agarose gel which compares the sizes of the inserts from representative clones with Bell and Hindlll digests of S . capensis mtDNA. These samples were also cut with Clal.

There is a single Clal site in both of these cloned fragments located just 3' of the common Hindlll site.

One other group of clones was made by directly cloning DNA from the capensis mitochondrial genome into pEMBL-18. These were made by recutting the isolated smallest Bell-Bcll fragment with Hpall. The doubly cut

S . capensis DNA was then cloned into pEMBL-18 which had 183

been cut with BamH1 and Accl. BamH1 and AccI produce, In

this case, staggered ends which pair with the staggered

ends produced by Bell and Hpall respectively. Sixteen

white colonies were picked off of MacConky's agar plates

with ampicillin. Of these 16 transformants, 9 had

inserts of which 8 had the desired Hpall-Bcll fragments.

These recombinants were cut with Hindlll and four of them

were found to have the Hindlll site which is in the

middle of the Bell fragment. This Hpall site is about

0.2 kb 3' to the Hindlll site. Sequencing data later

showed that the other Hpall site which forms the Hpa II

end of the other four transformants is located about 0.2

kb farther 31 to the Hpall site which is close to the

Hindlll site.

The pEMBL derived clones discussed above were the

sources of all DNAs which were subcloned into the M13

derived vectors M13-mpl8 and M13-mp19 for DNA sequencing.

The clones which were used for this purpose were named

Bcl-21 and Hind-16, which contained the entire Bell and

Hindlll fragments respectively, and HB-7 and HB-10, which

contain the Hpall to Bell fragments which contain or lack

the 5 1 Hindlll site, respectively. 184

III.A.3* Recombinant clones used to sequence novel

regions of the capensis oxi3 gene

All sequencing data shown in this dissertation were obtained by the dideoxy chain termination technique using

M13-mpl8 (same polylinker as pEMBL-18) or M13-mp19 (same polylinker as pEMBL-18 and M13-mpl8 in the reverse orientation). Purified vector DNAs were purchased from

New England Biolabs. In some cases, DNA was transferred from the pEMBL derived clones discussed above to an M13 derived vector by simply mixing isolated insert DNA from one of the pEMBL clones with M13-mpl8 or M13-mp19 cut with the same restriction enzymes used to cut out the insert DNA. For example, the Bcl-21 insert was isolated by cutting the pEMBL clone with EcoR1 and Pstl. These mixtures were then ligated together and transformed into

E . coli strain 71/18. In other cases, the insert DNA from a pEMBL clone was first isolated and then recut with a different restriction enzyme before it was cloned into

M13-mpl8 or M13-mp19» This greatly increased the number of ends from these regions which could be sequenced.

As will be shown later, the Bcl-21 insert (equals aI3b plus some flanking exon) is extremely A+T rich over most of its length. This dramatically reduces the number of available restriction sites which could be used for 185 subcloning. Either there were no sites for a particular enzyme or there were too many to be useful. Therefore, to complete the sequence of aI3b it was decided to use

Bal31 deletions to cross those regions which could not be read using subclones of known restriction sites.

Bal-31 is an exonuclease. To do this, CsCl/EB gradient purified Bcl-21 DNA was linearized at the EcoR1 site at the 5' end of the polylinker of pEMBL-18. Fifty four micrograms of this DNA was then digested with Bal31.

At five minute intervals, aliquots of DNA were removed and the reaction was stopped by adding EGTA. The digested DNA was then phenol, chlorophorm:isoamylalcohol, and ether extracted to insure that all Bal-31 was removed from the digested DNA. Since Bal-31 does not evenly digest both strands of the DNA at the same rate, it was necessary to repair the ends with the Klenow fragment of

Poll and dNTPs. This makes all the ends blunt. Figure

27 is a photograph of an EB stained agarose gel showing the progressive nature of these deletions relative to the length of time for which the Bal-31 digestion was allowed to continue. After the DNAs were repaired with Klenow and the enzyme was removed, these DNAs were cut with

Pstl. The Bal-31 deleted-Pstl cut DNAs were then shotgun cloned into M13-mp19 DNA which had been cut with Smal and

Pstl. Smal makes blunt ends which can be ligated to the 186 repaired Bal-31 generated ends. Thousands of colorless plaques were generated when these DNAs were used to transform 71/18 cells. Only about fifty of these transformants were actually used to generate single stranded DNA (ssDNA) for sequencing. These clones are named by the length of time for which their insert DNAs were digested with Bal-31 and an arbitrary number which represents the order in which the plaque was picked from the LB plate. For example clone 5-10 has an insert which was digested for five minutes with Bal-31 and was the tenth recombinant picked at random from this transfomation.

III.A.4. The sequence of oxi3-I3b

The strategy used to determine the sequence of aI3b is shown in Figure 28. An important feature of this strategy to note is that parts of this sequence were determined from M13 clones which can trace their origin back to four different clonings of capensis mtDNA into pEMBL-18. In all cases, the sequences are colinear and invariant between these clones for those regions for which can be read. Since the Bcl21 clone insert comigrates with the appropriate Bell fragment from S . capensis mtDNA and since the parts of the sequence 187 derived from the various subclones of Bcl21 are colinear with the same sequences determined from other cloning experiments, it is unlikely that the Bcl21 insert has suffered a rearrangement or other mutation in the process of being cloned into pEMBL-18. Single base changes or frameshifts however cannot be ruled out from those data.

The sequence of aI3b and parts of the flanking exons are shown in Figure 29. The entire sequence for the insert in Bcl21 is 2228 bp long including 150 and 172 bp of the 5* and 3' flanking exons, respectively. Also indicated in Figure 29 is the location of al4a in strain

161. It is shown in this figure that aI3b is inserted into the exon sequence 12 nucleotides 5' of the location of al4a. Oxi3 I3b is, therefore, inserted between aI3a and alUa in the exon sequence. Also noted in this figure is the observation that in S_^ capensis there are two single base substitutions in the 3' exon sequence.

0x13 aI3b is 1908 nucleotides in length. This is very long for an intron from the yeast mitochondrial genome. 0xi3 11 and 12 are the only yeast mitochondrial introns which are longer than aI3b.

III.A.5. 0xi3 I3b has a long 0RF which is continuous with the 5 1 exon There are three interesting features about the sequence of aI3b. The first is that it contains a 1083 nucleotide (361 amino acids) long ORF which is continuous with the 5' exon. The amino acid sequence of this ORF is shown in Figure 30. This sequence was deduced from the

DNA sequence assuming that TGA codes for tryptophan instead of stop (Bonitz et al., 1980) , that CTN codes for threonine instead of leucine, and that ATA codes for methionine instead of isoleucine (Hudspeth et al., 1982).

Figure 31 shows the distibution of stop codons that occur in all three possible reading frames. There are many stop codons in the other two reading frames over the entire length of this intron. In addition, in the same reading frame as the ORF, there are many stop codons which occur in the 3' portion of this intron after the end of the ORF. All of these data are consistent with the 361 amino acid ORF which is continuous with the 5' exon being the only possible significant protein encoding region within this intron.

The amino acid composition (as derived from the DNA sequence) and the codon usage for aI3b are shown in Table

8. Included in this table are data taken from Hudspeth et al. (1982) who made a similar analysis for the structural genes of the mitochondrial genome of S . cerevisiae and most of the ORFs found in introns. One gene, var1, is considered separately from the rest of the structural genes because it more resembles the ORFs than the genes in these respects. It is clear from this table that the amino acid composition and codon usage of aI3b closely resembles that of other ORFs found in introns. First, if it were to be translated, the predicted protein product of aI3b would be a basic protein. There are predicted to be 53 basic amino acids coded for in the aI3b ORF and only 29 acidic amino acids.

This is similar to other ORFs found in introns. Second, except for arginine, threonine, and valine, there is a strong bias when a choice is possible for codons which end with uridine. For example, UUU and CCU are the preferred codons for phenylalanine and proline, respectively. This is a characteristic shared by all intron ORFs. Structural genes show no such codon usage preference. Third, AUA is the preferred codon for methionine in intron ORFs and aI3b while AUG is preferred in gene sequences. Similarly, AAA is the preferred codon for lysine in aI3b and the other intron ORFs while AAG occurs more often in the gene sequences. It is, therefore, apparent that the ORF of aI3b is similar in amino acid composition and codon usage to the other intron ORFs but is not similar to the structural genes.

An interesting feature about the amino acid sequence of this ORF is that a sequence which matches the

LAGLI-DADG (apolar, apolar, glycine, apolar, apolar,

Asp-/Glu-, Gly/Ala, Asp-, Gly) sequence found in all class I introns (Hensgens et al., 1983) ,which contain

ORFs, occurs between amino acids 129 and 139 of the ORF.

This sequence is Phe, lie, Gly, Phe, Phe, Glu, Ala, Val, and Gly. Written in the form of the LAGLI-DADG consensus sequence, this is apolar, apolar, glycine, apolar, apolar, glutamic acid, alanine, valine, glycine. This matches the consensus sequence perfectly except for the valine residue which occurs in the place of the aspartic acid residue in the consensus sequence. Hensgens et al. also state that the first LAGLI-DADG sequence occurs about one third the distance from the 5' to the 3' ends of these class I intron ORFs. This is about where this sequence occurs in the aI3b ORF.

An unusual feature of this reading frame is that it may not have the second LAGLI-DADG sequence which in other class I intron ORF occurs about 115 amino acids downstream of the first sequence. There is one possible poor match which occurs between amino acids 148 and 158.

This sequnce is lie, Ser, Arg, Phe, Thr, Asp-, Gly, Glu, and Gly. Written in the form of the LAGLI-DADG consensus sequence this is apolar, polar, arginine, apolar, polar, aspartic acid, glycine, glutamic acid, and glycine. Only 191 five of the nine amino acids in the consensus sequence match this second posible LAGLI-DADG sequence.

Furthermore, while switching glutamic acid for aspartic acid at position eight is a conservative change

(substitutes one acidic amino acid for another), the other three changes are not conservative and dramatically change the polarity and/or charge of this domain. It is therefore possible that this second LAGLI-DADG sequence is degenerate. Alternatively, the similarity of this sequence to LAGLI-DADG may be entirely fortuitous. There is one example of a class I intron ORF having only the first LAGLI-DADG sequence. This is from the 23S rRNA gene of Chamydomonas reinhardii chloroplasts (Rochaix et al., 1985).

Class I intron ORFs can be divided into three groups based on shared amino acid homologies (Hensgens et al.,

1983, and Burger and Werner, 1985). Within each group the ORFs share roughly 20-30? homology. With the aid of the DNAstar program developed at the University of

Wisconsin and a minicomputer, the amino acid sequence of aI3b was compared with the amino acid sequence of the ORF of a representative member of each group. These introns were cob 12 and cob 14 fromcerevisiae, and the N . crassa URF1 intron. No significant homologies were found for cob 12 or the N. crassa intron relative to aI3b. 192

Significant homologies were found in the comparison between cob 14 and aI3b. The homology is maximized by assuming three roughly 40 amino acid insertion/deletions.

Ample precedents for this assumption (based on comparisons of related introns) are described in the

Introduction. These two ORFs are obviously not colinear; however, if the three regions of nonhomology are discounted, then these two ORFs share roughly 25% homology over the remaining roughly 240 amino acids. A summary of this comparison is shown in Figure 32.

III.A.6. Intron I3b contains sequences which are homologous to the conserved cis-acting sequences of class I introns

As described in the last section, the ORF of aI3b is fairly typical of the ORFs found in class I introns. In this section, the sequence of aI3b will be shown to possess the secondary and tertiary interactions and conserved sequence elements which characterize all class

I introns. As shown in Figure 29, the last base of the

5' exon is a T and the last base of the intron is a G.

This precisely matches boundary nucleotides of the vast majority of class I introns (Nomiyama et al., 1981).

Class II introns have a conserved base at the 5' site and 193 a pyrimidine at the 3' intron end. Class I introns lack

a conserved sequence at the 5' intron boundary which

class II introns have. This sequence is GTGCG. It is

therefore apparent that the boundary sequences of aI3b are typical of class I introns and not typical of class

II introns.

All class I introns also have similar proposed secondary structures (Figure 4) (reviewed by Waring and

Davies, 198^ and see Introduction). There are seven sequence elements found within this group of introns which form short base paired stems with each other or with the flanking exons that are found in all class I introns. These sequences are called the internal guide

(IG), E, P, Q, R, E', S. They are always found in this order. It is proposed that E base pairs with E ’, P base pairs with Q, R base pairs with S, and the IG base pairs with the flanking exons (Figure 2). In addition to being conserved in secondary interaction, P, Q, R, and S are also conserved in sequence (Tables 1 and 2). Structural and/or sequence homologs for all seven of these sequence elements have been identified in aI3b. Furthermore, other, less well conserved stem structures, which are found in most class I introns, have also been identified in aI3b.

Because of the vast genetic evidence pointing to their importance and because they are more highly

conserved in sequence than P and Q, it was decided to

search the sequence of aI3b for homologs of R (box 9L)

and S (box 2) first. The consensus sequences for R and S are TCAGAGACTA and AAGATATAGTCC, respectively (Table 2).

It is established that the sequence GACTA in R base pairs with TAGTC in S. There are no matches to the consensus sequence for R in aI3b which are better than a 6 out of

ten match. There are, however, over 20 sequences in which at least a 5 out of ten match to the R consensus.

There is one fairly good match for S in aI3b. Its sequence is AAUGUACAGUCG. This is an 8 out of 12 match to the consensus sequence for S. This sequence begins 1783 nucleotides from the 5 ’ boundary of aI3b and ends 121 nucleotides from the beginning of the 3' exon. Typically,

S elements are so situated near the 31 end of the intron.

This proposed S homolog predicts that the sequence GACTG must be part of any R homolog with which it interacts.

The sequence GACTG occurs only once in aI3b. It is only

49 nucleotides 51 to the beginning of this S homolog and so, is much closer to the 31 end of the intron than is the case for most class I introns. Nonetheless, because of the positions of the other cis-acting elements (see below), it has been concluded that this sequence is the R homolog for aI3b. The full sequence of the proposed R 195 homolog is CTACAGACTG which is only a 6 out of 10 match to the consensus sequence for R. It also happens that there is one other classl intron which has the same R and

S homologs. This intron is the first intron in the cob gene of crassa (Burke et al., 1984). Figure 33 shows the predicted base pairing between these sequences.

The other two conserved sequences found in all class

I introns are P and Q. The consensus sequences for P and

Q are ATGCTGGAAA and AATCAGCAGG, respectively. These sequences are not as highly conserved as R and S in most class I introns (Table 1); nonetheless, there is a 7 out of 10 match for P beginning 778 nucleotides from the 5' boundry of aI3b. This sequence is ATACAGGTAAA. Part of this sequence (TACAGG) is predicted to base pair with a homolog of Q. There is a weak homolog of Q which is located very near the middle of the intron. Its sequence is TAGAAGCTTG, which is only a 5 out of 10 match with the consensus sequence for Q. The Waring and Davies model for the P and Q interaction predicts that for this Q homolog to be the functional Q sequence of aI3b, the sequence TCAGCA should base pair with the six nucleotides shown above for P. These two sequences simply do not match; however, the first nucleotide which is 3' to this

Q homolog is a T. If this base is considered, then it is possible to form a six base stem between the sequence 196

ACAGGT in P with the sequence GCTTGT in Q.

In addition to the short possible stem structure discussed in the last paragraph, there are other sequences in and near these two P and Q homologs which can also base pair with each other. The extent of these possible interactions is shown in Figure 3^. All class I introns have an additional stem interaction between a sequence which is just 3' of P and a sequence which is just 5 1 of Q. In the Waring and Davies model (1984), this stem structure is called P5. The posible stem in aI3b which is the equivalent of P5 in other class I introns is also shown in Figure 3^. The observation of this possible P5-like stem structure and the location of the E equivalent (see below) lend support to the view that the P and Q homologs decribed here are the bona fide equivalents of P and Q for aI3b.

Unlike P, Q, R, and S, E and E' are not conserved in sequence; however, the locations of P , R and S put limitations on the possible locations of E and E'. E is always located just 5' of P between IG and P. E 1 is located between R and S and is usually close to R. There are sequences in aI3b which meet these criteria. The likely candidate for E is a 13 nucleotide sequence which ends 8 nucleotides before the beginning of P. This is actually further from P than are most E equivalents in other class I Introns; however, the length of the posible

stem formed between the proposed E and E' sequences is

longer than usual. As will be described latter, there

are other unusual features which distinguish the E and E'

interaction of aI3b from the other introns in this class.

The sequence of the proposed E equivalent for aI3b is

CTGAGGTGAAGG. The proposed E* sequence begins

immediately after the end of R. It is 12 nucleotides

long and has the sequence of CTTTATCGGTGG. The possible

stem structure formed by these sequences are shown in

Figure 35. The G nucleotide at position 8 in E must

bulge out to form this stem. In addition, the A at

position M in the E sequence does not base pair with the

G at position 9 in the E' sequence. In most E to E' interactions, the stems which are formed are usually perfect 5 or 6 base pair stems. Mismatched pairs are not common. One notable exception to this is the intron from the Tetrahymena rRNA gene which also contains one mismatched pair. In aI3b the mismatched pair may have another significance. As will be described latter, this mismatched pair may allow the last 3 bases of the proposed E' sequence to interact with another sequence.

The last of the seven cis-acting sequence elements found in all class I introns sought in aI3b is the internal guide (IG). The IG is defined by the following characteristics (Davies et al. 1982): 1. The IG always

has a G nucleotide which is proposed to base pair with

the last nucleotide of the 5' exon (a uridine). 2. The

the 3' half of the IG can base pair with the last few

bases of the 5 1 exon and the 5 ’ half of the IG can base

pair with the first few bases of the 3 1 exon. 3. The IG

must be between the 5' splice boundary and E. There are

two sequences, AAAGAAT, which meet these criteria in

aI3b. These two sequences begin 327 and 369 nucleotides

from the 5' splice boundary respectively. Most IG

sequences found in class I introns are much closer to the

5' splice boundary than this. There are, however, other

exceptions to this general trend. The most dramatic

exception is found in aI3a where the IG is 1005

nucleotides from the 5' splice boundary.

The problem of which of the two possible IG sequences

is the authentic IG for aI3b can be resolved by examining

the sequence which is 5' of E. In most class I introns,

there is an optional stem structure which can form

between the sequences which are 3' of the IG and 5 ’ of E.

This optional stem is called P2 by Davies et al. (1982).

No such stem can be formed between the sequence which is

3’ of the possible IG sequence which is located 369 nucleotides from the 5' splice boundry; however, a

perfect seven base pair stem can be formed with a 199 sequence which is nine nucleotides 3' of the possible IG located at position 327 in aI3b. In Figure 29, this possible IG sequence is labeled as the IG sequence for this intron. As will be described in the next section, the designation of this possible IG sequence as the functional IG for aI3b also allows other unusual interactions to take place between the sequence 3' of E' and the sequence 31 of the IG.

With the location of the IG sequence, all seven of the cis-acting sequences which characterize all class

I introns have been identified in aI3b. Schematic representations of the interactions between these seven sequences and the flanking exons are shown in Figures 3 and 36. For comparison with the generalized structure of other class I introns, see Figures 2 and 4.

In addition to the interactions described above, the model proposed by Waring et al (1982) for the secondary structure of class I introns predicts that there will be three other stem structures in aI3b. The importance of these stems is uncertain since no mutants have ever been identified which interfere with their formation and there is no apparent conservation in their sequences. The first of these was named P6 by Waring et al.(1982). This stem involves the base pairing of sequences which are just 5 1 of R with sequences which are just 3 1 of Q. A 200

stem of this type can be formed in aI3b (Figure 36)

between the sequence AUUAU which is 5 nucleotides 3' of Q

and the sequence GUGAU which is 2 nucleotides 5' of R.

The last two stems which are found in class I introns

are called P8 and P9 by Waring et al. (1982). The stem

called P8 occurs as a "hairpin" in aI3b but can involve

loop structures in other class I introns. For example,

in cob 14 in cerevisiae , the loop at the end of P8

is the "maturase loop" for that intron (Waring and

Davies, 1984). Loops at the ends of these stems are,

however, optional for class I introns. In many class I

introns these stems also occur as "hairpins". A good

example of this is seen in oxi3 I5a. The conserved stem

structure, P8 is located between E' and S. In aI3b,

there are only 37 nucleotides between the proposed E' and

S homologs. Of these relatively few bases, 20 of them

constitute a perfect 20 nucleotide long palindrome made

up of alternating A and U nucleotides. As shown in Figure

36, this sequence can be folded into a stem structure which is probably the equivalent of P8 in other class I

introns.

The one remaining stem structure, which has not yet been described in aI3b is P9« This stem is located

between S and the 3' end of the intron. In aI3b, there are two possible candidates for this structure. The first of these starts 9 nucleotides after the 3' end of

S. It starts with the sequence AUAUAUAUAUUU. This is

followed by a series of 20 nucleotides which do not base

pair here. Then the sequence AAAUAUAUAUAU which can

obviously form a perfect twelve base pair stem with the

first sequence. The 20 nucleotide loop which is formed

by this proposed stem is extremely interesting in that,

as will be described in the next section, it may be

involved in a long distance stem interaction which is

unique to aI3b (see Figure 37). The second possible P9

stem is a 35 nucleotide sequence which can fold into an

imperfect "hairpin" structure. This possible stem

structure is also shown in Figure 37. This sequence ends

just 8 nucleotides before the beginning of the 3' exon.

It is not unusual for class I Introns to have two stem

structures between S and the 3' exon; however, the model proposed by Davies et al. (1982 and reviewed by Haring and Davies, 1984) indicates that the second stem

structure is optional. By covention, the first stem will be considered the P9 homolog.

At this point in the description of the results, it

is fair to conclude that aI3b is a class I intron. The strength of this conclusion is based on the observation of all of the conserved stem structures which are typical of class I introns. Not only can the P-Q and R-S stems 202

be formed in aI3b, but also, the E-E' and the IG-exons

interactions are present. In addition, the stem

structures which were named P2, P6, P8, and P9 by Davies

et al. have also been determined. Furthermore, all of

these proposed stems are in the correct relative

locations to be the functional equivalents of these

structures in aI3b.

III.A.7. A possible unusual sequence interaction in aI3b

When the search for the equivalents of E and E' in aI3b was begun, the only criteria which were used were that E and E* had to base pair and that E had to be between the IG and P, and that E* had to be between R and

S. There are only 49 nucleotides between R and S.

Furthermore if the bases of the proposed P8 stem are excluded, there are only 29 nucleotides unaccounted for.

Surprisingly, there are three possible E' sequences there which meet the criteria listed above. The sequences described in the last section as the E and E' equivalents for aI3b were chosen because of their close proximities to P and R, respectively, and because of the strength of the proposed interaction.

As shown in Figure 36, the other two sequences are designated as A and B. The sequence called A includes the last three nucleotides of the proposed E' sequence.

As mentioned earlier, these last three nucleotides of the proposed E' sequence are separated form the rest of the proposed E 1 sequence by a mismatched pair. These sequences can form perfect seven and five base pair stems with sequences designated as A' and B', respectively as also shown in Figure 36. The interesting features of these proposed interactions are: 1. They involve interactions between sequences between E' and S and sequences close to the IG. 2. In order to form, they must displace the sequence near E which is proposed to be involved in the P2 stem. This second point is interesting because it implies a dynamic shifting of secondary structure of this intron. This proposed shifting of stem structures, however, does not require a major revision of the overall secondary structure of the intron since the proposed E-E' and P2 interactions bring all these sequences into close proximity. From Figure

36, it is obvious how the A-A’ interaction could be formed. The B-B1 interaction is slightly more difficult to visualize because it requires a tertiary folding of the intron to align these sequences.

The question now arises as to how common interactions like the ones discribed in the preceeding paragraph are in other class I introns. Several other class I introns 204 were examined for similar possible interactions between sequences which are 3* of the IG and E 1 sequences. These introns were cob 14, oxi3 I3a, oxi3 14, the Tetrahymena rRNA intron, the first rRNA intron in polycephalum, and the intron in the Jjk crassa mtDNA URF1 gene. For the first five of these introns, there is no space between E* and the beginning of the conserved stem, P8.

Furthermore, there are no stretches of pairing between this region and the region Just 3 1 to the IG sequences which come close to being as good as the pairings which form the P2 and P8 stems, respectively. It is therefore unlikely that these regions interact in these introns.

The last case examined was the N^ crassa intron

(Burger and Werner, 1985). This intron is interesting in that it does not have the P2 stem. In fact, E is only six nucleotides away from the IG. These six nucleotides plus the sequence for E are AUUAAAGAUCAC. The sequence for E' plus the next seven nucleotides are GUGAUCAUUUAA.

It is interesting that, except for the adenine at position 7 in the middle of the sequence which includes

E’, these two sequences base pair perfectly. Burger and

Werner have included the sequence which is 3' of E' in a

P8-like stem; however, this stem is not as strong as P8 stems seen in other class I introns. It is therefore possible that the sequences near E' interact with the sequence next to E simply by extending the E-E' 205 interaction. The situation in the crassa intron may be functionally related to the situation in aI3b despite the obvious differences in the spacing of many of the structural elements in these two introns.

There is one other possible sequence interaction which could be formed by aI3b. It involves the sequence found in the 20 nucleotide loop found at the end of the

P9 stem proposed in the last section and a sequence which begins 214 bases 3' of Q. These sequences are all A+T.

The sequence in the P9 related loop has a run of 15 thymidines in a row. The sequence which is closer to Q has a run of 10 adenines in a row. These two sequences can obviously base pair. The total possible helical structure is 18 nucleotides long of which 13 nucleotides can form a perfect stem (Figure 37). No stem structure like this has ever been observed in a class I intron before. These complementary sequences may be entirely fortuitous and not actually base pair in the RNA secondary structure. On the other hand, Q and R are farther apart in this intron than in most class I introns, and if this stem did form it might bring the various conserved elements closer together.

III.C.4. The distribution of nucleotides in aI3b

As noted previously, aI3b is relatively A+T rich. In fact, there are only 334 G+C residues in the entire

intron. This makes the A+T content of the intron as a

whole roughly 82.5$ A+T. There are however three short

GC site clusters in aI3b. The first of these begins 117

nucleotides from the 5 1 exon/intron boundary and so is

part of the ORF. The other two are located within 400

bases of the end of the ORF and are about 250 nucleotides

apart. The sequences of these G+C clusters are shown in

Figure 29. These three G+C clusters contain 18, 29 and

25 G+Cs, respectively. If one does not include these

sequences, aI3b is roughly 85.551 A+T. Of these remaining

G+Cs, most are located in the ORF. Except for the G+C

clusters the region between the end of the ORF and the

beginning of the region which contains the conserved

cis-acting elements R, E', and S is extremely A+T rich.

For example there is a 250 base pair stretch in this

region which has only 19 G+Cs (7.6$).

An interesting feature of the two 3' most G+C clusters described in the last paragraph is that they are also similar in sequence. These two G+C clusters are 38 and 35 nucleotides in length, respectively. As shown in

Figure 38, they can be aligned at 26 nucleotides. The homology is higher at the ends of these sequences than

they are in the middle. There are only two thymidines in each of these sequences and only 7 and 8 adenines, respectively. Given the distribution of nucleotides in 207 these two sequences, they can be aligned at about twice as many nupleotides than would be expected by chance. In the var1 gene, there are strains which are known to contain two, one (Hudspeth et al., 1984) or no G+C clusters (J. Wenzlau and D. Ralph, unpublished data).

The two possible G+C clusters are interesting in that they have exactly the same sequence and both of them can be involved in a biased gene conversion event. Another interesting feature of the var1 situation is that the

A+T rich sequences which flank these G+C clusters are also identical. Unfortunately, as shown in Figure 29, the sequences which flank the G+C clusters in aI3b are not similar; therefore, the importance of the homology between these two sequences is unclear.

III.D. Comparison between cob 13 from cerevisiae and the cob intron in jU nldulans

The intron in the cob gene of iU nidulans has been sequenced by Waring et al. (1982). Haldi (1985) has completed the sequence of cob 13 intron from strain 161 of cerevisiae. Southern hybridization data

(Lazowaska et al., 1981) had indicated that these two introns share homology. Figure 40 shows how the sequences of these two introns can be aligned. Both of these introns contain long ORFs. Figure 41 shows an 208 alignment of the amino acid sequence for these two ORFs.

There are two regions of high homology (6651-7451 amino acid homology) separated by a region of lower homology.

In addition, the first 71 amino acids in the iU nidulans intron bears no significant homology with the first 96 amino acids of cob 13. The homology begins at the first

LAGLI-DADG sequence and continues to the end of both introns. In other words, these ORFs are nearly colinear over three quarters of their length.

These introns are sufficiently homologous to indicate that they are probably evolutionarily related.

There are two interesting features about how these introns differ from each other. The first is that, as mentioned above, the amino acid homology is weakest at the 5 1 end3 of these introns. Surprisingly, all of the cis-acting conserved elements of these class I introns except S (box 2) are located in this 5* region.

Therefore, the 5' regions of these intron are structurally homologous if not related directly by sequence homology. The second interesting feature about their divergence is the closed reading frame. Between the S homologs and the intron/exon boundry these two introns are fairly homologous. What is surprising is the distance between the end of the ORFs and S (box 2). In the A_^ nidulans intron, this distance is ten nucleotides. In the S. cerevlsiae cob 13 this distance 209

is over five hundred bases. Therefore, while these two

introns are colinear over most of their length, they

differ by mainly one very large deletion/insertion.

III.C. Serological and genomic differentiation of

Rickettsia

The purpose of this project was to develop the means by which the members of the genus Rickettsia could be unambiguously distinguished from each other. Towards this end, serological tests were applied to this group of organisms using a variety of antibodies and sera, some of which have used by previous investigators. This provided the basis for comparing the results reported here with those of other investigations. The new technique developed here to distinguish the Rickettsia was to look for restriction fragment length polymorphisms (RFLP).

This permitted the identification of unique restriction patterns and permits one to examine the divergence of these organisms at the DNA level. This technique was successful in achieving the goal of this project.

III.C.1. Antigenic variation in the genus Rickettsia

Antigenic variation within the genus Rickettsia was determined by the micro-IF test of Philips et al. (1978) modified as described in the Methods chapter of this

dissertation. Four different sources of antibodies were

used to examine the shared antigens within this group.

The first two of these were murine monoclonal antibodies

raised against R^ rickettsii by Dr. Jim Lange (Lange and

Walker, 1984) at the CDC. The third source of antibodies was a polyclonal murine antiserum raised against R .

rickettsii. This antiserum was the kind gift of Dr.

Willy Burgdorfer at the NIH Rocky Mountain Lab. This antiserum has been used by him and others to identify R . rickettsii isolates using the micro-IF test. The fourth source of antibodies was a polyclonal rabbit serum which the CDC has distributed to interested individuals and institutions for the purpose of identifying rickettsia as belonging tb the RMSF group, pue to the high specificity and relative scarcity of the three murine derived sera, the monoclonal antibodies were not tested at dilutions of less than 1:100 and the polyclonal serum was not tested below a dilution of 1:16. These studies were conducted at the Ohio Depertment of Health in collaboration with

Chip Pretzman.

The results of these studies are shown in Table 9.

The monoclonal antibodies showed the highest degree of specificity; they cross reacted with the fewest number of species. The monoclonal antibody designated E-11G2 reacted strongly with R^ montana, R . parkeri, R . 211 rhipicephali, R . rickettsii , and R_^ siberica at dilutions of 1:10000 are greater. The other monoclonal antibody (Ell —F12) reacted equally well with these same five species and also at a dilution of 1:100 with R . conori. When these monoclonal antibodies were developed, it was hoped that they might react only with one species of rickettsia, thereby allowing the rapid identification of isolates as R^ rickettsii. This goal was not attained. These monoclonal antibodies do react with only a subset of all members of the RMSP group and do not cross react with R^ typhi, R . Canada, or

R^ belli!.

The data from the polyclonal murine serum showed that

R . akarl shares some antigenic determinants with the other members of the RMSF group ;however, this antiserum did not cross react with R^ australis which other studies had shown to be antigenically related to the other members of the RMSF group. This may be due to a random host effect in which this particular mouse happened not to develop antibodies to some particular antigen shared by IU australis and R^ rickettsii.

Of the sera tested, the polyclonal rabbit serum probably gives the best indication of the shared antigens in the genus Rickettsia. This serum cross reacts with all the members of the RMSF group with various degrees of specificity. In addition, it cross reacts well with R. 212

belli! and, to a lesser extent, with eanada. This

antiserum did not react with either risticii or R .

typhi. This lack of reaction with R^ typhi was

somewhat unexpected since humans frequently develop

antibodies which cross react between the Typhus group and

the RMSF group (see Introduction).

Philips et al. (1978) observed that mice rarely

develop antibodies to all the members of the RMSF group

after being exposed to one of the members. For that matter, most strains of mice are not even suseptible to

infection with RMSF. With this in mind it has always

been surprising to me that mice and murine cells have

been used by many investigators as model systems for RMSF

infections and cytopathology. The murine antibody

response .to exposure with RMSF group organisms is

different from that observed in other mammals. Philips

et al. used the peculiar characteristics of the mouse

immune system as the basis for differentiating between individual species of the genus Rickettsia. To fully

duplicate the work of Phillips et al., it would have been necessary to make "typing sera" to all rickettsial species examined and then titer each serum against all rickettsial species. This is not paractical; however,

the findings reported here agree with those of other

investigators that mouse-derived antisera do not cross

react with all members of the RMSF group. These data 213 also show the variability in antibody responce of different mammals.to the same rickettsial species. As will be discussed in the next section, the mouse sera also do not accurately define the relationships among the various rickettsial organisms.

III.C.2. Restriction site polymorphisms differentiate

Rickettsiae

Rickettsia were purified and their DNA was extracted free of contaminating host cell material by the gel filtration method described in the Methods chapter.

Another worker (Karl Poetter) has shown that no significant amount of contaminating Vero cell DNA in the final samples of rickettsial DNAs used in this study.

This was determined by probing dots of rickettsial DNA on nitrocellulose with P-32 labelled vero cell DNA. Karl

Potter has also shown that contaminating Vero DNA does not effect the restriction pattern or the autoradiograms of the Southern blots of rickettsial DNA.

Rickettsial DNAs were digested with Hindlll as described in the Methods. To insure that the DNAs had been cut to completion, about one tenth of each digest was used to cut 0.5 micrograms of lambda phage DNA. If the lambda phage DNA cut to completion, it was assumed that the rickettsial DNA did also. In addition, all 214 restriction digests were done with an excess of Hindlll.

An example of one of the lambda digest controls is shown in Figure 42.

Hindlll digested rickettsial DNAs were electrophoresed on agarose gels and blotted to nitrocellulose as previously described. An example of an EB stained gel with rickettsial DNAs is shown in Figure 43. It is clear from this photograph that the patterns of Hind III digested DNA fragments seen on this gel are distinct for each species of rickettsia. In the experiments described below, some of the species of rickettsia give identical results for a specific probe. This is not due to confusing the tubes of rickettsial DNAs or loading the same DNA in two or more wells of the same gel. This can be known with certainty because the stained gels show that different DNAs were loaded in each well. Also, since all Southern blots were done by the bidirectional method, two different probes were usually used to examine each restriction digest. In all cases in which this was done, the data were consistent with the correct DNAs being used in each lane.

It is clear from Figure 43 (and Figure 47 (see below)) that with a larger sample size and/or the use of different restriction enzymes it would be possible to differenciate the previously discribed rickettsia species based entirely on the restriction patterns seen on the agarose gels; however, such data is completely

qualitative. If a rickettsial strain yields a novel

pattern, it is impossible to determine if the strain is a

novel species or a varient of a previously discribed

species. In other words, while the patterns seen on

stained gels of rickettsial DNAs are sufficient to

demonstrate that all of the species of rickettsia are

distinct, such data yeild no usable measure the degree of

divergence between any two species. This is because the

complexity of the patterns and the occurance of

hypermolar bands makes further analysis impractical. To

get around this problem, Southern blots of Hind III

digested rickettsial DNAs were probed with pBR322 derived

recombinant clones. This allows the detection of changes

in the rickettsial genomes by examining restriction site polymorphisms in a particular region of DNA which, in at least one of the species (R^ rickettsii) is

continuous. These data are simpler, more managable and ultimately more informative.

Data from four probes are shown in this study. Three of these were large inserts obtained by cloning limit digests of either EcoR1 or Pst I cut R^ rickettsii DNA into plasmid DNA. The first of these (EcoR-3) is a 6.4 kb

EcoR1 fragment cloned into the EcoR1 site of pAT-153.

This vector is a derivative of pBR322 from which a small sequence near the origin of replication associated with 216 copy number control has been deleted. There are six

Hindlll sites in this insert. Some of these generate

Hindlll fragments (two or three depending on how far the gel was run) which are less than 500 bp long and were therefore not used in this study because they were run off the bottom of the gels. An autoradiogram of a

Southern blot probed with EcoR-3 is shown in Figure 44.

The other two R_^ rickettsii derived probes were made by cloning Pst I fragments into pBR322. These clones were made by Jon Clark. These two clones, called Pst-1 and

Pst-2, have inserts which are 6.2 and 5.2 kb in length, respectively. Nothing is known about the coding content of these probes. Together the three probes represent

17.8 kb or about 1< of the genome of R^ rickettsii.

Autoradiograms of Southern blots using these two probes are shown in Figures 45 and 46. The additional EB stained agarose gel which was used to generate Figure 45 is shown in Figure 47. This DNA was also probed with

EcoR-3 and the patterns were indistinguishable from those shown in figure 45. When viewing these autoradiograms it is important that the reader realize that some of the observed bands are not continuous in the R_^ rickettsii genome because of the smallest Hind III fragments which were run off these gels. However, when doing the analysis of these data, it was assumed that if two fragments of DNA comigrated and hydridized to the same probe then the two pieces were homologous fragments and

had exactly the same sequences at their ends (the

restriction sites). The technique used to analyse this

data (see below) assumes in most cases that if two

fragments do not coimograte, then at least one of the two

restriction sites are variant between DNAs which are

being compared dependent upon the total number of

fragments observed for a particulae digest. Since the

smallest bands have not been scored, there will be a

slight underestimation of the actual genetic variability

described below. This technique also assumes that those regions on the various DNAs which hybridize to the various probes have not been seriously rearranged in the various rickettsial genomes relative to each other.

The last probe used in this study was a 16S RNA specific probe made by Weisburg et al. (1985) from the genome of Rochallmaea quintana. An autoradiogram of

Hindlll digests of selected rickettsial DNAs is shown in

Figure 48. This autoradiogram was made by Jon Clark.

There are some interesting features of the autoradiograms discussed in the preceding paragraphs.

Late in this investigation, two species of ehrlichia, E . risticii and sennetsu (an equine and human pathogen, respectively), became available for analysis. DNA was extracted from these species in the same manner as for the other species of rickettsia. Hindlll digests of these DNAs were run on several of the gels described above and were probed with the three JR^ rickettsii derived probes. Hybridization was done at room temperature and the filters were washed for low stringency as described in the Methods section of this dissertation. It is known that when yeast mitochondrial sequences are probed under these conditions, sequences which are as little as 70Jt homologous (ie. cob 14 and al4alpha) will cross hybridize (see Figure 16).

Surprisingly, neither of the ehrlichia DNAs hybridized to any of the FU rickettsii derived probes. This was not true of coli. As seen in Figure 46, there is a sequence in the coli genome which hybridizes weakly to the Pst-2 probe. From examining Figure 48, it is clear that DNA from both erhlichia strains and E^ coli do cross hybridize to the quintana-derived 16S rRNA probe. Therefore, there is nothing inherently wrong with the sample of Ehrlichia DNA which might explain why it fails to hybridize to the R^ rickettsii derived probes.

Another interesting feature of the autoradiograms is the appearance of many faint bands which show up when low stringency conditions are used. Some of these may be related to border fragments which contain only very short regions of homology with the various probes. Others may be very rare partial digestion products. The argument that the digests all went to completion based on the digestion of the lamda DNA is largely a statistical one and therefore has some degree of uncertainty. There is however another possibly interesting explanation: these sequences may represent small or divergent sequences which are scattered over the genome. The best example of this is seen in the bellii lane probed with the Pst-1 probe as seen in Figure 45. At least nine faint bands are observed in this lane. The possibility that they are due to partial digests is remote since all of these bands are smaller than the largest strong signal. If these faint bands were due to partial digestion of the R . bellii DNA, then one would expect that faint signals would be seen which were larger than the largest strong signal. Regardless of their origin, these faint bands were not considered in the analysis discussed below.

The last interesting feature of these autoradiograms which will be discussed here is that all members of the genus Rickettsii hybridized to the R^ rickettsii derived probes. This includes R^ canada, R^ typhi and R . bellii. Under the stringency conditions used in these experiments, sequences which are 70% homologous will hybridize weakly to the probes. Empirically one would not expect sequences which were less than 60£ homologous to hybridize at all. It was observed that the three species mentioned above did hybridize strongly to the 220 three rickettsii derived probes. One would therefore expect that these sequences are probably on the order of at least 7051 homologous to the rickettsii derived probes.

In addition to the experiments described above, another investigator (Karl Poetter) has probed Hindlll digests of R^ akari and R^ rhipicephali with the Pst-2 probe. His data were included in the analysis described in the following paragraphs.

Composite diagrams showing the relative mobilities of the Hindlll fragments which hybridized to each of the four probes are shown in Figures 49 through 52. When this project was started, it was thought that the various species of rickettsia would be sufficiently closely related to allow cross hybridization of the probes but divergent enough so that the restriction patterns would be completely distinct. It was expected that very few bands would comigrate between any two species. This was not the finding. By examining the patterns shown in

Figures 49 through 52, it is clear that all of the species in the RMSF group and some of the other species produce DNA fragments upon Hind III digestion which comigrate with fragments derived from other species.

This indicates that for the portion of their genomes which were examined, the members of the RMSF group were much more closely related than had previously been 221

suspected would be true for their entire genomes.

To estimate the degree of divergence between the

various species of rickettsia examined in this

investigation, the technique developed by Engels (1981)

was used. This analysis was largely performed by Dr.

Paul Fuerst. In this technique the portion of mismatched

base pairs between any two homologous sequences is

estimated by the equation p 1=k/(2jm-kj), in which "k" is

the number of polymorphic sites, "j” is the number of

bases in the recognition site of the restriction enzyme

used (6 for Hind III), and m is the total number of

sites. The parameters "k" and "m" can be estimated

directly from tables in Engles (1983) for each digest by

using the total number of fragments and the number of

fragments with identical mobility observed between two

species. Table 10 gives the percent mismatched basepairs

predicted for all pairs of species examined in this

investigation. The variance for these predicted values

for percent mismatched base pair is estimated by the

equation V=pf squared/k. Within the RMSF group, this

variance predicts a standard error which is on the order

of 1$ mismatch.

The data shown on Table 10 were used to derive the

dendrogram shown in Figure 53 using the unweighted-pair group-mean (UPGM) method. This figure shows the

estimated percent sequence divergence between the various 222

members of the genus Rickettsia. There are six

species, Rj_ conorl, R . parkari, R . slberica, R .

australis, R. montana, and R^ rhipicephali, which

are all less than 6% divergent from each other. This will

be referred to as the RMSF core group. The figure shows

the members of this group as being further subdivided

into pairs. These subgroupings however are probably not statistically meaningful, since only four probes have been used. As a group these species are probably about equally divergent from each other.

One surprising finding is that R^ rickettsii falls outside of this core group, although only barely outside.

It is either the most closely related nonmember species or the most divergent member within the group.

Pathogenicity is not a characteristic which separates R . rickettsii from the other members of the RMSF core group since R^ conori, australis, and possibly the R . siberica isolates examined in this study are also pathogenic to humans. R^ akari was found to be the most distantly related RMSF group species to the RMSF core group.

Another interesting finding was that R^ Canada, R . bellii, and R_j_ typhi are all about equally divergent from each other and from the RMSF core group. (The variance of the predicted percent divergence in this range makes these species statistically about equally 223 divergent.) The small number of shared fragments between each of these species and all other species makes the exact estimate of genetic divergence very unreliable.

Nonetheless, it is correct to conclude from the data that these three species are not closely related to each other or to the RMSF group species.

One can compare the results of the DNA data with that of the serology data (Table 9). Basically, there are no real contradictions between these data sets. That is, taken as a whole the serological data does not place organisms within the RMSF group which the genetic data would not also place within this group. The one exception to this is that the CDC's polyclonal rabbit serum reacts equally well or better with R^ bellii as it does with four members of the RMSF group. The other sera, however, do not show this degree of cross reactivity with R^ bellii. The problem with the serological data is that it lacks resolution. It is very difficult to extrapolate the degree of relatedness of the members of the genus Rickettsia based on serological data. For example, the RMSF group is defined as those organisms which are serologically related to R . rickettsii. Nowhere in the literature is there any indication that the other members of the RMSF group are more closely related to each other than they are to R . rickettsii. Even though the serologieai technology has 224 greatly improved in resent years, it still lacks the resolution to detect the divergence of IK rickettsii from the rest of the group. Another apparent anomaly of the serological data is that serology places IK canada in or near the Typhus group. There is no evidence from the DNA data to support this placement.

In conclusion to this section of this dissertation, it has been shown that all the members of the genus

Rickettsia can be unambiguously distinguished from each other on the basis of restriction fragment length polymorphisms (RFLP) observed on Southern blots probed with specific JK rickettsii derived sequences. For any one particular probe, some of the species have identical restriction patterns; however, by examing more than one probe all the species can be distinguished. This restriction site data has been analysed by the method of

Engles (1983) and has shown that the eight species within the RMSF group are closely related to each other with R . rickettsii and IK akari being the most divergent members of the group. It has also been shown that IK Canada.

R . typhi, and IK bellii are not closely related to each other. On the other hand, all the members of the genus Rickettsii tested must be roughly at least 70Jt homologous to IK rickettsii in the regions which are homologous to the various probes in order for the signals to be observed on the autoradiograms. Finally, Ehrlichia 225

DNA did not cross hybridize to the rickettsii derived probes and therefore can not have sequences, which are

70% homologous to these probes within their genomes.

This is somewhat surprising in that one of the probes does cross hybridize weakly to coli DNA. PLEASE NOTE:

This page not included with original material. Filmed as received.

University Microfilms International To address this question with the evidence presented

in this dissertation, one must first make the assumption

that the various mitochondrial genomes examined in this

work are fuctional equivalents. This is a reasonable

assumption for two reasons. First, other workers (Karl

Joplin, Anna Morawiec, and J. Wenzlau) have used the

protoplast fusion technique to move the mitochondrial

genomes of aceti, capensls, S . diastaticus

, ellipsoideus and norbensis into a S .

cerevisiae nuclear background. The cerevl3iae

strains used as recipients in these protoplast fusion

experiments were first deleted for their entire

mitochondrial genomes (rho-0). The successful fusion

products were able to grow on nonfermentable carbon

sources. While this experiment does not demonstrate that

these yeast strains with a cerevisiae nucleus

and the mitochondrial genome of a different species of

yeast grow as well as cerevisiae strains with their

own mitochondrial genomes, it clearly demonstrates that

all processes required for wild type respiration do

proceed at some rate in these strains. This includes the

splicing of all of the oxi3 introns in these strains. An

examination of Table 7 shows that not only do these

strains contain introns not found in laboratory strains

of S. cerevisiae some of these also lack introns which 228

are homologous to the first, second, fourth, and last

introns found in all laboratory strains of S.

cerevisiae. The second reason for assuming that these

various mitochondrial genomes are functionally equivalent

is that the wild type diploid industrial strain of S.

cerevisiae examined in this work also contains aI3b and

does not contain a homolog of al4a. It is reasonable to assume that the various mitochondrial genomes of different strains of cerevisiae can function nearly equally well in a variety of Sj_ cerevisiae nuclear backgrounds. (To assume the converse is to assume that there are barriers to the viability of progeny of certain outcrosses.)

Since mitochondrial genomes of a number of different species of yeast function well in cerevisiae nuclear backgrounds, and since mitochondrial genomes of different

S . cerevisiae strains themselves are very heterogeneous, there is no reason to reject the hypothesis that all mitochondrial genomes examined in this study are functionally equivalent. If one makes this assumption, then by examining Table 7 one can see that all of the introns found in the oxi3 gene are optional except aI3a.

Therefore, there are eight optional introns in the oxi3 genes of yeast species.

There is no reason for believing that aI3a is 229

required either. Some of the optional introns are more

common than others. For example, al4a, al4b, and

possibly aI5a are not widely distributed. On the other

hand, aX5g is found in all strains examined except S.

ellipsoides . The case may be that a wild type strain

which lacks aI3a has simply excaped detection. Very late

in this study, it was learned that Dr. P.Q. Anziano

(personal communication) has sequenced the relavent portion of the oxi3 gene of a strain which has deleted aI3a. This strain was a revertant of a mitochondrial mutation. It grows on nonfermentable- carbon sources.

Thus, it appears that aI3a is optional also.

IV.A.2. Oxi3-I3b is a divergent class I intron

In the Results section, I attemped to establish that aI3b is a class I intron. The evidence for this is that aI3b contains homologs of the seven -conserved cis

-acting sequences which characterize all class I introns. These conserved cis-acting homologs are found in the same order as in other class I introns. Of these conserved sequence elements, E (box 9 R') and P are found close together as are R and E' (box 9L and box 9R, respectively). This arrangement is also typical of other class I introns. In addition, the last base of the 5' 230

exon is a "T" and the last base of the intron is a "G" as

is typical of other class I introns. Finally, aI3b

contains a long ORF which is conserved in length, structure (LAGLI-DADG), and sequence with ORFs found in other class I introns. This ORF shares amino acid homology with the ORF of cerevisiae cob I1* which along with the Tetrahymena intron are probably the the most widely studied class I introns.

While aI3b is clearly a member of the class I family of introns, it has several features which distiguish it from the other members of its class. The first distiguishing characteristic is its length. At a little over 1.9 kb, aI3b is the third longest class I intron yet to be sequenced. The longest class I intron (2.5 kb) is the second intron of the anserina URF1 gene. As will be discussed in a later section, the URF1 intron appears to have suffered at least one insertion event which added about a kilobase to its sequence. The second longest class I intron which has been sequenced is the 2295 nucleotide intron in the L-rRNA genes found in the mtDNAs of crassa and JU nidulans(Burke and RajBandary,

1982). This intron is also unusual in that it encodes the 426 amino acid long protein, S5, which is a protein subunit of the mitochondrial small ribosomal subunit. The rest of the introns in this class are all smaller and average about 1.3 kb in length. As a consequence of its length, some of the conserved cis-acting sequences must be spaced further apart from each other than is observed in other class I introns. The best example of this in aI3b is the distance between Q and R (box 9L). In most class I introns, this distance is less than 100 nucleotides. In aI3b it is almost 800 nucleotides. In the model of class I introns proposed by Waring and

Davies (1984), it is proposed that, when there are large distances between conserved sequence elements, these long sequences contain the majority of the ORFs of these introns. These long spaces between the conserved sequence elements are called "maturase loops". These

"maturase loops" are sequestered from the core structure of the intron by long stem structures. This is clearly not the case for aI3b (Figure 36). First of all, the vast majority of the ORF is 5' of Q. Second, while a short stem can be formed between the sequences just 3* of

Q and just 5' of R (box 9L), this stem is not particularly, impressive. Third, other conserved sequence elements (IG, E, and P) are scattered throughout the intron ORF thus making a "maturase loop" impossible.

The second distinguishing feature of aI3b is the possibility that this intron incorporates novel secondary and tertiary interactions into its core structure which are not used by other class I introns. These additional interactions may be related to possible problems in folding the core structure of thiB intron which are also related to its length. The first of these possible additional interactions involves sequences located between E 1 (box 9R) and S (box 2)(labelled A and B in

Figure 36). These sequences may form short stems with sequences near the internal guide (IG) sequence. If they do interact, they most probably have to compete for this interaction with another sequence located 5' of E (stem

P2). If this interaction does take place, then it probably has to be dynamic as the intron structure shifts from one stem structure to the other. This possible interaction has not apparently been noticed for any other class I introns in that it is not mentioned in the literature. An examination of five other class I introns reveals that strong local interactions probably prevent similar interactions from occurring in most class I introns. One exception to this general rule occurs in the crassa URF1 intron in which the six nucleotides between the IG and E can base pair with six nucleotides which are 3' of E' (Burger and Werner, 1985).

The second of these possible additional secondary interactions occurs between a sequence which is 3' of S

(box 2) and a sequence which is about 250 nucleotides 3' of Q (Figure 37). The core of this possible interaction 233 is a perfect 13 base stem. These two sequences are unusual in that one has a series of 10 consecutive "A" residues while the other has a series of 15 consecutive

"T" residues. No other strings of identical bases of comparable length occur in this intron. Therefore, other comparatively long sequences are precluded from forming long stems with these sequences.

Both of the possible additional interactions discussed in the last two paragraphs are only hypothesized at this point. If these interactions do occur, then mutants which block or hinder splicing could be identified in these regions. Genetic analysis probably represents the surest way to test whether these possible interactions actually occur. If they do occur, then they may help to fold this extremely large class I intron into a core structure similar to that of the other class I introns.

The third distinguishing feature of aI3b is that of the sequences of the conserved cis-acting sequences themselves. As shown in Tables 2 and 3 and as described by others (Davies et al., 1982, and Waring and Davies,

1984), four of the conserved cis-acting sequences have conserved sequences as well. These elements are P, Q, R, and S. The conserved sequences for P, Q, and R are each ten nucleotides long. The conserved sequences for S is

12 nucleotides long. In aI3b, only 7, 5, and 5 of the nucleotides in P, Q , and R (respectively) match the consensus sequences. For S (box 2), the match is 8 out of 12. P and Q are not as highly conserved among class I introns as are R and S. For example, in aI5a the matches for P and Q are only 6 and 4 nucleotides out of 10, respectively. 0xi3 I5b is worse with only 3 and 5 matches out of ten for P and Q, respectively. The key characteristic of P and Q is that they can form short stem structures. In this respect, aI3b is typical of other class I introns. What is unusual is the degree of divergence found in aI3b relative to the sequences for R

(box 9L) and S (box 2). Of the 22 nucleotides in these two conserved sequences only 13 of them match the consensus sequences. Taking the following five introns as an example, aI5a, aI5b, cob 15, the nidulans cob intron, and the Tetrahymena intron match 17, 20, 17, 22, and 20 of the nucleotides in the consensus sequences, respectively. No other class I intron except the second intron in the Podospora URF1 gene and the first intron in the N_;_ craasa cob gene is as poor a match for the combined consensus sequences for R and S as is aI3b. The

Podospora intron only matches 12 out of 22. As will be discussed later that intron is divergent for several reasons. The Neurospora intron is striking because the R and S homologs of this intron are identical to those of aI3b (Burke et al., 1984). Furthermore, these two 235 conserved elements are located near the 3' end of the intron as they are in aI3b. Beyond this, however, the similarities break down. The Neurospora cob intron is only 1.3 kb long and lacks an ORF. It is extremely interesting, however, that the Neurospora intron has been shown to be self splicing (Garriga and Lambowitz, 198*1).

The possibility that aI3b might also be self splicing is very exciting and should be investigated.

IV.A.3. Evidence for lateral translocation of organelle introns between species

Unless one is willing to make the assumption that all the introns within the two recognized classes of organelle introns are the result of convergent evolution, one must assume that the various members of these two classes have arisen by means of various duplication events. As will be discussed in the next section, these events by themselves represent an important group of the insertion events which have shaped the extant forms of the organelle genomes in question. The URF near oli2

(Hensgens et al., 1983) may represent the remnants of a failed attempt to duplicate a class I intron.

Conversely, the striking similarity between cob 1*1 and al4a and between all and aI2 in strains of cerevisiae may represent more recent successful duplication events. 236

Deletion events which cleanly excise entire introns have also been observed (Lambouesse and Sloniraski, 1983,

Gargouri et al.f 1983, Hill et al., 1985, and P. Anziano personal comunication). These deletions were obtained in mitochondrial revertants of mutants which block splicing of one particular intron. The defect was remedied by cleanly deleting the defective intron.

One could imagine that the number of introns in any mitochondrial genome is the result of a balance between relatively rare deletion and insertion events which involve entire introns. There are two reasons why this simple model which involves only intramolecular recombinations cannot explain the current distribution of introns in organelle genomes. The first reason is that, assuming that there is no selection for or against their presence, the number of introns within the organelle genome of any species should be subject to random drift over long periods of time. In a random drift model, one would expect that the number of introns within the organelle genome of any species would eventually drift to none. Once no introns occur, there is no way to reintroduce them by intramolecular recombination. As discussed in the Introduction, there are introns in all fungal mitochondrial genomes so far examined. While the number of mitochondrial introns varies among species, there are no known species in which the number is zero. There are also introns in all of the plant organelle

genomes which have been closely examined. (The one

possible exception is the mitochondrial genomes of dicot

plants, however; only one gene has been examined in two

species of this group ((Kao et al., 1984)). This sample

size is too small to arrive at any conclusion.) There

are no known introns in the mitochondrial genomes of metazoans or protista; however, these may represent

special cases resulting from the highly unusual

organization and structure of these particular genomes.

It may also be that the nuclear genomes of metazoans and protista may simply not encode the products required for

organelle intron excision which other nuclear genomes do.

Regardless, in the absence of selection, one would expect

that the number of organelle introns would eventually become fixed at zero if new introns could only be

introduced by intramolecular duplication. Given the

extremely long time since the divergence of eucaryotes,

one would have expected that many species would have lost

their organelle introns. This has not occurred. While

the sample size is still relatively small, the data are

currently consistent with the observation that if one member of a taxonomic group has introns in its organellar genome all the members of that taxonomic group will have organellar introns.

The second reason that intramolecular recombinations cannot explain the currently observed distribution of

organelle introns is that there appear to be many

examples of introns having close homologs in widely divergent species. Three examples will be given to

illustrate this point. The first is the relatonship between the second intron in the Schizosaccharomyces pombe cox1 gene and the third intron in the eox1 gene of

Aspergillus nidulans (Lang, 1984). These two class I introns are inserted into the exon sequences in exactly the same place. Both of these introns also have long

ORFs which are continuous with the preceding exon and contain the LAGLI-DADG sequences, which are typical of class I intron ORFs. Furthermore, these intronic ORFs are more homologous to each other (70} amino acid homology from the first LAGLI-DADG to the end of the

ORFs) than are the surrounding exons. In fact, there is a 102 amino acid long region of these ORFs in which the homology is 88%; by contrast, the exons are only 58} homologous between these two species. For comparison, the coxl gene of humans shares 53} amino acid homology with the S_^ pombe coxl product. The two introns in question here are inserted into the 3' end of a domain in the cox 1 gene which is very highly conserved between species. Not surprisingly, the 60 nucleotide long sequence which is just 5' of these introns is 82} homologous between these species. What was unexpected 239 was that the 60 nucleotide exon sequence on the 3' side

share 70% nucleotide homology but only 45$ amino acid homology. This led Lang (1984) to postulate that when

the horizontal transfer of this intron occurred between these distinct taxonomic groups, a portion of the flanking exons was also transferred.

Lang also postulated that the direction of the transfer was from a species related to iU nidulans into a species related to pombe. This specific hypothesis was proposed because S_^ pombe rarely uses the codon UGA for tryptophan. This codon is frequently used by other fungi including Aj_ nidulans. UGA is used twice in the coxl gene intron ORF of S_^ pombe. This is not a particularly strong argument since the cob intron and the URFa of pombe also contains UGA codons in their ORFs. On the other hand, these are the only UGA codons in the pombe mitochondrial genome. To accept

Lang's hypothesis one must at least suspect that these other two ORFs were also imported into the pombe mitochondrial genome. An alternative possibibility is that UGA may simply be a preferred codon for intron ORFs in this species. Other codons are preferentially used in intron ORFs in other species of yeast (Hudspeth et al.,

1982).

The second example of a possible horizontal transfer of an intron between species concerns the similarities amoung the chloroplast tRNA introns of flowering plants.

One pair of introns are in the tRNA-UUA-Leu genes of corn

(Steinmetz et al., 1982) and the broad bean (Bonnar et

al., 198*4). These two class I introns are 458 and 451

nucleotides long, respectively. They encode no

significant ORFs. Over most of their length, they are

roughly 85$ homologous. The region of lesser homology is

in the middle of the introns and largely results from a

small number of short insertions and deletions relative

to the other sequence. Being tRNAs, the exons are very

(96$) homologous to each other. Information is lacking

on the overall divergence of corn and bean chloroplast

genomes, but given that the monocot and dicot lines

diverged about 100 million years ago, the high degree of

homology between these introns suggests that these

introns may have diverged after the divergence of the

flowering plants which contain them.

These next four tRNA introns which will be discussed are in the genes for tRNA-Ile and tRNA-Ala which are

found in the spacer regions between the rRNA genes of both corn and tobacco chloroplast genomes (Koch et al.,

1981, and Takaiwa and Sugiura, 1982). Each of these four genes contains a class II intron all of which are over

50$ homologous to each other. The homology is higher when one compares the introns only with the other intron found in the same species. The differences between these 241 various introns can largely be explained by a series of deletions and insertions. All of these authors agree that these introns are evolutionarily related. The question is how have they come to be in their current locations? At least one duplication event is required to get these introns in two different tRNA genes. This could have been (probably) an intramolecular rearrangement. The point of interest here, however, is whether the introns diverged from a common ancestor before or after the divergence of flowering plants.

There is no way to distinguish between these two possibilities at this time.

The third example of a possible horizontal translocation of an intron between species can seen by comparing cob 13 of cerevisiae (Haldl, 1985) and the cob intron of A^ nidulans (Haring et al., 1982). As discussed in the Results section (Figure 41) these two introns have a long ORF which is continous with the preceding exon. Except for their extreme 5' end these two ORFs are homologous and colinear over their entire length. This homology is best in the region around the first LAGLI-DADG sequence and the very end of the ORFs.

In these regions, the ORFs are more homologous to each other than are their respective exons (70 and 67?, respectively). The exon amino acid homology for the cytochrome b apoprotein between these two species is 61? (Waring et al., 1981). In addition, with exception of E'

(box 9 right) all of the conserved cis-acting sequences are conserved in sequence and position. These two introns are also inserted into exactly the same place in their respective exon sequences. No other intron in the

A . nidulans mitochondrial genome shows any extensive homology with any other intron from cerevisiae

(Waring et al., 1984). While this example is not as convincing as the example involving the S_^ pombe intron, it seems likely that these two introns shared a common ancestor more recently than the divergence of the two species in which they are found. These examples of pairs of homologous organelle introns from divergent species lends support to the hypothesis, that on rare occasions, some of these introns are capable of moving laterally across species boundries and establishing themselves in a new species. If this is true, then it can explain why there are no known examples from the relavent taxonomic groups in which there are no introns in their organelle genomes. Introns can be acquired not only by duplication of introns which are already found in an organelle genome (intramolecular recombination), but also by acquiring sequences from outside its species

(intermolecular recombination). Since all of these introns (with the possible exception of cob 12 of S . cerevisiae ) are structurally related to introns which 243

are autocatalytic, relatively little may be required of

the host genome as the new intron first arrives or the

new intron may be able to take advantage of nuclear

encoded products which help splice other introns. The

fact that class I introns can be found in mitochondrial,

chloloplast, eucaryotic nuclear, eubacterial phage, and

archaebacterial genomes indicates that this intron form

is able to adapt to the widest possible variety of

genomic backgrounds. Class II introns may be similarly adaptable in various eucaryotic mitochondrial,

chloroplast. Similarities between class II introns and nuclear introns may also indicate an evolutionary relationship. Certainly if a genome is constantlybeing

•’bombarded" with incoming intron sequences, the possibility that a genome will ever completely lack such sequences appears to be remote even if they are being deleted at some slow rate.

The mechanism by which these lateral transfers of introns might occur is unknown; however, if lateral transfers do not occur, then it is difficult to explain the distribution of introns in extant species. One would have to invoke selection in order to maintain introns in all relavent species in the presence of occasional deletions. One would also have to invoke uneven selection pressures to maintain some introns in a more highly conserved manner than others. There is no 244 evidence that these introns are either advantageous or disadvantageous to their host genomes; however, as discussed in the Introduction, the question of selective advantages has not been extensively investigated. To a first approximation, however, these introns appear to be neutral towards selection.

IV.A.4. Deletions and insertions are important mechanisms in the evolution of organelle introns

Both large and small deletions and insertions are important in the evolution of organelle genomes as a whole. The duplication of introns discussed in the last section is but one example of this phenomonon. The organelle genome which is most prone to insertions is probably the mitochondrial genomes of flowering plants.

In various species, huge sections of the respective chloroplast genomes are inserted into the mitochondrial genomes (Stern and Palmer, 1984). Of course, introns found in the chloroplast sequences have also moved into the mitochondria. The eventual fate of these introns is unknown. The opposite of a huge insertion is a huge deletion. This, of course, is exactly what a petite mutants in yeast mitochondria are. The insertions and deletions which will be discussed here are smaller ones in which sequences of several hundred nucleotides are 245

less are Inserted into or deleted from the middle of

intron encoding sequences.

An excellent example of this kind of insertion event

can be seen by examining the sequences of cob 14 (Nobrega and Tzagoloff, 1980) and oxi3 I4a (Bonitz et al., 1980) from cerevisiae. This comparison has been made by others (Aniziano et al., 1982). It is clear from both sequence and genetic data that these two introns are closely related. It is reasonable to assume that these two introns have recently diverged by at least one duplication event from a common ancestor. This divergence may be as simple as one of these introns duplicating to form the other. The 5' ends of these introns both contain all the cis-acting sequences except S (box 9) but otherwise are not very homologous.

Both introns contain a long ORF which is continuous with the 5' exon. Beginning at the first LAGLI-DADG sequence and continuing to the end of the ORF these two introns are highly homologous and colinear. The DNA homology in this region approaches 70$. This homology ends abruptly at the end of the ORF but picks up again just before the

3' splice boundry and includes S (box 2). The cob intron is about 0.4 kb longer than the oxi3 intron. Most of this difference can be attributed to a 213 nucleotide insertion in the cob intron relative to the oxi3 intron which is located exactly at the end of the ORF and 246 continues to S (box 2). This proposed insertion is relatively A+T rich and is roughly 86J A+T. Of course, this could be a deletion in the oxi3 intron relative to the cob intron. It is imposible to distinguish between these to possiblities.

The second example of deletions and insertions playing an important role in the evolution of organelle introns can be seen by examining cob 13 from S . cerevisiae and the cob intron from JU nibulans (Figure

41). This comparison has been discussed in the last section (IV.A.4.) and also in the Results section of this dissertation. Basically, the situation here is very much like the comparison between cob 14 and oxi3 I4a from S . cerevisiae which was discussed in the last paragraph.

These two introns are clearly closely related. The largest difference between them is a large insertion in the S_^ cerevisiae intron relative to the JU nidulans intron. This insertion is at the very end of the ORF and extends for several hundred nucleotides 3' until the region near S (box 2) is reached. Before this insertion the introns are nearly colinear even though the first three hundred nucleotides are not very homologous between these two introns. This insertion is itself interesting in that it appears to be two separate insertions. One insertion is located within the other. The first of these is several hundred nucleotides long. Like the

situation discussed in the last paragraph, it is very A+T

rich. The second possible insertion, which is located

entirely within the first is a G+C cluster. As was

discussed in the Introduction (Hudspeth et al., 1984),

some of these G+C clusters have been observed to

participate in unidirectional gene conversion events such

that some of these G+C clusters are preferentially

inherited in crosses in which one parent has the G+C

cluster in question and the other does not. Therefore,

it appears that possibly two insertion or deletion events

have occurred since the two introns diverged from their most recent common ancestor. The one of these involes

the A+T rich sequence at the end of the ORF and the other involves the G+C cluster. There is no way to determine the order of these events. It may even be that the G+C cluster inserted itself into the A+T rich region prior to the insertion of the entire insert into the S. cerevisiae cob 13 intron.

The third example of how insertions and deletions have shaped the current form of organelle introns can be seen by examing the structure of the second intron in the

Podospora anserina mitochondrial URF1 gene (Michel and

Cummings, 1985, and Cummings et al., 1985). As mentioned in the last section, this is the longest known class I intron at 2641 nucleotides in length. All of the

cis-acting conserved elements except S (box 2) are

located in the first 300 nucleotides. The next 2248

nucleotides contain ORFs which are homologous to other

class I intron ORFs. The problem is that there are

multiple stop codons spread over its length. Therefore,

it was concluded by Michel and Cummings that these ORFs

are, in fact, . The first and last parts

(total of 290 codons) of this intron can be aligned with

the ORF of the Neurospora crassa intron found in the

ATPase subunit 6 gene. The overall homology between

these two ORFs was not given by the authors but there are

sections where the amino acid homology is 50$ which

contain regions of 7 amino acids in a row in which the

two sequences are identical.

In order to make the alignment discussed in the last paragraph, one has to assume four insertions. Two of

these are palindromic G+C clusters which also cause frameshifts which destroy the integrity of the ORF. One is small and will not be further discussed. The fourth is huge and is 1.2 kb in length. This insert divides the

ORF which is homologous to the Neurospora intron almost into equal portions. There is an ORF in this insertion also. It is related by amino acid homology to aI5a of

S. cerevisiae. The amino acid homology is not as good 249

as with the Neurospora intron but still appears to be

significant. There are also two frame shifts in this ORF

but no insertions. The ORFs of class I introns can be

divided into three groups based on amino acid homology

(Hensgens et al., 1983, and Burger and Werner, 1985). The

Neurospora and cerevisiae introns discussed here

represent members of different groups. It may be

significant that the ORF of aI5a is a member of the same

group as the omega ORF. The ability of the ORF of omega

to produce a product which can catalyse its own insertion

into a new sequence is well documented (see

Introduction).

One can imagine that at one time this intron

contained a single and translatable ORF which is the one

which is homologous to the Neurospora intron. For some

reason, this product is not required for intron excision

and so could acquire mutations. What is surprising is

the large number of insertions which have taken place in

this region. The most interesting insertion may be the

second ORF which is homologous to the ORF of aI5a. If

this kind of insertion is common, it may provide a mechanism by which new coding sequences (and therefore,

possibly new functions as well) could be brought into the

intron.

The next examples will be reviewed as a group. These are the tRNA genes of plant chloroplasts (Steinmetz et al., 1982, Bonnard et al., 1984, Koch et al., 1981, and

Takaiwa and Sugiura, 1981) and the monocot mitochondrial intron found in the cox II gene (Fox and Leaver, 1981, and Kao et al., 1984). These introns have already been discussed in depth in the Introduction and in the last section of this Discussion. The important point which is relevent here, is that all of the introns in corn which have been studied by these investigators have homologous introns in other species of plants. The degree of homology varies dramatically from as low as about 50% for the class II chloroplast tRNA introns (Koch et al., 1981, and Takaiwa and Sugiura, 1981) to as high as 98.6% for the class II mitochondrial cox II introns (Kao et al.,

1984). The surprising finding is that the major differences between the homologous introns are large insertions. In the rice cox II intron, there is a 461 nucleotide long insertion relative to the homologous intron in corn. In the class I introns of the bean and corn tRNA-UUA-Leu genes the insertions are smaller, 13 and 27 nucleotide respectively. Finally, the tobacco tRNA-Ile and tRNA-Ala gene introns (class II) differ from their corn homologs in that there are 229 and 103 nucleotide insertions in the corn introns relative to the tobacco introns. The homologies between these pairs of 251

introns are sufficiently high to indicate that they

evolved from common ancestors regardless of whether this

divergence occurred before or after the divergence of

their host species. None of these introns contains a

significant ORF. It may be that they can tolerate insertions without loss of function better than introns with ORFs.

Taken as a whole, the examples described above indicate that insertions and deletions played an important role in shaping the current forms of organelle introns. Because insertions and deletions alter large sections of sequence at once, they can play a more decisive role in the evolution of introns than the gradual changes inherent in accumulated point mutations.

Vith the preceding discussion in mind, the size of aI3b becomes more understandable. First of all, there are three G+C clusters in this intron. It is reasonable to assume that these sequences were brought into aI3b through insertion events. It would be interesting to determine if all of the aI3b homologs found in this study also contain these same G+C clusters. This would be a relatively easy experiment since the only Hpa II sites in this intron occur in the second and third G+C clusters.

By determining the distances between the Hind III site and the 3' exon Bel I site relative to the Hpa II sites, 252 one could quickly screen for the presence of these inserts. In addition to the G+C clusters, the large A+T rich region which extends for nearly 600 nucleotides between Q and R (box 9L) may have also been aquired as a large insertion. If these various sequences were removed, then aI3b would be only about 1.2 kb and would still have all of the cis-acting elements and its ORF.

IV.A.5. Similarities between organelle introns and infectious agents

Some organelle introns of both classes have the capability of encoding proteins. The possible functions of these proteins were discussed at length in the

Introduction. Basically, these intron ORFs, which have been shown by genetic evidence to encode functional products, all encode products which are related in some way to nucleic acid metabolism. Some are required for the splicing of the intron which encodes them (maturases) and one is involved in unidirectional gene convertion

(omega). It has been shown that all of the ORFs found within class I introns share some structural and amino acid sequence homology (Hensgens et al., 1983» and Burger and Werner, 1985). The sequence homology between the

ORFs of the class I introns is not great (sometimes as 253 low as 8J); however, the various class I intron ORFs can be subdivided into three subgroups. The various members of the subgroups are a more respectable 20 to 30% homologous at the amino acid level. The region of highest homology is the so called LAGLI-DADG region which is flanked by two conserved amino acid sequences and is about 115 amino acids in length.

The occurance of this conserved region has led to the hypothesis that all of these ORFs are evolutionarily related. It has also led to the examination of other protein encoding regions to determine if these proteins are related to any other proteins outside the intronic

ORFs (Zimmer, 1983)* This search was successful.

Tobacco Mosaic Virus (TMV) is a 6.4 nucleotide long single stranded RNA virus which encodes three proteins.

One of these is a 30 KDa protein which genetic evidence has shown is required for spreading of the virus between host cells but is not required for virus replication. If one allows for four small insertions in the mitochondrial intronic ORF, then the LAGLI-DADG region of cob 14 from

S. cerevisiae shares 24% amino acid homology with the central portion of the 30 KDa protein of TMV. It is also reported that this homology continues at a reduced percentage in the region 5' of the LAGLI-DADG region in some strains of TMV. While this degree of homology is 254

not particularly impressive, it is in fact better than

the homology between the cob 14 ORF and the ORF of -aI3a

from the oxi3 gene. This evidence is not strong enough

to conclude that these virus and intron proteins diverged

from a common ancestor, but it is sufficient to make this

relationship possible.

The second group of intronic ORFs are found in four

of the class II introns. These are all and aI2 from the

S. cerivislae oxi3 gene, the Podospora intron which is associated with senescence in that organism, and the

S . pombe cob intron. These ORFs all can encode proteins which are homologous to each other. Both yeast proteins act as "maturases" (Carignani et al., 1983, and

Mecklenburg in prep.). What is surprising is that all of

these ORFs share amino acid homology with RNA dependent reverse transcriptases of viral origins (Michel and Lang,

1984). The blocks of homology seen were shared by four retroviruses, cauliflower mosaic virus and hepatitis virus and "17.6", a transposable element. The overall amino acid homology was not high, but the regions of shared homology and their relative spacing were the same

regions of shared homology which linked the various

reverse transcriptases to each other. Structurally,

class II introns are not related to any of these virus

types, and reverse transcrptases are not required for 255 splicing of class II introns. However, the presence of such an activity may help explain how both class I and class II introns are able to duplicate themselves.

The last similarity between organelle introns and infectious agents is the similarity between class I introns and plant viroids. Viroids are the simplest infectous agents known. They are single stranded RNA molecules of less than 500 nucleotides (usually less than

400 nucleotides) (reviewed by Sanger, 1984). Two of these agents will be discussed here. These are potato spindle tuber viroid (Diener, 1986) and the satellite viroid of peanut stunt virus (Collmer et al., 1985). The interesting finding is that these infectious agents contain sequences which are homologous to P, Q, R, and S.

There are also sequences which could base pair to form short stems which may be functionally homologous to E and

E'. The R and E' homologs are found in the "central conserved region" which is an important characteristic of all viroids. The problem is that, if one considers the monomer of the viroid sequences* then the secondary structure of the viroids does not favor direct interaction of these sequences. This problem can be overcome if one considers the fact that in vivo the viroids are synthesized as multimers (Branch et al.,

1981) which are then processed down to monomer size. Diener (1986) has used computer modeling to predict the secondary structure of viroid dimers. Surprisingly, there are interactions between the sequences at the ends of viroid units which can interact. These interaction are predicted to be energetically favored. If these interactions do occur, then the entire secondary structure of the viroid is dramatically altered. The new structure is extremely homologous to the one that is typical of class I introns. After processing, the viroid structure would change and the class I intron structure would be lost. This could be a mechanism which insures that the processing proceeds in only one direction.

The question now arises as to whether class I introns evolved from viroids or whether viroids evolved from excised introns. This question cannot be answered with the current data. The fact that organelle introns (and some nuclear introns) have catalytic activity has led several authors (Cech, 1985, Sharp, 1985, Tabak and

Grivell, 1986, Gilbert, 1986, and Lewin, 1986) to speculate that the catalytic RNAs seen today are the remnants of an extremely ancient biochemistry which dates to the beginning of life itself. The catalytic activities of the Tetrahymena nuclear intron, such as the polymerase activity, (Zaug and Cech, 1986) are extremely suggestive of this possibility. There are 257

other RNA catalytic activities as well. The best example

of this is the RNA moiety of the RNAse P of coli

(Guerrier-Takada et al., 1983). It has also been argued

that protein synthesis is a fundamentally RNA catalysed

process (Cech, 1985). The point of this discussion is

that if, in fact, viroids and class I introns share a

common ancestor (as opposed to convergent evolution),

then this ancestor may be so ancient that this entity would not be classified as either an intron or a viroid.

IV.A.6. Future directions

In this study the structure of the oxi3 gene in 11 species of yeast in genus Saccharomyces has been investigated. It has been found that the number and kinds of introns found in this gene are extremely variable among species (see Table 7). This has led to the conclusion that most if not all introns found in this gene are optional for respiratory functions. It has also led to the discovery, of two previously unobserved introns. One of these is an unusual example of a class I intron.

There are, however, several areas of investigation which could contribute to this analysis. First, the other previously unobserved intron should be fully sequenced and its splice points clearly defined. Second, intron-specific probes should be used to more accurately define the distribution of aI5a and aI5b. Finally, it is unknown for some of these species if they possess any additional new introns in their extreme 5' ends of the oxi3 genes. This would require the use of exon specific probes which could then be used to probe Southern blots of mitochondrial DNAs which had been digested with restriction enzymes which would be expected to cut the 5' exons into several pieces. Those data could be used to determine if these exon sequences are continuous or seperated by large distances. Once this was known, it would be feasible to use S1 nucleases digestions of heteroduplexes of mtDNAs from various strains of yeast to more accurately map the location of any newly observed introns.

IV.B. The beginning of rickettsial molecular genetics

The goal of this project was to determine if the various species of the genus Rickettsia could be distinguished unambiguously by restriction fragment length polymorphisms (RFLP). The reason that this goal was chosen is that accepted serological differientiation

(Philips et al., 1978) is not unambiguous. In meeting 259 this goal, this project was successful.

The surprising finding of this investigation was that the members of the genus Rickettsia are much more closely related to each other based on nucleotide homology than was previously suspected. As previously discussed in the

Introduction, other investigators have attempted to determine the degree of DNA homology between various members of the genus Rickettsia (Tyeryar et al., 1973,

Myers and Wissemann, 1980, and Myers and Wissemann,

1981). Those worker indicated that Rickettsia rickettsii, a member of the RMSF group, is only >40 +/-

20% homologous to the Typhus group rickettsia and R . canada. They also reported that the genomes of the

Typhus group rickettsia are only two thirds the size of the R^ rickettsii and R^ canada genomes.

In this investigation, three recombinant probes derived from the genome of R_^ rickettsii were used to probe Southern blots of Hind III digested R^ typhi and

R . canada DNA. The A+T content of rickettsial DNA is similar to the A+T content of yeast mitochondrial DNA.

Using a stringency which just barely allows cross hybridization between yeast mitochondrial sequences which are 70% homologous, the R^ rickettsii derived probes strongly cross hybridized to both R_^ typhi and R . canada DNAs. It is difficult to reconcile these 260 observations with the previously published results.

Clearly, the R^ rickettsii derived probes are more than 70% homologous to the R^ typhi and R^ canada sequences.

It is also interesting, that if the Typhus group rickettsia genomes are significantly smaller than the R . rickettsii genome, one would have expected that one of the probes would have failed to hybridize to the R . typhi DNA. (If we assume that the genome size of R . typhi is two thirds that of R^ rickettsia, then the probability of three random probes hybridizing is only

.296). All three R^ rickettsii derived probes hybridized to the R_^ typhi DNA. Three explanations of this data come to mind. The first is that by chance the

R . rickettsii derived probes are of regions from the rickettsial genome which are conserved between species.

The second possibility is that the previous investigations overestimated the true divergence of this group due to the high A+T content of these genomes.

These investigators did attempt to take this factor into account, but perhaps the adjustments used were insufficient. Finally, the genome size of R^ typji may have been underestimated. Regardless of the reason for the observed inconsistencies, it is clear that there are regions in the R^ rickettsii. R . canada and R. 261

rickettsii genomes which are more homologous than

previously estimated.

Another surprising finding of this investigation, was

that the members of the RMSF group are extremely

homologous to each other. Using conserved restriction

sites as an indicator of homology, it was shown that most

of the members of the RMSF group are on the order of less

than 9% divergent from each other at the nucleotide level. Many of these species were less than 3J divergent. The one member of this group which was more divergent was R^ akari. Thi3 species also has a different invertebrate host range than the rest of the group. These studies indicate that R^ bellii is not closely related to the RMSF group and that R^ canada and

R . typhi are also not closely related.

This work has now been largely completed. It could be expanded upon by adding additional strains to the survey and by including R_. prowazekii. It is already known, however, that R^ prowazekii is closely related to

R . typhi. The survey could also be expanded by adding additional restriction digests and additional probes.

While this will increase the data base and decrease the accuracy of in the predicted divergence values, I am not sure that any of this will change the basic findings of this work. The technology has been developed to use 262 molecular techniques to investigate the Rickettsia.

Currently, other workers (Karl Poetter and Jon

Clark) are investigating the intraspecific variation in

R . bellii and the structure and sequence of the rickettsial 16S rRNA. The analysis of the rickettsial

16S rRNA will doubtlessly clarify the relationship between the Rickettsia and other eubacterial groups. An interesting question concerns the intraspecific variation of various isolates of R^ rickettsii. R . rickettsii was unknown in the eastern half of the United States until it was observed in N. Carolina in the 1930's. Since then it has become a problem of great public health concern over the entire eastern United States. The question which most workers with R_^ rickettsii would like answered is where did the eastern RMSF agents come from? More specifically, are the various isolates of R . rickettsii from the eastern half of this country clonally related? The reason this is interesting is that the answer to this question can allow one to distinguish between the possibility that the outbreak of RMSF originated the introduction of a new pathogen in the ea3ten United States about 1930 and the possibility that

RMSF had aways been there, undetected in the animal populations, and an environmental change led to the outbreak. There are many other interesting questions which can be approached studying Rickettsia. These concern basic rickettsial metabolism and the extent of the host-symbiont interactions. The similarities between

Rickettsia and the predicted endosymbiont which later evolved to become the modern mitochondrion are described in the Introduction. It has been shown (Anaker et al.,

1982) that members of the RMSF group can persistently infect vole derived tissue culture cells without killing them. Investigation of the metabolism of these persistently infected cell lines relative to that of uninfected cells and cell lines (such as Vero cells) which are destroyed by Rickettsia can yield valuable information. If I were designing these experiments, I would want to investigate the aerobic metabolism of these cells. Specifically, I would investigate the rate of mitochondrial and rickettsial oxygen consumption in infected and uninfected cells. This can be easily done using drugs which selectively inhibit either mitochondrial or rickettsial respiration. An interesting question is whether Rickettsia can substitute or replace mitochondrial functions in inhibited cells. Similarly, can Rickettsia survive entirely by scavenging ATP from their hosts if their oxidative functions are inhibited.

Another experiment that I would suggest is to clone the NAD translocator from a rickettsial species (Atkinson and Winkler, 1981). coll will recognize and express some (perhaps all) rickettsial promotors (Wodd et al.,

1983 Krause et al., 1985, and Greg McDonald personel communication). Energy metabolism plays an important regulatory role in rickettsial organisms. The citrate syntase and ATP/ADP translocator genes have already been cloned. The NAD translocator may be another important link in this system of regulation (see Introduction). It also appears unusual to me that no protein encoding gene from rickettsia has ever been sequences. Such information would yield valuable information about the molecular evolution of rickettsia in general. It might also reveal unique or interesting features of rickettsial genes as compared to other organisms. TABLES

Table 1. P and Q sequences of class I introns

List of P and Q homologs found in the sequences class I introns are presented. Also included are the consensus sequences for P and Q as derived from the sequenced homologs. Numbers over the individual letters representing the consensus sequences indicated how many of the sequenced homologs actually have a particular nucleotide shown in the consensus sequences.

265 266

TABLE 1 Consi >d P and Q Sequenoaa froa Claaa X Introna

Xntron t Rafaranoa SoC0B3(M) AOGCDGAAAO OAGCAGOAAO Anzlano at al.. In prap. D SoCOB5(M) OOGCAAGAAA AACDOGCAGC Hobraga at al., 1980

3oC0B4(H) AUGCUGGAAA AAOCAGCAGG

3oOX4(M) ADGC000AAA AA0CA0CA0G Bonita at al., 1980

SoOZ3(H) AOOCOGOAAA AAOCCOCAGG

3oOZ5(H) C0C0A0GAAA AACCUCQAAC Banagana at al., 1983

3oOX6(M) AACOOGGDOO AAOCCCGUGA

SoRNA(M) UOGCDOAGAU AAOAAGCAGC DuJon, 1980

KtrRNA(M) DOQCAAOAAU AADOOGCAGC Hlobal at al., 1982

AnOX1(H) ADGCOOGAAA AAOAOGCAOG Haring at al., 1984

AnOX2(M) AOACDOQAAA AAOCAODAOG

AnOX3(M) AOQCOGOOAC OAOCAGCAGG

AnCOB(H) AOOCOGOAAA AAOCAOCAGO Haring at al., 1982

AnRNA(M) OOOCAAAAAO OAODOGAAGC Matskat at al., 1982

MoRHA(M) OOOCAAAAAO AAOOOGAAGC Burka and SaJBbandary, 1982 267

Tabla 1 (oontlnuad) 00 MoOLX2(M) AOGCDGOGAA AAUCAOCAOO Moralli and Maolno, 1984

NoORFI(M) 0D0C00GAAA AA0CA0CAA0 Bur ', 1985

NoCOBI(M) ACOOOOAAAA OAAOACCGOO Halner-Cittarlob at al., 1983

NoC0B2(H) DOACAAGGAC OAOOOGDAGC

PaOI-1(M) OOGACGOGGA AAOCCOCAGC Mlobal and Cuaalnga, 1985

Pa01-2(M) COACOGGGAA AADCOCGOAO

Pa01-3(N) OOOCAAOAAA AACOOGAAOC

SpoOX2(M) AUGCOOOAAC OAOCAOCAGO Lang, 1984

VftRNA(C) OOCAGAOAAA AAOCCOOAGC Bonnard at al., 1984

ZatRBA(C) OOCAGAOAAA AAOCCOOAGC

TtrRNA(N) OOGCGGGAAA AACCAGCAOC Kan and Gal, 1982

PprRNA(H) AOGCOOAOAC OADCAGCAAO Roalyana at al., 1981

SoaOZ3b(H) AOACAGGTAA AAGCOOGOOC aaa Plgura 29

15 26 17 22 13 16 22 19 26 18 Conaanaua P A 0 0 C 0 0 0 A A A 0 A,0 A A 0 0 12 8,6 6 6 7 6

20 28 21 18 14 22 17 24 21 13 Conaanaua Q A A 0 C A 0 C A G G 0 6 5 2 S A C 8 6 8 4 5 5 12 268

Table 2. R and S sequences of class I Introns

Lists of R and S (box 9L and box 2) homologs found in the sequenced class I introns are presented. Also included are the consensus sequences for R and S as derived from the sequenced homologs. Numbers over the individual letters representing the consensus sequences indicate how many of the sequenced homologs actually have a particular nucleotide shown in the consensus sequences. 269

TABLE 2 Cona id B and S Saquanoaa froa Claaa I Introna

Intron II Rafaranoa SoCOB3(H) OCAGAGACOO AAGAOAOAODCC Anzlano at al., In prap

SoC0B5(M) OCAACGACOA AGGACADAGOCD Nobrega at al., 1980

ScCOBA(M) DCAGAOACOA AAGADADAGUCC

SoOXA(M) UCAGAOACDA AAGAOAOAODCC Bonita at al., 1980

SoOX3(M) OCAOAOACOA AAOAOAOAGOCC

3oOX5(M) QOAGAGAODA AAGODAOAAOCC Banal I.. 1983

SoOX6(M) GOAGAGACOA AAOAOAOAGOCC

SoRMA(M) OCAACGACOA AOGAOAOAOOCO Dujon, 1980

KtrRNA(H) OCAACGACOA AOGACAOAODCD Hiohal at al., 1982

AnOXI(M) OCAGAGACOA AAOACAOAGOCC Baring at al., 198A

AnOX2(M) OCAGAGACOA AAAAOAOAGOCC

AnOX3(M) OCAGAOACDO AAGAOAAAOOCC

AnCOB(H) DCAGAOACOA AOOAOADAGOCC Baring at al., 1982

AnRNA(H) OCAACGACOA AAGAAAOAGOOO Batakat at al., 1982

RcRHA(H) OCAACGACOA AAQAAAOAOOCO Burka and ReJBhandary, 1982 270

Table 2 (eontinued)

HoOLX2(H) OCAGAGACOA AAGADAUAGDCC Moralli and Maolno, 1984

NoORM(M) OCAOAGACOA AAGGDAUAGUCC Burger and Werner, 1985

NoCOBI(H) COACAGACQO AAOGOACAQOCG Belaer>Cltteriob et al., 1983

NoCOB2(M) OCAACGACOA AAGAOAOAGOCO

PaOI-KM) OCAOCOACOA AAGAOAOAGOCA Hlohel and Cunnings, 1985

Pa01-2(M) OCAACOAGCA AAGGOGDGCDCO

Pa01-3(H) OCAACGACOA

SpoOZ2(H) CDAOAOACOO AAAAOAAAGOCC Lang, 1984

VftRHA(C) GCAGAOACTC AAGAOAGAOOCC Bonnard et al., 1984

ZatRNA(C) OCAGAGACDC AAOAGAGAODCC •

CrRMA(C) OGAOAOACOD AAGACAUAGOCC Roohaix et al., 1965

TtrRNA(N) DCACAGACDA AAGADAGAGOCC Kan and Gal, 1982

PprRMA(N) DCAACGACDG AAGGOOCAGUCC NoBlpaBa et al., 1981

A DaRNA(A) d c a o a o a c t a ACOAGAAAOAG Kjeaa and Garrett, 1985

SoaOX3b(M) COACAGACUG AAOGOACAGDCG aee Figure 29

21 2* 30 18 20 30 30 28 29 21 Conaenaua R 0 C A 0 A 0 A C 0 A 0 0 A C 0 6 6 9 10 5

29 21 24 24 20 28 21 28 26 28 27 19 Conaeneua 8 A A G A V A 0 A 0 0 C C

S * 7° 271

Table 3. Yeast strains used in this study

The following yeast strains were used in this study.

Table 3: Yeast species examined

Yeast Species Strain

Saccharomyces cerevisiae . IDM1-6/161 S. cereviciae NRRL Y-12,632 s. capensis NRRL YB-H237 s. aceti NRRL Y-12,617 s. coreanus NRRL Y-12,637 s. dlastaticus NRRL Y-2H16 s. hienipiensis NRRL Y-6677 s. norbensis NRRL Y-12,656 s. oleaceus NRRL Y-12,657 s. uvarum Nova Scientific s. ellipsoideus Nova Scientific 272

Table Rickettsial strains used in this study

The rickettsial isolates used in this study are shrown. All rickettsia used in this study were the kind gifts of either Dr. W. Burgdorffer or Dr. G. Dasch.

Table Rickettsial species examined

Species Name Strain designation

Rickettsia akari RML str. 29 R. australis NQTT-1,H2292 R^ belli! RML 12-T R . canada MK17 R . conorli ATCC VRS97 R. montana RML M5/6 R . parkarl RML SF110 R. rhlplcephali RML #42 R. rickettsii RML Sawtooth R. siberloa Bozeman 232 R . typhi RTW-46 Ehrlichia rlsticli Maryland strain Ehrlichia senetetsu Japanese strain 273

Table 5* Zero mixes

The zero mix series used in dideoxy sequencing as moidified from the description given in the BRL sequencing manual were modified and used as shown.

Table 5 Zero Mixes

Microliters Added A zero C zero G zero T zero

0.04 mM dATP 4 1 1 1 0.5 mM dCTP 20 1 20 20 0.5 mM dGTP 20 20 1 20 0.5 mM dTTP 20 20 20 1 10x Polymerase 20 20 20 20 reaction buffer Table 6. Termination mixes

DideoxyNTPs were added as shown to the zero mixes shown in Table 5 to make the termination mixes used in dideoxy sequencing performed in this investigation.

Table 6 Termination Mixes

Termination mix "A" = 20 microl. of 0.5 mM ddATP + 20 microl. of A zero

Termination mix "C" = 20 microl. of 0.5 mM ddCTP + 20 microl. of C zero

Termination mix "G"s 20 microl. of 0.5 mM ddGTP + 20 microl. of G zero

Termination mix njn- 20 microl. of 0.67 mM ddTTP + 20 microl . of T1 zero 274 275

Table 7. Distribution of oxi3 introns

The distribution of introns in the oxi3 gene as deduced from data shown in Figures 8, 9, 10, 11, 12, 19, 20, and 21 are summarized in this table.

Table 7

Sunmary of 0XX3 Introna In 11 speoles and atralna on Saoobaroayoes Teaat

Mane of Zntron

*11 *12 aI3 aI3 al* al* aI5 aI5 aI5 Xaaat Spaciea alpha beta alpha beta alpha beta gaii

161 ^ + blenlplenala + + elllpaoldeua ooraanua norbenala - oleaaeua ♦ + + + oapenala + aoetl ? ? + + - + . + + dlaatatloua + - + - + + ~ + + uvarua - + + - - .... 4. oerevlalae + 444--444 276

Table 8. Codon usage in aI3b's ORF

The amino acid composition and the codon usage as deduced fron the nucleotide sequence of aI3b is shown. Also shown are the codon usage data for the structural gene and intronic ORFs of S_j_ cerevisiae mtDNA and the codon usage of the varl gene (Hudspeth et al. 1982). 277

Table 8 Codon Usage In aI3b*s ORF

Amino acid codon Genes ORF var 1 aI3b

Ala GCA 53 30 0 0 U 66 58 0 5 C 5 10 1 0 G 2 7 2 0

Arg AGA 37 121 3 3 G 0 7 0 2 CGA 0 0 0 0 U 0 14 1 1 C 0 1 0 0 G 0 1 1 1

Asn AAU 62 356 121* 45 C 9 26 2 0

Asp GAU 41 137 10 17 C 9 26 2 1

Cys UGU 12 53 1 4 C 1 1 0 1

Gin CAA 29 69 4 3 G 4 7 0 O

Glu GAA 37 105 2 8 G 3 16 0 0

Gly GGA 28 52 0 0 0 88 120 8 11 C 0 6 1 1 G 6 14 1 0

His CAU 47 65 4 3 C 3 3 1 0

lie AUU 149 270 40 19 C 26 22 2 4

Leu UUA 224 338 41 48 G 2 13 0 1

Lys AAA 31 352 35 45 G 65 49 2 0 278

Table 8 (continued)

Met AUA 7 133 31 20 G 65 49 2 0

Phe UUU 71 136 5 19 C 60 19 0 3

Pro CCA 35 26 0 2 U 39 83 4 6 C 2 5 2 1 G 0 8 1 0

Ser UCA 71 71 3 4 U 34 68 4 13 C 1 10 0 1 G 0 3 0 1 AGU 15 67 9 5 C 0 3 0 2

Thr ACA 51 76 1 5 U 34 62 7 6 C 1 12 0 0 G 0 17 0 0 CUA 14 24 2 6 U 2 21 3 4 C 0 1 0 0 G 2 0 0 0

Trp UGA 36 47 3 5 G 0 6 0 0

Tyr UAU 73 209 28 19 C 13 17 0 3

Val GUA 76 89 1 6 U 42 66 4 6 C 4 4 0 0 G 6 10 0 1 279

Table 9• Rickettsial serology

Various sources of antibodies were titered against rickettsia infected Vero (green monkey kidney) cells using the microIF test as modified as described in the Methods section. Titers indicate the' highest dilution of a particular antibody (or serum) which were visibly detected using the microIF test.

Table 9

COMPARATIVE IFA SEROLOGY

IFA TITRES - RECIPROCAL DILUTION SPECIES E-11G2® E11-F12® RMLb CDCc

R. AKARI <100 <100 16 8 R. AUSTRALIS <100 <100 <16 16 R. BELLII <100 <100 <16 16 R. CANADA <100 <100 <16 1 R. CONORII £.100 100 64 16 R. MONTANA 20,000 20,000 64 32 R. PARKERI 80,000 80,000 32 64 R. RHIPICEPHALI 40,000 40,000 128 16 R. RICKETTSII i 320,000 160,000 1024 64 R. SIBERICA 80,000 40,000 256 32 R. TYPHI <100 <100 < 16 0 E. RISTICII <100 <100 <16 0 a. MURINE MONOCLONAL ANTIBODIES TO R. RICKETTSII b. ROCKY MOUNTAIN LAB MURINE POLYCLONAL ANTIBODY TO R. RICKETTSII c. CDC RABBIT POLYCLONAL ANTIBODY TO R. RICKETTSII S A N A T D N D U P S B O K 280

4.5 5.1 4.5 4.3 5.2 4.9 4.2 6.7 5.6 5.3

1.8 1.4 1.7 1.4

4.0 2.4 4.2 2.4 2.8

1.9 1.9 3.6 3.2 3-2 4.3 2.6 4.7 2.7 1.4 2.4 4.4 2.6 2.4 2.4 4.5 8.0 13.6

1.5 1.5 3.1 3.7 3.1 2.4 4.3 2.8 7.1 3.7 3.1 2.4 3.0 2.4 2.4 5.1 2.8 4.3

3.7 3.1 3.8 2.4

Table 10 Table 16.7 16.7 7.1 4.0 4.9 5.1 11.5 Rickettsial Genomes 3.9 20.1 23.3 - 4.8 4.8 4.8 20.1 4.8 20.1 16.7 2.4 4.8 23.3 16.7 8.3 6.3 18.1 16.7 9.1 2.9 22.5 16.7 8.0 2.9

6.7 18.3 22.9 22.9 18.3 - 14.4 13.0 13.0 14.4 11.7 11.7 11.9 13.6 10.9 23.8 16.7 24.1 24.1 22.4 16.7 16.7 22.8 22.9 16.7 - 7.8 7.8 - 4.9 Estimated Percent Mismatched Pairs Between akar aust bell cana coro mont park rhlp rick sibe typh Using the method of Engles (1981) and the data the and (1981) Engles of method the Using t y p h s i b e a u s t b e l l c a n a c o n o r i c k a k a r m o n t for the percent mismatohed pairs shown on this table. this on shown pairs mismatohed percent the for calculated. Also shown are the estimated sample errors sample estimated calculated. the are shown Also rickettsial genomes rickettsial presented in Figures 49-52, the estimated percent estimated the were 49-52, genomes Figures in Rickettsial presented between pairs mismatched 53. Figure generate to used was data This Table 10. Estimated percent mismatched pairs between pairs mismatched percent Estimated 10. Table

s s h n n x < h o a h a #.

Figure 1. Structure of the yea3t mitochondrial genome

The relative positions of the known genes oh the yeast mitochondrial genome are shown. Transfer RNAs are shown as dots except for the cluster of tRNAs which occurs between the L-rRNA and oxil genes. Three genes in the mitochondrial genome of yeast are knpwn to contain introns. These are oxi3 , cob and the L-rRNA genes. The distribution of introns in these genes is strain dependent. Shown here are the relative locations of the thirteen Introns which are known to occur in mtDNAs of laboratory strains of yeast; not all strains,, however, have all of the introns shown.

281 ■> bl4 9 bfl bl2bl3 ! bl5 omega — •M’ 3 oxi2 * tsl ? r R N ^ ^ olil* varl LrRNA - n r oxi3 7 T X ' c0b oxil arf ■rf “ P* oli2 arf most tRNAs

PO 00 ro 283

Figure 2. Interaction between conserved sequence elements in class I introns

All class I introns possess seven sequence elements which can participate in long distance sequence interactions either with themselves or with the boundary sequences of the flanking exons (reviewed by Haring and Davies, 1984). These elements are named the internal guide (IG), E, P, Q, R, E', and S. They always occur in the same order. E is close to P and R is usually close to E*. The IG is usually close to the 5' exon. The IG can form short helical stems by base pairing with the last few nucleotides of the 5' exon and the first few nucleotides of the 3' exon. E can base pair with E', P can base pair with Q, and R can base pair with S. In addition, an optional stem, P2, is shown which is formed by base pairing between a sequence which is just 3* of the IG and a sequence which is just 5* ’of E. 284 c o X o V)

a

r U

6— Ul nL f— G

L i k8 Figure 2 1 o 285

Figure 3. Interactions between conserved sequence elements in aI3b

Intron aI3b possesses the homologs of the seven conserved sequence elements which are found in all class I introns. In aI3b, the distance between the 5' exon and the IG and the distance between Q and R are greater than has been observed in most other class I introns. In addition, two other possible sequence interactions are shown. The first of these (1), is between two short sequences which occur between E' and S which may compete with a sequence just 5' of E for the formation of stem P2 by base pairing with x. The second possible unusual base pairing interaction found in aI3b (2) is between a sequence which is 3' of Q with a sequence which is 3' of S. 286 c o X CO

to

UJ CM fiC

= f

x Figure 3 o tf> .z. y ® © ,c/ w ^ s ® ou*o*u»u»u*u»* / ^ftU*U*U*U»U*u*« i i » i * i • ■ • * ! y

B . . ^ v / \u-^uu4 1 . . Z“‘S>“ ®i> r o I c tin . **C £ !>•

Figure 4. Computer generated secondary structure of al4

Shown is a computer generated secondary structure of a m (taken from Michel and Dujon, 1982). Intron al4 is a class I intron. The base pairing of the 5 ’ exon with the IG is shown as "a". The B and E' interaction is shown as "b". The base pairing between P and Q is shown as "c", and the interaction between R and S are shown as the narrow arrows above f and f'. The other two labeled stems in this figure, d and e, are less highly conserved between class I introns and are called P6 and P8 in the nomenclature of Waring and Davies (1984). Intron/exon boundaries are shown as bold arrows. 288

•ji» c o o *00' i L l

Figure 5. Computer generated secondary structure of all

Shown is a computer generated secondary structure of all (taken from Michel and Dujon, 1982). Intron all is a class II intron. Intron all is related to intron aI2. Lower case letters are used to indicate those nucleotides which vary in aI2 relative to all. Arrows are used to indicate the exon/intron boundaries. 289

Figure 6. EcoR1 digests of 11 species of yeast mtDNA

EcoR1 digests of mtDNA from all eleven species examined in this investigation are shown after being size fractionated by agarose gel electrophoresis. None of the species produce Identical patterns of DNA fragments indicating that there are differences which distiguish all of these mtDNAs from each other. S. cerevisiae S. uvarum S. diastaticus S. capensis S. oleaceus S. norbensis S. coreanus S. eilipsoideus S. hienipiensis S. aceti 161 pBR 322 (uncutl 291

Figure 1 r. Genomic location of probes used in this study

Parts of the ox!3 gene and flanking regions were cloned and used during this investigation as probes to identify homologous sequences on Southern blots of mtDNAs. Bars indicate those regions which were cloned. The recombinant clone pKM2 was construted using mtDNA from a strain which lacked introns aI5a and aI5b; therefore, those exons numbered 5, 6, and 7 in this figure were continuous in this strain. Dotted lines indicate sequences from aI5a and aI5b which are missing from pKM2. The clones 11 through 14 are intron specific Mbo1 fragments cloned into the BamH1 site of pBR322. The Bglll band five is the Bglll band five from strain 161 cloned into the BamH1 site of pBR322. Similarly, the clone EcoR1 band seven is the EcoR1 band seven from strain 161 cloned into the EcoR1 site of pBR322. The aI5g specific probe, which is referred to as 17 in this figure, extends from the TaqI site at the 5' splice boundary of aI5g to the Hpall site near the 3' end of the intron. The eight exons in the oxi3 gene of strain 161 are labeled E1 through E8. The location of the two ATPase subunit genes, aap1 and oli2 are also shown. Recognition sites for restriction endonucleases used for mapping and as indicators of intron sequences are shown as A=AvaI, C=ClaI, B=BamH1, Bg=BglII, E=EcoR1 (the second "E" in aI2 denotes two EcoR1 sites which are closely spaced), H=HindIII, P=PstI, and Pv=PvuII. ro to rvs URF

2 kb 2 EcoR I band 7 I band EcoR JScaJe^ E 8 al5< E 7 6 PKM2 * * * • • • t A e 5 7 Jl 15 15 I S h e

14 m A 13 Bgl II band 5 band II Bgl 12 11 Figure 7 293

Figure 8. Yeast mtDNAs probed with all specific sequences

BamH1 digests of yeast mtDNAs were probed with an all specific MobI fragment cloned into pBR322. 294

<0 . cerevisiae .

Figure 9. Yeast mtDNAs probed with aI2 specific sequences

BamH1 digests of yeast mtDNAs were probed with an aI2 specific Mbol fragment cloned into pBR322. 295

co CO co V) 3 « 3 0) c 0) CO CO co O (0 ■o 3 *55 3 • MB o> c E CO o C a> co 3 co co 0) o k_ *> c o X) CO co (0 a> a ■>. B_ 0) hm 0) o o (Q > a> o c o 3 o • ■ CO CO CO

Figure 10. Yeast mtDNAs probed with aI3 specific sequences

Ec°R1 digests of yeast mtDNAs were probed with an aI3 (aI3a) specific Mcbl fragment cloned into 296

Figure 11. Yeast mtDNAs probed with al4 specific sequences

EcoR1 digests of yeast mtDNAs were probed with an al4 (al4a) specific Mbol fragment cloned into pBR322. 297

Figure 12. Yeast mtDNAs probed with aI5g specific sequences

EcoR1 digests of yeast mtDNAs were probed with an aI5g specific TaqI to Hpall fragment. 298

Figure 13« BamH1 plus EcoR1 digests of mtDNAs

Yeast mtDNAs were digested with both EcoR1 and BamH1 and size fractionated by agarose gel electrophoresis. IQ C 03 cerevisiae

uvarum diastaticus

capensis oleaceus

norbensis

coreanus ellipsoideus

hienipiensis

aceti 161 EcoRI + BamHI | i 161 EcoRI only pBR 322 ro

The agarose gel shown in Figure 13 was blotted to nitrocellulose and hybridized to P-32 labelled pKM2 DNA. The clone, pKM2, contains sequences which are homologous to aI5g, exons 6 and 7, and parts of exons 5 and 8 from strainl6l as shown in Figure 7. The fragment to which this clone has hybridized extends from the BamH1 site in aI3 (aI3a) to the EcoR1 site in exon 8. uvarum and S. coreanus have BamH1/EeoR1 fragments which are consistent with their having no additional introns beyond those found in strain 161. The hybridized uvarum fragment is the correct size to consist only of those sequences found in aI5g, the 3* end of aI3 (aI3a), and the relevant exons. The coreanus fragment comigrates with the fragment from strain 161 and probably contains all the introns found in strain 161 which are 3* of aI3 (aI3a). The industrial strain of cerevisiae, S. capensis. and aceti all lack a l 1! (al4a) and produce fragments which are homologous to pKM2 but are actually larger than the homologous fragment from strain 161. This indicates the possibility of additional introns in the ox!3 genes of these speoies. All of the other species except S. ellipsoideus and S. diastaticus lack a l H 7al4a) but produce EcoR1/BamH1 fragments which are homologous to pKM2 which nearly comigrate with this fragment in strain 161. This also indicates the possibility of an additional intron in these mtDNAs which is not found in the 161 mitochondrial genome. (Data shown in Figures 19 and 20 shows that S. ellipsoideus arid diastaticus also contain an intron not found in strain 161.) S.cerevisiae S. uvarum

S. diastaticus

S.capensis

S. oleaceus S. norbensis S. coreanus

S. ellipsoideus

S. hienipiensis ’ S. aceti

161 iBamHI & EcoRII

«161|EcoRI only! p B R 322 (uncutI 302

Figure 15. TaqI digests of mtDNAs probed with pKM2

TaqI digests of mtDNAs were hybridized to P-32 labelled pKM2 DNA. The resulting autoradiogram is shown. The fragments indicated by the arrow comigrate with the fragment derived from strain 161 mtDNA by cutting the TaqI sites at the 5' splice boundary of aI5g and in exon 8. All species except S^ ellipsoideus produce a fragment which comigrates with the fragment from 161• This indicates that these TaqI sites are conserved between species and that there are no significant insertions or deletions in the aI5g homologs of any of these species. ellipsoideus does not produce this fragment upon TaqI digestion because it does not contain aI5g (see Figure 12). Tl <5‘ c "I

S. uvarum Ol S. diastaticus S. capensis

S. oleaceus

S. norbensis

S. coreanus

S. ellipsoideus

S. hienipiensis 304

Figure 16. Mapping oxi3 inserts in selected species of yeast

Mitochondrial DNAs from S_j_ diastaticus, S. capensls. S . acetl.and strain 161 Here digested with BamHI and either Bglll, Bell, or Hindlll. These digestions were separated by agarose gel electrophoresis, blotted to nitrocellulose and probod with P-32 labelled Bglll band five DNA (see Figure 7). Hybridization was performed at room temperature and washes were with 0.5x SSC. The resulting autoradiogram is shown. Lanes 1 through 4 result from BamH1/BglII double digests. The dark signal in the 161 lane (lane 4) is actually a doublet derived from cutting the Bglll band five (see Figure 7) almost directly in the middle with BamHI. The 5' Bglll site of this fragment is in aI2, and the 3' Bglll site is in exon 5 of strain 161. The BamHI site is near the 3' end of a!3 (aI3a). The lighter signal in the 161 lane is the result of cross hybridization between al4 sequences in Bglll band five and cob 14. These two introns are 70% homologous. The other three species examined lack aI2 (see Figure 9) and, therefore, cannot have the Bglll site located in that intron. Those species also lack the 3' Bglll site located in exon 5 cf strain 161 since they produce no fragment which comigrates with the 161 doublet. The less intense signals seen in lanes 1 through 3 also result fromcross hybridization with homologs of cob 14. Lanes 5 through 8 result from BamH1/Bcll double digests. The two relevant Bell sites in strain 161 are located in exons 4 and 5. In all four lanes (5-8), a small fragment is seen to hybridize to the Bglll band five probe. This fragment is generated by cutting the BamHI site near the 3' end of a!3 (al3a) and the Bell site in exon 4 of strain 161. These fragments from all four mtDNAs comigrates indicating that there are no insertions and/or deletions in this region which distinguishes one of these mtDNAs from the others. The next larger signal from strain 161 is generated by cutting the Bell sites in exons 4 and 5. These sites flank al4 (al4a). S_^ diastaticus has al4 (see Figure 11) and produces a fragment which comigrates with the Bcll/Bcll fragment from strain 161. This indicates that the al4 (al4a) homologs in diastaticus and strain 161 are probably colinear. The next larger signal in lane 8 (strain 161) is also the weakest signal in this lane. This signal is generated by cross hybridization between al4 sequences in the probe and cob 14. The other three species also have DNA fragments which migrate similarly. The largest 305

signal in this lane is due to hybridization of the probe to a large DNA fragment which extends from the BamHI site near the 3' end of aI3 (aI3a) in a 5' direction beyond the oxi3 region. The other three species also have large DNA fragments which cover this region. It is interesting that oapensis and aceti produce fragments which migrate slightly faster in this region. In S_. capensis , this is to be expected since this species lacks both all and aI2 (see Figures 8 and 9). aceti probably has a similar distribution of introns in this region. Perhaps the most interesting signals in the BclI/BamH1 double digest series are observed near the middle of lanes 6 and 7 ( capensis and aceti ). These signals are derived by cutting the Bell sites which are in exon sequences homologous to exons 4 and 5 in strain 161. These signals indicate that there is an approximately 1.9 kb insert in these species relative to strain 161. This insert will be referred to as intron aI3b (for oxi3 intron 3 beta). The name of aI3 will be changed to aI3a (for oxi3 intron 3 alpha). In lanes 9 through 12, the results of HindIII/BamH1 double digest are shown hybridized to Bglll band five. The relevant Hindlll site in strain 161 is within al4 near the 3* end (see figure 7). The fastest migrating signal derived from strain 161 mtDNA (lane 12) results from cutting the BamHI site near the 3' end of aI3a and the Hindlll site near the end of al4 (al4a). S . diastaticus contains al4 and produces a fragment which comigrates with the fragment from strain 161. This indicates that the al4 homologs in these two species are probably colinear and is consistent with the data shown in lanes 5 and 8. The other two signals seen in lane 12 (strain 161) result from hybridization with cob 14 (middle signal) and the 5* end of oxi3 (slowest migrating signal). Since S. capensis does not have al4 (see Figure 11), it cannot have the Hindlll site found in this intron. Both S. capensis and aceti produce two fragments upon Hindlll and BamHI digestion which are not found in strain 161 mtDNA. There are, therefore, two Hindlll sites in these species not found in strain 161. The sizes of the two novel fragments resulting from these BamH1/HindIII double digests in capensis and aceti are roughly 1.4 and 1.8 kb. This means that one of the two Hindlll sites not found in strain 161 must be in aI3b while the other Hindlll site must be 3' of the Bell site in exon 5 of strain 161. This second Hindlll site is in a second novel intron which will be called al4b (for ox!3 intron 4 beta). The name of al4 will be changed to al4a (for oxi3 intron 4 alpha). The fastest migrating signal in the 306

diastaticus BamH1/HindIII double digest is interesting in that it results from cutting the Hindlll sites in al^a and alUb (see Figure 20). CO o ^*4 00 a 3 00 o

161 S.aceti pBR322 llinear) pBR322 capensis S. aceti S. 1 » 161 co S. aceti co S.capensiso> S.diastaticus

Figure 16 308

Figure 17. Hindlll plus EcoR1 digests of mtDNAs

Mitochondrial DNAs from eleven species of yeast were digested with Hindlll and EcoR1 and size fractionated by agarose gel electrophoresis. PBR322 161 161 EcoRI only 161 161 + EcoRI III Hind coreanus norbensis aceti hionipiensis oleaceus •Ilipsoideus u varum u capensis cerevisiae diastaticus

Figure 17 310

Figure 18. Hindlll plus EcoRI digests of yeast mtDNAs probed with pKM2

The agarose gel shown in Figure 17 was blotted in nitrocellulose and probed with P-32 labelled pKM2 DNA (see Figure 7). All eleven species of yeast have at least one Hindlll site in the oxl3 region. With the exception of uvarum, the signal which is observed in this figure has resulted from cutting the EcoRI site in the last exon and the Hindlll site in either aI3b, al^a, or al4b. For uvarum. the Hindlll site occur closer to the 5' end of its ox!3 gene. "n (Q c -t

(0 in in 3 3 w a> o (A in ® c *• m 2 0) ■o » E (0 r- © c c o a> o 0) (0 V) > 2 (0 (0 ■O V CL c © «0 <0 a 0) a> <0 o O = o O o c to S S 5 O 0> . to (O (fl « ai w

**!§■

I

Figure 19. Yeast mtDNAs probed with aI3b specific sequences

BamH1 plus Hindlll double digests of yeast mtDNAs were probed with aI3b specific sequences. The clone used as a probe in this experiment (C2) extends from a Dral site near the 5' end of aI3b to the Hindlll site found in this intron (see Figure 28). All species of yeast examined except uvarum. S. diastaticus. S. norbensis. and strain 161 contain aI3b. Other faint bands seen in this autoradiogram result from cross hybridization with other less homologous mtDNA sequences. Figure 20. Yeast mtDNAs probed with al4b specific sequences

BamHI plus Hindlll double digests of yeast mtDNAs were probed with al4b sepcefic sequences. The clone used in this experiment extends from a Hpall site near the 5' end of al4b to the Hindlll site in this intron. Only S. diastatious, S. capensls and aceti were found to contain this intron. 314

Figure 21. Aval plus PstI digests of yeast mtDNAs

Yeast mtDNAs were digested with Aval and PstI and size fractionated by agarose gel electrophoresis. The relevant Aval site is in aI5a in strain 161. The relevant PstI site is 3' of oxi3 in an URF (see Figure 7). The arrow indicates the fragment generated in strain 161 by cutting those sites. A comigrating band is observed in the industrial strain of cerevisiae and S. coreanus. This indicates that these two species are colinear with strain 161 between the Aval and PstI sites and have aI5a and aI5b. The other species of yeast do not produce a comigrating fragment and do not produce a fragment which is the correct size to be generated by cutting these same Aval and PstI sites in a mtDNA in a strain which contains aI5a but not aI5b. The faster migrating fragments observed in this figure probably result from contaminating nuclear DNA since they also occur in strain 161. m co'c S. cerevisiae (D S. uvarum ro S. capensis S. oleaceus S. norbensls S. coreanus S. ellipsoideus S. hienipiensis 161

uncut pBR322 316

Figure 22. BamH1 plus PstI digests of yeast mtDNAs (#1)

Yeast mtDNAs were digested with either BamH1 only or BamH1 plus PstI. The relevant BamH1 and PstI sites are in aI3a and the URF which is 3' of oxi3» respectively. Odd numbered lanes are digested with BamH1 plus PstI. Even numbered lanes are digested with BamH1 only. All species shown except S_j_ capensis and S_. diastaticus have one PstI site located in the URF located 3' of oxi_3. S_z_ capensis has two PstI sites one of which is in the URF. diastaticus has no PstI sites in its mtDNA.

tn I IS. diastaticus o> I

S. capensis CD

<0 S.oleaceus ■4 ■ o 161 (BamH I onlyl

uncut pBR 322 318

Figure 23. BamH1 plus PstI digests of yeast mtDNAs (#2)

The description of this figure is the same as for Figure 22 except that different mtDNAs were digested. S. aceti has no PstI sites. — 319

Figure 24. BamH1 plus PstI digests probed with EcoRI band 7 specific sequences (#1)

The gel shown in Figure 22 was blotted to nitrocellulose and hybridized to- a clone which contained EcoRI band 7 from strain 161 (see Figure 7). 320

S. norbensis

- .

S. coreanus

W i B ' M '■ ^ “*V' ,v>

S.ellipsoideus

, • ’■• ■ t aJsVi'ss I ’’ S.hienipiensis

■.'=' V i , -■ v - :•. /* £ ■■■ " g . « 0 9

^ S. aceti

I ^ 161

Figure 25. BamH1 plus PstI digests probed with EcoRI band 7 specific sequences (#2)

The gel shown in Figure 23 was blotted to nitrocellulose and hybridized to a clone which contained EcoRI band 7 from strain 161 (see Figure 7). 321

Figure 26. Bcl21 and Hindl6 compared to S_^ capensis mtDNA

The Bell fragment which contains all of aI3b plus some flanking exon sequences and the Hindlll fragment generated by cutting the Hindlll sites in aI3b and al4b were cloned out of capensis into PEMBL18 (see Figure 16). These two clones are called Bcl21 and Hindl6, respectively. Bcl21 was digested with EcoRI and PstI (the sites which flank the insert DNA in the polylinker of pEMBLl8). Hindl6 was digested with Hindlll. S . capensis mtDNA was digested with Bell or Hindlll. These digests were electrophoresed to show that the sizes of the inserts of these clones comigrate with the S. capensis fragments from which these clones were generated. (Q C S. capensis (Bel II

Hind 16

S. capensis (Hind III I

CO ro ro 323

o a> ■o ■o T3 ■o

a A * r 0 5 10 15 25 35 45 55 r-

Figure 27. Progressive Bal 31 digestion of Bcl21

Bcl21 was linearized at the EcoRI site and digested with Bal 31. Numbers at the top of the figure indicate the number of minutes that the digestion was allowed to proceed. 324

Figure 28. Strategy for sequencing aI3b

Arrows indicate the various clones used in dideoxy sequencing of aI3b. The arrows show the beginning, end, and direction that these clones were read. "X" is used to indicate those clones which were generated by subcloning from specific restriction sites. Subclones without "x" were generated using the Bal 31 deletions shown in Figure 27. The clone, C2, was used both for sequencing and as an aI3b specific probe as shown in Figure 19.

i CO ro cn C 2 C > < r ~ > Rsal ----- *

IO Figure I QO Bel 326

Figure 29* Sequence of aI3b

The complete nucleotide sequence of the capensis mtDNA between the Bell sites which occur in exons 4 and 5 in strain 161 is shown. This sequence includes 150 and 179 nucleotides (underlined) of the 5' and 3' exons, respectively. Amino acid determinations as deduced from the DNA sequence are also shown and differences between the exon amino acid and/or nucleotide sequences as compared to S^ cerevisiae are indicated by stars. There is a 1083 nucleotide (361 codon) ORF in aI3b which is continuous with the 5' exon. Amino acid sequences which are homologous to the LAGLI-DADG sequences found in other class I introns are shown. Lower case letters are used to denote nucleotides in the closed reading frame at the 3' end of aI3b. The conserved cls-actlng sequences which characterize all class I introns (IG, E, P, Q, R, E', and S) are identified and underlined. 327 Sequence of :Bcl21

trp ser ile phe ile thr ala phe leu leu leu leu ser leu pro 0 TGA TCA ATT TTC ATT ACA GCG TTC TTA TTA TTA TTA TCA TTA CCT Bel 1

val leu ser a la gly ile thr met leu leu leu asp arg asn phe 45 GTA TTA TCT GCT GGT ATT ACA ATG TTA TTA TTA GAT AGA AAC TTC

asn thr ser phe phe glu val ala ala giy giy asp pro ile leu 90 AAT ACT TCA TTC TTT GAA GTA GCA GGA GGT GGT GAC CCA ATC TTA

tyr glu his leu phe tyr lys giy tyr met met asn asn asn val 135 TAC GAG CAT TTA TTT TAC AAA GGC TAC ATA ATA AAT AAT AAT GTT

ile phe asn phe leu pro met met leu leu leu leu asn asn asp 180 ATT TTC AAT TTC TTA CCA ATA ATA TTA TTA TTA TTA AAT AAT GAT

ser tyr ile met met met met lys leu thr met leu leu ser asn 225 TCT TAT ATC ATA ATA ATA ATA AAA TTA ACA ATA TTA TTA TCT AAT

asn his thr leu leu leu ser ser ser met asn asn lys asp lys 270 AAT CAT CTA TTA TTA TTA TCT TCA TCT ATA AAT AAT AAA GAT AAA

thr leu met lys leu asp thr pro phe arg giy phe giy ser val 315 CTA TTA ATA AAA TTA GAT ACT CCT TTT CGG GGT TTC GGT TCC GTG

ser ser pro lys thr ser lys asp thr lys lys ser leu leu asp 360 TCG AGC CCC AAA ACT TCT AAA GAT ACT AAA AAA TCA TTA TTA GAT

leu thr asn asn asp ile asn lys tyr asn trp glu trp asp asn <405 TTA ACA AAT AAT GAC ATT AAT AAA TAT AAT TGA GAA TGA GAT AAT

asn asn phe asn phe asp lys phe tyr lys glu phe lys lys val 450 AAT AAT TTT AAT TTT GAT AAA TTT TAT AAA GAA TTT AAA AAA GTT IG. Dra I

lys pro asn asn lys leu pro ser lys glu phe leu glu trp phe 495 AAA CCT AAT AAT AAA TTA CCA TCT AAA GAA TTT TTA GAA TGA TTT

Figure 29 (first page) 328

Figure 29 (continued)

LAGLI-DADG ile giy phe phe glu ala val giy cys leu thr ile pro lys asn 540 ATT GGT TTT TTT GAA GCT GTT GGT TGT TTA CTT ATT CCT AAA AAT

lys gin leu tyr ala ile ile thr ser asn ser lys asp leu tfSn 585 AAA CAA TTA TAT GCT ATT ATT CTT TCT AAT AGT AAA GAT TTA AAT Dra I

thr leu asn tyr ile lys asp asn met thr phe giy asn val thr 630 CTA TTA AAT TAT ATT AAA GAT AAT ATA ACA TTT GGT AAT GTA CTA Rsa I

try his ser lys lys leu asn thr tyr arg trp val val tyr asn 675 TAT CAT TCA AAA AAA TTA AAT ACT TAT AGA TGA GTA GTA TAT AAT

glu thr asp ile leu leu leu ile his leu phe asn giy asn thr 720 GAA ACA GAT ATT TTA TTA TTA ATT CAT TTA TTT AAT GGT AAT CTA

val leu pro val arg tyr val lys phe arg met phe ile ser asn 765 GTA TTA CCT GTA AGA TAT GTA AAA TTT AGG ATA TTT ATC TCT AAT

met asn met lys leu leu lys asn asn lys thr ile ile lys lue 810 ATA AAT ATA AAA TTA TTA AAA AAT AAT AAA CTT ATT ATT AAA TTA

ile asn lys cys lys met pro lys leu asn lys cys leu lie ser 855 ATT AAT AAA TGT AAA ATA CCT AAA TTA AAT AAA TGC TTG ATT AGC

LAGLI-DADG arg phe thr asp giy glu giy cys phe tyr thr giy lys ser lys 900 AGG TTT ACT GAT GGT GAA GGT TGT TTT TAT ACA GGT AAA TCT AAA ______E P

ser phe tyr tyr thr ser tyr ile ile thr gin lys tyr leu ala 945 TCT TTT TAT TAC CTA AGT TAT ATT ATT ACT CAA AAA TAT TTA GCT 329 Figure 29 (continued)

asn lys ile val phe asp met leu leu leu leu leu gin asn met 990 AAT AAA ATT GTT TTT GAT ATA TTA TTA TTA TTA TTA CAA AAT ATA

ile asn ile lys ser giy giy val asn asn his ser lys asp asn 1035 ATT AAT ATT AAA TCT GGT GGT GTT AAT AAT CAT TCT AAA GAT AAT

thr tyr val leu arg ile ser ser leu glu ala cys'ser lys leu 1080 CTT TAT GTT TTA CGT ATT AGT AGT TTA GAA GCT TGT TCT AAA TTA Q Hind III

lys leu tyr phe asp lys tyr pro leu arg ser tyr lys leu leu 1125 AAA TTA TAT TTT GAT AAA TAT CCT TTA AGA AGT TAT AAA TTA TTA

ile tyr lys asp trp leu asn phe ile asp met ala met asn asp 1170 ATC TAT AAA GAT TGA TTA AAT TTT ATC GAT ATA GCT ATA AAT GAT Cla I

asn leu ser lys lys tyr end 1215 AAT TTA TCA AAA AAA TAT TAG aaaagttaa tttaataaaa atattagata

1265 ttagttccgg gcccgccgag cagccagaac cccggacgga gaaataaaaa Hpa II Hpa II

1315 gaaattatta taatttatta aaaaaaaaat aaataaattt attatgtaaa

1365 gaaataaaaa atataattaa atgtaaattt atatataaat atatattatt

1415 ttattattct ttatttatta aagaaataat aataatattt atataatatt

1465 ttttatatag ttttatatct tttattattt gaaacttttt atttattaat 330

Figure 29 (continued)

1515 agttccgggg cogcaggacc cggaaccccg aaagagttca tctatttaat Hpa II Hpa II

1565 tatatatgga caataattaa ataatagtat ttatttatct taaagatata

1615 tataataaat ataaagaata tgttatattt tagtttttat atattataat

1665 tatatatata cttaatttat atatataacg taaaaattac aaaatgtaat

1715 ttttattgat aattatctta ttattattat attaaattat ataatataat

1765 aagtttatat taataataat tattataata taaagtatat aaaggcgaca

1815 ctgatagtac ggttaataat cttctttagg atcaagaccg tcggttaatt

1865 aagtgatcgc tacagactgc tttatcggtg gctttaataa tatatatata Mbol R______E 1

1915 tatatatatt ataaggttaa tgtacagtcg gaactctcaa tatatatatt S Rsa I

1965 tttttttttt tttaattaaa taaatatata tataaattgg ttattattat

2015 taaataataa ataaaatata ttgtaagtga aatatttata ttttatatga

phe giy phe phe giy his pro glu val tyr'lie leu ile ile 2065 g TTT GGA TTC TTT GGT CAT CCA GAA GTA TAT ATT TTA ATT ATT

pro giy phe giy ile ile ser his val val ser thr tyr ser lys 2108 CCT GOA TTT GGT ATT ATT TCA CAT OTA GTA TCA ACA TAT TCT AAA

lys pro val phe giy glu ile ser met val tyr ala met ala ser 2153 AAA CCT GTA TTT GGT GAA ATT TCA ATG GTA TAT GCT ATG GCT TCA

ile giy leu leu giy phe leu val trp ser 2198 ATT GGA TTA TTA GGA TTC TTA GTA TGA TCA Bel I 331

Figure 30. Amino acid sequence of aI3b

The complete amino acid sequence as deduoed from the nucleotide sequence of the ORF of aI3b is shown. The homologs of the LAGLI-DAGD sequence found in other class I introns are identified and underlined. 332

10 20 30 40 50 YKGYMMNNNV,IFNFLPMMLL,LLNNDSYIMM,MMKLTMLLSN, NHTLLLSSSM,

• 60 70 80 90 100 NNKDKLLMKL, NTPFRGFGSV,SSPKTSKDTK,KSLLDLTNND,INKYNWEWDN,

110 120 130 140 150 NNFNFDKFYK,EFKKVKPNNK,LPSKEFLEHF,IGFFEAVGCL,TIPKNKELYA, LAGLI-DADG

160 170 180 190 200 IITSNSKDLN,TLNYIKDNMT,FGNVTYHSKK,LNTYRWVVYN,ETDILLLIHL,

210 220 230 240 250 FNGNTVLPVR,YVKFRMFISN,MNMKLLKNNK,TIIKLINKCK,MPKLNKCLIS, LA

260 270 280 290 300 RFTDGEGCFY,TGKSKSFYYT,SYIITQKYLA,NKIVFDMLLL,LLQNMINIKS, GLI-DADG

310 320 330 340 350 GGVNNHSKDN,TYVLRISSLE,ACSKLKLYFD,KYPLRSYKLL,IYKDWLNFIN,

360 MAMNDNLSKK,Y.

FIGURE 30 333

Figure 31* Distribution of stop codons in aI3b

The distribution of stop codon as determined from the nucleotide sequence is shown for all three reading frames which are read in the same direction as the exon sequences. The top reading frame is continuous with the 5' exon while the other two lines represent +1 and +2 frame shifts of the first reading frame. There are many stop codons in all three reading frame except for the ORF which is shown in Figures 29 and 30. No other significant reading frames exist in aI3b. CO CO 45* Bel Bel I exon Real m • • • • • •• • M«e II H2 •• •• •• — lea • lea Clal +1 +2 I H3 C2 • • • • • • • i U i / m o ee e m• •••• Rsat m « ORF eaee *» Dra I I Dra I Dr« m • • • V w w »e A»e exon Bell Figure 31 335

Figure 32. Amino acid homology between aI3b and cob 14

The amino acid sequences of aI3b and cob 14 can be aligned at 59 amino acid positions if three insertion/deletion events are allowed. These deletion/insertions are 54, 40, and 21 amino acids, respectively. If these deletion/insertions are not included and if the 28 amino acids which occur in aI3b before the beginning of cob 14 and the 18 amino acids which occur in cob 14 after the end of aI3b are also not includes, then aI3b and cob 14 are 29.5)1 homologous at the amino acid level for the remaining 200 amino acid positions. 336

Amino Acid Homology Between aI3beta and cob I4 ORFs

10 20 30 40 50 aI3beta...YKGYMMNNNV,IFNFLPMMLL,LLNNDSYIMM,MMKLTMLLSN,NHTLLLSSSM, • • I cob III QN .MALLLITYVI ,NILCAVCWKS, 2 12 22

60 70 80 90 100 aI3beta...NNKDKLLMKL,NTPFRGFGSV,SSPKTSKDTK,KSLLDLTNND,INKYNWEWDN, I i * « I* • « cob III.. . . LFIKYQWKIY ,NKTTYYFIIQ ,NILNTKQLNN , F JLKFNWTKQ, YNKMNIVSDL, 32 42 52 62 72

110 aI3beta...NNFNFDKFYK...... • • • cob 14....FNPNRVKYYY,KEDNQQVTNM,NSSNTHLTSN,KKNLLVDTSE,TTRTTKNKFN, 82 92 102 112 122

LAGLI-DADG 116 126 136 146 aI3beta...... EFKKVK, PNNKLPSKEF f LEWFIGFFEA, VGCLLIPKNK, • » • * • * • «i« cob 14....YLLNIFNMKK,MNQIITKRHY,SIYKDSNIRF,NQWLAGLIDG,DGYFCITKNK, 132 142 152 162 172 LAGLI-DADG

156 166 176 186 196 aI3beta...ELYAIITSNSf KDLNTLNYIK,DNMTFGNVTY,HSKKLNTYRW,VVYNETDILL, «•«•««• i «i cob 14....YASCEITVKL tKDEKMLRQIQ,DKFGGSVKLR,SGVKAIRYR...... 182 192 202 211

Figure 32 337 Figure 32 (continued)

206 216 226 236 246 aI3beta...LIHLFNGNTV,LPVRYVKFRM,FISNMNMKLL,KNNKTIIKLI,NKCKMPKLNK, • II III! I I cob 14...... L,KNKEGMIKLI,NAVNGNIRNS, 212 222 232

LAGLI-DADG 255 265 275 aI3beta...... CLISRFTDG, EGCFYTGKSK,SFYYTSYIIT, i • • ii cob 14....KRLVQFNKVC,ILLNIDFKEP,IKLTKDNAWF,MGFFDADGTI,NYYYSGKLKI, 242 252 262 272 282 LAGLI-DADG

285 295 305 315 325 aI3beta...QKYLANKIVF tDMLLLLLQNM,INIKSGGVNN,HSKDNTYVLR,ISSLEACSKL, • » » • • cob 14....RPQLTISVTN,KYLHDVEYYR,EVFGGNIYFD,KAKNGYFKWS,INNKELHNIF, 288 298 308 318 328

335 345 355 361 aI3beta...KLYFDKYPLR,SYKLLIYKDW,LNFINMAMND fULSKKY. i ill i cob 14....YTYNKSCPSK,SNKGKRLFLI,DKFYYLYDLL,AFKAPHNTAL,YKAWLKFNEK, 338 348 358 368 378 cob I4....WNNN. 382 338

7 R (box9L) 5...CTACA y. GACTG / CTGAC A . 3...G ATGTAA .• Slbox 21

Figure 33. Interaction between R and S in aI3b

The homologs of R and S found in aI3b are shown aligned in order to form the five base pair stem which characterizes this interaction. 339

Figure 34. Interaction between P and Q in aI3b

The sequences for the homologs of P and Q found in aI3b are shown indicating the posible helical base pairing interactions which characterize these two sequences. Also shown is the less well conserved stem, P5 (reviewed by Waring and Davies, 1984). .IG .. .5' p aAa P5 ptttta .E,GTTTTTATACAGGT T CTAAAT AAAT TGTTCG AGATTTG ■ CT 0 M ^ Q t g a t t .. ■ ' ■ R,E'...S...3' 341

E 5.'CTG TGGTGAAGG 3^ 3GGT GCTA—TTT C 5 G

Figure 35. Interactions between G and E f in aI3b

Shown are the sequences of the E and E' homologs found in aI3b. THese sequences are not conserved in sequence but are conserved both in position and in their ability to base pair with each other. 342

Figure 36. Proposed secondary structure of aI3b

As proposed by the model of Waring and Davies (1964) and Michel and Dujon (1982) (see Figure 4), all class I introns are conserved in secondary structure. Intron aI3b can also be folded into a structure which homologuus to that of other class I introns. The seven conserved cis-aeting sequences which characterize al class I introns (IG, E, P, Q, R, E', and S) are labeled and underlined. Also shown are the less well conserved stem structures P2, P5f P6, and P8. The.possible unusual interactions between sequences between IG and E (A' and B') and E' and S (A and B) are also indicated by underlining. Lower case letter indicate exon sequences. (Q C

UUA a 731 aGUGAU nt. .UAUUAAAA U U U A P6 U U B s

A a i «UU . AGGUU A AUGUACAGUCGn ljC U Au u .U " A U' G «u UA A GU UAUAUAUAU4 C GU. GC U 151 AGUUUAGA a Gq G AUAUAUAU ^ .UG P8 C nt. UAAAUCU U la U UUC p5 A ag uji g u A A UG 108 GC P2 CG___ nt. 395 a g g u u u a c u u u A nt. UCCAAAUUGAAAAAUU uuuac5 U A A ua A B AU U AU 3 guuu|GAG GU. IG AU A A /rA C IUU A A A U 218 A nt.

CO CO 344

AA * u u v u AU AU AU U A AA JS AU U G AU AU GU U U All UG UAAA UU U A. , UA UJUA AAUAU...5!uau...5...Q UA UA ? UA fiS AU UA AU P 9 AU ,UAUI AU. UA u' AU A y a u a UA U u UA AU U U AU ^.S-GAACUCUCA aUAS a AAUUG0"...... «G *AAUAAUA UUAUAUGAGI3«on

F ig u r e 3 7

Figure 37. Proposed secondary structure of the 3» end of aI3b

The sequence of the 3' end of aI3b (shown as a 108 nucleotide gap in Figure 36) call be folded into two base paired helicies. The first of these is a perfect twelve base pair stem which is analogous to the P9 stem observed in other class I introns (reviewed by Haring and Davies, 1984). The second stem structure is an imperfect 35 nucleotide stem located 3' of P9. A loop structure exists at the end of P9. It is possibly significant that this loop can base pair with a sequence located between Q and R. 345 5'G+C Cluster

n r AG AGCGA G CG^ i 5 6TTCCGGG gttccggggccgcaggacc CCGC G C c 0 iSSg££2§gaacCccgaa agag GAG 3 G

3'G+C Cluster

Figure 38. Homology between G+C clusters in aI3b

Two G+C cluster exist in aI3b in the closed reading frame within 400 nucleotides of the 3' end of the ORF. These two G+C clusters are 38 and 35 nucleotides long, respectively. It is possibly significant that these two G+C clusters are similar in sequence and can be aligned at 26 nucleotide positions. 346

Figure 39. Comparison of cob 14 and oxi3 14

This figure is a summary of the data presented by Anziano et al. (1982) which shows that cob 14 and oxi 14 are homologous to each other and differ primarily by the presence of a series of insertions found in cob 14 which are absent in ox!3 14. Three insertions occur before the first LAGLI-DADG sequence in the ORFs. The largest insertion (218 nucleotides) occurs immediately after the end of the ORF in cob 14. Exons, intron ORFs, and closed reading frames are shown as triple, double and single width lines, respectively. The first and second LAGLI-DADG sequences are shown as "A" and "B", respectively. Cob intron 4 Ibl4l

76 % Nucleotide homology exon 5

0.25 kb

exon 5 66 % Amino acid homology

Oxi3 intron 4 Ia!4l 348

Figure 40. Comparison of cob 13 from S_j_ cerevisiae and the cob intron from A^ nidulans

The Sj^ cerevisiae cob 13 intron and the cob intron fron A_j. nidulans are homologous and differ primarily by a single 518 nucleotide insertion in the cob 13 sequence relative to the /U nidulans sequence. The homology is highest near the first LAGLI-DADG sequence and the ends of the respective ORFs. The homology is poorest at the 5' ends of these introns. Exons, intron ORFs, and closed reading frames are shown as triple, double and single width lines, respectively. The first and second LAGLI-DADG sequences are shown as WAW and "B", respectively. There is a G+C cluster in the insert found in the S. cerevisiae intron. CO VO exon4

% 7 6 25 kb homology Nucleotide homology Amino Amino acid 66 % % 66 2 exon 67% 51 % 51 S. cereviciae cereviciae 3 cob Ibl3l S. intron % 7 4 A. nidulans A. cob intron exonl exon 3 exon

Figure 40 350

Figure 41. Comparison of the amino acid sequences from the ORFs of cob 13 and the iU nidulans cob intron

The amino acid sequences of cob 13 from S. cerevisiae and the cob intron from /U nidulans can be aligned as shown. These two intron ORFs are nearly colinear. The homology between these introns increases dramaticly at the first LAGLI-DADG sequence and continues to the ends of the respective ORFs. The homology is highest neasr the first LAGLI-DADG sequence. 351

Amino Aoid Homology between oob 13 of oerevlalae and the A. nidulana oob Intron ORFs 5 15 25 oob A...... -...... XTDEP,QCGDVLLKIL.LNAGKSPILG, a oob 13...NMEDPXXSNM.MLNKSVLCXN,IFIWMMNXSX,IQLIIXNNMI,VNKNNMVKMFf 10 20 30 AO 50 LA 35 45 55 65 75 oob A....FAXDLFFIIV,LLIGVK1AMT,RGKSAGVRSL,HTSEASQRLH,AGDLTXAXLT, a • a a a a aaa a oob X3«..IMRRKLAVIN,MYMYMKLIIQ,RLXSYYMNNT,IIXDKNHKLI,TDNPIXAXIV, 60 70 80 90 100

OLI-DADQ 85 95 105 115 125 oob A....GLFEGDGXFS,ITKKGKXLTX,ELGIELSIKD,VQLIXK1KII,LGXGXTSFRK, “ aaaaaaa a aaaaaa a aaaaa a a aa aaaa a aaaa a a oob 13...GLFEGDOVXT,XSKKGKXLLX,ELGIEHHIRDfXQLLXKXKNX.LGIGKVTIKK, 110 120 130 140 150

130 140 150 160 170 oob A....IN.....EXE,MVALRXRDKN,HLKSFILPIF,EKXPHFSHKQ,XDXLRFRNAL, a a a aa aaa a aaa aaaa aa aaaa a a oob X3...LKMKDGTXKE,MCKFNVRNKN,HLKNXXXPXF,NKXPHLTNKH,XDXLXFKDNLI 160 170 1*0 190 200

LAGLI-DADG 180 190 200 210 220 oob A....LSGXXSLEDL,PDXTRSDEPL,NSXESXXNTS,XFSAVLVGFI,EAEGCFSVXK, aa aa aa eaaea aaa ae aa aa aaa aa oob 13.•*LKDXKXXNDL,SXXLRPIKPFfNTLELILNKN tXFSSWLIGFF tEAKSCFSNXK, 210 220 230 240 250

230 240 250 260 270 oob A....LNKDDDXLIA,SFDXAQRDGD,XLISAXRKXL,SFTTKVXLDK,TNCSKLKVTS, a a aa - aa aa a a aa a oob X3••.PHNKKMKL.A,SFEVSMNNKM,EVHLAXKSXL,KXNNNXXMNE,FHNSKMTLKS 259 269 279 289 299 280 290 300 310 320 oob A....VRS7ENXXEF,LQNAP7KLLG,NKKLQXLLWLIKQLRKXSRXS,EKXEXPSNX. a • a a a aaaa eaaaaaa a a aa a a aaaa a oob X3.<..XNDIKNVVMF(XHNNPIKLLG,XKlCLQXLLFL,KDLRXITKXH>NXFKXPSKX. 309 319 329 339 349

Figure 41 352

Figure 42. Control lambda digests for rickettsial blots

Rickettsial DNAs were digested with Hindlll. To determine if the rickettsial digestions went to completion, one tenth to one twentieth of each rickettsial digestion was removed and used to digest 0.5 micrograms of lambda phage DNA. If the lambda digestion went to completion, it ws assumed that the rickettsial digestion also went to completion. In the series of rickettsial digestions shown here, the R^ belli!. ^ Canada, and ^ aiberica digestions were ethanol precipitated and digested again at a lower concentration of DNA in order to more accurately examine the lambda digestions. R. akari

R. belli

R.canada

R. conorii

R. montana

R. parkari

R. rhipicephali

R. rickettsii

’R. siberica 354

Figure 43* Hindlll digestions of rickettsial DNAs (#1)

Hindlll digestions of rickettsial DNAs are shown. This gel was blotted bldirectionally and used to produce the autoradiograms shown in figures 44 and 45. 355

Figure 44. Rickettsial DNAs probed with nEcoR3n

The gel shown in Figure 43 was blotted to nitrocellulose and hybridized to "EcoR3n* "EcoR3" is an EcoR1 fragment from R^ rlckettsii which has been cloned into pAT153> 356

Figure 45. Rickettsial DNAs probed with "Pstl"

The gel shown in Figure 44 was blotted to nitrocellulose and hybridized to nPst1". "Pstl" is a PstI fragment from !U rickettsii which has been cloned into pBR322. 357

Figure 46. Rickettsial DNAs probed with wPst2"

The gel shown in Figure 47 was blotted to nitrocellulose and hybridized to "Pst2n. nPst2n is a PstI fragment from R^ rlokettsii cloned into pBR322. 358 . coli .

Figure 47. Hindlll digests of rickettsial DNAs (#2)

Hindlll digests of rickettsial DNAs are shown. This gel was blotted bidireotionally and used to produce the autoradiogram shown in Figure 46 and to confirm the results shown in Figure 44. 359

Figure 48. Rickettsial DNAs probed with 16S RNA specific sequences

This autoradiogram was produced by Jon Clark using DNAs provided by Dave Ralph. It shows the hybridization of 16S rRNA specific sequences fron R^ quintana to a variety of rickettsial organisms. 360

Figure ^9. Line drawing of "EeoR3" data

A composite drawing indicating the hybridization pattern of rickettsial DNAs to the "EcoRS" probe is shown. 9 J n 6 ij 6 P R. R. AKARI R. R. AUSTRALIS R. BELLI I R. BELLI R. R. CONORI R. R. PARKERI R. R. RHIPICEPHALI R. R. RICKETTSII R. R. SIBERICA R. R. CANADA R. R. TYPHI I— | R. MONTANA ____ I _____ l\> l\> O J Jk. II 1 II1 1II 1 1 1 1 1 111 II 1 II 1 II II 1 II 1 II 1 II II 1 I ------I I 5 O* i? CT* J I I I I I I

FRAGMENT PATTERNS FOR ECO-RI-3 PROBE ROM R. RICKETTSII 190 362

Figure 50. Line drawing of "Pst1n data

This line drawing is a composite drawing showing the relative hybridization patters of the various rickettsial DNAs examined in this study to the nPst1N probe

09a-inBy R. AUSTRALIS R. R. CONORI R. R. MONTANA R. R. PARKERI R. R. RHIPICEPHALI R. R. RICKETTSII R. R. SIBERICA R. R. CANADA R. R. TYPHI R...... L ______I U3 U3 m ro ro | | I BELLI | R. ______1 I I I I ------• o a • a • cn • | | | | AKARI R. -I I I I I I I II I I I I I I II I I I II II FRAGMENT PATTERNS FOR PST1-1 PROBE FROM R. RICKETTSII £9£ 364

Figure 51. Line drawing of nPst2n data

This linedrawing is a composite drawing showing the relative hybridization patterns of the various rickettsial DNAs examined in this study to the nPst2n probe. IS einBy a O) CXI GO o £ _L_ _L_ _L_

I R. AKARI

I I R. AUSTRALIS 3

I R. BELLI I

I R. CONORI

l R. MONTANA a 50 ■ R. PARKERI i I r\> tj R. RHIPICEPHALI zo i a . R. RICKETTSII i I I za • zo I l R. SIBERICA o

CO R. CANADA

R. TYPHI l l

S9e 366

Figure 52. Line drawing of "IBS11 data

This line drawing is a composite drawing showing the relative hybridization patterns of the various rickettsial DNAs examined in this study to the Rochalimaea 16S rRNA probe. FRAGMENT PATTERNS FOR 16S rRNA PROBE FROM ROCHALIMAEA 368

Figure 53* Summary of the rickettsial data

This figure shows the estimated branching pattern of the members of the genus Rickettsia. The axis is an estimation of the percent divergence between the various species. CO s R. R. RICKETSII CANADA R. R. R. CONORI R. I R. BELLI TYPHI R. 0.00 ___ R. PARKERI R. RHIPICEPHALI

R. SIBERICA

0.05 R. MONTANA R. AUSTRALIS ------

0.10 AKARI R.

______H 0.15 K 1- | - | l i t I I— 1 I— I— > | i- +- \ U m O T ip E DIFFERENCES PER SITE

0.20 ■» ■» ------¥ ------ESTIMATED BRANCHING PATTERNS FOR SFG RICKETTSIAS

l - l - l I 0.25

Figure 53 Bibliography:

Abelson, J. 1979. RNA processing and the intervening sequence problem. Ann. Rev. Biochem. 48:1035-1069.

Alexander, Nancy J.t Philip S. Perlman, Deborah K. Hanson and Henry R. Mahler. 1980. Mosaic organization of a mitochondrial gene: Evidence from double mutants in the cytochrome b region of . Cell 20: 199- 206.

Altmann, R. 1890. Die Elementarorganismeh und ihre Beziehunger zu den Zellen. Veit and Co.., Leipzig.

Anacker, Robert L., Robert H. List, Raymond E. Mann, and Danny L. Wiedbrauk.. 1986. Antigenic heterogeneity in high and low-virulenc,e btnains. of Rickettsia rlckettsii revealed by mohoclonal antibodies. Infection and Immunity. 51:653-660.

Anacker, Robert L., Thomas F. McCaul, Hilly Burgdorfer, and Robert K. GerLoff. 1980. Properties of selected -Rickettsiae of the Spotted Fever Group. Infection and Immunity 27:468-474.

Anderson, S., A.T. Barikier, B.G. Barrell, M.H.L. deBruijn A.R. Coulson, J. Drouin, .I.C. Eperon, D.P, Nierlich, B.A. Roe, F. Sanger, P.H. Schreier, A.J.H. Smith, Ri, Straden, and I.G. Young. 1982.Comparison of the human and bovine mitochondrial genomes. In: Mitochondrial Genes, ed. P. Slonimski, P. Borst, and G. Attardi. Cold Spring Harbor Laboratory, N.Y. pp 5-44.

Andrew, R., J.M. Bonnin, and S. Williams. 1946. Tick typhus in North Queensland. Med. J. Aust. 2:253.

Andrews, Brenda J., Gerald A. Proteau, Linda G. Beatty, and Paul D. Sadowski. 1985. The FLP recombinase of the 2 micron circle DNA of yeast: Interaction with its target sequences. Cell 40:795-803*

Anziano, Paul Q.,Deborah K. Hanson, Henry R. Mahler, and Philip S. Perlman. 1982. Functional domains in introns: Trans-acting and cisactlng regions of intron 4 of the cob gene. Cell 30: 925-932.

Anziano, Paul Q. 1984. Functional splicing domains within 370 371 the forth intron of the cytochrome b gene of yeast mitochondria:DNA sequence of splicing defective mutants. Dissertation! Ohio State University.

Arnberg, A.C., G. Van der Horst, and H.F. Tabak. 1986. Formation of lariats and circles in self-splicing of the precursor to the large ribosomal RNA of yeast mitochondria. Cell 44:235-242.

Arnberg, A.C., G.-J. B. Van Ommen, L.A. Grivell, E.F.J. Van Bruggen, and P. Borst. 1980. Some yeast mitochondrial RNAs are circular. Cell 19:313-319.

Atkinson, William H. and Herbert H Winkler. 1981. A centrifugal filtration method for the study of transport of nicotinamide-adenlne dinucleotide and pyruvate by Rickettsia prowasekli. In Rickettsiae and Rickettsial Diseases, ed. Willy Burgdorfer and Robert L. Anacker. Academic Press. New York. p. 411-420.

Battey, J. and D.A. Clayton. 1978. The transcription map of mouse mitochondrial DNA. Cell 14:143.

Belfort, Marlene, Joan Pedersen-Lane, Deborah West, Karen Ehrenman, Gladys Maley, Frederick Chu, and Frank Maley. 1985. Processing of the intron-containing thymidylate synthase (td) gene of phage T4 is at the RNA level. Cell 41:375-382.

Bell, E.J., B.L. Bennet, and L. Whitman. 1946. Antigenic differences between strains of scrub typhus as demonstrated by cross-neutralization tests. Proc. Soc. Exp. Biol. Med. 62:134-137.

Bell, John E., Glen M. Kohls, Herbert G. Stoenner, and David B. Lackman. 1962. Nonpathogenic rickettsias related to the Spotted Fever Group isolated from ticks, Dermacentor variabolls and Dermacentor andersonl from eastern Montana. J. Immunol. 90:468-474.

Bell, E.J., and E.G. Pickens. 1953* A toxio substance associated with rickettsias of the Spotted Fever Group. J. Immunology 13:142.

Bell, E.J., and H.G. Stoenner. 1960. Immunologio relationships among the Spotted Fever Group of rickettsias determined by toxin neutralization tests in mice with convalescent animal serums. J. Immunol. 84:171.

Bennet , B.L., J.E. Smadel, and R.L. Gauld. 1949. Studies 372 on scrub typhus {tsutsugamu3hi disease). IV. Heterogeneity of strains of tsutsugamushi as dfemonstrated by cross-neutralization tests. J. Immunol. 62:453-461.

Bibb, Maureen J., Richard A. Van Etten, Catharine T. Wright, Mark W. Walberg, and David A Clayton. 1981. Sequence and gene organization of mouse mitochondrial DNA. Cell 26:167-180.

Bird, C.R., B. Koller, A.D. Auffret, A.K.Huttly, C.J. Howe, T.A. Dyer, and J.C. Grey. 1985. The wheat chloroplast gene for CF(zero) subunit I of ATPase synthase contains a large intron. EMBO J. 4:1381-1388.

Black, C.M., T. Tzianabos, F. Roumillat, M.A. Redus, J.E. McDada and C.B. Reimer. 1983. Detection and characterization of mouse monoclonal antibodies to epidemic typhus Rickettsiae. J. Clin. Microbiol. 18:561 -568.

Birky, C. W. 1983* Then partioning of oytoplasmic organelles at cell division. International Review of Cytology, supplement 15.49-89.

Bonen, L. and M.W. Gray. 1980. The genes for wheat mitochondrial ribosomal and transfer RNA: Evidence for an unusual arrangement. Nucl. Acids Res. 8:319*

Bolotin, M. , D. Coen, J. Deutsch, B. Dujon, P. Netter, E. Petrochilo, and P.P. Slonimski. 1971* La recombinaison des mitochondries chez Saccharomyces cerevislae. Bull. Inst. Pasteur 69:215-239.

Bonjardim, C.A., and F.G. Nobrega. 1984. Nucleotide substitutions in a yeast mitochondria cis-acting mutant located in the last intron of the apocytochrome b gene. FEBS Lett. 169:73-78.

Bonnard, Geraldine, Francois Michel, Jacques Henry Weil, and Andre Steinmetz. 1984. Nucleotide sequence of the split tRNA-UAA-Leu gene from V i d a faba chloroplasts: evidence for structural homologies of the chloroplast tRNA-Leu intron with the intron from the autosplicable Tetrahymena ribosomal RNA precursor. Mol. Gen. Genet. 194:330-336.

Bonitz, Susan G., Gloria Coruzzi, Barbara E. Thalenfeld, Alexander Tzagoloff, and Giuseppi Macino. 1980. Assembly of the mitochondrial membrane system: Physical map of the oxi3 locus of yeast mitochondrial DNA. J. Biol. Chem. 373

255:11922-11926.

Bonitz, Susan G. Gloria Coruzzl, Barbara E. Thalenfeld, Alexander Tzagoloff, and Giuseppi Macino. 1980. Assembly of the mitochondrial membrane system: Structure aud nucleotide sequence of the gene coding for subunit 1 of yeast cytochrome oxidase. J. Biol. Chem. 255:11927-11941.

Bozeman, F.M., B. Elisburg, J.W. Humphries, R. Hunelk, and D. Palmer. 1970. Serologic evidence of Rickettsia canada infections in man. J. Infectious Dis. 181:365-371•

Bozeman, F.M., S.A. Masiello, M.S. Williams and B.L. Elisberg. 1975. Epidemic typhus rickettsiae isolated from flying squirrels. Nature 255:545-547.

Bozeman F.M., M.S. Williams, N.I. Stocks, D.P. Chadwick, B.L. Elisberg, D.E. Sonenshine and D.M. Laver. 1978. Ecological studies on epidemio typhus infection in the eastern flying squirrel, in: Rickettsiae and Rickettsial Diseases, ed. Kazar, Ormsbee and Tarasevich. Publishing house of Slovak Academy of Sciences. Bratislava, pp.493-504.

Branch, Andrea D., Hugh D. Robertson, and Elizabeth Dickson. 1981. Longer than unit length viroid minus strands are present in RNA from Infected plants. Proc. Nat. Acad. Sci. USA. 78:6381-6385.

Broach, J.R., V.R. Guarascio and M. Jayarain. 1982. Recombination within the yeast 2 micron circle is site specific. Cell 29:227-234.

Brody, Edward and John Abelson. 1985. The "Splieeosome": Yeast pre-messenger RNA associates with a 40S complex in a splicing-dependent reaction. Science 228:963-967.

Burgdofer Willy, A. Aeschlimann, 0. Peter, S.F. Hayes, and R.N. Philip. 1979. Ixodes ricinus: Vector of a hetherto undescribed Spotted Fever Group agent in Switzerland. Acta Trop. 36:357-367.

Burgdorfer, W.A. and L.P. Brinton. 1975. Ann. N.Y. Acad. Sci. 266:61-72.

Burgdorfer, Willy, Stanley F. Hayes, Leo A. Thomas, and Jay L. Lancaster Jr. 1981. A new spotted fever group rickettsia from the lone star tick, Amblyomma americanum • In Rickettsiae and Rickettsial Diseases, ed. Willy Burgdorfer and Robert L Anacker. Academic Press. New York. 374 p 595-602.

Burgdorfer, Willy, D.J. Sexton, R.K. Gerloff, R.L.Anacker, R.N. Philip, and L.A. Thomas. 1975. Rhipioephalus sanguineus; Vector of a new Spotted Fever Group rickettsia in the United States. Infection and Immunity 12:205-210.

Burger, Gertraud and Sigurd Werner. 1985. The mitochondrial URF1 gene in Neurospora crassa has an intron that contains a novel type of urf\ 11985. J. Mol. Biol. 186:231-242.

Burke, John M. , Caroline Breitenberger, Joyce E. Heckman, Bernard Dujon and Uttam L. RajBhandary. 1984. Cytochrome b gene of Neurospora crassa mitochondria. J. Biol. Chem. ~ 259:504-51 1.

Burke, J. M., K.D. Irvine, K.J. Kaneko, B.J. Kerker, A.B. Oettgen, W.M. Tierney, C.L. Williamsom, A.J. Zaug and T.R. Cech. 1986. Role of conserved sequence elements 9L> and 2 in self-splicing of the Tetrahymena ribosomal RNA precursor. Cell 45:167-178.

Burke, John M. and Uttam L. RajBhandary. 1982. Intron within the large rRNA gene of N_j_ crassa mitochondria: A long and a consensus sequence possibly important in splicing. Cell 31:509-520.

‘Carignani, Giovanna, Olga Groudinsky, Domenico Frezza, Emma Schiavon, Elisabeth Bergantino, and Piotr P. Slonimski. 1983. An mRNA maturase is encoded by the first intron of the mitochondrial gene for the subunit I of cytochrome oxidase in S^ cerevisiae. Cell 35:733-742.

Cech, Thomas R. 1985. Self-spicing RNA: Implications for evolution. Internat. Review of Cytol. 93:3-22.

Cech, T.R. 1986. The generality of self-splicing RNA: Relationship to nuclear mRNA splicing. Cell 44:207-210.

Cech, Thomas R., Arthur J. Zaug, and Paula J. Grabowski. 1981. In vitro splicing of the ribosomal RNA precursor of Tetrahymena: Involvement of a guanosine nucleotide in the excision of the intervening sequence. Cell 27:487-496.

Chomyn, Anne, Paolo Mariottini, Michael W.J. Cleeter, C. Ian Ragan, Akemi Matsuno-Yagi, Youssef Hatefi, Russell F. Doolittle, and Giuseppe Attardi. 1985. Six unidentified reading frames of human mitochondrial DNA encode 375 components of the respiratory-chain NADH dehydrogenase. Nature 314:592-597.

Church, George M . , Piotr P. Slonimski, and . 1979. Plelotropic mutations within two yeast mitochondrial cytochrome genes block mRNA processing. Cell 18:1209-1215.

Clark-Walker, G.D., and K.S. Sriprakash. 1983. Map location of transcripts from Torulopsis glabrata mitochondrial DNA. EMBO J. 2:1465-1472.

Clark-Walker, G.D., C.R. McArthur, and K.S. Sriprakash. 1983. Order and Orientation of genic sequences in circular mitochondrial DNA from Saccaromyces exiguus. J. Mol. Evol. 19:333-341.

Collins, Richard A. and Alan M. Lambowitz. 1983* Structural variations and optional introns in the mitochondrial DNAs of Neurospora strains isolated from nature. Plasmid 9:53-70.

Chu, Frederick K., Gladys F. Maley, Frank Maley, and Marlene Belford. 1984. Intervening sequence in the thymidylate synthase gene of bacteriophage T4. Proc. Nat. Acad. Sci. USA. 81:3049-3053.

Cittereich, Manuela Helmer, Giorgio Morelli, and Giuseppe Macino. 1983. Nucleotide sequence and intron structure of the apocytochrome b gene of Neurospora crassa mitochondria. EMBO J. 2:1235-1212^

Colleaux, L., L. dAuriol, M. Betermier, G. Cottarel, A Jacquier, F. Galibert, and B. Dujon. 1986. Universal code equivalent of a yeast mitochondrial intron reading frame is expressed into E. coll as a.specific double strand endonuclease. Cell""¥4:521-533.

Collmer, Candace Whitmer, A. Hadidi, and J.M. Kaper. 1985. Nucleotide sequence of peanut stunt virus reveals structural homologies with viroids and certain nuclear and mitochondrial introns. Proc. Nat. Acad. Sci. USA. 82:3110-3114.

Coolbaugh, James C., Joseph J. Progar, and Emilio Weiss. 1976. Enzymatic activities of cell-free extracts of Rickettsia typhi . Infection and Immunity. 14:298-305.

Clayton, David A. 1982. Replication of animal mitochondrial DNA. Cell 28:693-705. 376 Cummings, D.J., L. Belcour and C. Grandchamp. 1979* Mitochondrial DNA from Podospora anserina; Properties of mutant DNA and multimeric circular DNA from senescent cultures. Mol. Gen. Genet. 171:239.

Cummings, Donald J., Ian A. MacNeil, Joanne Domenico, and Etsuko T. Matsuura. 1985. Excision-amplification of mitochondrial DNA during senescence in Podospora anserina : DNA sequence analysis of three unique plasmids. J. Mol. Biol. 185:659-680.

Daum, Gunther, Peter C. Bohni, Gottfried Schatz. 1982. Import of proteins into mitochondria: cytochrome b2 and cytochrome c peroxidase are located in the intermembrane space of yeast mitochondria. J. Biol. Chem. 257:13028-13033.

Davies, R . Wayne, Richard B. Waring, John A. Ray, Terence A. Brown, and Claudio Scazzocchio. 1982. Making ends meet: a model for RNA splicing in fungal mitoohondria. Nature 300:719-724.

Davis, B.D., R. Deulbecco, H.N. Eisen and H.S. Ginsberg. 1980. Microbiology. Harper and Row, Publishers, Inc. N.Y.

Deno, Hiroshi, Akira Kato, Kazuo Shinozaki, and Masahiro Sugiura. 1982. Nucleotide swquences of tobacco chloroplast genes for elongator tRNA-met and tRNA-val(UAC): the tRNA-val(UAC) gene contains a long intron. Nucl. Acids Res. 10:7511-7520.

Deno, Hiroshi, and Masahiro Sugiura. 1984. Chloroplast tRNA-gly gene contains a long intron in the D stem: Nucleotide sequences of tobacco chloroplast genes for tRNA-gly(UCC) and tRNA-arg(UCU). Proc. Nat. Acad. Sci. USA. 81:405-408.

Dente, Luciana, Gianni Cesareni, and Riccardo Cortese. 1983. pEMBL: A new family of single stranded plasmids. Nucl. Acids Res. 11:1645-1655.

Dhawale, S., D.K. Hanson, N.J. Alexander, P.S. Perlman, and H. R. Mahler. 1981. Regulatory interactions between mitochondrial genes: Interactions between two mosaic genes. Proc. Nat. Acad. Sci. USA. 78:1778-1782.

Dieckmann, Carol L., and Alexander Tzagoloff. 1985. Assembly of the yeast mitochondria membrane system: CBP 6, A yeast nuclear gene necessary for synthesis of cytochrome 377

b. J. Bio. Chem. 260:1513-1520.

Diener, T.O. 1966. Viroid processing: A model involving the central conserved region and hairpin I. Proc. Nat. Acad. Sci. USA. 83:58-62.

Doersen, C.-J. and E.J. Stanbridge. 1982. Cytoplasmic and nuclear inheritance of Erythromycin resistance in human cells, in: Mitochondrial Gene3 ed. P. Slonimski, P. Borst, and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp. 129-132.

Dujon, Bernard. 1980. Sequence of the intron and flanking exons of the mitochondrial 21S rRNA gene of yeast strains having different alleles at the omega and rib-1 loci. Cell. 20:185-197.

Dujon, Bernard and Alain Jacquier. 1983. Organization of the mitochondrial 21S rRNA gene in Saccharomyces cerevislae: Mutants of the peptidyl transferase centre and nature of the omega locus. In: Mitochondria 1983. eds. R.J. Schweyen, R.J. Wolfe, and F.Kauduwitz. W. de Gruyter and Co. Berlin, pp. 389-403.

Engles, William R. 1981, Estimating genetic divergence and genetic variability with restriction endonucleases. Proc. Nat. Acad. Sci. USA. 78:6329-6333.

Erickson, Jeanne M., Miohele Rahire, and Jean-David Rochaix. 1984* Chlamydomonas relnhardii gene for the 32000 mol wt protein of photosystem II contains four large introns and is located entirely within the chloroplast inverted repeat. EMBO J. 3:2753-2762.

Eisemann, Christine S. and Joseph V. Ostermann. 1976. Proteins of Typhus and Spotted Fever Group rickettsiae. Infection and Immunity 14:155-162.

Farrelly, Frances and Ronald A. Butow. 1983* Rearranged mitochondrial genes in the yeast nuclear genome. Nature 30 1:296-301.

Fox, Thomas D. and Christopher J. Leaver. 1981. The Zea mays mitochondrial gene coding cytochrome oxidase subunit II has an intervening sequence and does not contain TGA codons. Cell 26:315-323.

Frendewey, David, and Walter Keller. 1985. Stepwise Assembly of a pre-mRNA splicing complex requires U-snRNPs 378

and specific intron sequences. Cell 42:355-367.

Gargouri, A. J. Lazowska and P.P. Slonimski. 1983* DNA-splicing of introns in the gene: A general way of reverting intron mutations, in: Mitochondria 1983. ed. R.J. Schweyen, K. Wolf and F. Kaudewitz. Walter de Gruyter. N.Y. pp.259-268.

Garriga, Gian, and Alan M. Lambowitz. 1984. RNA splicing in Neurospora mitochondria: Self-splicing of a mitochondrial intron In vitro. Cell 38:631-641.

Gasser, Susan M. , Gunther Daum, and Gottfried Schatz. 1982. Import of proteins into mitochondria: Energy-dependent uptake of precursors by isolated mitochondria. J. Biol. Chem. 257:13034-13041.

Gasser, Susan M . , Akira Ohashi, Gunther Daum, Peter C. Bohnl, Jane Gibson, Craeme A. Reid, Takashi Yonetani, and Gottfried Schatz. 1982. Imported mitochondrial proteins cytochrome b2 and cytochrome c2 are processed in two steps. Proc. Nat. Acad. Sci. USA. 79:267-271.

Gilbert, Walter. 1986. Origin of life: The RNA world. Nature 319:618.

Giminez, D.F. 1965. Staining rickettsiae in yolk-sac cultures. Stain Technology 39:135-140.

Grabowski, Paula J., Sharon R. Seiler, and Phillip A. Sharp. 1985. A multicomponent complex is involved in the splicing of messenger RNA precursors. Cell 42:345-353.

Graf, L., H. Kossel and E. Stutz. 1980. Sequencing of 16S-23S spaoer in a ribosomal operon of Euglena gracilis chloroplast DNA reveals two tRNA genes. Nature 286:908-910.

Gray, M.W. 1983: The bacterial ancestry of plasids and mitochondria. BioScience 33:693-699.

Gray, M.W., L. Bonen, D. Falconet, T.Y. Huh, M.N. Schnare and D.F. Spencer. 1982. Mitochondrial ribosomal RNAs of Tritcum aestlvum (wheat): Sequence analysis and gene organization, in: Mitochondrial Genes, ed. P. Slonimski, P. Borst G. Attardi. Cold Spring Harbor Laboratory, pp.483-488.

Gray, W. Michael, David Sankoff, and Robert J. Cedergren. 1984. On the evolutionary descent of organisms and 379 organelles: a global phylogeny based on a highly conserved structural core in small subunit ribosomal RNA. Nucl. Acids Res. 12:5837-5852.

Green, M.R., M.F./ Grimm, R.R.; Goewert, R.A. Collins, M.D. Cole, A.M. Lambowitz, J.E. Heckman, S. Yin and U.L. RajBhandary. 1981. Transcripts and prossing patters for the ribosomal RNA and transfer RNA region of Neurospora crassa mitochondrial DNA. J. Biol. Chem. 256:2027-2034.

Greer, Chris L., Craig L. Peebles, Peter Gagenheimer, and John Abelson. 1983. Mechanism of action of a yeast RNA ligase in tRNA splicing. Cell 32:537-546.

Grivell, L.A., L. Bonen, and P. Borst. 1983* Mosaic genes and RNA processing in mitochondria. In: Genes: Structure and Expression, ed A.M. Kroon. John Wiley and Sons Ltd. pp. 279-306.

Groves, Mlcheal G. and Joseph V. Osterman. 1978. Host defenses in experimental Scrub Typhus: Genetics ofnatural resistance to infection. Infection and Immunity 19:583-588.

Groves, Michael G., David L. Rosenstreich, Benjamin A. Taylor, and Joseph V. Osterman. 1980. Host defenses in experimental Scrub Typhus: Mapping the gene that controls natural resistance in mice. J. Immunol. 125:1395-1399.

Halbreich, A. P. Pajot, M. Foucher, C. Grandchamp, and P. Slonimski. 1980. A pathway of cytochrome b mRNA processing in yeast mitochondria: Specific splicing steps and an intron-derived circular RNA. Cell 19:321-329.

Haldi, M.A. 1985. RNA splicing in yeast mitochondria: Genetic and molecular studies of the folded structures of two introns of the cytochrome b gene. Dissertation. Ohio State University.

Hanahan, Douglas. 1983. Studies on transformation of Escherichia coll with plasmids. J. Mol. Biol. 168:557-580.

Hanson, Barbara. 1984. Characterization of Rickettsia tsutsugamushi outer membrane proteins. In: Microbiology-1984. ed. L. Leive and D. Schlessinger. Am. Soc. Microbiol., Washington, D.C. pp.240-243*

Haung, K.-Y.1967* J* Bacteriol. 93:653-859* 380

Hase, Tatsuo, 1983. Growth pattern of Rickettsia tsutsugamushl In irradiated L Cells. J. Bacteriol. 15^:879-892.

Hayes, S.F. and W. Burgdorfer. 1979. Ultrastructure of Rickettsia rhlpicephall, a new member of the spotted fever group rickettsiae in tissues of the host vector Rhipicephalus sanguineus. J. Bacteriol. 137:605-613.

Hechemy, Karim E., Roy W. Stevens, Sandra Sasowski, Edith E. Michaelson, Elizabeth A. Casper, and Robert N. Philip. 1979. Discrepancies in Weil-Felix and microimmunofluorescence tests for Rocky Mountain Spotted Fever. J. Clinical Micro. 9:292-293.

Hensgena, Lambert A.M., Annika C. Arnberg, Egbert Roosendaal, Grenda van der Horst, Renske van der Veen, Gert-Jan B. van Ommen, and Leslie A. Grivell. 1983* Variation, transcription and circular RNAs of the mitochondrial gene for subunit I of cytochrome c oxidase. J. Mol. Biol. 164:35-58.

Hensgens, Lambert A.M., Linda Bonen, Muus de Haan, Grenda van der Horst, and Leslie A. Grivell. 1983* Two intron sequences in yeast mitochondrial Cox1 gene: Homology amoung URF-containing introns and strain-dependent variation in flanking exons. Cell 32:379-389.

Hensgens, L.A.M., G. van der Horst, H.L. Vos, and L. A. Grivell. 1984. RNA processing in yeast mitochondria: characterization of mit- mutants disturbed in the synthesis of subunit I of cytochrome c oxidase. Current Genetics 8:457-465.

Hernandez, Nouria, and Walter Keller. 1983. Splieing of in vitro synthesized messenger RNA precursors in HeLa Cell extracts. Cell 35:89-99.

Highfield, Peter E., and R. John Ellis. 1978. Synthesis and transport of the small subunit of chloroplast ribulose bisphosphate carboxylase. Nature. 271:420-424.

Hill, John, Patricia McGraw, and Alexander Tzagoloff. 1985. A mutation in yeast mitochondrial DNA results in a precise excision of the terminal intron of the cytochrome b gene. J. Bio. Chem. 260:3235-3238.

Holl, Jurgen, Gerhard Rodel, and Rudolf J. Schweyen. 1985. Suppressor mutations identify box9 as a central nucleotide sequence in the highly ordered structure of intron RNA in 381

yeast mitochondria. EMBO J. <4:2081-2085.

Holl, Jurgen, Cornelia Schmidt, and Rudolf J. Schweyen. 1985. Cob intron 3 in yeast mtDNA: Nucleotide sequence and mutations in a novel RNA domain. In: Achievements and Perspectives of Mitochondrial Research, eds. E. Quagliariello, Slater, Paloieri, Saccone, and Kroon. Elsevier Science Publishers. N.Y., N.Y. pp 227-236.

Hope, J. Thomas, Renato J. Aguilera, Mark E. Minie, and Hitoshi Sakano. 1986. Endonucleolytic activity that cleaves immunoglobulin recombination sequence. Science 231:1141-111*5.

Hopper, Anita K. and Akemi H. Furukawa. 1982. Defects in modification of cytoplasmic and mitochondrial transfer RNAs are caused by single nuclear mutations. Cell 28:5*43-550.

van der Horst, Gerda, and Henk F. Tabak. 1985. Self-splicing of yeast•mitochondrial ribosomal and messenger RNA precursors. Cell <40:759-766.

Huebner, Robert J., William L. Jellison, and Charles Pomerantz. 1946. Rickettsialpox- A newly recognized rickettsial disease. Public Health Reports 61:1677-1682.

Hudspeth, Michael E.S., Robert D. Vincent, Philip S. Perlman, Deborah S. Shumard, Laurelee 0. Treisman, and Lawrence I. Grossman. 198<4. Expandable var 1 gene of yeast mitochondrial DNA: In-frame insertions can explain the strain-specific protein size polymorphisms. Proc. Natl. Acad. Sci. USA. 81:3148-3152.

Hudspeth, M.E.S., D.S. Shumard, C.J.R. Bradford, and L.I. Grossman. 1983* Organization of Achlya mtDNA: A population with two orientations and a large Inverted repeat containing the rRNA genes. Proc. Natl. Acad. Sci. USA. 80:142-146.

Ibrahim, Z. Ades and Ronald A Butow. 1980. The transport of proteins into yeast mitochondria: Kinetics and pools. J. Biol. Chem. 255:9925-9935.

Inoue, Tan, and Thomas R. Cech. 1985. Secondary structure of the circular form of the Tetrahymena rRNA intervening sequence: A technique for RNA structure analysis using chemical probes and reverse transcroptase. Proc. Nat. Acad. Sci. USA. 82:648-652. 382

Irons, E.N. 1946. Clinioal and laboratory variation of virulence In scrub typhus. Am. J. Trop. Med. 26:165-174.

Ise, Wolfgang, Horst Haiker, and Hannis Wless. 1985. Mitochondrial translation of subunits of the rotenone-sensitivo NADH:ubiquinone reductase in Neurospora crassa. EMBO J. 4:2075-2080.

Jacq, C., J. Banroques, A.M. Becam, P.P. Slonimski, N. Guiso, and A. Danchin. 1984. Antibodies against a fused 'lacZ-yeast mitochondrial intron' gene product allow identification of the mRNA maturase encoded by the forth intron of the yeast cob-box gene. EMBO J. 3:1567-1572.

Jacq, C., P. Pajot, J. Lazowska, G. Dujardin, M. Claisse, 0. Groudinskey, H. De La Salle, C. Grandchamp, M. Labouesse, A. Gargouri, B. Guiard, A. Spyridakis, M. Dreyfus and P.P. Slonimski. 1982. Role of introns in yeast cytochrome b gene: Cis and trans-acting signals, intron manipulation, expression, and intergenic comunication. in: Mitochondrial Genes, ed. P. Slonimski, P. Borst and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp.155-184.

Jacquier, Alain and Bernard Dujon. 1983. The intron of the mitochondrial 21S rRNA gene: Distribution in different yeast species and the sequence comparison between Kluyveromyces thermotolerans and Saccharomyces cerevlsiae. Mol. Gen. Genet. 192:4&7-499•

Jacquier, Alain and Bernard Dujon. 1985. An intron-encoded protein is active in a gene conversion process that spreads an intron into a mitochondrial gene. Cell 41:383-394.

Julou, C., and M. Bolotin-Fukuhara. 1982. Genetics of mitochondrial ribosomes of yeast: mitochondrial lethalitry of a double mutant carrying two mutations os the 21S ribosomal RNA gene. Mol. Gen. Genet. 188:256-260.

Kan, Nancy C., and Joseph G. Gall. 1982. The intervening sequence of the ribosomal RNA gene is highly conserved between two Tetrahymena species. Nucl. Acids Res. 10:2809-2822.

Kao, Teh-hui, Eunpyo Moon, and Ray Wu. 1984. Cytochrome oxidase subunit II gene of rice has an insertion sequence within the intron. Nucl. Acids Res. 12:7305-7315.

Keller, Mario, and F. Michel. 1985. The introns of the Euglena gracilis chloroplast gene which codes for the 383

32-kDa protein of photosystem II. FEBS lett. 179:69-73.

Keller, Mario, and Erhard Stutz. 1984* Structure of the Euglena gracilis chloroplast gene (psbA) coding for the 32-kDa protein of photosystem II. FEBS lett. 175:173-177.

Keller, Walter. 1984. The RNA lariat: a new ring to the splicing of mRNA precursors. Cell 39:423-425.

Kjems, Jorgen, and Roger A. Garrett. 1985. An intron in the 23S ribosomal RNA gene of the archaebacterium Desulfurococcus mobilis. Nature 318:675-676.

Klar, Amar J.S., James B. Hicks, and Jeffrey N. Strathern. 1982. Directionality of yeast mating-type interconversion. Cell. 28:551-561.

Koch, W, K. Edwards and H. Kossel. 1981. Sequencing of the 16S-23S Spacer in a Ribosomal RNA operon of Zea mays chloroplast DNA reveals two split tRNA genes. Cell 25:203-213.

Koller, Barbara, Jeffrey C. Gingrich, Gary L. Stiegler, Michael A. Farley, Hajo Delius, and Richard B. Hallick. 1984. Nine introns with conserved boundary sequences in the Euglena gracilis chloroplast ribulose-1,5-bisphosphate carboxylase gene. Cell 36:545-553.

Koller, Barbara, Jill Clarke, and Hajo Delius. 1985. The structure of precursor mRNAs and of excised intron RNAs in chloroplasts of Euglena gracilis. EMBO J. 4: 2445-2450.

Kostriken, Richard, Jeffrey N. Strathern, Amar J.S. Klar, James B. Hicks, and Fred Heffron. 1983. A site-specific endonuclease essential for mating-type switching in Saccharomyces cerevisiae. Cell 35:167-174.

Kotylak, Z. and P.P. Slonimski. 1976. Joint control for cytochrome a and b by a unique mitochondrial RNA region comprising four genetic loci, in: The Genetic Function of Mitochondrial DNA. ed. C. Saccone and A.M. Kroon. Elsevier/North-Holland Biomedical Press. Amsterdam, pp.143.

Krainer, Adrian R., Tom Maniatis, Barbara Ruskin, Michael R. Green. 1984. Normal and mutant Human beta-globin pre-mRNAs are faithfully and efficiently spliced in vitro . Cell 36:993-1005. 384

Krause, Duncan C., Herbert H. Winkler and David 0. Wood. 1985. Cloning and expression of the Rickettsia prowazekil ADP/ATP translocator in .Proc. Natl. Acad. Sci. USA. 82:3015-3019.

Kearsey, S.E., and I.W. Craig. 1982. Genetic basis of chloramphenicol resistance in mouse and human cell lines, in: Mitochondrial Genes, ed. P. Slonimski, P. Borst and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp.117-120.

Kruger, K., P.J. Grabowski, A.J. Zaug, J. Sands, D.E. Gottschllng and T. R. Cech. 1982. Self-splicing RNA: Autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell 31:147— 157 •

Kuck, Ulrich, Birgit Kappelhoff, and Karl Esser. 1985. Despite mtDNA polymorphism the mobile intron (plDNA) of the COI gene is present in ten different races of Po'dospora anserina. Curr. Genet. 10:59-67.

Kuck, Ulrich, Heinz D. Osiewacz, Udo Schmidt, Birgit Kappelhoff, Erika Schulte, Ulf Stahl, and Karl Esser. 1985. The onset of senescence is affected by DNA rearrangements of a discontinuous mitochondrial gene in Podospopra anserina. Curr. Genet. 9:373-382.

Kuntzel, Hans and Heinrich G. Kochel. 1981. Evolution of rRNA and origin of mitochondria. Nature 293:751-755.

Labouesse, Michel, Genevieve Dujardin, and Piotr P. Slonimski. 1965. The yeast nuclear gene NAM2 is essential for mitochondrial DNA integrity and can cure a mitochondrial RNA-maturase deficiency. Cell. 41:133-143*

Labouesse, Michel, and Piotr P. Slonimski. 1983. Construction of novel cytochrome b genes in yeast mitochondria by subtraction or addition of introns. EMBO J. 2:269-276.

Lackman, D.B., E.J. Bell, H.G. Stoenner, E.G. Pickens. 1965. The Rocky Mountain Spotted Fever Group of riokettsias. Health Lab. Sci. 2:135.

Lambowitz,A.M. 1979. Preparation and annalysis of mitochondrial ribosomes. Meth. Enzymology. 59:421-433*

Lambowitz, A.M., N.H. Chua and D.J.L. Luck. 1976. Mitochondrial ribosome assembly in Neurospora. Preparation of mitochondrial precursor particles, site of 385 synthesis of mitochondrial ribosomal proteins, and studies on poky mutant. J. Mol. Biol. 107:223-253*

Lane, Robert S., Richard W. Emmons, Dale V. Dondero, and Bernard C. Nelson. 1981. Ecology of Tick-borne agents in California I. Spotted Fever Group rickettslae. Am. J. Trop. Med. Hyg. 30:239-252.

Lang, B.F. 1984. The mitochondrial genome of the fission yeast Schizosaccharomyces pombe: Highly homologous introns are inserted at the same position of otherwise less conserved cox 1 genes in Schizosaccharomyces pombe and Aspergillus nidulans. EMBO J. 3:2129-2136.

Lang, B.F., F. Ahne, S. Distler, H. Trinkl, F. Kaudewitz, and K. Wolf. 1983* Sequence of the mitochondrial DNA, arrangement of genes and processing of their transcripts in Schizosaccharomyces pombe. In: Mitochondria 1983 ed. Schweyen, Wolf, and Kaudewitz. Halter de Gruyter. N.Y., N.Y. pp.313-330.

Lange, James V. and David H. Walker. 1984. Production and characterization of monoclonal antibodies to Rickettsia rlckettsil. Infection and Immunity 46:289-294.

Langford, C.J. and D. Galwitz. Evidence for an intron-contained sequence required for splicing of yeast RNA polymerase II transcripts. Cell 33:519-527.

Langford, Christopher J., Franz-Josef Klinz, Cornelia Donath, and Dieter Gallwitz. 1984. Point mutations identify the conserved, intron-contained TACTAAC box as an essential splicing signal sequence in yeast. Cell 36:645-653.

LaPolla, R.J. and A.M. Lambowitz. 1979. Binding of mitochondrial ribosomal proteins to a mitochondrial ribosomal precursor RNA containing a 2.3-kilobase intron. J. Biol. Chem. 254:11746-11750.

Lazowska, Jaga, Claude Jacq, and Piotr P. Slonimski. 1980. Sequence of introns and flanking exons in wild-type and box3 mutants of cytochrome b reveals an Interlaced slicing protein coded by the intron. Cell 22:333-348.

Lazowska, Jaga, Claude Jacq and Piotr Slonimski. 1981. Splice points of the thrid intron in the yeast mitochondrial cytochrome b gene. Cell 27:12-14.

Leaver, C.J. and B.C. Forde. 1980. Mitochondrial genome 386

expression in higher plants, in: Genome Organization and Expression in Plants, ed. C.J. Leaver. Plenum Press. N.Y. pp.1*07.

Leaver, C.J., B.G. Forbe, L.K. Dixon, and T. D. Fox. 1982. Mitochondrial genes and cytoplasmically inherited variation in higher plants. In: Mitochondrial Genes, ed. P. Slonimski, P. Borst, and G. Attardi. CSH Laboratory, pp.457-470.

Leaver, C.J., B.G. Forde, L. Dixon and T. Fox. 1982. Mitochondrial genes and cytoplasmic inherited variation in higher plants, in: Mitochondrial Genes, ed. P. Slonimski, P. Borst and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp.457-470.

Leaver, C.J. and M.W. Gray. 1982. Mitochondrial genome organization and expression in higher plants. Ann. Rev. Plant Physiol. 33:373-402.

Lewin, Roger. 1986. RNA catalysis gives fresh perspective on the origin of life. Science 231:545-546.

McKiel, John A., E. John Bell, and David B. Lackman. 1967. Rickettsia Canada: A new member of the Typhus Group rickettsiae isolated from Leporlspalustrls ticks in Canada. Can. J. Micro. 13:503-510.

Macreadie, Ian G. Rose M. Scott, Andrew R Zinn, and Ronald A. Butow. 1985. Transposition of an intron in yeast mitochondria requires a protein encoded by that intron. Cell 41:395-402.

Margulis, L. 1970. Origin of Eukaryotic Cells. Yale University Press. New Haven, Conn.

Martin, N.C., K.Underbrink-Lyon, and D.L. Miller. 1982. Identification and characterization of a yeast mitochondrial locus necessary for tRNA biosynthesis. In: Mitochondrial Genes, ed. P. Slonimski, P. Borst, and G. Attardi. Cold Spring Harbor Laboratory, N.Y. pp.263-268.

Michaelis, G., G. Mannhaupt, E. Pratje, E. Fisher, J. Naggert, and E. Schweizer. Mitochondrial translation products in nuclear respiration-deficient pet mutants of Saccharomyces oerevisiae. in: Mitochondrial Genes, ed. P. Slonimski, P. Borst and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp.311-322.

Michel, Francois, and Donald J. Cummings. 1985. Analysis 387 of class I intron in a mitochondrial plasmid associated with senescence of Podospora anserina reveals extraordinary resemblance to the Tetrahymena ribosomal intron. Curr. Genet. 10:69-79.

Michel, Francois, Alain Jacquier, and Bernard Dujon. 1982. Comparison of fungal mitochondrial introns reveals extensive homologies in RNA secondary structure. Biochimie 64:867-881.

Michel, F. and B.F. Lang. 1985. Mitochondrial class II introns encode proteins related to the reverse transcriptases of retroviruses. Nature (in press).

Michel, Francois, and Bernard Dujon. 1983. Conservation of RNA secondary structures in two intron families including mitochondrial-, chloroplast-, and nuclear-encoded members. EMBO J. 2:33-38.

Miller, Dennis L. and Nancy C. Martin. 1963. Characterization of the yeast mitochondrial locus necessary for tRNA biosynthesis: DNA sequence analysis and identification of a new transcript. Cell 34:911-917.

Morelli, Giorgio, and Giuseppe Macino. 1984. Two intervening sequenceas in the ATPase subunit 6 gene of Neurospora crassa . J. Mol. Biol. 178:491-507.

Mounolou, J., H. Jakob and P. Slonimski. 1966. Mitochondrial DNA from yeast "petite" mutants: Specific changes in buoyant density corresponding to different cytoplasmic mutations. Biochem. Biophys. Res. Comm. 24:218.

Mounolou, J., H. Jakob and P. Slonimski. 1968. in. Biochemical aspests of the biogenisis of mitochondria. ed. E. Slater el al. Adriatica Editrice. pp.473.

Mount, S.M. 1982. A catalogue of splice junction sequences. Nucl. Acids Res. 10:459-472.

Muller, Peter P., Michelle K. Reif, Shen Zonghou, Christian Sengstag, Thomas L. Mason. 1984. A nuclear mutation that post-transcriptiopally blocks accumulation of a yeast mitochondrial gene product can be suppressed by a mitochondrial gene rearrangement. Academic Press Inc. N.Y.,N.Y.

Myers, William F. and Charles L. Wisseman, Jr. 1980. Genetic relatedness amoung the Typhus group Rickettsiae. 388

Internat. J. Systematic. Bac. 30:143-150.

Myers, William F. and Charles L. Wisseman, Jr. 1983. Taxonomic relationship of Blckettsla Canada to the Thyphus and Spotted Fever groups of the genus Rickettsia . In: Rickettsiae and Rickettsial Diseases, ed. Willy Burgdorfer and Robert L. Anacker. Academic Press. New York. p. 313-326.

Myers W.F., C.L. Wisseman Jr. P. Fiset, E. V. Oaks and J.F. Smith. 1979. Taxonomic relationship of vole agent to Roehalimaea quintana. Infection and Immunity 26:976-983.

Nargang, Frank E. John B. Bell, Lori L. Stohl, and Alan M. Lambowitz. 1984. The DNA sequence and genetic organization of a Neurospora mitochondrial plasmid suggest a Relationship to introns and mobile elements. Cell 38:441-543.

Nasmyth, Kim A. 1985. At least 1400 base pairs of 5'-flanking DNA is required for the correct expression of the HO gene in yeast. Cell 42:213-223*

Nasymth, Kim A. 1982. The regulation of yeast mating-type chromatin structure by "Sir": An action at a distance affecting both transcription and transposition. Cell. 30:567-578.

Netter, Pierre, Claude Jacq, Giovanna Carignani, and Piotr P. Slonimski. 1982. Critical sequences within mitochondrial introns: Cis-dominant mutations of the "cytochrome-b-llke11 intron of the oxidase gene. Cell 28:733-738.

Newman, Andrew J., Ren-Jang Lin, Soo-Chen Cheng, and John Abelson. 1985. Molecular consequences of specific intron mutations on yeast mRNA splicing in vivo and in vitro. Cell 42:335-344.

Nobrega, Francisco G. and Alexander Tzagoloff. 1980. Assembly of the mitochondrial membrane system: DNA sequence and organization of the cytochrome b gene in Saccharomyces oerevlslae D273-10B. J. Biol. Chem. 255:9828-9837.

Noller, H.F., J. Kop, V. Wheaton J. Brosius, fi.R. Gutell A. Kopylov, F. Dohme and W. Herr. 1981. Nucl. Acids Res. 22:6167-6189. 389

Nomiyama, Hisayuki, Yoshiyuki Sakaki, Yasuyuki Takagi. 1981. Nucleotide sequence of a ribosomal SNA gene intron from slime mold Physarum polycephalum. Proc. Nat. Acad. Sci. USA. 78:1376-1360.

Nomiyama, Hisayuki, Teruhisa Tsuzukl, Shojo Wakasugi, Makoto Fukuda, and Kazunori Shimada. 1984. Interruption of a human nuclear sequence homologous to mitochondrial DNA by a member of the Kpnl 1.8 kb family. Nucl. Acids Res. 12:5225-5234.

Oaks, Edwin V., Charles L. Wisseman, Jr., and Jonathan F. Smith. 1981. Radiolabeled polypeptides of Rickettsia prowasekii grown in microcarrier cell culture. In: Rickettiae and Rickettsial Diseases, ed Willy Burgdorfer and Robert L. Anacker. p. 461-472.

Ogden, R.C., J.S. Beckman, J. Abelson, H.S. Kang, D. Soli and 0. Schmidt. 1979. In vitro transscription and processing of a yeast tRNA gene containing an intervening sequence. Cell 17:399-406. van Ommen, G.-J. B., G.S.P. Groot and L.A. Grivell. 1979. Transcription maps of mtDNAs of two strains of Saccharomyce3: Transcription of strain-specific insertions; complex RNA maturation and splicing. Cell 18:511-523.

Ormsbee, R., M. Peacock, R. Philip, E. Casper, J. Plorde, T.Gabre-kidan, and L. Wright. 1978. Antigenic relationships between the Typhus and Spotted Fever Group of rickettsiae. Am. J. Epidem. 108:53-59.

Orozco, E.M., K.E. Rushlow, J.R. Dodd and R.B. Hallick. 1980. Euglena gracilis chloroplast ribosomal RNA transcription units. J. Biol. Chem. 255:10997-11001.

Osiewacz, H.D. and K. Esser. 1984. The mitochondrial plasmid of Podospora anserina: A mobile intron of a mitochondrial gene. Current Genetics 8:299-305

Padgett, Richard A., Maria M. Konarska, Paula J. Grabowski, Steven F. Hardy, and Phillip A. Sharp. 1964. Lariat RNAs as intermediates and products in the splicing of messenger RNA precursors. Science 225:898-903.

Padgett, Richard A., Stephen M. Mount, Joan A. Steitz, and Phillip A. Sharp. 1983* Splicing of messenger RNA precursors is inhibited by antisera to small nuclear ribonucleoprotein. Cell 35:101-107. 390

Palmer, Jeffrey D. 1983* Chloroplast DNA exists in two orientations. Nature. 301:92-94.

Pedersen Jr., Carl E., and Van D. Walters. 1978. Comparative electrophoresis of Spotted Fever Group rickettsial proteins. Life Sciences 22:583-588.

Peebles Craig L., Peter Gegenheimer, and John Abelson. 1983. Precise excision of intervening sequences from precursor tRNAs by a membrane-associated yeast endonuclease. Cell 32:525-538.

Peebles, C.L., P.S. Perlman, K.L. Mecklenburg, M.L. Petrillo, J.H. Tabor, K.A. Jarrell, and H.-L. Cheng. 1986. A self-splicing RNA exises an intron lariat.- Cell 1*4:213-223.

Perlman, Philip S. 1975. Genetic analysis of petite mutants of Saocharomyces cerlvisiae: Transmissional types. Genetics 82:645*^663.

Perlman, P.S., M.G. Douglas, R.L. Strausberg, and R.A. Butow. 1977. Localization of genes for variant forms of mitochondrial proteins on mitochondrial DNA of Saocharomyces cerevlsiae . J. Mol. Biol. 115:675-694.

Perlman, Philip S. and Henry R. Mahler. 1974. Derepression of mitochondria and their enzymes in yeast: Regulatory aspects. Arch. Biochem. and Biophys. 162:248-271.

Perlman, Philip S., and Henry R. Mahler. 1983>Genetics and biogenesis of cytochrome b. Methods in Enzymology 97:374:395.

Phibbs, Paul V., Jr., and Herbert H. Winkler. 1981. Regulatory properties of partially purified enzymes of the tricarboxylic acid cycle of Rickettsia prowasekii. In: Rickettsiae and Rickettsial Diseases, ed. Willy Burgdorfer and Robert L. Anacker. Academic Press. New York. pp.421-430.

Phibbs, Paul V. and Herbert H. Winkler. 1982. Regulatory properties of citrate synthase from Rickettsia prowazekli . J. Bacteriol. 149:718-725.

Philip. R.N., E.A. Casper, R.L. Anacker, J. Cory, S.F. Hayes, W. Burgdorfer and C.E. Yunker. 1983. Rickettsia bellll sp. nov: a Tick-borne rickettsia, widely distributed in the United States, That is distinct from 391

the Spotted Fever and Typhus biogroups. Interna. J. System. Bacteriol. 33:94-106.

Philip, Robert N., Elizabeth A. Casper, Willy Bugdorfer, Robert K. Gerloff, Lyndahl E. Hughes, and E. John Bell. 1978. Serologic typing of Rickettsiae of the Spotted Fever group by microimmunofluorescence. J. Immunol. 121:1961-1968.

Pikielny, Claudio W., John Teem and Michael Roshbash. Evidence for the biochemical role of an internal sequence in yeast nuclear mRNA introns: Implications for U1 RNA and metazoan mRNA splicing. Cell 34:395-403.

Ponticelli, Alfred S., Dennis W. Schultz, Andrew F. Taylor, and Gerald R. Smith. 1985. Chi-dependent DNA strand cleavage by RecBC enzyme. Cell. 41:154-151.

Plotz, H., B. Bennett and K. Wertman. 1944. Cross-reacting typhus antibodies in Rocky Mountain Spotted Fever. Proc. Soc. Exp. Biol. Med. 57:336-339.

Plotz, H., J.E. Smadel, B.L. Bennet, R.L. Reagan, and M.S. Snyder. 1946. North Queensland tick typhus: Studies on the aetiological agent and its relationship to other rickettsial diseases. Med. J. Austr. 2:263*

Regnery, L. Russell and Catherine L. Sprull. 1984. Extent og genetic heterogeneity among human isolates of Rickettsia prowasekli as determined by restriction endonucleases analysis of rickettsial DNA. In: Microbiology-1984 ed. L. Leive and D. Schlessinger. Am. Soc. Microbio., Washinton, D.C. pp.297-300.

Regnery, Russell, Theodore Tzianabos, Joseph J. Esposito, and Joseph E. McDade. 1983. Strain differentiation of epidemic typhus rickettsiae (Rickettsia prowazekil) by DNA restriction endonucleases analysis. Current Microbiology 8:355-358.

Rights, F.L., J.E. Smadel, and E.B. Jackson. 1948. Studies on scrub typhus (tsutsugamushi disease). III. Heterogeneity of strains of tsutsugamushi as demonstrated by cross-vaccination studies. J. Exp. Med. 87:339-351.

Rochaix, J.D., M. Rahire, and F. Michel. 1985. The chloroplast ribosomal intron of Chlamydomonas relnhardil codes for a polypeptide related to mitochondrial maturases. Nucl. Acids Res. 13:975-984. 392

Robertson, R.G. and C.L. Wisseman, Jr. 1973. Tick-borne rickettsiae of the Spotted Fever Group in West Pakistan and Thailand: Evidence for two new species. Am. J. Epidemiol. 97:55.

Rodel, Gerhard, Jurgen Holl, Carlo Schmelzer, Cornelia Schmidt, Rudolf J. Schweyen, Brigitte Weiss-Brummer, and Fritz Kaudewitz. 1983* Cob intron 1 and 4: Studies on mutants and revertants uncover functional intron domains and test the validity of predicted RNA-secondary structures. In: Mitochondria 1983. Walter de Gruyter & Co., N.Y., N.Y. pp.191-201.

Ruskin, Barbara, Adrian R. Krainer, Tom Maniatus and Michael R. Green. 1984. Excision of. an Intact intron as a novel lariat structure during pre-mRNA spiling _in vitro . Cell 38:317-331.

Sagan, L. 1967. On the origin of mitosing cells. J. Theoret. Biol. 14: 225. de la Salle, Henri, Claude Jacq, and Piotr P. Slonimski. 1982. Critical sequences within mitochondrial Introns: Pleiotropic mRNA maturase and cls-domlnant signals of the box intron controlling reductase and oxidase. Cell 28:721-732.

Sanders, J.P.M., C. Heyting, M. Ph. Verbeet, F.C.P.W. Meijlink, and P. Borst. 1977. The organization of genes in yeast mitochondrial DNA. III. Comparison of the physical maps of the mitochondrial DNAs from three wild-type Saocharomyces strains. Mol. Gen. Genet. 157:239-261.

Sanger, F., P. Schreier, A. Smith, R. Straden and I. Young. 1981. Nature 290:457-464.

Scazzocchio, C., T.A. Brown, R.B. Waring, J.A. Ray, and R. Wayne Davies. 1983. Organization of the Aspergillus nldulans mitochondrial genome. In: Mitochondria 1983. ed. Schweywn, Wolf, and Kaudewitz. Walter de Gruyter & Co. N.Y., N.Y. pp 303-312.

Schmidt, Bernd, Bernd Hennig, Richard Zimmermann, and Walter Neupert. 1983. Biosynthetic pathway of mitochondrial ATPase subunit 9 in Neurospora crassa. J. Cell Bio. 96:248-255.

Schmidt, Francis J. 1985. RNA splicing in prokaryotes: Bacteriophage T4 leads the way. Cell 41:339-340. 393

Schmidt, Gerald D. and Larry S. Roberts. 1981. Foundations of Parasitology. C.V. Mosby Company. St. Louis.

Sharp, Phillip A. 1985. On the origin of RNA splicing and introns. Cell 42:397-400.

Sherman, F. and P. Slonimski. 1964. Respiration-deficient mutants of yeast. II. Biochemistry. Biochim. Biophys. Acta 90:1.

Shumard. D.S., L. Grossman and M.E.S. Hudspeth. 1986. Achlya mitochondrial DNA: Gene localization and analysis of inverted repeats. Mol. Gen. Genet, (in press).

Slonimski, P. and B. Ephrussi. 1949. Action de 1'acriflavine sur les levures V. Le systeme des cytochromes des mutants "petites colonic". Ann. Inst. Pasteur (Paris) 77:47:63.

Slonimski, P.P., C. Parjot, M. Jacq, G. Foucher, A. Perrodin, A. Kochko and A. Lamouroux. 1978. Mosaic organization and expression of the mitochondrial DNA region controlling cytochrome c reductase and oxidase. I. Genetic, physical and complimentation maps of the box region, in: Biochemistry and Genetic of Yeast, ed M. Bacila. Academic Press. N.Y. pp.339.

Smith, J.R. and Rubenstein. 1973* The development of senescence in Podospora anserina. J. Gen. Microbiol. 76:297-304.

Sikorski, S. William H. Atkinson, Duncan C. Krause, and Herbert H. Winkler. 1984. Cloning Rickettsia prrowazekll genes in Escherichia coli. In: Microbiology 1 9 M . ed. L. Leive and D. Schlessinger. Am. Soc. Microbiol., Washington, D.C. pp.301-304.

Silverman, D.J. and S.B. Bond. 1984. Infection of human vascular endothelial cells by R^ rlckettsla. J. Infect. Dis. 149:201-206.

Silverman, D.J.and C.L. Wisseman, Jr. 1978. Comparative ultrastructural study on the cell envelopes of Rickettsia prowazekil, Rickettsia rickettsll, and Rickettsia tsutsugamushi. Infection and Immunity. 21:1020-1023.

Simon, Michel, and Gerard Faye. 1984. Steps in processing of the mitochondrial cytochrome oxidase subunit I pre-mRNA affected by a nuclear mutation in yeast. Proc. Nat. Acad. 394

Sci. USA. 81:8-12.

Smith, Deborah K. and Herbert H. Winkler. 1979. Separation of inner and outer membranes of Rickettsia prowazeki and characterization of their polypeptide compositions. J. Bacteriol. 137:963-971.

Sonenshine, D.E., F.M.Bozeman, M.S. Williams, S.A.Masiello, D.P. Chadwick, N.I. Stocks, D.M. Lauer and B.L. Elisberg. 1978. Am. J. Trop. Med. Hyg. 27:339-349.

Sor, F. and H. Fukuhura. 1982. Nucleotide sequence of the small ribosomal RNA gene from the mitochondria of Saocharomyces cerevislae. In: Mitochondrial Genes, ed. P. Slonimski, P. Borst, and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp.255-262.

Sprouse, H.M., M. Kashdam, L. Otis and B. Dudock. 1981. Nucleotide sequence of a spinach chloroplast valine tRNA. Nucl. Acids Res. 9:2543-2547.

Stahl, U., P.A. Lemke, P. Tudzynski, U. Kuck and K. Esser. 1978. Evidence for plasmid-like DNA in a filamentous fungus, the ascomycetes Podospora anserina. Mol. Gen. Genet. 162:341.

Stern, David B. and David M. Lonsdale. 1982. Mitochondrial and chloroplast genomes of maize have a 12-kilobase DNA sequence in common. Nature 299:698-702.

Stern, David B., Jeffrey Palmer. 1984. Extensive and widespread homologies between mitochondrial DNA and chloroplast DNA in plants. Proc. Nat. Aoad. Sci. USA. 81:1946-1950.

Steinmetz, Andre, Earl J. Gubbins, and Lawrence Bogorad. 1982. The anticodon of the maize chloroplast gene for tRNA-UAA-Lue is by a large Intron. Nucl. Acids Res. 10:3027-3037. 1 Strausberg, R.L., R.D. Vincent, P.S. Perlman and R.A. Butow. 1978. Asymetric gene conversion at inserted segments on yeast mitochondrial DNA. Nature 276:577-583.

Sugita, Mamoru, Kazuo Shinozaki, and Masahlro Sugiura. 1985. Tobacco chloroplast tRNA-UUU-Lys gene contains a 2.5-kilobase pair intron: An open reading frame and a conserved boundry sequence in the intron. Proc. Nat. Acad. Sci. USA. 82:3557-3561. 395

Sullivan, Francis X., and Thomas R. Cech. 1985* Reversibility of cyclization of the Tetrahymena rRNA intervening sequence: Implication for the mechanism of splice site choice. Cell 42:639-648.

Szostak, Jack W., Terry L. Orr-Weaver, Rodney J. Rothstein, and Franklin W. Stahl. 1983* The double-strand-break repair model for recombination. Cell 33:25-35.

Tabak, H.F and L.A. Grivell. 1986. RNA catalysis in the excision of yeast mitochondrial introns. Trends in Genetics. Feb 1986.

Tabak, Henk F., Gerda Van der Horst, Klaas A. Osinga, and Annika C. Arnberg. 1984. Splicing of large ribosomal precursor RNA and processing of intron RNA in yeast mitochondria. Cell 39:623-629.

Takaiwa, Fumio, and Masahiro Sugiura. 1982. Nucleotide sequence of the 16S-23S spaoer region in an rRNA gene cluster from tobacco chloroplast DNA. Nucl. Aoida Res. 10:2665-2674.

Tamura, Akira, Norio Ohashi, Hiroshi, Urakami, Kumiko Takahashl, and Miho Oyanagi. 1985. Analysis of polypeptide composition and antigenic components of Rickettsia tsutsugamushi by polyacrylamide gel electrophoresis and immunoblotting. Infection and Immunity 48:671-675.

Tamura, A, H. Urakami and T. Tsuruhara. 1982. Purification of Rickettsia tsutsugamushi by percoll density gradient centrifugation. Microbiol. Immunol. 26:321-328.

Tatohell, Kelly, Kim A. Nasymth, and Benjamin D. Hall. 1981. In vitro mutation analysis of the mating-type locus in yeast. Cell. 27:25-35.

Todd, W.J., W. Burgdorfer, and A.J. Mavros. 1982. Establishment of cell cultures persistently infected with Spotted Fever Group rickettsia. Can. J. Microbiol. 28:1412-1416.

Turner, Geoffrey, Gamal Imam, and Hans Kuntzel. 1979. Mitochondrial ATPase complex of Aspergillus nldulans and the dicyclohexylcarbodilmide-binding protein. Eur. J. Biochem. 97:565-571.

Trelsman, R., S.H. Orkin and T. Maniatis. 1983. Special transcription and RNA splicing defects in five cloned 396

beta-thalassemia genes. Nature 302:591-596.

Trinkl, Helga, B. Franz Lang, and Klaus Wolf. 1985. The mitochondrial genome of the fission yeast Schizosaccharomyces pombe 7. Continuous gene for apocytochrome b in strain EF1(CBS 356) and sequence variation in the region of intron insertion in strain ade7-50h-. Mol. Gen. Genet. 198:360-363.

Tyeryar, Jr., F.J., E. Weiss, D.B. Millar. F.M. Bozeman and R.A. Ormsbee. 1973* DNA base composition of rickettsiae. Nature 180:415—^17.

Tzagoloff, Alexander, Anna Akai, Richard B. Needleman, and George Zulch. 1975. Assembly of the mitochondrial membrane system: Cytoplasmic mutants of Saocharomyces cerevlslae with lesions in enzymes of the respiratory chain and in the mitochondrial ATPase. J. Bio. Chem. 250:8236-8275.

Van der Veen, R., A.C. Arnberg, G. Van der Horst, L. Bonen, H.F. Tabak, and L.A. Grivell. 1986. Excised group II Introns in yeast mitochondria are lariats and can be formed by self-splicing in vitro. Cell 44:225—234.

Vierny, C., A.M. Keller. 0. Begel and L. Belcour. 1982. A mitochondrial sequence is required for senesence in a fungus. Nature 297:157-159.

Vinson, J. and H.S. Fuller. 1961. Studies on trench fever. I. Propogation of rickettsia-like organisms from patient's blood. Pathol. Microbiol. 24(suppl):151-166.

Wallace, D.C., N.A. Oliver, H. Blanc and C.W. Adams. 1982. A system to study human mitochondrial genes: Application to chloramphenicol resistance, in: Mitochondrial Genes, ed. P. Slonimski, P. Borst and G. Attardi. Cold Spring Harbor Laboratory. N.Y. pp.105-116.

Wallace, John C. and Mary Edmonds. 1983. Polyadenylated nuclear RNA contains branches. Proc. Nat. Acad. Sci. USA 80:950-954.

Walker, D.H., H.N. Kirkman and P.H. Wittenberg. 1981. Genetic states possibly associated with inhanced severity of RMSF. in: Rickettsia and Rickettsial Diseases, ed W. Burgdorfer and R. Anaker. Academic Press. N.Y. pp.621-630.

Wang, S.-P. 1971. A micro-immunofluorescence method. Study of antibody responce to TRIC organisms in mice, in: Trachroma and Related Disorders Caused by Chamydial; 397

Agents, ed. R.L. Nichois. Excerpta Medica. Amsterdam, pp.273.

Waring, R.B., T .A. Brown, J.A. Ray, C. Scazzocchlo, and R.W. Davies. 1984. Three variant introns of the same general class in the mitochondrial gene for cytochrome oxidase subunit 1 in Aspergillus nldulans. EMBO J. 3:2121-2128.

Waring, Richard B., R. Wayne Davies, Sidney Lee, Eutalia Grisi, Mary McPhail Berks, and Claudio Scazzocchio. 1981. The mosaic organization of the apocytochrome b gene of Aspergillus nldulans revealed by DNA sequencing. Cell 27:4-11.

Waring, Richard B., and Wayne Davies. 1984. Assessment of a model for intron RNA secondary structure relevant to RNA self-splicing -a review. Gene 28:277-291.

Waring, Richard B., John A. Ray, Steven W. Edwards, Claudio Scazzocchio, and R. Waynes Davies. 1985. The Tetrahymena rRNA intron self-splices in coll: In vivo evidence for the importance of key base-paired regions of RNA for RNA enzyme function. Cell 40:371-380.

Waring, R.B., C. Scazzocchio, T.A. Brown, and R.W. Davies. 1983* Close relationship between certain nuclear and mitochondrial introns: Implication for the mechanism of RNA splicing. J. Mol. Biol. 167:595-605.

Waring, Richard B. Paul Towner, Stephen J. Minter, and R. Wayne Davies. 1986. Splice-site selection by a self-splicing RNA of Tetrahymena . Nature 321:133-139.

Weil, F. and A. Felix. 1916. Zur serologischen diagnose des Fleckfiebers. Wien. Klin. Wochensohr. 29:33-35.

Weisburg, William G., Carl R. Woese, Michael E. Dobson and Emilio Weiss. 1985. A common origin of Rickettsiae and certain plant pathogens. Science 230:556-530.

Weiss-Brummer, Brigitte, Gerhard Rodel, Rudolf J. Schweyen, and Fritz Kaudewitz. 1982. Expression of the split gene cob in yeast: Evidence for a precursor of a "Maturase" protein translated from intron 4 and preceding exons. Cell 29:527-536.

Weiss, Emilio. 1981. Biochemistry and metabolism of rickettsiae: Current trends. In: Rickettsiae and Rickettsial Diseaes. ed Willy Burgdofer and Robert L 398

Anacker. Academic Press, p. 387-400.

Weiss, E., J.C. CDoolbaugh and J.C. Williams. 1975. Separation of viable Rickettsia typhl from yolk sac and L cell host components by renografin density gradient centrifugation. Applied Microbiology 30:456-463.

Weiss, E., A.E. Green, R. Grays and L.M. Newman. 1973* Metabolism of Rickettsia tsutsugamushi and Rickettsia rlckettsli in irradiated host cells. Infection and Immunity 8:4-7.

Weiss, Emilio. 1982. The biology of rickettsiae. Ann. Rev. Microbiol. 36:345-370.

Weiss, E., G. Dasch, D. Woodman and. R. Williams.1978. The vole agent is a strain of the trench fever agent, Rochalimaea quintana.Infection Immunity. 19:1013-1020.

Weiss, E. and G. Dasch. 1982. Int. J. Syst. Bacteriol. 32:1150.

Weiss, Emilio, Horace B. Rees Jr., and Jude R. Hayes. 1967. Metabolic activity of purified suspensions of Rickettsia rlckettsil. Nature 211:1020-1022.

Weiss, E. and J.M. Moulder. 1984. in: Bergey's Manual of Systematic Bacteriology, ed. Krieg and Holt. Williams and Wilkins. Baltimore, pp.687.

Wieringa B., F. Myers, J. Reiser and C. Weissmann. 1983* Unusual splice sites revealed by mutagenic inactivation of an authentic splice site of the rabbit beta-globin gene. Nature 301:38-43.

Wild, Martha A., and Reinhold Sommer. 1980. Sequence of a ribosomal RNA gene intron from Tetrahymena. Nature 283:693-694.

Williams, Jim C. and Emilio Wiss. 1978. Energy metabolism of Rickettsia typhi: Pools of adenine nucleotides and energy charge in the presence and absence of glutamine. J. Bacteriol. 134:884-892.

Winkler, Herbert H. 1976. Rickettsial permeability: An ADP-ATP transport system. J. Biol. Chem. 251:389-396.

Wisseman, Jr., C.L., E.A. Edlinger, A.D. Waddell, and M.R. Jones. 1976.Infection cycle of Rickettsia rickettsii in chicken embryo and L-929 cells in culture. Infection and Immunity 14:1052-1064. 399

Wood, David 0., Robert S. Sikorski, William H. Atkinson, Duncan C. Krause, and Herbert H. Winkler. 1984. Cloning Rickettsia prowasekll genes in Escherichia coll K12. In: Microbiology-1984 ed. L. Leive and D. Schlessinger. Am. Soc. Microbio., Washington, D.C. pp.301-304.

Wright, R.M. and D.J. Cummings. 1983* Current Genetics 7:457-464.

Yang, D., Y. Oyaizu, H. Oyalzu, G.J. Olsen, and C.R. Woese. 1985. Mitochondrial origins. Proc. Nat. Acad. Sci. USA. 82:4443-4447.

Yin, Samuel, Joyce Heckman, and Uttam L. RaJBhandary. 1981. Highly conserved GC-rich palindromic DNA sequences flank tRNA genes in Neurospora crassa mitochondria. Cell 26:325-332.

Zahorchak, Robert J. and Herbert H. Winkler. 1981. Hydrolysis and synthesis of ATP by Rickettsia prowasekii . In: Rickettsiae and Rickettsial Diseases, ed. Willy Burgdorfer and Robert L. Anacker. Academic Press. New York. p. 401-410.

de Zamaroczy, Miklos, and Giorgio Bernard!. 1985. Sequence organization of the mitochondrial genome of yeast-a review. Gene 37:1-17.

Zassenhaus, H.P. and P.S. Perlan. 1982. Respiration deficient mutants in the A+T-rich region on the yeast mitochondrial DNA containing the var1 gene. Current Genetics. 6:179-188.

Zaug, Arthur J., and Thomas R. Cech. 1980. In vitro splicing of the ribosomal RNA precursor in nuclei of Tetrahymena. Cell 19:331-338.

Zaug, Arthur J., and Thomas R. Cech. 1986. The intervening sequence RNA of Tetrahymena is an enzyme. Science 231:470-475.

Zaug, Arthur J., Jeffrey R. Kent, and Thomas R. Cech. 1985. Reactions of the intervening sequence of the Tetrahymena ribosomal ribonucleic acid precursor: pH dependence of cyclizatlon and site-specific hydrolysis. Biochemistry 24:6211-6218.

Zimmern, David. 1983* Homologous proteins encoded by yeast 400 mitochondrial introns and by a group of RNA viruses from plants. J. Mol. Biol. 171:345-352.

Zinn, A.R. and R. A. Butow. 1985. Nonreciprocal exchange between alleles of the yeast mitochondrial 21S rRNA gene: Kinetics and the involvement of a double-strand break. Cell 40:887-895.

Zurawski, Gerard, Warwick Bottomley, and Paul R. Whitfeld. 1984. Junctions of the large single copy region and the inverted repeats in Spinacia oleracea and Nicotiana debneyi chloroplast DNA: sequence of the genes for tRNA-His and the ribosomal proteins S19 and L2. Nucl. Acids Res. 12:6547-6558.