<<

THE APICOMPLEXAN DNA:

AN EVOLUTIONARY AND MOLECULAR STUDY

Paul William Denny

Ph.D .

University College London,

University of London

1997 ProQuest Number: 10106922

All rights reserved

INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest.

ProQuest 10106922

Published by ProQuest LLC(2016). Copyright of the Dissertation is held by the Author.

All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC.

ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 ABSTRACT

The discovery and characterisation in this laboratory of a 35 kilobase plastid from the parasite, , has led to intense speculation concerning its origins and function. This thesis describes how sequence data were garnered from malaria's distant apicomplexan cousins, the coccidians and piroplasms, and used to make an evolutionary analysis of their corresponding plastid .

From the sequence data produced it appears highly likely that the plastid DNA was gained by an ancient progenitor of the , and singularly reorganised upon the adoption of a parasitic lifestyle. Previously, the identification in this laboratory of a

Plasmodium plastid encoded open reading frame with homologues in rhodoplast led to the suggestion that the apicomplexan has a red algal origin.

However, phylogenetic analysis of the tuf , which encodes the ubiquitous plastid and prokaryotic elongation factor Tu, provides support for the hypothesis that the organelle was derived through endosymbiosis of a green alga.

The highly conserved nature of the extrachromosomal DNA found across the range of parasites studied here suggests that the organelle within which the genome is housed performs a vital cellular role. However, the results presented provide no indication of its particular function. The sequence data generated also led to the speculation that certain plastid encoded products may provide targets for novel anti-apicomplexan chemotherapeutics.

Finally, it is inferred from the sequence of the plastid DNA of the coccidians.

Toxoplasma gondii and tenella, that opal stop codons are occasionally used as tryptophan codons. Such a system would be the first identified in a plastid organelle. ACKNOWLEDGEMENTS

Grateful thanks to Dr. Iain Wilson for his thoughtful and supportive supervision through the practical and cerebral aspects of this thesis. Gratitude also to Dr. Don

Wilhamson for his helpful insight into an array of matters.

Great appreciation must also be expressed towards Dr. Andrea Whyte for patient thesis advice; and Malcolm Strath, Dr. Peter Preiser, Dr. Kaveri Rangachari, Dr. Barbara

Clough, Peter Moore, Daphne Moore, Dr. Anjana Roy, Kate Roberts, Irene Ling,

Terry Scott-Finnigan, Anna Law and all the other 'parasitology people' who, over the years, have provided sound advice, ready smiles and even the occasional pint. Thanks also to Dr. Nick Goldman for evolutionary endeavours, and to the Photographies section of the National Institute for Medical Research for their expert assistance in the preparation of the figures.

AU the work described was performed at the National Institute for Medical Research whilst in receipt of a Medical Research Council post-graduate studentship. A University

CoUege London travel grant is also gratefully acknowledged. CONTENTS

ABSTRACT I

ACKNOWLEDGEMENTS II

CONTENTS III

FIGURES AND TABLES VII

ABBREVIATIONS IX

AMINO ACID CODES XII

ACCESSION NUMBERS XIII

CHAPTER 1: INTRODUCTION 1.1.0 The Apicomplexa 1 1.2.0 The Parasites Studied 2 1.2.1 The coccidians: and Eimeria tenella 2 1.2.2 The piroplasm: annulata 3

1.3.0 Apicomplexans have two Extrachromosomal DNAs 4

1.4.0 The Malarial DNA Circle: A Vestigial Plastid Genome 5 1.4.1 The RNA polymerase 5 1.4.2 The genes 6 1.4.3 The m/gene 6 1.4.4 The 7

1.5.0 Provenance of the Apicomplexan Plastid 8 1.5.1 A origin? 10 1.5.2 A chromistan origin? 10 1.5.3 A 'green' origin? 12 1.5.4 The 'unitary' hypothesis 13

1.6.0 Further Indications of a 'Green' Origin for the Apicomplexan Plastid 14

1.7.0 The AT Content of the Apicomplexan Plastid DNA 16

1.8.0 Is the Parasite Plastid Functional? 17

1.9.0 Susceptibly of Apicomplexans to and Herbicides 19

1.10.0 Aims 22

111 CHAPTER 2: MATERIALS AND METHODS

2.1.0 Parasite Culture 27 2.2.0 Parasite Purification 28

2.3.0 DNA Extraction 28 2.3.1 Toxoplasma 28 2.3.2 Other parasites 28

2.4.0 Caesium Chloride/D API Gradients 29

2.5.0 Southern Blots 29 2.6.0 Slot Blots 30 2.7.0 Copy Number Determination 30

2.8.0 Synthesis of Oligonucleotides 31 2.9.0 Polymerase Chain Reaction 31

2.10.0 Cloning PCR Products 32 2.11.0 Sequencing 32 2.12.0 Sequence Analysis 33 2.13.0 RNA Extraction 34 2.14.0 Northern Blots 35 2.15.0 Reverse PCR 35 2.16.0 Production of Fusion Proteins 36

2.17.0 Polyclonal Antibody Production 37

2.18.0 Western Blotting 37

CHAPTER 3: ISOLATION OF THE PLASTID DNA FROM TOXOPLASMA

3.1.0 Introduction 45

3.2.0 Fractionation of Toxoplasma DNA 45

3.3.0 Identification of the Fractions 46

3.4.0 Copy Number 47

3.5.0 Discussion 47

IV CHAPTER 4: EVOLUTION OF THE APICOMPLEXAN PLASTID GENOME

4.1.0 Introduction 51

4.2.0 Cross Hybridisation 51

4.3.0 The Inverted Repeat and Downstream Region 52

4.4.0 The tuf Region 52

4.5.0 Phylogenetic Analysis of the tuf Gene 53 4.6.0 Discussion 55

CHAPTER 5: PROTEIN ENCODING GENES 5.1.0 Introduction 60

5.2.0 ORF470 60 5.3.0 The tuf Gene 61 5.4.0 ORF78 62 5.5.0 The Ribosomal Protein Genes 62

5.6.0 Protein Analysis 63 5.7.0 Discussion 64

CHAPTER 6: RIBOSOMAL RNA GENES 6.1.0 Introduction 74

6.2.0 Conservation and Transcription of the Ribosomal RNA Genes 74 6.3.0 Susceptibility 75 6.4.0 Discussion 77

CHAPTER 7: CODON USAGE AND TRANSFER RNAs

7.1.0 Introduction 83

7.2.0 Plastid Encoded Transfer RNAs 83

7.3.0 Frequencies and Codon-Usage 84

7.4.0 Known Anti-Codon Frequency 85

7.5.0 Discussion 85 CHAPTER 8: CONCLUSION 96

CHAPTER 9: REFERENCES 101

APPENDIX: PUBLICATIONS 130

VI FIGURES AND TABLES

Figure 1.1: Apicomplexan Merozoite 23

Figure 1.2: Toxoplasma -Cycle 24 Figure 1.3: Complete Map of the Malarial Plastid Genome 25

Figure 1.4: Cruciforms 26 Table 2.1: Oligonucleotide Primers for Amplification of the P30 Nuclear Gene 39 Table 2.2: Oligonucleotide Primers for Amplification of Isolated Toxoplasma Plastid DNA 40 Table 2.3: Oligonucleotide Primers for Amplification of Eimeria Plastid DNA 41 Table 2.4: Oligonucleotide Primers for Amplification of Theileria Plastid DNA 42 Table 2.5: Oligonucleotide Primers for RT-PCR 43 Table 2.6: Oligonucleotide Primers for Amplification of Toxoplasma Plastid Genes for Fusion Protein Production 44 Figure 3.1: Caesium Cbloride/DAPI Gradients of Toxoplasma DNA 48 Figure 3.2: Identification of Gradient Fractions 49 Figure 3.3: Determination of the Plastid Copy Number 50 Figure 4.1: Preliminary Identification of Malarial Plastid Gene Homologues 57 Figure 4.2: Conservation of Gene Order in Apicomplexan Plastid Genomes 58

Figure 4.3: Phylogenetic Tree Based on Plastid Sequence Data 59

Figure 5.1: Analysis of ORF470 Homologues 66

Figure 5.2: Alignment of EF-Tu Proteins 67

Figure 5.3: Analysis of ORF78 Homologues 68

Figure 5.4: Analysis of Predicted S7 Proteins 69

Figure 5.5: Alignment of Predicted S12 Proteins 70

Figure 5.6: Analysis of Putative Lll Proteins 71

Vll Figure 5.7: RT-PCR Analyses in Toxoplasma 72

Figure 5.8: ORF470 Protein Analysis 73

Figure 6.1: SSU rRNA Sequence Alignment 79

Figure 6.2: LSU rRNA Sequence Alignment 80 Figure 6.3: Northern Analysis of rRNA 81

Figure 6.4: Susceptibility of Apicomplexan GTPase Centres to Thiostrepton 82

Figure 7.1: Toxoplasma Transfer RNAs 90 Figure 7.2: Eimeria Transfer RNAs 91 Table 7.1: Toxoplasma Plastid Codon Usage 92 Table 7.2: Eimeria Plastid Codon Usage 93

Table 7.3: Toxoplasma Plastid Anti-Codon Frequency 94 Table 7.4: Eimeria Plastid Anti-Codon Frequency 95

V lll ABBREVIATIONS

A2 6 0 absorbance at 260 nm aa amino acid

AIDS acquired immunodeficiency syndrome bp base pairs

DAPI 4',6-diamidino-2-phenylindole

DNA deoxyribonucleic acid

DNase deoxyribonuclease dNTP deoxyribonucleotide triphosphate

DTT dithiothreitol

EDTA ethylenediaminetetraacetic acid

EF-Tu elongation factor Tu

Et Eimeria tenella

EtBr ethidium bromide

PCS foetal calf serum g gram

GDP guanosine diphosphate

GST glutathione S-transferase

GTP guanosine triphosphate

HEPES N-2-hydroxyethylpiperazine-N'-ethanesulfonic acid

HRPL horseradish peroxidase luciferase

HPLC high performance liquid chromatrography

I inosine

Ig immunoglobulin

IPTG isopropyl |3-D-thiogalactopyranoside

IR inverted repeat

IX kb kilobase kD kilodalton

1 litre

L-agar luria agar

L-broth luria broth

LSC large single copy region

LSU large subunit

M molar m milli mA müliamp

|i micro

MDCK Madin-Darby canine kidney mol mole

MOPS 3-(N-morphohno) propanesulfonic acid

MRC Medical Research Council mRNA messenger RNA n nano

NTMR National Institute for Medical Research nt nucleotide nucL nuclear

CD optical density

ORF open reading frame

p pico

PAGE polyacrylamide gel electrophoresis

PBS phosphate buffered saline

PCR polymerase chain reaction

Pf Plasmodium falciparum PK proteinase K plas.

RF release factor

RNA ribonucleic acid

RNase ribonuclease rpm revolutions per minute rRNA ribosomal RNA

RT-PCR reverse transcription PCR

SDS sodium dodecyl sulphate sp. species ssc small single copy region SSU small subunit

Ta Theileria annulata

TAB tiis-acetate-EDTA

TBE tris-borate-EDTA

TdT terminal deoxytransferase

TE tris-EDTA

Tg Toxoplasma gondii Tris tris (hydroxymethyl) aminomethane tRNA transfer RNA

UV ultraviolet w Watt w/v weight/volume

XI AMTNO ACID CODES

AMINO ACID THREE LETTER SINGLE LETTER

CODE CODE

Lysine Lys K

Arginine Arg R

Histidine His H

Aspartic Acid Asp D

Glutamic acid Glu E

Glycine Gly G

Asparagine Asn N Glutamine Gin Q Cysteine Cys C

Serine Ser s

Threonine Thr T

Tyrosine Tyr Y

Alanine Ala A

Valine Val V

Leucine Leu L

Isoleucine lie I

Proline Pro P

Phenylalanine Phe F

Methionine Met M

Tryptophan Trp W

Xll ACCESSION NUMBERS

The sequence data generated during the work presented in this thesis has been submitted to the EMBL database under the following accession numbers:

• Toxoplasma gondii : Y 11430 and Y Y11431 ;

• Eimeria tenella: Y12332 and Y12333;

• Theileria annulata: Y11429.

X lll CHAPTER 1: INTRODUCTION

1.1.0 The Apicomplexa

Anthony van Leeuwenhoek was probably the first person to see an apicomplexan protozoan, when in 1674 he observed the oocysts of Eimeria sttedai in the gall bladder of a rabbit (Dobell, 1958). It was another 150 years before the was fiilly described (Lindemann, 1865).

The class Sporozoa in the phylum was established by Leuckart (1879). The sporozoans became the parent group of the Apicomplexa, described by Levine (1970) as having an apical complex consisting of polar rings, a conoid, micronemes, rhoptries, and subpeUicular tubules (see figure 1.1). Since then the

Apicomplexa has been re-classified as a phylum within the Protozoa

(Cavalier-Smith, 1993).

All members of the phylum Apicomplexa are parasitic, and many are of medical and economic importance e.g.:

• Plasmodium - the malaria parasite;

• Toxoplasma and - opportunistic parasites observed in

immunocompromised patients, such as those with AIDS;

• Eimeria - the causative agent of coccidiosis in poultry;

and Theileria - parasites of cattle in tropical and subtropical regions.

Phylogenetic analyses of small subunit ribosomal RNA (SSU rRNA) genes support the monophyly of the apicomplexans (Barta et a l, 1991), and suggest that they share a common ancestor with and cihates (Johnson et a l, 1988; Gajadhar et a l, 1991; Barta et al, 1991; Wolters et al, 1991; Gagnon et a l, 1993). The recent report of structures resembling apicomplexan micropores in a parasitic dinoflagellate

{ sp.) reinforces this hypothesis (Appleton and Vickerman, 1996). The

apicomplexans, dinoflagellates, and are known collectively as the , a name which refers to the subplasmalemmal membranous sacs forming a tri-layered membrane which is visible by electron (Vivier and Desportes, 1990). The origin of the Apicomplexa (and other alveolates) may have been more than a billion years ago, perhaps before the emergence of the three multiceUular kingdoms of , , and fungi (Escalante and Ayala, 1995), making this large and diverse group of parasitic protozoa (339 genera and 4516 species named in 1987 (Levine, 1988)) very ancient.

1.2.0 The Parasites Studied

This thesis is concerned with the extrachromosomal DNA of apicomplexan parasites.

The primary objective of the work was to compare the malarial extrachromosomal

DNA, which has been well studied (reviewed Feagin, 1994; Wilson and Williamson,

1997), with those of other important apicomplexans in order to gain an understanding of the molecules' function and evolutionary origin. The organisms selected for these comparisons were the coccidians Toxoplasma gondii and Eimeria tenella, and the piroplasm Theileria annulata.

1.2.1 The coccidians: Toxoplasma gondii and Eimeria tenella

The main thrust of this work concerns T. gondii (lifecycle, figure 1.2), an important human pathogen which can infect almost any warm-blooded vertebrate, and is easily grown in laboratory culture. It is estimated that up to 50% of the world's human population is infected (Dubey, 1977, Krahenbuhl and Remington, 1982), either by consumption of undercooked meat contaminated with parasite cysts, or by close association with cats in whose intestine the sexual phase of the parasite's life-cycle occurs (Dubey, 1977). In humans, T, gondii typically causes mild, flu-like symptoms before progressing to a chronic infection without serious clinical complications

(Krahenbuhl and Remington, 1982). However, in individuals with lowered immunity the parasite can cause severe disease. Toxoplasmosis is also a significant cause of congenital defects in humans, and leads to major economic losses due to spontaneous abortion in domestic animals (Dubey, 1977, Desmonts and Couvreur, 1976).

Recently, much attention has been focused on T. gondii as an opportunistic pathogen of patients with AIDS. Chronic subclinical infections can reactivate in these immunocompromised individuals, and unfettered parasite growth may lead to severe encephalitis (Luft and Remington, 1988, Mills, 1986). Patients who are immunocompromised by virtue of underlying neoplastic disease may be similarly affected (Israelski and Remington, 1993).

Coccidiosis caused by E. tenella in poultry is of major international economic importance. Unlike Toxoplasma this species of Eimeria is homoxenous, i.e. it has a single host, namely the chicken. The parasite infects the caeca of chickens, leading to haemorrhage and a high level of mortality amongst immature birds. A review of recent research on avian Eimeria is given by Shirley (1992).

1.2.2 The piroplasm: Theileria annulata

It has been estimated that Th. annulata and Th. parva together affect over 200 million cattle. Th. annulata is spread throughout Asia and northern Africa, whereas Th. parva is confined to southern Africa. The parasites infect blood cells of domestic cattle leading to a malaria-like disease and consequent economic losses. Theileria is transmitted by tick vectors, in whose gut the sexual phases of the life cycle occur (reviewed Levine,

1987).

1.3.0 Apicomplexans have two Extrachromosomal DNAs

In eukaryotic organisms extrachromosomal DNAs play a crucial role in two vital , the , found in most , and the plastid, normally found in plants and algae. Several mitochondrial and plastid genomes have been sequenced, and analyses of these data have suggested a prokaryotic origin for their organelles, presumably following endosymbiosis; the mitochondrion originating from the alpha subdivision of purple , and from cyanobacteria (Gray, 1988).

Organellar DNAs continue to be of general medical and scientific interest, as is demonstrated by recent studies on the effects of on mitochondrial metabohsm in man (Wallace, 1994), and the 'great debate' amongst botanists on the monophyletic origin of the plastid (Martin et al, 1992). However, studies on extrachromosomal DNA in important 'sporozoan' parasites of man and his domestic animals (with which this thesis is concerned) are relatively recent, and have been undertaken by very few groups. This recent interest stems largely from the surprising finding that these organisms have two forms of extrachromosomal DNA, as do plants and algae (Wilson et al, 1993).

Circular extrachromosomal DNA molecules were first identified by electron microscopy in extracts of the avian and murine malarial parasites, P. lophurae (Kilejian, 1975) and

P. berghei (Dore et al, 1983). Subsequently, AT-iich DNA circles of a similar size

(approximately 35 kilobases (kb)) were isolated from the primate and human malarial parasites, P. knowlesi (Wilhamson et al., 1985) and P. falciparum (Gardner et ai,

1988), utüising density gradient centrifugation. In all cases these circular DNAs were assumed to be mitochondrial, the only extrachromosomal genome then expected in a non-photosynthetic protozoan parasite. However, sequence analysis of the isolated P. falciparum circle (Gardner et at., 1991a and b), together with reports indicating the

presence of mitochondrial genes on a tandemly repeated, 6 kb DNA (Vaidya et a i,

1989, Aldritt et ai, 1989), suggested that an alternative origin for the circle might have

to be considered.

1.4.0 The Malarial DNA Circle: A Vestigial Plastid Genome

Comparative analyses of rRNA genes from the putative mitochondrial 6 kb element and

the 35 kb circle of P. falciparum showed that the two extrachromosomal DNAs are not

closely related (Feagin eta l, 1992). The sequence and secondary structure of the large

subunit rRNA (LSU rRNA) from the 35 kb circle ofP. falciparum were used in an evolutionary study (Gardner et a l, 1993). This indicated that although the malarial sequence was highly divergent in comparison with several LSU rRNAs, it

appeared more related to them than to mitochondrial LSU rRNAs. Other studies considered in the following sections add credence to the suggestion that the malarial circular genome is of a plastid origin.

1.4.1 The RNA polymerase genes

A cloned fragment from the P. falciparum circle was found to contain two contiguous open reading frames (ORFs) {rpoB and rpoQ, encoding portions of the (3 and p' subunits of an RNA polymerase similar to prokaryotic and chloroplast proteins

(Gardner et a i, 1991a). Little homology was found to nuclear or nuclear-encoded mitochondrial RNA polymerases. Phylogenetic analyses of the rpoB gene (Gardner et a i, 1994a) and part of the rpoC gene (Howe, 1992) were consistent with a plastid origin. The complete sequence of the rpoC gene showed that it is split into rpoC^ and rpoC^ as in some other plastid and cyanobacterial genomes, but rpoC^ lacks the intron typical of higher plants (Wilson et al, 1996) (see figure 1.3). These findings suggest that the 35 kb molecule is of a plastid origin, but is not homologous to the organeUar genomes of higher plants.

1.4.2 The ribosomal protein genes

A string of 15 putative ribosomal protein genes form a prominent feature of the P. faciparum circle, encompassing a region of approximately 7kb {rpl4 to rpsl - see figure

1.3). The genes in this cluster, as in other plastid genomes, are in an arrangement resembling a fusion of the E. coli SIO, spc, alpha and str opérons (Wilson et al,

1996). Moreover, the fact that the malarial circle carries rps5, rpsll and rpl4 genes is reminiscent of algal rather than higher plastid genomes (Harris et al, 1994).

1.4.3 The tuf gene

A m/gene, predicted to encode the ubiquitous prokaryotic and plastid elongation factor

Tu (EF-Tu), is located downstream of rpsll and rpsl in the truncated str operon (see figure 1.3). Its presence on the malarial circle has been taken as indicative of an algal origin (Wilson et al, 1996) since tuf is nuclear encoded in higher plants, but is characteristically maintained on most algal plastid genomes (Baldauf et al, 1990;

Baldauf and Palmer, 1990). Also, in common with other plastid EF-Tu proteins, the predicted amino acid sequence from P. falciparum contains an insertion of unknown function. 1.4.4 The inverted repeat

Prior to the isolation and identification of the Plasmodium plastid genome, Borst et al

(1984) utihsed caesium chloride/ethidium bromide (CsCl/EtBr) gradients to isolate a circular DNA of supposed mitochondrial provenance from T. gondii. The molecule's size, and propensity to form a 'cruciform' structure indicative of an inverted repeat

(IR), is reminiscent of the plastid genome of Plasmodium sp. (Dore et al, 1983;

Williamson et al, 1985; Wilson and Williamson, 1997). The P. falciparum IR encodes a novel arrangement of duphcated SSU and LSU rRNA genes interspersed with nine transfer RNA (tRNA) genes, occupying both strands on this sector of the circle

(Gardner et al, 1994b) (see figure 1.3). The 'snap-back' pattern observed following dénaturation and rapid renaturation experiments using T. gondii and E. tenella DNA is consistent with a similar DR. being present (Wilson et al, 1993); in 'snap-back' experiments a 'cruciform' structure (figure 1.4) forms when complimentary strands of endonuclease restricted, denatured DNA reanneal after renaturation. This result suggested that an extrachromosomal DNA homologous to that of P. falciparum, is conserved across a range of apicomplexans. The presence of a homologous SSU rRNA gene in Babesia boms has also been reported (Gozar and Bagnara, 1993 and 1995).

IRs encoding copies of rRNA genes are prominent in many plastid, but not mitochondrial, genomes (Palmer, 1985). The presence of such a feature on the malarial

35 kb circle was a factor supporting identification of the extrachromosomal DNA as a residual plastid genome (Wilson et al, 1991; Palmer, 1992). However, the arrangement of the LSU and SSU rRNA genes within the P. falciparum IR differs from that in other known plastids (Palmer, 1985), the LSU rRNA genes being distal rather than proximal to the small single copy (SSC) region (see figure 1.3). In addition, transcription of the LSU rRNA genes is away from this region (see figure 1.3), rather than towards it as in other plastid genomes. LSU rRNA genes are often fragmented, with short regions at the 5' and 3' ends encoding 5.8S and 4.58 rRNAs respectively (GuteU and Fox 1988). Plastid genomes do not encode 5.88 rRNA genes, whereas 4.58 rRNA genes are common (GuteU and

Fox, 1988). However, cyanobacterial and algal plastid genomes, unlike those of higher plants and some ferns, do not possess a 4.58 rRNA gene either. Instead, a sequence with a similar predicted secondary structure is fused to the 3' end of the L8U rRNA

(Bowman et al, 1979; Kossel e ta l, 1991; Harris et al, 1994). 8imilarly, the L8U rRNA plastid gene of P. falciparum is unfragmented (Gardner et al, 1993). It might be argued that the contiguous nature of the malarial gene is another indication of an algal origin for the plastid DNA molecule.

58 RNA is a smaU rRNA species (-120 nt long) that is ubiquitous in eubacterial, archaeal, plastid and eukaryotic cytosoUc ribosomes (Erdmann, 1976; Moore, 1985).

The apparent lack of a gene for this rRNA species in the P. falciparum plastid genome

(Wilson e ta l, 1996) is reminiscent of non-angiosperm mitochondria (Curgy, 1985).

Yoshionari eta l (1994) have demonstrated the presence of nuclear encoded 58 rRNA in the mitochondrial ribosomes of higher animals, raising the possibiUty that this rRNA species can be imported from the cytosol. Whether the malarial plastid ribosomes function with or without 58 rRNA is unknown, but the lack of this gene on the malarial plastid genome can be regarded as unusual and mitochondria-like.

1.5.0 Provenance of the Apicomplexan Plastid

The P. falciparum 35 kb circle is the smallest residual plastid genome yet identified, and appears to be ubiquitous in all apicomplexans so far examined. It is approximately half the size of the two best known vestigial plastid DNAs, those of Astasia longa, a non­ photosynthetic euglenoid (Gockel et al, 1994), and Epifagus virginiana (or

Beechdrops), a parasitic, non-photosynthetic plant (Wolfe et al, 1992a). In all three residual genomes virtually all genes code for expression function. However, the DNAs differ in detailed gene content.

Until very recently the location of the apicomplexan plastid DNA was open to speculation. However, several results provided indirect evidence that it was situated in an organelle:

• In P. falciparum the 35 kb circle is transmitted uniparentally by the macrogamete

(female) in sexual reproduction, as is the mitochondrial genome (Vaidya et al,

1993; Creasey etal, 1994);

• the plastid DNA does not co-fractionate with the mitochondrion, and was therefore

presumed to reside in another intracellular compartment (Wilson et al, 1992).

The identity of this organelle remained mysterious. However, it was proposed that the so-called spherical body in Plasmodium sp., the Golgi-adjunct or Hohlzylinder in T. gondii, and dickwandiger vesikel in , all multimembranous organelles of unknown function, might carry the plastid DNA (Kilejian, 1991; Wilson et al, 1991;

Siddall, 1992). Recently, high resolution in situ hybridisation identified the plastid organelles from P. falciparum and T. gondii as the multimembranous bodies described above (MTadden et al, 1996; Kohler et al, 1997). This organelle has been re-named the 'apicoplastid' (Kohler et al, 1997), a reference to its presence within the members of the Apicomplexa rather than to any cellular location.

That the plastid genome resides within a multimembranous organelle implies that

Wilson et al (1994) were correct in their postulation that the Apicomplexa obtained the

35 kb circle via secondary endosymbiosis, i.e. the membranes are derived from both the plastid and its primary eukaryotic host. It was suggested that the genome was subsequently reduced to its current condensed state upon the adoption of parasitism

(Wilson et al, 1994). Several hypotheses pertaining to the provenance of the apicomplexan plastid are discussed below. 1.5.1 A dinoflagellate origin?

Structural similarities, and phylogenetic analysis of nuclear SSU rRNA genes (Johnson eta l, 1988; Gajadhar et a i, 1991; Barta et ai, 1991; Wolters et al, 1991; Gagnon et a l, 1993; Appleton and Vickerman, 1996) suggest that marine protozoans, the dinoflagellates, form the closest clade to the apicomplexans. This, coupled with the fact that many dinoflagellates are photosynthetic, has led to the proposal that the

Dinoflagellata possess plastid DNAs which are likely to be placed close to those of the

Apicomplexa in evolutionary analyses (Howe, 1992; Wilson et ai, 1992; Palmer,

1992; Gardner et ai, 1993 and 1994a). Unfortunately no dinoflagellate plastid sequence data are currently available to test this, the most evolutionary stringent hypothesis.

It has been suggested that secondary endosymbiosis of either algal cells (Gibbs, 1978;

Lefort-Tran etal, 1980) or cryptophytes (Wilcox and Wedemayer, 1984a and 1984b) accounts for the presence of plastids in photosynthetic dinoflagellates. Wilson et al

(1994) have proposed that a dinoflagellate, or related progenitor of the Apicomplexa, acquired the plastid now found in the parasitic phylum in a similar way.

1.5.2 A chromistan origin?

In contrast to the proposal that dinoflagellates obtained their plastids by endosymbiosis of cryptophytes (kingdom Chromista), Cavalier-Smith et al, (1994) used a phylogenetic analysis of nuclear SSU rRNA genes to suggest that the parvkingdom

Alveolata (containing the three phyla Cihophora, Dinoflagellata and Apicomplexa -

Cavalier-Smith, 1993) evolved directly from an ancestral form of chromist. They hypothesised that the apicomplexans (and ciliates) have subsequently lost their photosynthetic capacity. This proposal is consistent with the recent observation that.

10 like the chromistan organelle, the apicomplexan plastid is surrounded by four membranes (Kohler et al, in 1997).

The cryptomonads (Cryptophyta) are considered to have obtained their plastids via secondary endosymbiosis of a red alga (reviewed by MTadden and Gilson, 1995), suggesting that dinoflagellate and chromistan plastids may be ancestrally rhodophytic.

In 1994 Williamson et al, noted that the malarial plastid ORF470, (see figure 1.3) encoding a putative peptide of unknown function, demonstrated a high level of conservation with a predicted protein from the rhodoplast genomes of Antithamnion sp., Cyanidium caldarium and Porphyra purpurae. These results led to the suggestion that the apicomplexan plastid was of red algal provenance (Williamson et al, 1994), tentatively supporting a dinoflagellate or chromistan origin for the parasite organelle.

The same conserved ORF has since been found in the plastid genome of the chromophytic diatom Odontella sinensis (K. Kowallik et al, 1995), in the prokaryotic genomes of Mycobacterium leprae (Pietrokovskiet al, 1994), E. coli (Aiba H., Baba

T., Fujita K., Hayashi K., Honjo A., Horiuchi T., Ikemota K., Inada T., Isono K.,

Itoh T., Kanai K., Kasai H., Kashimoto K., Kim S., Kimura S., Kitagawa M.,

Kitakawa M., Makino K., Masuda S., Miki T., Mizobuchi K., Mori H., Motomura

K., Nakamura Y., Mashimoto H., Nishio Y., Oshima T., Saito H., Sampei G., Seki

Y., Tagami H., Takemoto K., Wada C., Yamamoto Y. and Yano M., unpublished data), and the cyanobacterium Synechocystis PCC6803 (Kaneko et al, 1996), as well as in the cyaneUe DNA of the glaucocystophyte, Cyanaphora paradoxa (Stirewolt V.L.,

Michalowski C.B., Luffelhardt W., Bohnert H.J. and Bryant D.A., unpublished data).

An expressed sequence tag from a cDNA library of the green plant, Arabidopsis thaliana (Desprez T., Amselem J., Chiapette H., Caboche M. and Hoffe H., unpubhshed data), indicates that a similar gene is also maintained in higher photosynthetic organisms. In phylogenetic analyses, the malarial gene grouped with the rhodoplast sequences (Jefferies and Johnson, 1996) supporting the initial hypothesis of

Williamson er fl/. (1994). However, evolutionary studies of other plastid genes do not

11 confirm this (see below), and given the limited number of known ORF470 gene homologues this result should be viewed tentatively.

Chromistan plastids are typically located in the lumen of the rough endoplasmic reticulum (Cavalier-Smith, 1981). The complex series of events which led to this location were hypothesised to have occurred only once after endosymbiosis of a photosynthetic (Cavalier-Smith, 1982). However, phylogenetic studies based on the SSU rRNA genes of chromistan nucleomorphs (Cavalier-Smith et al, 1994) and plastids (Medhn et al, 1995), suggest that the organelle was acquired independently on several occasions. Similarly, dinoflagellates maintain a myriad of plastids and photosynthetic (see Scherf, 1993), indicating that many were obtained polyphyletically (Dodge, 1989). These observations imply that secondary endosymbiosis has occurred many times in evolution, and make an independent origin for the apicomplexan plastid appear increasingly likely.

1.5.3 A ’green' origin?

Recently, a variety of phylogenetic analyses based on the plastid encoded tuf gene, consistently placed the apicomplexans within the 'green' plastid clade (Kohler et al,

1997). These results agree with similar, considerably less comprehensive studies, based on the small and large subunit rRNA, and the rpo plastid genes (Egea and Lang-

Unnasch, 1995; Gardner et al, 1993 and 1994; Howe, 1992), and contrast with the suggestion that the apicomplexan plastid has a red algal origin (Williamson et al,

1994). Based on their phylogenetic data, Kohler et al (1997) suggested that the four membranes they observed surrounding the T. gondii plastid envelope were those of a green algal symbiont and its plastid.

12 Although the evidence for a green algal origin for the apicomplexan vestigial plastid seems compelling (discussed further in chapter 4), the Apicomplexa are distant from each other as well as from other organisms. In addition, bootstrap support for relationships among the plastid clades is weak and there is no clear support for any one plastid group furnishing the potential for the apicomplexans. This is discussed further in chapter 4.

1.5.4 The 'unitary' hypothesis

Whatever its origin, the wide distribution of plastid DNA molecules (Borst et al, 1984;

Wilson et al, 1993 and 1996) and plastid organelles (or putative organelles) (Siddall,

1992; Dubremetz, 1995; MTadden et al, 1996; Kohler et al, 1997; Dr. R.J.M.

Wilson, personal communication) in the Apicomplexa suggests that a plastid might be maintained throughout the phylum. As stated earher, it has been postulated that the organelle was obtained by a single, ancient progenitor of the apicomplexans, the plastid

DNA being drastically condensed to its current diminutive state upon the adoption of parasitism (Wilson et al, 1994), According to this 'unitary' hypothesis, aU extant apicomplexans should maintain similarly organised plastid genomes. A later chapter in this thesis has the objective of comparing sequence data from other apicomplexans with the fully sequenced plastid genome of malaria (Wilson et al, 1996) to establish whether this hypothesis is correct. Alternatively the plastid genome could have arrived at its current state and distribution within the phylum by multiple events:

• Differential reduction of an original element over an extended period of evolution

during spéciation of the Apicomplexa. This scenario predicts that while common

features will be present, there may be lineage-specific differences in plastid gene

content or organisation;

• reduction of different plastid DNAs in different apicomplexan progenitors, i.e.

polyphyly. This hypothesis predicts that substantial differences may be present in

13 the plastid genome of each apicomplexan hneage, although this may be countered to

some extent by convergent evolution.

1.6.0 Further Indications of a ’Green* Origin for the Apicomplexan

Plastid

Phylogenetic analyses based on nuclear SSU rRNA genes show no specific relationship between the alveolates and plants (see Cavalier-Smith, 1993). However, several workers have suggested that the apicomplexans possess plant-like genes besides those encoded by the plastid. In Plasmodium sp., the enolase gene (Hyde et ai, 1994), the mitochondrial gene cytb (Ghelh et ai, 1992), the protein gene H2A (Thatcher and Gorovsky, 1994), and the calmodulin gene (Robson et al, 1993) have all been held up as indicative of a plant connection. However, as discussed in detail by Wilson et ai (1994), all these reports are highly questionable due to the low level analyses employed (enolase and cytb genes), the unstable evolutionary trees produced (histone gene), or the use of the possibly misleading maximum parsimony phylogenetic methodology (calmodulin gene).

The nuclear encoded P tubulin genes of T. gondii and the green alga Chlamydomonas demonstrated similarity at the amino acid level (Boothroyd et al, 1987), but such an unsophisticated analysis is inconclusive (Wilson et al, 1994). Similar studies showing similarity between the predicted protein sequence of a T. gondii gene, hsp30/bagl, and the conserved C-terminal region of plant small heat shock proteins (Bohne et al, 1995) might also be treated with scepticism. More recently, maximum parsimony and distance matrix phylogenetic analyses (Stokkermans et al, 1996) placed the p tubulin genes of the alveolates closer to plant and algal than to and fungal sequences. Analyses of the a tubulin and dihydrofolate reductase-thymidylate synthase (DHFR-TS) genes gave

14 similar results (Stokkermanset al, 1996). This is in contrast to results obtained utilising SSU rRNA genes, the 'rosetta stone' of molecular phylogeny (see Cavalier-

Smith, 1993). However, the high bootstrap value (93%) between the and plant clades, and the reproducibihty of the results using various genes and methodologies suggests that the findings of Stokkermans et al are valid. AU phylogenetic analyses record the evolutionary histories of individual genes rather than whole organisms. Hence, the apparent genetic dichotomy in the nucleus of T. gondii may be explained by translocation of some genes from the nucleus of the proposed photosynthetic eukaryotic endosymbiont (Wilson et al, 1994; Kohler et al, 1997) to the nucleus of the apicomplexan progenitor, creating an evolutionary mosaic.

It is of particular interest that molecules associated with have been observed in several apicomplexans. Hackstein et al (1995) used high performance hquid chromatography (HPLC) and fluorescence-emission spectrum analyses of ceU extracts to deduce the presence of protochlorophyllide a and chlorophyll a bound to photosynthetic reaction centres in T. gondii. Further, a portion of the psbA gene, which encodes the D1 protein of photosystem II, was amplified by PCR from the coccidian

Sarcocystis muris, and detected by Southern hybridisation in S. muris, T. gondii and

P. falciparum (Hackstein et al 1995). Phylogenetic analysis of the S. muris psbA partial gene sequence placed it among the green algae (Hackstein et al 1995), in agreement with many of the analyses based on apicomplexan plastid genes discussed above. Given that Hackstein et al studied components exclusively linked to plastid organelles, their results appear worthy of consideration. The genes for these proteins are not encoded by the fuUy characterised plastid genome of P. falciparum (Wilson et al, 1996), but translocation of genes from plastids to nuclei and mitochondria have several precedents (Weeden, 1981; Stem and Lonsdale, 1982). However, the results presented by Hackstein et al (1995) are not without controversy. Firstly, one cannot rule out contamination by photosynthetic organisms, particularly given the high sensitivity of the techniques employed. Further, the PCR amplification of an

15 apicomplexan psbA gene has not proven reproducible in T. gondii (Kohler et al, 1997) or P. falciparum (Dr. R.J.M. Wilson, personal communication). Given this, the findings of Hackstein et al should be regarded with circumspection.

1.7.0 The AT Content of the Apicomplexan Plastid DNA

The AT bias noted in the completely sequenced P. falciparum plastid genome is extreme

(86.9%; Wilson et al, 1996) when compared with the mitochondrial (68%; Feagin et al, 1992) and nuclear genomes (82%; Weber, 1988). The codon usage of the plastid genome reflects this biased AT-composition, as do the corresponding tRNA anti­ codons (Preiser et al, 1995).

Sequence analysis of an operon homologous to E. coli groE in the aphid bacterial endosymbiont sp., showed directional base substitutions tending towards an increase in AT content (Ohtaka and Ishikawa, 1992). In view of the phylogenetically close relationship between Buchnera and E. coli this bias was ascribed to the difference in their environments, i.e. intracellular versus free-living. It appears that the genomic

DNA of residing in eukaryotic cells is subjected to an AT-biased directional pressure (Sueoka, 1988), and/or an AT-favoured selection pressure (Bemardi etal, 1988) in the process of adaptation to the intracellular environment. Plastids (and mitochondria) are thought to be the result of ancient endosymbioses of prokaryotic organisms, and an AT-bias appears to be a common feature of organellar genomes in plants, animals and fungi (Aota et al, 1988). However, it is not clear what causes this pressure towards AT.

16 1.8.0 Is the Parasite Plastid Functional?

Whether the apicomplexan plastid genome is functional, and if so what its function is, remain unanswered questions. However, several strands of evidence from P. falciparum suggest a role for the extrachromosomal DNA:

• Reduction of a conventional plastid genome to resemble the condensed 35 kb circle

would require a considerable number of deletions with a skewed distribution, i.e.

many genes involved in gene expression have been retained whilst those required

for photosynthesis have been selectively lost;

• maintenance of open reading frames despite extensive divergence of the genetic

code;

• the plastid is transcriptionally active (Wilson et ai, 1996), its genes being expressed

differentially during the erythrocytic cycle (Feagin and Drew, 1995);

• the presence of plastid polysomes (Wilson et ai, 1996; Dr. A. Roy, personal

communication);

• the identification of a full complement of transcribed tRNAs (Preiser et ai, 1995).

Additionally, putative ribosome structures have been identified in the T. gondii plastid organelle (MTadden et al, 1996).

Nearly all the genes on the malarial plastid DNA (figure 1.3) are thought to be involved with expression (Wilson et al, 1996), including those encoding three subunits of a eubacterial-hke RNA polymerase, 17 ribosomal proteins, EF-Tu, duplicated SSU and

LSU rRNAs, and 25 tRNAs (nine of which are duplicated). Of the remaining nine

ORFs, the two largest encode a putative regulatory subunit similar to the Clp family of molecule chaperones, and a highly conserved ORF (ORF470) of unknown function found in some bacteria and algal plastid genomes. It has been suggested that this gene could be the raison d’être for maintenance of the plastid genome in the malaria parasite

17 (Wilson, 1994; Williamson et al, 1994). However, there are no indications of its functionality and an equivalent gene has yet to be located in other vestigial plastids.

Current work in this laboratory is investigating a cyanobacterial homologue utihsing molecular techniques unavailable for the apicomplexan plastid (A. Law, personal communication).

The most famihar function of plastid organelles is photosynthesis, but clearly this is highly unlikely in the parasitic apicomplexans. However, plastids are involved in other metabohc roles, such as amino acid and fatty acid biosynthesis, assimilation of nitrate and sulphate, and starch storage (Hrazdina and Jensen, 1992).

Similar questions have been raised concerning the functionaUty of the remnant non­ photosynthetic plastid of the parasitic, flowering plant Epifagus virginiana (dePamphihs and Palmer, 1990). Evidence indicates that the early stages of porphyrin biosynthesis

(leading to haem as well as chlorophyll) are restricted to plastids in both photosynthetic and non-photosynthetic plants (Smith, 1988; Shashidara et al, 1992). It has been suggested that the E. virginiana plastid genome is necessary for the biosynthesis of haem, which would still be required by an achlorophyllous plant for the mitochondria and other organelles (Howe and Smith, 1991). However, evidence in malaria indicates that this pathway is active in the mitochondrion, utilising glycine, rather than glutamate as found in plastids (Suroha and Padmanaban, 1992). An alternative (or additional) explanation for the retention of a plastid genome in E. virginiana was offered by

Wallsgrove (1991), who pointed out that the biosynthetic pathways of many amino acids are partly or wholly located within plastids. Synthesis of most of the protein amino acids, and aU metabohtes derived from them, is dependent on intact and functional organelles. A similar situation could account for the maintenance of the apicomplexan plastid organelle. Many of the genes involved with the predicted metabolic functions of the Epifagus plastid are probably encoded by the nucleus of the organism. Likewise, such genes from the original apicomplexan plastid DNA have

18 probably been translocated to the nucleus, as is the case with other endosymbiotic genomes (Palmer, 1985; Gray, 1989). The current malaria genome sequencing project

(Dame et al, 1996) should lead to the elucidation of functional elements in the course of time.

Tomavo and Boothroyd (1995) have tentatively suggested that products of the mitochondrion and the plastid might interact in a pathway regulating differentiation of

T. gondii between the tachyzoite and the bradyzoite forms (see life cycle figure 1.2).

Mutants selected for resistance to the mitochondrial inhibitor, atovaquone, were observed to switch from tachyzoites to bradyzoites and also demonstrated hyper­ sensitivity to clindamycin, an antibiotic suspected to act on the plastid (Pfefferkom and

Borotz, 1994).

1.9.0 Susceptibly of Apicomplexans to Antibiotics and Herbicides

The widespread emergence of P. falciparum strains which are resistant to anti-malarials such as chloroquine, primaquine, pyrimethamine, and mefloquine (Es, 1993), has prompted a search for alternative chemotherapeutics. In contrast, clinical toxoplasmosis can be effectively treated by a combination of antifolates, such as pyrimethamine and sulphonamides (Gregoriev, 1994). However, the persistence of latent cysts requires that immunocompromised patients (predominantly AIDS suffers) receive chemotherapy for life. Frequently patients develop an allergic reaction to sulphonamides precluding the administration of these drugs, and pyrimethamine alone is generally insufficient to prevent a relapse (Leport et al, 1988; Tenant-Flowers et al, 1991). Therefore, as in malaria, the search is on for alternative therapies to deal with this disease (Laughon et al, 1991).

19 Mitochondrial and plastid organelles are potential targets for anti-bacterial agents given their prokaryotic origin (reviewed Ebringer, 1990). Since apicomplexans harbour both organellar compartments, novel therapies may be possible. Geary and Jenson (1983) estabhshed that several antibiotics have an anti-malarial action, and tetracycline is used in combination with other drugs in regions where drug resistance has become a severe problem (World Health Organisation, 1973). It has been suggested that this antibiotic disrupts protein synthesis in the parasitic mitochondrion (Divo et al, 1985; Kiatfueng-

Foo et al, 1989). However, both the plastid and mitochondria-encoded SSU rRNAs maintain potential binding sites for tetracycline (Feagin et al, 1992). The in vitro and in vivo effects of rifampicin as an anti-malarial also have been documented (Strath et al,

1993; Pukrittayakamee et al, 1994). Since this drug is known to act specifically against the DNA-dependent RNA polymerase of prokaryotes it was hypothesised that its mode of action in malaria was against the plastid, whose genome encodes subunits for such a polymerase (Gardner eta l, 1991).

Both the fragmented rRNA genes of the mitochondria, and the contiguous rRNA genes of the plastid in malaria have been predicted to be sensitive to a range of antibiotics

(Feagin eta l, 1992). In the mitochondrial LSU and SSU rRNA genes, two sites vary from the expected sensitive sequence, possibly conferring resistance to erythromycin and streptomycin, respectively. By contrast the plastid rRNAs are predicted to be sensitive to these compounds. Erythromycin has proved effective against P. falciparum in vitro, however the parasite was resistant to streptomycin and other aminoglycosides

(Geary and Jenson, 1983). Interestingly, streptomycin resistance of the green alga

Chlamydomonas reinhardtii results from specific mutations in the SSU rRNAs of the chloroplast (Harris et al, 1989), suggesting that this is the target in at least one eukaryotic . Recent studies have demonstrated the efficacy of thiostrepton against the malaria parasite in vitro (M'^Conkey et al, 1997). It was predicted that the antibiotic acted selectively against the plastid based on the primary nucleotide sequence of the target LSU rRNA GTPase centre, and its differential effect on cytoplasmic and

20 organellar encoded transcripts. The specificity of thiostrepton was supported by binding studies (Clough et al, 1997). However, the T. gondii plastid GTPase centre proved unable to bind the drug, this is further discussed in chapter 6.

The lincosamide antibiotic clindamycin, and the macrolides spiramycin and azithromycin, are known to be active against T. gondii. Clindamycin in combination with pyrimethamine has proved to be as effective as pyrimethamine-sulphadiazine in the treatment of toxoplasmic encephalitis in patients with AIDS (Dannemann et al, 1992;

Katlama, 1991). Spiramycin is used to treat pregnant women who have acquired toxoplasmosis (M'^Cabe and Remington, 1990), since the use of pyrimethamine- sulphadiazine is not recommended until twenty weeks post-conception. Azithromycin reduces mortality in murine models (Araujo et al, 1991). Comparison of mutants resistant for these drugs (Pfefferkom and Borotz, 1994) suggested that they share a common target, which did not appear to be the mitochondrion since its function was unaffected. It was hypothesised that these anti-microbials might act against protein synthesis in the plastid. Molecular evidence also suggested that the target, the peptidyl transferase region of the LSU rRNA, was of the plastid rather than mitochondrial type

(Beckers et al, 1995). However, it should be stressed that the mitochondrial genome of

T. gondii has not yet been isolated. Beckers et al utilised the sequence of nuclear mitochondrial gene fragments (Ossorio et al, 1991) to predict the insensitivity of the peptidyl transferase region to lincosamide and macrolide antibiotics, whether this is reflected within the organelle is unknown.

The tetracycline-class of antibiotics shows in vitro and in vivo activity against T. gondii

(reviewed Luft and Remington, 1992). Beckers et al (1995) hypothesised that the target was mitochondrial based on the sensitivity of a subset of proteins to tetracycline.

A more surprising finding is that all the apicomplexans thus far tested have proved to be sensitive to a therapeutic derivative of the herbicide tiiazine (Mehlhom et al, 1984;

21 Harder and Haberkom, 1989; Hackstein et al, 1994). It was suggested that this herbicide interacts with the D1 protein of photosynthetic reaction centres in the parasites' plastid organelles (Hackstein et al, 1995). Hackstein et al hypothesised that all apicomplexans possess such targets, and new therapeutic herbicide derivatives might prove effective in the treatment of malaria, toxoplasmosis, coccidiosis, and other infections. However, as discussed above there is still some doubt concerning the existence of photosynthetic reaction centres in the Apicomplexa. Similarly, several dinitroaniline herbicides were recently found to specifically inhibit T. gondii rephcation

(Stokkermanseta l, 1996), and it was suggested that this effect was due to interaction with a plant-like tubuhn encoded by the parasite's nuclear DNA.

It remains to be demonstrated whether any agents specifically inhibit the function of the plastid. However, potential apicomplexan chemotherapies may be present among the array of herbicides and antibiotics currently available. The suitability of such agents in a clinical setting remains to be tested.

1.10.0 Aims

This thesis is largely dedicated to a comparative analysis of the weU studied P. falciparum extrachromosomal DNA, and those of other important apicomplexans. To this end, two sectors of the plastid genome were sequenced from the coccidians, T. gondii and E. tenella, and from the piroplasm Th. annulata. This allowed evolutionary studies to be initiated by analysing conservation of gene content and arrangement, and by utilising phylogenetic methodologies. Subsequently, sequence data were used to suggest that the apicomplexan plastid is functional, that potential targets are available for antibiotics, and that the coccidian plastid employs an unusual coding system.

22 Figure 1.1: Apicomplexan Merozoite

Diagram of a proliferative form apicomplexan parasite, illustrating the apical complex used by Levine (1970) to classify the phylum.

Redrawn from Levine (1987).

23 Polar ring

Conoid

# Pellicle m m Subpellicular tubule mm

Micropore 8 Microneme 0. M

B Rhoptry 00 0 Golgi

Nucleus Nucleolus

Mitochondrion

Posterior ring Figure 1.2: Toxoplasma Life-Cvcle

Oocysts develop during the enteroepithelial cycle in cats, the definitive host, and are shed into the environment. This persistent form of the parasite is a major cause of infections in humans and other animals who come into contact with cat faeces.

Oocysts reactivate to give, normally, a brief acute infection instigated by the tachyzoite form of the parasite. Following activation of the immune system cysts containing the bradyzoite form of the parasite establish themselves in the brain and muscle tissues, giving a chronic, sub-chnical infection. These cysts are a major cause of human infection in raw or under-cooked meat.

In immunocompromised individuals tissue cysts can reactivate and an acute, tachyzoite infection can result, leading to encephalitis. An additional clinical risk occurs on first infection of pregnant women. Tachyzoites are passed to the foetus whose immune system is too immature to tackle the parasite. Congenital abnormalities can result.

24 Parasite sexual cycle

Oocysts

Undercooke meat

Neonatal infection

Immunosuppression Acute phase reactivation of cysts tachyzoite infection Chronic phase bradyzoite cysts Figure 1.3: Complete Map of the Malarial Plastid Genome

Diagrammatic illustration of the fully sequenced P. falciparum 35 kb circular DNA

(Wilson et at., 1996).

Genes are represented by filled boxes which are proportional to gene length. Genes on the outer circle are transcribed clockwise. Genes on the inner circle are transcribed counter-clockwise. Transfer RNA genes are identified by the single letter code for the cognate amino acid, with the anticodon in parentheses. The asterisk indicates the presence of an intron.

IR -inverted repeat

SSC -small single copy region (very small or non-existent in the vestigial plastid

genome)

LSC -large single copy region

SSU -small subunit

LSU -large subunit

ORF -open reading frame

25 (ne6) I (o6n)V (6en)l \ VNd-* (6 o b ).Ü (non)d (neo)l/M (oen)A

VNHJ nsi

VNdJ (nn6)N (ne6) I (non)d OZt-dHO

LSdUO mUBO

(q>i9e) VNQld lunjBdpteyd

1.69eidJ dWO Usdj (6nn)0 dt-sdj isdj \/ini 8ZdB SOLdBO ZQodj 621-dBO ZdBO

(b e 6) d Figure 1.4: Cruciforms

A(I). Diagram of the P. falciparum inverted repeat (IR). Numbered sections correspond to cloned HindRl fragments.

A(U). Illustration of the folding of the IR into a 'cruciform'. Repeated restriction sites used in 'snap-back' experiments are designated (V).

B. Electron micrographs showing 'cruciform' structures in the 35 kb circles of T. gondii (I) and Plasmodium sp. (H and III). Scale bar indicates 0.5|im.

26 4 16a I 5 |6b|7|6b| 5 |6a~[~2~

LSU SSU SSU LSU

(p 3b SSU LSU

4 I 6 a I 5a

LSU SSU

0 * CHAPTER 2: MATERIALS AND METHODS

2.1.0 Parasite Culture

T. gondii tachyzoites (RH strain) were cultured on monolayers of Madin-Darby Canine

Kidney (MDCK) epithelial cells (Madin and Darby, 1958) (European Culture

Collection). The MDCK cells were grown to confluence using RPMI (Gibco 074-18(X)) medium supplemented with 23 g NaHCO^, 5.957 g HEPES, 25 mg gentamycin, 50 mg hypoxanthine, and 2 g glucose per litre, plus 10% foetal calf serum (ECS) (Sera Labs) in 250 ml vented tissue culture flasks (Falcon) placed in an incubator at 37°C with an atmosphere of 5% carbon dioxide. To provide stock cultures, one flask of confluent

MDCK cells was stripped by washing with 5 ml of trypsin/versone (8.0 g NaCl, 0.2 g

KCl, 1.15 g Na 2 HP 0 4 , 0.2 g KH^PO^, 0.1 g EDTA, 1.25 g trypsin, 0.01 g phenol red, and distilled H^O to a litre) then incubating for approximately 15 minutes in 10 ml of the same solution, and gently scraping. 10 ml of trypsin-free medium was added in order to neutralise the trypsin/versone solution, and the cell suspension was diluted into an appropriate number of flasks (a dilution of 1:10 gave a monolayer after 5 days). For maintenance of the host cells and parasite culture the concentration of ECS was reduced to 3% in the medium and the flasks were inoculated with approximately 10^ tachyzoites, strain RH (provided by Dr. M. Pudney, Wellcome). After 5 to 7 days, as the monolayer began to disintegrate, the parasites were harvested by scraping off the cells which were then pelleted in a bench-top centrifuge (MSE Centaur 2) at 3000 rpm and washed with phosphate buffered saline (PBS). Approximately 10^ organisms were obtained per flask.

T. gondii parasites were also supplied by Dr. J. Smith (University of Leeds).

27 2.2.0 Parasite Purification

The T. gondii parasites were released from host cells by passage through a .22 gauge needle. Extracellular parasites were then purified by passing them through successive

5fi and 3}i Nucleopore polycarbonate filters (Dahl and Johnson, 1983). The purified cells were then pelleted and washed in PBS as above.

2.3.0 DNA Extraction

2.3.1 Toxoplasma

T. gondii cells were lysed by adding 10 pellet volumes of warm, 37°C, lysing buffer

(0.1 M Tris-HCl, 10 mM EDTA, 4% Sarkosyl, pH 8.0), mixing gently, and incubating at 60°C for a few minutes. After cooling at room temperature to approximately 37°C, 5

|il/ml of DNase-free RNase (GibcoBRL) were added and incubated at 37°C for 1 hour.

Proteinase K (PK) (Sigma) (100 |ig/ml) was then added and incubated over-night at

37°C. A phenol/chloroform extraction was then performed and the DNA ethanol- precipitated, washed with 70% ethanol, pelleted, and then resuspended in TE pH 8.0.

2.3.2 Other parasites

The malarial parasite P. falciparum (clone CIO, Hemplemann et al, 1981) was cultured in vitro (Trager and Jenson, 1976) and the DNA extracted as previously described

(Gardner et al, 1988). Isolated P. falciparum circle DNA was supplied by Dr. D.H.

Williamson (NIMR). Total DNA from Th. annulata and E. tenella was supplied by Dr.

R. Hall (University of York) and Dr. F. Tomley (Institute of Animal Health,

28 Compton), respectively. Total Leishrmnia guyanesis B8 DNA, used as a negative control in PCR reactions, was supplied by Dr. D C. Barker (NIMR, Molteno

Laboratories, University of Cambridge).

2.4.0 Caesium Chloride/DAPI Gradients

T. gondii DNA was purified on density gradients using a modified protocol based on that of Williamson et al. (1985). For each separation 50 p,g of DNA from filtered or unfiltered parasites were added to 4.3 g CsCl, 3.8 mis saline and 2 |ig/ml 4’,6- diamidino-2-phenylindole (DAPI). The mixture was loaded into Beckman Optiseal ultrafuge tubes and centrifuged at 20°C for 4 hours at 65000 rpm, followed by 12 hours at 50000 rpm in a vertical rotor (Beckman). Fractions were removed from the tubes by side puncture, and diluted with saline and 10 mM EDTA pH 8.0. These were then loaded into Beckman ultrafuge tubes and pelleted at 4°C for 4 hours at 45000 rpm in a swing-out rotor (Beckman). The DNA pellet was dissolved in TE pH 8.0.

2.5.0 Southern Blots

The DNAs were restriction digested at 37°C overnight using HindHl (GibcoBRL).

Approximately 500 ng of restricted total T. gondii, P. falciparum, Th. annulata and E. tenella DNAs were separated by electrophoresis on 0.8% agarose-TBE gels, impregnated with 0.6 |ig/ml of ethidium bromide. The DNAs were then transferred under alkaline conditions to Hybond N+ membranes (Amersham Life Science) according to the manufacturer's instructions, and fixed by UV crosslinking (auto­ crosslink, Stratahnker, Stratagene). Hybridisations were carried out as recommended in the Hybond manual using [a-^^P]-radiolabelled probes prepared by the random primer method (Prime-It II Random Primer Kit, Stratagene). However, low stringency

29 hybridisations were carried out at a reduced temperature (50°C), and washed accordingly (a low temperature and high salt concentration). The malarial 35 kb DNA probes used (see figure 1.3) were: 1) cloned SSU rRNA gene fragment 6b (P.W.

Moore, NIMR); 2) cloned ORF470 (Dr. A.M. Whyte, NIMR); 3) PCR amplified tuf gene fragment (Dr K. Rangachari, NIMR); 4) PCR amplified clp gene fragment (Dr.

A.M. Whyte, NIMR). The hybridised membranes were exposed to Blue X-ray film (X- ograph) using intensifying screens (Dupont) at -70°C. For rehybridisation the membranes were 'deprobed' as described in the Hybond manual.

2.6.0 Slot Blots

Undigested T. gondii DNA fractions from CsCl/DAPI gradients, U (20 ng), M (200 ng) and L (200 ng), together with undigested control DNAs from P. falciparum (200 ng) (M. Strath, NIMR) and MDCK cells (200 ng) were blotted onto Hybond N+ using a Slot Blotter (Scotlab Ltd.) according to the manufacturer's protocol. Following fixation by UV irradiation the blot was hybridised with [a-^^P]-radiolabelled probes, washed (at a high or low stringency), and autoradiographed (see 2.5.0). The probes used were: 1) cloned malarial circle SSU rRNA gene fragment 6b (P.W. Moore,

NIMR); 2) PCR amplified P30 gene fragment (see table 2.1), a single copy T. gondii nuclear gene (Burg et al, 1988).

2.7.0 Copy Number Determination

Approximately 500 ng of Hindm digested purified total T. gondii DNA was separated on an agarose gel and Southern blotted as in 2.5.0. As controls the blot contained known amounts of PCR products from the single-copy nuclear P30 gene (see table

2.1), and the plastid ORF470 gene (see table 2.2) at doubling dilutions. Hybridisations

30 were carried out at a high stringency using the PCR products as probes (see 2.5.0). The intensity of signal for each probe against the genomic DNA and the controls was measured using a phosphorimager (Molecular Dynamics) and appropriate software

(ImageQuant, Molecular Dynamics). The hybridisation efficiency of the probes were calculated from the control results, and the copy number was then determined from the results gained with the total DNA.

2.8.0 Synthesis of Oligonucleotides

Oligonucleotides were synthesised at the NIMR using an ABI 38GB DNA synthesiser and supplied deprotected at 0.1 mM in 35% ammonia. They were ethanol-precipitated, washed, and their concentration determined by UV spectrophotometry (OD 1.0 at

A2 6 0 ™ = 2 0 mg/ml).

2.9.0 Polymerase Chain Reaction

Ohgonucleotide primers (tables 2.1-2.5) were designed based on conserved regions of the malarial 35 kb circle (Wilson et al, 1996) using the 'guessmer' technique (Jaye et al, 1983), or on published sequence. Polymerase chain reactions (PCRs) were carried out with 20 ng of T. gondii plastid DNA, 400 ng of total T. gondii DNA, 150 ng of total E. tenella DNA, or 300 ng of total Th. annulata DNA as templates. Approximately

50 pmol of each primer was added to reaction mixtures containing 2.5 mM MgCl^

(Promega), xl PCR buffer (Promega) and 1 )ll1 (5 units) Amplitaq (Cetus) in 100 pi total volume. Amplification was carried out for 35 cycles as follows: 95°C for 30 seconds, 40°C (55°C for the P30 gene, see table 2.1) for 60 seconds, 72°C for 120 seconds. As a positive control, malarial circle DNA was used as template. As negative controls, total DNA from L. guyanensis and MDCK cells was used. PCR products were analysed by electrophoresis on 0.8% agarose-TBE gels impregnated with 0.6

31 |ig/ml of ethidium bromide. DNA was purified using Wizard PCR Preps (Promega) or

GeneClean n Kits (Bio 101).

2.10.0 Cloning PCR Products

AnE. tenella PCR product encompassing the IR (see table 2.3) was purified using the

GeneClean n Kit and cloned utilising the TA Cloning Kit Version 2.1 (Invitrogen) according to the manufacturer's protocol. Amplified plasmid was purified using the

Wizard Plus Mini Prep Kit (Promega).

2.11.0 Sequencing

Most PCR products obtained were sequenced directly on one strand using the Sanger di-deoxy chain termination method (Sanger et al, 1977). The Sequenase Version 2.0

Kit (USB) was utilised with the following modifications: the products were denatured by boiling for 5 minutes then annealed with the requisite primer on ice for 10 minutes.

The labelling reaction was performed on ice for 3 minutes with the mix at a 1:10 dilution, and the termination reaction was carried out at 40°C for 3 minutes. In instances where 'ghost' bands proved problematic, deoxy-transferase (TdT) also was added and the reactions incubated at 37°C for 30 minutes (Fawcett and Bartlett, 1990). The samples were run on 6% acrylamide/bisacrylamide gels (Sequagel, National

Diagnostics) with a modified TBE buffer (163.5 g Tris, 27.8 g Boric Acid, 9.31 g

EDTA, to 1 litre pH 8.9) at 55 W using an S2 sequencing gel apparatus (GibcoBRL).

Gels were fixed for 15 minutes in 10% v/v acetic acid and 10% v/v methanol, dried under vacuum at 80°C (Model 1583, Bio-Rad) and exposed to BioMAX-AR film

(Kodak).

32 Reverse strands, and the TA cloned E. tenella PCR product, were sequenced directly by fluorescent dye-terminated thermo-cycle sequencing (Perkin Elmer) using an ABI

PRISM 377 Automated DNA sequencer (Perkin Elmer) according to published protocols.

Initially, the sequencing primers used were the same as those for PCR. Further primers were designed from the derived sequence to enable 'walking' along both strands of the

DNA template.

The sequences were assembled using the Staden-Plus programme SAP (Amersham).

2.12.0 Sequence Analysis

Sequence data was analysed using:

• The Staden-Plus programme NIP (Amersham) tRNA option (Staden, 1980);

• the BLAST, TRANSLATE, COMPOSITION and PILEUP options of the Sequence

Analysis Software Package ( Computer Group, Madison Wisconsin)

(Devereux er a/., 1984);

• the Clustal (Higgins and Sharp, 1989), phylogenetic tree and sequence distance

methods (Saitou and Nei, 1987) in the MegAhgn option of the LaserGene package

version 1.4 (DNASTAR inc.).

Detailed phlyogenetic studies were carried out with the help of Dr. N. Goldman

(Department of Genetics, University of Cambridge).

Predicted amino acid (aa) sequences from the plastid tuf genes of P. falciparum

(X95276), T. gondii, E. tenella and Th. annulata, corresponding to a 132 aa fragment of EF-Tu, were aligned using the PILEUP programme. Other sequences were selected

33 to give as broad a range of diversity as possible, given the constraints of sequence availability and computational tractability in reasonable times. The data comprised a range of bacterial, and higher plant and algal plastid tuf sequences with the following accession numbers: Anacystis nidulans X 17442; Eschericheria coli JO 1690

Mycobacterium tuberculosis X63539; Chlamydomonas rheinhardtii X52257

Cryptomonas phi X52912; Odontella sinensis Z67753; Cyanophora paradoxa X52497

Porphyra purpurea U38804; Codium fragile U09427; Euglena gracilis Z11874; Astasia longa X14385; Glycine max X66062; Nicotiana sylvestris D 11469. Maximum likehhood analyses of ahgned nucleotide sequences were carried out, the third codon positions being excluded. Analysis of the first and second codon positions was performed using the BASEML programme of the PAML package (Yang, 1995). The basic model designated F84, which allows for both base composition bias (by a three parameter distribution (tc) and transition/transversion bias (by a relative rate parameter

R) was utilised. In addition, rate heterogeneity across sequence positions (Yang 1994 and 1996) was described by the shape parameter (a) of a gamma distribution with mean

= 1. In all cases, estimation of different mean rates of substitution at the two codon positions was allowed, and no molecular clock was assumed. This methodology will be expanded upon by Denny et al (submitted).

2.13.0 RNA Extraction

RNA was extracted from purified T. gondii parasites using the RNaid Kit (Bio 101) according to the manufacturer's protocol. RNA concentration was determined by UV spectrophotometry (OD^^onm 1-0 = 40 mg/ml).

34 2.14.0 Northern Blots

RNA, at a concentration of 30 |ig/lane, was resolved using 1% w/v agarose- formaldehyde gels electrophoresed in 3-(N-morpholino)-propanesulphonic acid

(MOPSyformaldehyde buffer (Ausubel et al, 1995) and transferred to Hybond N+ according to the manufacturer's protocol. Transferred RNA was fixed by UV crosslinking as in 2.5.0. Blots were either stained with Methylene Blue (membrane soaked in 5% acetic acid for 5 minutes, then in 0.004% Methylene Blue in 0.5% sodium acetate pH 5.2 for 5 minutes), or hybridisations were carried out as recommended in the Hybond manual using [a-^^P]-radiolabelled DNA probes prepared as in 2.5.0. The PCR amplified T. gondii plastid DNA probes (see table 2.2) used were: 1) SSU and LSU rRNA gene fragments; 2) ORF470 fragment; 3) tuf gene fragment. After washing as recommended the blots were autoradiographed as in 2.5.0.

For rehybridisation the membranes were 'deprobed' as described in the Hybond manual.

2.15.0 Reverse Transcription PCR

RT-PCR was performed using total T. gondii RNA (see 2.13.0). First strand synthesis was carried out in a 20 ml volume using approximately 50 pmol of reverse strand primers for the ORF470 and rpsl genes (see table 2.5), about 5 |ig of RNA, 1 pi 10 mM dNTP mix (Promega), 4 pi 5x First Strand Buffer (GibcoBRL), 2 |il 0.1 M DTT

(GibcoBRL) and 1 pi (200 units) Superscript H RNase H Reverse Transcriptase

(GibcoBRL) according to the GibcoBRL protocol. The RNase inhibitor RNasin

(Promega) was utilised according to the manufacturer's protocol. The forward primers

(see table 2.5) were added and a PCR reaction was performed in a 100 p,l volume as

35 described in 2.9.0. As negative controls, first strand synthesis was carried out after treating RNA with DNase-free RNase (Boehringer Mannheim) as described by the manufacturer, or without reverse transcriptase. PCR amplification of T. gondii plastid

DNA was used as a positive control.

2.16.0 Production of Fusion Proteins

The 5' ends of the T. gondii plastid rpsl gene and ORF470 were amplified by PCR as described in 2.9.0, using oligonucleotide primers engineered to contain suitable restriction endonuclease sites (see table 2.6). The PCR products and the vector, pGEX-

3X (Pharmacia Biotech), were digested for 1 hour at 37°C using BamYil (GibcoBRL) and EcoBl (GibcoBRL) sequentially. The DNAs were isolated after electrophoresis on a 0.8% agarose-TAE gel impregnated with 0.6 jig/ml of ethidium bromide, and then purified utilising the GeneClean II Kit. The vector was treated with calf intestinal alkaline phosphatase (Promega) according to the manufacturer's instructions.

Subsequently, the PCR products were ligated into pGEX-3X using Bacteriophage T4

DNA ligase (GibcoBRL) according to the manufacturer's protocol. To ensure the reading frame required to synthesise a Glutathione S-transferase (GST) fusion protein was restored, the plasmid inserts were end sequenced using the Sequenase Version 2.0

Kit (USB) according to the manufacturer's protocol (sequencing primers supplied by

Dr. M. Blackman, NIMR).

Approximately 20 ng of the ligated DNA was used to transform Epicurian Coli SURE

Competent Cells (Stratagene) according to the manufacturer's instructions. pGEX transformants were selected on L-agar plates supplemented with ampicillin (50 |ig/ml).

Clones were grown up overnight at 37°C in L-broth containing 50 |ig/ml ampicillin, and induced using isopropyl-p-D-thiogalactoside (IPTG) (Pharmacia Biotech) according to the manufacturer's protocol. The cells from a selected clone were lysed by lyzozyme

36 treatment (Sigma) and mild sonication (Vibra Cell, Sonics and Materials) in ice-cold

PBS, and then fusion proteins were purified using the Bulk GST Purification Module

(Pharmacia Biotech) according to the manufacturer's protocol. Samples at various stages of purification were solubilised in SDS loading buffer (Biolabs) with 100 mM

DTT for dénaturation, boiled for 5 minutes, and then analysed by SDS-PAGE on homogenous 10% gels (Easigel, Scotlab). Electrophoresis was carried out at 25 mA for approximately 1 hour using a Mighty Small II vertical slab gel unit (Hoefer Scientific

Instruments). Broad-range prestained protein markers (6-175 kD) (NEB) were used.

To visualise the proteins SDS-PAGE gels were stained with 0.1% w/v Coomassie

Brilliant Blue R-250 (Sigma) in methanol:waterethanoic acid (5:5:1), and destained with methanol:water:ethanoic acid (1:17:2). Purified ORF470 fusion protein was isolated. However, the rpsl-den\cd fusion protein appeared degraded and was discarded.

2.17.0 Polyclonal Antibody Production

Purified ORF470 fusion protein (see 2.16.0) was inoculated intraperitoneally into

BalbC mice with Freund's complete adjuvant (Harlow and Lane, 1988). After boosting together with Freund's incomplete adjuvant (Harlow and Lane, 1988), tail bleeds were taken and analysed in Western blots.

2.18.0 Western Blotting

Purified ORP470 fusion protein (see 2.16.0), isolated GST protein (provided by I.

Ling), and T. gondii and P. falciparum (provided by Dr. B. Clough) cells, were solubilised in SDS loading buffer (with or without DTT) and separated by SDS-PAGE

(see 2.16.0). The proteins were subsequently transferred to a Hybond-C pure nitrocellulose membrane (Amersham Life Science) using the Sartoblot apparatus

37 (Sartorius) according to the manufacturer's protocol. Blots were treated overnight with

Protoblock (National Diagnostics) at4°C, washed with PBS, and incubated for 1 hour at room temperature with primary antibodies (anti-fusion protein, anti-GST (provided by I. Ling) or preimmune sera) diluted 1:100 in Protoblock. After washing vigorously in PBS, the blots were incubated for 1 hour at room temperature with secondary antibody. Horseradish peroxidase-conjugated anti-mouse IgG (provided by M. Strath), diluted 1:3000 in Protoblock. After further washing in PBS the blots were developed using the Horseradish Peroxidase Luciferase (HRPL) Kit (National Diagnostics), according to the manufacturer's instructions. The membranes were exposed to Blue X- ray film using intensifying screens for a minimum of 2 seconds.

38 Table 2.1: Oligonucleotide Primers for Amplification of the P30

Nuclear Gene

1. P30 sequence published by Burg e ta l (1988).

39 NUCLEAR ENCODED GENE PRIMER PAIR P30' fragment used as probe 5’ ATGTCGGnrCGCTGCACCAC and 5’ TGCGTCGTCTCCCTTGATG Table 2.2: Oligonucleotide Primers for Amplification of Isolated

Toxoplasma Plastid DNA

Published primer sequences: 1. Gozar and Bagnara (1993); 2. Beckers et al (1995).

/: Separates degenerate nucleotides.

40 SEGMENT OF CIRCLE PCR PRIMER PAIRS rRNA/tRNA Inverted Repeat- 5' TACGGCTACCTTGTTACGACTTC ORF470 and 5' GCGGTAATACAGAAAATGCAAG'

5’ AAGAACATACCAATCCACC and 5' CTATTTCACCGAATATTTC

5' GCGAAATTCCTTGTCGGGTAAGTTCC^ and 5’ TTTT/CT/CGG/ATCCTCTCGTAC^

5' GTTCGCCTATTAAAGCGATACGTGAGCTGGG and 5' ATCACATTGAG/CAyTATAATT LSU rRNA gene fragment 5' GCGAAATTCCTTGTCGGGTAAGTTCC^ used as probe and 5' ATCCCTAGAGTAACTTTATCCGT^ SSU rRNA gene fragment 5' GCGGTAATACAGAAAATGCAAGCG used as probe and 5' AGCACGAACTGACGACAGCCATGCAC ORF470 fragment 5' CAATITGAAAGAACAmTA used as probe and 5' ATCACATTGAG/CA/TATAATT tuf gem fragment 5' CATGTAyTGATCATGGA/TAAAAC used as probe and 5' ATCTTCTTTATTTAAAAAA/TAC rps\2- rpsl-tuf-lRNAs-rpl 11 5’ GGAGGTAGAGTAAAAGATTTACCAGG and 5' GGTAGAGCAATGGATTGAAG

5' CTTCAATCCATTGCTCTACC and 5’ GTGAGTATAGTTTAG Table 2.3: Oligonucleotide Primers for Amplification of Eimeria Plastid

DNA

* PCR product TA cloned (Invitrogen).

/: Separates degenerate nucleotides.

41 SEGMENT OF CIRCLE PCR PRIMER PAIRS rRNA/tRNA Inverted Repeat- 5' CGCTTGCATTTTCTGTATTACCGC’ ORF470 and 5' GAGTTTGATCCTGGCTCAG

5' CTGAGCCAGGATCAAACTC* and 5' GGAAAACTGCTTCTAAG*

5' CTTAGAAGCAGTTTTCC and 5' TTTCTCACTTATATGTTGTC

5' GTTCGCCTATTAAAGCGATACGTGAGCTGGG and 5' ATCACATTGAG/CA/TATAATT rps 12-rps7 -tuf-tRNA^^ 5' GGAGGTAGAGTAAAAGATTTACCAGG and 5' GGTAGAGCAATGGATTGAAG Table 2.4: Oligonucleotide Primers for Amplification of Theileria

Plastid DNA

/: Separates degenerate nucleotides.

42 SEGMENT OF CIRCLE PCR PRIMER PAIR rpsl2-rpsl -tuf 5' GGAGGTAGAGTAAAAGATTTACCAGG and 5 ATCTTCITTATTTAAAAAA/TAC Table 2.5: Oligonucleotide Primers for RT-PCR

43 TRANSCRIPT FORWARD PRIMER REVERSE PRIMER ORF470 5' CAATTTGAAAGAACA/rTTA 5' ATCACATTGAG/CA/TATAATT rpsl 5’ CTATTTATCAAGTACCACAAC 5’ nTAATTGATAAGCTTTACC Table 2.6: Oligonucleotide Primers for Amplification of Toxoplasma

Plastid Genes for Fusion Protein Production

Nucleotides in bold represent engineered EcoRI and BamWl restriction endonuclease sites.

44 GENE FOR FUSION PRIMER PAIRS PROTEIN PRODUCTION (INCLUDING RESTRICTION SITES) rpsl 5’ GAGCTCGGGATCCGCATGATTTATTTTTATCAAAAAC and 5’ GAGCTCGGAATTCGGTTGTGGTACTTGATAAATAG ORF470 5’ GAGCTCGGGATCCGCATGAAATTATATAAATATTTA and 5’ GAGCTCGGAATTCCTGAAAAl'lTTAATTCTAATCC CHAPTER 3: ISOLATION OF THE PLASTID DNA FROM

TOXOPLASMA

3.1.0 Introduction

The organic dye ethidium bromide (EtBr) intercalates more readily with linear than with circular DNA molecules. This means that DNA circles are of a relatively high buoyant density when genomic material is separated by ultra-centrifugation on caesium chloride

(CsCl) gradients. Borst et al (1984) utilised EtBr/CsCl gradients to isolate a circular genome from T. gondii which has since been hypothesised to be homologous to the plastid DNA of malaria (Wilson et al, 1993). In order to test this hypothesis the putative plastid genome of T. gondii was isolated in sufficient quantities to allow molecular characterisation. EtBr/CsCl gradients only allow isolation of closed circular forms, to obtain all forms (closed, nicked and linearised circles) of the T. gondii molecule the methodology described in 3.2.0 was employed.

3.2.0 Fractionation of Toxoplasma DNA

It was predicted that the putative T. gondii plastid DNA would be under the same evolutionary pressures as its distant malarial cousins, leading to an AT-iich base composition. This would allow the isolation of the genome from the relatively GC-rich parasite nuclear DNA (-50% AT) (Neimark and Blaker, 1967; Perrotto et al, 1971), and host MDCK cell DNA (-60% AT) by utilising the isopycnic centrifugation techniques previously used for sp. (Williamson gf a/., 1985, Gardner al, 1988). This procedure relies purely on the buoyant density of the components to be separated, not on their shape or size, and is time independent. Williamson et al (1985) and Gardner et al (1988) employed so-called 'self-forming' gradients, in which the

45 sample was mixed with the gradient medium (in these cases CsCl) to give a solution of uniform density. The gradient was then formed by sedimentation equilibrium during ultra-centrifugation. The genetic material was detected utilising fluorescent organic dyes such as Hoechst 33258 or 4',6-diamidino-2-phenyhndole (DAPI) (Williamson and

Fennell, 1975), which bind preferentially to AT-rich DNA species thereby decreasing their buoyant density upon centrifugation. In the malaria parasites, the AT-rich plastid

DNA was visible as a buoyant band.

DNA was prepared from both purified and host cell-contaminated T. gondii parasites, then separated on isopycnic CsCl/DAPI gradients as described in 2.4.0. The DNA separated into three distinct bands designated U (upper), M (middle) and L (lower) (see figure 3.1). U contained the DNA species with the lowest buoyant density (i.e. the highest AT content). A similar pattern had been observed previously in an attempt to purify T. gondii DNA from contaminating murine DNA (Johnson et al, 1986). This group identified a low, 'heavy' band as parasite, a middle band as murine, and a high,

'light' band as murine satellite DNA. In 1989 Joseph et al, found the middle band to be enriched for a putative parasite mitochondrial DNA.

3.3.0 Identification of the Fractions

Hybridisation analysis demonstrated that fraction L contains parasite nuclear DNA (see figure 3.2B). Band M was assumed to comprise host MDCK cell material, since its intensity was significantly reduced in gradients containing purified T. gondii DNA (see figure 3.1). These results concur with the findings of Johnson et al (1986). However, fraction U was shown to harbour a homologue of the malarial plastid genome in hybridisation experiments (see figure 3.2A). This fraction did not appear contaminated with parasite nuclear DNA since no hybridisation was noted with a T. gondii nuclear probe (see figure 3.2B).

46 3.4.0 Copy Number

Probes based on single copy genes from the T. gondii nuclear and plastid genomes were hybridised with Southern blots of HindUl digested genomic T. gondii DNA, and controls. Utilising phosphorimaging techniques, as described in 2.7.0, the copy number of the plastid genome was estimated to be four per haploid cell (see figure 3.3).

This contrasts with the published figure of eight (Kohler et al, 1997), and an estimate of one to two copies in P. falciparum (Presier et al, 1996).

3.5.0 Discussion

A pure fraction of the T. gondii plastid DNA was isolated, providing preliminary evidence that this AT-rich genome is maintained across a range of apicomplexan parasites, and facihtating further molecular analyses. Like P. falciparum the extrachromosomsal DNA is maintained at a relatively low copy number, namely four.

The discrepancy between the figure presented here and that pubhshed by Kohler et al is difficult to explain as the methods used to arrive at their estimate of eight copies were not discussed.

47 Figure 3.1: Caesium Chloride/DAPI Gradients of Toxoplasma DNA

Genetic material from both MDCK cell contaminated (A), and purified T. gondii parasites (B) were separated using isopycnic centrifugation as described. The pattern observed was reminiscent of that described by Johnson et al (1986); a low, 'heavy' band (L), a middle band (M), and a minor high, 'light' band (L). Bands L and, particularly, U are faint in A due to the relatively low amount of parasite material on the gradient. The reduction in intensity of band M in B suggests that it largely contains host, MDCK cell material.

48 u-

M- L- Figure 3.2: Identification of Gradient Fractions

A. A Slot blet of the T. gondii CsCl/DAPI DNA fractions (U, M and L), together with positive (P. falciparium DNA (Pf)) and negative (MDCK DNA) controls, was probed with the P. falciparum plastid SSU rRNA gene at a low stringency. This demonstrated that fraction U contains a homologue of the malarial 35 kb circle. Some cross- hybridisation was noted in the negative control, MDCK DNA, and in M and L DNA.

B. The same slot blot was probed with the T. gondii nuclear P30 gene at a high stringency. In agreement with Johnson et al (1986) this showed fraction L as containing parasite nuclear DNA. The plastid DNA (U) appears uncontaminated with nuclear material.

49 Pf MDCK U # Figure 3.3: Determination of the Plastid Copy Number

A phosphorimager scan of approximately 500 ng of HindRl digested, total T. gondii

DNA (Tg) hybridised with PCR products from the nuclear P30 and plastid ORP470 single copy genes. Doubling dilutions of the PCR products (1 ng to 125 pg) (P30 1-4 and ORF A-D) were used as controls.

1. The dilutions of control DNAs showed a linear drop in signal strength. The mean ratio of ORF:P30 signal intensity was 0.0425. Note boxed regions.

2. The ratio of ORF:P30 signal intensity was 0.168 in the total T. gondii DNA. Note boxed regions.

3. The ratio in 1 shows the relative efficiency of the probes. Therefore, taking the ratio in 2, one may calculate the copy number of the T gondii plastid genome: 2-i-1 = 0.168

4- 0.0425 ~ 4 copies per haploid cell.

50 P30 ORF Tg

R30 8kb

a:F 2kb . m W ï B c D O.Skb mA S) s CHAPTER 4: EVOLUTION OF THE APICOMPLEXAN

PLASTID GENOME

4.1.0 Introduction

Before the beginning of this study the evolutionary history of the apicomplexan plastid

DNA molecule remained an issue of pure speculatation (see chapter 1). To gain some insight into this matter the organellar DNAs of the coccidians T. gondii and E. tenella, and the piroplasm Th. annulata, were partially characterised and compared with the published sequences of P. falciparum (Wilson et al, 1996) and other organisms.

4.2.0 Cross Hybridisation

Preliminary studies with Southern blots demonstrated the presence of homologues of the P. falciparum plastid genes, ORF470, m/and clpC, in T. gondii, E. tenella and Th. annulata (see figure 4.1). Previously, rpoB, and SSU and LSU rRNA gene homologues were similarly identified in T. gondii and E. tenella (Wilson et al, 1993).

To fully elucidate the conservation of plastid gene content and gene order in these organisms, oligonucleotide primers based on the fully characterised P. falciparum genome (Wilson et al, 1996) were used to amplify two regions of the 35 kb molecule selected as potentially characteristic for the organellar DNA. The PCR products produced were subsequently sequenced and analysed (see 2.12.0).

51 4.3.0 The Inverted Repeat and Downstream Region

Earlier indirect studies indicated the presence of an IR homologous to that of the malarial 35 kb genome in T. gondii and E. tenella (Wilson et al, 1993). The present study has confirmed that the novel gene arrangement occupying both DNA strands of the P. falciparum IR (Gardner et al, 1994b) is conserved in these coccidians (figure

4.2A). The findings presented here also demonstrate that a homologue of the P. falciparum ORF470 lies downstream of one arm of the IR in T. gondii and E. tenella, just as in the malaria parasite (Williamson et al, 1994) (figure 4.2A). This feature is of interest because the ORF470 gene has a very limited distribution in known organellar

DNAs.

4.4.0 The tuf Region

A tuf gene lies downstream of the ribosomal protein genes rpsl and rps\2 in the truncated str operon of the P. falciparum plastid genome. These elements are encoded on the opposite strand from a phenylalanine tRNA (tRNAP^®^®“^) gene lying further downstream (Wilson etal, 1996). This arrangement was again conserved in T. gondii and E. tenella. (figure 4.2B).

ORF78, immediately downstream of the tuf gene in P. falciparum (Wilson et al,

1996), was poorly conserved in T. gondii and E. tenella, but the latter version allowed alignment of these sequences with ORF57 found on the plastid DNA of the euglenoid

Astasia longa (Gockel et al, 1994) (see chapter 5). A homologue of ORF 129, located still further downstream of the tuf gene in malaria (Wilson et al, 1996), was found in

T. gondii However, it appeared less divergent than in malaria, allowing its tentative identification as a derivative ribosomal protein gene, rpll 1 (see chapter 5).

52 Minor differences are also apparent in this region:

• Of the three contiguous tRNA genes in P. falciparum close to, but on the opposite

strand from (glutamine (gin), glycine (gly) and tryptophan (trp)), only

those for gin and trp were present in T. gondii (figure 4.2B);

• in 77%. annulata an ORF lying in the location corresponding to rpsl in the other

species (figure 4.2B) was not identified as such from a search of the EMBL

database. Whether it represents a highly divergent form of rpsl remains to be

determined (see chapter 5).

Excepting the minor differences outlined above, gene arrangement and content was conserved in the multigenic sequences obtained from P. falciparum, T. gondii, E. tenella and Th. annulata. Currently available data indicate that the gene order in these sectors of the plastid genome is unique to the apicomplexans.

4.5.0 Phylogenetic Analysis of the tuf Gene

As mentioned in 1.4.3, the tuf gene encodes EF-Tu, a major factor in prokaryotic protein synthesis. The gene is a good candidate for phylogenetic studies because homologues are present in all known organisms, it is well conserved across a wide range of species, and the biological role and crystal structure of the protein have been elucidated (Bourne etal, 1990; Kjeldgaard and Nyborg, 1992). As discussed in 1.4.3, the tuf gene is encoded on the plastid genome of most algae, but it has been duphcated and transferred to the nucleus of higher plants and some green algae on the plant lineage, the original plastid copy being lost in some cases (Bauldauf and Palmer, 1990).

However, these nuclear encoded genes are easily aligned with those from plastids and can be included in phylogenetic analyses (Delwiche et al, 1995).

53 A tuf gene has been fully or partially sequenced from the plastid DNA of the malaria parasite P. falciparum (Wilson et ai, 1996), the coccidians T. gondii and E. tenella, and the piroplasm Th. annulata. These data were aligned with tuf gene sequence from several plants, algae and bacteria, and analysed phylogenetically with the help of Dr. N.

Goldman (Dept. Genetics, Cambridge University) as described in 2.12.0. The analysis was performed utilising maximum likelihood methodology after discarding the third

(degenerate) codon position, and allowed for differences in base frequencies, heterogeneity of substitution rates across the sequences, and transition/transversion bias. The best estimate of the phylogenetic tree (figure 4.3) showed that the

Plasmodium sequence is the most divergent of the tuf genes, it also has the highest AT content for codon positions one and two - P. falciparum (69%) > Th. annulata (62%) >

T. gondii (58.5%) > E. tenella (57.5%). In the tree shown in figure 4.3, the tuf genes of Astasia longa (57%) and Euglena gracilis (51.5%) have the next highest AT contents.

Bootstrap analyses used to test the rehability of this result indicated that certain sequences could not be positioned with certainty and artificially reduced bootstrap values assigned to other areas of the tree. Therefore, to gain a better idea of the reliabihty of other parts of the phylogenetic tree, a bootstrap analysis was performed excluding these sequences. It is the values of this analysis that are displayed in figure

4.3 and it should be noted that they apply only to the tree shown in this figure without the terminal branches leading to Cyanophora paradoxa, Porphyra purpurae, Codium fragile and Chlamydomonas reinhardtii..

It can be concluded that, the apicomplexan plastid tuf genes are distant from all the others analysed, their nearest neighbours being the , E. gracilis and A. longa.

This is also reflected in their AT bias.

54 4.6.0 Discussion

Chapter 1 discusses several hypotheses to account for the presence of a plastid in the

Apicomplexa. This chapter demonstrates that both gene order and content is highly conserved in the plastid genomes of four highly evolved apicomplexans (believed to have diverged 8(X) milhon years ago - see Escalante and Ayala, 1995). To have reached its current condensed form (it is the smallest plastid genome yet identified) and novel gene arrangement, the organellar DNA must have been subjected to many deletions and rearrangements. The conserved gene order among the Apicomplexa makes it highly unlikely that the plastid was obtained either independently by various progenitors (i.e. it is probably not polyphyletic), or that it evolved differentially in different lineages. By contrast, the plastid genomes of two parasitic higher plants, Epifagus virginiana

(Beechdrops) (Wolfe, 1992a) and Conopholis americana (Sqawroot) (Wimpee, 1992), provide an example of differential evolution in the Orobanchaceae. The organellar DNA of these organisms now differ considerably in gene content and organisation due to extensive deletions and rearrangements incurred since they diverged relatively recently,

5-50 million years ago (Taylor et al, 1991; Wolfe et al, 1992b). Consequently, the data presented here for the apicomplexan plastid DNAs support the 'unitary' hypothesis, i.e. that the Apiocomplexa evolved from a single photosynthetic progenitor whose plastid DNA was radically condensed, and rearranged on the adoption of a parasitic hfestyle.

As discussed in chapter 1, the provenance of the apicomplexan plastid still remains open to question. Our analysis of a portion of the tuf gene from the four apicomplexans placed it at the end of a long branch, the genera being distant from each other as well as from their nearest neighbours, the green plastid genes from the euglenoids E. gracilis and A. longa. This result is similar to previous, less extensive, studies based on the small and large subunit rRNA, and the rpoB plastid genes (Egea and Lang-Unnasch,

1995; Gardner et al, 1993 and 1994; Howe, 1992). The Euglenophyta are unicellular.

55 protozoan flagellates present in both fresh and salt water environments. Their triple membraned plastids have been hypothesised to have arisen by uptake of a green alga

(Gibbs, 1978; Turner etal, 1989). Phylogenetic analyses of nuclear SSU rRNA genes show that Euglenophyta and Alveolata are distantly related (Gibbs, 1993), indicating that they obtained their plastids through independent events. However, the analysis presented here lends support to the postulation, based largely on more extensive phylogenetic analyses of the tuf gene, that the apicomplexan plastid was obtained through secondary endosymbiosis of a chlorophytic alga, like that of the euglenoids

(Kohler e ta l, 1997) (see 1.5.3). In contrast to the results presented here, some of the analyses of Kohler et al placed the apicomplexan sequences closer to the green algae

Codium or Coleochaete than to the euglenoids. As recognised by Kohler et al the association between the divergent sequences of the apicomplexan genes and the

Coleochaete tuf pseudogene (Baldauf et al, 1990) maybe spurious. Such rapidly evolving sequences tend to group together falsely in what is known as a 'long branch' effect. The grouping of Codium with the apicomplexans may also be misleading, as mentioned in 4.5.0 the gene sequence of Codium could not be placed with any certainty in the analysis presented here.

Bootstrap support for relationships among the plastid clades is weak both in the results presented here and by Kohler et al (1997), disqualifying any bold inferences concerning the provenance the apicomplexan plastid. However, despite uncertainty over its origin, the high bootstrap values recorded between the apicomplexan tuf genes in this study strongly suggest monophyly, thereby supporting the 'unitary' hypothesis.

56 Figure 4.1: Preliminary Identification of Malarial Plastid Gene

Homologues

Southern blots of HindQi. digested total DNA (approximately 500 ng) from P. falciparum (Pf), T. gondii (Tg), E. tenella (Et), Th. annulata (Ta), and MDCK cells, probed at a low stringency with genes from the malarial plastid:

A. clp

B. tuf

C. ORF470

Faint bands are designated by *.

The two bands noted in C with Pf are due to a HindlU site within the ORF470 gene.

57 Pf Tg Et Ta MDCK kb

-2 0

-6.4 -5.6

kb

-20

B -6.4 -5.6 -5.2

# # Figure 4.2: Conservation of Gene Order in Apicomplexan Plastid

Genomes

A. Diagram (not to scale) showing conservation of gene order on the plastid DNAs of

P. falciparum (Pf), T. gondii (Tg), and E. tenella (Et). In P. falciparum this region, spanning approximately 6.3 kb, encodes one arm of the inverted repeat and the

ORF470 gene (see Wilson et al, 1996). Arginyl (arg) tRNA genes with differing anticodons are distinguished.

B. Diagram (not to scale) showing the gene arrangement around the tuf gene of P. falciparum (Wilson et al, 1996), T. gondii, E. tenella, and Th. annulata (Ta) plastid genomes. In P. falciparum, this region covers approximately 3.0 kb. With the primers used PCR products were not obtained for the regions designated by dotted lines. ORFs of uncertain function are queried (?).\

The lengths of intergenic regions are given in number of nucleotides.

The 'ragged' ends show the limits of sequence data, or PCR amplified template.

58 O R F470 tRNA’'"' LSU rRNA tRNA""®' tRNA®'^' IRNA®®""

Pf 2 5 ^ 1 4 Ü H i i ^ ^ ^ H |41 M 5 M 2 O 1 7 I 0 Æ 2 9 46

tRNA"®' IRNA®'9 tRNA'®" tRNA®'® SSU rRNA

Tg

Et 48, 9 # 2 4 # 26 10 10 26 # 2 0 -6 0

B tpl11 tRNA’^P tRNA^'y IRNA^'" ORF78 tuf rps7 rpsl 2

Pf 26 ^ 12 ^ 6 ^ § 2 2 21 13 I 48 / 18 I tRNAP*"® Tg 28 1 ^ / ___

12 16 34 / 36 / Et

Ta I 2 6 ,/ ,T _z_/ Figure 4.3: Phylogenetic Tree Based on Plastid Sequence Data

A tree derived from a maximum likelihood analysis of m/gene sequences from:

• The apicomplexan plastid genomes of Plasmodium falciparum (Accession number:

X95276), Theileria annulata,, Eimeria tenella, and Toxoplasma gondii',

• the algal plastid genomes of the glaucophyte Cyanophora paradoxa (X52497); the

rhodophyte Porphyra purpurea (U38804); the cryptophyte Cryptomonas phi

(X52912); the chromophyte Odontella sinensis (Z67753); the chlorophytes

Chlamydomonas reinhardtii (X52257) and Codium fragile (U09427); and the

euglenoids Astasia longa (XI4385) and Euglena gracilis (Z11874);

• the nuclear encoded plastid genes of the higher plants Glycine max (soybean)

(X66062) and Nicotiana sylvestris (wood tobacco) (D11469);

• the cyanobacterium Anacystis nidulans (X17442) ;

• the prokaryotes Mycobacterium tuberculosis (X63539) and

(J01690).

Dotted lines show the associations of organisms excluded from the bootstrap analysis.

Scale bar indicates length corresponding to 0.1 substitutions/site.

59 A. nidulans P- purpurea O. sinensis

C. paradoxa C. phi E. gracilis 0.1 A. longa E. coli C. fragile C. reinhardtii ,100

G. max 87

M. tuberculosis sylvestris E. tenella

T gondii

T. annulata

P. falciparum CHAPTER 5: PROTEIN ENCODING GENES

5.1.0 Introduction

The sequence analyses presented in chapter 4 encompassed several protein encoding genes of interest first identified on the malarial plastid genome, namely, ORF470, a tuf gene, and two ribosomal protein genes, rpsl and rps 12. Homologues of the malarial

ORF 129 and ORF78 were also identified in T. gondii and E. tenella. These genes are further analysed here with respect to other plastid and prokaryotic versions.

5.2.0 ORF470

Homologues of the P. falciparum plastid-encoded ORF470 have been mapped and partially sequenced from T. gondii and E. tenella. The ORF sequences a ^éncode putative organellar protein of unknown function. Clustal and cladogram analyses

(figure 5.1) show that the apicomplexan polypeptides are more related to each other than to similar, highly conserved algal plastid and cyanobacterium sequences.

However, the high level of similarity between all plastid, cyanobacterium and E. coli forms is apparent in the alignment, particularly downstream of position 140. The divergent Mycobacterium ORF encodes a potential intein (protein intron), a sequence which is posttranslationally excised while the flanking regions are spliced together creating an additional protein product (Pietrokovski, 1994). Inteins do not appear to be present in any of the other homologues. It was noted that T. gondii and E. tenella encode a TGA stop codon in place of a conserved tryptophan codon at different positions in the ORF. It is proposed that these 'stop' codons specify tryptophan for the reasons discussed below and in chapter 7.

60 Transcripts of the T. gondii gene were undetectable by Northern analysis. However, the more sensitive RT-PCR method demonstrated the presence of mRNA (see figure

5.7).

5.3.0 The tuf Gene

The tuf encoded protein, EF-Tu, is a guanine nucleotide-binding factor which when bound to GTP (the 'on' state) acts as a carrier of amino acyl-tRNA to the ribosome. GTP hydrolysis at the ribosome leads to the release of EF-Tu:GDP (the 'o ff state) facilitating the formation of a peptide bond (reviewed, Weijland et al, 1992). A m/gene has been mapped and fully sequenced on the plastid genome of P. falciparum

(Wilson et al, 1996). In the present study homologues were characterised in T. gondii,

E. tenella and Th. annulata. The phylogenetic analysis in 4.5.0 has demonstrated how divergent these sequences are compared with other prokaryotic and plastid versions.

However, several highly conserved functional domains are maintained in all sequences studied, suggesting that the apicomplexan EF-Tu is active in organellar protein synthesis. In E. coli the elements involved in binding the phosphoryl, Mg^"", and guanosine residues of GTP are G19HVDHGK25, D83CPG86, N138KCD141, and

S176AL178 (Kjeldgaard and Nyborg, 1992) (figure 5.2). The apicomplexan sequences demonstrate only two substitutions in these regions, V21I in Th. annulata, and C140E and C l401 in P. falciparum and the coccidians. The residues defining the GDP binding pocket are also conserved (G24, N138, K139, D141, S176, L178 in E. coli). In a less conserved region close to the GTP binding domain (residues 183 to 192) the malaria sequence has a specific insertion like other plastid EF-Tu proteins. However, the coccidians, T. gondii and E. tenella, appear to have lost part of this insertion which forms a loop of unknown function in the protein.

61 Again Northern analysis failed to show the presence of transcripts from the T. gondii tuf gene, but mRNA has been detected using RNase protection assays (P. Preiser, personal communication).

5.4.0 ORF78

Figure 5.3 illustrates the limited conservation between the predicted peptide products of

ORFs located downstream of the tuf gene in the apicomplexans, and ORF57 encoded on the vestigial plastid genome of the non-photosynthetic euglenoid, A. longa. Whether these ORFs encode functional proteins is unknown. However, the identification of a euglenoid homologue adds some weight to the phylogenetic analysis presented in

4.5.0. A TGA 'stop' codon within the T. gondii sequence has been designated as encoding tryptophan (discussed below and in chapter 7).

5.5.0 The Ribosomal Protein Genes

Genes encoding the ribosomal proteins S7 (rpsl) and S12 (rps 12) have been mapped and sequenced from P. falciparum (Wilson et ah, 1996). In the present study, homologues were characterised in T. gondii, E. tenella and Th. annulata. However, as discussed in chapter 4.3.0 the Th. annulata rpsl homologue, ORF204, remains to be definitively identified as a ribosomal protein gene. In E. coli ribosomes, S7 interacts with several nucleotide clusters in the 3' domain of the SSU rRNA. It is one of the initiating proteins in the assembly of the 30S ribosomal subunit (reviewed, Harris et al,

1994). S12 interacts with two regions of the SSU rRNA (the 530 loop and 900 stem- loop), it appears to be involved in proof-reading and the binding of EF-Tu (reviewed,

Harris et al, 1994).

62 The putative apicomplexan S7 proteins are highly derived when compared with other prokaryotic and plastid sequences (figure 5.4). Despite failing to be identified as a rpsl gene in an EMBL search, the Th. annulata ORF204 predicted peptide product demonstrated some similarity compared with apicomplexan S7 sequences; 20.3%,

19.7% and 20.4% with E. tenella, T. gondii, m d P. falciparum respectively (see figure

5.4B). This suggests that ORF204 could represent a highly derived rpsl gene. As with the ORF470 homologue it was noted that the T. gondii gene encodes a TGA stop codon in place of a conserved tryptophan codon (see below and chapter 7).

Transcripts of the T. gondii rpsl gene have been detected by RT-PCR (see figure 5.7).

S12 is, universally, the most highly conserved small subunit protein (Harris et al,

1994). The predicted sequence of the P. falciparum S12 ribosomal protein corroborates this (Wilson et al, 1996). A short C-terminal region has been sequenced in T. gondii,

E. tenella and Th. annulata (see figure 5.5).

A homologue of the P. falciparum plastid ORF 129 has been located in the T. gondii extrachromosomal DNA (see 4.4.0). A search of the EMBL database indicated that the

T. gondii gene encodes a highly derived ribosomal protein L ll, suggesting that

ORF 129 should be regarded as an fp/11 gene. L ll is an early-assembly protein which binds to the highly conserved GTPase centre of the LSU rRNA (Ryan et al, 1991).

Figure 5.6 illustrates the divergence of the predicted apicomplexan proteins and L ll polypeptides.

5.6.0 Protein Analysis

As described in 2.16.0-2.18.0, polyclonal antibodies were produced against a GST-7. gondii ORF470 fusion protein and used in Western blot analyses. Figure 5.8 shows that the polyclonal anti-sera recognised purified fusion protein and, surprisingly, a

63 protein of the same size from non-induced transformed bacteria. Purified GST was not recognised, demonstrating that the anti-sera was specific for the ORP470 portion of the fusion protein. However, native protein was not detected in either T. gondii or P. falciparum preparations. Preimune sera showed no reactivity against any of the samples, and anti-GST sera recognised purified GST and fusion proteins from both induced and non-induced bacteria.

The detection of fusion protein in non-induced bacterial preparations was unexpected, and no explanation was found to account for its expression in these cells. The fact that native protein was not recognised in parasite preparations suggested that either the anti­ sera was not specific for the polypeptide, or that ORF470 was not expressed in detectable amounts in cell cultures. However, monoclonal antibodies against the T. gondii rhoptry protein, Ropl, (provided by Dr. J.C. Boothroyd, Stanford University,

U.S.A.) also failed to detect a polypeptide in the parasite preparations, indicating that the assay was at fault. Unfortunately, due to time constraints, further investigations were not possible.

5 .7.0 Discussion

The divergent nature of the apicomplexan plastid alluded to in chapter 4 is reiterated in this study of putative protein sequences. Indeed, lack of , conservation led to uncertainty over the assignment of the ORF78 homologues, the Th. annulata ORF204, and the malarial ORF 129. However, the overall maintenance of conserved open reading frames, when augmented by conservation of gene order (see chapter 4), supports the

'unitary' hypothesis and argues for plastid functionahty.

The presence of functional regions in the putative EF-Tu protein of the Apicomplexa suggests that the plastid is active in protein synthesis. EF-Tu is susceptible to several antibiotics (reviewed, Weijland et ai, 1992). The apicomplexan proteins demonstrate

64 none of the mutations recorded in the literature as conferring resistance to GE2270

(Sosio eta l, 1996), pulvomycin (Boon et al, 1995), or kirromycin (Mesters et al,

1994; Abdulkarim et al, 1994). Kirromycin has proven to be effective against P. falciparum in vitro (M. Strath, personal communication) providing indirect evidence of organellar protein synthesis. Similarly, the maintenance of genes for the ribosomal proteins S7, S I2, and L ll indicates that apicomplexan plastid contains functional ribosomes. Unlike S7 and S12, L ll is encoded in the nucleus of higher plants, but on the plastid genome of the red algae Porphyra purpurea, and on the Cyanophora paradoxa cyanelle DNA (Harris et al, 1994). The organellar location of the putative rpll 1 gene in the apicomplexans is, again, algal-like.

ORF470 is one of the few genes not thought to be involved in expression of the malarial plastid genome, and it has been postulated to serve a vital, as yet unknown, functional role in the organelle (Wilson et al, 1994; Williamson et al, 1994). The fact that well conserved homologues are encoded by the plastid DNA of T. gondii and E. tenella reinforces this hypothesis, but does not allude to any function.

Any functional considerations are brought into question by the presence of TGA stop codons at conserved tryptophan sites in the putative protein sequences of ORF470 in T. gondii and E. tenella, and S7 in T. gondii The conservative nature of these apparent nonsense codons, i.e. a G to A base change at the third position of a TGG tryptophan codon, led to the consideration of suppression mechanisms (see 7.5.0). The predicted

T. gondii ORF78 peptide also encodes a TGA stop codon. For the sake of simphcity all coccidian TGA codons have been regarded as encoding tryptophan in this chapter.

Direct evidence for apicomplexan plastid proteins remains to be gathered, but transcriptional studies have shown that the tuf, rpsl and ORF470 genes described here are aU transcribed in erythrocytic forms of P. falciparum (Wilson et al, 1996) and T. gondii tachyzoites. In itself this may be taken as indirect evidence of functionality.

65 Figure 5.1: Analysis of ORF470 Homologues

A. Predicted protein products from the Plasmodium falciparum ORP470 (Accession number: X95275), and Toxoplasma gondii and Eimeria tenella partially sequenced homologues, aligned in a Clustal analysis with similar, fully characterised genes from:

• The algal plastids of the glaucophyte Cyanophora paradoxa (P48260), the

chromophyte Odontella sinesis (P49530), and the rhodophyte Porphyra purpurea

(P51240);

• the cyanobacterium Synechocystis PCC6803 (D64(X)4);

• the non-photosynthetic prokaryotes Mycobacterium leprae (U00013), and

Escherichia co//(D90812).

Predicted polypeptide sequences are written using the single letter code.

Numbers correspond to amino acid positions in the alignment.

The arrows above sites 306 and 348 indicate positions of a TGA stop codon predicted to encode a conserved tryptophan residue in T. gondii and E. tenella respectively.

The arrow downstream of position 423 indicates the location of one of the primers used to amphfy this region in T. gondii and E. tenella.

Sites with a majority of identical amino acids are shaded.

B. Protein sequence common to all organisms analysed (between the arrows below sites 1 and 423) was used to generate the cladogram.

The scale is an estimate of the number of nucleotide substitutions between each node.

66 10 20 30 Eim eria L R Y ------LGPKLHK...... HKRLN I H Toxoplasma KLY------KYLYNK Plasmodium Cyanophora — - - -HBE( q s p k - n s g - le Synechocystis O d o n tella - ...... T H i K s H k IliNTN-IT Porphyra ------— isqtsdld E. c o li WLWRKLWGIGG T M S R NTEATDDVKTWT _ Mycobacterium T R ------TSE tCTMs PAPELLTQQQAIDSLGK ADSDVAGASARR

Eim eria Toxoplasma I Plasmodium Cyanophora Synechocystis O d o n tella Porphyra E. c o li Mycobacterium —I------1---- 110 120 130 140 Eim eria SKPNSKKSINNM P O H I ______LD Toxoplasma LNVYTNK nB y kH i ______.___Ls Plasmodium ILK ------D N N L I Y Y ------N E P 106 Cyanophora ------Q -ggmgQ ...... E Synechocystis ______Q sHaHi E ------E O d o n tella ------V_____ ------K Porphyra — — — — — — — L E ...... OQ E. c o li SCGNCDDTCASEPGAV TGANAPLSKEVEAA Mycobacterium ...... R s T eDQ A aB w ______lped ir nH

Eim eria Toxcç)lasma Plasmodium Cyanophora Synechocystis r- A I F D s V V Q 165 O d o n tella aDAQFDSV Porphyra V D A n F D S V E. c o li R V D A I FPST V R 194 Mycobacterium A PQ 176

210 220 230 250 Eim eria 0 11 F F Toxoplasma t) II F F Plasmodium D II F Cyanophora I> N G E 221 Synechocystis N g B 215 O d o n tella Porphyra E. c o li Mycobacterium N 226

270 E im eria P A Toxoplasma P A Plasmodium Cyanophora m ran Synechocystis RIH H O d o n tella RIH M Porphyra m HQ i n E. c o li Mycobacterium M

Eim eria Toxoplasma Plasmodium Cyanophora Synechocystis O d o n tella Porphyra E, c o li Mycobacterium ------1------380 390 E im eria Toxcplasma Plasmodium Cyanophora Synechocystis O d o n te lla P o r# iy ra m E. c o li m[pBIpBSSl&M sfl Mycobacterium B l EI l a s UU s B

430 E im eria Toxoplasma Plasmodium Cyanophora Synechocystis II T F I O d o n tella II T F I Porphyra II T F I E. c o li Mycobacterium N K G E D D V 417

470 480 490 500 Eim eria Toxoplasma Plasmodium |K Q| Cyanophora KifEHE Synechocystis K E H E A S T S K I G E D ^ L F Y F Q v R G I O d o n te lla EHEASTSKIG e B v | | F Y F L O R G I Porphyra E H E A T S K I 0 E D 0 H F Y F L 0 R G I E. c o l i Mycobacterium

510 Eim eria 384 Toxoplasma 370 Plasmodium 470 Cyanophora 486 Synechocystis 480 O d o n tella 486 Porphyra 488 E. c o li 508 Mycobacterium 482

Odontella Porphyra Cyanophora Synechocystis ------E. coli Eimeria — Toxoplasma Plasmodium Mycobacterium 5 1 . 4 . T" “T I “T “T

5 0 4 0 3 0 20 10 Figure 5.2: Alignment of EF-Tu Proteins

Predicted EF-Tu polypeptides from Plasmodium falciparum (Accession number:

X95276), Theileria annulata (partial sequence), Eimeria tenella, and Toxoplasma gondii, aligned in a PILEUP analysis with homologues from:

• The algal plastids of the glaucophyte Cyanophora paradoxa (X52497); the

rhodophyte Porphyra purpurea (U38804); the cryptophyte Cryptomonas phi

(X52912); the chromophyte Odontella sinensis (Z67753); the chlorophytes

Chlamydomonas reinhardtii (X52257) and Codium fragile (U09427); and the

euglenoids Astasia longa (XI4385) and Euglena gracilis (Z11874);

• the nuclear encoded plastid genes of the higher plants Glycine max (soybean)

(X66062) and Nicotiana sylvestris (wood tobacco) (D11469);

• the cyanobacterium Anacystis nidulans (X17442);

• the non-photosynthetic prokaryotes Mycobacterium tuberculosis (X63539) and

Escherichia coli (JO 1690).

Predicted polypeptide sequences are written using the single letter code.

Numbers correspond to amino acid positions in the PILEUP.

Consensus elements are underlined.

Arrow located downstream of position 134 indicates one of the primers used to amplify the rw/ gene fragment from Th. annulata.

Sites with a majority of identical amino acids are shaded.

67 Plasmodium N N K L Q G - L T h e ile r ia S K K Q KGCTQ E im eria K K F I - - N|flT Toxoplasma N - - N Q A MyccAjacterium V A HDKFPDLNE E. c o li V T Y G Cyanophora Porphyra A n acy stis Cryptoitionas Odcxicella G lycine M iCOtiana A s ta sia Euglena Chi amydomonas Codium

Plasmodium T h e ile r ia E im eria F H Y Tox<^lasma F H V Mycobacterium F H Y E. c o li Cyanophora P orphyra Anacystis Cryptomonas O d o n te lla G ly cin e N ic o tia n a

Euglena Chi amydomonas Codium 0

150 Piasmodium iinT BBtfvlàW I D T h e i le r ia E im eria Toxoplasm a Mycobacterium E. c o li Cyanophora Porphyra A nacystis II Cryptomonas 1 O d o n te lla 1 G ly cin e N ic o tia n a A s ta sia EXiglena Chi amydomonas

160 170 190 200 JSl ^ _____ ,______u Plasmodium TUUUNlaU^rUijNy DLNY^HQI, N V I N I I 0 1M8IK DYELIKSNIWIQ 198 T h e i le r ia 133 E im eria L VELE" FELL K N 194 Toxoplasm a lvele Q fell Mycobacterium lve IîIevfell E. c o li lve H evfell cyanophora L V E L E " F E L L 2 1 P orphyra LVELE0FELL2p A nacystis LVELEVFELLci Cryptomonas L VplL E V 0E L L 0J O d o n te lla L V E L E V F. E L L C G ly cin e L V E n E V F H L L C N ic o tia n a A s ta sia Euglena Chiamydomonas Codium

210 220 240 250 Plasm odium KQN n L I q I I D NQI iQ tQ k Y|jkV~-|iUfls' kHB qI[cH n" T h e i le r ia Eimeria LAN Toxoplasm a Mycobacterium E. c o li cyanophora F d rp ly ra A nacystis Cryptomonas O d o n te lla G ly cin e N ic o tia n a A s ta s ia E uglena Chl amydomonas 260 270 280 290 Plasm odium L n W e i UI l K F E K S S P N ~ï7M !^tTBKW3Wl!l K Q |)| t o a o s | T h e ile r ia E im eria Toxoplasma Mycobacterium E. c o li cyanophora Porphyra A n acy stis Cryptomonas O d o n te lla G lycine N ic o tia n a A sta sia N E 296 Euglena Chlanydomcnas

310 320 Plasm odium i T i ^ N KLKVYKSQÎ T h e ile r ia

Toxoplasm a Mycobacterium E c o li Cyanophora P orphyra A n acy stis Cryptomonas O d o n te lla G lycine N ic o tia n a A s ta sia Euglena Chl amydomonas Codium

360 Plasmodium E Q K n I y L NENVQKVAI K D Y I I V L T L N T h e ile r ia E im eria Toxcplasm a Mycobacterium E. c o li Cyanophora Porphyra A n a c y stis Cryptomonas O d o n te lla G ly cin e N ic o tia n a A s ta sia Euglena Chl anydomonas NHIQMRNPSSVAEEHSNK q -...__._ oki Untqiq

410 420 Plasm odium 111ir 1 3 « !■ M a n n H l T B K l f f ï i t e A k n 409 T h e i le r ia 133 E im eria 403 Toxoplasm a 401 hycobacterium 396 E. c o li 394 C yaixphora 409 P orphyra 409 A n a c y stis 409 Cryptomonas 408 O d o n te lla R V S a 409 G ly cin e 409 N ic o tia n a 409 A s ta s ia 409 E uglena 3 Q â 409 Chl amydomonas 418 Codium 410 Figure 5.3: Analysis of ORF78 Homologues

A. Clustal alignment of the Plasmodium falciparum ORF78 predicted polypeptide

(Accession number: X95276), Eimeria tenella and Toxoplasma gondii homologues, and the plastid encoded ORF57 from the euglenoid Astasia longa (P34774).

Predicted polypeptide sequences are written using the single letter code.

Numbers correspond to amino acid positions in the ahgnment.

Arrow above position 37 indicates a T. gondii TGA codon predicted to encode tryptophan.

Sites with a majority of identical amino acids are shaded.

B. Sequence pair distances based on ahgnment in A.

68 50 Astasia m F Y F R D F E E G S T L| - L K |k 30 Eimeria R _ _ _ _ i _ K ngNgR L|_ ra iN KI l ----- L D Nn Q' l J ...... 33 Plasmodium I m - - - - Q n QTI F F lia - - K IH k QI] F N I O Q y r Q Ff Ii n B iI I iiBi Q l L F I L V N || 44 Toxoplasma -N kQ kVI I q B n EI y L N L L Y L - - K q H s n B Nn N N------35 60 70 80 Astasia LigR F S l" Y il F I liM V I - - lA v V YHlM ils - F

Plasmodium k Q y Y N T inYm Kp#W M #H p##3l Il D ^ D I H c Q K S d | : |9 9

Percent Similarity

A. Predicted protein sequences from the rpsl genes of Plasmodium falciparum

(Accession number: X95276), Eimeria tenella and Toxoplasma gondii, and the Th. annulata ORF204, aligned in a Clustal analysis with homologues from:

• The plastids of the cryptophyte Cryptomonas phi (P19458); and the plants

Beechdrops (Epifagus virginiana) (P30057), common tobacco {Nicotiana tabacum)

(P06361), and liverwort (Marchantia polymorpha) (P06360);

• the non-photosynthetic Bacillus Stearothermophilus (P22744).

Predicted polypeptide sequences are written using the single letter code.

Numbers correspond to amino acid positions in the ahgnment.

The arrow above site 147 indicates the position of a TGA stop codon predicted to encode a conserved tryptophan residue in T gondii.

Sites with a majority of identical amino acids are shaded.

B. Sequence pair distances based on ahgnment in A.

69 Beechdrops Tobacco L iv erw o rt B a c illu s Cryptomonas Eim eria - N N L------y p L Toxoplasma - K Q P ------I F Y Plasmodium - - Y T h e ile r i a qQ n q p I pB l □ l DI s

B eechdrops Tobacco L iv erw o rt B a c illu s R p D Cryptomonas V R A P E im eria L T H - - Toxoplasma PQl - - Plasmodium I M S - - T h e ile r i a LDV N I

Beechdrops Tobacco L iv erw o rt B a c illu s R R V S Cryptomonas r H t n E im eria N L A L Toxoplasma K K S Plasmodium ------T h e ile r i a CKSIKKNRKRVSPDPYPKNEIYKLLNKSS n Q n D l KKNPITIPKNTISI nQ 137

Beechdrops Tobacco L iv erw o rt B a c illu s E E R Cryptomonas S E im eria P S L T Toxoplasma K N K ISTYKKLYLECDN ALI Plasmodium I N K N - — - I CKYIIP PIS Theileria N N ILL--KKVNI B N N N K Y L P ______Beechdrops ------L| 155 Tobacco ------P 155 L iv erw o rt ______- Pj 155 B a c illu s |KQ^*n ...... ggaw 155 Cryptomonas IK Q Q p 156 E im eria NRIPIY ------LI ------Kl 138 Toxcplasm a NRILIY----LT-- - -1|kI 147 Plasmodium NRVYIY------LLKKNKI k1_ 142 T h e ile r i a t i k n i H nknkliigetikt 204

Percent Similarity 1 1 2 j 3 ! 4 5 6 7 8 9 1 175.5 |43.9 53.5 21.0 19.0 12.7 13.5 1 Beechdrops 2 9.0 #1077.4 143.9 55.5 (U 19.6 18.4 12.7 12.3 2 Tobacco H 3 24.5 : 22.6 0 0 0 44.5 56.1 22.5 21.8 15.5 14.2 3 Liverwort & 4 55.8 : 55.8 j 55.2 iÜ Ü il 58.7 15.9 15.6 12.7 12.3 4 Bacillus > o 1 5 46.5 144.5 I 43.9 141.3 ^ ■ 1 18.1 16.3 12.7 10.9 5 Cryptomonas

Predicted protein sequences from the fully characterised Plasmodium falciparum rpsll gene (Accession number: X95276), and the partially sequenced Eimeria tenella’.

Toxoplasma gondii, and Th. annulata genes, ahgned in a PILEUP analysis with homologues from:

• The algal plastids of the cryptophyte Cryptomonas phi (P19461); the glaucophyte

Cyanophoraparadoxa (P17294); and the chromophyte Odontella sinensis (P4900);

• the plastid of the higher plant tobacco {Nicotiana sp.) (P06369);

• the non-photosynthetic prokaryotes Mycobacterium tuberculosis (P41196) and

Escherichia coli (P02367).

Predicted polypeptide sequences are written using the single letter code.

Numbers correspond to amino acid positions in the PILEUP.

Arrow located upstream of position 97 indicates one of the primers used to amplify the rpsll gene fragment from T. gondii, E. tenella and Th. annulata.

Sites with a majority of identical amino acids are shaded.

70 30 40 Cryptomonas Ba s B Cyanophora H O d o n te lla S g S e f X i s mi Tobacco 9 Sa B Mg Mycobacterium Iv B t a | g Sa B B RIE E . c o l i |s N vl K Sa B a L i m i G E im eria Toxcplasma Plasmodium | n kH l ykkkykkt k | | i t n y l H n k | | k k B i V l k I l I k | T h e ile ria

70 80 90 100 Cryptomonas k y k L T V >/ Liak i:O k V K D L i O V k Y H 1 I k o99 Cyanophora k V p. L. T V V L W k (:<3 kV K D L î' G V k Y H 1 □ kG 100 O d o n te lla k V k L T V Vi B F; Cc; k V i: D L p g v F: Y H I l kG 100 Tobacco k V F L T V L k (,G k V f: D L r G V k Y H I El k G100 Mycobacterium k n L T 3"- L k G k V i: D L F G V kV n I I kG 100 E. c o l i k Vy L T □ k G 99 E im eria I I k G4 Toxoplasma ------4 Plasmodium QDB k i Q Q s N N kQ l I. W M f.IH k S i B D Q N p D O ^ K |i kB kI 100 T h e ile r i a ______4

110 130 Cryptomonas 123 Cyanophora 124 O d o n te lla SDK 126 Tobacco 123 Mycobacterium 124 E. c o l i 123 E im eria IJ k l 25 Toxcplasma n k } 1 I 27 Plasmodium M k I Y 122 Theileria N YIEMLYKKYC 34 Figure 5.6: Analysis of Putative L ll Proteins

A. The predicted Plasmodium falciparum ORF129 polypeptide (Accession number:

X95276), and the putative L ll protein from Toxoplasma gondii, aligned in a Clustal analysis with homologues from:

• The algal plastids of the glaucophyte Cyanophora paradoxa (P48126), and the

chromophyte Odontella sinensis (P49549);

• the nuclear encoded plastid gene of spinach {Spinacia oleracea) (P31164);

• the cyanobacterium Synechocystis PCC6803 (P36237);

• the non-photosynthetic prokaryotes Bacillus subtilus (Q06796) and Staphylococcus

carnosus (P36254).

Predicted polypeptide sequences are written using the single letter code.

Numbers correspond to amino acid positions in the alignment.

Sites with a majority of identical amino acids are shaded.

B. Sequence pair distances based on alignment in A.

71 B a c illu s Staphylococcus Synechocystis cyanophora O d o n te lla Spinach Toxoplasm a N L IÜ P Plasm odium N L L N

B a c illu s Staphylococcus Synechocystis cyanophora O d o n te lla Spinach Toxopl asma L Plasmodium

120 B a c illu s v i% lJ 140 Staphylococcus Q v Q 139 Synechocystis Q v N s 141 Cyanophora Q Q Q 141 O d o n te lla T Q v Q 141 Spinach Kammi d H d p P I L V K K 147 Toxoplasm a - -Tkhk Cj i k a T L ------N 125 PIasmodium L lDy K N N ------K 123

B a c illu s 140 Staphylococcus 139 Synechocystis 141 Cyanophora 141 O d o n te lla 141 sp in a c h K K E V I F 153 Toxoplasm a S F Y N I N 131 Plasmodium N H D N F K 129

Percent Similarity 1 I 2 3 4 1 5 6 7 8 : 1 70.7 60.7 :62.9 60.0 18.3 8.5 : 1 Bacillus 8 2 12.2 WM 76.3 61.9 Î63.3 62.6 19.1 10.9 ! 2 Staphylococcus c (D 3 29.3 |2 3 .7 68.1 :74.5 69.5 20.6 11.6 j 3 Synechocystis 0? m 4 39.3 *38.1 31.9 ■ Ü I 7 0 . 2 :63.1 22.9 11.6 i 4 Q Cyanophora 5 37.1 :36.7 25.5 2 9 . 8 ^ 0 0 65.2 22.1 : 10.9 : 5 Odontella

Lane 1 - experiment; lane 2 - RNase treated (negative control); lane 3 - no reverse

transcriptase (negative control); lane 4 - DNA template (positive control).

123 markers (Gibco BRL) are on the left of both gels.

A. Demonstrating ORP470 transcripts.

B. Demonstrating rpsl transcripts.

72

Figure 5.8: ORF470 Protein Analysis

A Western blot of purified GST and fusion protein (FP), preparations from non­ induced bacteria (non-i), T. gondii (Tg(l) and Tg(2)) and P. falciparum (Pf), and T. gondii prepared in the absence of the reducing agent DTT (Tg(l)*), probed with anti­ fusion protein polyclonal antibodies.

The fusion protein migrates slightly above the 32.5 kD molecular weight marker (FP and non-i), what were assumed to be degradation products run below this. The antisera is cross-reactive with several proteins (note large bands in FP and non-i, and reactivity in Pf).

73 1— jL -r- O) O) O) M- kD O C 8: I- I- I- CL

83- 62.5- r- m 47.5- 32.5- • r 25- ■ 16.5- s J f t p CHAPTER 6: RIBOSOMAL RNA GENES

6.1.0 Introduction

Plasmodium sp. are unique in possessing structurally distinct, developmentally regulated, nucleus-encoded cytoplasmic rRNAs (see Waters et al. 1995). In addition, their two extrachromosomal DNAs encode two distinct sets of organellar rRNAs; the fragmented mitochondrial rRNAs (Feagin et al, 1992), and the plastid rRNAs encoded in the IR (Gardner et ai, 1994). The plastid DNAs of both T. gondii and E. tenella encode rRNAs in an IR with an identical arrangement to that of P. falciparum (4.3.0).

However, whereas the E. tenella IR has only been mapped and partially sequenced utilising PCR technology, comprehensive data are available from T. gondii. In this chapter the conservation of core (i.e. functional) regions of the T. gondii plastid rRNA genes, and their possible susceptibility to certain antibiotics, are discussed.

6.2.0 Conservation and Transcription of the Ribosomal RNA Genes

Primers based on conserved sequences were used to amplify by PCR a large portion of the T. gondii SSU rRNA plastid gene (see figure 6.1). Sequence analysis demonstrated that while the overall level of identity is 44% and 53% with E. coli and P. falciparum respectively, the core regions (Cedergren et al, 1988) are 74% and 80% conserved.

Core regions have previously been noted to be well conserved in the malarial gene homologue (Gardner etal, 1991b).

The complete T. gondii LSU rRNA plastid gene has been amplified by PCR and sequenced (figure 6.2). The overall level of conservation with respect to E. coli and P. falciparum is 36% and 45%, but in core regions (Cedergren et al, 1988) it is 73% and

74 72% respectively. Gardner et ai (1993) have previously noted a high level of cons&ation (75%) in the core regions of the P. falciparum plastid LSU rRNA gene with respect to E. coli.

After transfer to a nylon membrane, the organellar rRNAs were detected using specific radio-labelled probes. Both molecules migrate in a denaturing gel according to their predicted size, smaller than the nuclear encoded rRNAs which are visible after methylene blue staining (figure 6.3).

6.3.0 Antibiotic Susceptibility

Gardner et al. (1991b) showed that the predicted secondary structure of the malarial

SSU rRNA has binding sites for tetracychne and streptomycin. As discussed (1.9.0), tetracycline has been used in the treatment of malaria (World Health Organisation,

1973), but streptomycin has proved ineffective in vitro (Geary and Jenson, 1983). The

T. gondii gene maintains sites for these drugs (figure 6.1), suggesting the SSU rRNA may be a target for the tetracycline class antibiotics which have proved effective against the parasite both in vitro and in vivo (Luft and Remington, 1992). Beckers et ai (1995) suggested the mitochondria as the site of action, but no genomic data are currently available to look at this hypothesis at the molecular level.

As discussed (1.9.0), Beckers etal, (1995) have predicted that the peptidyl transferase region of the T. gondii LSU rRNA plastid genes serves as the functional target for the lincosamide/marolide class of antibiotics (figure 6.2). The thiazole family of antibiotics (e.g. thiostrepton) inhibit binding of the EF-G:GDP, and the EF-Tu:aminoacyl- tRNA:GTP complexes to the ribosome in prokaryotic systems (Modolell et al, 1971).

They bind co-operatively with the ribosomal protein LI 1 (Thompson et al, 1979) to the

GTPase centre of the LSU rRNA (Ryan et al, 1991). The GTPase centre has a

75 secondary structure conserved in all eubacterial, archaebacteiial and eukaryotic cytoplasmic, chloroplast and mitochondrial LSU rRNAs so far investigated (Leffers et al, 1987). In this region, nucleotides A1067 and A1095 {E. coli numbering) are predicted to interact directly with thiazole antibiotics (Rosendahl and Douthwaite,

1994). A1067 is conserved in eubacterial, archaebacterial and chloroplast LSU rRNAs, but a G occurs at the equivalent position in most eukaryotic cytoplasmic ribosomes.

Mutagenesis of this nucleotide confers thiostrepton resistance on holobacteria in vivo

(Hummel and Bock, 1987) and E. coli ribosomes in vitro (Thompson et al, 1988). In contrast, mutating the equivalent base from G to A makes normally insensitive eukaryotes susceptible to thiostrepton (Uchiumi et al, 1995). Given the high level of conservation observed across a range of bacterial and organellar GTPase centres, one would expect the apicomplexan plastid to be sensitive to thiostrepton (and other thiazole antibiotics). Indeed, based on the phylogenetically conserved model presented by Ryan eta l (1991) the malarial plastid GTPase centre would appear to be susceptible, whilst the mitochondrial and nuclear LSU rRNAs would be resistant (M'^Conkey et al, 1997)

(figure 6.4). Similarly, it would be predicted that the T. gondii nuclear GTPase centre

(Gagnon et al, 1996) is unable to bind thiostrepton (no data has been published concerning the mitochondrial genome) (figure 6.4). However, the equivalent region in the T. gondii plastid LSU rRNA has a U at position 1077 {E. coli numbering) which, according to E. coli mutagenesis studies (Ryan et al, 1991), would prevent thiostrepton binding to the GTPase centre and confer resistance (figure 6.4). It has been shown that thiostrepton binds to short RNA transcripts derived from the sequence of the malarial GTPase centre, and that U1077 precludes this in T. gondii (Clough et al,

1997). However, 'whilst P. falciparum is susceptible to the drug in vitro (M^'Conkey et al, 1997), its efficacy against T. gondii has yet to be established.

76 6.4.0 Discussion

The T. gondii plastid rRNA genes are both transcribed. Together with the observation that their core regions are well conserved with respect to E. coli this suggests that the rRNAs form part of functional ribosomes, providing further evidence that the plastid performs some role. MTadden et al (1996) have noted the presence of ribosome-like structures in the T. gondii plastid organelle.

The elucidation of drug sites within the apicomplexan plastid SSU and LSU rRNAs may lead to new therapies for this important group of parasites. The efficacy of thiostrepton with regard to the possibly resistant T. gondii parasite is currently being investigated in this laboratory (Dr. R.J.M. Wilson, personal communication). The identification of putative rplW plastid genes (5.5.0) may aid binding studies since thiostrepton and the ribosomal protein L ll interact co-operatively with the GTPase centre (Xing and Draper, 1996). However, the LSU rRNA substitution U 1077 {E. coli numbering), predicted to confer thiostrepton resistance on the T. gondii plastid, also reduces the binding affinity of L ll four-fold in E. coli (Ryan et al, 1991). L ll stimulates a set of ribosomal activities associated with several protein synthesis factors that bind the large subunit. Ribosomes lacking L ll synthesise protein two-fold more slowly than normal ribosomes and are correspondingly defective in the EF-G- dependent GTPase (Stark and Cundliffe, 1979). LI 1-deficient ribosomes are also defective in release factor 1 (RFl)-dependent termination (Tate et ai, 1984), and completely inactive in binding stringent factor, which is responsible for the synthesis of ppGpp and pppGpp in the stringent response (Parker et ai, 1976; Smith et ai, 1980).

It is unknown whether the putative T. gondii L ll binds normally to the plastid GTPase centre. If not one might assume that the organeUe would be at least partially disabled bringing into question any assumptions concerning functionality.

77 5s rRNA genes are usually located downstream of the LSU rRNA genes of eubacteiia and plastids (Delp and Kossel, 1991; Kossel, 1991). As discussed in 1.4.4 the malarial plastid genome does not encode this rRNA species. Similarly, no 5S rRNA gene has been identified on the T. gondii organellar genome, it remains to be seen if the plastid ribosomes of the apicomplexans lack a 5S rRNA.

78 Figure 6.1: SSU rRNA Sequence Alignment

Clustal alignment of SSU rRNA gene sequence from Escherichia coli (Accession

number: J01695), Plasmodium falciparum (X95275), and Toxoplasma gondii.

The consensus is shown above the gene sequences. Residues identical to this are shaded.

The numbers correspond to nucleotide positions in the ahgnment.

The core regions (Cedergren et al, 1988) are denoted by heavy lines above the consensus.

Conserved nucleotides suggesting susceptibihty to streptomycin (s) and tetracycline (t) are indicated above the consensus at positions 530, 907 and 927.

The arrow below the T. gondii sequence downstream of position 1540 shows the position of one of the primers used to amplify this gene.

79 G A G T T T G A T C

T G fflC G G C A G G Plasmodium T A B A a a t a t a A a B a g a t g t g

ACACATGC

c o li ACAGGAAGAAGCTTGCT TTTGCTGAC 99 T T A A------TATAi------83 G T G A G 130 150 E. c o l i G A Gj G G G A A A C Q G C c Q g ATGGAGGGGGla Plasmodium - - a I A A A T T T T Q A A t Q t ttaaatagatt I Toxoplasma - - a | I Q A T t Q t t a g a - atata I G A G A A T T Consensus T A A 170 190 CGTCGCAAG C C A AQG A 199 G--CGCAAA ATA TQT G 165 I------TA A C G cH - A 159

240 250 E. c o l i GGGGGACC TCGGGCCTCl |TGT---GCCC| Plasmodium TACTATAT aaaaattaa I aa taaaattt I Toxoplasma TAGTGTTTiaaaagataa I AAAAAAGCTCI Consensus ■ CTAGT.GGT. . . . T A A . .G. .T.ACC.AGG. . A ..AT . . . . A . . T G . . . 270 280 290 E. c o l i G | Plasmodium Al Toxoplasma G | Consensus TGA . A G TGA 310 330 E. c o li G A T G A C C T G G A A A T G T A T T A G G G G G A A A T A G A G TGA

E. c o l i G G G A G G - A QBG C A Plasmodium A A A A A A T A BBT T A Toxoplasma A G G G G G G T Hil C T G Consensus G . . T 410 450 E. c o li C C A T G C C g I■ g | î|G T g J t I Plasmodium T A A T A T T t I9 t B| T T g 1 g Toxoplasma C G T C A T C a | B a E |G a | | g | Consensus A . T G A — I— 460 470 480 E. c o l i GGGAGGAAGGGAGl TACCTTTGCI CGTTACCCGCAG 494 Plasmodium TT------[ I t t t t t a - aa I T A A - - AAATAAA 442 Toxcplasma T C aaaaaataa I ICGTTTAAGTTTA 438 Consensus A A A G G T G C CAGCAGC . OCGGTAATAC 510 520 530 540 A G G gQ 543 A A A aQ 492 G G G AH 486

CAAGCGTT T T A 560 570 580 590 600 E. c o l i G | l e G G T T j j G T 593 Plasmodium g I T T T T a Q a t 541 Toxcqplasma a | T G G A T H G G 536 Consensus T A A A A T A A 610 630 650 C A G A T G CCC-CGGGC T T A T G T ATT-TAAAT T C T T C C ACTATTTGA

E. c o li GAG G G G G T A C C A G A A T T A T T A A T T T T A G G T A A G A G T T T T

E. c o li G T A G G T G A A A G A A C A

A G C A A A G G A T T A G 760 780 790 E. c o li GA C G A A Gg Plasmodium T A T A A T T R # Toxoplasma G G T C G A tEI Consensus A T A C C C G T A G T C GTAAAC.ATG

T G T G C C C 841 T -A T A 787 G G A G A C A 784

C C G C C T G 860 E. c o l i | A J g A G - G c g Q4 g g c Q T-----CCGGAG Plasmodium EX i t T a Q a a a Q A T A A T A...... A Toxoplasma B Q A A A T A T a Q a A A Q TCAOTACCCTAG

Consensus C G C A A G AAA.TCAAA.GAA TTGACGGG

Consensus C A A G . G G T G G A . C A T G T TTAATTCGATG A A C . C G A A A C C T T A C C 960 3 980

Toxopl «is ma

1030 E. c o l i TG G T Cl ATCCACGGAAG TLM cQc A gM »|gQg aIST GTGCCTTCGGG 1035 Plasmodium A A A A t | AATATTTTTAA A g g A Q G A a QI t Q a t Q i T - - TTAATAAA 976 Toxoplasma A G A A c | T A A A T T T A TAATTAATAAA 981 I f t D q a H a t t B i t H a t B i Consensus TGCATGGCTGTCGTCAG . T C G T G 1060 1070 1080 1090 E. c o l i ACCGTGAGl C |M n T ~ f Plasmodium T A T - - - - - j TT Toxoplasma A A T T T - - - 1 C E U 'T C Consensus T T . A G T A C G A . C G A C C 1120 1140 1150 E. c o l i Q g G GI CTTATCCTTTGTTGCCAGCGGT 1135 Plasmodium QAATl -----XTTTATA-- 1057 Toxopl asma Q g G g | jcTTATTTAAAGTTTTTTATATC 1078 Consensus I 1160 1180 E. c o li CCGGCCGGGAACTCAAAGG 1184 Plasmodium - - - AAAAA------Toxoplasma TTTAAAAACAGACTATATA A G 1126

C onsensus A C G T C A A G T C T T A ...... G G G C T 1210 1230 E. c o li G G G A T Gl Plasmodium T - - - - - Toxcplasma A G G A A a | Consensus T A C A A T 1-- . T A 1— 1260 1280 1290 1300 E. c o l i G G C G CAAAGAGAAGCGACCTC...... 1266 Plasmodium T A A A A T T A ATA - - T T T ...... - ...... 1159 Toxcplasma A T A G G G T A ATAATTTT AAAATAAATACAATATCAATTTAT 1226

Consensus 1310 1320 1330 1340 1350 E. c o li ------1301 Plasmodium ------TTATAT Toxoplasma TTTTAATTTATTT 1276

Consensus . C ■ G A . ■ G G A A T C ■ CTAGTAAT 1360 1370 1380 1390 1400 E. coli CB|G C G A C ||C C A Plasmodium t H a CAT TWA T A Toxoplasma T H <3 T G T t H a t g

Consensus 1410 1420 1430 1440 1450 E . c o li 1400 Plasmodium 1296 Toxoplasma 1376

C onsensus GCCCGTCAC 1470 (.• C < • G T C A C GCAAAAGAAGl rCCGTCAcC C C G T C A C I AT------f TGAAGTCGTAA 1510 1530 1540 1550 E. c o li TCGGGAGGGCGCTT Plasmodium ATTGTTAAATA - - - Toxoplasma TTGATAAAAGG- - -

Consensus C A A G G T A 1570 1590 ACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTA 1542 GCCGTACTGGAAGGTGCGGCTGGATAATAAAATAAATT 1427 I 1469 Consensus A T A C C C G T A G T C GTAAAC.ATG. . . A

TTATATT G G A G G C G G T A C

E. c o l i

Toxoplasma

E. c o l i G C G G C Plasmodium A T A T T Toxoplasma A C G C T

C onsensus C A A G . G G T G G A ■ C A T G T T T A A T T C G A T G . A A C . C G A A A C C T T A C C 960 980 c o li Plasmodium

Consensus 1020 1050 E. c o l i TG G T Cl ATCCACGGAAG T L M c Q G A gQNgQg aQ I gtgccttcgggi Plasmodium A A A A t | AATATTTTTAA a QI a Q g a a QI t Q a T g g T - - ttaataaa I Toxoplasma A G A A c | I T A A A T T T Aft EX i a H a t t QQ t H a t QQ taattaataaa I C onsensus TGCATGGCTGTCGTCAG . T C G T G G T G A T G T 1060 1070 1080 1090 E. c o l i ACCGTGAGl Plasmodium TAT-----[ Toxoplasma A A T T T - - -|

Consensus T T . A G T A C G A . C G A C C 1120 1140 1150 E. c o l i | Ï G G GI CTTATCCTTTGTTGCCAGCGGT 1135 Plasmodium HA ATI ....TTTTATA- - 1057 Toxoplasma Q g G g [ CTTATTTAAAGTTTTTTATATC 1078 Consensus 1— A A 1160 1180 E. c o l i CCGGCCGGGAACTCAAAGG Plasmodium ---AAAAA------Toxoplasma TTTAAAAACAGACTATATA - - T T G

Consensus A C G T C A A G T C T T A ...... G G G C T 1210 1220 r E. c o l i G G G A T G CG cMcM#*l e G A C C a IMWIM Plasmodium T - - - - - l°B^wWBa^WBWA||T^BMc tH c^wQ T A T T T Toxoplasma A G G A A A T H T r c I t HIB | t G T T C

Consensus T A C A . . A T . T A — I— 1— 1260 1270 1280 1290 E. c o l i G G C G CAAAGAGAAGCGACCTC------1266 Plasmodium T A A A A T T A A T A - - T T T Toxoplasma A T A G G G T A ATAATTTT AAAATAAATACAATATCAATTTAT 1226

Consensus A A 1310 1330 1340 E. c o l i ------Plasmodium ------TTATAT T T A Toxoplasma TTTTAATTTATTT

Consensus G A G G A A T C . CTAGTAAT 1360 1370 1380 1390 1400 E. c o l i C T T G GG^G T"c& gcg#gc^C G A cW< Plasmodium T T T A Tt Q a a TEaA^QcQc a t t Q j Toxoplasma T A C A Tt H a a cBQ cBBTQ T G T TQi

Consensus C G T C A G . A T . ■ ■ ■ . C G G T G A A T A . G T TTGTACACACC 1440 14bC |T G~g T'~ - G C C Â1 T C C C G G MUc 1400 Plasmodium C T A A T A T A G T C T T A A 1296 Toxcplasma B Q je C G G T A C G G I 1376 i C T T C G A Consensus GCCCGTCAC T A 1470 1490 1500 E. c o li <.• C i J T C A C A T G G G GCAAAAGAAG 1 1 a g g I ^ g c t t IM c c ImI 1450 Plasmodium C C C G T C A C T A A A A AT------Q a t A g g A ^ Q a t Q 1335 Toxcplasma rcrnTCAc A C G G A AT------Q t t t D q a ^ B c t Q 1415

Consensus AT TGAAGTCGTAA 1510 1530 1540 1550 E. c o l i TCGGGAGGG C G C T T | T G A A G T C G T A 1500 Plasmodium ATTGTTAAATA - - - [ TGAAGTCGTA 1382 Toxcplasma TTGATAAAAGG- - -| TGAAGTCGTA 14 61

Consensus C A A G G T A 1560 1570 1590 ACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTA 1542 GCCGTACTGGAAGGTGCGGCTGGATAATAAAATAAATT 1427 I 1469 Figure 6.2: LSU rRNA Sequence Alignment

Clustal alignment of LSU rRNA gene sequence from Escherichia coli (Accession number: J01695), Plasmodium falciparum (X95275), and Toxoplasma gondii.

The consensus is shown above the gene sequences. Residues identical to this are shaded.

The numbers correspond to nucleotide positions in the alignment.

The core regions (Cedergren et ai, 1988) are denoted by heavy lines above the consensus.

The peptidyl transferase region lies largely within the core sequences between positions

1956-2131 and 2498-2681. The conserved sites which led Beckers et al (1995) to suggest that this region is the functional target of the lincosamide/macrolide class of antibiotics are indicated above the consensus by arrow heads.

The core region between positions 1101-1148 covers much of the highly conserved

GTPase centre. Three sites within this region vital for thiostrepton (th) binding are indicated above the consensus at positions 1112, 1122 and 1140. However, whereas

1112 and 1140 are conserved in the T. gondii sequence, 1122 is not.

80 C o n se n su s

E. c o l i G G T T A G C G A C G C G T A C C G G T G G G A G G C G 48 Plasmodium A A T - T A T A A T A T A T A T T A A T A A T A A T T T 49 Toxoplasma A A - - A A T T - T AT-- - T T A G G A A G G T A C T 41

C onsensus

A G C G G ------A TQT GHA 95 Plasmodium TATA T T T T A TQ A T g A 98 A A T A A T A C G a H a a H c 84

E. c o l i C C G Tj ACCGGCGATTTCCG ACCCAGTG IjG T T T C GQC A 144 Plasmodium T T A Al I atatatttatattt I I TATTATTA Q a a a ag tQ t C 148 Toxoplasma TT A T | TCTCCGAA H a G G G A AHC C 132 GCTTATATACTTTAl C onsensus 180 E. coli C A Cl G G C G A AHC G G Gl Plasmodium T G a I I A A - - A aQ t t aa Toxcplasm a T T t | ------HT T G A

C onsensus GAA.CATCT.AGT A C A G G A A

Consensus AG.GGCGA.CGA

ACGGGGAGCAGCCC GCCTGAATCAGTGTGTGTGTT 293 ATTGAAAA------I A------271 GT--AGGG------I A------247

— I— 340 E. coli A G T G Gl A A G G C G C G C Gl T A C A G G Gl G C C C C G T CgC 343 PlasmcxUum - - - - Tj A A A A T ------1 T G T A A A Al tQ a 304 Toxoplasma - - - - g | G T T T A A A THA 279 ICGTTT-----I Consensus A A 380 E. c o l i Q A A GCTCGATGAGnÿ;AGGGCGGGACACG WG gVKhmc 393 Plasmcxlium Q G T TTT---TATA A A A ------Q a a ImM mI - 339 Toxoplasma H A T AC-----AACrQcG G T ------Ht aIuUU- 311 Consensus T A A A T A

E. c o l i C T G T C T Gl C T C C T G g c T G 442 Plasmcxlium ----- A a I T A A T A A g - - A 381 Toxoplasma - - - - - T g | 1C T T T A A H G T A 354 Consensus AGTACCGTG A G G . A A A G GAAAAGAA 460 470 500 E. coli A C C Gl le C C CHG C gQ 492 Plasmodium G T T t I T T Tt Q a - A g 428 Toxoplasma A C C g | C C C cH g - G H 403 Consensus AGTGAAAA . G A C T G A A 510 520 530 550 E. c o l i G G G Gl GCAGTGGGgGCg 541 Plasmodium A A A t I A T T A T A A T gA A g 476 Toxoplasma G G G g | - C A A THG AH 448 Consensus 560 570 590 600 E. coli C G C T -I Plasmcxlium T T T A Al Toxoplasma G G G T a | C onsensus G G . T A A 620 630 I ----- C C Gl a t a t a t t t a I a --- attaa 1

CT------G G G C O Plasmodium TTATTTTT A A A T T TATTACAA A A - - -

720 740 750 G G G C G G T G G G ATT A A A ATA A A A

TA. .T . . . .G . . . G A A

E. c o l i C A C G G Q T G Q C t 766 Plasmodium A A T A A gT AgA A 726 Toxoplasma C T T A g H a g H a a 685 G T G A A A T C . A A A G . TAGCTGGTTCTC I------840 E. c o l i gGTGGC Plasmodium M A T A AT Toxoplasma Q g A G G T

C onsensus 860 E. c o l i Cl C C T C G T G C C O - G G G Plasmodium t | T A T T T A T A A A T ------Toxoplasma c[ C T A T T A T A A A A A G G

C onsensus T T T 930 E. c o l i AI C-CCGATGC---- 910 Plasmodium t | TATTAATATTAAT 869 Toxoplasma TATT----TTTAT

C onsensus — I— 960 970 980 990 1000 _1_ E. coli A C G C G ATACCGGAG GTTAT-CACGGGAGAC-I [CACGGCGGG TQ c Q 958 Plasmodium T T A A T ATTTTATTT AAAATATATATATTATT GAAAATAAA g H a Q 919 Toxoplasma - T A A C TTAT---- A A G - - - TG-AGACTTTT| |g ------G A^# 857

Consensus A A AGGGAAACA . C C C . . A ...... A. . T A A G 1010 1020 1030 1050 I g QC CCA 1008 AMT T T T 969 AHA A - - 906 Consensus A A A A --1-- 1080 1090 1100 G G G GGGAAGGCC CAG T T T a t a a a - a tta t a t ta I T T T goagttgttacgtat I

ttagaagcag t T T A a A G C G T A A . A G C T T -r 1120 • 1130 1140 1150 E. coli G| T G G Cl HC c A|D aQ TgIcn 1105 Plasmodium a I A A A T I Ht t A gH tQ ciI t Q 1068 Toxoplasma a | I T A G C[ Ht T TiBcH t | |cQ 1005 Consensus . . A A . AT. . A A . C G 1160 1170 1180 1200 E. coli G G| Plasmodium A a I Toxoplasma G g |

C onsensus G A G C G T T 1220 1230 E. coli gA GCTGCGGCl Plasmodium gTATTATAAA Toxoplasma QAGTAATGG g I

C onsensus A A G T G 1270 1290 E. c o l i G T A A G C C T G C G A A G G T G T G G C - ATGCTGGAGGT aI Plasmodium G A --TA A AATA TTAT T T A AATTAAAATAATAAAA Toxoplasma A C A T T A A A A T A A A G C A A A aattttgtgaataaag I

C onsensus A A T G . T . A G T A 1310 1320 1330 1340 E. c o l i C G "cTG A C A Q a Q Plasmodium A T C T A A T t Bi g H Toxoplasma A G T C G G T tua H A G G G T . A G T C G 1360 1380 E. c o l i Al T T < ■ T A C G T | T a A T C G G G G C W B U T W l lACC 1349 Plasmodium T| T T C C T T A A T * ] t T T T T T A A T k W # # K A A T 1312 Toxoplasma a [ T T c T A G G A |l A A A C CGC A T W g W ü lH G G G 1251

C onsensus G T T A A T A T T C 1450 E. c o l i G C C G A A A G G Cl Plasmodium a t - gaaacgt I Toxoplasma Ia a - a t t t a t t | C onsensus 1490 E. coli QG T| ACTGCGAAGGGGGGACGGAGA M g g c B ^ TTGTTGGCCGG 1449 Plasmodium MATi AT------Q a t t EQ tTATT------1385 Toxoplasma Q g a | H g t a EB cC A A G ------1327 1TATAT------C onsensus 1530 1540 1550 E. coli ^ G C G A C G Gl TCCAGGCAA AMC C 1499 Plasmodium ------f -AAT---MTT 1407 Toxoplasm a 1TCTAAGTAA-H a T 1357 C onsensus

1590 1600 E. coli G G Al G A G G C Gl Plasmodium A ATI T A T A - t[ Toxoplasma A A t | iT T T A C T 1610 1630 E. c o l i C A Q a Q g c c cQA GCTTCCAGGAAAAGCCTC lACATCAAAT 1599 Plasmodium T - Q-Q a t t Q -TATT------1459 Toxoplasma T G B-Q a t t g Q -TGCC------1427

Consensus C G T A C • AAAC.GACAC. . G T . . . .A. .TA. . . .A. . .TA. A A G . . G 1680 |G CQC 1646 A a H - 1505 A g H - 1475

Consensus AAGGAACT . G G C A A G T A A C T T . G G 1710 1720 1730 1750 E. c o l i T T Gl Plasmodium T T a I Toxoplasma A C g |

Consensus G A 1760 1770 1780 1790 1800 GTCCCTCGCGGAl

G C

G C . ACTGTTTA AAAAACACA 1840 E. c o l i T Cl GATACCAGCTGGCTl G C A c Q 1796 Plasmodium - - 1 A AT cQ 1619 Toxoplasma - - | ] g G T AD 1590

Consensus GC.AA. . . . A A . . . . G A . G T A T G A C . .CTGCCC . G T G C 1860 1880 E. c o l i G T | ■TGCC (.• C G G 1846 Plasmodium T tI • T G C C C T A T 1669 Toxoplasma C c| 1T A G 1640 Consensus T A 1910 1950

GTAAACGGCGGC . G T A . CTATAACGGTCC . AAGGTAGCGAAATTC . TT 1960 1970 1980 1990 2000 luIgHc^Q 1944: ^^Bc[m 1767: T a BEBB—B ggCTBgW TBgW |aB3BI i3QBaBH :1739 Consensus GTCGGGTAAGTTCCGACC . G C A . G A A . G . . G T A A . G A . . A . c T G T c 2020 2030 2040 2050 T G G C c B g Gl 1994 C T T A a Q t g I 1817 C T T G a B c Al 1789 T G A A A T T G T . A A G A T CAG 2060 2100 E. c o li T G Q A c C 2044 A t Q t c T 1867 A c H t t t 1839

Consensus T G A A . CTTTACT 2110 2130 E. c o l i C G C G G Cl Plasmodium T A T A T t I Toxoplasma T A C A T cj

Consensus T A 2170 2180 2200 GCCTTGATGTGj [g QG g a c 2143 tatatttaaca I a Q t T T C 1967 Ia H t a a T 1937 I ttttatttata I

2210 2240 2250 E. c o l i GCCAGTCTGCAj Plasmodium TTGGAAATAATI Toxoplasma GAAACACCAC t}

Consensus T A 2270 2280 E. c o l i T Gl ACGTTGACCCGT T C - C G G GlIja n lG - -CGGAQAGQ-GTCQ- GTCQGGTG 2237 Plasmodium A t I TAGAAATTTTAT C A - A AT |S| - - T T A A aB a aQ -- Aa Tt tQ a T G A 2060 Toxoplasma T a | i TAAAAATTTCAA A G G T A G TDD g a c t g g gQ g g Q tTGC g c cQH cCTA 2036 Consensus 2340 G C gC GQ 2287 G tB c B 2109 | a t B a - B 2083

Consensus ATA 23 60 E. c o l i gGGI CCTGGTCGGACAj Plasmodium Q t g I I T a t t a -- ta I Toxoplasma B t a | T AATAAAT g I C o n se n su s 2410 2450

Consensus T G A T C C G I— T C G C T . AACGGATAAAAG 2460 2470 2480 2490 2500 G A T C V- G T G G T j l - Cl |G G G C C A EB G A T C C A T A A t H a t I A A A T T A IB G A T A C A A c D - cl | g G A T T G m Consensus T A C T C G G G A T AACAGGCT ■ AT A G T ■ C A T A T ■ G A C G 2510 2520 2530 2540

Consensus GTTTGGCACCTCGATGTCGGCT T C ■ CATCCT 2560 2570 2580 2590 2600 E. coli G G TI Plasmodium A AGI Toxcqplasma G A g |

Consensus A A G G G T T T A A A G GAGCTGGGTT 2610 2620 2640 E. coli ||C C C Plasmodium R T T T Toxoplasma Q C T C

Consensus A G A A C G T C G T G A G A C A G C G G T C C 2700 E. c o l i >• A G T T C Plasmodium C A G T T C Toxoplasma ^ a T T

Consensus T G ...... T ...... TAGTACGAGAGGA 2740 2750 E. c o li GGAGTGGACGC Plasmodium AAAAAAATTAT Toxoplasma AAGGAGATTAA

Consensus G T A G ■ T A A 2760 E. c o l i jjGTTCGGGl CATGCCAATGGCACT OCCCGl Plasmodium Q a TATCAA AAAATTTTTT - TGTAAAGTTGA Toxoplasma H a GTTCA g I I tttatcataagtattttactga I Consensus . . . T G A A 2810 2820 2830 2840 3 E. c o l i G A G A i G f J g | | T 2784 Plasmodium A T T T a | I a t I 2586 Toxoplasma A A A A | g | | t t | I aBl A 2568 Consensus T C

E. c o l i C aggaacgttg I Plasmodium T AACAATAATTj Toxoplasma T I gaatttaaaa I C onsensus T ■ ■ . T T . A G 2950

2970 E. c o l i Gl G G C T TQ A C C T T 2904 Plasmodium a [ TATT TQA T T A 2700 Toxoplasma a | |g A T A aH t T t t t 2683 Figure 6.3: Northern Analysis of rRNA

Northern blots of total T. gondii RNA probed with plastid LSU (A) and S SU (B) rRNA genes.

The transcripts of approximately 2.5 kb (A) and 1.5 kb (B) correspond to the plastid

LSU and SSU rRNA respectively. The nuclear versions of these species were visible as transcripts of approximately 3.3 kb and 1.7 kb on staining with methylene blue.

81 B * -■

i kb -3.3 — -2.5 — kb 1 -1.7 — -1.5 — m

1

■'i-- 4 Figure 6.4: Susceptibility of Apicomplexan GTPase Centres to

Thiostrepton

The predicted secondary structure of the plastid LSU rRNA GTPase centre of

Plasmodium falciparum and Toxoplasma gondii (E. coli numbering). The substitution sites affecting the binding of thiostrepton are circled. The corresponding nucleotides in the cytosolic LSU rRNA of P. falciparum and T. gondii, and in the mitochondrial LSU rRNA of P. falciparum are denoted.

82 Plasmodium falciparum

A - Mitochondrial A - Cytosolic ^ fi c , 1°;%, U A 1095 G u U CUU^ C - Mitochonâial A e I I III ?t A -Cytosolic ^C U C G ^ / ^ A —U 1067 A —U G - Mitochondrial U —A G - Cytosolic U—A A — U

5 ' 3 '

Toxoplasma gondii

A - Cytosolic

* ° ° ‘ î : r®" ° G ) u u C (3 A, A ^ G —C 1067 A — U G - Cytosolic q __q G —G A —U

5 ' 3 ' CHAPTER 7: CODON USAGE AND TRANSFER RNAs

7.1.0 Introduction

The P. falciparum 35 kb circular DNA encodes 25 different tRNA genes all of which

are transcribed. A comparison of codon and anticodon usage indicates that these adapter

molecules are sufficient to decode all the protein genes present on the circle (Preiser et

al, 1995). This provides evidence for a parsimonious but complete translation system

for the highly condensed plastid genome. This chapter investigates the translation

mechanisms utilised by the coccidian plastids.

For clarity all the codon data derived from primary gene sequence is described in terms

of the coding DNA strand, and all predicted tRNAs are illustrated as transcribed,

unmodified sequence.

7.2.0 Plastid Encoded Transfer RNAs

The cluster of tRNA genes within the apicomplexan plastid IR is completely conserved

in P. falciparum, T. gondii and E. tenella (see 4.3.0). Another cluster located

downstream of the tuf gene in malaria is also conserved, with the exception of a tRNA®'^ gene, in T. gondii (see 4.4.0). Similarly, E. tenella has been shown to encode

a tRNA^'’® gene downstream of the tuf gene (see 4.4.0). All the genes encode for tRNAs that fold with minimal free energies into the typical cloverleaf structures characteristic of these adapter molecules (figures 7.1 and 7.2). tRNA molecules possess characteristically conserved nucleotides (reviewed by MTlain, 1993); a U between the acceptor stem and D loop, two A residues and a GG dinucleotide in the D loop, a U in the anti-codon loop, and the tetranucleotide GUUC, an A and a C in the T loop and

83 stem. Several of the coccidian plastid tRNAs have changes at some of these sites (see figure 7.1 and 7.2), whether any of these would affect the identity of the adapter molecules is unknown. It remains to be seen whether these genes are transcribed.

7.3.0 Amino Acid Frequencies and Codon-Usage

The AT bias in P. falciparum circle-specific codons is even greater than that reported for the malaria nuclear-encoded proteins (Preiser et ai, 1995) (see 1.7.0). Similarly, the plastid genomes of T. gondii and E. tenella are also AT rich; P. falciparum (82.5%) >

T. gondii (78.1%) > E. tenella (77.3%) for the ORF470, m /and libosomal protein gene sequences used in this study. Like the malarial 35 kb circle, the plastid DNAs of these organisms demonstrate a codon bias, illustrated in tables 7.1 and 7.2 with respect to nuclear encoded proteins. The amino acids He, Leu and Lys constitute approximately a third of all codons in both T. gondii and E. tenella. Thr, Asn, Phe, Ser, Gly and Tyr residues are encoded by 38% and 36.6% of codons respectively. Of these nine frequent codons eight have an A or T in the first position, the exception being Gly which is present at a significantly lower frequency than in the nuclear-encoded proteins of both organisms. Six of the codons have an A or T in the second position, the exceptions are

Thr, Ser, and Gly again. Significantly, the three most common amino acids used. He,

Leu and Lys, have A or T in positions one and two of their codons. Only 3.6% of these nine codons in T. gondii, and 2.7% in E. tenella, have a G or C in the third position.

The remaining eleven amino acids range in frequency from Glu (5.3% in T. gondii and

4.7% inE. tenella) to Trp (0.6% and 0.7%). Contrary to convention, the Trp residues are denoted as being encoded by two codons, the usual TGG and the opal stop TGA, this is discussed further later. Based solely on the high AT content of the plastid genome, the frequency of Gly residues in T. gondii and E. tenella (5.9% and 6.0%) is high, given that the first two positions of the codon are GG.

84 7.4.0 Known Anti-Codon Frequency

Genes for eleven and nine tRNAs have been identified in T. gondii and E. tenella, respectively. Of the anti-codons, four in T. gondii and three in E. tenella have a G or C at the first, degenerate nucleotide. These data suggest that in order to maintain a complete translation system, the AT pressure so notable in the codons, particularly at the degenerate third position, is resisted by the anti-codons. A similar situation has previously been noted in the P. falciparum plastid (Preiser et a i, 1995). Assuming a parsimonious translation system hke that of malaria, i.e. there are no other genes encoding tRNAs with different degenerate anti-codons, an analysis of known codon to anti-codon frequencies is presented in tables 7.3 and 7.4. It was noted that those anti­ codons containing a U at position one are matched by their codons 56% and 51% of the time in T. gondii and P. tenella respectively, whereas this occurs only 3.1% and 1.8% of the time with G at that position. Of those tRNAs with A or C at the first position of the anti-codon, two break the 'wobble rules' (Crick, 1966); tRNA^®^^'"®^ seems to decode

CGG codons in T. gondii, and CGA codons in E, tenella’, and intriguingly, tRNA^^“^^ is predicted to suppress the opal stop codon TGA.

7.5.0 Discussion

The tRNAs identified here could decode the codons shown in tables 7.3 and 7.4 by effective use of the wobble position (the first, degenerate nucleotide in the anti-codon), in combination with the frequency of certain codons. Analyses have indicated that complete codon families are read by single tRNAs in a number of organelles and prokaryotes (see Claesson et al, 1995). Therefore, tRNA*^^'^^^ tRNA"^('^\ tRNA*""^""®^ and tRNA^^^"^''^ could decode ACN, GTN, CTN and GCN codons respectively. The U in the wobble position of the anti-codon of these molecules is typical of tRNAs shown to read without discrimination in the third codon position (Steinburg, 1993). It has been argued that in these cases the wobble U can form non-conventional, stable base pairs

85 with T or C in the third codon position (Heckman et al, 1980). The other possibility is that a 'two out of three' decoding mechanism is utilised (Lagerkvist, 1978; Bonitz et al, 1980); conventional base pairing at positions one and two, with a U at the wobble position being the least restrictive in a mismatch situation (Lagerkvist, 1981).

It is interesting to note that Preiser et a l, (1995) reported the presence of two tRNA^^ genes on the P. falciparum circle. The T. gondii molecule does not encode at the same position, but a homologue of the malarial with a C rather than a conservative U at position 32 could allow utilisation of a 'two out of three' mechanism as described in E. coli and Mycoplasma (Claesson et al, 1995).

With only two known exceptions, in Mycoplasma mycoides and yeast mitochondria

(Guindy et al, 1989; Sibler et al, 1986), anti-codons so far not been shown to contain

A in the wobble position. However, the coccidian plastid appears able to decode the CON family of codons. Analysis of known T. gondii plastid and yeast mitochondrial coding sequences has shown that the molecule should decode

CGG codons as well as CGT codons, despite the unusual AG wobble pairing. In both cases the utilisation of CGG codons is low. Similarly, the known coding sequence of the E. tenella plastid genome utilises two CGA codons as weU as seven CGT arginine codons. Again, the presence of a single tRNA^^^^®^ for these codons would require an unusual AA wobble pair. It should be pointed out that in a 'two out of three' decoding mechanism (Lagerkvist, 1978) the strong C/G-G/C bonds in positions one and two may be sufficient to allow A to be in the anti-codon wobble position. Alternatively, the conversion of the wobble A into inosine (I) by post-transcriptional modification (Elliot and Trewyn, 1984) could improve its pairing with codon bases. Indeed, in the comparatively few cases in the literature where the corresponding gene nucleotide is A, the tRNA has I in the wobble position (Sprintzl et al, 1991). A tRNA“’®^‘‘'®^ could recognise CGA codons in E. tenella, but according to the wobble hypothesis (Crick,

86 1966) such an adapter molecule would not be able to decode the CGG codons present in T. gondii.

More intriguingly, opal (TGA) stop codons in T. gondii and E. tenella appear to be decoded as tryptophan. This situation is reminiscent of non-plant mitochondria and

Mycoplasma, which utilise a non-universal , whereby TGA is a tryptophan codon decoded by an unconventional tRNA^^"''^^ (Osawa et al, 1992). The plastid genome of P. falciparum encodes a single tRNA*^^“^^ gene, a corresponding gene was mapped and sequenced in T. gondii This tRNA^ gene also has a conventional CCA anti-codon, suggesting that the organism does not utihse a system similar to non-plant mitochondria and Mycoplasma.

Genetic information not found in the genomic template can be transferred into mRNA after transcription via RNA editing, a process first discovered in the kinetoplasts of trypanosomes (Benne et al, 1986). This mechanism could allow the alteration of a

UGA stop codon to a UGG tryptophan codon by an A to G substitution. RNA editing has been identified in angiosperm plastids, but only C to U and U to C substitutions have been observed (Yoshinaga et al, 1996). Sequence analysis of T. gondii ORF470 and rpsl transcripts amplified by RT-PCR (see figure 5.7.0) showed no evidence of editing.

Polyclonal antibodies generated against a T. gondii ORF470 fusion protein failed to detect a native protein in cell extracts (see 5.6.0). However, further efforts are being made at the NIMR and elsewhere to identify this and other T. gondii plastid proteins. If such studies show that TGA is not utilised as a stop codon in coccidian plastid organelles then investigation of the mechanism of suppression should prove interesting.

Several such mechanisms are discussed below.

87 Decoding TGA using requires an unconventional C-A wobble position

(Crick, 1966). In E. coli, mutations or deletions in the peptidyl transferase region of the

LSU rRNA (Jemiolo eta l, 1995), and at position C1054 in the SSU rRNA (Murgola et al, 1988; Hanfler et al, 1990)) lead to TGA stop codon specific suppression.

Neither the LSU nor the SSU rRNA genes encoded by the T. gondii plastid genome demonstrated any of these mutations (positions 1138 and 1069 in figures 6.2 and

6 . 1 respectively).

A in the D arm of the tRNA^^^“^^ of E. coli has been shown to lead to

TGA stop codons being decoded as tryptophan codons (Hirsh, 1971; Hirsh and Gold,

1971). The tobacco chloroplast tRNA*^^^™^^ similarly suppresses a TGA stop codon in the tobacco rattle (Zerfass and Beier, 1992). Both these adapter molecules contain a A24-UJ1 pair in the D arm, non-suppressers having a G^^-U^^ pair. Figure 7.1 shows that the T. gondii tRNA^^“^^ gene has a similar A-U pair in the D arm of the predicted structure, suggesting a mechanism for TGA stop codon suppression. However, the P. falciparum tRNA*^^""^^ gene demonstrates an identical predicted pairing in the absence of

TGA encoded tryptophans, and TGA codons are thought to be utilised as terminators for several plastid genes (Preiser et al, 1995).

Despite the lack of information about an obvious TGA suppression mechanism, this study does not preclude tryptophan being inserted at these stop codons. The tRNA^^^“^\ rRNAs, release factors and ribosomal proteins could all play a role in suppression, probably via some subtle, unrecorded mechanism. To begin to try and answer this difficult question, one would first need to isolate the tRNA‘^^‘'‘^^ of T. gondii (and E. tenella) and test its abihty to suppress TGA stop codons in vitro .

In conclusion, it would appear that the T. gondii and E. tenella plastid genomes encode a similar, parsimonious translation system to that elucidated in malaria by Preiser et al

(1995), where 25 different tRNAs are thought to be sufficient to decode aU the AT biased codons used. Given the genetic conservation across the Apicomplexa it is assumed that such an economic system is ubiquitous. However, the limited sequence data obtained from the plastid elements of T. gondii, E. tenella and Th. annulata does not preclude the presence of tRNAs different from those identified in malaria. For example, a second tRNA^ with a UCA anti-codon to decode TGA as tryptophan. Also, the possibility, or necessity, of tRNA import cannot be discarded, although no examples have been recorded in plastid organelles.

89 Figure 7.1: Toxoplasma Transfer RNAs

The predicted secondary structures of T. gondii plastid tRNAs encoded in the inverted repeat (A), and in the tuf region (B). Conserved nucleotides are in bold, those underlined indicate divergent sites.

90 Val T h r A U G-C G+U G-C C-G G-C U-A U+G G-C A-U A-U G-C U-A U -A UUA G+U UAA U UAACC Ü UUCCC UGA A ! ! ! ! ! UAA A ! ! ! ! UUCG AUUGG G CUCG AUGGG G + ! ! + u troc ! ! ! ! U ÜÜC G GAGU G G GAGC U GUA A A GUA C G U-AU A G GA G A C C -G U-A U-A G-C G-C C-G A -U U A U A U A U A UAC UGU Arg Mst A U A-U A-U G-C G+U G-C C-G U+G G-C C-G G+U U-A G-C A -U AAA G-C UAA V CAUCC Ü UCGCC GUAA C ! ! ! ! ! UGGA A ! ! ! ! ! UCUG GUAGG G CGAG AGCGG A ! ! ! ! C ÜUC + ! ! ! u troc G GGAC A U GCUC U GUAA A GGUA G A-UU U U-AA G A-U C -G A-U G-C A-U G-C C-G G-C C A C A U A U G ucu CAU Lau A r g ' A U G+U G-C C-G A-U G-C G-C G-C U-A A-U C-G U U A-U G-C UGA A-U UAA U CUCUC Ü CUCCC UUAA G ! ! ! ! ! UGA A !+!!! AGCG GAGAG G CUCG GGGGG A u ! ! ! + c troc ! ! ! u troc G ACGC U G GAGU A GUAG G C GAA A G U-AA U U-AU A G-C U-A A A U -A A-U G+U G-C G-C A-U C A U A Ü G U G ACG UAG

A sn Ala G A U-A A C C-G G-C U+G G+U U-A G-C C-G A-U A-U U-A G-C UAA A -U UGA U UAACC U UUGCC UGA A + ! ! ! ! GGA A ! ! ! ! CUCG GUUGG A CUCG UACGG A G ! ! ! ! C tro c A ! ! ! ! U tr o c G GAGC U G GAGC U UUA U G GUA A A U-AG A C-GA A U-A U-A C-G U-A G-C U-A G-C U-A C A U A U A U A GUU UGC B

T r p G A-U C-G G-C U-A U-A U-A U -A UCA U CUUUC UAA A ! ! ! ! ! UUUG GAAAG A + ! !+ U UUC G GAAU G GUA A A AA G A-U G-C G+U U-A C A U A CCA

G in A U-A A-U A-U A-U A-U A-U G-C UAA U UUUCC UGA A ! ! ! ! ACCG AUAGG G ! ! ! C UUC G AGGC A GUA A U A AG A C -G G-C G-C A-U U U U A UUG

P h e A G-C U-A U-A G-C A-U A-U A-U UUA U UAGUC UAA A ! ! ! ! ! CUCG AUCAG A ! ! ! ! U UUC G GAGC U GUA A G A-UA A A-U G-C G-C A -U C A U G GAA Figure 7.2: Eimeria Transfer RNAs

The predicted secondary structures of E. tenella plastid tRNAs encoded in the inverted repeat. Conserved nucleotides are in bold, those underlined indicate divergent sites.

91 Thr Val U A G+U G-C C-G G-C U-A G-C G-C U-A G+U G+U U-A G A A-U UUA U-A UUA U UUCCC U CAAUC AAA A ++ ! ! ! UA A ! ! ! + ! CUCG GGGGG A AUCA GUUGG G ! ! ! ! U UUC G ! ! ! G UUC G GAGC U G GAGU U GUA C G UA A G G AA G U-AA A C -G A-U U -A U-A G-C A-U A-U C -G U A U A U A U A UGU UAC

Met Arg U A A-U A-U A-U A-U C-G A-U G-C C -G G-C G G A-U U-A G-C UUA A -U UAA U UUGCC U CAUCC CUGA A ! ! ! ! ! UAGA C ! ! ! ! ! CGAG AACGG A UUUA GUAGG A U ! ! ! ! C UUC U ! + ! ! U UUC G GCUC U G AGAU A GUUA A G AAA A U-AA G A-UU A C -G A-U G-C A-U G-C A-U G+U C-G C A C A U G U A CAU UCU

Arg' Leu U A A-U A-U A-U C-G G-C G-C C-G G-C C-G A-U A-U A-U A -U UUA G-C UAA U CUCCC U CUCUC UGAA A ! ! ! ! AUAA G ! ! ! ! ! UUCG GUGGG A AGCG GAGAG A U ! ! ! ! U UUC A ! ! ! U UUC G AAGC G G ACGC A GAAG A A GUAU G U U-AA U U-AU C G-C U-A A U U-A A -U G U G-C G -C A U G-C A-U A C A U U U U U G U G ACG UAG

Asn Ala G A U-A G-C C-G G-C C-G G+U U U G-C C-G A-U A-U U-A G-C CUA A-U UUA U CGACC U UCGCC UGA A ! ! ! ! ! AAÇ A ! ! ! ! ! CUCG GCUGG G AGUA AGCGG G G ! ! ! ! C UUC U ! ! U UUC G GAGC U G GAAU G UUA U G GUA A G U-AA A G AA A U -A U-A C -G U-A G-C C-G G-C U-A C A U A U A U A GUU UGC Table 7.1: Toxoplasma Plastid Codon Usage

Amino acid (aa) frequency per 100 aa of T. gondii nuclear and plastid (Tg^,^ ) genes.

Nuclear gene data from Ellis et al. (1993).

Plastid gene data compiled from ORE470, tuf and rpsl (approximately 1000 aa).

Amino acids are denoted according to the three letter code.

92 aa '^Snucl. T g p u s . ne 3.3 11.7

Leu 8.3 1 1 . 2 Lys 4.1 9.9

Thr 5.4 6 . 8 Asn 3.5 6.7

Phe 4.3 6 . 6

Ser 8 . 0 6 . 6

Gly 8 . 2 5.9 Tyr 1.9 5.4 Glu 7.1 5.3 Val 7.7 4.2 Asp 4.8 3.9 Ala 7.8 3.9

Gin 4.8 2 . 8 Arg 7.2 2.7

Pro 6 . 2 2 . 2

Met 2.3 2 . 2

His 1 . 8 1.3 Cys 1.9 0.9

Trp (TGG and 0 . 6 0 . 6 TGA) Table 7.2: Eimeria Plastid Codon Usage

Amino acid (aa) frequency per 100 aa of E. tenella nuclear and plastid genes.

Nuclear gene data from Ellis et al (1993).

Plastid gene data compiled from ORF470, tuf and rpsl (approximately 1000 aa).

93 aa ^^nacL EW

ne 2 . 8 11.5

Leu 7.8 1 1 . 1 Lys 3.1 10.4 Ser 7.8 7.4 Asn 2.7 6.5

Gly 1 2 . 1 6 . 0 Thr 4.6 5.9 Tyr 1.9 5.8

Phe 2 . 6 5.0 Glu 8.4 4.7 Asp 3.0 4.3 Ala 11.9 4.3 Val 7.6 4.0 Pro 5.9 3.1 Arg 5.2 3.0 Gin 4.6 2.3

Met 2 . 1 1.9

His 1 . 2 1.5

Cys 3.2 0 . 8 Trp (TGG and 1.3 0.7 TGA) Table 7.3: Toxoplasma Plastid Anti-Codon Frequency

Anti-codons matching codons with a Watson-Crick base pair at the wobble position are in bold.

Those anti-codons thought to demonstrate unusual wobble pairings with codons are queried (?).

Amino acids are denoted according to the three letter code.

94 CODON ANTI- NUMBER CODON TTT-Phe gaa 59 TTC-Phe gaa 4 TAA-Stop *** 3 *** TAG-Stop 0

TGA-Trp ?cca? 2 TGG-Trp cca 4 CTT-Leu uag 9

CTC-Leu uag 0

CTA-Leu uag 2

CTG-Leu uag 0 CAA-Gln uug 27

CAG-Gln uug 0 CGT-Arg acg 3

CGC-Arg acg 0

CGA-Arg acg 0

CGG-Arg ?acg? 1

ATG-Met eau 2 1 ACT-Thr ugu 34

ACC-Thr ugu 1 ACA-Thr ugu 28

ACG-Thr ugu 2 AAT-Asn guu 64

AAC-Asn guu 0

AGA-Arg ucu 2 2

AGG-Arg ucu 0 GTT-Val uac 14

GTC-Val uac 0 GTA-Val uac 26

GTG-Val uac 0 GCT-Ala ugc 29

GCC-Ala ugc 0 GCA-Ala ugc 7

GCG-Ala ugc 0 Table 7.4: Eimeria Plastid Anti-Codon Frequency

Anti-codons matching codons with a Watson-Crick base pair at the wobble position are in bold.

Those anti-codons thought to demonstrate unusual wobble pairings with codons are queried (?).

Amino acids are denoted according to the three letter code.

95 CODON ANTI- NUMBER CODON n r -P h e gaa 46 TTC-Phe gaa 2 TAA-Stop *** 3 TAG-Stop *** 0 CTT-Leu uag 8 CTC-Leu uag 0 CTA-Leu uag 1 CTG-Leu uag 0 CGT-Arg acg 7 CGC-Arg acg 0 CGA-Arg ?acg? 2 CGG-Arg acg 0 ATG-Met cau 18 ACT-Thr ugu 35 ACC-Thr ugu 1 ACA-Thr ugu 18 ACG-Thr tgt 3 AAT-Asn guu 63 AAC-Asn guu 0 AGA-Arg ucu 20 AGG-Arg UGU 0 GTT-Val uac 14 GTC-Val uac 0 GTA-Val uac 24 GTG-Val uac 0 GCT-Ala ugc 20 GCC-Ala ugc 1 GCA-Ala ugc 20 GCG-Ala ugc 0 CHAPTER 8i CONCLUSION

The results presented here demonstrate the highly conserved nature of the apicomplexan

plastid genome with regard to both gene content, gene order and sequence in four

divergent genera (the haemospororin P. falciparum’, the coccidians T. gondii and E.

tenella’, and the piroplasm Th. annulata). As discussed in chapter 4, this provides good

support for the hypothesis that the Apicomplexa evolved from a single photosynthetic

progenitor whose plastid genome was drastically 'pruned' on the adoption of

parasitism. To add weight to this 'unitary' hypothesis further information should be

sought from other members of the phylum, especially those that could be considered

less specialised, such as the gregarines which infest the invertebrate lineages. Perhaps

significantly, two of the most primitive apicomplexans, the bivalve flagellated parasite

Perkinsus atlanttcus, and the heterotrophic flagellate (previously

Spiromonas), have been reported not to contain a putative plastid organelle despite extensive electron microscopic analysis (Siddall, 1992). Both these organisms have been postulated to represent ‘missing-links’ between non-flagellated apicomplexans and dinoflagellates (Wolters, 1991). Further, phylogenetic analyses have shown that

Perkinsus branches at the base of the radiation that is the Apicomplexa (Goggin and

Barker, 1993), suggesting an early divergence. Unless the plastid organelle was overlooked, one must assume that these organisms diverged before the plastid was acquired by the apicomplexans, or that they secondarily lost the organelle. However, at this time I conclude that apicomplexan vertebrate parasites now widely separated in evolutionary terms, maintain a vestigial plastid DNA with a characteristic gene content and organisation.

Phylogenetic analysis of the apicomplexan plastid-encoded tuf gene provides tentative support for the hypothesis that the organelle was obtained through secondary endosymbiosis of a green alga (Kohler et al, 1997). This evolutionary scenario does

96 not necessarily preclude that suggestion that the Apicomplexa arose from an ancestral,

photosynthetic dinoflagellate or chromist. Both Lepidodium viride (Watanbe et al,

1987 and 1990) and Chlorarachnion sp. (Hibberd, 1990; Ludwig and Gibbs, 1989)

maintain green endosymbionts or green plastids. Filling the void of dinoflagellate

plastid sequence information, and expanding the chromistan database, may elucidate the

photosynthetic progenitor of the Apicomplexa. However, both dinoflagellates

(Schnepf, 1993) and chromists (Medlin, 1995) maintain myriad forms of plastids,

presumably obtained through separate endosymbioses. Even if contemporary versions

exist, a search for the proposed apicomplexan ancestor invokes the proverbial ‘needle in

a haystack’ situtation. Also, given that secondary endosymbiosis appears to be a

recurrent event in the evolution of photosynthetic eukaryotes, an independent origin for

the apicomplexan plastid cannot be dismissed.

Whatever the provenance of the organelle, the high level of genetic conservation

observed amongst genera that have evolved diverse lifestyles over a considerable

evolutionary time span (approximately 800 milhon years) suggests that the vestigial

plastid organelle (MTadden et al, 1996; Kohler et al, 1997) performs a vital function

within the cell. Photosynthesis can be dismissed in these obhgate parasites, but, as

discussed in 1.8.0, plastids are also involved in several aspects of intermediary

metabohsm such as fatty acid, porphyrin, and amino acid biosynthesis. Whether the

apicomplexan organelle carries out any such functions is unknown. The plastid genome is largely composed of 'house-keeping' genes involved in expression. Only a scattering

of unknown open reading frames, and perhaps most significantly the highly conserved

ORF470 and a gene for a chaperone protein (Clp), provide any indication that the

organelle performs a specific role. Many plastid genes have probably been translocated to the nucleus in the course of evolution. Elucidation of apicomplexan nuclear genes whose products have an appropriate organelle-directed leader sequence, may shed hght on the question of the plastid’s function. At this time, only one nuclear gene (cpn60) with a potential mitochondrial targeted leader sequence has been reported in malaria

97 (Holloway et al, 1994). A homologue of the apicomplexan plastid ORF470 is currently being investigated for function in cyanobacteria (A. Law, personal communication).

Such a system allows the utihsation of reverse genetics and complementation- technologies currently unavailable for the apicomplexan plastid. In the meantime, the identification of plastid proteins would provide firm evidence of functionality. Although transcripts were identified, I failed to detect a T. gondii plastid peptide corresponding to

ORP470. Efforts to complete this work continue in this laboratory and elsewhere. Such studies may also show that the TGA stop codons I identified in the coccidian plastid genomes of T. gondii and E. tenella (see chapter 7) are suppressed. As discussed in

7.5.0, the mechanism which would allow TGA stop codons to be decoded as tryptophan remains to be elucidated. Jemiolo et al (1995) reported that mutations in the peptidyl transferase region of the E. coli LSU rRNA led to specific TGA suppression, but no substitutions corresponding to those reported were found in the T. gondii plastid sequence. The discovery of a non-conservative nucleotide in the GTPase centre of the

T. gondii plastid LSU rRNA (see 6.3.0) led to the finding that this substitution is responsible for a relatively low binding affinity for the antibiotic thiostrepton (Clough et al, 1997). In addition to the mechanisms discussed in chapter 7, the possibility that such a deviation in the normally highly conserved GTPase centre could lead to specific

TGA suppression was also considered. As discussed in chapter 6 this region binds the libosomal protein L ll and mutations in the GTPase centre can affect this association

(Ryan e ta l, 1991). E. coli mutants lacking L ll support RFl mediated termination (at

TAG and TAA) very poorly, but RF2 activity (needed for TGA and TAA termination) is increased severalfold (Tate et al, 1984). Clearly, anything affecting the binding of

LI 1 to the GTPase centre may result in stop codon termination efficiency being altered.

However, recent studies by Dr. B. Clough in this laboratory have found that the

GTPase centre of the E. tenella plastid LSU rRNA does not contain a similar substitution to that observed in T. gondii. Since it is predicted that the Eimeria plastid also employs TGA as a tryptophan codon, it appears unlikely that this region plays a role in stop codon suppression.

98 Aside from these hypothetical considerations, the most important question concerning the apicomplexan plastid is its possible susceptibility to a range of antibiotics, and perhaps even herbicides. Possible target sites for several antibiotics within the EF-Tu protein and the rRNAs are discussed in 5.7.0. and 6.3.0 respectively, and inhibitory

agents have begun to be used to study plastid function (Strath et al, 1993;

Pukrittayakameeet al, 1994; Pfefferkom and Borotz, 1994; Beckers et al, 1995;

Hackstein etal, 1995; Tomavo and Boothroyd, 1995; Clough et al, 1997; M^Conkey

etal, 1997). The binding studies performed by Clough et al (1997) provided the first

direct evidence that the malarial plastid organelle is functional in protein synthesis. They

demonstrated that the antibiotic thiostrepton, which has in vitro activity against P. falciparum, binds to short RNA transcripts derived from the plastid LSU rRNA

GTPase centre, but not to equivalent transcripts from the nuclear or mitochondrial

rRNA genes. Evidence that other antibiotics act directly against the apicomplexan plastid is largely based on molecular comparisons rather than functional studies. In this respect, it should be noted that the nuclear encoded rRNAs of P. falciparum contain

several sites which potentially make them sensitive to antibiotics thought to act

specifically against prokaryotic ribosomes (Waters, 1994), and it has been suggested that chloramphenicol, tetracycline and doxycycline exert a direct effect on the parasite’s cytoplasmic ribosomes (Budimulya etal, 1997). Similar observations have been made

about the cytoplasmic ribosomes of E. tenella mg, 1978). However, whatever their

site of action prokaryotic chemotherapies may be useful against the increasingly drug resistant malaria parasite, and several antibiotics (tetracycline, clindamycin and

spiromycin) proposed to act against the plastid are currently used in the treatment of malaria and toxoplasmosis (see 1.9.0).

That a plastid appears to be maintained in a wide variety of apicomplexan parasites led

some to consider a ‘one for all’ drug, i.e. a single chemotherapeutic agent that would inhibit organellar function in a variety of apicomplexans (see Jefferies and Johnson,

99 1996). However, the proposal based on sequence and binding data that the T. gondii plastid ribosome is resistant to thiostrepton (see chapter 6) indicates that not aU apicomplexan plastids are susceptible to certain antibiotics, and that a search an apicomplexan ‘super-drug’ could prove fruitless. Nevertheless, research into targeting the plastid with drugs is stiU in its infancy.

100 CHAPTER 9: REFERENCES

Abdulkarim F., Liljas L. and Hughes D. (1994). Mutations to Kirromycin Resistance

Occur in the Interface of Domains I and in of EF-Tu.GTP. FEES Letters 352:118-122.

Aldritt S.M., Joseph J.T. and Wirth D.F. (1989). Sequence Identification of

Cytochrome b in Plasmodium gallinaceum. Mol. Cell. Biol. 9:3614-3620.

Aota S., Gojobori T., Ishibash F., Maruyama T. and Ikemura T. (1988). Codon Usage

Tabulated from the Genbank Genetic Sequence Data. Nucleic Acids Res.

16(suppl.):R315-R402.

Appleton P.L. and Vickerman K. (1996). Presence of Apicomplexan-type Micropores in a Parasitic Dinoflagellate, Hematodinium sp. Parasitol. Res. 82:279-282.

Araujo F.G., Shepard R.M. and Remington J.S. (1991). In vivo Activity of the

Macrohde Antibiotics Azithromycin, Roxithromycin and Spiramycin against

Toxoplasma gondii. Eur. J. Clin. Microbiol. Infect. Dis. 10:519-524.

Ausubel F.M., Brent R., Kingston R.S., Moore D.D., Seidman J.A., Smith J.A. and

Struhl K. (1995). Current Protocols in Molecular Biology. John Wiley and Sons Inc.,

U.S.A.

Barta J R., Jenkins M.C. and Danforth H.D. (1991). Evolutionary Relationships of

Avian Eimeria Species among other Apicomplexan Protozoa: Monophyly of the

Apicomplexa is Supported. Mol. Biol. Evol. 8:345-355.

101 Bauldauf S.L., Manhart J.R. and Palmer J.D. (1990). Different Fates of the

Chloroplast tuf A Gene Following its Transfer to the Nucleus in Green Algae. Proc.

Natl. Acad. Sci. U.S.A. 87:5317-5321.

Bauldauf S.L. and Palmer J.D. (1990). Evolutionary Transfer of the Chloroplast tuf A

Gene to the Nucleus. Nature 344:262-265.

Beckers C.J., Roos D.S., Donald R.G., Luft B.J., Schawb J.C., Gao Y. and Joiner

K.A. (1995). Inhibition of Cytoplasmic and Organellar Protein Synthesis in

Toxoplasma gondii. Imphcations for the Target of Macrolide Antibiotics. J. Clin.

Invest. 95:367-376.

Benne R., Berg J. van den, Brakenhoff J.P., Broom J.H. van and Tromp M.C.

(1986). Major Transcript of the Frameshifted coxll Gene from Trypanosome

Mitochondria Contains Four Nucleotides that are not Encoded in the DNA. Cell 46:819-

826.

Bemardi G., Mouchiroud D., Gautier C. and Bemardi G. (1988). Compositional

Patterns in Vertebrate Genomes: Conservation and Change in Evolution. J. Mol. Evol.

28:7-18.

Bohne W., Gross U., Ferguson D.J.P. and Heesemann J. (1995). Cloning and

Characterisation of a Bradyzoite-Specifically Expressed Gene {hsp?>Qlbag\) of

Toxoplasma gondii, Related to Genes Encoding Small Heat-Shock Proteins of Plants.

Mol. Microbiol. 16:1221-1230.

102 Bonitz S.G., Berleni R., Comzzi G., Li M., Macino G., Nobrega F.G., Nobrega

M.P., Thalenfield B.E., and Tzagoloff A. (1980). Codon Recognition Rules in Yeast

Mitochondria. Proc. Natl. Acad. Sci. U.S.A. 77:3167-3170.

Boon K., Krab I., Parmeggiana A., Bosch L. and Kraal B. (1995). Substitution of

Arg230 and Arg233 in Escherichia coli Elongation Factor Tu Strongly Enhances its

Pulvomycin Resistance. Eur. J. Biochem. 227:816-822.

Boothroyd J.C., Burg J.L., Nagel S.D., Perelman D., Kasper L.H., Ware P.L.,

Prince J.B., Sharma S.D. and Remmington J.S. (1987). and Tubulin Genes of

Toxoplasma gondii. Molecular Strategies of Parasitic Invasion (Agabian N., Goodman

H. and Nogueira N. eds.). Alan R. Liss, New York:237-250.

Borst P., Overdulve J.P., Weijers P.J., Fase-Fowler F. and Berg M. van den (1984).

DNA Circles with Cruciforms from (Toxoplasma) gondii. Biochim. Biophys.

Acta. 781:100-111.

Bourne H R., Sanders D A. and McCormick F. (1990). The GTPase Superfamily: A

Conserved Switch for Diverse Cell Functions. Nature 348:125-132.

Bowman V.M. and Dyer T.A. (1979). 4.5S Ribonucleic Acid, a Novel Ribosome

Component in the of Flowering Plants. Biochem. J. 183:605-613.

Budimulja A.S., Syafruddin, Tapchaisri P., Wilairat P. and Marzuki S. (1997). The

Sensitivity of Plasmodium Protein Synthesis to Prokaryotic Ribosomal Inhibitors. Mol.

Boichem. Parasitol. 84:137-141.

103 Burg J.L., Perelman D., Kasper L.H., Ware P.L. amd Boothroyd J.C. (1988).

MolecularAnalysis of the Gene Encoding the Major Surface Antigen of Toxoplasma

gondii. J. Immunol. 141:3584-3591.

Cavalier-Smith T. (1981). Eukaryotic Kingdoms: Seven or Nine? Biosystems 14:461-

481.

Cavalier-Smith T. (1982). The Origins of Plastids. Biol. J. Linn. Soc. 17:289-306.

Cavaher-Smith T.(1993). Kingdom Protozoa and its 18 Phyla. Microbiol. Rev.

57:953-994.

Cavalier-Smith T„ Allsopp M.T.E.P. and Chao E.E. (1994). Chimeric Conundra: Are

Nucleomorphs and Chromists Monophyletic or Polyphyletic? Proc. Natl. Acad. Sci.

U.S.A. 91:11368-11372.

Cedergren R., Gray M.W., Abel Y. and Sankoff D. (1988). The Evolutionary

Relationships Among Known Lifeforms. J. Mol. Evol. 28:98-112.

Claesson C., Lustig P., Borén T., Simonsson C., Barciszewska M. and Lagerkvist U.

(1995). Glycine Codon Discrimination and the Nucleotide in Position 32 of the

Anticodon Loop. J. Mol. Biol. 247: 191-196.

Clough B., Strath M., Preiser P., Denny P. and Wilson R.J.M. (1997). Thiostrepton

Binds to Malarial Plastid rRNA. FEBS Letters 406:123-125.

104 Creasey A., Mendis K., Carlton J., Williamson D.H., Wilson I. and Carter R. (1994).

Maternal Inheritance of Extrachromosomal DNA in Malaria Parasites. Mol. Biochem.

Parasitol. 65:95-98.

Crick F.H.C. (1966). Codon-Anticodon Pairing: The Wobble Hypothesis. J. Mol.

Biol. 19:548-555.

Curgy J.-J. (1985). The Mitoribosomes. Biol. Cell. 54:1-38.

Dahl R.J. and Johnson A.M. (1983). Purification of Toxoplasma gondii from Host

Cells. J. Clin. Pathol. 36:602-604.

Dame J.B., Arnot D.E., Bourke P.P., Chakrabarti D., Christodoulou Z., Coppel R.L.,

Cowman A.P., Craig A.G., Pischer K., Poster J., Goodman N., Hinterberg K.,

Holder A.A., Holt D C., Kemp D.J., Lanzer M., Lim A., Newbold C.I., Ravetch

J.V., Reddy G.R., Rubio J., Schuster S.M., Su X., Thompson J.K., Vital P.,

Wellems T.E. and Werner E.B. (1996). Current Status of the Plasmodium falciparum

Genome Project. Mol. Biochem. Parasitol. 79:1-12.

Dannemann B.J., M^Clutchan J.A., Israelski D., Antoniskis D., Leport C., Luft B.,

Nussbaum J., Clumeck N., Morlat P., Chiu J., Vüde J.-L., Orellana M., Peigal D.,

Bartok A., Heseltine P., Leedom J. and Remington J. (1992). Treatment of

Toxoplasmic Encephahtis in Patients with AIDS (a Randomized Trial Comparing

Pyrimethamine Plus Clindamycin to Pyrimethamine Plus Sulfadiazine). Ann. Intern.

Med. 116:33-43.

Delp G. and Kossel H. (1991). rRNAs and rRNA Genes of Plastids. Cell Cult.

Somatic Cell Genet. Plants 7A.T39-167.

105 Delwiche C F., Kunsel M. and Palmer J.D. (1995). Phylogenetic Analysis of tufA

Sequences Indicates a Cyanobacterial Origin of all Plastids. Mol. Phylog. Evol. 4:110-

128.

dePamphilis C.W. and Palmer J.D. (1990). Loss of Photosynthetic and

Chlororespiratory Genes from the Plastid Genome of a Parasitic Flowering Plant.

Nature 348:337-339.

Desmonts G. and Couvreur J. (1974). Congenital Toxoplasmosis: a Prospective Study

of 378 Pregnancies. N. Engl. J. Med. 290:1110-1116.

Devereux J., Haeberli P. and Smithies O. (1984). A Comprehensive Set of Sequence

Analysis Programs for the VAX. Nucleic Acids. Res. 12:387-395.

Divo A.A., Geary T.G. and Jensen J.B. (1985). Oxygen and Time-Dependent Effects

of Antibiotics and Selected Mitochondrial Inhibitors on Plasmodium falciparum in

Culture. Antimicrob. Ag. Chemo. 27:21-27.

Dobell C.C. (1958). Anthony van Leeuwenhoek and his "Little Animals". Russell and

Russell, New York.

Dodge J.D. (1989). The Chromophyte Algae: Problems and Perspectives (Green, J.C.,

Leadbeater, B.S.C. and Diver W.L., eds.) Clarendon Press, Oxford:207-227.

Dore E., Frontali C., Forte T. and Fratarcangeli S. (1983). Further Studies and

Electron Microscopic Characterisation of Plasmodium bergei DNA. Mol. Biochem.

Paras. 8:339-352.

106 Dubey J.P. (1977). Toxoplasma, , Besnotia, , and other Cyst-

Forming of Man and Animals. Parasitic Protozoa (Kreier J.P., ed.). Academic

Press, New York: 101-237.

Dubremetz J.F. (1995). Toxoplasma gondii: Cell Biology Update. Molecular

Approaches to Parasitology (Boothroyd J.C. and Komunieki R., eds.) Wiley-Liss Inc.,

New York:345-358.

Ebringer L. (1990). Interaction of Drugs with Extranuclear Genetic Elements and its

Consequences. Teratogen. Carcinogen. Mutagen. 10:477-501.

Egea N. and Lang-Unnasch N. (1995). Phylogeny of the Large Extrachromosomal

DNA of Organisms in the Phylum Apicomplexa. J. Euk. Mirobiol. 42:679-684.

Elliot M.S. and Trewyn R.W. (1984). Inosine Biosynthesis in Transfer RNA by an

Enzymatic Insertion of Hypoxanthine. J. Biol. Chem. 259:2407-2418.

Ellis J., Griffin H., Morrison D. and Johnson A.M. (1993). Analysis of Dinucleotide

Frequency and Codon Usage in the Phylum Apicomplexa. Gene 126:163-170.

Erdmann V.A. (1976). Structure and Function of 5S and 5.8S RNA. Prog. Nucl. Acid

Res. Mol. Biol. 18:45-90.

Es H.H. van, Skamene E. and Schurr E. (1993). Chemotherapy of Malaria: a Battle

Against All Odds? Clin. Invest. Med. 16:285-293.

Escalante A. A. and Ayala F.J. (1995). Evolutionary Origin of Plasmodium falciparum and other Apicomplexa Based on rRNA Genes. Proc. Natl. Acad. Sci. U.S.A.

92:5793-5797.

107 Fawcett T.W. and Bartlett S.G. (1990). An Effective Method for Eliminating 'Artifact

Banding' when Sequencing Double-Stranded DNA Template. Biotechniques 9:46-48.

Feagin J.E., Werner E., Gardner M.J., WiUiamson D.H. and Wilson R.J.M. (1992).

Homologies Between the Contiguous and Fragmented rRNAs of the Two Plasmodium falciparum Extrachromosomal DNAs are Limited to Core Sequences. Nucleic Acids

Res. 20:879-887.

Feagin J.E. (1994). The Extrachromosomal DNAs of Apicomplexan Parasites. Ann.

Rev. Microbiol. 48:81-104.

Feagin J.E. and Drew M E. (1995). Plasmodium falciparum: Alterations in Organelle

Transcript Abundance during the Erythrocytic Cycle. Exp. Parasitol. 80:430-440.

Gagnon S., Levesque R.C., Sogin M L. and Gajadhar A.A. (1993). Molecular

Cloning, Complete Sequence of the Small Subunit Ribosomal RNA Coding Region and

Phylogeny of Toxoplasma gondii. Mol. Biochem. Parasitol. 60:145-148.

Gagnon S., Bourbeau D. and Levesque R.C. (1996). Secondary Structures and

Features of the 18S, 5.8S and 26S Ribosomal RNAs from the Apicomplexan Parasite

Toxoplasma gondii. Gene 173:129-135.

Gajadhar A.A., Marquardt W.C., Hall R., Gunderson J., Ariztia-Carmona E.V. and

Sogin M L. (1991). Ribosomal RNA Sequences of Sarcocystis muris, The\leria

annulata and Cij)diecodinium cohnii Reveal Evolutionary Relationships Among

Apicomplexans, Dinoflagellates, and Cilliates. Mol. Biochem. Parasitol. 45:147-154

108 Gardner M J., Bates P.A., Ling I T., Moore D.J., McCready S., Gunasekera M.B.R.,

Wilson R.J.M. and Williamson D.H. (1988). Mitochondrial DNA of the Human

P3 i 2iSiiQ Plasmodium falciparum Mol. Biochem. Parasitol. 31:11-18.

Gardner M.J., Williamson D.H. and Wilson R.J.M. (1991a). A Circular DNA in

Malaria Parasites Encode an RNA Polymerase like that of Prokaryotes and

Chloroplasts. Mol. Biochem. Parasitol. 44:115-124.

Gardner M.J., Feagin J.E., Moore D.J., Spencer D.F., Gray M.W., Williamson D.H. and Wilson R.J.M. (1991b). Organisation and Expression of Small Subunit Ribosomal

RNA Genes Encoded by a 35-kilobase Circular DNA in Plasmodium falc&rum. Mol.

Biochem. Parasitol. 48:77-88

Gardner M.J., Feagin J.E., Moore D.J., Rangarchari K., Williamson D.H. and Wilson

R.J.M. (1993). Sequence and Organisation of Large Subunit rRNA Genes from the

Extrachromosomal 35kb Circular DNA of the Malaria Parasite Plasmodium falciparum.

Nucleic Acids Res. 21:1067-1071.

Gardner M.J., Goldman N., Barnett P., Moore P.W., Rangachaii K., Strath M.,

Whyte A., Williamson D.H. and Wilson R.J.M. (1994a). Phylogenetic Analysis of the rpoB Gene from the Plastid-like DNA of Plasmodium falciparum. Mol. Biochem.

Parasitol. 66:221-231.

Gardner M.J., Preiser P., Rangachari K., Moore D., Feagin J.E., WiUiamson D.H. and Wilson R.J.M. (1994b). Nine Duplicated tRNA Genes on the Plastid-like DNA of the Malaria Parasite Plasmodium falciparum. Gene 140:307-308.

Geary T.G. and Jensen J.B. (1983). Effects of Antibiotics on Plasmodium falciparum in vitro. Am. J. Trop. Med. Hyg. 32:221-225.

109 Ghelli A., Crimi M„ Orsini S., Gradoni L., Zannotti M., Lenaz G. and Degli Esposti

M. (1992). Cytochrome b of Protozoan Mitochondria: Relationships between Structure and Function. Comp. Biochem. Physiol. 103B:329-338.

Gibbs S.P. (1978). The Chloroplasts of Euglena may have Evolved from Symbiotic

Green Algae. Can. J. Bot. 56:2883-2889.

Gibbs S.P. (1993). Origins of Plastids (Lewin R.A. ed.). Chapman and Hall, New

York:107-121.

Gockel G., Hachtel W., Baier S., Fliss C. and Henke M. (1994). Genes for

Components of the Chloroplast Translational Apparatus are Conserved in the Reduced

73-kb Plastid DNA of the Nonphotsynthetic Euglenoid FlageUate Astasia longa. Curr.

Genet. 26: 256-262.

Goggin C.L. and Barker S.C. (1993). Phylogenetic Position of the genus Perkinsus

(Protista, Apicomplexa) Based on Small Subunit Ribosomal RNA. Mol. Biochem.

Parasitol. 60:65-70.

Gozar M.M.G. and Bagnara A S. (1993). Identification of a Babesia bovis Gene with

Homology to the Small Subunit Ribosomal RNA Gene from the 35-kilobase Circular

DNA oi Plasmodium falciparum. Int. J. Parasitol. 23:145-148.

Gozar M.M.G. and Bagnara A S. (1995). An Organelle-hke Small Subunit Ribosomal

RNA Gene from Babesia bovis: Nucleotide Sequence and Secondary Structure of the

Transcipt and Preliminary Phylogenetic Analysis. Int. J. Parasitol. 25:929-938.

110 Gray M.W. (1988). Organelle Origins and Ribosomal RNA. Biochem. Cell Biol.

66:325-348.

Gray M.W. (1989). Origin and Evolution of Mitochondrial DNA. Ann. Rev. Cell.

Biol. 5:25-50.

Gregorgeiv V.S. (1994). Management of Toxoplasmosis. Drugs 48: 179-188.

Guindy Y.S., Samuelsson T. and Johansen T.-I. (1989). Unconventional Codon

Reading by Mycoplasma mycoides tRNAs as Revealed by Partial Sequence Analysis.

Bioch. J. 258:869-873.

Gutell R.R. and Fox G.E. (1988). A Compilation of Large Subunit RNA sequences

Presented in a Structural Format. Nucleic Acids Res. 16(suppl.):r 175-r270.

Hackstein J.H.P., Schubert H., Rosenberg J., Berg M. van den, Brul S., Derksen

J.W.M. and Matthijis H.C.P. (1994). A Novel Photosynthetic Organelle in Anaerobic

Mastigotes. Endocytobiol. Cell. Res. 10:261.

Hackstein J.H.P., Mackenstedt U., Mehlhom H., Meijerink J.P.P., Schubert H. and

Leunissen J A M. (1995). Parasitic Apicomplexans Harbor a Chlorophyll a-Dl

Complex, the Potential Target for Therapeutic Triazines. Parasitol. Res. 81:207-216.

Hanfler A., Kleuvers B. and Goringer H.U. (1990). The Involvement of Base 1054 in

16S rRNA for UGA Stop Codon Dependent Translational Termination. Nucleic Acids

Res. 18:5625-5632.

Harris E.H., Boynton J.E. and Gillham N.W. (1994). Chloroplast Ribosomes and

Protein Synthesis. Microbiological Reviews 58:700-754.

Ill Harris E.H., Burkhart B.D., Gillham N.W. and Boynton J.E. (1989). Antibiotic

Resistance Mutations in the Chloroplast 16S and 23S rRNA Genes of Chlamydomonas reinhardtii'. Correlation of Genetic and Physical Maps of the Chloroplast Genome.

Genetics 123:281-292.

\ Harder A. and Haberkom A. (1989). Possible mode of Action of Toltrazuril: Studies on two Eimeria Species and Mammalian and A s3is suum Enzymes. Parasitol. Res.

76:8-12.

Harlow E. and Lane D. (1988). Antibodies a Laboratory Manual. Cold Spring Harbor

Laboratory.

Heckman J.E., Samoff J. Alzner-Dewweerd B., Yin S. and RajBhandary U.L.

(1980). Novel features in the Genetic Code and Codon Reading Patterns in Neurospora crassa Mitochondria based on Sequences of Six Mitochondrial tRNAs. Proc. Natl.

Acad. U.S.A. 77:3159-3163.

Hibberb D.J. (1990). Handbook of Protoctista (Margulis L., Corliss J.O., Melkonian

M. and Chapman D.J. eds.). Jones and Bartlett, Boston:288-292.

Higgins D.G. and Sharp P.M. (1989). Fast and Sensitive Multiple Sequence

Ahgnments on a Microcomputer. CABIOS 5:151-153.

Hirsh D. (1971). Tryptophan Transfer RNA as the UGA Suppressor. J. Mol. Biol.

58:439-458.

Hirsh D. and Gold L. (1971). The Translation of the UGA Triplet in vitro by

Tryptophan Transfer RNA's. J. Mol. Biol. 58:459-468.

112 Holloway S.P., Min W. and Inselburg J.W. (1994). Isolation and Characterization of a

Chaperonin-60 Gene of the Human Malaria Parasite Plasmodium falciparum. Mol.

Biochem Parasitol. 64:25-32.

Howe C.J. and Smith A.G. (1991). Plants without Chlorophyll. Nature 349:109.

Howe C.J. (1992). Plastid Origin of an Extrachromosomal DNA Molecule from

Plasmodium, the Causative Agent of Malaria. J. Theor. Biol. 158:199-205.

Hrazdina G. and Jensen R.R. (1992). Ann. Rev. Plant. Physiol. Plant Mol. Biol.

43:241. Spatial Organization of Enzymes in Plant Metabolic Pathways

Hummel H. and Bock A. (1987). Thiostrepton Resistance Mutation in the Gene for

23S Ribosomal RNA of Halobacteria. Biochimie 69:857-861.

Hyde J.E., Sims P E G. and Read M. (1994). Green Roots of Malaria. Parasitol.

Today 10:25.

Israelski D M. and Remington J.S. (1993). Toxoplasmosis in Patients with .

Clin. Inf. Dis. 17(Suppl.2):S432-S435.

Jaye M., Salle H. de la, Schamber P., Balland A., Kohli V., Findeli A., Tolstoshev P. and Lecocq J.-P. (1983). Isolation of a Human Anti-Haemophilic Factor IX cDNA

Clone Using a Unique 52-Base Synthetic Oligonucleotide Probe Deduced from the

Amino Acid Sequence of Bovine Factor IX. Nucleic Acids Res. 11:2325.

Jeffries A C. and Johnson A.M. (1996). The Growing Importance of the Plastid-like

DNAs of the Apicomplexa. Int. J. Parasitol. 26:1139-1150.

113 Jemiolo D.K., Pagel F T. and Murgola E.J. (1995). UGA Suppression by a Mutant

RNA of the Large Ribosomal Subunit. Proc. Natl. Acad. Sci. U.S.A. 92:12309-

12313.

Johnson A.M., Dubey J.P. and Dame J.B. (1986). Purification and Characterization of

Toxoplasma gondii Tachyzoite DNA. Aust. J. Exp. Biol. Med. Sci. 64:351-355.

Johnson A.M., Blana S., Hakendorf P. and Baverstock P R. (1988). Phylogenetic

Relationships of the Apicomplexan Sarcocystis as Determined by Small Subunit

Ribosomal RNA Comparison. J. Parasitol. 74:847-860.

Joseph J.T., Aldritt S.M., Unnasch T., Puijalon O. and Wirth D.F. (1989).

Characterisation of a Conserved Extrachromosomal Element Isolated from the Avian

Malarial Parasite Plasmodium gallinaceum. Mol. Cell. Biol. 9:3621-3629.

Katlama C. (1990). Evaluation of the Efficacy and Safety of Clindamycin Plus

Pyrimethamine for Induction and Maintenance Therapy of Toxoplasmic Encephahtis in

AIDS. Antimicrob. Agents Chemother. 31:492-496.

Kaneko T., Sato S., Kotari H., Tonaka A., Asamizu E., Nakamura Y., Miyajima N.,

Hirosowa H., Sugiura M., Sasomoto S., Kimura T., Hosouchi T., Matsuno A.,

Muraki A., Nakazaki N., Naruo K., Okumura S., Shimpo S., Takeuchi C., Wada T.,

Wahanabe A., Yamada H., Yasuba M. and Tabata S. (1996). Sequence Analysis of the

Genome of the Unicellular Cyanobacterium Synechocystis sp. PCC6893. II. Sequence

Determination of the Entire Genome and Assignment of Potential Protein-Coding

Regions. DNA Res. 3:109-136.

114 Kiatfuengfoo R., Suthiphonggchai T., Prapwattana P. and Yuthavoog Y. (1989).

Mitochondria as the Site of Action of Tetracycline on Plasmodium falciparum. Mol.

Biochem. Parasitol. 43:109-116.

Kilejian A. (1975). Circular Mitochondrial DNA from the Avian Malarial Parasite

Plasmodium lophurae. Biochim. Biophys. Acta. 390:276-284.

Kilejian A. (1991). Spherical Bodies. Parasitol. Today 11:309.

Kjeldgaard M. and Nyborg J. (1992). Refined Structure of Elongation Factor EF-Tu from Escherichia coli. J. Mol. Biol. 223:721-742.

Kohler S., Delwiche C.P., Denny P.W., Tüney L.G., Webster P., Wilson R.J.M.,

Palmer J.D. and Roos D.S. (1997). A Plastid of Probable Green Algal Origin in

Apicomplexan Parasites. Science 275:1485-1489.

Kowallik K.V., Stoebe B., Schaffran I., Kroth-Pancic P. and Freier U. (1995). The

Chloroplast Genome of a Chlorophyll a-kc-Containing Alga, Odontella sinensis. Plant

Mol. Biol. Rep. 13.336-342.

Kossel H. (1991).. Structure and expression of rRNA genes. NATO AST Ser. Ser. H.

55:1-17.

Krahenbuhl J.L. and Remington J.S. (1982). The Immunology of Toxoplasma and

Toxoplasmosis, in Immunology of Parasitic Infections (Cohen S., Warren K.S., eds.).

Blackwell Science, Oxford:356-421.

Lagerkvist U. (1978). "Two out of Three": An Alternative Method for Codon Reading.

Proc. Natl. Acad. Sci. USA. 75:1759-1762.

115 Lagerkvist U. (1981). Unorthodox Codon Reading and the Evolution of the Genetic

Code. Cell 23:305-306.

Laughon B E., AUaudeen H.S., Becker J.M., Current W.L., Feinberg J., Frenkel

J.K., Hafner R., Hughs W.T., Laughlin C.A. and Meyers J.D. (1991). Summary of the Workshop on Future Directions in Discovery and Development of Therapeutic

Agents for Opportunistic Infections Associated with AIDS. J. Infect. Dis. 164:244-

251.

Leffers H., Kjems J., 0stergaard L., Larsen N. and Garrett R.A. (1987). Evolutionary

Relationships Amongst Archaebacteria: A Comparative Study of 23S Ribosomal RNAs of a Sulphur-Dependent Extreme Thermophile, an Extreme Halophile and a

Thermophilic Methanogen. J. Mol. Biol. 195:43-61.

Lefort-Tran M., Pouphile M., Freyssinet G. and Pineau B. (1980). Signification

Structurale et Fonctionnelle des Eveloppes Chloroplastidiques d'Euglena. Etude

Immunocytologique et en Cryofracture. J. Ultrastr. Res. 73:44-63.

Leport C., Raffi P., Katlama C., Regnier B., Saimot A.G., Marche C., Vedrenne C. and Vüde J.L. (1988). Treatment of Central Nervous System Toxoplasmosis with

Pyrimethamine-Sulfonamide Combination in 35 Patients with Aquired

Immunodeficiency Syndrome. Am. J. Med. 84:94-100.

Leuckart R. (1879). Die Pamiten des Menschen, 2“‘* Edition, Winter, Leipzig.

itol Levine N.D. (1970). of the Sporozoa. J. Paras. 56(Sec. II, Part I):208-209.

116 Levine N.D. (1987). Phylum H. Apicomplexa. An Illustrated Guide to the Protozoa

(Lee J.L, Hunter S.H. and Bovee B.C. eds.). Society of Protozoologists, Lawerence,

Kansas:322-357.

Levine N.D. (1988). Progress in Taxonomy of the Apicomplexan Protozoa. J.

Protozool. 35:518-520.

Li N. and Cattolico R.A. (1987). Chloroplast Genome Characterisation in the Red Alga

Grijfithsiapacifica. Mol. Gen. Genet. 209:343-351.

Lindemann K. (1865). Weiteres iiber Gregarinen. Bull. Soc. Imp. Nat. Moscou

38:381-387.

Ludwig M. and Gibbs S.P. (1989). Evidence that the Nucleomorphs of Chlorarachnion reptans (Chlorarachniophyceae) are Vestigial Nuclei: Morphology, Division and DNA-

DAPI Fluorescence. J. Phycol. 25:385-394.

Luft B.J. and Remington S. (1988). Toxoplasmic Encephalitis. J. Inf. Dis. 157:1-6.

Luft B.J. and Remington S. (1992). Toxoplasmic Encephalitis in AIDS. Clin. Infect.

Dis. 15:211-222.

M'^CabeR.E. and Remington J.S. (1990). Toxoplasma gondii. Principles and Practice of Infectious Diseases (Mandel G.L., Douglas Jr. R.G. and Bennett J.E. eds.).

Churchill Livingston, New York:2090-2103.

M'Clain W.H. (1993). Transfer RNA Identity. FASEB Journal 76:72-77.

117 M'Conkey G.A., Rogers M J. and M"=Cutchan T.F. (1997). Inhibition of Plasmodium falciparum Protein Synthesis. Targeting the Plastid-like Organelle with Thiostrepton. J.

Biol. Chem. 272:2046-2049.

MTadden G. and Gilson P. (1995). Something Borrowed, Something Green: Lateral

Transfer of Chloroplasts by Secondary Endosymbiosis. Trends Ecol. Evol. 10:12-17.

MTadden G.I., Reith M E., MunhoUand J. and Lang-Unasch N. (1996). Plastid in

Human Parasites. Nature 381:482.

Madin S.H. and Darby N.B. (1958). Established Kidney Cell Lines of Normal Adult

Bovine and Ovine Origin. P S.E.B.M. 98:574-576.

Martin W., Sommerville C.C. and Loiseaux-de Goer S. (1992). Molecular Phylogenies of Plastid Origins and Algal Evolution. J. Mol. Evol. 35:385-403.

Medlin L.K., Cooper A., Hill C., Wrieden S. and Wellbrock U. (1995). Phylogenetic

Position of the Chromista Plastids Based on Small Subunit rRNA Coding Regions.

Curr. Genet. 28:560-565.

Mehlhom H., Ortmann-Falkenstein E. and Haberkom A. (1984). The Effects of Sym.

Triazones on Developmental Stages of Eimeria tenella, E. maxima, and E. acervulina. A

Light and Electron Microscopical Study. Z. Parmtenkd 70:173-182.

Mesters J R., Zeef L.A.H., Hilgenfeld R., de Graaf J.M., Kraal B. and Bosch L.

(1994). The Stmctural and Functional Basis for the Kirromycin Resistance of Mutant

EF-Tu Species in Escherichia coli. EMBO J. 13:4877-4885.

118 Mills J. (1986). Pneumocystis carinii and Toxoplasma gondii Infections in Patients with AIDS. Rev. Inf. Dis. 8:1001-1011.

Modolell J., Carbrer B., Parmeggiani A. and Vazquez D. (1971). Inhibition by

Siomycin and Thiostrepton of both Aminoacyl-tRNA and Factor G Binding to

Ribosomes. Proc. Natl. Acad. Sci. U.S.A. 68:1796-1800.

Moore P.B. (1995). The Structure and Function of 5S Ribosomal RNA. In Ribosomal

RNA: Structure, Evolution, Processing, and Function in Protein Biosynthesis

(Zimmermann R.A. and Dahlberg A.E., eds.). CRC Press, Boca Raton, FL: 199-236,

Murgola, E.J., Hijazi K.A., Goringer H.U. and Dahlberg A.E. (1988). Mutant 16S

Ribosomal RNA: A Codon Specific Translational Suppressor. Proc. Natl. Acad. Sci.

USA. 85:4162-4165.

Neimark H. and Blaker R.G. (1967). DNA Base Composition of Toxoplasma gondii

Grown in vivo. Nature 216:600.

Ohtaka C. and Ishikawa H. (1993). Accumulation of Adenine and T ^ in e in a gwE-

Homologous Operon of an Intracellular Symbiont. J. Mol. Evol. 36:121-126.

Osawa S., Jukes T.H., Watanabe K. and Muto A. (1992). Recent Evidence for

Evolution of the Genetic Code. Microbiol. Rev. 56:229-264.

Ossorio P.N., Sibley L.D. and Boothroyd J.C. (1991). Mitochondrial-like DNA

Sequences Flanked by Direct and Inverted Repeats in the Nuclear Genome of

Toxoplasma gondii. J. Mol. Biol. 222:525-536.

Palmer J.D. (1992). Green Ancestry of Malarial Parasites? Curr. Biol. 2:318-320.

119 Palmer J.D. (1985). Comparative Organisation of Chloroplast Genomes. Ann. Rev.

Genet. 19:325-354.

Parker J., Watson R.J. and Friesen J.D. (1976). A Relaxed Mutant with an Altered

Ribosomal Protein L ll. Mol. Gen. Genet. 144:111-114.

Perrotto J., Keister D.B. and Gelderman A.H. (1971). Incorportion of Precursors into

Toxoplasma gondii DNA. J. Protozool. 18:470-473.

Pfefferkom E.R. and Borotz S.E. (1994). Comparison of Mutants of Toxoplasma gondii Selected for Resistance to Azithromycin, Spiromycin or Clindomycin.

Antimicrob. Agents Chemother. 38:31-37.

Pietrokovski S. (1994). Conserved Sequence Features Of Inteins (Protein Introns and their use in Identifying New Inteins and Related Proteins. Protein Science 3:2340-

2350.

Preiser P., Wüüamson D.H. and Wilson R.J.M. (1995). tRNA Genes Transcribed from the Plastid-like DNA of Plasmodium falciparum. Nucleic Acids Res. 23:4329-

4336.

Preiser P.R., Wilson R.J.M., Moore P.W., M'^Cready S., Hajibagheri M.A.N., Bhght

K.J., Strath M. and Wilhamson D.H. (1996). Recombination Associated with

Rephcation of Malarial Mitochondrial DNA. EMBO J. 15:684-693.

Pukrittayakamee S., Viravan C., Charoenlarp P., Yeaput C., Wilson R.J.M. and White

N.J. (1994). Antimalaiial Effects of Rifampin in Plasmodium vivax Malaria.

Antimicrob. Agents Chemother. 38:511-514.

120 Robson Gamble Y. and Acharya K.R. (1993). Molecular Modelling of Malaria

Calnîàulin Suggests that it is not a Suitable Target for Novel Antimalarials. Philos.

Trans. R. Soc. London, Series B 340:39-53.

Rosendahl G. and Douthwaite S. (1994). The Antibiotics Micrococcin and Thiostrepton

Interact Directly with 23S rRNA Nucleotides 1067A and 1095A. Nucleic Acids Res.

22:357-363.

Ryan P.C., Lu M. and Draper D.E. (1991). Recognition of the Highly Conserved

GTPase Centre of 23S Ribosomal RNA by Ribosomal Protein L ll and the Antibiotic

Thiostrepton. J. Mol. Biol. 221:1257-1268.

Saito N. and Nei M. (1987). The Neighbor Joining Method: A New Method for

Constructing Phylogenetic Trees. Mol. Biol. Evol. 4:406-425.

Sanger P., Niklen S. and Coulson A.R. (1977). DNA Sequencing with Chain

Terminating Inhibitors. Proc. Natl. Acad. Sci. U.S.A. 13:5463.

Schnepf E. (1993). Origins of Plastids (Lewin R.A. ed.). Chapman and Hall, New

York:53-76.

Shashidhara L.S., Lim S.H., Shackleton J.B., Robinson C. and Smith A.G. (1992).

Protein Targeting Across the Three Membranes of the Euglena Chloroplast Envelope. J.

Biol. Chem. 267:12885-12891.

Shirley M.W. (1992). Research on Avian Coccidia: an Update. Br. Vet. J. 148:479-

499.

121 Sibler A-P., Dirheimer G. and Martin R.P. (1986). Codon Reading Patterns in

Saccharomyces cerevisiae Mitochondria Based on sequences of Mitochondrial tRNAs.

FEES Letters 194:131-138.

Siddall M E. (1992). Hohlzylinders. Parasitol. Today 8:90-91.

Smith A.G. (1988). Subcellular-Localisation of 2 Porphyrin-Synthesis Enzymes in

Pisum-sativum (Pea) and Arum (Ceickoopint). Biochem J. 249:423-428.

Smith I., Paress P., Cabane K. and Dubnau E. (1980). Genetics and Physiology of the rel System of Bacillus subtilis. Mol. Gen. Genet. 178:271-279.

Sosio M., Amati G., Cappellano C., Sarubbi E., Monti F. and Donadio S. (1996). An

Elongation Factor Tu (EF-Tu) Resistant to the EF-Tu Inhibitor GE2270 in the

Producing Organism Planobispora rosea. Mol. Microbiol. 22:43-51.

Sprinzl M., Dank N., Nock S. and Schon A. (1991). Compilation of tRNA Sequences and Sequences of tRNA Genes. Nucleic Acids Res. 19(suppl.):2127-2171.

Staden R. (1980). A Computer Program to Search for tRNA Genes. Nucleic Acids

Res. 8:817-825.

Stark M.J.R. and Cundhffe E. (1979). On the Biological Role of Ribosoal Protein BM-

L ll of B. megatherium, Homologous with E. coli Ribosomal Protein L ll. J. Mol.

Biol. 134:767-779.

Steinberg S., Misch A. and Sprinzl M. (1993). Compilation of tRNA Sequences and

Sequences of tRNA Genes. Nucleic Acids Res. 21:3011-3015.

122 Stem D.B. and Lonsdale D.M. (1982). Mitochondrial and Chloroplast Genomes of

Maize have a 12-kilobase DNA sequence in common. Nature 299:368-702.

Stokkermans T.J.W., Swartzman J.D., Keenan K., Morrissette N.S., Tilney L.G. and

Roos D.S. (1996). Inhibition of Toxoplasma gondii Replication by Dinitoanihne

Herbicides. Exp. Parasitol, 84:355-370.

Strath M., Scott-Finnigan T., Gardner M., Williamson D. and Wilson I. (1993).

Antimalaiial Activity of Rifampicin in vitro and in Rodent Models. Trans. R. Soc.

Trop. Med. Hyg. 87:211-216.

Sueoka N. (1988). Directional Mutation Pressure and Neutral Molecular Evolution.

Proc. Natl. Acad. Sci. U.S.A. 85:2653-2657.

Suroha N. and Padmanaban G. (1992). De novo Biosynthesis of Heme Offers a New

Chemotheraputic Target in the Human Malarial Parasite. Biochem. Biophys. Res.

Comm. 187:744-750.

Tate W.P., Dognin M.J., Noah M., Stôffler-Meüicke M. and Stoffler G. (1984). The

NH2 -Terminal Domain of Escherichia coli Ribosomal Protein L ll: Its Three-

Dimensional Location and its Role in the Binding of Release Factors 1 and 2. J. Biol.

Chem. 259:7317-7324.

Taylor G.W., Wolfe K.H., Morden C.W., dePamphilis C.W. and Palmer J.D. (1991).

Lack of a Functional Plastid tRNA'"^® Gene is Associated with Loss of Photosynthesis in a Lineage of Pa^itic Plants. Curr. Genet. 20:515-518.

123 Tenant-Flowers M., Boyle M J., Carey D., Marriott D.J., Penny R. and Cooper D.A.

(1991). Sulfadiazine Desensitization in Patients with AIDS and Cerebral

Toxoplasmosis. AIDS 5:311-315.

Thatcher T.H. and Gorosky M.A. (1994). Phylogenetic Analysis of the Core

H2A, H2B, H3 and H4. Nucleic Acids Res. 22:174-179.

Thompson I., Cundliffe E. and Stark M. (1979). Binding of Thiostrepton to a

Complex of 23S rRNA with Ribosomal Protein L ll. Eur. J. Biochem. 98:261-265.

Thompson I., Cundliffe E. and Dahlberg A.E. (1988). Site-directed Mutagenesis of

Escherichia coli 23S Ribosomal RNA at Position 1067 within the GTP Hydrolysis

Centre. J. Mol. Biol. 203:457-465.

Tomavo S. and Boothroyd J.C. (1995). InWconnection between OrganeUar Functions,

Development and Drug Resistance in the Protozoan Parasite, Toxoplasma gondii.

Internat. J. Parasitol. 25:1293-1299.

Trager W. and Jenson J.B. (1976). Human Malaria Parasites in Continous Culture.

Science 193:673-675.

Turner S., Burger-Wiersma T., Giovannoni S.J., Mur R.L. and Pace N.R. (1989).

The Relationship of the Prochlorophyte Prochlorothrix hollandica to Green

Chloroplasts. Nature 337:380-382.

Uchiumi T., Wada A. and Kominami R. (1995). A Base Substitution within the

GTPase-associated Domain of Mammahan 28S Ribosomal RNA Causes High

Thiostrepton Accessibility. J. Biol. Chem. 270:29889-29893.

124 Vaidya A.B., Akella R. and Suplick K. (1989). Sequences Similar to Genes for Two

Mitochondrial Proteins and Portions of Ribosomal in Tandemly Arrayed 6-kilobase-pair

DNA of a Malarial Parasite. Mol. Biochem. Parasitol. 35:97-107.

Vaidya A.B., Morrisey J., Plowe C.V., Klasow D.C. and Wellems T.E. (1993).

Unidirectional Dominance of Cytoplasmic Inheritance in Two Genetic Crosses

of Plasmodium falciparum. Mol. Cell. Biol. 13:7349-7357.

Vivier E. and Desportes I. (1990). Handbook of Protoctista (Margulis L., Corliss J.O.,

Melkonian M. and Chapman D.J. eds.). Jones and Bartlett, Boston:549-573.

Wallace D.C. (1994). Mitochondrial Sequence Variation in Human Evolution and

Disease. Proc. Natl. Acad. Sci. U.S.A. 91:8739-8746.

Wallsgrove R.M. (1991). Plastid genes and Parasitic Plants. Nature 350:664.

Wang C.C. (1978). The Prokaryotic Characteristics of Eimeria tenella Ribosomes.

Comp. Biochem. Physiol. 61B:571-579.

Watanbe M.M., Takeda Y., Sasa T., Inouye I., Suda S., Sawaguchi T. and Chihara

M. (1987). A Dinoflagellate with Chlorophylls a and b. Morphology, Fine Structure of the Chloroplast and Chlorophyll Composition. J. Phycol. 23:382-389.

Watanbe M.M., Suda S., Inouye I., Sawguchi T. and Chihara M. (1990).

Lepidodinium viride gen. et sp. nov. (, Dinophyta), a Green

Dinoflagellate with a Chlorophyll a- and ^-containing endosymbiont. J. Phycol.

26:741-751.

125 Waters A.P. (1994). The Ribosomal RNA Genes of Plasmodium. Adv. Parasitol.

34:33-79.

Waters A.P., White W. and M'^Cutchan T.F. (1995). The Structure of the Large

Subunit rRNA Expressed in Blood Stages of Plasmodium falciparum. Mol. Biochem.

Parasitol. 72:227-237.

Weber J.L. (1988). Molecular Biology of Malaria Parasites. Exp. Parasitol. 66:143-

170.

Weeden N.F. (1981). Genetic and Biochemical Implications of the Endosymbiotic

Origin of the Chloroplast. J. Mol. Evol. 17:133-139.

Weijland A., Harmark K., Cool R.H., Anborgh P H. and Parmeggiani A. (1992).

Elongation F actor Tu: A Molecular Switch in Protein Biosynthesis. Mol. Microbiol.

6:683-688.

Wilcox L.W. and Wedemayer G.J. (1984a). Gymnodium acidotum nygaard

(Pyrophyta), a Dinoflagellate with an Endosymbiotic Cryptomonad. J. Phycol. 20:236.

Wilcox L.W. and Wedemayer G.J. (1984b). Dinoflagellate with Blue Green

Chloroplasts Derived from an Endosymbiotic Eukaryote. Science 227:192-194.

Williamson D.H. and Fennell D.J. (1975). The use of a Fluorescent DNA-Binding

Agent for Detecting and Separating Yeast Mitochondrial DNA. Methods Cell. Biol.

12:335-351.

126 Williamson D.H., Wilson R.J.M., Bates P.H., M'Cready S., Ferler F. and Qiang B.

(1985). Nuclear and Mitochondrial DNA of the Primate Malarial Parasite Plasmodium knowlesi. Mol. Biochem. Parasitol. 14:199-209.

Williamson D.H., Gardner M.J., Preiser P., Moore D.J., Rangarchari K. and Wilson

R.J.M. (1994). The Evolutionary Origin of the 35 kb Circular DNA of Plasmodium falciparum: New Evidence Supports a Possible Rhodophyte Ancestry. Mol. Gen.

Genet. 243:249-252.

Wilson R.J.M., Gardner M.J., Feagin J.E. and Williamson D.H. (1991). Have

Malaria Parasites Three Genomes? Parasitol. Today 7:134-136.

Wilson R.J.M., Fry M., Gardner M.J., Feagin J.E. and Williamson D.H. (1992).

Subcellular Fractionation of the Two OrganeUar DNAs of Malaria Parasites. Curr.

Genet. 21:405-408.

WUson R.J.M., Gardner M.J., Rangachari K. and WUliamson D.H. (1993).

Extrachromosomal DNA in the Apicomplexa. Toxoplasmosis (Smith J.E., ed.). NATO

ASI Series, Vol. H 78:51-59.

WUson R.J.M., Wilhamson D.H. and Preiser P. (1994). Malaria and Other

Apicomplexans: The "Plant" Connection. Inf. Agents and Dis. 3:29-37.

WUson R.J.M., Denny P.W., Preiser P R., Rangachari K., Roberts K., Roy A.,

Whyte A., Strath M., Moore P.W. and Wilhamson D.H. (1996). Complete Gene Map of the Plastid-Uke DNA of the Malaria Parasite Plasmodium falciparum. J. Mol. Biol.

261: 155-172.

127 Wilson R.J.M. and Williamson D.H. (1997). Extrachromosomal DNA in the

Apicomplexa. Microbiol. Mol. Biol. Revs. 61:1-16.

Wimpee C.F., Morgan R. and Wrobel R.L. (1992). Loss of Transfer RNA Genes from the Plastid 16S-23S Ribosomal RNA Gene Spacer in a Parasitic Plant. Curr.

Genet. 21:417-422.

Wolfe K.H., Morden C.W. and Palmer J.D. (1992a). Function and Evolution of a

Minimal Plastid Genome from a Non-Photosynthetic Parasitic Plant. Proc. Natl. Acad.

Sci. USA. 89:10648-10652.

Wolfe K.H., Morden C.W., Ems S.C. and Palmer J.D. (1992b). Rapid Evolution of the Plastid Translational Apparatus in a Non-Photosynthetic Plant: Loss or Accelerated

Sequence Evolution of tRNA and Ribosomal Protein Genes. J. Mol. Evol. 35:304-317.

Wolters J. (1991). The Troublesome Parasites-Molecular and Morphological Evidence that Apicomplexa belong to the Dinoflagellate-Cihate Clade. Biosystems 25:75-83.

World Health Organisation (1973). Chemotherapy of Malaria and Resistance to

Antimalarials. Tech. WHO, Geneva: Rep. 529.

Xing Y. and Draper D.E. (1996). Cooperative Interactions of RNA and Thiostrepton

Antibiotic with Two Domains of Ribosomal Protein L ll. Biochemistry 35:1581-1588.

Yang Z. (1994). Maximum Likelihood Phylogenetic Estimation from DNA Sequences with Variable Rates over Sites: Approximate Methods. J. Mol. Evol. 39, 306-314.

128 Yang Z. (1995). Phylogenetic Analysis by Maximum Likelihood (PAML), Version

1.1. Institute of Molecular Evolutionary Genetics, Pennsylvania State University.

Yang Z. (1996). Among-Site Rate Variation and its Impact on Phylogenetic Analysis.

Trends Ecol. Evol. 11, 367-372.

Yoshinaga K., liuma H., Masuzawa T. and Uedal K. (1996). Extensive RNA Editing of U to C in Addition to C to C Substitution in the rhcL Transcripts of Homwort

Chloroplasts and the Origin of RNA Editing in Green Plants. Nucleic Acids Res.

24:1008-1014.

Yoshionari S., Koike T., Yokogawa T., Nishikawa K., Ueda T., Miura K. and

Watanabe K. (1994). Existance of Nuclear-Encoded 5S-rRNA in Bovine Mitochondria.

FEES Letters 338:137-142.

Zerfass K. and Beier H. (1992). The Leaky UGA Termination Codon of Tobacco

Rattle Virus RNA is Suppressed by Tobacco Chloroplast and Cytoplasmic tRNAs*^ with CmCA Anticodon. EMBO Journal 11:4167-4173.

129 APPENDIX: PUBLICATIONS

Wilson R.J.M., Denny P.W., Preiser P R., Rangachari K., Roberts K., Roy A.,

Whyte A., Strath M., Moore P.W. and Williamson D.H. (1996). Complete Gene Map of the Plastid-like DNA of the Malaria Parasite Plasmodium falciparum. J. Mol. Biol.

261: 155-172.

Kohler S., Delwiche C.F., Denny P.W., Tilney L.G., Webster P., Wilson R.J.M.,

Palmer J.D. and Roos D.S. (1997). A Plastid of Probable Green Algal Origin in

Apicomplexan Parasites. Science 275:1485-1489.

Clough B., Strath M., Preiser P., Denny P. and Wilson R.J.M. (1997). Thiostrepton

Binds to Malarial Plastid rRNA. FEBS Letters 406:123-125.

130 FEBS 18386 FEBS Letters 406 (1997) 123-125

Thiostrepton binds to malarial plastid rRNA

Barbara Clough*, Malcolm Strath, Peter Preiser, Paul Denny, Iain (R.J.M.) Wilson

National Institute for Medical Research, Mill Hill, London NW7 lAA, UK

Received 13 February 1997

DNA or RNA as template, a reverse transcription reaction with ran­ Abstract Binding of the thiazolyl peptide antibiotic thiostrepton dom primers being carried out first in the latter case. All wild type and to the GTPase domain of 23S rRNA involves a few crucial modified transcript sequences were verified prior to thiostrepton bind­ nucleotides, notably A1067 (£. coli). Small RNA transcripts ing assays. A positive control transcript was used based on the 23S were prepared corresponding to the GTPase domain of the rRNA sequence of E. coli with a mutation (U 1061 A) that increases plastid 23S rRNA and the two forms of cytosolic 288 rRNAs stability and binding (kindly supplied by Dr P. Ryan, Johns Hopkins found in the human malaria parasite Plasmodium falciparum, as University, Baltimore, MD). Thiostrepton filter-binding assays were well as the plastid form of rRNA of the AIDS-related pathogen performed as previously described [8] except that transcripts were Toxoplasma gondii. Binding affinities of the wild type and labelled with ^^P and the KCl buffer was replaced with NHjCl [11]. mutated RNA sequences were as predicted; the malarial plastid sequence had by far the highest affinity, whereas that from 3. Results and discussion toxoplasma did not bind thiostrepton. © 1997 Federation of European Biochemical Societies. Reproducible inhibition of uptake of both Key words: Malaria; Toxoplasma; Thiostrepton; [^H]hypoxanthine and [’‘‘Qisoleucine was found with P. falci­ GTPase domain; rRNA parum in blood cultures (50% inhibition at 3-5 pM thiostrep­ ton m r\ respectively, see Fig. 2A). As shown in Fig. 2B, onset of inhibition of protein synthesis by thiostrepton was more rapid (5 h) than by tetracycline (8 h). Specificity was 1. Introduction demonstrated by the lack of effect of viomycin (data not shown), an unrelated antibiotic that also can inhibit translo­ Like plants and algae, apicomplexan parasites (malaria, cation [12]. toxoplasma etc.) carry two DNA-containing organelles, a mi­ Evidence that the highest affinity interaction of thiostrepton tochondrion (mt) and a plastid (pi) [1,2]. The plastid, enve­ is with 238 rRNApi was obtained from a thiostrepton filter- loped by several membranes, is vestigial and probably of an­ binding assay [8]. Fig. 3A shows that the mutation Pfpi (£. coli cient endosymbiotic origin, but it is also demonstrably active number A1067U) markedly reduced thiostrepton binding and could be a new target for chemotherapeutic agents [3]. (~14% of wild type). An intermediate level of binding Nucleotide (nt) sequences are available for the GTPase do­ (-^35% of wild type) was obtained with the mutation Pfpi main of the two types of 28S rRNA specified by the nucleus (£. coli number A1067G). Thiostrepton binding to transcripts of the human malaria pathogen Plasmodium falciparum (PO corresponding to the GTPase domain (nt 1334-1427) of both [4], as well as the pi 23S rRNA genes [6] and the 23S rRNA the sporozoite and erythrocytic forms of Pf 288 rRNA [4] specified by the mt genome (only a partial sequence, see [5]). were 10% of that for Pf 238 rRNApi. These data show These data indicate that the high affinity binding site for the thiazolyl peptide antibiotic thiostrepton, A 1067 in E. coli [7- Seguence of the GTPaso region of 23Sai_£BMa 9], is conserved in the GTPase domain encoded by the plastid DNA, but is modified to a G in both nuclear and mitochon­ A mitochondrial C mitochondrial A cytosolic . drial genomes (Fig. 1). This substitution is predicted to reduce A cytosolic G U G ^ 0 1 0 9 5 rRNA binding by the antibiotic [8]. We have tested thiostrep­ G A G C I I I I A ton to ascertain whether it inhibits Pf in cultures of human C U C G A A A U A blood and have determined the levels of antibiotic binding to A -u 1067 nucleus and plastid forms of Pf 28/23S rRNA. G cyto so lic A -U U-A G mitochondrial / U-A 2. Materials and methods A-U

Blood cultures of the malaria parasite P. falciparum were main­ G C A GUU ecu Toxo A A G C © tained in vitro and the uptake of radioactive tracers in the presence I • I I I Iplastld I I I I A and absence of thiostrepton was determined by methods described A CGA GGA U U C G previously [10]. Short transcripts of wild type rRNA (corresponding to the GTPase A-U domain of Pf 23S rRNApi, nt 987-1078) were transcribed in vitro C-G from a PCR product that included a T7 promoter sequence in one C-G of the primers. Mutated malarial rRNApi sequences (£. coli numbers A-U A1067U and A1067G) were obtained by PCR methodology and tran­ Fig. 1. Sequence of the GTPase region of the plastid 23S rRNAs of scribed in the same way. Transcripts of both types of Pf 28S rRNA Plasmodium falciparum (Pf) and Toxoplasma gondii (Toxo) (numbers also were prepared from PCR products, in this case using either total based on E. colt), showing substitution sites (circled) affecting the binding of thiostrepton [7]. The corresponding nucleotides in Pf cy­ ^Corresponding author. tosolic 28S rRNA and Pf mitochondrial 23S rRNA are indicated.

0014-5793/97/S17.00 © 1997 Federation of European Biochemical Societies. All rights reserved. P//S00 1 4-5793(97)00241-X 124 B. Clough et al.lFEBS Letters 406 (1997) 123-125 A B 100 2 ? 0 O 4 - ' K + Vl hypoxanthine 0 c 8 0 - 0 NV o i so leucine X u E M- Q 8 0 O 60 - ■M C (D 0) / 6 ü m 4 - " (D 40 - 0. a o 4 0) (D c m 20 - u ■M D 2 Q. O D 0 (/) 0

0 2 4 6 8 10 0 6 12 1824 303642 4854

Thiostrepton (/a M) Incubation time (hrs)

Fig. 2. Thiostrepton inhibition of Plasmodium falciparum in blood cultures. A: Dose-response curve. B: Inhibition of incorporation of ['“'CJisoleucine in cultures following addition (arrow) of 10 pM thiostrepton mP* (+), or 100 pM tetracycline m P' (o) at 23 h. Untreated con­ trol ( A ). that the nts crucial for thiostrepton binding to Pf 23S rRNA Acknowledgements This: work was supported by the UNPDAVorld are as predicted from E. coli, and that the plastid form has the BankAVHO Special Programme for Research in Tropical Diseases (TDR) PD was supported by an MRC Studentship. highest binding affinity. It is not yet possible to reconstruct the complete sequence for the GTPase domain of the 23 S References rRNAmt [5], however, the fragments of sequence available do not have the critical consensus nts required for thiostrep­ [1] J.E. Feagin, Annu. Rev. Microbiol. 48 (1994) 81-104. ton binding at positions A 1067 and A 1095 (E. coli numbers) [2] R.J.M. Wilson, et al. J. Mol. Biol. 261 (1996) 155-172. (see Fig. 1). [3] G.I. McFadden, M.E. Reith, J. Munholland, N. Lang-Unnasch, In a similar way, we tested a transcript corresponding to the Nature 381 (1996) 482. GTPase domain of the 23S rRNApi of Toxoplasma gondii [4] M.J. Rogers, R.R. Gutell, S.H. Damberger, J. Li, G.A. McCon- key, A.P. Waters, T.F. McCutchan, RNA 2 (1996) 134-145. (Tg), a related apicomplexan that is an important opportun­ [5] J.E. Feagin, B.L. Mericle, E. Werner, M. Morris, Nucleic Acids istic pathogen in patients with AIDS. In this case, the wild Res. 25 (1997) 438-446. type sequence has a substitution at a different site (E. coli [6] J.E. Feagin, Mol. Biochem. Parasitol. 52 (1992) 145-148. number A1077U) (see Fig. 1), that inhibits binding by thio­ [7] J. Thompson, E. Cundliffe, Biochimie 73 (1991) 1131-1135. [8] P.C. Ryan, M. Lu, D.E. Draper, J. Mol. Biol. 221 (1991) 1257- strepton in £. coli [8]. This was found also to be the case with 1268. a transcript derived from a PCR product covering the GTPase [9] G. Rosendahl, S. Douthwaite, Nucleic Acids Res. 22 (1994) 357- domain of Tgpi 23S rRNA (nt 926-1024) (Fig. 3B). Corrective 363. mutation of the Tgpj transcript (£. coli number U1077A) con­ [10] M. Strath, T. Scott-Finnigan, M. Gardner, D. Williamson, 1. ferred a significant increase ( X 5) in thiostrepton binding (Fig. Wilson, Trans. R. Soc. Trop. Med. Hyg. 87 (1993) 211-216. [11] Y.-X. Wang, M. Lu, D.E. Draper, Biochemistry 32 (1993) 3B). 12279-12282. These thiostrepton binding studies constitute the first direct [12] U.R. Kutay, C.M.T. Spahn, K.H. Nierhaus, Biochim. Biophys. evidence that components of the malarial plastid organelle Acta 1050 (1990) 193-196. could be preferentially targeted by drugs. The results comple­ [13] E.R. Pfefferkom, S.E. Borotz, Antimicrob. Agents Chemother. 38 (1994) 31-37. ment earlier studies [13,14] which inferred that toxoplasma’s [14] C.J.M. Beckers, D.S. Roos, R.G.K. Donald, B.J. Luft, J.C. 23S rRNApi niight be the target of the macrolide antibiotic, Schwab, C. Yang, K.A. Joiner, J. Clin. Invest. 95 (1995) 367-376. clindamycin, acting at a different effector site.

■i - dough et ail FEBS Letters 406 (1997) 123-125 125

A 60n

50 •

c o c 30

20

0 S 10 1 5 20

[Thiostrepton] pM

B son

50 ■

40 - §

IX

20 ■

0 5 1 0 1 5 20

[Thiostrepton] pM

Fig. 3. Thiostrepton binding curves [8] (mean of duplicates, bar = range). A; Short 23S transcripts of P. falciparum (Pf) wild type rRNApi (o, A1067) and mutated forms (□, A1067U and a , A1067G), as well as Pf 28S rRNA sporozoite-type transcripts (a ) and erythrocytic-type (■), compared with an optimized E. coli con­ trol transcript (•). All nucleotide numbers correspond to E. coli for convenience. B : T. gondii wild type rRNApi transcript ( a ) and mu­ tated transcript (a ) compared with control transcripts from P. falci­ parum rRNApi (O) and E. coli (•). y Reprinted from J.Mol. Biol. (1996) 261, 155-172 JMB

Complete Gene Map of the Plastid-like DNA of the Malaria Parasite Plasmodium falciparum

R. J. M. (lain) Wilson, Paul W. Denny, Peter R. Preiser Kaveri Rangachari, Kate Roberts, Anjana Roy, Andrea Whyte Malcolm Strath, Daphne J. Moore, Peter W. Moore and Donald H. Williamson J. Mol. Biol. (1996)261, 155-172 JMB

Complete Gene Map of the Plastid-like DNA of the Malaria Parasite Plasmodium falciparum

R. J. M. (lain) Wilson*, Paul W. Denny, Peter R. Preiser Kaveri Rangachari, Kate Roberts, Anjana Roy, Andrea Whyte Maicoim Strath, Daphne J. Moore, Peter W. Moore and Donald H. Wiiiiamson

National Institute for Medical Malaria parasites, and other parasitic protists of the Phylum Apicomplexa, Research, Mill Hill, London carry a plastid-like genome with greatly reduced sequence complexity. NW7 lAA, UK This 35 kb DNA circle resembles the plastid DNA of non-photosynthetic plants, encoding almost exclusively components involved in gene expression. The complete gene map described here includes genes for duplicated large and small subunit rRNAs, 25 species of tRNA, three subunits of a eubacterial RNA polymerase, 17 ribosomal proteins, and a translation elongation factor. In addition, it codes for an unusual member of the Clp family of chaperones, as well as an open reading frame of unknown function found in red algal plastids. Transcription is polycistronic. This plastid-like DNA molecule is conserved in several genera of apicomplexans and is conjectured to have been acquired by an early progenitor of the Phylum by secondary endosymbiosis. The function of the organelle (plastid) carrying this DNA remains obscure, but appears to be specified by genes transferred to the nucleus. © 1996 Academic Press Limited *Corresponding author Keywords: evolution; malaria; non-photosynthetic plastids; plastid DNA

Introduction vestigial plastid, which is transmitted uniparentally by the macrogamete (female) in sexual reproduction Like plants, malaria parasites {Plasmodium spp.) (Vaidya et al., 1993; Creasey et al., 1994). carry two organellar DNAs; one is mitochondrial, The reduced malarial plastid genome is only half the other is a 35 kb circular molecule resembling the the size of the other two best known vestigial remnant of an algal plastid genome (Wilson et al., plastomes, those of Astasia longa (a non-photosyn- 1994). To account for the plastid-like DNA (plDNA), thetic euglenoid; Gockel et al., 1994) and Epifagus we proposed that an early progenitor of the Phylum virginiana (a parasitic, non-photosynthetic higher acquired an algal plastid by secondary endosymbio­ plant; Wolfe et al., 1992), and it has a notably sis (Williamson et al., 1994). In keeping with this different gene content from both of these. Yet the hypothesis, the plDNA occurs in several genera of transcriptional activity of the malarial plDNA and apicomplexans besides Plasmodium (Wilson et al., its comprehensive set of tRNA genes (Preiser et al., 1993). Exploratory studies have shown that the 1995) imply that it is functional. To obtain more circular DNA does not co-fractionate with the insight into the plDNA's origin and function, we mitochondrion (Wilson et al., 1992), but resides in have sequenced the 35 kb circle from the human another intracellular compartment: a presumed malarial parasite Plasmodium falciparum (Pf) and obtained a complete gene map. Abbreviations used: plDNA, plastid-like DNA; Pf, Approximately half the sequence of the Pf 35 kb Plasmodium falciparum; EF-Tu, elongation factor Tu; circle has been described (see references in Preiser LSU, SSU, large and small subunits; ORE, open reading frame; IR, inverted repeat; rp, ribosomal et al., 1995). Here we describe further recently protein; RT-PCR, reverse transcription-polymerase identified genes and give an overview of the chain reaction; DAPI, 4',6-diamidino-2-phenylindole; complete gene map. Nearly all the genes turn out Tg, Toxoplasma gondii; Et, Eimeria tenella; Ta, Theileria to be involved with gene expression, including annulata. those encoding three subunits of a eubacterial type

0022-2836/96/320155-18 $18.00/0 1996 Academic Press Limited 156 Gene Map of Malarial Plastid DNA of RNA polymerase, 17 ribosomal proteins, the trn^, carries the only intron recognized so far on the elongation factor Tu (EF-Tu), duplicated large and circle (Preiser et al., 1995). Along with the tRNAs small subunit (LSU and SSU) rRNAs, and 25 encoded within the inverted repeat and six others tRNAs, nine of which are duplicated. In addition, located in two small groups at remote sites on the there is an open reading frame (ORF) encoding a circle, a total complement of 25 species has been putative regulatory subunit similar to the Clp found. These are believed to be sufficient to provide family of molecular chaperones, as well as a highly a minimal but complete set for translation of the conserved ORF of unknown function found in protein-encoding genes on the circle (Preiser et al., bacteria and "primitive" red algal plastids, and 1995). seven small potential ORFs of various sizes. All the genes appear to be transcribed polycistronically. Ribosomal protein genes Further evidence for conservation of the 35 kb circle amongst apicomplexans supports our contention A string of 15 putative ribosomal protein (rp) that it has a common origin and conserved genes forms another prominent feature of the circle, function. occupying a sector of about 7 kb (Figure 1). The order of rp genes in this large cluster implies a Results fusion of the SIO, spc, alpha and str opérons of Escherichia coli, as found in other plastids (Figure 2). Physical features Another rp gene, rps4, present in the alpha operon of E. coli, is separately located upstream of the A continuous sequence of 34,682 nt has been clustered malarial rp genes (Figure 1). In addition, constructed, corresponding to the "35 kb" circular rps2 lies at a distant site immediately downstream DNA of Pf. A small stretch, estimated to be only of the rpo genes in a location characteristic of plastid tens of nucleotides in length remains unsequenced DNAs rather than those of cyanobacteria. It is in the centre of the rDNA inverted repeat. This possible that some of the small unidentified ORFs repeat has the propensity to form a large cruciform on the circle also correspond to rp genes but, if so, structure (Wilson et al., 1993), with arms that can they are extremely divergent. Although several rp reach 0.5 pm in length, and the torsional constraints genes characteristic of bacterial opérons are no on formation of the cruciform have been discussed longer present on the malarial circle, the retention for the corresponding circular molecule from the of gene order is striking when one takes into related apicomplexan parasite Toxoplasma gondii account the many deletions and rearrangements (Borst et a i, 1984). required to reduce what was presumably a typical The high A + T (86.9%) content of the circle plastid genome of some 15(3 kb or more, to its sequence is consistent with its buoyancy in caesium present small size and selected gene content. The chloride gradients: it forms a "light" satellite band clustered rp genes are closely packed, open reading in density gradients of DNA from apicomplexans frames often being separated by no more than 30 nt, with genomic DNA of moderate G + C richness; for or even in two cases overlapping by a single nt. One example, P. knowlesi (Williamson et al., 1985). In Pf, exception is an overlap of 30 nt between rpl36 and separation of the circular DNA from the almost ORF91, which could invalidate the latter. equally A + T-rich genomic DNA (82%; Weber, In the pIDNA of the primitive red alga Porphyra 1988) requires a multi-step fractionation procedure purpurea, the SIO operon begins with rpl3 followed (Gardner et al., 1988). by rpl4, the preceding gene of the corresponding bacterial operon (rpslO) having been transposed Gene content and organization to follow tufA at the end of the str operon (see Figure 2). In Plasmodium, the first rp gene in the SIO A gene map of the 35 kb circle of Pf is illustrated series is rpl4, but an open reading frame (ORF78) in Figure 1 and the genetic content is listed in following the tufA gene shows no apparent Table 1. The main features are as follows. similarity to rpslO, which presumably has been transferred to the nucleus. Inverted repeat The order of rp genes on the circular DNA has been useful for identification purposes and, The inverted repeat (IR) covers about one third of likewise, the juxtaposition of rp and other genes the circle and encodes duplicated large and small occasionally helps to signify the circle's origin. One subunit rRNA genes (Gardner et al., 1991a), as well example is the presence of a tuf gene downstream as nine duplicated tRNA genes (Gardner et al., of rps7 and rpsl2. Unlike most algal genomes, tufA 1994b). Sequence analysis of the rRNA genes is encoded in the nucleus of higher plants (Baldauf showed that they are not closely related to those of & Palmer, 1990), so its presence on the malarial the mitochondrion or nucleus (Feagin et al., 1992). circle next to the truncated str operon strengthens our proposal that the circle is of algal origin, like the tRNA genes plDNA of euglenoids, rather than derived from higher plants by lateral transfer. Moreover, like Downstream of the IRb arm of the inverted repeat algal pi genomes but unlike those of higher plants, there is a cluster of ten tRNA genes, one of which. the malarial circle encodes rps5, rpsl7 and rpl4. On Gene Map of Malarial Plastid DNA 157

rps2

cIpC .0 R F 7 9 ' O R F129/ O R F 105 '0R F7I tufA

rps12-K rps11 rpl 36 O R F 91 /poC 1 rps5

r p l 6 - \ rp8 6^\ rpl 14 1 rps 17 rpl l e t

P.falciparum pIDNA rps 3 rpoB (35kb) rps 19 rpl 2 rpl 23

O R F101 ^ 0 R F 5 1 IRa IRb

O R F 470 R(ucu) LSU rRNA

LSU SSU rRNA rRNA

SSU p (u cu ) rRNA

I (gau)

Figure 1. Gene map of the 35 kb circular DNA of Plasmodium falciparum. The two halves (A and B) of the inverted repeat (IR) are indicated. tRNA genes are specified by the anticodon, as well as the single letter amino acid code. ORFs specify the number of amino acid residues in the open reading frame. Genes on the outer strand are transcribed clockwise, those on the inner strand anti-clockwise.

the other hand, although the isolated rps4 gene is the putative peptides have a basic charge as would preceded by trn^ as in P. purpurea, Cyanidium be predicted for rps (Table 2). As is evident in the caldarium, Marchantia polymorpha and some higher 17 pairs of DOT MATRIX plots given in Figure 3 plants, it is known that different trn genes abut rps4 (ten pairs for small subunit rps and seven pairs for in other plastomes (Harris et al., 1994). In this large subunit rps), identification has relied for the instance then, gene order seems fluid. most part on small stretches of conserved sequence The rp genes on the malarial circle show typical indicating regions of global similarity to equivalent as well as unusual compositional features. Their peptides from E. coli, a plastid, or in one case a nucleotide composition is extremely rich in A 4- T mycoplasma source. It is interesting that most of residues and the codon usage notably biased, the rps encoded on the 35 kb plDNA are important consequently similarity at the peptide level to other in the initial assembly of the 30 S subunit plastid rp genes often is borderline. Nonetheless, all (Wittmann, 1983). 158 Gene Map of Malarial Plastid DNA

Table 1. Gene content of the 35 kb circular DNA of P. falciparum Class Genes Ribosomal RNA 16 S, 23 S Transfer RNA*'** AUGC CGCA DGUC EUUC FGAA GACC GUCC HGUG |GAU KUUU LUAG LUAA*MCAUMCAU NGUU pUGG QUUG RUCU RACG gGCU gUGA JUGUVUAC WCCA YGUA Ribosomal proteins: rps 2, 3, 4, 5, 7, 8, 11, 12, 17, 19 rpl 2, 4, 6, 14, 16, 23, 36 RNA polymerase rpoB, C l, C2 Other proteins dpC tufA ORF470 Unassigned ORFs 51, 78, 79, 91, 101, 105, 129 “ Single letter amino acid code and anti-codon. Asterisk represents an intron.

Features of interest in a selection of rp genes are rpl2 as follows. This commences with an ATG codon like other plant homologues except for rice and maize, which rps2 have an ACG codon edited to AUG at the transcript level (Kossel et al., 1993). The C terminus of the This maps downstream of the RNA polymerase predicted malarial peptide contains the usual block genes rpoB/QIC2 , as in other plastid genomes. of conserved amino acid residues (DHPHGGG), However, it is not followed by atp genes as in other otherwise the peptide is truncated at both ends. plastids, the 3’ end of rps2 marking instead a possible deletion/recombination site and the cross-over point for the directions of transcription rpl4 from the two arms of the inverted repeat (see Fig­ This has not been found in other pi genomes ure 1). except in the primitive red alga P. purpurea (Reith & Munholland, see Harris et al., 1994). rps4 This is one of the rRNA binding proteins that rpl 14 initiate assembly of the 30 S ribosome. Only the first This is relatively well conserved (24% identity 20 amino acid residues and a large central portion with E, coli), the malarial sequence being without show any similarity to other versions of this protein the frameshifts found in Epifagus, where it was (<20% identity with E. coli). The malarial gene has designated a pseudogene (Wolfe et al., 1992). the extreme nucleotide composition of 94% A -i- T. Of the 209 codons, only six have a C or G in the third position. rpl16 This is another well conserved rp gene found on rps5 all plastids (32% identity with E. coli). This is poorly conserved, only the central region of the predicted peptide showing similarity to other rpl23 versions (35% identity with E. coli for this region). This encodes a poorly conserved peptide (<20% However, sites in the E. coli version conferring identity with E. coli) but the gene appears to be spectinomycin and streptomycin sensitivity (amino uninterrupted, unlike the pseudogene in Epifagus. acid residues 20 to 22, and 104 and 112, respectively) are conserved in the predicted malarial peptide. rpl36 This encodes a relatively highly conserved rps12 peptide (47% identity with E. coli) despite the open reading frame's marked A T bias (85%). This is the best conserved of the malarial small Immediately downstream of the rp genes as subunit rps, as in other species (50% identity with shown lies a tuf gene encoding the elongation factor E, coli). The sites affecting streptomycin resistance Tu (EF-Tu), which will be considered next. or sensitivity in other organisms (TPKKPNSA; GGRVKDLPG; see Harris et al., 1994), are The tufA gene conserved in wild-type form in the malarial sequence. The amino acid composition is less The predicted peptide is highly divergent, biased than that of other malarial rps (Table 2). sharing only 45% amino acid identity with the E.coll alpha

Porphyra Porphyra Porphyra Porphyra T a ï L4 ï L23 ï U2 © © © # # © © #

Cyanophora Cyanophora C yanophora Cyanophora GD © © # # © © @

Plasmodium C E D ® © ® • ® © # © m ORF 91

Euglena Euglena Euglena Euglena & ©

M archantia M archantia © © • © • S ) ©

Nicotiana Nicotiana © © • © • © ©

Epifagus Epifagus Epifagus Epifagus O © # # © ©

Figure 2. The order of the large (L) and small (S) subunit ribosomal protein genes clustered on the Plasmodium 35 kb circle is compared with those in the SIO, spc, alpha an d str opérons of £. coli (top line) and those of plastids from the red algae Porphyra purpurea an d Cyanophora paradoxa, the photosynthetic protist Euglena gracilis, the lower plant Marchantia polymorpha, the higher plant Nicotiana tabacum and the parasitic higher plant Epifagus virginiana. Open ovoids represent pseudogenes; stippled un-numbered ovoids show genes now known to be nuclear. Other interspersed genes include secY, rpo A an d tufA . Modified from (Harris et al., 1994). 160 Gene Map of Malarial Plastid DNA

Table 2. Biased amino acid composition of ribosomal ATP-binding domain (Figure 5). In addition, the proteins ATPase-Iike consensus sequence Y-x8-T-xl3-Y rp aa F I L K NY% pi (amino acid residues 460 to 482) was only S2 224 5 20 15 14 16 12 82 10.1 semi-conserved in the predicted malarial peptide; S3 214 7 18 10 12 17 17 81 9.97 T468 being substituted by N due to a possible 84 209 6 20 13 15 15 17 86 10.0 transversion (C to A) in the second position of the 55 239 7 21 13 14 15 13 83 10.4 codon. By contrast, a high level of similarity was S7 142 7 23 14 20 11 8 83 10.6 58 128 11 16 16 17 11 17 88 10.2 evident in the second nt-binding domain, especially 511 132 5 17 11 17 18 12 80 10.4 between residues 650 to 750, and in two blocks in 512 122 C7* 9 12 23 7 6 64 10.9 the tail region around residues 810 and 870, 517 74 4 20 9 19 15 14 81 10.0 respectively. It is notable that the residue corre­ 519 90 6 18 13 19 19 13 88 10.4 sponding to Lys620 in the second nt-binding site of L2 245 G9' 12 7 16 12 12 68 10.3 L4 179 7 19 11 16 17 10 80 10.5 Hspl04 of S. cerevisiae is conserved in the malarial L6 167 8 23 10 13 18 13 85 9.97 sequence (amino acid 658 in Figure 5); this residue L14 114 6 13 7 19 15 11 71 10.3 has been implicated in oligomerization of the L16 129 59= 16 12 20 6 5 68 10.7 protein into an adenine nucleotide-dependent L23 21 12 75 8 9 20 16 86 10.2 et al., L36 36 6* 17 6 31 6 0 66 11.1 hexameric ring (Parsell 1994). Although DOTPLOT comparisons with Clps A, B and C The most frequent amino acids indicated by single letter (Figure 6) showed similarity with the malarial amino acid code, F, I, L, K, N and Y, are given as % composition of the rps. rp, ribosomal proteins of small (S) or large (L) sub­ sequence only in the C-terminal region, visual unit aa, number of predicted amino acids in each protein; inspection of the malarial peptide revealed two pi, calculated isoelectric point. leucine-rich blocks of N-terminal sequence, around * Atypical compositions (see the text). G, glycine; S, serine. residues 130 and 210, with a spacing similar to ClpC of the tomato. We interpret this to mean the malarial peptide is a degenerate or specialized form tuf genes of £. coli and 51% identity with of ClpC. Unlike the latter, the rest of the N-terminal Anacystis nidulans and Euglena gracilis. Never­ repeat region was not conserved within the theless, several highly conserved functional do­ malarial peptide, which has diverged considerably mains are evident, including the four clusters from other known ClpC sequences. of residues in domain I involved in GTP binding. As shown in Figure 4, these segments in E. coli The rpo genes carry the consensus elements, G19HVDHGK25; D83CPG86; N138KCD141; and S176AL178, in­ The rpoB gene and part of rpoC have been volved in binding the phosphoryl, Mg^'^, and described as indicators of the plastid origin of the guanine residues of GTP, respectively (Kjeldgaard 35 kb circle (Gardner et al., 1991b; Howe, 1992). The & Nyborg, 1992). In the malarial sequence there is complete sequence of rpoC now available shows only one substitution, C140E. The residues defining that it lacks the intron typical of higher plants. the GDP binding pocket also are conserved (in Furthermore, the malarial circle gene homologous £. coli G24, N138, K139, D141, S176, L178). Despite to E. coli rpoC is split into rpoCi and rpoCi as in other the tufA gene's high A -i- T content, it encodes one plastid and cyanobacterial genomes. But in contrast of the best conserved proteins specified by the to all the other genes that we have identified on the circle. In a less well conserved region topologically malarial circle, the open reading frame of rpoCz has close to the GTP binding domain (amino acid a frameshift in a poorly conserved central section. residues 183 to 192), the malarial sequence has a The 3' end of the gene commences immediately specific insertion like other plastid versions of after this frame shift, the open reading frame EF-Tu. These residues form a loop barely discern­ beginning with an ATG codon. A gene correspond­ able in the mitochondrial equivalent (tufM) of ing to rpoA was not found on the circle. Saccharomyces cerevisiae and Homo sapiens (Nagata et al., 1983; Wells et al., 1995) and absent from £. coli. ORFs

The clp gene ORF470 encodes a well-conserved protein recorded from the plastids of three red algae An unusual member of the Clp family of (Williamson et al., 1994), as well as the diatom molecular chaperones is found downstream of the Odontella sinensis (K. Kowallick, personal communi­ tuf gene (Figure 1). It corresponds by sequence cation), and Mycobacterium leprae (Pietrokovski, similarity to the double nt-binding, regulatory 1994). There are also seven other putative forms of Clp rather than the single nt-binding unassigned open reading frames distributed round subfamilies ClpX and Y (Gottesman et al., 1993), yet the circle (Figure 1), all of them small and some only the second of the two ATP-binding domains possibly spurious; for example, ORF78 is not is conserved in the predicted malarial peptide. conserved in the related apicomplexan T. gondii Alignments of amino acids from double nt-binding (Paul Denny, unpublished observation). Thus the subunits of Clp proteins showed little similarity significance of these small ORFs is questionable and with the malarial sequence throughout the first further studies are required to validate them. Gene Map of Malarial Plastid DNA 161

Small subunit Large subunit E. coli Euglena E. coli Cyapa E. coli .Cyapa E. coli Euglena 1 246 1 248 1 148 1 148 1 289 1 _ 289 1 1 137 1 _ 135

rplie

T ^ 2 5 9 T 'T

rpl23

233 233 220 220 102 1 ^-«"'223 , Cryph 22, , E PP" ,24 1 C y p " 124 1 E. coli ^95 , Cyapa 1 E. coli 38 1 Euglena 3g 1 1

rpl36

196 . 196 1 ^ “ '' 266 1 CyPPP 260 1 ^ 83 1 89 1 125 1 125

rpl14 A 266 266 78 78 1 E .coli ^89 ^ Cyapa ,39 ^ E. coli g g , Cyapa g g 1

Figure 3. DOT matrix comparisons of the predicted amino acid sequences of the malarial ribosomal proteins (rp) w ith those of E. coli, or plastid versions from C. paradoxa (Cyapa), Cryptomonas phi (Cryph), and E. gracilis (Euglena), or w ith Mycoplasma capricolum (Mycca); For accession numbers see Harris et al. (1994). DOTPLOTS for small subunit (rps), or large subunit (rpl), are in pairs, each gene being specified to the LH side of the plots. In each plot, the malarial sequence (Pf) is on the vertical axis. The numbers shown include gaps inserted for alignment purposes (Needleman & Wunsch, 1970). The window size was 20 amino acid residues with a stringency of 25% (5/20 residues). The similarity diagonal runs from top left to bottom right in each plot.

Conservation of the circle However, comparison of the predicted Pf peptide with the atypical EF-Tupi sequence from the To assess the level of conservation of the 35 kb Charophycean alga Coleochaete orbicularis (probably DNA in clades of apicomplexans widely separated no longer functional; Baldauf et al., 1990), showed by evolution (Escalante & Ayala, 1995), we used that Pf has only six conservative substitutions the complete nucleotide sequence from Pf for in 22 residues that differ in the C. orbicularis mapping and sequencing comparisons. Figure 7 EF-Tupi peptide from nearly all of 27 other summarizes our findings for selected genes from EF-Tu sequences. This suggests that the func­ two representative coccidia, namely T. gondii (Tg) tional domains of the Pf peptide have been and Eimeria tenella (Et), as well as the piroplasm maintained under selective pressure. A full Theileria annulata (Ta). Figure 7a shows that, just phylogenetic analysis of the apicomplexan tuf as in Pf, a trn^ gene and a version of Pf ORF470 genes in relation to those of other species will be lie downstream of one of the LSU rRNA genes given elsewhere (C. Delwiche et al., Indiana, in both Tg and Et. Moreover (Figure 7b), a tufA unpublished results). gene flanked by rps7 and rpsl2 upstream, and Whilst the order of contiguous genes was trn'^ downstream (on the opposite strand), form conserved in each multigenic apicomplexan se­ a continuous sequence in Tg and Ft as well quence just described, the intergenic regions as Pf. A PILEUP comparison of the predicted differed in length between Pf, Tg and Ft. Analysis amino acid sequences from a block of 273 nt of the largest intergenic sequences, i.e. trnT- from the tuf genes of Pf, Tg, Ft and Ta is shown in ORF470, rps7-tufA and tufA-trnF, showed greatest Figure 7c. Pf with the highest A + T content of similarity between the various parasite genera at the tuf sequences (76%); Ta = 71%, Ft = 68%, the ends nearest the genes, falling off towards Tg = 68%, and concomitant codon bias, emerged the centres (Figure 7d). From these limited data, the as the most divergent of the four apicomplexans. apicomplexan plDNA is evidently conserved at 162 Gene Map of Malarial Plastid DNA

eftu_pf 50 eftu_anani 50 eftu_cryph PH-vTlIGTIGH 50 eftu_cyeç>a Pff.'WIGTIGH 50 eftu_euglena PHÎÎÎ1IGTIGH 50 eftu _ecoli 50 eftu_yeastnu. ff-TOGTIGH 50

eftu_pf F.GITIIlTaHi 100 eftu_anani P.GITIMTPiT' 100 eftu_cryph F.GITIIITAH- 100 eftu^cya^ PGITIIlTAtf 100 eftu_euglena rGITIITTAH- 100 eftu _ eco li FGITI n XCFGHAD 100 eftu_yeastmi GITISTAH H-TCFGH.-.D 100

110 120 eftu_pf 150 eftn_anani CGFIJFCTF.E 150 eftvL.cryph J3FMFQTF.E 150 eftu_cyapa EGPMFQTFE 150 eftu^euglena d g f iif q t S s: 150 eftu ^ecoli ZGFIIFCTFE 150 eftu^eastnd 150

eftu_pf 200 eftu^anani 200 eftu^cryph 200 eftu^cyapa 200 eftu^euglena 200 eftu ^ ecoli 200 eftu^eastmi 200

eftu_pf 250 eftiL_anani 250 eftu^cryph 250 efttL-Cyapa 250 eftu_euglena 250 eftu _ecoli 250 eftu_yeastmi 250

eftu_pf 251 NLNEE 300 eftu_anani 300 eftu_cryph 300 eftu_cyapa 300 eftu_euglena 300 eftu _ecoli 300 eftu_yeast3td HNS-TPL 300

eftu_pf 350 eftu_anani 350 eftu_cryph 350 eftu_cy5ç>a 350 eftu_euglena 350 eftu ^ecoli 350 eftu_yeastnii 350

eftu_pf 400 eftu_anaili 400 eftu_cryph 400 eftu^cysçja 400 eftu_euglena RS 400 eftu _ecoli — —“EL 400 eftu_yeastmi vmr 9 pkevedhsmq 400

eftu_pf 450 eftu^anani 450 eftiLcryph 450 eftiLcyapa 450 eftu_euglena 450 eftu _ecoli 450 eftiLyeastnu. 450

Figure 4. Multiple alignment of predicted EF-Tu peptides: P. falciparum (Pf), A. nidulans (anani), Cry. phi (cryph), Cya. paradoxa (cyapa), E. gracilis (euglena), E. coli (ecoli), and the mitochondrial version from S. cerevisiae (yeastmi); EMBL accession number K00428. For other accession numbers, see Delwiche et al. (1995). The yeast sequence lacks its S' leader whilst the initial methionine has been added to the E. coli version. The residues involved in GTP binding are underlined. Gene Map of Malarial Plastid DNA 163

10 20 30 40 50 pea.clpC 1 MARVLAQSLS VPGLVAGHKD SQHKGSGKSK RSVKTMCAIÜ TSGLRMSGFS 50 Pfclp 1 ------60 70 80 90 100 pea.clpC 51 GRLTFNHLNT MM-RPGLDFH SKVSKAVSSR RARAKRFIPR AMFERF 100 pfclp 51 MIILNNL 100

110 120 130 140 150 p e a .c lp C 101 — ftX A jicV I MLAQEE&RL GUg/GTEQI g g l G E G T G lAAgg^iC 150 pfclp 101 YCgŒLglIF IKSEY l I lJCY w B < X T-CN LC t S tsS tn K 150

160 170 180 190 200 p e a .c lp C 151 INLiODARVEV EKIIGRG9GF VAVEflpfcPR AKRjft^LSQE EARQLGHNYg 2 0 0 pfclp 151 KIINNKIILS LLNKYKYή» NIINgQ^- —KgCNILIK I2IPWK— Q 200

210 220 230 240 250 p e a .c lp C 2 0 1 G & H g t g R§3BGVAARV LENLGADPIN IRTQVIRMVG ESAOiTTATV 2 5 0 p f c lp 201 n B fn W w JB EgCNFKKDIN YLFKYI/ÜJJF SNUILNNYIK IN IF g lN IR I 2 5 0

260 270 280 290 300 p e a .c lp C 2 5 1 G9GSSNNKTP T te Y G T ^ T KLAEEGKLM» W a iQ P Q fe l VTQ0fcRRTK 300 p f c l p 2 51 KLKEISVNLL N 3 m Y N §Œ . NFYKQQYIQL LQILNLKtfOC HI-51------300

310 320 330 340 350 pea.clpC 301 NNPCLIGEPg feCTAIABGL AWIANGDVP ETfeftcVTT LEMXj^AGT 350 pfclp 301 Eg BŒNIFIFLQ LLINNIKNKI IPBiLgiTEI WVLNDguTYD 350

360 370 380 390 400 p e a .c lp C 351 KYRGEFEERL KKLMEElf-Q S E D g L p I b i VHTglGAGAA BGA-IDAANI 400 pfclp 351 IQTLIYKILN ISKYFW&K LILafNg-g XFNSPENIMl ENNKLYYLFL 400

410 420 430 440 450 pea.clpC 401 *ŒA#U%3EL Oc R ga BiIDS jfeKHIEKDPD LERR&FVKV PEPTVDEB i Q 450 p f c l p 401 BLNKgfGYNI H l g M ^ g BTTYFKYNII KDSY^KIRI KELSILQlP^ 4 50

460 470 480 490 500 pea.clpC 451 [lÆLRERfe IHH§3RYTœ ALIAAAOgf oQ^RFgD KAIDg/OEAG 500 P f c l p 451 B r a ^ Y K u I NYYffllNINNY IIYELINa#^ K g fŒ L Ig S r TPLIfflLENSC 500

510 520 530 540 550 pea.clpC 501 BlVRjfcHAQL PEEAKELIKE VRKIVKEKEE YVRNQDFEKA GELRDKEMDL 550 Pfclp 501 g4KYljLNNKI SYSNFNYLFT YNFNIIYNNK NNNLTIEDIK NSISN 550

560 570 580 590 600 pea.clpC 551 KAQISALIEK GKEMSKAETE TAMGPIVTE VDIQHIVSSW TSIPVDjg^SA 600 Pfclp 551 ------YI2JI SKTILFg%IK 600

610 620 630 640 650 pea.clpC 601 DESmftS

660 670 680 690 700 pea.clpC 651 g&muVYYg 700 P fclp 651 C @ sgl@ rg S cElB H 700

710 720 730 740 750 pea.clpC 701 BÆmgrrf 750 Pfclp 701 iBsim MiY&ggiH flbE63WtBr 750

770 780 790 800 Æss VIEKGŒWIG FDLDYDEKDB SYNRII cSLVT 800 P fclp 751 StLlgSlHQl nJ .RSfrJr P KNYDLY LKNKNFLSKg IXKEtfmiK 800

810 820 830 840 850 pea.clpC 801 EELKQjS|lg9 F^glEM IV g%gTKLEVK E*\DIMIlCi/ PQRjcnCEIE 850 Pfclp 801 lSEPNILI BŒ>§nNNLL FgWFINgL KIK®rUIKm 850

860 870 ______880 890 900 pea.clpC 851 LQVTE-RFRD R -vfeG H B D®LLED&4A EKM&REIKE 900 Pfclp 851 IIIHINKELK YFL#OLM® l | H S ^ LELIFEKglS ELLgTYNKHY 900

910 920 930 940 950 pea.clpC 901 GDSVIVDVDS EKlSTIVgî SSGTPESLPE ALSI ...... 950 Pfclp 901 FIKNKYILYY Y IA#Y K& NIYLL...... 950 Figure 5. Alignment of ClpC from Pisum sativum (pea.clpc; EMBL accession number L09547) with the Clp-like sequence of P. falciparum (Pfclp), using the Higgins-Sharp algorithm (MacDNASlS software, Hitachi). The leader sequence (1 to 95) and the "spacer" region (545 to 590) of the plant sequence do not have counterparts in the malarial sequence. The leucine-rich blocks near the N terminus (- -) and the conserved nt-binding residues in ATP-binding dom ain 2 (- -) are indicated. 164 Gene Map of Malarial Plastid DNA

982 1 995 43 Ÿ ‘ • • • I \ • ' ■ - 1» i

• 1 % \ Iy •l! V** 1 988 995 992

Figure 6. Dot matrix comparisons. The predicted malarial Clp peptide was compared with: a, ClpA (E. coll); b, ClpB (£. coli); c, ClpC (P. sativum). For comparative purposes, d shows the similarity between ClpB and ClpC. Stringency = 33% (window size of 20 and match of 6). The diagonal indicating similarity runs from the N terminus (top left) to C terminus (bottom right). The central spacer region where there is no homology, as well as the conserved nt-binding domains and two small trailer sections near the C terminus, are evident. the levels of gene content, gene order, and of the RNA, as probes for rRNA genes (Figure 8a) intergenic sequence, supporting our contention that or nuclear genes gave clean bands. Owing to this they have evolved from a single source, i.e. a heterogeneity of RNA molecules, RNase protection progenitor of the Phylum. of representative transcripts from circle genes was performed using probes to internal regions of rpoB, Transcription rpoCi, rpoCi, tufA, rpl2, clpC and ORF470. In each case, protected fragments of the expected size were Primer extensions. Northern blots, RT-PCR and found (Figure 8c and data not shown). By contrast, RNase protection methods have been used to probes that either covered two genes ixpoCx / rpoCi, determine whether the genes on the circle are rpl2/rpl23) or went beyond the 5' end of a gene transcribed. Primer extension had previously given (tufA) led to uninterpretable results (data not convincing data for transcription of the tRNAs shown), multiple bands of different sizes being (Preiser et al., 1995), and was used to map the 5' produced. Despite these difficulties, RT-PCR did ends of both small and large rRNAs (Gardner et al., allow us to detect large transcripts covering certain 1991a, 1993). Northern blots had detected tran­ pairs of adjacent genes. Primer extensions starting scripts of the expected sizes for tRNAs (Preiser with gene rpll or rps3, followed by PGR reactions et al., 1995) and rlàvJAs (Gardner et al., 1991a, 1993). using different primers, allowed us to link This approach further showed that rpoB exists in transcripts of rpll and rpl23 as well rps3 and rpsl9, the form of a large transcript, significantly bigger showing that they are polycistronic (Figure 8b). The than the coding region of the gene (Gardner et al., numbers and sizes of primary transcripts remain 1991b; Feagin & Drew, 1995). Attempts to use these to be determined but the complete gene map approaches for other genes on the circle led to (Figure 1) suggests that a minimum of four different observations. Primer extensions to map transcripts might be expected, all commencing the start site of transcription for some of the rp, within the JR. rpoB/C, clpC and tufA genes, gave either no signal at all, or a smear upon which some bands were Polysomes superimposed (data not shown). These bands did not correspond to specific start sites as they could Crude microsome preparations from erythrocytic not be shifted by using primers closer to or further stage parasites were fractionated on sucrose density away from the stops. Similarly, Northern blots gradients before and after treatment with 0.2 mM using probes specific for these same genes normally puromycin and 0.3 M KCl. RNA was prepared gave a smear, some of the material detected being from the fractions and Northern blots were greater than 9 kb in size (Figure 8a and data not hybridized with a probe specific for a variable shown). The smearing was not due to degradation region of the SSU rRNA of the 35 kb circle. It was thr ORF470 tRNA LSU rRNA Pf L 25 /I 15 7 Tg 4 0 r\ 3 f Et 4 8 / I -5 A

tufA rps7 rps12

Pf 2 2 6 / 4 8 / 1 8 / U Tg 1 5 7 / 6 1 / 2 3 / U Et 1 6 3 / 3 4 / 3 6 /

e ftu _ p f e ftu _ to x o eftu_eimeria eftu_theiler

e ftu _ p f e ftu _ to x o eftu_eimeria eftu theiler

tm aT_orf. Pf @------#8 50 tm a T _ o rf .Tg X^TATgrr TAgÆniMU 50 tm aT_orf. Et @ 5 0 ^ 3 # AA@TATA#U^ AT@\@^AA# g S T A l ^ ; ,AATTTTAAA. 50

10 20 30 40 50 r p s7 _ tu f .P f jb irTAl^gi a------|E@rA 50 rps7_tuf.Tg >CAAAAG TATB ^ S tT A g T A im M 50 rps7_tuf.Et 50

60 80 90 rps7_tuf .Pf 100 rps7_tuf .Tg 100 rps7_tuf .Et m i 100

20 30 40 50 tuf_trnaF.Pf TTTi tuf_tmaF.Tg tu f_ tm a F .E t

60 70 ______æ 90 100 tuf_tmaF.Pf 100 tuf_tmaF.Tg 100 tuf trnaF.Et 100

110 120 _____ 130 140 150 tuf_tmaF.Pf 101 ’AGITA ATAAAAAAAA 150 tuf_tmaF.Tg 101 ------150 tuf _tm aF. Et 101 Ü E K B i t ------150 160 170 180 190 200 tuf_trnaF.Pf 151 ATATTATAAT ACTATAATAT ATAACAAATA TAAATATTTA TTAAATTr® 200 tuf_tmaF.Tg 151 ------200 tuf_trnaF.Et 151 @ 200

210 220 230 240 250 tuf_tm aF. Pf 201 TA 250 tuf_tmaF.Tg 2 0 1 ------250 tuf_tm aF. Et 201 AT'ugmraa i S H SSF. 250

Figure 7. Summary of sequence data showing conservation of gene order, nt sequence, and predicted amino acid sequence on the pIDNAs of P. falciparum (Pf), T. gondii (Tg), E. tenella (Et) and Theileria annulata. a. Gene order down­ stream of the IR-A arm of the inverted repeat (see Figure 1). Lengths of the intergenic regions are given in nucleotides, b. Gene order at the end and downstream of the large rp gene cluster (see Figure 1). In each case, the tRNA gene is encoded on the opposite strand from the other three genes, c. Comparison of predicted amino acid sequences for a portion of EF-Tu in four genera of apicomplexans. d. Nucleotide sequence comparisons of entire intergenic regions on the circular DNAs of P. falciparum (Pf), T. gondii (Tg), and £. tenella (Et). 1, tR N A ^" to ORF470; 2, rps7 to tufA; and 3, tufA to tRNA"^. 166 Gene Map of Malarial Plastid DNA

b rpl23-rpl2 rps19-rps3 M 1 2 3 1 2 3 M

M l

clp ORF470 rpl2 tufA 123YPM 123YPM 123YPM 123YP

713 /7 2 6 713/726- 553 553-;^ .'X' 500 500. 4 13-427 413-427-A -

249-

200-ï'f »

IIS-

Figure 8. Evidence for polycistronic transcription of protein-encoding genes on the 35 kb circle, a. Northern blots hybridized with a specific PCR-generated probe for tiifA produced smears and multiple bands, whereas probes for LSU and SSU rRNAs revealed more discrete populaions of molecules, b, PCR products (asterisks) were amplified following reverse transcription of RNA spanning pairs of adjacent genes. Left: primers spanning rpl23 and rpl2 (expected product size, 248 bp); and right; primers spanning rpsl9 and rps3 (expected product size, 710 bp). Before reverse transcription, all RNA samples were treated with DNase; RNase was added to lane 1, and no reverse transcriptase was added to lane 3. M, a 123 bp ladder. The faint band of expected product size in lanes 1 and 3 of the rpl23lrpl2 experiment may indicate incomplete DNase digestion, c, RNase protection assays. Transcribed radiolabelled antisense RNA strands of defined size (asterisks) were protected by mRNAs corresponding to (from right to left) clpC (135 nt), ORF470 (330 nt), rpl2 (239 nt) and tufA (231 nt); radiolabelled sense strands were not protected. RNase digestion was carried out at decreasing concentrations (lanes 1 to 3, see Materials and Methods for details) at 30°C. The whole probes are shown before (P) and after digestion (Y) in the presence of yeast RNA. Marker (M) sizes are given in nt. Gene Map of Malarial Plastid DNA 167

Figure 9. Localization of 35 kb extrachromosomal DNA in red cells infected with P. falciparum, a, Extra- chromosomal DNA (arrowheads) and nuclear DNA stained with DAPl. b. Fluorescence signal (ar­ rowheads) from the same cells after in situ hybridization with a single­ stranded DNA probe specifying a variable region of the SSU rRNA encoded by the 35 kb circle. Bar represents 10 pm.

found that puromycin shifted the hybridization virtually of only a nucleus and an axoneme. In situ signal from the lower third of the gradients to the hybridization of red cells infected with ring and top, as expected for the dissociation of polysomes trophozoite stages of P. falciparum, using a into ribosome subunits. These preliminary results digoxigenin-labelled DNA probe for a variable are consistent with the presence of a distinct subset region of the SSU rRNA encoded by the 35 kb circle, of ''plastid" polysomes and, by inference, active revealed a discrete fluorescence signal readily protein synthesis. distinguished from the DAPI-stained nucleus and often corresponding to weak DAPI-staining of an Plastid organelle extrachromosomal DNA (Figure 9). Genetic crossing experiments and DNA analysis Discussion of gametes in malaria (Vaidya et a l, 1993; Creasey et al., 1994) strongly support the presence of a Our proposal, that malaria and other apicom­ cytoplasmic 35 kb DNA-containing organellar com­ plexans have a vestigial plastid genome, probably partment: DNA probes to parts of the rpoB gene derived from an ancient photosynthetic progenitor hybridized with DNA from purified female (macro) (Wilson et al., 1994), is consistent with Cavalier- gametes, which contain a full complement of Smith's independent suggestion, based on phylo­ cytoplasmic organelles, but not with purified male genetic analysis of 18 S rRNA genes, that api­ (micro) gametes, which in malaria are composed complexans evolved from an ancestral form of 168 Gene Map of Malarial Plastid DNA chromist, but lost their photosynthetic capacity in whose genetic make-up has little in common with the process (Cavalier-Smith et al., 1994). Electron the 35 kb circle other than the fact that both micrographs of the putative remnant plastid genomes are replete with genes involved in gene organelle (the so-called "spherical body" in P. expression. falciparum, the "Golgi-adjunct" in T. gondii, and the According to our evolutionary hypothesis, the "holzylinder" in haemogregarines), suggest it malarial pi genome has been retained over an has four surrounding membranes (Siddall, 1992; extended period, possibly as long as 800 million Dubremetz, 1995; and our own unpublished years (Escalante & Ayala, 1995). Its conservation results). This is significant, as plastids with might be explained if the putative plastid organelle quadruple membranes constitute one of the central contributes to an integral step in eukaryotic characters of chromists, some of whose plastids are intermediary metabolism that is normally carried of red algal origin (McFadden & Gilson, 1995). On out by plastids (Howe & Smith, 1991; Wallsgrove, the other hand, phylogenetic analysis of a portion 1991). One such function might be the de novo of the pi LSU rDNA from three apicomplexan biosynthesis of haem precursors (Kannangara et al., genera has suggested that it is more closely related 1988). However, in malaria there is evidence that to that of euglenoid plastids (these have triple this pathway is active in the mitochondrion, membranes, liloe dinoflagellates) rather than rhodo- utilizing glycine, rather than the plastid pathway, phytes (Egea & Lang-Unnasch, 1995). utilizing glutamate (Surolia & Padmanaban, 1992). in some respects then, apicomplexans resemble As with other organelles, the products of non-photosynthetic plants that retain a vestigial nucleus-encoded genes are key players in the plastid genome (Palmer, 1992). This hypothesis maintenance and function of the putative plastid raises the twin questions: is the genome functional, and it has been claimed that the apicomplexans and if so what does it do? Neither question can be Sarcocystis muris and T. gondii carry a core answered definitively at present, but three features photosynthetic gene, psbA, encoding protein D1 of of the circle's sequence point to a functional role: the photosystem II complex, as well as small firstly, reduction of a conventional plastid genome amounts of protochlorophyllide a (Hackstein et ah, to resemble the malarial circle would entail 1995). It was further suggested that the sensitivity numerous deletions with a skewed distribution, of apicomplexans to toltrazuril depends on its dozens of genes being retained for gene expression interaction with the D1 protein in a photoreaction whilst those directly involved in photosynthesis centre in the parasite's organelles. However, were lost; secondly, open reading frames have been evidence for the D1 protein depends solely on a maintained despite extensive sequence divergence; PCR-derived partial nt sequence, which does not thirdly, the genetic content of the circle has been include the N terminus. We were unable to obtain conserved across disparate genera of apicomplex­ a corresponding PCR product using malarial DNA ans. More direct support for function is provided as template (unpublished data), hence these pro­ by the transcriptional activity of the organelle, and vocative suggestions require confirmation. if our preliminary evidence for the presence of a That products of the mitochondrion and putative subset of plastid polysomes is confirmed, there can plastid might interact in a pathway regulating be little reason to doubt that the presumptive differentiation between tachyzoites and brady- organelle has some cellular role. zoites of T. gondii has been suggested (Tomavo & Hopes of finding a clue to the nature of its Boothroyd, 1995). Mutants of T. gondii selected for function by determining the complete nucleotide resistance to the mitochondrial inhibitor ato- sequence of the circle have not proved fully vaquone were observed to switch from tachyzoites justified. If there is a key gene it would appear to to bradyzoites and also had increased sensitivity to be one of the few ORFs, possibly ORF470, as the clindamycin, an antibiotic believed to act on the others seem too small. The relatively highly putative plastid (Pfefferkorn & Borotz, 1994). conserved nature of the ORF470-predicted poly­ Preliminary experiments with antibiotics in our peptide is suggestive, and implies that its inter­ laboratory (M. Strath, unpublished) have indicated action with other molecules has evolved under that several inhibitors of organellar protein syn­ different selective pressures from, for instance, thesis act relatively rapidly (hours) on Pf in those of the malarial plastid ribosomal proteins, cultures, some of these compounds having binding which often are poorly conserved. It has been noted specificities corresponding to the 35 kb circle and that the algal equivalent of ORF470 is markedly not the mitochondrial DNA. However, the modes up-regulated when the red alga P. purpurea is of action of these antibiotics in the malarial cell are shifted from dark to light conditions (Mike Reith, not known. Halifax, Nova Scotia, personal communication). Apart from the use of inhibitory agents, further However, despite its interest, ORF470 could yet insight into the function of the presumptive turn out to be involved only in a plastid house­ plastid-like organelle might possibly come from keeping role, and knowing its activity might experimental ablation or "knock-out" studies, or by not bring us nearer to understanding what the the discovery of nucleus-encoded genes whose organelle actually does. We get little help in this products carry appropriate leader sequences for connection from the degenerate plDNA of Epifa- binding to the organellar membrane. (It is germane gus, which has no counterpart of ORF470, and that plastids of secondary endosymbiotic origin are Gene Map of Malarial Plastid DNA 169 surrounded by multiple membranes, hence they ACGTGAGCTGGG. For rps7-tufA-tRNA^^": 5' GGAG- may have evolved their own specialized import GTAGAGTAAAAGATTTACCAGG; 5-GGTAGAGCA- mechanisms about which nothing is known ATGGATTGAAG. For internal block of Th. annulata tufA: (Cavalier-Smith et al., 1994).) At present, the 5'-CATGTA/TGATCATGGA/TAAAAC; 5 -ATCTTCTT- TATTTAAAAAA/TAC. Amplification was carried out reverse genetics and complementation technolo­ for 35 cycles as follows: 95°C for 30 seconds, 40°C for gies (including appropriate vectors, mutants 60 seconds, 72°C for 120 seconds. As a positive control, etc.) for targeting modified genes to DNA-contain- purified 35 kb circle from Pf was used as template. Total ihg organelles in apicomplexans have not been DNAs from MDCK cells and the trypanosomatid developed, and to date only one nuclear gene Leishmania guyanensis (strain B8, courtesy of Dr D. Barker, (cpn60) with a potential mitochondrial targeting Cambridge University) were used as negative controls. leader sequence has been reported (Holloway et a l, 1994). Sequence analysis We conclude that the 35 kb plDNA of apicom­ plexans has been maintained over a long evolution­ Staden-Plus software (Amersham) was used to identify ary period for an essential function. Clues to the tRNA genes; standard conserved bases were specified role played by the remnant plastid organelle in and large score numbers used to "filter" the candidate sequences. Open reading frames were identified and biosynthesis or intermediary metabolism have not analysed using algorithms such as BLASTP, PILEUP and been discovered, but may yet come to light as the DOTPLOT on the GCG package, version 7.1 (Devereux nuclear genome is more fully sequenced. et al, 1984).

Materials and Methods Transcription analysis Cloning, amplification and sequencing of 35 kb Total RNA was extracted from erythrocytic forms circular DNA of malaria parasites by the acid/phenol method, RNaid Plus kit (Bio 101). Northern blots and primer extensions The 35 kb circle was purified from total DNA of were performed as described (Feagin. et al, 1992; Preiser erythrocytic cultures of P. falciparum (Pf, CIO strain) by et al, 1995). RNase protection was carried out with density gradient centrifugation in caesium chloride the RPA II kit supplied by Ambion (AmsBiotechnology): (Gardner et al, 1988). Clones were derived from a Hmdlll DNA templates to generate the appropriate antisense library (Gardner et al, 1988), a Sau3A! library (Gardner RNAs were obtained either by using pre-existing clones et al, 1991b), or a shotgun library of purified, sonicated (jpl2) or by cloning fragments generated by PCR into circle that had been end-repaired and ligated into the the TA vector-Invitrogen (ORF470, Clp, tuf ). Part of the Bluescript vector pBS KS(II)+ (Gardner et al, 1994a). rpl2 sequence cloned in pBS (Stratagene) was digested Clones were sorted by reference to the Hm dlll restriction with Dral and then transcribed with T3 RNA polymerase, sites of the circle. Oligonucleotide primers designed from giving a product of 325 nt, of which 239 nt was sequenced clones were used to fUl gaps by PCR. Contigs complementary to rpl2. TA vectors containing either were constructed using the Staden-Plus software, an internal fragment of ORF470, or sequence close to commencing from both ends of the inverted repeat (IR). the 5' ends of clp or tuf, were restricted with Dral The entire circle was sequenced, apart from a short region and transcribed in vitro with SP6 RNA polymerase; in the centre of the IR estimated from sized restriction the transcript sizes were 410,215 and 311 nt, respectively. fragments to be only tens of nucleotides long. Clones and All transcripts contained 80 nt of the plasmid leader PCR products were sequenced on both strands using sequence as well as the complementary malarial Sequenase version 2.0 (USB) and terminal dideoxy-nucle- gene sequence. The RNA transcribed in vitro was otidyl transferase as described (Fawcett & Bartlett, 1990). treated with DNase to remove the template and The EMBL accession numbers are X95275 (IR-A half) and then purified on 6% denaturing polyacrylamide gels. X95276 (IR-B half). The ^^P-labelled antisense RNA, or a sense strand The T. gondii 35 kb homologue was isolated from control, were hybridized with 10 gg total Pf RNA total tachyzoite DNA (RH strain, harvested from MDCK and digested with RNase (a mixture of RNase A and cells) by a density gradient procedure similar to that RNase Ti, diluted 10"’, 10“^ 10"^) at temperatures varying referred to above, and shown to hybridize with from 24°C to 37°C, according to the manufacturer's probes from the Pf 35 kb circle. Various PCR products recommendation. Protected fragments were separated were generated using degenerate primers based on on a 6% polyacrylamide/8 M urea gel prior to conserved regions of the Pf circle, and were purified autoradiography. prior to sequencing. Total DNA from Eimeria tenella Transcripts were detected by RT-PCR using RNA (H-strain) sporozoites and Theileria annulata (Hissar samples first treated with DNase. Reverse transcription strain) piroplasms was obtained from Dr F. Tomley, Inst. (RT) reactions were carried out with 5 pg total RNA for Animal Health, Compton, and Dr R. Hall, York one hour with 25 pg ml"’ antisense primer and 200 units University, respectively. DNA corresponding to the 35 kb of Moloney murine leukaemia virus RTase (Life Sciences) circle was obtained for sequencing as follows: PCR in a buffer consisting of 20 pi 50 mM KCl, 10 mM reactions were carried out with 20 ng T. gondii circle Tris-HCl (pH 8.3), 0.1% Triton. X-100, 6 mM MgCL. All DNA, 160 ng total E. tenella DNA, or 1 pg total Th. samples, including controls treated with RNase, were annulata DNA as templates. About 500 ng of the then amplified by PCR in 100 pi of the same buffer following primers were added to reaction mixtures containing 500 ng of both sense and antisense primers, containing 2.5 mM MgClz, xl PCR buffer (Promega), as well as 1 unit of Tacj DNA polymerase and an 0.2 mM dNTPs (Promega) and 1 pi Amplitaq (Cetus)/ experimentally determined concentration of MgCl 2. 100 pi: For LSUrRNA-tRNA^-ORF470: 5 -ATCACATT- Products were separated in 2% agarose gels and stained GAG/CA/TATAATT; 5' GTTCGCCTATTAAAGCGAT- with ethidium bromide. 170 Gene Map of Malarial Plastid DNA

Polysomes Acknowledgements Ribosomes were isolated essentially as described Drs Douglas Barker (Cambridge), Roger Hall (York), (Sherman et al., 1975). Erythrocytes infected with Pf Mary Pudney (Glaxo/Wellcome), Judith Smith (Leeds) were lysed on ice in an NP-40 (Nonidet-40) buffer and Fiona Tomley (Compton) kindly supplied DNA (0.14%) containing 380 mM sucrose. After repeated samples for analysis. P.D. was the recipient of an MRC centrifugation at 10,000 g the supernatant was cen­ Studentship Award. This work was funded in part by the trifuged at 105,000 g for one hour in an SW40 rotor UNDP/WORLD BANK/WHO Special Programme for (Beckman). The pellet was resuspended in 25 mM KCl, Research in Tropical Diseases (TDR). 5 mM MgCb and 50 mM Tris-HCl (pH 7.7), and homogenized by hand (x50 strokes) with a glass Bounce homogenizer (Wheaton, USA). Following centrifugation References at 10,000 g for ten minutes at 4°C, the crude pellet was discarded and the supernatant centrifuged at Baldauf, S. L. & Palmer, J. D. (1990). Evolutionary transfer 105,000 g for two hours. The resulting pellet resuspended of the chloroplast tufA gene to the nucleus. Nature, in the same buffer was the source of polysomes. 344, 262-265. Polysome preparations, with or without incubation Baldauf, S. L., Manhart, J. R. & Palmer, J. D. (1990). Different fates of the chloroplast tufA gene following at 37°C for one hour with either 0.2 mM puromycin transfer to the nucleus in green algae. Proc. Natl Acad. in 0.2 mM GTP or 0.3 M KCl (Cox & Hirst, 1976; Gale Soi. USA, 87, 5317-5321. et al., 1981), were placed on sucrose density gradients Borst, P., Overdulve, J. P., Weijers, P. J., Fase-Fowler, F. (20% to 50% (w/v) sucrose with a 70% cushion in 10 mM & van Den Berg, M. (1984). DNA circles with Tris-HCl (pH 7.7), 10 mM MgCb, 100 mM KCl) and centrifuged at 105,000 g for 16 hours. Fractions from the cruciforms from Isospora (Toxoplasma) gondii. Biochim. gradients were extracted in phenol/chloroform/isoamy- Biophys. Acta, 781, 100-111. lalcohol (25:24:1, v/v) to obtain total RNA, which was Cavalier-Smith, T., Allsopp, M. T. E. P. & Chao, E. E. then separated on 2% agarose/formaldehyde gels (1994). Chimeric conundra: Are nucleomorphs and and blotted on nylon membranes (Gene Screen) for chromists monophyletic or polyphyletic? Proc. Natl hybridization with a single stranded DNA probe specific Acad. Sci. USA, 91, 11368-11372. for the SSU rRNA encoded by the 35 kb circle. Creasey, A., Mendis, K., Carlton, J., Williamson, D. H., Preparations of eukaryotic polysomes (L cells) were Wilson, 1. & Carter, R. (1994). Maternal inheritance fractionated as controls to monitor the shift of polysomes of extrachromosomal DNA in malaria parasites. Mol. Biochem. Parasitol. 65, 95-98. into ribosomal subunits following treatment with Cox, R. A. & Hirst, W. (1976). A study of the influence puromycin and KCl. of magnesium ions on the conformation of ribosomal ribonucleic acid and on the stability of the larger subribosomal particle of rabbit reticulocytes. Bio­ chem. J. 160, 505-519. In situ hybridization Delwiche, C. F., Kuhsel, M. & Palmer, J. D. (1995). Phylogenetic analysis of tufA sequences indicates a Digoxigenin-labelled (DIG) DNA probes correspond­ cvanobacterial origin of all plastids. Mol. Phylog. ing to both sense and antisense single strands of a Evol. 4, 110-128. variable region of SSU rRNA (region V4; see Gardner Devereux, J., Haeberli, P. & Smithies, O. (1984). A et al., 1991a) were prepared by PCR using 35 kb DNA as comprehensive set of sequence analysis programs template. The dNTP reaction mix contained 20 pM for the VAX. Nucl. Acids Res. 12, 387-395. DlG-11-dUTP (Boehringer), and forward and reverse Dubremetz, J. F. (1995). Toxoplasma gondii: Cell biology primers in a ratio of 50:1 in an appropriate Taq update. In Molecular Approaches to Parasitology polymerase buffer (Perkins-Elmer). Following ampli­ (Boothroyd, J. C. & Komuniecki, R., eds), pp. 345- fication for 25 cycles at 94°C for 45 seconds, 50°C 358, Wiley-Liss, Inc., New York. for 60 seconds and 72°C for 120 seconds, the double- Egea, N. & Lang-Unnasch, N. (1995). Phylogeny of and single-stranded reaction products were assessed the large extrachromosomal DNA of organisms in by agarose gel electrophoresis. After boiling for five the phylum Apicomplexa. J. Euk. Microbiol. 42, minutes, the reaction mix was hybridized to thin 679-684. smears of methanol-fixed, parasitized erythrocytes Escalante, A. A. & Ayala, F. J. (1995). Evolutionary heated in a boiling water bath. The reaction mix was origin of Plasmodium and other Apicomplexa based spread on the cells with a heated coverslip and on rRNA genes. Proc. Natl Acad. Sci. USA, 92, hybridization allowed to take place for two hours at 70°C 5793-5797. on a heated slide tray (MJ Research, Inc). The slides were Fawcett, T. W. & Bartlett, S. G. (1990). An effective washed with 500 mM NaCl/50 mM EDTA for five method for eliminating "artifact banding" when minutes, twice with 2 x SSC; see Maniatis et al. (1982), sequencing double-stranded DNA template. Biotech­ and once with phosphate buffered saline (PBS). niques, 9, 46^8. Hybridized probe was revealed by incubation overnight Feagin, J. E. & Drew, M. E. (1995). Plasmodium falciparum. at 37°C with a fluoresceinated antibody to DIG Alterations in organelle transcript abundance (Boehringer) diluted 10“’ in PBS-bovine serum albumen during the erythrocytic cycle. Exp. Parasitol. 80, (1 mg ml"’). Total DNA was stained with 4',6-diamidino- 430-440. 2-phenylindole (DAPl) at 5 ng ml"’ for five minutes, the Feagin, J. E., Werner, E., Gardner, M. J., Williamson, D. H. slides being washed in PBS prior to mounting in & Wilson, R. J. M. (1992). Homologies between the VectorShield (Vector Ltd.). The cells were observed contiguous and fragmented rRNAs of the two under UV light using a Zeiss Photomicroscope 111 with Plasmodium falciparum extrachromosomal DNAs are niRS-epifluorescence illuminator, LP418 or BP520-560 limited to core sequences. Nucl. Acids Res. 20, barrier filters, and x 63 objective. 879-887. Gene Map of Malarial Plastid DNA 171

Gale, E. P., Cundcliffe, E., Reynolds, P. P., Richmond, that regulate chlorophyll synthesis. Trends Biochem. M. H. & Waring, M. J. (1981). Antibiotic inhibitors of Sci. 13, 139-143. ribosome function. The Molecular Basis of Antibiotic Kjeldgaard, M. & Nyborg, J. (1992). Refined structure of Action (Gale, P. P., Cundcliffe, P., Reynolds, P. P., elongation factor PF-Tu from Escherichia coli. J. Mol. Richmond, M. H. & Waring, M. J., eds.), pp. 402-547, Biol. 223, 721-742. John Wiley & Sons Ltd., London. Kossel, H., Hoch, B., Igloi, G. L., Maier, R. M. & Ruf, S. Gardner, M. J., Bates, P. A., Ling, I. T., Moore, D. J., (1993). Editing creates the initiator codon of the rpl2 McCready, S., Gunasekera, M. B. R., Wilson, R. J. M. transcript from maize chloroplasts. In The Transla­ & Williamson, D. H. (1988). Mitochondrial DNA of tional Apparatus. Structure, Function, Regulation, Evo­ the human malarial parasite Plasmodium falciparum. lution (Nierhaus, K. H., Franceschi, F., Subramanion, Mol. Biochem. Parasitol. 31, 11-18. A. R., Erdmann, V. A. & Wittman-Liebold, B., eds), Gardner, M. J., Feagin, J. P., Moore, D. J., Spencer, D. P., pp. 609-616, Plenum Press, New York. Gray, M. W., Williamson D. H. & Wilson, R. J. M. Maniatis, T., Fritsch, P. F. & Sambrook, J. (1982). Molecular (1991a). Organisation and expression of small Cloning. A Laboratory Manual. Cold Spring Harbor subunit ribosomal RNA genes encoded by a Laboratory Press, Cold Spring Harbor, New York. 35-kilobase circular DNA in Plasmodium falciparum. McFadden, G. & Gilson, P. (1995). Something borrowed, Mol. Biochem. Parasitol. 48, 77-88. something green: lateral transfer of chloroplasts by Gardner, M. J., Williamson, D. H. & Wilson, R. J. M. secondary endosymbiosis. Trends Ecol. Evol. 10, (1991b). A circular DNA in malaria parasites 12-17. encodes an RNA polymerase like that of pro­ Nagata, S., Tsunetsugu-Yokota, Y., Naito, A. & Kaziro, Y. karyotes and chloroplasts. Mol. Biochem. Parasitol. 44, (1983). Molecular cloning and sequence determi­ 115-123. nation of the nuclear gene coding for mitochondrial Gardner, M. J., Feagin, J. P., Moore, D. J., Rangachari, K., elongation factor Tu of Saccharomyces cerevisiae. Proc. Williamson, D. H. & Wilson, R. J. M. (1993). Natl Acad. Sci. USA, 80, 6192-6196. Sequence and organization of large subunit rRNA Needleman, S. B. & Wunsch, C. D. (1970). A general genes from the extrachromosomal 35 kb circular method applicable to the search for similarities in the DNA of the malaria parasite Plasmodium falciparum. amino acid sequence of two proteins. J. Mol. Biol. 48, Nucl. Acids Res. 21, 1067-1071. 443^53. Gardner, M. J., Goldman, N., Barnett, P., Moore, P. W., Palmer, J. D. (1992). Green ancestry of malaria parasites. Rangachari, K., Strath, M., Whyte, A., Williamson, Curr. Biol. 2, 318-320. D. H. & Wilson, R. J. M. (1994a). Phylogenetic Parsell, D. A., Kowal, A. S. & Lindquist, S. (1994). analysis of the rpoB gene from the plastid-like DNA Saccharomyces cerevisiae Hspl04 protein purification of Plasmodium falciparum. Mol. Biochem. Parasitol. 66, and characterization of ATP-induced structural 221-231. changes. J. Biol. Chem. 269, 4480^487. Gardner, M. J., Preiser, P., Rangachari, K., Moore, D., Pfefferkorn, P. R. & Borotz, S. P. (1994). Comparison of Feagin, J. P., Williamson, D. H. & Wilson, R. J. M. mutants of Toxoplasma gondii selected for resistance (1994b). Nine duplicated tRNA genes on the to azithromycin, spiramycin or clindamycin. Antimi- plastid-like DNA of the malaria parasite Plasmodium crob. Agents Chemother. 38, 31-37. falciparum. Gene, 140, 307-308. Pietrokovsld, S. (1994). Conserved sequence features of Gockel, G., Hachtel, W., Baier, S., Fliss, C. & Flenke, M. inteins (protein introns) and their use in identifying (1994). Genes for components of the chloroplast new inteins and related proteins. Protein Sci. 3, translational apparatus are conserved in the reduced 2340-2350. 73-kb plastid DNA of the nonphotosynthetic Preiser, P., Williamson, D. H. & Wilson, R. J. M. (1995). euglenoid flagellate Astasia longa. Curr. Genet. 26, tRNA genes transcribed from the plastid-like DNA 256-262. of Plasmodium falciparum. Nucl. Acids Res. 23, Gottesman, S., Clark, W. P., de Crecy-Lagard, V. & 4329-4336. Maurizi, M. R. (1993). ClpX, an alternative subunit Sherman, I. W., Cox, R. A., Higginson, B., McLaren, D. J. for the ATP-dependent Clp protease of Escherichia & Williamson, J. (1975). The ribosomes of the simian coli. J. Biol. Chem. 268, 22618-22626. malaria, Plasmodium knowlesi: I. Isolation and Hackstein, J. H. P., Mackenstedt, U., Mehlhorn, H., characterization. J. Protozool. 22, 568-572. Meijerink, J. P. P., Schubert, H. & Leunissen, J. A. M. Siddall, M. P. (1992). Hohlzylinders. Parasitol. Today, 8, (1995). Parasitic apicomplexans harbor a chlorophyll 90-91. a-Dl complex, the potential target for therapeutic Surolia, N. & Padmanaban, G. (1992). De novo triazines. Parasitol. Res. 81, 207-216. biosynthesis of heme offers a new chemotherapeutic Harris, P. H., Boynton, J. P. & Gillham, N. W. (1994). target in the human malarial parasite. Biochem. Chloroplast ribosomes and protein synthesis. Micro­ Biophys. Res. Comm. 187, 744-750. biol. Rev. 58, 700-754. Tomavo, S. & Boothroyd, J. C. (1995). Interconnection Holloway, S. P., Min, W. & Inselburg, J. W. (1994). between organellar functions, development and Isolation and characterization of a chaperonin-60 drug resistance in the protozoan parasite. Toxoplasma gene of the human malaria parasite Plasmodium gondii. Int. J. Parasitol. 25, 1293-1299. falciparum. Mol. Biochem. Parasitol. 64, 25-32. Vaidya, A. B., Morrisey, J., Plowe, C. V., Kaslow, D. C. & Howe, C. J. (1992). Plastid origin of an extrachromosomal Wellems, T. P. (1993). Unidirectional dominance of DNA molecule from Plasmodium, the causative agent cytoplasmic inheritance in two genetic crosses of of malaria. J. Theor. Biol. 158, 199-205. Plasmodium falciparum. Mol. Cell. Biol. 13, 7349-7357. Howe, C. J. & Smith, A. G. (1991). Plants without Wallsgrove, R. M. (1991). Plastid genes and parasitic chlorophyll. Nature, 349, 109. plants. Nature, 350, 664. Kannangara, C. G., Gough, S. P., Bruyant, P., Hoober, Wells, J., Henkler, P., Leversha, M. & Koshy, R, (1995). A J. K., Kahn, A. & Wettstein, D. V. (1988). tRNA*^*" as mitochondrial elongation factor-like protein is a cofactor in 8-aminolevulinate biosynthesis: Steps over-expressed in tumours and differentially ex­ 172 Gene Map of Malarial Plastid DNA

pressed in normal tissues. FEES Letters, 358, plexa. In Toxoplasmosis, vol. 78, NATO ASI series H. 119-125. (Smith, J. E., ed.), pp. 51-60, Springer-Verlag, Weber, J. L. (1988). Molecular biology of malaria Heidelberg. parasites. Exp. Parasitol. 66, 143-170. Wilson, R. J. M., Fry, M., Gardner, M. J., Feagin, J. E. & Williamson, D. H., Wilson, R. J. M., Bates, P. A., Williamson, D. H. (1992). Subcellular fractionation of McCready, S., Perler, F. & Qiang, B. (1985). Nuclear the two organellar DNAs of malaria parasites. Curr. and mitochondrial DNA of the primate malarial Genet. 21, 405^08. parasite, P. knowlesi. Mol Biochem. Parasitol. 14, Wilson, R. J. M., Williamson, D. H. & Preiser, P. (1994). 199-209. Malaria and other Apicomplexans: The "plant" Williamson, D. H., Gardner, M. J., Preiser, P., Moore, connection. Infect. Agents Dis. 3, 29-37. D. J., Rangachari, K. & Wilson, R. J. M. (1994). The Wittmann, H. G. (1983). Architecture of prokaryotic evolutionary origin of the malaria parasite's 35 kb ribosomes. Annu. Rev. Biochem. 52, 36-65. circular DNA; new evidence supports a possible Wolfe, K. H., Morden, C. W. & Palmer, J. D. (1992). rhodophyte ancestry. Mol. Gen. Genet. 243, 249-252. Function and evolution of a minimal plastid genome Wilson, I., Gardner, M., Rangachari, K. & Williamson, D., from a non-photosynthetic parasitic plant. Proc. Natl Eds. (1993). Extrachromosomal DNA in the Apicom­ Acad. Sci. USA, 89, 10648-10652.

Edited hy J. Kam

(Received 8 March 1996; received in revised form 4 June 1996; accepted 4 June 1996) R epo rts

CAT activity \was m easured with a diffusion assay as 35. D. A. Mann and A. D. Frankel. EMBO J. 10, 1733 former kit (Glontech) with the following two oligonu­ previously reported (30). (1991). cleotides ; 5 ’ -GTGTATGAAAGTAGTAAGTAGTAG- 17 M, 0. Westendorp ef al.. Nature 375, 497 (1995). 36. plL-2luc was constructed by cloning a 390-base 3' (Tat mutation) and 5 '-GTGGGAGGTGATATGTA- 18. E. Serfling, A. Avots, M. Neum ann. Biochim. Bio­ pair Kpn 1-Hind 111 fragment containing the human AGAAAGG-3' (selection primer). The presence of the phys. Acta 1263. 181 (1995). IL-2 prom oter (nt -3 4 0 to nt -t-50) (a gift from R. mutation was verified by sequencing, and a fully 19. A. S. Wechsier. M. C. Gordon. U. Dendorfer. K. P. Le Gaynor) into the corresponding sites of pGL2-BASIC resequenced Sal l-Stu 1 fragment containing the Clair, J. Immunol. 153, 2515 (1994). (Prom ega, Madison. Wl). This plasm id w as u sed a s a mutation was subcloned back into p89.6. Super­ 20. 25. P. G hosh. T. H. Tan. N. R. Rice. A. Sica, H. A. substrate for the mutagenesis of the GD28RE by natants from GEM x 174 cells transfected with this Young. Proc. Natl. Acad. Sa. U.S.A. 90. 1696 means of the Transformer site-directed mutagenesis DNA were harvested and their RT titer was mea­ (1993). m ethod (Glontech, Palo Alto, GA). The oligonucleo­ sured. To confirm that virus stocks had not revert­ 21. G. Nabel and D. Baltimore. Nature 326. 711 (1987). tide (5'-GGGTTTAAAGAAGCCTGAAAGAGTGAT- ed to wild type, virus stocks were centrifuged and 22. M. Ott and E. Verdin. unpublished observation, GA-3') was used to introduce four mutations within purified RNA was used in RT-PGR to amplify a 23 PBLs v/ere infected with HIV infectious stocks gen­ the GD28RE (mutations are highlighted in bold), fragment containing the mutated Tat gene. This erated after transfection of molecular clone HlV^gg abolishing binding of the GD28RG to the element PGR fragment was cloned with the TA cloning kit or a mutant containing a stop codon at amino acid (27). The oligonucleotide (5 ' -GT TAT GAT GTGTG A- (Invitrogen). 10 individual clones were rese­ 72 of Tat (5 y 10“ pg of p24 antigen per 1 0 ' cells). CGTCGTGGAGGGATGG-3') changes a Bam HI into quenced. and all contained the original mutation in Culture supernatants from cells stimulated with anti- an Aat II restriction site (highlighted in bold) and was the Tat gene. oodies directed against either CD3 or CD3+CD28 used for selection during mutagenesis. The region 40, This paper is dedicated to the memory of Kenneth were harvested at different times after infection and corresponding to the IL-2 prom oter w as fully rese­ W arren. We thank R. Golmann. D. Olive, N. Ghi- analyzed for RT activity. quenced to confirm the mutation, and the rese­ orazzi, R. Gaynor, G. Metz, and U. Siebenlist for 24 C. Graziosi et al.. Science 265. 248 (1994). quenced fragment was subcloned into pGL2-BASIG reagênts used in these studies: L. Bonetta and K. 25. C. Graziosi et al., Proc. Natl. Acad. Sci. U.S.A. 93. as described above. Manogue for their helpful comments on the manu­ 4386(1996). 37. G. Van Lint, J. Ghysdael, P. P aras Jr., A. Burny, E. script; D. Olive and Y. Golette for helpful discussions; 26. A. L. Kinter, G. Poli. L. Fox. E Hardy, A. S. Fauci. Verdin. J. Virol. 68. 2632 (1994). . K.-T. Jeang for pointing out to us the Tat mutation J. Immunol. 154. 2448 (1995). 38. L. Osborn, S. Kunkel, G. Nabel. Proc. Natl. Acad. resulting in a truncated Tat protein; and N. Yarlett for 27. J. D. Fraser. B. A. Irving. G. R. Crabtree. A. Weiss. Sci. U.S.A. 86. 2336 (1989). testing cell lines for Mycoplasma. G.V.L. is Ghargé Sc/ence 251, 313 (1991). 39. A point mutation (G T) was introduced into a d e R echerches of the Fond National d e la Recherche 28. 0. K. Haffar et al., Proc. Natl. Acad. Sci. U.S.A. 90, molecular clone of HIVgg g, p89.6 (5) to change Scientifique, Belgium. 11094 (1993). codon 72 of the Tat ORE from GAG into TAG. The 29. S. K. Arya. C. Guo, S. F. Josephs. F. Wong-Staal, mutation was introduced with the use of the Trans­ 13 September 1996; accepted 15 January 1997 Science 229. 69 (1985). 30. C. Van Lint. S. Emiliani, fvl. Ott, E. Verdin. EMBOJ. 15. 1112(1996). 31. F. Pages ef al.. Nature 369. 327 (1994). A Plastid of Probable Green Algal Origin in 32. M. A dam s ef al.. Proc. Natl. Acad. Sci. U.S.A. 91, 3862(1994). Apicomplexan Parasites 33. Anti-CD3 (454,3.21) was provided by N. Chiorazzi (North Shore University Hospital. Manhasset. NY) and precoated on tissue culture wells with 3 |j.g of Sabine Kohler,*t Charles F. Delwiche,Paul W. Denny, antibody per milliliter of buffer [35 mM bicarbonate and 15 mM carbonate (pH 9.6)] at 4°C overnight. Lewis G. Tilney, Paul Webster, R. J. M. Wilson, Anti-CD28 (done 28.2) w as provided by D. Olive Jeffrey D. Palmer, Daviid 8. Roos§ (INSERM. Marseilles. France) (37). 34. To express the different forms of Tat in Escherichia coll. Bam Hl-Bam HI fragments corresponding to either Tat72 or Tati Oi cDNAs (72) were cloned into Protozoan parasites of the phylum Apicomplexa contain three genetic elements: the the unique Bam HI site of pTrcHisB behind the Trc nuclear and mitochondrial genomes characteristic of virtually all eukaryotic cells and a promoter (Invitrogen). E. co//(TOPlOstrain. Invitro- 35-kilobase circular extrachromosomal DNA. In situ hybridization techniques were used gen) containing the recombinant were in­ to localize the 35-kilobase DNA of Toxoplasma gondii to a discrete organelle surrounded duced for 16 hours with isopropyl-p-D-thiogalacto- pyranoside (1 mM) at 37°C. Pelleted cells were by four membranes. Phylogenetic analysis of the tufA gene encoded by the 35-kilobase resuspended in buffer A [6M guanidine hydrochlo­ genomes of coccidians T. gondii and Eimeria tenella and the malaria parasite P lasm o­ ride. 0.1 M NaHgPO.,, and 0.01 M tris (pH 8.0)] at 5 dium falciparum grouped this organellar genome with cyanobacteria and plastids, show­ ml per gram wet weight and stirred for 1 hour at room temperature. Lysate was centrifuged at ing consistent clustering with green algal plastids. Taken together, these observations 10.000 rpm for 15 min at 4°C and the supernatant indicate that the Apicomplexa acquired a plastid by secondary endosymbiosis, probably was collected. Four milliters of a 50% slurry of from a green alga. Ni-nitrilotriacetic acid resin (Invitrogen) equilibrated in buffer A was added to the cell suspension and incubated at room temperature for 1 hour. This mixture was loaded on a column and the flowthrough was collected. The resin was washed sequentially with 10 ml of buffer A. 15 ml of buffer B Apicomplexan parasites contain two ma­ ing three proteins of the respiratory chain [8 M urea. 0.1M NaH^PO^. and 0,01 M tris (pH 8.0)]. ternally inherited extrachromosomal DNA and extensively fragmented ribosomal and 15 ml of buffer C (same composition as buffer B. elements ( I ). The mitochondrial genome is RNAs (2). In addition, these parasites con­ but at a pH of 6.3). The recombinant protein was eluted with 15 ml of buffer C containing 250 mM a multicopy element of ~6 to 7 kb encod- tain a 35-kb circular DNA molecule with imidazole. This eluate was loaded directly on a high- no significant similarity to known mito­ perform ance liquid chrom atography 0 4 column run S. Kohler, L. G. Tilney. D. S. R ocs. Departm ent of Biolo­ chondrial genomes. The 35-kb element is in a gradient of acetonitrile (0 to 100%) in 0.1 % triflu- gy. University of Pennsylvania. Philadelphia. PA 19104. similar to chloroplast genomes, containing oroacetic acid. Fractions containing Tat were lyophl- USA. lized in small aliquots and stored at -70“C under an inverted repeat of ribosomal RNA genes G. F. Delwiche and J. D. Palmer, Departm ent of Biology, anaerobic conditions to prevent Tat oxidation. Tat Indiana University, Bloomington, IN 47405, USA. and genes typically found in chloroplasts was resuspended in degassed phosphate-buffered P. W. Denny and R. J. M. Wilson, National institute for but not mitochondria {rpoBIC, tufA, and saline plus 0.1 mM dithiothreitol immediately before Medical Research, Mill Hill, London NW7 1AA UK. use. The biological activity of Tat purified according to clpC ) (3). The 35-kb DNA is also predicted P. W ebster, D epartm ent of Gell Biology, Yale University this protocol w as tested by incubating it with U1 cells, School of Medicine, New Haven, GT 06520, USA. to encode a complete set of tRNAs, numer­ which contain a virus exhibiting a Tat-defective phe­ ous ribosomal proteins, and several uniden­ notype (32). Tat treatm ent of th e se cells ca u sed a 'These authors contributed equally to this work. 50-fold induction of viral expression as detected by tP re se n t address; School of Pharm acy, University of Gal- tified open reading frames (3). p24 secretion into culture supernatant. Endotoxin ifornia, San Francisco, GA 94143. USA. We used in situ hybridization to deter­ contamination of Tat prepared with this protocol was $ Present address: Department of Plant Biology, Univer­ mine whether the 35-kb DNA is found below the detection limit of the assay (<59 EU/ml; sity of Maryland. Gollege Park. MD 20742, USA. Limulus Amebocyte Lysate Assay. Biowhittaker. §To whom correspondence should be addressed. E-mail; within the parasite nucleus, mitochondrion, Walkersville. MD). [email protected] or or, alternatively, whether this moleculç localizes to a previously unidentified Examination by laser-scanning confocal cause neither Hoechst nor DAPl stains are ' DNA-containing organelle. We chose T. gon- microscopy revealed that the 35-kb DNA excited by the Kr-Ar laser that was avail­ dii for this project (rather than Plasmodium, in of T. gondii is localized to a specific region able for confocal imaging and because in which the 35-kb element has been better in the cell, adjacent to (but distinct from) situ signals were difficult to detect on a characterized) for two reasons. First, there are the apical end of the parasite nucleus (Fig. conventional fluorescence microscope. approximately eight copies of the 35-kb circle lA ). Transcripts of rps4 were also concen­ To examine the subcellular location of per haploid genome in T. gondii tachyzoites, as trated in this region (Fig. IB), suggesting the 35-kb DNA more precisely, we hybrid­ opposed to approximately one copy in Plas­ that diffusion of 35-kb DNA-related tran­ ized frozen ultrathin sections with digoxige­ I modium. Second, Toxoplasina offers much bet­ scripts is restricted by a physical (possibly nin-labeled DNA probes (Fig. 2, A and B). ter ultrastructural resolution, because of its membranous) barrier. Staining with antidigoxigenin followed by a regular organization of intracellular organelles Extranuclear DNA was not detected by secondary antibody and gold-conjugated and well-defined apical region. To localize the YOYO-1 (or propidium iodide), presumably protein A localized the 35-kb element to a 35-kb DNA, we hybridized extracellular because of the low DNA concentrations membranous region adjacent to the nucleus tachyzoites "with digoxigenin-labeled DNA typically found in non-nuclear organelles but distinct from either the mitochondrion probes that covered 10.5 kb of the 35-kb and the membrane-impermeable nature of or the Golgi apparatus (large gold particles). genomic sequence but excluded the ribosom­ these dyes. However, the extranuclear sig­ Antibody to DNA also stained this area al genes, to avoid cross-hybridization with the nal obtained by FISH resembled the pat­ (small gold particles). In control experi­ mitochondrial genome (4). We also targeted tern observed after staining with sensitive ments, probes prepared from plasmid vector RNA transcripts derived from the 35-kb ge­ membrane-permeable DNA dyes such as DNA showed no hybridization, although nome, using digoxigenin-labeled antisense Hoechst 33258 or 4',6'-diamidino-2-phe- the antibody to DNA still detected the RNA generated from putative rps4 sequences nylindole (DAPl) (Fig. 1, D and E). To membranous region just apical to the nu­ (5). The DNA:DNA or RNA:RNA hybrids compare the subcellular distribution of ex­ cleus. The morphology of the membranous were visualized by fluorescence in situ hybrid­ tranuclear DNA with the 35-kb DNA-de- structure labeled by 35-kb DNA probes is ization (FISH), and nuclear DNA was coun­ rived FISH signal (Fig. 1, F though H), we difficult to resolve under the harsh condi­ terstained with the fluorescent dye YOYO-l. used a monoclonal antibody to DNA be- tions used for in situ hybridization, but con­ ditions suitable for labeling with antibody against DNA alone revealed an organelle Fig. 1. The 35-kb episomal genome and 35-kb associated with multiple membranes (Fig. derived RNA transcripts localize to a specific re­ 2C). Thin sections through Epon-embed- gion adjacent to the nucleus inT. gondii ded parasites (which provide superior mem­ tachyzoites. Pseudocolor image of T. gondii (A) brane preservation but do not permit anti­ tachyzoites hybridized with digoxigenin-labeled body or in situ labeling) show that this 35-kb genome-specific DNA (26). The DNA:DNA organelle is invariably enclosed by four bi­ hybrids were visualized with rhodamine-conju- gated anti-digoxigenin (red), and nuclear DNA layer membranes (Fig. 2, D and E). was counterstained with YOYO-1 (green). Sig­ Previous phylogenetic studies on the 35- nals derived from the two fluorophores were col­ kb genome suggested a plastid ancestry, but lected independently by laser scanning confocal confidence in this assessment has been low microscopy and merged with phase-contrast im­ because of the limited number of taxa and ages simultaneously collected from the transmit- phylogenetic methods used (6). Genes ted-light flow-through from the confocal micro­ identified on the 35-kb element include scope. and Localization of 35-kb DNA- (B C) tufA , encoding the protein synthesis elon­ encodedrps4 transcripts(27). Tachyzoites were gation factor Tu, a gene previously found hybridized with digoxigenin-labeled (B) antisense or (0) sense RNA generated in vitro from a cloned useful for constructing molecular phylog­ DNA fragment spanning the putativerps4 gene énies (7). Phylogenetic analysis of tufA se­ and visualized with rhodamine as above (red); quences from T. gondii, P. falciparum, and nuclei were counterstained with YOYO-1 (green). E. tenella (8) places the apicomplexan 35- (D and E) Extrachromosomal DNA in T. gondii kh element solidly within the plastids (Fig. tachyzoites. Fixed parasites were stained for 20 3). This placement is robust when either min at 25°C with ~2 |xg/ml of Hoechst 33258 in amino acid alignments or nucleotide align­ 1X SSC and examined by conventional epifluo- ments first and second codon positions are rescence microscopy with a Zeiss Axiovert 35 9 analyzed under a variety of phylogenetic equipped with an ultraviolet filter set. A distinct methods, including maximum likelihood, extranuclear signal is seen in extracellular tachyzoites (D). Intracellular tachyzoites (E) orient parsimony, and distance methods (using ei­ in “rosettes." with their apical ends pointed out­ ther Kimura three-parameter or Log De t ward (28). permitting localization of the extranu­ transformation) (9). The association of api­ clear DNA to the apical juxtanuclear region.(F com p lexan tufA genes with those of plastids through H) Co-localization of extranuclear DNA does not appear to be caused by either the and 35-kb genome-specific sequences. Nuclei AT-rich or the divergent nature of the se­ were labeled with YOYO-1 and extranuclear DNA quences (10). The similarity of apicom­ with an antibody directed against DNA, followed plexan and plastid tufA gen es is also sup­ by a fluorescein-conjugated secondary antibody (green). (Nuclear DNA was not labeled by the antibody to ported by the presence of two insertions DNA except under extraction conditions, that destroyed parasite morphology, presumably because binding is blocked by -associated proteins.) The extranuclear DNA co-localizes with in situ characteristic of plastids and cyanobacteria, hybridization probes derived from the 35-kb element (red). (F and G) Green and green -f red images of thealthough the length of these insertions is same field (containing two parasites); (H) green and red fluorescence signals from a different parasite,variable among the Apicomplexa. merged with the corresponding phase-contrast image. Scale bars, 5 p,m. All three phylogenetic methods used sup-

1486 SCIENCE • VOL. 275 • 7 MARCH 1997 • h ttp ://w w w .scien cem ag.org Reports

purt monophyly of all plastids, including the Fig. 2. Ultrastructural localization of apicomplexan 35-kh element. Resampling 35-kb genome-specific DNA to a methods that test the internal consistency of unique organelle enclosed by four membranes inT. gondii tacfiyzoites. phylogenetic patterns within the data gave (A) Longitudinal ultratfiin cryosection bootstrap values (II) of 75, 39, and 88% for of T. gondii tacfiyzoites fiybridized monophyly of plastids (for maximum likeli- witfi digoxigenin-labeled probes de­ hixxl, LogDet-neighbor-joining, and parsimo­ rived from the 35-kb DNA (29). ny analyses, respectively) and 81, 69, and 94% Digoxigenin was visualized with an­ tor monophyly of plastids and cyanobacteria tibodies and protein A coupled to (Fig. 3). Parsimony analysis of nucleotide data 10-nm gold particles. Samples were scored only for transversion events also pro- further incubated with monoclonal \ ides strong support for the clade composed of antibody directed against DNA (which does not stain intact chroma­ cyanobacteria and plastids (94%), and mod­ tin in the parasite nucleus; see Fig. 1 erate support (78%) for plastid monophyly. legend), followed by a secondary The apicomplexan plastids were consistently antibody with protein A coupled to placed among the green algae hy all analytical 5-nm gold (30). (B) Higher magnifi­ methixJs used. Although support for green cation of the region in (A) show­ algal affinity was weak (bootstrap values of 41, ing gold labeling. The 35-kb DNA 21, and 63%), these values are comparable to probes hybridize specifically with a the level of support for green plastid mono­ membranous region (*) just apical to phyly when apicomplexans are excluded, yet the nucleus (Nu) but are distinct from the green plastids are known to be monophy­ the mitochondrion (m) and Golgi ap­ paratus (g). (C) Immunogold labeling letic on many other grounds (7). Trees con­ of extranuclear DNA (10-nm gold strained to place the Apicomplexa with non­ particles) in a T. gondii tachyzoite green plastids were consistently worse than not subjected to in situ hybridiza­ those placing them with the green plastids, tion conditions. Membranes appear although the difference in likelihood was not white in this negatively stained im­ significant hy the Kishino-Hasegawa test (9). age. (D and E) Ultrathin sections Many investigators have assumed that through the (*) of an the apicomplexan 35-kb genome is related to Epon-embedded parasite (31). The dinoflagellate plastids, on the basis of struc­ organelle is surrounded by four membranes (stained black by uranyl acetate). The parasite in (E) Is beginning to divide, as indicated by division of the Golgi and development of the two daughter "buds." The tural similarities between the Apicomplexa apicoplast is flattened adjacent to the apical end of the nucleus and is divided between the two daughters and dinoflagellates, and phylogenetic analy­ early during endodyogeny. ses of nuclear genes (12). Unfortunately, few dinoflagellate plastid genes have been exam­ ined, but there is considerable diversity of Twtofitaiirm fawt^asraa TowpitsffM 4 Aptcomptew plastid form among dinoflagellates, and their Plaerrfodium

plastids may have arisen from multiple dis­ Eugl#mophy1#$ Qf—n AlgM, tinct endosymhioses (13). Thus, it seems Plant* and Nicottana Coleochaefe Euglanophytaa Chan CNsmyOomonas likely that the last common ancestor of all Chtamydomonas DnoamakMa Dnpamafdia Ntotiana dinoflagellates was not photosynthetic and Pofphym CycMelfa l i Rad Aloaa and omonaa Potphyn that the Apicomplexa and dinoflagellates ChromophytM — Ochromonas OOtromonas Odmm onm acquired their plastids independently. çumtm. çanAim SplnjUna Phommim A structure consisting of multiple mem­ PwchkMvthfIx Gloeomece AnacysUs Plectonema Cyanobacttri* 5 branes has previously been described as the Qloeobacler SpfuHna

"Golgi adjunct” in Toxoplasma, and similar ThiobaclMus Mnonema Damonerna Pseudomonas structures— variously termed the lamellarer Thiobactlus Thiobactlus EscharictUa Pseudomonas Pseudomonas Shewanela korper, vacuoles plunmembranaires, spherical Eschehcim Eachadchla Utcrococcus Shewanella Shewanela MycobactaPum Ulcrococcus MIciococcus Streptococcus btxiy, or Hohlzylinder— have been observed Mycobacterium Bacatus Streptococcus Chlofoblum in other apicomplexan parasites (14). The Bactlus Mycoplasma Mycoplasma FSrrobacler FtroOactar cytological derivation of this structure has Fbrobacter Bacteroldes Bacteroldes Bacteroldes Mycoplasme Chlomblum Chlomblum Detionema been unclear, but the demonstration that this BorreUa accharomyces Saccharorrrvces organelle is associated with a plastid genome Arabidopsis -ArabidopsG CMamydla - Flexislpes in Toxoplasma— combined with the mono­ Flexlstpes Borreta -Splrochaala Splrochaala -Chlamydia phyly of Toxoplasma, Plasmodium, and Eimeria CMororiaxus Chlorollexua■ - Chlorollexus ___ tu/As in all analysses— argues for a single en- — Therrrmtoga 0.10 Sub» yciiK. Thermotoga OiiOdogCW) - Thermotoga 50 siw Maximum Likelihood LogDet I Neighbor-Joining Parsimony dosymbiotic organelle common to all apicom­ plexans. The apicomplexan plastid (abbrevi­ Fig. 3. Molecular phylogenetic analyses oftufA genes from three apicomplexan 35-kb genomes and ated “apicoplast”) is an authentic plastid in all representative eubacteria, plastids, and mitochondria(32). Maximum likelihood finds the phylogeny that is statistically most likely to have given rise to the observed sequences. Neighbor joining is a cluster respects, albeit one that is probably incapable method, in this case using “LogDet" distances (-In determinant). Parsimony finds the tree that requires of photosynthesis. the fewest inferred mutations to represent the data (9). Branch lengths are proportional to the number Previous investigators have debated the of inferred substitutions (or LogDet value); bootstrap values >40% are given above the corresponding number of membranes surrounding the apico­ branch (11). The column at the far right indicates the number of membranes surrounding the plastid for plast, suggesting that the appearance of mul­ taxa in the parsimony tree. All three phylogenetic methods consistently group the apicomplexan 35-kb tiple membranes may result from proximity to encodedtufA genes with green algal plastids. the endoplasmic reticulum or Golgi apparatus REFERENCES AND NOTES 24. C. J. M. Beckersetal ., J. Clin. Invest. 9 5 .3 6 7 ( M. M. Fichera, M. K. B hopale, D. S. R oos, A (14, 15). Although this organelle is closely 1. R. J. M, Wilson. D. H. Williamson, P. Preiser, Infect. crob. Agents Chemother. 39, 1530 (1995). associated with the Golgi, the fixation and Agents Dis. 3, 29 (1994); J. E. Feagin, Annu. Rev. 25. S. Pukrittayamee, C. Viravam. P. Charoenia staining conditions used for Fig. 2, D and E, Microbiol. 48, 81 (1994). Y eam put, R. J. M. Wilson, Antimicrob. Agents 2. A. B. Vaidya, R. Akella, K. Suplick, Mol. Biochem. mother. 38, 511 (1994). I commonly show four membranes. It is diffi­ Parasitol. 3 5 ,3614 (1989); J. T. Jo se p h , S. M. Aldritt, 26. We prepared nick-translated DNA probes co cult to visualize distinct membranes all the T. U nnasch, 0 . Puijalon, D. F. Wirth, Mol. Cell Biol. 9, 10.5 kb of the T. gondii 35-kb circle by mcuba way around the organelle (and serial sections 3621 (1989). to 2 p.g of template DNA for 2 hours at 14'‘( necessarily lose definition at the top and bot­ 3. R. J. M. Wilson et ai.,J. Mol. Biol. 261, 155 (1996), 50-pul reaction mixture containing 50 mM tri 4. J. E. Feagin, E. W erner, M. J. G ardner, D. H. William­ 7.8), 0.1 mM digoxigenin-1 l-deoxyuridin; tom of the stack), but all of our micrographs son, R. J. M. Wilson, Nucleic Acids Res. 20, 879 triphosphate (dUTP) (Boehnnger-Mannheim) are consistent with the four-membrane hy­ (1992). . mM dATP, 0.4 mM dCTP, 0.4 mM dGTP, f pothesis, and many sections are clearly incom­ 5. The complete sequence of the T. gondii 35-kb DNA MgClg, 10 mM dithiothreitol, 2.5 pug of nucleas has been deposited in GenBank, with accession bovine serum albumin, 10 U of DNA Polymer patible with ^3 or ^5 membranes. The pres­ number U87145. The map of this genome is virtually and DNase I at concentrations titrated to prc ence of four membranes enclosing the apico­ identical to that of Plasmodium (3). labeled fragments with an average length of plast suggests that it originated as a secondary 6. 0. J. Howe, J. Theor. Biol. 158, 199 (1992); M. J. bp. RH-strain T. gondii tachyzoites were cuitu Gardner ef al.. Mol. Biochem. Parasitol. 66, 221 endosymbiont (derived by ingestion of a eu­ vitro in primary hum an fibroblasts [D. S. R oos, (1994). K. Donald, N. S. M orrissette, A. L. C. Moulton, / karyote that itself harbored a plastid), analo­ 7. B. Cousineau, 0. Cerpa, J. Lefebvre, J. Cedergren, ods Cell Biol. 45, 27 (1994)], resuspended m ; gous to the plastids of chlorarachniophytes Gene 120, 33 (1992); W. Ludwig et al.. Antonie van phate-buffered saline (PBS) at -5 x 10^ para and cryptomonads (16). This hypothesis is Leeuwenhoek 64. 285 (1993); C. F. Delwiche, M. ml, attached to silane-coated glass slides, and Kuhsel, J. D. Palmer, Mol. Phylogenet. Evol. 4, 110 for 10 min at 25°C in a solution of 4% formalde bolstered by the phylogenetic grouping of api- (1995); S. L. Baldauf, J. D. Palmer, W. F. Doolittle, 65% methanol, and 25% glacial acetic acid, folf coplasts with green algal plastids, which pre­ Proc. Natl. Acad. Sci. U.S.A. 93, 7749 (1996). by two 5-min fixations in methanol and glacial £ sents a clear conflict with nuclear gene phy­ 8. Predicted protein sequences for Apicomplexan tufA acid (3:1). Slides were rinsed twice for 5 mm in ' genes are available from GenBank; accession num­ ethanol, rehydrated, and permeabilized for 10 r logénies (12, 17) and therefore provides pri­ bers; T. gondii, X88775; P. falciparum. X8763G; and 25°C in Proteinase K at concentrations from ( ma facie support for a secondary endosymbi­ £ tenella. X89446, 1.0 pug/ml (optimal co ncentrations varied from t otic origin. The putative green algal origin of 9. D. M. Hillis, 0 . Moritz, B. K. Mable, Eds., Molecular to batch) in 10 mM tris (pH 8.0) containing £ Systematlcs (Sinauer, S underland MA, 1996). EDTA. Specimens were then fixed for 5 min on apicomplexan plastids should be testable 10. The apicomplexan tufA genes are 70 to 80% A-T PB S-buffered 4% form aldehyde, n nsed twir through further phylogenetic analyses of plas­ (-95% at third-codon positions), versus 48% (43%) PBS, and incubated for 5 min m 2x standard ; tid sequences and analysis of apicomplexan for bacterial tufA genes and 66% (84%) for plastid citrate (SSC). Intracellular RNA was removed t tufAs. Biases in base composition can influence phy­ gestion for 1 to 2 hours in a 200-pig/ml solutr nuclear genes of potential green algal origin, logenetic analysis [P. J. Lockhart, M. A. Steel, M. D. DNase-free RNase A (in 2x SSC), followed by c such as phosphoglucose isomerase and enolase Hendy, D. Penny, Mol. Biol. Evol. 11, 605 (1994)], dration through an ethanol series. Parasite DNA (18). but bias-tolerant methods such as transversion and denatured for 5 min at 70°C in 70% formamic LogDet analyses consistently place the 35-kb ele­ 2x SSC), chilled in ice-cold 70% ethanol, and c The function of the apicoplast remains ment well within the plastids. Although apicom­ drated. Hybridization was carried out for 12 hoi unknown, but the parasite faithfully replicates plexan tufAs are highly divergent (Plasmodium in 37°C in a 15-pul volum e [10 ng of h e a t-d e n a ’ this organelle, which divides by binary fission particular) and therefore may be vulnerable to “long probe, 1 pug of yeast tRNA, and l to 2 pug of I branch" effects [J. Felsenstein, Sysf. Zoo/. 27, 401 and is introduced into developing daughter denatured calf thymus DNA per microliter of (1978); D. M. Hillis, J. P. H uelsenbeck, 0 . W. C u n ­ formamide, 10 mM tris (pH 7.4), 300 mM Na parasites very early during replication (Fig. ningham , Science 264, 671 (1994); J. H. Kim, Sysf. mM EDTA (pH 8), 10% dextran sulfate, and 1 x 2E). The apicoplast genome is certainly tran­ Biol. 45, 363 (1996)1, the Apicomplexa remain within hardt’s solution]. After hybridization, the slides scribed: Several transcripts have been identi­ th e plastids even in th e p re se n c e of th e rapidly evolv­ rinsed in 4 x SSC and twice washed for 10 m ing mitochondrial and Mycoplasma genes. Only the 25°C in 4x SSC, twice for 3 min at 37''C in fied by Northern (RNA) blot analysis (3, 19), association with Coleochaete in parsim ony analyses form am ide (in 2 x SSC), twice for 5 mm at 37°C i rps4 transcripts localize to the same region as is a probable example of long-branch effects. SSC, once for 2 mm at 25°C in 2x SSC, and t the 35-kb DNA (Fig. 1), and ribosomal RNA 11. J. Felsenstein, Annu. Rev. Genet. 22, 521 (1988). for 5 min at 25°C in 4x SSC. We visualizec 12. N. D. Levine, J. Protozool. 35. 518 (1988); A. A. hybrids by incubating the specimens for 40 rr derived from the 35-kb circle has been local­ Gajadhar ef al.. Mol. Biochem. Parasitol. 45, 147 25°C in 4x SSC containing rhodamme-conjug ized to this organelle (15). Like other endo­ (1991); T. Cavalier-Smith, Microbiol. Rev. 57, 9 53 polyclonal sheep anti-digoxigenm (Boehringer) symbiotic genomes (20), the 35-kb element is (1993); Y. Van de Peer, S. A. Rensing, U.-G. Maier, 0.5% nuclease-free blocking reagent. Contro R. De W achter, Proc. Natl. Acad. Sci. U.S.A. 93. bridizations with labeled pGEM-3 vector presumed to be the remnant of a much larger 7732(1996). showed no signal. Nuclear DNA was stained fc precursor, most of whose original functions 13. J. D. D odge, in The Chromophyte Algae: Problems mm at 25"C with 2.5 nM YOYO-i (Moiet have been lost or transferred to the nuclear and Perspectives, J. C. Green, B. S. C. Leadbeater. P robes) m 1 x SSC . In Fig. 1, F through H. extr. W. L. Diver, Eds. (Clarendon, Oxford, 1989), pp. clear DNA was stained with a monoclonal antit genome. Photosynthesis is the most familiar 2 0 7 -2 2 7 . raised against double-stranded DNA (Boehrm. function of plastids, and evidence for a chlo­ 14. M. E. Siddall, Parasitol. Today 8, 90 (1992). followed by a secondary fluorescein isothiocya rophyll binding protein in Apicomplexa has 15. G. I. M cFadden, M. E. Reith, J. Mulholland, N. Lang- (FITC)-conjugated rabbit-antimouse antit U nnasch, Nature 381 . 482 (1996). been reported (21), although we have not (Pierce). Both YOYO and FITC were visualized 16. G. I. McFadden and P. R. Gilson, Trends Ecol. Evol. a fluorescein filter set. Specimens were mountf been able to confirm these results in Toxoplas­ 12, 12(1995). Aqua-Poly/Mount (Polysciences) and anal\ ma. Plastids also play many other key meta­ 17. S. L. Baldauf and J. D. Palm er, Proc. Natl. Acad. Sci. with a Leitz scanning confocal microsc bolic roles— including biosynthesis of amino U.S.A. 90, 11558 (1993). equipped with a Kr-Ar laser, FITC and tetrame 18. D. Van Der S traeten, R. A. R odrigues-P ousada, H. rhodamine isothiocyanate filter sets, and a tr; acids and fatty acids, assimilation of nitrate M. G oodm an, M. Van M onagu, Plant Cell 3, 719 mitted-light detector. and sulfate, and starch storage (22)— and (1991); M. R ead, K. E. Hicks, P. F. G. Sim s, J. E. 27. Antisense- and sense-RNA probes were prep; have been maintained in many nonphotosyn­ Hyde. Eur. J. Biochem. 220, 5 13 (1994); L. A. J. by standard procedures, with the use of the T7 Katz, J. Mol. Evol. 43, 453 (1996). Sp6 promoters in pGEM-3, flanking a 0.7-kb cic thetic taxa over millions of years (23). The 19. J. E. Feagin and M. E. Drew, Exp. Parasitol. 80, 43 0 fragment from the T. gondii 35-kb g en o m e predic apicoplast has been suggested as a target for (1995). to e n c o d e rps4 (5). Freshly h arvested parasites v macrolide antibiotics in Toxoplasma(24) 20. J. D. Palmer, Annu. Rev. Genet. 19, 32 5 (1985); M. fixed for 5 min at 25°C in 4% PBS-buffered forr W. Gray. Annu. Rev. Cell Biol. 5, 25 (1989). and may also be the target for rifampicin in dehyde, washed in PBS, attached to siiane-coc 21. J. H. P. Hackstein ef al.. Parasitol. Res. 81, 207 slides, fixed for 5 min on ice in 4% formaideh', Plasmodium (25). Further studies are likely (1995). briefly washed m PBS. and treated with Protem to elucidate important aspects of plastid 22. G. H razdina and R. R. Jen sen , Annu. Rev. Plant K. After another 5 mm of fixation on ice m 4% fc Physiol. Plant Mol. Biol. 43, 241 (1992). function and evolutionary history, in addi­ aldehyde, the slides were washed m PBS. dehyc 23. C. W. de Pamphilis and J. D. Palmer, Nature 348, ed. and incubated with 15 pilof hybridization solu tion to identifying other parasite-specific 337 (1990); K. H. Wolfe, C. W. M orden, J. D. Palm er, (26), at probe concentrations of 1 to 2 ng/pil. Nuc targets for chemotherapy. Proc. Natl. Acad. Sci. U.S.A. 89, 10648 (1992). DNA was counterstamed with YOYO-1.

1488 SCIENCE • VOL. 275 • 7 MARCH 1997 • http://www.sciencennag.org T e c h n ic a l C o m m en ts

28. N. S. M orrissette, V. Bedian, P. W ebster, D. S, Roos, not directly comparable to standard distances but Burroughs Wellcome New Investigator in Molecular Exp. Parasitol. 79, 445 (1994). yield additive distances under any Markov model Parasitology and a Presidential Young Investgator of 29 Extracellular tachyzoites were fixed for 1 hour on ice in when sites are evolving independently and at the the NSF, with support from Merck Research Laborato­ 4% PBS-buffered formaldehyde and then for 12 hours sam e rate (9). ries and the MacArthur Foundation. We wish to thank at 4'C in 8% PBS-buffered formaldehyde. The cell sus­ 33. This work was supported by NIH grants AI-31B08 R. G. K. Donald for molecular clones derived from the T. pension w as em bedded in 10% gelatin, incubated for 2 (D.S.R.) and GM-52857 (L.G.T.), NSF grant DEB-93- gondii 35-kb element, L. Chicoine for assistance with hours at 4=C in PBS containing 2.3 M sucrose, and 18594 (J.D.P.), and the University of Pennsylvania Pro­ cryosectioning, J. F. Dubremetz for suggesting that the frozen in liquid nitrogen. Ultrathin sections of the frozen gram in Computational Biology. P.W.D. w as supported apicomplexan plastid described herein be desgnated samples were freshly prepared before each hybridiza­ by a Medical Research Council studentship, and the apicoplast, and P. Kuhlman, F. Lutzoni, K. Pryer, tion experiment, Cryosections were transferred to grids R.J.M.W. by the United Nations Development Program- and D. Williamson for helpful discussons. and digested for 40 min at 37°C in 2x SSC containing -World Bank-World Health Organization Special Pro­ 200 Jig/ml of DNase-free RNase A. Cellular DNAs were gram m e for Research in Tropical Diseases. D.S.R. is a 16 September 1996; accepted 27 December 1996 denatured for 5 min at 70°C in 70% formamide (in 2 x SSC), chilled on ice, transferred to 50% formamide (in 2 x SSC), and incubated briefly at 25°C. Sections were hybridized for 12 hours at 37"'C in a humidified chamber TECHNICAL COMMENTS in 5 (jlI of hybridization mix containing 10 to 20 ng/p.1 of DNA probe (26), w ashed three times for 5 min at 25°C in 4x SSC, twice for 3 min at 37°C in 50% formamide (in 2x SSC), twice for 5 mm at 25°C in 2x SSC, and kept Evidence for a Family of Archaeal ATPases in 4x SSC at 25°C before staining. Hybridized probe was detected with polyclonal sheep anti-digoxigenin, followed by a secondary rabbit antibody directed T h e analysis by Carol J. Bult er al. of the against sheep immunoglobulin G (Pierce), and Protein A has been noted before (4). Many of the AAA conjugated to 10-nm particles of gold. Immunogold- Methanococcus jannaschii genome included' family proteins possess chaperone-like activity- lafcieied sections were blocked for 20 mm at 25°C in 4 x families of paralogous proteins that did not and, in particular, are involved in ATP-de­ SSC containing 0.5% blocking reagent and were incu­ seem to have counterparts in the current se­ pendent proteolysis; examples include bacte­ bated with a monoclonal antibody against DNA, fol­ lowed by a rabbit anti-mouse secondary antibody and quence databases (1). The largest of such rial proteins ClpA, ClpB, ClpX, FtsH, and protein A conjugated to 5-nm gold particles. To improve families consists of 13 chromosomal and three HslU; proteasome components; and yeast the contrast of membranous structures, we counter­ plasmid-encoded proteins, which were found HSP78 (7). Members of the novel archaeal stamed hybridized cryosections on ice for 10 min in 0.3% aqueous uranyl acetate plus 2% methylcellulose. to be highly similar to one another [figure 6 in protein family could also perform chaperone­ Grids were air-dried on loops and examined with a Phil­ (])], but did not show statistically significant like functions. This is particularly plausible, lips EM400 microscope. similarity to any proteins, thus escaping func­ because M. jannaschii does not encode several 30. The antibody directed against DNA used in Fig, 2A probably recognizes both endogenous DNA and the tional prediction. Our inspection of the align­ molecular chaperones that are ubiquitous and digoxigenm-labeled probe. Similarly, the 5-nm gold- ment, however, indicates that two of the con­ highly conserved in bacteria and eukar>'otes— protem A conjugate used to visualize this antibody served sequence blocks correspond to well- namely, members of the HSP70, HSP90, and (by means of a secondary rabbit antibody) is poten­ characterized functional motifs: namely, the tially able to recognize any anti-digoxigenin that re­ HSP40 families. It remains to be seen how mained unblocked. Comparable staining with anti­ phosphate-binding P-loop and the Mg^"^- typical is this situation in . body against DNA was observed even in the ab­ binding site that are conserved in a vast vari­ Finally, the family of putative ATPases se n c e of a DNA probe, how ever (Fig. 2C), or w hen ety of ATPases and GTPases (Fig. 1 and 2 - 4 ). contains a third strikingly conserved motif control plasmid was used as a probe. Cryosections labeled with antibody to DNA before the application Even though most commonly used methods with two invariant histidines and one in­ of anti-digoxigenin also showed co-localization of for database search such as BLASTP (5) variant cysteine (Fig. 1). Even though this large and small gold particles. The apparent cluster­ showed only marginally significant similarity motif did not show statistically significant ing of label in Fig. 2, A and B, m ay be an artifact of in situ hybridization conditions, because antibody di­ to several ATPases, a new version of the similarity to any proteins in the database, rected against DNA labels the organelle uniformly BLASTP program that constructs local align­ this may be a specific metal-binding site, (Fig. 2C). ments with gaps (6) indicated a probability of and some resemblance of the divalent cat­ 31. Infected cultures were fixed for 45 min in freshly prepared 50 mM phosphate buffer (pH 6.3) contain­ matching by chance between 10“'' and 10”^ ion-binding motif in bacterial Fur proteins ing 1% glutaraldehyde and 1% OsO„, rinsed in dis­ for some of the proteins in the new archaeal that are metal-dependent transcription reg­ tilled water, stained in 0.5% uranyl acetate overnight, family and bacterial DnaA proteins; the con­ ulators (8) could be detected (Fig. 1). Two dehydrated, and embedded in Epon. Ultrathin sec­ servation was particularly notable in the two tions were picked up on uncoated grids, stained with observations seem relevant: (i) One of the uranyl acetate and lead citrate, and examined with a ATPase motifs (Fig. 1). Thus, even though chaperone ATPases, FtsH, contains a met- Phillips 200 electron microscope, these 16 proteins comprise a novel family that al-bindmg motif conserved in its bacterial 32. A total of 65 sequences, including nearly all available is so far represented only in archaea, they and eukaryotic homologs and is a Zn-de- bacterial sequences and representative plastid se­ quences, were aligned using PILEUP (Genetics appear to belong to a known broad class of pendent protease (9). (ii) Methanococcus Computer Group, Madison, Wl (1991)), with manual proteins, and we predict that they possess jannaschii encodes at least two other puta­ refinement on the basis of secondary structural infor­ ATPase activity. tive ATPases, namely, the predicted pro­ mation. Maximum likelihood analysis w as performed with fastDNAmI v l.0 .6 [G. J. O lsen, H. M atsuda, R. Screening of the nonredundant protein se­ teins MJ0578 and MJ0579 that also contain Hagstrom, R, Overbeek, CAB/OS 10, 41 (1994)], quence database at the National Center for a metal-binding domain, in these, cases a compiled as parallel code running on an Intel Para­ Biotechnology Information (National Insti­ ferredoxin-like domain (10). gon 64-node partition. Three random addition se­ quences and global swapping were used, but it can­ tutes of Health, Bethesda, MD), with a bi­ Thus, analysis of conserved motifs and ap­ not be guaranteed that the tree found is the highest partite pattern representing the specific forms plication of additional methods for sequence likelihood tree possible. Bootstrap data sets and of the two ATPase motifs conserved in the database search yields specific functional pre­ consensus trees were generated using PHYLIP tools M . jannaschii family— namely, hhhhOx^- SEOBOQT and GONSENSE [J. Felsenstein, Univer­ dictions for archaeal proteins that initially sity of Washington, Seattle, WA (1993)). Bootstrap GK[TS]x^hhhhD[DE] (h indicates a bulky appeared to comprise a unique family. There replicates were analyzed with fastDNAmI using a sin­ hydrophobic residue), selected 271 pro­ is little doubt that further exploration of the gle random addition sequence and local branch teins, all of which are either known to swapping only. LogDet, parsimony, and constraint M . jannaschii genome sequence will bring analyses were performed with PAUP*4.0d48 [D. L. possess ATPase activity or are highly simi­ more interesting findings. Swofford; Smithsonian Institution, Washington, DC lar to ATPases. In addition to DnaA, this E ugene V . Koonin, National Center for Bio­ (1996)1 using nucleotide d ata from the first and s e c ­ list includes a number of members of the technology Information, National Library of ond codon positions, and bootstrapping was carried out using 100 replicates with random addition se­ so-called AAA ATPase family (7); the sim­ Medicine, National Institutes of Health, Beth- quences (where appropriate). LogDet distances are ilarity between these proteins and DnaA gs&, MD 20894, USA