Copyright 0 1995 by the Genetics Society of America

Complete Sequence and Gene Organization of the Mitochondrial Genome of the Land Snail Albinuria cornlea

Evi Hatzoglou, George C. Rodakis and Rena Lecanidou

Department of Biochemistry, Cell and Molecular Biology, and Genetics, University of Athens, Panepistimiopolis, Athens 157 01, Greece Manuscript receivedJanuary 31, 1995 Accepted for publication May 15, 1995

ABSTRACT The complete sequence (14,130 bp) of the mitochondrial DNA (mtDNA) of the land snail Ahinaria coerulea was determined. It contains 13 protein, two rRNA and 22 tRNA genes. Twenty-four of these genes are encoded by one and 13 genes by the other strand. The gene arrangement shares almost no similarities with that of two other molluscs for which the complete gene content and arrangement are known, the bivalve Mytilus edulis and the chiton Kathanna tunicata; the protein and rRNA gene order is similar to that of another terrestrial gastropod, Cepaeu nemoralis. Unusual features include the following: (1) the absence of lengthy noncoding regions (there are only 141 intergenic interspersed at different gene borders, the longest intergenic sequence being 42 nucleotides), (2) the presence of several overlapping genes (mostlytRNAs), (3) the presence of tRNA-like structures and other stem and loop structures within genes. An RNA editing system acting on tRNAs must necessarily be invoked for posttranscriptional extension of the overlapping tRNAs. Due to these features, and also because of the small size of its genes (e.g.,it contains the smallest rRNA genes among the known coelomates), it is one of the most compact mitochondrial genomes known to date.

N the last few years, there has been an accelerated aswell as Cqbuea nemoralis, whose partial gene order I accumulation of sequence data on animal mitochon- (excluding tRNA genes) has been published (TERREIIT drial genomes. Most of the complete mitochondrial et al. 1994). DNA (mtDNA) sequences that have been published The mtDNA gene content of coelomate metazoans concerndeuterostomes [vertebrates and echino- is constant: it consists of 13 protein genes, two genes derms: human (ANDERSON et al. 1981), mouse (BIBB for the small and large ribosomal RNA subunits, and et al. 1981), rat (GADALETAet al. 1989), cow (ANDER- 22tRNA genes, some of which are transcribed from SON et al. 1982), fin whale (ARNASON et al. 1991), blue one andsome from the othermtDNA strand. An excep whale (ARNASON and GULLBERG1993), harbor seal, tion is observed in molluscs. Mytilusis missing a protein Phoca vitulina (ARNASON andJOHNsSON 1992),Ameri- gene (ATPuse8), while it contains an extra tRNA gene can opossum, Didelphis virgzniana UANKEet al. 1994), and has all of its genes transcribed from one strand chicken (DESJARDINSand MORAIS 1990),Xenopus (HOFFMANNet al. 1992); Katharina, on the other hand, (ROE et al. 1985), carp (CHANGet al. 1994), sea ur- contains the standard setof 37 mitochondrial genes, as chins Paracentrotus lividus (CANTATOREet al. 1989) well as two extra tRNA genes, that may or may not be and Strongylocentrotus purpuratus (JACOBS et al. 1988), functional (BOOREand BROWN1994b). sea star, Asterina pectinijiera (AsAKAWA et al. 1991)], Besides coding regions, there are also noncoding while the available sequences for protostome coelo- sequences in animal mtDNA and, more specifically, a mates are limited to fourarthropods [Drosophila major noncoding segment,which in deuterostomes (D- yalzuba (CLARYand WOLSTENHOLME1985a), bee (CRO- loop) contains a combination of sequence elements ZIER and CROZIER1993), mosquito, Anopheles quadrima- that are related with control of both replication and culatus (MITCHELLet al. 1993), Artemia (PEREZet al. transcription (JACOBS et al. 1988; CLAYTON 1991, 1992; 1994, EMBLdata bank accession number X69067)l and WOLSTENHOLME 1992; SHADEL and CLAY~ON1993). one mollusc [the chiton Katharina tunicata (BOOREand The length of this region is extremely variable [121 BROWN1994b)l. Within the phylum of molluscs, con- nucleotides (nt), sea urchin (JACOBSet al. 1988); over siderable information is also availablefor Mytilus edulis, 20 kb, pine weevil(BOYCE et al. 1989)l. The length whose partial sequence but complete mtDNA gene con- variation is usually due to the presence of short (e.g., tent and organization is known (HOFFMANet al. 1992), 10 bp) (GHMZZANIet al. 1993) or longer [e.g., 260 bp (BROUGHTONand DOWLINC1994) or 1.2 kb (GJETVAJ Cmespondingauthw:Rena Lecanidou, Department of Biochemistry, et al. 1992)] repeated sequences. Size variation in non- Cell and Molecular Biology, and Genetics, University of Athens, Panepistimiopolis,Athens 157 01, Greece. coding sequencesinevitably results in total length varia- E-mail: [email protected] tion, which can be observed even at thelevel of individ-

Genetics 140 1353-1366 (August, 1995) 1354 E. 1354 Hatzoglou, G. C. Rodakis and R. Lecanidou uals (reviewed by MORITZ et al. 1987). Extremes in size variation have beendocumented in molluscs:seven species of scallops have sizes ranging from 16.2 to 42 kb (SNYDERet al. 1987; LAROCHEet al. 1990; GJETVAJet al. 1992). The mtDNA gene organization is considered to be relatively constant within eachmetazoan phylum, where the observed variations mainly concern tRNA gene rearrangements (WOLSTENHOLME1992) and nu- cleotide substitutions (CROZIER et al. 1989; PALUMBI and BENZIE 1991; AVISE et al. 1992; MARTIN et al. 1992). However, molluscs are the largest exception to this rule, since large variations in gene organiza- tion (e.g., between Mytilus and Katharina) (BOORE and BROWN 1994b) and sequence [Albinaria tunita vs. Mytilus (LECANIDOUet al. 1994) and Katharina us. Mytilus (BOORE and BROWN1994b)l have been found. We have recently pointed out thefirst indications for this great diversity of molluscan mtDNA, as Albinaria sequences are more similar to the corresponding se- quences of Drosophila than of Mytilus, and have sug- FIGURE1.-Gene map of the A. coeruleu mtDNA molecule. gested that this may be attributed to either a polyphy- The outer circle represents the sense strand for genes tran- scribed clockwise and the inner circle for genes that are tran- letic origin or to a high and differentiated evolutionaIy scribed counterclockwise. tRNA genes are depicted by the rate of molluscan mtDNA (DOURISet al. 1995; LECANI- one-letter code. Positive numbers at gene bound- DOU et al. 1994). Boom and BROWN(1994b), on the aries indicate noncoding nucleotides; negativenumbers indi- basis of the observation that structural features of Ka- cate overlappingnucleotides of adjacent genes. Abbreviations tharina resemble more those of Drosophila than of used: ATPase68, ATP synthase subunits 6 and 8; COI-IIZ, cyte Mytilus, have proposed a fast mtDNA molecular clock chrome c oxidase subunits I, 11, and 111; Cytb, cytochrome b apoenzyme; NDl-6and ND4L, NADH dehydrogenase subunits for Mytilus. 1-6 and 4L; s-rRNA and l-rRNA, small and large subunits of In this paper, we present the complete mtDNA se- ribosomal RNA. quence of the land snail A. coerulea. It is the smallest mtDNA among coelomate metazoans (14,130 bp), does KS (Stratagene). Further subcloning was performed after not contain any noncoding sequence longer than 42 deletions of clones and subclones using the enzyme Exo- nt, andits gene organization seems to have almost noth- nuclease I11 (HENIKOFF1987). Clones and derived sub- clones were amplified after transformation withCaC12/ ing in common with that of other known metazoans, RbC1, using as hosts Escherichia coli JM83. DNA was ex- with the exception of the major genes of another terres- tracted by mini preparations with alkaline lysis trial gastropod, C. nemoralis. (SAMBROOKet al. 1989). Both mtDNA strands werecompletely sequenced. Se- quence determination was performed from the ends of clones MATERIALSAND METHODS and subclones by the dideoxynucleotide chain termination method (SANGERet al. 1977) using Sequenase V 2.0 (US Bie The mtDNA of A. coemleu was cloned in three consecu- chemical) and universal Ml3/pUC primers.Samples were tive HindIII segments: 6.55 kb (13A3, bases 1561-8123), resolved in 4 or 5% polyacrylamide,7.5 M urea gels, and 5.4 kb (13A12, bases8124-13278), and 2.4 kb (13F6, bases autoradiography was performed using Kodak X-Omat film. 13279-1560) (Figure 2) (DOURISet al. 1995). The follow- Sequences wereidentified by comparison with data from the ing evidence indicates that the entire mitochondrial ge- EMBL data bank or from cited publications.The tRNA genes nome is representedin these three Hind111 clones: (1) were identified by their potential to form the characteristic sequencing of an EcoRI clone (15A6, bases 282-2950) (Fig- for mt-tRNA stem and loop structures. Sequence analysis was ure 2) showed that it overlaps clones 13F6 and 13A3, which performed using computer programs developed by PUSTELL therefore must be adjacent; (2) clones 13A12 and 13F6 and KAFATOS (1984, 1986). Other computer programs used must also be adjacent, because a sequenced clone from are cited in the text. a closely related Albinaria species (A28 from A. turrita) Northern hybridization was used to identify the location (LECANIDOUet al. 1994) overlaps them, and (3) theHindIII of the tRNALysgene. RNA was extracted from the feet (mus- site between clones 13A3 and 13A12 (base 8124) (Figure cle) of 12 snails, treated with Proteinase K, extracted with 2) lies in a conserved region of the ND4 gene (bases 8113 phenol/chloroformand precipitated with ethanol ac- to 8137 correspond to the aminoacid sequence FLVKLPIY, cording to SAMBROOKet al. (1989). Samples were resolved which is identical in the chiton K. tunicata) (BOORE and by electrophoresis in a 20 X 20 X 0.15 cm 4% polyacryl- BROWN1994b). amide, 6 M urea gel and electroblotted in 0.5X TBE to a Restriction fragments of the three HindIII clones, as Gene Screen Plus (DuPond, NEF-976) hybridization trans- wellas of the overlapping EcoFU clone, were subcloned fer membrane (SRIVASTAVAet al. 1993). The membrane was into plasmid vectors pUC8, 9, 18, 19, or pBluescript I1 baked at 80" for 2 hr. mtDNA fragments used as probes Land Snail Mitochondrial Genome 1355 GenomeMitochondrial Snail Land were labelled by nick translation (SAMBROOKet al. 1989). of observed/expected frequency. p* = 1.44 and p* = Hybridization was performed at 65" in 0.4 M NaCl, 1%so- 1.30, respectively; p* is the symmetrized dinucleotide dium dodecyl sulfate, 50 mM Tris-HC1 pH 7.5. Autoradiog- raphy was performed using an intensifjmg screen (DuPond odds ratio calculated according to BURGEet al. (1992). Cronex Lighting-Plus) and Kodak X-Omat film. Exposure In contrast, the doublestranded dinucleotide CG * CG was for 1-4 days. is the only under-represented (p* = 0.70). At present, The sequence of A. coeruka mtDNA has been deposited in there is no widely accepted explanation for the CpG the EMBL sequence data bank under accesion suppression in animal mtDNAs (CARDON et al. 1994). number X83390. Gene arrangement: Close inspection of the Albinaria gene arrangement map(Figure 1) reveals certain inter- RESULTS AND DISCUSSION esting features. The Albinaria gene organization is General features: A. coerulea mtDNA has the typical novel, bearing almost no similarities to any published features of metazoan mtDNA but several of its charac- complete gene arrangements, including those of two teristics are novel. Its length (14,130 bp) is the smallest other molluscs, Mytilus and Katharina (HOFFMANet al. among known coelomates with the exception of that 1992; BOOREand BROWN1994a,b). However, the partial belonging to yet another land gastropod, C. nemwalis, gene organization of another terrestrial gastropod, C. that is reported to have approximately the same size as nemwalis (TERRETTet al. 1994),reveals manysimilarities Albinaria (TERRETTet al. 1994). The pseudocoelomate in the protein and rRNA gene order with Albinaria: nematodes A. suum and C. elegans ( OKIMOTOet al. 1992) only one gene rearrangement is required to intercon- have equally small mtDNA genomes. The small size of vert the Albinaria and Cepaea protein and rRNA gene Albinaria mDNA is the consequence of itscompact gene order, namely a transposition of the ND4 or COIIIgene. organization, the small size of its genes, as well as the As the Cepaea complete sequence and gene organiza- absence of a lengthy noncoding region (Figure 1). It tion has not been published, a comparison of tRNA contains all 37 genes typicalof metazoan mtDNA 13 genes of these two land gastropods cannot be made at protein genes (ATPase6, ATPase8, COJ COII, COIII, Cytb, present, although it can be inferred from looking At ND1, ND2, ND3, ND4, ND4L, ND5, ND6), two ribosomal Figure 6 of TERRETTet al. (1994) that there are at least RNA genes (l-rRNA,s-rRNA) and 22 tRNA genes. The two positions in the genome map that differ: a region sequence shown in Figure 2 constitutes the sense strand containing either tRNA genes or noncoding sequences for 24 genes (majmstrand, coding for more genes), while present between genes coding forND6 and ND5 in Cep 13 genes are transcribed in the opposite direction (minw aea is absent from Albinaria and no tRNA is present strand, coding for fewer genes). The base composition between s-rRNA and ATPase6 in Cepaea, whereas these of the major strand is T, 37.9%; C, 13.8%; A, 32.8%; G, two genes in Albinaria are separatedby two tRNAgenes. 15.5%, and the G + T content is 53.4%. However, if we No gene boundaries are shared between Albinariaana take into account the nucleotide frequencies at fourfold Mytilus and only very few with Katharina and the arthre synonymous sites (considered by definition as neutral pods. More specifically, tRNAvd, lrRNAand tRNA&(CW) positions), we find that inthese positions the major are directly adjacent in Albinaria, Katharina and the ai" strand has almost the same G + T percentage as the thropods, but in Albinaria these genes are transcribed minor strand (50.3 and 50.2%, respectively). from the major strand; furthermore, Albinaria and Ka- The A + T content (70.7%) of Albinaria is higher tharina also share the gene boundaries between tRNAMd than that of deuterostomes [in echinoderms the A + and srRNA, which are transcribed from the minor strand T content ranges from 58.9%, S. pu7jburatus, (JACOBS et in both molluscs. The only other shared gene boundaries al. 1988) to 61.3%, A. pectinifma (ASAKAWA et al. 1991); between Albinariaand otherorganisms are those between in vertebrates from 55.6%, human (ANDERSON et al. genes COIII and tRNATh, which is also observed in the 1981) to 63.2%, mouse (BIBBet al. 1981), as well as of nematodes (OKIMOTOet al. 1992), and between the S- other known molluscs [69.0%, K. tunicata (BOOREand rRNA and tRNAau, which is held in common with echin& BROWN1994b); 62%, sequenced portions of M. edulis derms (see SMITHet al. 1993). (HOFFMANNet al. 1992)], butlower than that of insects It has been suggested from gene order comparisoxik (78.6%, D. yakuba (CLARYand WOLSTENHOLME1985a) ; that rearrangements involving tRNAs occur more fre- 84.9%, Apismellifera (CROZIERand CROZIER1993); quently thanrearrangements involving other genes 77.4%, A. quadrimacukztus (MITCHELLet al. 1993)] and (WOLSTENHOLME1992). Thus,if the differences in rela- of pseudocoelomate nematodes (76296, C.elegans; tive locations of tRNA are ignored,we can discern very 72.0%, A. suum)(OKIMOTO et al. 1992). The A + T limited similarities in proteingene borders between content at fourfold synonymous sites (77.5% overall, Albinaria and Mytilus involvingthe Cytb and COIIgenes, 77.9% major strand, 76.3% minor strand) is indicative and the ND4, COIII and ND2 genes, although COIII is of a clear bias toward A + T. transcribed from the minor strand in Albinaria (while As far as the dinucleotidecomposition of the mtDNA all Mytilusgenes are transcribed from one strand).Per- is concerned, Albinaria exhibits the same deficiencies haps more interesting is the proximity of ND2 and COI as all known metazoan mtDNAs. The double-stranded genes in many organisms including themolluscs Albina- dinucleotides CC * GG and GC - GC show a high ratio ria, Katharina and Cepaea (but notMytilus), the arthro- 1356 E. Hatzoglou, G. C. Rodakis and R. Lecanidou

[ m5-> A 1 CCGTTTTCCT~TTATTAGGTGTTCTATGT~TATTATAffiTGTAATTTACATAGTATTAAATATA~TTCCAGTTATCTTTTMTATTTMTTTA 101 TTTTCAACC-GUjTTAACTTTAACTT~TTTAATTTGTGAT-T~GTTTTTffiT~TGGTATT~MTTT~~GTGTTTTTC 201 TTTTTGCTMTGAATATATATCTGAAGATCATTATAT-TCCGTTTTGGTTGAATTTTMT-TTTGT~CATCTAT~TTCT~TTTT~ffi 301 TTCAATTTTTACTTTGCTTCTAGGTTGAGATGCTCG~TT~TTCATTT~TTTAATT~T~~~TAATTATMT~TCTTCCTCA~TTT 401 CTGACGCTAATMCTAATCGATTAGGAGATGTATTAATTATT~T-TTT~TMTT~~TT-GTTT~CGTC~TTCCCCCCTAT~T 501 TRGTTTGGCTCTCATCGATTTTATTTACAATTGCAAGTTT 601 ACCAGTAAGGGCATTAGTCCACTCTAGGACA~~TT~GGCT~TCTATTTAATMTTCGTTGTTTTAT~T~TffiT~CC~T-TATAT 701 AGTTTAATGGGACTTGTTGGATCTATTACTTGTTTACTGGGRGGT~GTCGCATTATTTGRGTATGATTT~T~TT~CTTATCTACATTM 801 GTCAGTTGGGAGTTATAATATAT~TT~TCTTMTCTACCTT~CTAGCATTACTTCATTTATAT~TGCTATGTTT~ATACTATTTTT 901 AGGGGCAGGACTTATTCTAATATGGGACT~GATTT~GTTTACTAGGTAGATTACTATATTCTT~CTATTGTTATCTCACTGCTTMT 1001 ATTRGTATATTGTGCTTMT~GATTTCCATTTGTT~TCATTTTACT~CATTTMTC~~TATT~TAT-TTGTAATTTCTTTA 1101 CATCCATAATATTTATATTAGGRACTCTCTCCTTACTG~TATATTCTATTCG~TMT-TTTTTATGTTGRG-TMTMTMTAAGTCCACCTTCTTA 1201 CTGTMTATGRGTTGACACAAAGCARGATATCCATATTCCCTTTRGCGGCTCTC~TGTATT~~TTMTAT~TAT~T~T~TCATATATA 1301 ACGTTTAGATGGTCTTAATTTTAT~GRGTTCTTTTTTT~TTTTTTTT-TTGTTAT-TTT~TMCTTTT~C 1401 CRACTTTAATATTTTAGGCCCTACATCATATAATCTATTA~TT-CTAAATCTTTATTMTTTATAT~~TTGATCTTTC 1501 GATTAGTGA~CCAACTGAGTTATARGCARTTTTAATATATT~~TTCTTGGCGAGTTAT~~TATTTAATTGATTM-TTATATGTTRGTTACA t nul-> ms-> I 1601 TGATTCTTGTTGGTTTGGTTAATAATTAT~TCCTTAT-T~TATTT~GTT~TATTAAATTTATGTATTTT~TATCRGT~CTTTTTA 1701 TACTTTATTGGRACGTAAAGTACTAAGCAGCATACAAATTC~GACCTAA~TT~TATAT~TATTATT~C~TT~~T~CTTA 1801 AAATTATTTTTARAAGRGTTTTTTATTC~TTAATffiTAATTCCTTTATGTTTATAATTTT~CTCTATT~CCTM~-TTGATATTAT~ 1001 CTGTTTTTCCTAGTATATGAATATT~TTTTCAT~CTAT~A~AATATTATTTGTCG~TT~C~TTTGTTTATGTMTTATTTTTGCCGG 2001 TTGATCATCTAACTCTAAATATTCTTTTTTATTCTTTTTTAGGGGG~T~~CTGCT-~ATTTCTTAT~TT~~TATTATTATTATTATTTTTTGCT 2101 GTTCTTATATATCGGACATATTCTTGATAT~~T~TCTC~TTffiTATTATTATTTTTATTATTATATTTATTTffiTTT~TCATGTC 2201 TAGCCGAGA-TCGTGCATTTGACTTTG~~TCCGAGTTRGTAT~ffiGTTTMTATT~TATTATGGTffiTATATTC~TTGCT 2301 CTTTTTAGCTGRGTATAGTTCGATTCTATTTATATGTATAATATCAACCGTTT~TTTTTGTAT~GGATATAATCTTTATTAT~TT~AATTTTG 2401 CTAATTGCGATAGCATTTTTATTTGCTCGCGGGGTTTTATCCTC~TC~TAT~TTT~TTAT-CTTATGCT~TTTTTAC~TTT~GC m1-> 1 I m4L-> 2501 TATGCTGCATTTGCTAT-TACTTCTTTGAATTGTGT~CAT~~CTATTTAT~~TTATTATTATTATGTATffiTMTGTTTGT~CGT 2601 TTTTTACTCTTT-TTAAT~T~TTCT~TTTMTATTAATT~T~~CGRGTGT~CTGTTTCCCTTMTTATATG~ 2701 TGGATCCAGCATAGTATTTATTATTGTGTTTTGCTGCT~CRGCATT~CT~TTTACTTGTTTGCTTTATT~GTTMTTCTAGG W4L-> It cy--> 2801 TGTGAAATATTAGCAATPAATAAARTTTTGTTTGCARAAA 2901 AATATAGGATCTATTCTTGGTATMTTTTA~TTT-TTGCTMCTG~TTCTACTGT~T-TTATTCGTCT~TT~TA~TTT~T 3001 CTATTATTCATATTATTCGTGATGT~CT~GATGATTTCTTCGTTTACTTCAT~MT~GCCT~CTTTTTTTCTTGTTTATGTAT~CCATAT 3101 CGGACGTGGACTATACTATCGTATATTATCCATCCACGTGTAT~TGGT~GTGT~TTTTTTTffiTT-T~AC~TTTTT~G 3201 TATGTACTTCCATGGGGTCCATTTT~TGCT~TGTAATTA-~TATTAT~CGT~C~ATTTT~CC~T~TT~TGRG 3301 TTTGGGGAGGGTTCTCTGTTGGGCAT~TACCCT~TCGTTTTTT~~T-TTTTTTATT~CCTTT~TATT~GTT~~A~TCATAT 3401 TATTTTTCTTCATGATAAGGGGTCATCTAACCCCATTAGGTAATTTATTTCACTTAA~CATTT~CCATATTT~~TT~T~GTT 3501 GGATTTTTAATAGTATTTGGTGTTCTGTTAATMTCACATTTTT~~CCTTCTTTACT~T~TCCC~TTA~TT~T~~CCCTAT~T~ 3601 CTCCTACTCACATTCMCCGGAATGATATTTTTTTGTTCGCTTAT~TATTCTTCGAT~ATTCCCT~TT~GffiffiT~T~TTATTMTATC 3701 TATTTTAATTCTATATTTTTTACCTTTATCTAGATATGG~TATTCCTGT~TAT~TATTATTTATCAffiTTTTATTTTGRATTTTRGT~TT 3801 ACTTTTATTATTCTGACATGACTAGGTGCTTGCGAGATT~CATATCTCTCATT~GGGTC~CTTACATTATTATATTTTTTMTGTTTTTAT

3901 TATTAGGGATATCGAATAATTTAAATTTTAATCTTATT-TT-=A~TTT~TTAATTTATATT-TAATTT~TT-T-T-TAA~~cy"> I tRwI "> tRwI *s-> 4001 M~TGTTTARACTTATMTTATATTTTMTATATTATTTAT~~MTTAATAGC~TATMT~

~ m me-> t mII+ 4101 TAAAGCGTGGCTTT~GGGCTTRGRAAGAAAATTTTTCTTTMTT~TC~GTA~T~GA-TTMTTTMT~TCCTGCGTCTCCGRTC 4201 CAAATAGAAATARTATTATTTCATGAC~TGCAAT~~TTCTTATC~TTTTT~CTGGT-TT~T~TT-TTATGTTTTMT~T 4301 TATCAACTCGAACTATACATGAAGCTCAGTTATTA~~CTGT~~TTTTAC~TTTCTTCTGGTGT~CT~CTCTTC~GACTGCffiTT 4401 ATTATATTTACTCGATGAGRGCAGGGTAGAGAGGGTATTATTATTTT-~TATTGGC~T~T~TACT~TAT~TGC~T~T-CATTT~ 4501 MCTTCGATTCTTATATAATTCCGGRRGAGGATTTT~C~GTGATTATCGACTACTT~TT~TMCC~CTATGGTACCTTATffiTTT~TA 4601 TTMTGTTATTACTACRAGGTGA~TAATTCAT~T~CTTTAC~TAT~GffiTT-T~T~GTTC~GATTAAAT~TATAGG 4701 TTTTW\TGCAAACTTGCCAGGAATTTTATTACGGA~T~TCT~TTTGTGGGGCTAACCATTCTTTTAT~~ATT~~TT~CCATT~TGTT mI1-> 1 tRwI Tyr-> 4801 AAAGATTTTATTARAATATGTAA~~TTAAACTTTTATT-CTT~TTTTGGCCCT~T~GATGGT~TT~-T~AGTT~TTTA 4901 TTTCC~TAT~TATATGTAAGTTAARCAAATGTC~GA~TRGTCATAATTA~TTATA~TTGT~TTTATAT~TATAT~TTGTA~Trp-> 1 -m my-> - m mi.-> 5001 GTTACCT~TARAAAGGCTCTA~GT~T~TATTMGTT-TT-CTAT~T-CCCTTMTATT~TT~TGAG

5101 ~TTCG~TTTCTATAARAATATTTTT~~TTTAA~~T~TTAA~GTGCTTT-C~~~<-moln <-" wml [ <-).TPaMB 5201 CA~-TCTATGCACTTAATTCTGCCRCATAGTARATGCG 5301 AAGCRAGTAAGGGA~CTATTGGTGTGRGTATAT~TGATTAATAACTMTA~TAAGTCCATAATAT~TGTAACG~TATMTT~CCCATT <-ATPase0 I <-mlun <-ATRaMI 5401 TATTGGTCTTAATTGRGGUTGACTAAGTAACCTAATGGTTTCTTTTTMTT~TT~TATT~TCT~TT~TGTTCATTTATG 1501 TATAAACTTAAAAGT-T~TATAGAAGTAARAATATAGGCTTGARTGA 5601 AAGTATTTCTTAGGTTCGAGCTTAATA-TGCCATTAATG~-TMTAT~CC~CG~TATATT~~~TC~GTT-T~~ 5701 TCGRATTATAATACTMTAGATTCAATT-TT~GGTAGTAAT~~T~C~T~~TG~CTMTCTCTTTTT~~T 5601 TTMTATAGCCTGT~CCT~TAGTAAT~CCAGGGTTATATTTA~CAT~CCTAGTCGTMTTCCGTAGGT~T~TATAC 1901 CT-GGTTATTAAAAATMTTAATACTATTATMTTGTTAAT~CTTTTRGTT~TTATATGTC~TTGTTT~TT~T~~AAATATTT 6001 AATMTATTATTTTGATCTATTeTTCACGTCTTATTTCTTATTT-TTAT~TGAT~TATCGGTGTTMT~~CCAT~TTATTCCATCT <-ATPase6 1 <" kg 6101 AAGGATGAAAATAAGTCTACTAT-~AATATTGGAAAA <-fRxI Qlu , t <-a- 6201 CTACAGAGTTATATCTAACTCATTTTTTATTAT~T~GTATTT~TTTTATACTAT~TAACTTA~A~~TffiTGTAAGTCCATATATTATTAffi 6301 GGCRGCTTCCACTACCCCTACTTTGTTACGRCTTATCTCT 6401 ATCCTTATATATTACTTTTAAGTCCATCCAT~TTAT-TTTTTCATTTCGT~TCC~T-TTAATTAATGTMCTCAC~TMTTCTT~~T~C 6501 TGCACCTTGATTTGTCTTAATATTT~TTMTATTCATGAACCCCTTTGT~TTCAT~~TATA-TTTTTT~TAAGTCCATATM 6601 TAATGCTTGGTGGTTATCTATATGGTARGTTCCCCPAT 6701 AAATTACTTAGTTGGGGTCGTCTCTAATCCTAGTTTTAT~AT~TTTGTTAT~CAT~TTTT~TTAATATTCATMTMTA 6801 TTTCACCATTTATTATATTACTTT~TTATMGTMTTTAATAAACTTGTAAGCTAA~TATTA~~~TTT~CCGC~~~TT~~CCTMT~C 6901 CTAAATTAATAAACTGTCTAAATTATAATTT~TATATMTTT-TCTT~~G~T~ACACTT~TCCATATTT~C~TATTTATTG

FkUm2.-The complete nucleotide sequence of the circular A. coeruku mtDNA. Only one strand is shown, whichcorresponds to the sense strand for the majority of genes. Abbreviations used as in Figure 1; numbering starts at the first nucleotide after the ND6 termination codon. The putative first and last nucleotide for all genes are indicated; putative initiation and termination codons of translation, as well as the anticodon sequences of the tRNA genes, are in bold face characters. Arrows indicate the direction of transcription. The translated amino acid sequences of protein genes are available from the authors upon request. pods (CLARYand WOLSTENHOLME 1985a;CROZIER and either directly adjacent (Katharina)or are separatedby CROZIER1993; PEREZet al. 1994) (sequence of Artemia different tRNA genes. Interestingly, in sea urchins ND2 mtDNA submitted to EMBL withaccession number and COI are separated by the 1-rRNA gene (JACOBS et X69067) andthe vertebrates (see WOLSTENHOLME al. 1988; CANTATOREet al. 1989; DE GIORGIet al. 1991; 1992). In all the above, these two protein genes are SMITHet al. 1993). Land Snail Mitochondrial Genome 1357

B7001 CTTACGATATCCAT-TT-GTG~~TTAGTGAGGAA<" 1 <-tam mt [ <-m3 7101 CAGTCRAGGA~PCCATAATATATTTCATATMT-TAAGATCARTT~CACTATTATAAAT-TMT~TMTMTT~~TGTTATAAATC 7201 TTACTATRW\TAATACAGGTC~T~TTCCRC 7301 ATTTCTTAAAGGGTCAAAGCCACATTCAAATGGGGTTTTTAATTCGTAGGATG~TT~TT~TTMT~T~TAATTATT~T <-ND3 I <-tam .rr (Val - 7401 ARGCATAAT~~TAATGATGTCGGTAGGACGTAT-~AATGGTTT~TTAT~TTTGGAT~CTATT-TT~T~T~AT m aer (AM)-> [ N"> 7501 TCAT~TTATAARTT~T~AA~TTAATTT-TATATTATT~~GTTTCTT~~TATTTMTTTTATT~TGC~TA~TGT 7601 TTATTTCCTTTTTGGGARGTT~CATTGACATTTATCATTCTAAGATCARTATCATTAACATATTTAAATATTMTTTTA~TT~TT~ 7701 AGTTGATTTGATTAACTCCTGGGACAGCATTAATTTTTCTTA~CTTATAGTTRCCTT~AGTATTAATT-CRTATAATATT~TTATAA 7801 GTACATTGGGTGTTTARGRRGTTTAAATTTGGTATTGATAAT~~~~TTTTGTGTCTGTGATTTCTT~TTTTATGTAATATTT~TATCTTTGATC 7901 CCTACTTTATTATTAATTTTACTATGAGGCTATTAAGATCARC~~GGAT~CAGGTTTTTATTTGATA~ATATACCGT~~T~TTRCCTTTAT 8001 TGTTATTATTATTATATTTGTACTATACGGTAGGATCTCT 8101 GCTAATAATAGCATTTTTAGTARAGCTTCCAAT~ATACCTGTCATTTATGACTACC~CTCACGTC~CCATT~TTCCATAGTATTG 8201 GCCGGTGTGTTACTTAAACTGGGGGGATACGGACTGTATATATTMTTAATTTTATTATT~TAGT~TTGGTAATT~TAATCATT~C 8301 TTTCATTGTGRGGTGCTGTTATTGCTTCGATTATTTGTATTCARCAAGTTGATATT~TCTGGT-TATTCTTCAGTTGCT-T-TTAGT 8401 ATCAGCTGGGATTTTAATGATAT-TTGAA~TAT~TGTG~TAATAAGATCARTAATTGCTCRTGGCTAT~TCRTCTGCTTTATTCGTATT~T 8501 AACTTATCATATTTAARAATTARGRGCCGAAGTTTARTATTCAT~~TATT~TATCTTTC~CAATAGCTTTTTATTGATTTCTATTTAGTT BIOI GTATAAATATAGCRGCCCCACCAACACTTARCTTTATTGGG 8701 CATTATTATATTTATTAW\GCGGGGTATRGATTATATATATATATAACCGTTAATCATG~TT~GTCTTTATATCACCCC~TT~TTAAA ND4-> I 7 BBOl AATGTAGACTATCATGTTTTAACCGCACATTTGTTAC~TTTATTTTATTAATTCC~TTATTTTCTGT~~T~C~ACTRCT-TA <-mThr [ <-COIII as01 CATCATTATCGTAGATTTAU.AARTCTATGCTTTAA~TT-TA~TAA~AATTTAT-TGATCCTCATCART~TCCTAATATAT~T so01 CATACTACATCTACGRARTGTTAAGATCARTATCATGCTGCTGCT~CC~~ACATGATG~TAGTTCT-TGATAATAATATGTACGT~AAATTCRCGA 9101 AAAGAAATAGGGTACCGGTAATCCAT~CGGTTG~AT~TGT~TTCCAT~~CRT~~TT~T~TGTTTC 9201 ATTATRCTCACCATATTWATT~T~-T~T~TAAACCC~TTGRGTAATATTTTCCTTCCGTCAATG~ 9301 TGATGRGCTCARGTAATACTTACACerGATAATAAT-~TATTTAAT~T-T~TT~T-TMTTCCAATT~TC 9401 ATACAGAGCCGATTTCARTTGAAGGTGCTAACCTTCTATGXWLATATGCC~T~CT~TMTAAAT-CTAT 9501 ACCTAGTTTTARGCCTTTCACCACGTAAGTTGTAT~T~CTTGATAGGT~TCTC-CARTGTCRCGTCRTCATATAT~TAAT~TT 9601 AACAACATACCGTATAAAAGTARGTAAATAGAGCCAAARCCC <-COIII I e701 ARGC~ACATACTTCGACT-TRATGRRRAOGTTTTTTTGCATG

~ m Il"> [ m2-> SBOI TACGGGTATCRTT~GTTGGGAGTT-TACCGTTGCTT~CCTT-TCTGTTCTTCT~CRTAATTATTTT~TCTAAGATCARTT 9901 TTARGTATAACATCARGTAATTGAATTATTATTTGAATTG 10001 CAGGTGAAGGAATTATARTRTATA~TTTTAATTCARTCAGTTT~CAGTGATA~ATTAAAT~TTATATATTTTTGT~TCRT~GTCATCATA 10101 TATTTATTTATTTATTTTTATCRCTATACTAATACTGRRAT 10201 AATATTGGTATTGTTGGGTTACTTCTRARAATTGTCCCRR 10301 TARGAGTARCATTAAGATCARTATTATTTGGRGCATTAATTGGTATAAATTTAA~TAC-T~TT~~T~erCARCTATT~~TAATffiTTffiTT 10401 AGGAATAAGATGTATCRGAGGTAGATTATTTARATATTTTATT~TATGGGTTTTCTTTAGTMTCTT~CGTGTTTTTAT~~-~~TA 10501 AGCATTAGGTTGAGCTTATT-TTAAGGGGTCTTCCGCCATTTATGCTATTTATTG~TTMTGTGTT~TATAATAAT~~AATTTAT 10601 GGTTTATTGTATTAGTATTTGCGATTTTARGTGCTGTTATTAGTCTTGTATACTATCT-TTTAGTGTAATATTTTTTATAAATAT~TAA~A mz-> 1 tam Lym-> 10701 TTTGARGCATTAT-TAGCTATATTTTTACTTGTTAAT [ COT-> 10001 AATT~AATTTUV\TTRCGAATTAATAAGATCARTT~~GA~GATTTTACTCAAC-T~T~TATT~T~TTGTATATGTTATTT~~~TATCTGAT 10901 GTGGAATAGTAGGAACAGGTATCTCTATTAATTCGATT~CT~TACTTCTG~TT~T~~CATTTTTATMTGTMTTGTT~~ 11001 ACATGCATTTGTARTAATTTTTTTTATAGTAATACCTATTATAATT~TTTTGG-TTGAATAGTA~TTT~GATT~~C~GATAT- 11101 TTTCCTCGTATGTTTTTGATTGerRCCAC~CTTTTATTTTATTAATTTGTTCT-TGGT~T~~~TGGA 11201 CTGTATATCCGCCTCTAAGATCARGTTTAGCTCAT~~TGCTTCTGTTGACTTAGCTATTTTTT~CTTCATTT~TATCATTAAGATCART~T~ 11301 TGCAATTAATTTTATTACAACRATTTTTAATRTRCGTAGACCCGGTATAACCATGGAACGTGT~~TTATTTGTGTffiTCGATTCTAGTTACTGTTTTT 11401 TTATTATTACTATCATTACCTTCTTGCTGGGGCAATT~ATACTATTAACTGACC-CTTTAATACT~TTTTTTGATC~~~~T~G 11501 ATCCCATTCTATATTAAGATCARCAACRCTTATTTTGATTTTTTGGTC 11601 TAGTGCTATAAARCAACCTTTTGGGACATTAGGTATAATTTAT~CATAATTTCTATTGGTATTTT~TTTATTGTTT~~CATATATTTA~ 11701 GTTGGGAT~TGTAGATACRCGCGCTTATTTTACAGCCG~ACTATAATTATTGTAAGATCARTTCCAACTffiGATT-TTTTT~TGACT~TAAerATTT 11801 ATGGCTCARARGTTCAGTATACTGCARGCATATACTGAGTTC 11901 TTCATTAOATATTATACTTCATGATACTTACTATGTTGTGGCACATTTTCAerATGTTTTATCTATAG~TCTTT~ATCTTT~~TTT~T 12001 TTTTGATTTCCAGTTATAACCGGATTCTTCACGAGCTTTTTCCCCC 12101 AACATTTTTTAGGGTTAGCTGGAATRCCTCGACGGTATTCC 12201 TGTGTTTGCTGTATTATTGTTTGTATTAATTGTCTGAGAG COI-> I tam Val-> 12301 TCTGATGGATATTTTCCCCCTTTCAT~TTTATCAAT~TACATTACTAT~~TTATTATAATC-TAATACT~T~~T I"> I"> 12401 GAGTTAGTTCCTARTATTT~TCT-TATT~TA~~~T~T-~T~T~TACCTTTT-TAAT~~~TGATRC 12501 TAAATTTATTCTRRGATTTTATATTTCCCGAAAATAAAAOCTC 12601 AATGAAAAGTTRGTAGTGARACRCTTTCACTTTTATTGATTTT 12701 AAGCGAAGAATTTTATATAACAGTARRAATTTCTAATTAATAATATTARCTGGTAGTTTTATT~TAT~TATATTTATATATTAAATTCCTT-TT 12801 CTATTTATATATTATAAGATCARCCWCAGTATACTAATAATGTATCAATTAGTAAAT~TAT~TAAATATTAGT~TC~-TMTT~TT~GA 12901 CTGTTTATCAARRACAT~CAAARGTAGATATTTTTGGTGTTAT~GCCCRGT~TTTT~~CC~TACCTTGACTGTG~TAGCRT 13001 AATAATTTGRCTTTTAAATGCer~T~CGTAT~CTTGTCTCATT~TATTACTTTAAATTTAATffiTT~T~TACTC 13101 ATAATTTTAATAAT-GCCTT~TTTTTTTTA 13201 TATATTAATGAAAGTAATACGATTAAAT~TTATAGA 13301 CTAGGTACTATTARGGCCGTTTTAATATAATATGTTCTGTTCGAACTTTGTTACCT~TGATCTGRGTT~CGGCGT~~TCTTT 1"> 1"> 1 tam L.u (aN)-> 13401 CTATCTTTTATCTGGC~~ATAATAGTACGAARGGRCCTTATAGGCRCTTATTCAT~TATA~TTATTTTGGCRGAGTAT~TAT~C~~TCATA m P"> 13501 ATATAGTAGTTACCTACTAAATAAT~TAAGGGCRGTTRCTGGT~GTT-T~TATT~T~TRCTGTCTACCTTTA~TTTATRCTTTAA m [ -I-> 13601 TTAAGAAGRTGTGGTT~cATTARGT~TATATTTTTcc 13701 GGTATTTATGATGCTGAARGGRRTCAATCCRATWU;CCTTTTATTAGererT~TACTTT~GTTATGT~TGTTCTATGATT~T~TTTAT~G 13801 TCTTGATATGC~ATATTTTATTTATTGTCTATATTGGTG 13901 ATATATATAAATCATTATTGTATGCGTGAWiAGCRGTAAT 14001 TACTAGAGTAAATATTCCTTTAATTTTTCTCTC NDI->1 14101 ATTCTCATAGTTGAARGTAGCCARGTTTTAA FIGURE2. - Continued

As can be seen in Figures 1 and 2, all Albinaria genes many cases of overlapping genes coded by the same are eitherdirectly adjacent or have very few nucleotides strand. Three types of such overlap may be identified, separating them. The total number of nucleotides be- involving protein-protein, tRNA-protein, or tRNA-tRNA tween genes is 141, the longest intergenicsequence junctions. Thus, there is one case of overlapping pro- of 42 nucleotides separating two genes transcribed in tein genes (hD5-h!D1,7-ntoverlap), onecase of overlap opposite directions (COIII and tRhCA'7. Five genes end between a tRNA and a protein gene ( tRNALYs-COI,7-nt on abbreviated stopcodons. Furthermore, there are overlap) and six cases of overlapping tRNA genes (Asp 1358 E. Hatzoglou, G. C. andRodakis R. Lecanidou

Cys, 5 nt; Gly-His, 4 nt; h(CW)-Pro, 4 nt; h(UU.R)- (the largest unassigned sequence is 42 nt and the sec- Gln, 2 nt; PreAla, 2 nt; Tyr-Tq, 5 nt). Finally, an exten- ond and third largest are 21 and 16 nt, respectively). sive overlap of 12 nt is observed between twotRNA Interestingly, the decanucleotide AATATATATT, lo- genes transcribed in opposite directions [tRNAs”(UCN) cated between tRNAh(UUR) and ATPase8 within the and tRNAs“(AGN)]. lhtAlbinaria unassigned sequence, is reminiscent of Another unusual feature in Albinaria mtDNA is the sequences present in sea urchins (TTATATATAA) and presence of four consecutive protein-protein gene junc- chicken and duck (ATATATAT). In Albinaria, this pal- tions (h?D6-h?D5,h?D5-NDl, NDl-ND4L and ND4L-Cytb) indrome is even longer (tetradecanucleotide,TAAATA- that are not separated by an intervening tRNA. It has TATAITTA), if we allow for a 2-nt overlap with the been suggested that the secondary structure of a tRNA tRNAh(UUR) gene. Such palindromes have been impli- gene between pairs of protein genes is needed to act cated to function as bidirectional promoters or as recog- as a signal for the precise cleavage of the polycistronic nition signals for enzymes involved in transcription or (OJALAet al. 1980, 1981). In accor- processing (JACOBS et al. 1988, 1989; CANTATORE et al. dance with this hypothesis, almost all reported cases of 1989; L’ABBE et al. 1991; RAMIREZ et al. 1993). protein-protein geneborders with no intervening The longest Albinaria noncoding sequence (42 nt) tRNA have the potential to form hairpin structures contains yet another decanucleotide perfect palin- (BIBBet al. 1981;CLARY and WOLSTENHOLME1985a; drome (ATAAATITAT), whichis twice as long (TGCT- OKIMOTOet al. 1992;BOORE and BROWN199413). In GATAAATTTATAAGCA) if we overlook a 5-nt overlap Albinaria it is also possibleto draw stem and loop struc- with the adjacent tRNA’[”gene. tures near or at the protein-protein gene boundaries Synthesisof the second (L) strand in a varietyof (Figure 3). In the case of h?D6ND5 and NDl-ND4L bor- metazoans (including several mammals, Xenopus, Dro- ders, the stem and loop structures are positioned more sophila andthe nematodes) is supposedly initiated or less similarly with those reported for the nematodes within a run of Ts situated in the loops of potential and Katharina with respect to gene termini; the com- hairpins that can be formed from intergenic sequence plete terminationcodons are positioned within the (CWGet al. 1985; ROEet al. 1985; CLARyand WOLSTEN- HOLME 1987; OKIMOTO et al. 1992). No such structures loops. In the case of ND5-ND1 and m4L-Cytb borders, can befound in theshort Albinaria noncoding se- however, a tRNA-like structure precedes a stable stem quences. However, a sequence with thepotential of and loop structure that is located near the 3‘ end of forming a stable stem and loop structureis found within ND5 and ND4L genes, respectively (in ND4L, there are the Nll5 gene (Figure 3); this secondary structure has two hairpins separated by29 nt). It should be noted a T-rich loop and moreover its 5‘ end is identical to that NLI5 and NDl are overlapping genes and that h?D4L that of the OL loop in Drosophila (boxed sequence ends with anincomplete termination codon. These ATATAA in Figure 3) (CLARYand WOLSTENHOLME tRNA-like structures resemble normal tRNA with AAG 1987). The significance of this structure is presently [alternative Leu(CUN)] and TTT (Lys) anticodons re- unknown. spectively. The existence of superfluohs tRNA-like Protein genes: Identification of protein genes was genes has been also reported for two other molluscs, accomplished by comparison at the nucleotide/protein Mytilus and Katharina. What is unusual with Albinaria, level (see Table 1 for genetic code) with already known however, is that, in contrast to other molluscs, the tRNA- mtDNA genes of a close relativeto A. coerulea (A.turrita) like structures are located totally within protein coding (LECANIDOUet al. 1994) as wellas with sequences of genes. Interestingly, an extension to the 5‘ end of the genes available in the EMBL data bank. Identification h?D5gene in sea urchins can be folded intoa secondary of ND4L was based on hydropathy profiles, predicted structure resembling a tRNA gene and is thought to be using the computer program SOAP of the PCGENE a remnant of a tRNAh(CurV) gene (CANTATORE et al. package (BAIROCH1988) (data not shown), that were 1987, 1989; DEGIORGI et al. 1991). The significance of found to be very similar to corresponding profiles of the Albinaria tRNA-like structures is not clear at pres- Katharina, Mytilus and Drosophila. ent. However, we have found thatthey do nothybridize Nine of the 13 protein genes of Albinaria mtDNA with lowmolecular weight RNA in Northernblots, while start with theorthodox translation initiation codon the two corresponding standard tRNA do (data not ATGtwo with ATA (Cytb, ND3), one with ATT (h.2)5) shown). and one with TTG (COI).ATT is not used as initiation Control regions: An important question emerging codon in Katharina and Mytilus but is quite frequent from the gene organization of Albinaria mtDNA con- in Drosophila (CLARYand WOLSTENHOLME1985a) and cerns the location of regions containing the signals for the nematodes (OKIMOTOet al. 1990,1992). TTG is not replication and transcription. In all metazoans, where used for initiation in any of the known coelomates but such sequences have been identified, they are found is common in pseudocoelomate nematodes (OKIMOTO in noncoding regions (WOLSTENHOLME1992). What is et al. 1990, 1992). Eight Albinaria protein genes termi- unique in Albinaria mtDNA, however,is that itcontains nate with the stop codonsTAA or TAG (Figure 2). The virtually no unassigned sequences of significant length other five are inferred to have incomplete translation Land Snail Mitochondrial Genome 1359

GAA A T A T T T T T T T T T T T TT T*G C-G T-A T-A GTT G’T G A A-T T A GG T T GG T T A GT GOT G A A T T-A G T A T m6 A-T T T C AI T-A T-A T-A G-T “G-CYJ T-A T-A A*C T-A T-A T*G A-T C-G G-T A-T T-A A-T TT T-A

/:&i!iAACT/ IAATTT-A A-T 1654 14110 16 1323 1394 1529 A-T T*G A-T T-A A-T A T ATTAA C T T 1.1 I I GATC TGATT A T -1.1 T A TTGG A C CA GOT A-T G-T T-A TT A T T C AAG

CTT G T T A T T TT G-C T-A G T -A TC C-G T A A-T T A T T GA TT nD1 T-A T-A T-A G-C T-A T-A T-A G-T G-T A-T T - Ar CTG-CTAGGTG /:GAAT-AATGACT I /GTAGCTGTTTCC-G C-G 2821 2802 nD4L ”-* 2530 2551 2530 2615 T-A T-A A-T A-T T - A AA T-A T CGACGI G GTCG A IIIIII A G GTATGCTGCI C A Ill1 TG TT T CATA T CCAG G TGT TOG A-T A-T T-A A-T A T T A TTT FIGURE3.-Potential secondary structures near or at the junctions of hD6-hD5, hD5D1, NLll-hD4L, and hD4L-Cytb genes. The start and stop codons are indicated in each case. Within the hD5 and NLl4L genes, tRNA-like structures that are followed by a stem and loop structure may be drawn. Moreover, -300 nt before the end of hD5 another stem and loop structure may be formed, immediately after the boxed sequence ATATAA; a similar structure has been implicated as the origin of light strand replication in Drosophila (see text). 1360 E. Hatzoglou, G. C. Rodakis and R. Lecanidou

TABLE 1 Codon usage of A. coerulea mtDNAencoded proteins

Amino Codon N % -I-Amino Codon N % Amino Codon N % Amino Codon N % acid acid acid acid Phe TlT 6.7240 Ser TCT 77 2.1 TYr TAT 148 4.1 Cys TGT 34 0.9 TTC 0.725 TCC 15 0.4 TAC 38 1.1 TGC 10 0.3 Leu TTA 9.4340 TCA 68 1.9 Term TAA 0.2 6 Trp TGA 68 1.9 TTG 63 1.8 TCG 14 0.4 TAG 0.1 2 TGG 20 0.6 Leu CTT 83 2.3 Pro CCT 49 1.4 His CAT 61 1.7 Arg CGT 17 0.5 CTC 14 0.4 CCC 22 0.6 CAC 13 0.4 CGC 0.1 4 CTA 2.2 79 CCA 50 1.4 Gln CAA 1.1 41 CGA 22 0.6 CTG 0.724 CCG 11 0.3 CAG 10 0.3 CGG 0.2 6 Ile ATT 7.4266 Thr ACT 77 AAT2.1 Asn 106 2.9 Ser AGT 49 1.4 ATC 36 1.0 ACC 23 0.6 AAC 32 0.9 AGC 0.828 Met ATA 213 5.9 ACA 2.0 73 LYS AAA 2.073 AGA 64 1.8 ATG 1.4 50 ACG 12 0.3 AAG 13 0.4 AGG 30 0.8 Val GTT 90 2.5 Ala GCT 91 2.5 ASP GAT 47 1.3 Gly GGT 73 2.0 GTC 20 0.6 GCC 0.7 24 GAC 15 0.4 GGC 0.724 GTA 97 2.7 GCA 66 1.8 Glu GAA 48 1.3 GGA 2.2 79 GTG 0.622 GCG 12 0.3 GAG 0.725 GGG 1.2 43 L N, total number of particular codon in all proteins. The total number of codons is 3595; the incomplete termination codons were excluded. termination codons (either T for ATPase6, ND3 and amino acids). The only Albinaria proteins that are not ND4L or TA for COI and Cytb). the shortest among known molluscs are ATPase8 and In general, proteins coded by Albinaria mtDNA are ND4L; both are among the least conserved mitochon- quite short (Table 2). Five of them (COI, COII, COIII, drial proteinsand exhibit a high degree of length varia- Cytb, ND5) are the shortest among known metazoans tion (WOLSTENHOLME1992). It is worth notingthat (compare with Table 11 Of WOLSTENHOLME 1992). Also, ATPase8 differs in length by >5% between two Albina- five others (ATPase6, ND1, ND2, ND4, ND6) are the ria species (A. cornleu, 55 amino acids; A. tum'tu, 52 shortest among coelomate metazoans, since these are amino acids). shorter only in thenematodes (the corresponding Among compared molluscs of Table 2, there is exten- lengths in C. eleguns are 199, 291,282, 409, and 145 sive variation not only in the size of their mitochondrial

TABLE 2 Comparison of the A. cod mitochondrial protein coding genes Wi-th those of A. turritu, C. neman-alis, K. tunica&, M. edulis and D. yakuba

Percentage of identity Protein length (amino acids) A. coeruh A. coerulea A. coerulea A. coerula A. coerula ProteinD.y.M.e. K.t. C.n. A.t. A.c. us. A.t. us. C.n. us. K.t. us. M.e. us. D.y. ATPASE6 214 - -224 238 230 34.0- 30.2 - 34.0 ATPase8 55 52 54 53 A 53 50.9 19.6 28.1 A 26.8 COI 509 P - 513 P 98.9* 512 - 72.6 (49.0) - (36.0) 71.5 (46.5) COII 224224 - 229 P 92.4228 (94.7) - 55.0 (54.1) - (37.4) 53.9 (54.9) COIII 259 - - 259 264 262 - - 62.7 42.9 59.5 Cytb 367 - - 379 P 378 - - 53.6 (50.4) - (46.0) 53.6 (52.3) ND 1 299 ND1 - - 316 P 324 - - 46.1 (47.3) - (40.2) 47.4 (49.8) ND 2 307 ND2 - - 338 P 341 - - 24.9 (25.2) - (31.6) 30.9 (32.4) ND3 117 - - 120 116 117 - - 39.2 35.5 39.7 ND 4 437 ND4 - - 442 P 446 - - 35.2 (29.9) - (33.1) 43.0 (39.5) ND4L 99 -100 79 22.596 93 26.2 -22.7 24.2 ND5 545 - - 571 P 573 - - 35.0 (37.0) - (33.9) 34.2 (35.7) ND6 155 P 180 166 15823.9 17419.9 75.3* 17.4 27.8 Percentage of amino acid identity was calculated by dividing the number of identical amino acid positions by the common length of the compared sequences. Numbers in parentheses show the percentage of identity among the partially sequenced portions of M. edulis protein coding genes and the corresponding regions of A. coerulea, A. turrita, K. tunicata and D. yakuba genes. Asterisk denotes the percentage identity values for the partially sequenced A. turrita COZ and ND6 genes. A, absent; P, partial. Alignment of compared amino acid sequences is not shown. Land Snail Mitochondrial Snail Land Genome 1361

srRNA

A.C. (5) TTTAATTT (61) TTTAAGATTTTFAATTATAT (27) ATTTAGGTTTA -(O)- X.t. (5) ...GG..C (103) ...... A.C-...... T. (46) ...... -TAG (18) .. D.Y. (6) ...T .... (61) .....A.AA- .....- .... (43) ..-----... .-TAG (24) ..-...... GT..

GT-TAATATACL-CCA-TAGTA K.t. [548] ...... T...... A....CT...bT... D.Y. [548] ...... T .....AT...C.... .AG.. TAT...G-... GA

IrRNA

A.c. (52) TACCTTTTGCATAATGGT A.t. (53) ...... [;;E!; GTACGAAAGGACC...... (25)(24) K.t. (132) ...... T..... [lo851 ...... (27) U.O. (80) G 1 1 .T (86) ...... D.y. (179) ...... G..T..C.GC.. [ 10641 ...... (52) x.1. (189) ...... [ 13141 ...... (106)

FIGURE4.-Alignment of the ends of the two ribosomal RNA genes. A.c., A. turrita; K.t., K. tunicata; M.e., M. edulis; D.Y., D. yakuba; X.Z.,X. laevis. Numbers in parentheses denote regions where alignment is umbiguous; numbers in brackets denote the total intervening length of unaligned sequences between the ends of the molecules. Double underlining emphasizes inverse repeats forming a stem and loop structure; in the proposed secondary structure of Drosophila s-rRNA this is the last hairpin (CLARYand WOLSTENHOLME1985b). Single underlining indicates an additional hypothetical stemand loop structure in Albinaria. proteins but also in the percentage of identity they ex- and Thr (5.2%) are more (or equally, Thr) frequent hibit. As expected, the greatest identity is observed be- than Tyr and Asn. Thus, there is no direct correlation tween the two Albinaria species. When more distantly between the most commonly used codons and themost related species arecompared, there are some dis- frequent amino acids. If the predominance of codons cordances with traditional taxonomic relationships, a composed solely ofA and/or T is attributed to AT bias conclusion we have previouslyreached from analysis of (CROZIERand CROZIER1993; BOORE and BROWN A. tumh mtDNA sequences (LECANIDOUet al. 1994) 1994b), then the relatively high percentage of certain and which is also reported by BOOREand BROWN amino acids that are not coded by strictly A and/or T (1994b). codons (such as Ser, Thr and the aliphatic Val,Gly, Codon usage and codon bias: The genetic code of Ala with similar physicochemical properties) must be Albinaria (Table 1) has been presented previously (LE- attributed to selective pressure. CANIDOU et al. 1994) and is the same in all known mol- Ribosomal RNA genes: Identification of the Albina- luscs (HOFFMANNet al. 1992; BOOREand BROWN1994 ria s-rRNA and 1-rRNA genes was accomplished by com- a,b; TERRETTet al. 1994) as well as in Drosophila (CLARY parison with other known mitochondrial ribosomal and WOLSTENHOLME1985a). Itdiffers from the univer- RNA genes. The l-rRNA gene had beenpreviously iden- sal code in that ATA codes for methionine, TGA for tified in two cloned segments of A. turrita mtDNA (LE- tryptophane, and AGA and AGG for serine. CANIDOU et al. 1994). Using the complete A. coerulea As can be seen in Table 1, codons ending at A or T mtDNA sequence as a guide, it is inferred that these are much more frequent (-81%) than those ending two segments are actually consecutive and that the 5' in C or G. In fourfold synonymous codon families and of the A. tum'ta 1-rRNA gene is at residue 631 of 77.5% ofthe codonsend atA or T. Since any nucleotide segment I (the sequence preceding the5' end contains substitution in these sites does not lead to an amino the tRNAval gene), while the 3' end is at nucleotide 923 acid replacement, the AT bias in the third codon posi- of segment I1 [due to revision of the A. turrita tRNAb- tion should not be attributed to pressure of natural (CUN)] (see discussion on tRNAs) . selection at the protein level. The endsof the two Albinaria rRNAs cannot be pre- Seven out of 62 amino acid codons, which consist cisely determined because of uncertainties in the align- exclusively ofA and/or T, represent 38.4% of the total ments with known rRNAs.Figure 4 shows a comparison number of codons. Six of these are the most frequently of the 5' and 3' regions, which is based on conserved used codons and correspond to the amino acids Leu, sequence elementsshowing a significant degree of simi- Ile, Phe, Met, Tyr, and Asn, which constitute 16.8, 8.4, larity. Since the size of the compared regions is smaller 7.4, 7.3, 5.2 and 3.8% of the total number of amino in Albinaria, we assume at present that each Albinaria acids, respectively. Withthe exceptionof Leu, this order rRNA gene occupies all of the available space between of amino acid frequencies is different from that actually the tRNAMetand tRNAGL"(s-rRNA) and the tRNAva1and observed in Albinaria mitochondrial proteins, where tRNALcugenes ( 1-rRNA) . Ser, and not Ile, is the second most frequent amino At the 5' end of the Albinaria large rRNA gene, the acid (9.7%), while Val (6.4%), Gly (6.1%), Ala (5.4%) first conserved region (18 nt) is located 52 nt down- 1362 E. Hatzoglou, G. C. Rodakis and R Lecanidou

A ARGININE A - T A-T T-A A-T T-A T-A G-C AC A CTTTT CA A II I C AGCG GATTA A Ill1 T GC TCGC T AG A CT C-G T-A 0°C A- T G- C T C T A TCG

T A CYSTEINE G * T GLUlWUC ACID A - T T-A G-C A- T T-G G-T T-A A-T A- T A- T C-G T-A T .G T - A GT T-A A- T TA A- T TA T TTGT A T TATA A T CTCAAJ T TA A Ill1 AA A IIII AAA A IIIIII A TTTA AACA A A TTTG ATAT T A TATG GAGTTI A Ill1 T GC c IIII T TT T IIII T AG T AAAT A T AAAC T T ATAC A TA A AA TA G AA TAA G AA A-T A- T T -A T -A T-A T -A T -A A-T A- T T-A T-A T -A C-G G- C T-A T T T G T T T A T A T A GTC GCA TTC B

a aT T I AC A gT C

A ISOLEUCINE A - T A-T G- C C- G A- T G-T C-G A C CCATAI A G IIIIIIG T AGGCC GGAGT I I Ill T T T TACGG A G G AAT T-A A- T T-A C-G A-T T T T G GAT

FIGURE5.-Cloverleaf representation of 22 putative A. coerulaa tRNA genes. In 11 of them a direct comparison is made with the corresponding A. tum'tu putative WAS. 0,identical nucleotides; small leters, differences; - - -, gaps in A. turritu. Arrow- shaped boxes encompassing variable a number of nucleotides denote overlapping regions that are supposedly modified posttran- scriptionally by RNA editing using the opposite T nucleotides of the 5' end as an internal guide. Land Snail Mitochondrial Genome 1363

C 6 A A LYSINE T - A COI METHIONINE T - A PHENYIXANINE T - A T-A A- T f *Id A- T A- T G-C T-A T-A T-A :N G-C A- T A-T C A-T T A- T T T CTTAA T CCT T TCTTT GA G IIIII A AA A Ill T RAT A IIIII T A GCCG GAATT T C GA GGA T ATTCG AGAAA -1 I C A A Ill1 TC IIII A A T AAGC A AG C T A G AAAGC A AC A ATT TA G AA T-A T -A C-G A- T A- T G- C G- C A-T G-T C-G A-T G- C T-G T A C A T G T A T A T G TTT CAT GAA

G T SERINE AGN A - T SERINE UCN T - A A- T T -A G-C A- T A- T T-A A- T A- T A- T A- T T - G TT T-G TC TA A T CCTTAlAT T CCGTT A T II.III A AT T IIIII A GGGAT I I TTTC GGCAT A A T AT T II II A cc A T I ATAG T T-AT GA T AA T-A A- T G*T A- T A- T G-C G.T A- T G- C T-A C A T A T A T A GCT TGA D A THREONINE A - T TYROSINE 1 T-G T-A A- T C-G T -A t G-C G T TTTAT AA A Ill- T T TTCGTAAT G AI G Ill1 A A tT T AAGC G a TA AC T -A A-T G-C A-T T -A T A T A T GT

FIGURE 5.-Continued 1364 E. Hatzoglou, G. C. andRodakis R. Lecanidou stream from the 3' end of the tRNAva' gene, this being bp anticodon stems, all three molluscan sequences be- the smallest length amongall compared molecules (Fig- ing very conserved in this region. Anticodon stems with ure 4). A similar observation can be made for the 3' an increased potential of base pairing have been also end, where a conserved sequence of 13 nt is located drawn for Ser{AGN} and Ser(UCN) of C. elegans (WOL only 24-25 ntfrom the junction with tRNALeU. STENHOLME 1992). The actual presence of a six-mem- The presence of the conserved heptamersequence bered anticodon stem has been demonstrated by direct TGGCAGA, defined as the rRNA transcription termina- RNA sequencing of a mammalian tRNA'"(UCN) (YOKO- tion signal in animal mitochondria (VALVERDEet al. GAWA et al. 1991; JANKE et al. 1994). Interestingly, the 1994) in positions 8-14 of the adjacent tRNAh gene, Albinaria tRNATy' gene conforms very well with the sec- should be noted. ondary structure of mammalian tRNA'"(UCN), in that, Comparison of the 5' ends of the small rRNA genes in addition to thesix-membered anticodon stem, it con- shows that a sequence of152 nt in Albinaria corre- tains only one nucleotide between the 7-bp acceptor sponds to a sequence of 236 nt in Katharina or 192 nt stem and the4bp D stem. Finally, the Albinaria tRNA'"- in Drosophila. In contrast, the Albinaria 3' end appears (UCN) gene may equally well be drawn to conform with to be longer than the equivalent regions of Katharina the secondary structure of its mammalian counterpart. and Drosophila. A sequence of 18 nt, which is almost Amino-acyl stems consist of 7 bp. As can be noted in identical among compared species, is immediately fol- Figure 5, mismatching is observedin several amino- lowed by an inverted repeat (double underlining in acyl stems [Asp, Gly, His, Leu(CUN), Leu(UUR), Lys, Pro, Figure 4) corresponding to the final stem andloop Ser(UCN), Val].What is interesting is that in most cases structure of Drosophila s-rRNA (CLARYand WOLSTEN- of mismatches, we are dealing with overlapping genes: HOLME 198513). In Albinaria this region is followed by Asp with Cys,Gly with His, Leu(CUN) with Pro, Leu- a sequenceof 27 nt that could potentially form an extra (UUR) with Gln, Lys with COI, Pro with Ala. It is also stem and loop structure, as it contains a 5-nt inverted worth noticing that in all such cases the 3' ends of the repeat (single underlining in Figure 4). In any case, amino-acyl stems are almost invariably composed of T the exact points of the rRNA gene ends must be deter- residues. Recently, it was demonstrated that in A. castel- mined by more direct methods (VAN ETTENet al. 1980; lanii tRNA mitochondrial genes, certain bases of the 5' CLARYand WOLSTENHOLME198513). end (confined to the first 3 bp of the acceptor arm, Even if we assume that the Albinaria ribosomal RNA where correct base pairing is presumably essential for genes occupy all ofthe available space betweenadjacent biological activity)are modified posttranscriptionally by tRNA genes, they still represent the shortest rRNA genes a process of RNA editing (LONERGANand GRAY1993); among coelomate metazoans (s-rRNA/ I-rRNA: Albinaria, the specificity of editing, rather thanbeing provided by 759/1035 bp; Katharina 826/1275 bp; Mytilus, 945/1244 guide , could be provided by the 3' end of the bp; Drosophila, 789/1326) but are larger than those of acceptor stem itself. In Albinaria we believe that a simi- pseudocoelomate nematodes (compare with Table VI of lar mechanism of RNA editing is operating, but thatin WOLSTENHOLME1992; OKIMOTOet al. 1992). this case what we probably have is the 5' end of the Transfer RNA genes: Identification of the standard acceptor stem acting as an internal guide for editing of set of 22 Albinaria tRNA genes was based on their pre- the 3' end. This editing might thus resemble a primitive dicted cloverleaf structures, which define unambiguous polyadenylation mechanism. Finalverification must anticodons (Figure 5). Some of the putative secondary await direct tRNA sequence determination. structures of ten previously reported A. tum'ta tRNA- We thank professors R. A. D. CAMERON, M.J. SMITHand ananony- gene sequences (LECANIDOUet al. 1994) have been re- mous reviewer for helpful comments on the manuscript. We thank drawn to conform with the A. coerulea tRNA secondary VASSILISDOURIS and LARA KRAVARITI for their excellent technical structures (see discussion below and Figure 5). Inaddi- assistance. This work was supported by the University of Athens and tion to these tRNA genes, two more sequences that can by the General Secretariat of Research and Technology of the Greek Ministry of Industry, Energy and Technology. be folded into tRNA-like structures were detected (see gene arrangement) within protein coding genes (Fig- ure 3). LITERATURE CITED

All standard Albinaria tRNAs have the same antico- ANDERSON,S., A. T. BANKIER,B. G. BARRELL, M.H. L. DE BRUIJN,A. R. dons as those reported for Mytilus and Katharina; these COULSONet al., 1981 Sequence and organization of the human are preceded by T and followed by either A or G (pu- mitochondrial genome. Nature 290: 457-465. ANDERSON,S., M. H. L. DE BRUIJN,A. R. COULSON,I. C. EPERON,F. rine). Five cases of mismatched base pairs at exactly SANGERet al., 1982 Complete sequence of bovine mitochon- equivalent positions are evident in the anticodon stem drial DNA: conserved features of the mammalian mitochondrial genome. J. Mol. Biol. 156: 683-717. (at the top of five-membered stems; in Ala, Gly, Phe, ARNASON,U., and E. JOHNSSON,1992 The complete mitochondrial Pro, Trp). Albinaria anticodon stems consist of 5 and DNA sequence of the harbor seal, Phoca vitulina. J. Mol. Evol. sometimes 6bp [Ser(AGN), Tyr, and probably Ser- 34: 493-505. (UCN)];in the case Ser(AGN), as many as 9 bp are ARNASON,U., and A. GULLBERG,1993 Comparison between the of complete mtDNA sequences of the blue and the fin whale, two possible. Although not shown, the Katharina and Myti- species that can hybridize in nature. J. Mol. Evol. 37: 312-322. lus tRNAYm(AGN)also have the potential of forming 9- ARNASON, U.,A. GULLBERCand B. WIDEGREN,1991 The complete Land Snail Mitochondrial Genome 1365

nucleotide sequence of the mitochondrial DNA of the fin whale, Ahinaria populations (Gastropoda: Clausiliidae).J. Moll. Studies Balaaoptera physalus. J. Mol. Evol. 33: 556-568. 61: 65-78. ASAKAWA,S.,Y. KUMAZAWA,T. ARAKI, H. HIMENO,K. MIURAet al., 1991 GADALETA,G., G. PEPE,G. DE CANDIA,C. QUAGLJARIELLO,E. SBISA Strand-specific nucleotide composition bias in echinoderm and et al., 1989 The complete nucleotide sequence of the &ttm vertebrate mitochondrial genomes. J. Mol. Evol. 32 511-520. nuruegicw mitochondrial genome: cryptic signals revealedby com- AVISE,J. C., B. W. BOWEN,T. R. LAMB, A. B. MEYLANand E. BERMING parative analysis between vertebrates. J. Mol. Evol. 28 497-516. HAM,1992 Mitochondrial DNA evolution at a turtle’s pace: evi- GHMZZANI,S. C., S. L.D. MACKAY,C. S. MADSEN,P. J. JAIPIS and dence for low genetic variability and reduced microevolutionary W.W. HAUSWIRTH, 1993 Transcribed heteroplasmic repeated rate in the testudines. Mol. Biol. Evol. 9 457-473. sequences in the porcine mitochondrial DNA D-loop region. J. BAIROCH,A., 1988 PC/Gene version 5.15. University of Geneva, Mol. Evol. 37: 36-47. Switzerland. GJETVAJ,B., D. I. COOKand E.ZoUROS, 1992 Repeated sequences BIBB,M. J., R.A. VAN ETTEN,C. T. WRIGHT, M. W. WALBERGand and large scale size variation of mitochondrial DNA a common D.A.CLAYTON, 1981 Sequence and gene organization of feature among scallops (Bivalvia: Pectinidae). Mol. Biol. Evol. 9 mouse mitochondrial DNA. Cell 26: 167-180. 106-124. BOORE,J. L., and W. M. BROWN,1994a Mitochondrial genomes and HENIKOFF,S., 1987 Exonuclease I11 generated deletions for DNA the phylogeny of mollusks. Nautilus (Suppl.) 2 61-78. sequence analysis. Promega Notes No. 8. BOORE,J. L., and W.M. BROWN,1994b Complete DNA sequence HOFFMANN,R. J., J. L.BOORE and W. M. BROWN,1992 A novel of the mitochondrial genome of the black chiton, Katharina tuni- mitochondrial genome organization for the blue mussel Mytilus cata. Genetics 138: 423-443. edulis. Genetics 131: 397-412. BOYCE,T. H., M. E. ZWICK and C. F. AQUADRO,1989 Mitochondrial JACOBS, H., D. ELLIOT,V. MATH and A. FARGUHARSON,1988 Nuclee DNA in the pine weevils: size, structure and heteroplasmy. Genet- tide sequence and gene organization of sea urchin mtDNA. J. ics 123: 825-836. Mol. Biol. 201: 185-217. BROUGHTON,R.E., and T. E. DOWLING,1994 Length variation in JACOBS, H. T.,E. R. HERBERTand J. RANKINE,1989 Sea urchin egg mitochondrial DNA of the minnow Cypn‘nellaspiloptera. Genetics mitochondrial DNA contains a short displacement loop (D-loop) 138 179-190. in the replication origin region. Nucleic Acids Res. 17: 8949- BURGE,C., A. M. CAMBELL and S. KARLIN, 1992 Over- and under- 8965. representation of short oligonucleotides in DNA sequences. JANKE,A,, G. FELDMAIER-FUCHS,W. K. THOMAS, A.VON HAESELER and Proc. Natl. Acad. Sci. USA 89 1358-1362. S. PAho, 1994 The marsupial mitochondrial genome and the CANTATORE,P., M. N. GADALETA,M. ROBERTI,C. SACCONEand A. C. evolution of placental mammals. Genetics 137: 243-256. WILSON,1987 Duplication and remoulding of tRNA genes dur- L’ABBP,D., J. F. DUHAIME,B.F. LANG and R MOWS, 1991 The ing the evolutionary rearrangement of mitochondrial genomes. transcription of DNA in chicken mitochondria initiates from one Nature 329: 853-855. major bidirectional promoter. J. Biol. Chem. 266 10844-10850. CANTATORE,P., M. ROBERTI,G. RAINALDI, M. N. GADALETAand C. LAROCHE,J., M. SNYDER, D.I. COOK,K. FULLERand E. ZOUROS,1990 SACCONE,1989 The complete nucleotide sequence, gene orga- Molecular characterization of a repeat element causing large size nization, and genetic code of the mitochondrial genome of Para- variation in themitochondrial DNA of the sea scallopPluceopecten centrotus lividus. J. BIOI. Chem. 264 10965-10975. magellanim. Mol. Biol. Evol. 7: 45-64. CARDON, L. R., C. BURGE,D. A. CLAYTON and S. WIN,1994 Pem- LECANIDOU,R., V. DOURISand G. C. RODAKIS,1994 Novel features sive CpG suppression in animal mitochondrial genomes. Proc. of metazoan mtDNA revealed from sequence analysis of three Natl. Acad. Sci. USA 91: 3799-3803. mitochondrial DNA segments of the land snail Ahinaria tumita CHANG,D. D., T. W. WONG,J. E. HIXONand D.A. CLAYTON,1985 (Gastropoda: Clausiliidae).J. Mol. Evol. 38 369-382. Regulatory sequences for mammalian mitochondrial transcrip LONERGAN,K. M., and M. W. GRAY,1993 Editing of transfer RNAs tion and replication, pp. 135-144 in Achievements and Perspectives in Acanthamueba castelanii mitochondria. Science 259: 812-816. of Mitochondrial Research, Vol. II. Biogenesis, edited by E. QUAGLIA- MARTIN, A. P., G. J. P. NAYLORand S. R. PALUMBI,1992 Ratesof RIELLO,E. C. SLATOR,F. PALMIERI, SACCONEC. and A. M. KROON. mitochondrial DNA evolution in sharks are slow compared with Elsevier, Amsterdam. mammals. Nature 357: 153-155. CHANG,Y.S., F.-L. HUANG andT.-B. LO, 1994 The complete nucleo- MITCHELL,S. E., A. F. COCKBURNand J. A. SEAWRIGHT,1993 The tide sequence and gene organization of carp (Cypinus carpio) mitochondrial genome of Anopheles quadrimaculutuc species A mitochondrial genome. J. Mol. Evol. 38 138-155. complete nucleotide sequence and gene organization. Genome CLARY, D.O., and D. R. WOLSTENHOLME,1985a The mitochondrial 36: 1058-1073. DNA molecule of Drosophila yakuba: nucleotide sequence, gene MORITZ, C., T. E. DOWLINGand W.M. BROWN,1987 Evolution of organization, and genetic code. J. Mol. Evol. 22 252-271. animal mitochondrial DNArelevance for population biology CLARY,D. O.,and D. R. WOLSTENHOLME,1985b The ribosomal RNA and systematics. Annu. Rev. Ecol. Syst. 18 296-292. genes of Drosophila mitochondrial DNA. Nucleic Acids Res. 31: OJALA,D., C. MERKEL,R. GELFANDand G.A~ARDI, 1980 The tRNA 4029-4044. genes punctuate the reading of genetic information in human CLARY, D. O., and D. R. WOLSTENHOLME,1987 Drosophila mitochon- mitochondrial DNA. Cell 22: 393-403. drial DNAconserved sequences in the A+T-rich region and OJALA,D., J. MONTOYA and G. ATTARDI,1981 tRNA punctuation supporting evidence for a secondary structure model of the small model of RNA processing in human mitochondria. Nature 290 ribosomal RNA. J. Mol. Evol. 25: 116-125. 470-474. CLAYTON, D. A,, 1991 Replication and transcription of vertebrate OKIMOTO,R, J. L. MACFARLANE and D. R. WOLSTENHOLME,1990 mitochondrial DNA. Annu. Rev. Cell Biol. 7: 453-478. Evidence for the frequent use of lTGas the translation initiation CLAYTON, D.A,, 1992 Transcription and replication of animal mito- codon of mitochondrial protein genes in the nematodes, Ascaris chondrial DNAs. Int. Rev. Cyt. 141: 217-232. suum and Caaorhabditis elegans. NucleicAcids Res. 18: 6113- CROZIER,R. H., andY. C. CROZIER,1993 The mitochondrial genome 6118. of the honeybee Apis meU@raa: complete sequence and genome OKIMOTO,R., J. L. MACFARLANE,D. 0. CLARYand D. R. WOLSTEN- organization. Genetics 133 97-117. HOLME, 1992 The mitochondrial genomes oftwo nematodes CROZIER, R. H., Y. C. CROZIER and A G. MAamY, 1989 The COI Caaorhabditis elegans and Ascaris suum. Genetics 130: 471-498. and COII region of honeybee mitochondrial DNA: evidence for PALUMBI,S. R., andJ. BENZIE,1991 Large mitochondrial DNA differ- variation in insect mitochondrial rates.Mol. Biol. Evol. 6: 399-411. ences between morphologically similarPaneid shrimb. Mol. Mar. DE GIORGI,C., c. LANAVE,M. D. MUSCIand C. SACCONE,1991 Mite BIOI. Biotech. 1: 27-34. chondrial DNA in the sea urchin Arbacia lixulu: evolutionary in- PEARSON, W. R., and D. J. LIPMAN, 1988 Improved tools for biologi- ferences from nucleotide sequence analysis. Mol. Bid. Evol. 8: cal sequence comparison. Proc. Natl. Acad. Sci. USA 85 2444- 515-529. 2448. DESJARDINS, P.,and R. MOWS, 1990 Sequence and gene organiza- PEREZ,M. L., J. R. VALVERDE,B. BATUECAS, F.AMAT, R. MARGO d tion of the chicken mitochondrial genome: a novel gene order al., 1994 Speciation in the Altaia genus: mitochondrial DNA in higher in higher vertebrates. J. Mol. Biol. 212: 599-634. analysis of bisexual and parthenogenetic shrimps. J. Mol. Evol. DOURIS, V., G. C. RODAKIS,S. GIOKAS,M. MYLONASand R. LECANIDOU, 38: 156-168. 1995 Mitochondrial DNA and morphological differentiation of PUSTELL, J., and F. C. KAFATOS, 1984 A convenient and adaptable 1366 E. Hatzoglou, G. C. Rodakis and R. Lecanidou package of computer programs for DNA and protein sequence SNYDER,M., A. R FRASER,J. LAROCHE, K. E. GARTNER-KEPKAYand E. management, analysis and homology determination. Nucleic ZOUROS,1987 Atypical mitochondrial DNA from the deep-sea Acids Res. 12 643-655. scallop Plocopectm magellanicus. Proc. Natl. Acad. Sci. USA 84 PUSTELL,J., and F. C. KAFATOS,1986 A convenient and adaptable 7595-7599. microcomputer environment for DNA and protein manipulation SRIVASTAVA,K. K., E. E. CABLEand H. L. BONKOVSKY,1993 Improved and analysis. Nucleic Acids Res. 14 479-488. resolution and sensitivity of Northern blots using polyacrylamide- RAMIREZ,V., P. SAVOIE and R. MOW, 1993 Molecular characteriza- urea gels. Biotechniques 15: 227-231. tion and evolution of a duck mitochondrial genome. J. Mol. Evol. TERRETT, J., S. MILES and R. H. THOMAS, 1994 The mitochondrial 37: 296-310. genome of Cepaea naoralis: gene order, base composition, and ROE, B. A., D.-P. MA, R. K. WIUONand J. F.-H. WONG,1985 The heteroplasmy. Nautilus (Suppl.) 2 79-84. complete nucleotide sequence of the Xenopus l0eUi.s mitochon- VALXERDE,J. R, R. MARCO and R. GARESSE,1994 A conserved hep drial genome. J. BIOI. Chem. 260 9759-9774. tamer motif for ribosomal RNA transcription termination in ani- mal mitochondria. Proc. Natl. Acad. USA 91: SAMBROOK,J., E. F. FRITSCHand T. MANIATIS, 1989 Molecular clon- 5368-5371. ing: a laboratory manual. Cold Spring Laboratory Press, Cold VANETIXN, R. A., M. W. WALBERGand D. A. CLAWON,1980 Precise Spring Harbor, localization and nucleotide sequence of the two mouse mito- NY. chondrial rRNA genes and three immediately adjacent novel SANGER,F., S. KICKLEN and A. R COULSON, DNA sequencing 1977 tRNA genes. Cell 22 with chain-terminating inhibitors. Proc. Natl. Acad. Sci.USA 74 157-170. WOLSTENHOLME,D. R., 1992 Animal mitochondrial DNA Structure 5463-5467. and evolution. Int. Rev. Cyt. 141: 173-216. SHADEL,G. S.,and D. A. CLAYTON,1993 Mitochondrial transcription YOKOGAWA, Y.T., WATANABE, Y. KUMAZAWA,T. UEDA, I. HIRAOet al., initiation. J. Biol. Chem. 468 16083-16086. 1991 A novel cloverleaf structure found in mammalian mito- SMITH,M. J., A. ARNDT,S. GORSKIand E. FAJBER,1993 The phylog: chondrial tRNAs' (UCN). Nucleic Acids Res. 19 6101-6105. eny ofechinoderm classes basedon mitochondrial gene arrange- ments. J. Mol. Evol. 36 545-554. Communicating editor: W.-H. LI