<<

Copyright 0 1992 by the Genetics Society of America

A Novel Mitochondrial Genome Organization for theBlue Mussel, Mytilus edulis

Richard J. Hoffmann,*" JeffreyL. Boore+ and Wesley M. Brownt *Department of Zoology and Genetics, Iowa State University, Ames, Iowa 5001 1-3223, and ?Department of Biology, University of Michigan, Ann Arbor, Michigan 48109-1048 Manuscript received December1 1, 199 1 Accepted for publication February 7, 1992

ABSTRACT The sequence of 13.9 kilobases (kb) of the 17.1-kb mitochondrial genome of Mytilus edulis has been determined, and the arrangement of all genes has been deduced. Mytilus mitochondrial DNA (mtDNA) contains 37 genes, all of which are transcribed from the same DNA strand. The gene content of Mytilus is typically metazoan in that it includes genesfor large and small ribosomal RNAs, for a complete set of transfer RNAs and for 12 proteins. The protein genes encode the cytochrome b apoenzyme, cytochrome c oxidase (GO) subunits 1-111, NADH dehydrogenase (ND) subunits 1-6 and 4L,and ATP synthetase (ATPase) subunit 6. No gene for ATPase subunit 8 could be found. The reading frames for the ND1, COI, and COIII genes contain long extensions relative to those genes in other metazoaq mtDNAs. There are 23 tRNA genes, one more than previously found in any metazoan mtDNA.The additional tRNA appears to specify methionine, making Mytilus mtDNA unique in having two tRNAMetgenes. Five lengthy unassigned intergenic sequences are present, four of which vary in length from 79 to 119 nucleotides and the largest of which is 1.2 kb. The base compositions ofthese are unremarkable and do not differ significantly from that of the remainder of the mtDNA. The arrangement- of genes - in Mytilus mtDNA is remarkably unlike that found in any other known metazoan mtDNA.

HE mitochondrialgenome of metazoans is a 1985). For example, among mtDNAs there T closed circular DNA; only among hydrazoans appears to be one basic arrangement (ANDERSONet al. (Hydra spp.) is this DNA linear (WARRIORand GALL 1981, 1982; BIBB et al. 198 1;GADALETA et al. 1989; 1985).Although metazoan mitochondrial DNA ROE et al. 1985; JOHANSEN,GUDDAL and JOHANSEN (mtDNA) ranges insize from ca. 14-40 kb (SEDEROFF 1990) with only minor variations (PAABo et al. 1991; 1984; MORITZ,DOWLING, and BROWN1987; SNYDER DESJARDINSand MORAIS 1990, 1991; L. SZURA and et al. 1987), its gene content is highly conserved. The W. BROWN, unpublisheddata); thiswithin-phylum sameset of genes is usuallypresent: 12 or 13 for stability also appears to hold true among echinoderm proteins, one each for the small and large subunit (JACOBS et al. 1988; CANTATOREet al. 1989; DE ribosomal RNAs (s-rRNA and I-rRNA), and 22 for GIORGIet al. 1991; HIMENOet al. 1987; SMITHet al. transfer RNAs (tRNAs;a sufficient number to trans- 1989, 1990;M. SMITH,personal communication) and latethe simplified mitochondrialgenetic code). In arthropodmtDNAs (CLARY and WOLSTENHOLME addition, atleast one extensive noncoding sequenceis 1985; BATUECASet al. 1988; HSUCHEN,KOTIN and present which, in vertebrates (CLAYTON1982, 1984) DUBIN1984; DUBIN, HSUCHENand TILLOTSON1986; and in insects (CLARY and WOLSTENHOLME1985), is MCCRACKEN,UHLENBUSCH and GELLISSEN1987; known to contain elements controlling the initiation UHLENBUSCH,MCCRACKEN and GELLISSEN1987; D. of replication andtranscription. Sizevariation in STANTON,L. SZURAand W. BROWN,unpublished mtDNA is usually due to differences in this noncoding data),even though the chordate, echinoderm and sequence (BROWN1985; HARRISON1989), but is oc- arthropodarrangements differ substantially from casionally due to duplications of other portions of the each other. genome (MORITZand BROWN1986, 1987; ZEVERING One exception to theconservation of gene arrange- et al. 1991). ment within a phylum occurs in nematodes. Ascaris Althoughthe nucleotide sequence of mtDNA suum and Caenorhabditiselegans have gene arrange- changes at a rate as much as ten times higher than ments that areidentical except for the location of an that of nuclear DNA (BROWN,GEORGE and WILSON A T-rich noncoding region (WOLSTENHOLMEet al. 1979),the order inwhich the genes are arranged + 1987; OKIMOTOand WOLSTENHOLME1990; OKI- appearsto change at a muchslower rate (BROWN MOTO, MACFARLANEand WOLSTENHOLME1990), but ' To whom correspondence should be addressed. that are radically different from that of a third ne-

Genetics 131: 397-412 (June, 1992) 398 R. J. Hoffmann, J. L. Boore and W. M. Brown matode, Meloidogyne javanica (OKIMOTOet al. 1991). States Biochemical Gorp.), used according to supplier's in- Whether this difference is due to achange in the structions. [55S]dATP(Amersham) was used to label the products, which were separated on 5-6% polyacrylamide rearrangement rate or to a very ancient split in the gels containing 8 M urea and castwith wedge spacers. lineages leading to these taxa is unknown, although Initially, sequences from each end of each subcloned frag- such asplit in nematodes has been postulated (POINAR ment were obtained using primers complementary to the 1983). Neither of the nematode gene arrangements T7 and T3 priming sites in pBluescript 11; sequences inter- shows an affinity with that of any other metazoan nal to those were obtained using synthetic 17 nucleotide (nt) primers designed on the basis of the sequences obtained. mtDNA. At present, nematodes are the only phylum Some automated sequencing, primarily to confirm se- known in which representatives of different subgroups quences manually determined from only one strand, was fail to share abasic mitochondrial gene arrangement, performed by the Iowa State University Nucleic Acid Facil- and they are further distinguished from other meta- ity on an Applied Biosystems model 373A DNA sequencing zoans by their lack of amitochondrial gene for system, followingmanufacturer's protocols. The subcloning and sequencing strategies are illustrated in Figure 1. ATPase8 (OKIMOTOet al. 199 1). Computer analysis was performed with the University of The slow butperceptible rate of rearrangement Wisconsin Genetics Computer Group package of programs and the very large number of arrangements possible (DEVEREUX, HAEBERLISMITHIES and 1984). Tentative iden- suggest that mitochondrial gene order may provide tifications of protein coding genes were made using FastA useful information about the phylogeny of distantly searches of the mtDNA nucleotide sequences in GenBank and confirmed using BestFit comparisons of in- related metazoan groups. Although present data pro- ferred amino acid sequences. Because both its nucleotide vide encouragement for this approach, and methods and amino acid sequences are known to be poorlyconserved, for analyzing the evolutionary relationships among ND6 gene identification was further confirmed by hydrop- gene rearrangements are being developed (SANKOFF, athy profile comparisons (KYTE and DOOLITTLE1982) with CEDERCREN andABEL 1990), only a few groups have the ND6 proteins of Drosophila and the sea urchin Paracen- trotus. Final amino acid alignments were made using Gap been investigated anddata from many more are with program defaults (gap weight = 3.0, gap length weight needed for even apreliminary assessment. To this = 0.1). The s-rRNA gene was identified by its sequence end, we have determinedthe mitochondrial gene similarity with knowns-rRNA genes, and ends were assigned arrangement for a bivalve mollusk, the blue mussel by adjacency to inferredtRNA genes. The I-rRNA gene has Mytilus edulis, the first such reported for any mollusk. no directly adjacent tRNA gene to define its 3' end and, therefore, its approximate 3' end was designated as the Mytilus mtDNA is especially interesting in that it is average position of the end of its sequence similarity with commonly inherited biparentally (HOEH,BLAKLEY several known metazoan mitochondrial I-rRNA genes. Se- and BROWN 1991),providing a possible opportunity quences encoding tRNA genes were identified generically for recombination. Its genearrangement is unlike that by their ability to fold into typical metazoan mitochondrial reported for any other metazoan mtDNA. Like nem- tRNA structures,and specifically by their anticodon se- quences. atodes, Mytilus apparently lacks mitochondriala ATPase8 gene. Further, Mytilus mtDNA contains a RESULTS ANDDISCUSSION unique tRNA gene which appears to specify methio- nine and, thus, has two tRNA genes for this amino The total length of this Mytilus mtDNA is 17.1 kb, acid. as determined by the sequence data combined with the restrictionfragment size estimates forthe few unsequenced portions. The cleavage map generated MATERIALS AND METHODS for this cloned mtDNA appears to be identical to that A X EMBL3 vector containing an entire Mytilus mtDNA for theM. edulis mtDNA that was designated A in the was a gift of D. 0. F. SKIBINSKI.The mtDNA had been report by HOEH, BLAKLEYand BROWN(1991). We inserted into X EMBL3 DNA using a unique BamHI site, as didnot map BclI sites in this study, but all other described by SKIBINSKIand EDWARDS(1987). Although the cleavage sites reported in that study appear identical mtDNA was isolated from Mytilus galloprovincialis-like mus- to those we found. Comparison of nt from the sels, it appears to be M. edulis mtDNA, perhaps acquired by 588 5' hybridization between the twospecies (see below). Frag- and 3' ends of the cloned ND4 gene with ND4 se- ments produced by digesting DNA from the X EMBL3 clone quence data obtained by S. MCCAFFERTY(personal with combinations of BamHI, XbaI, EcoRI, XhoI and NheI communication) from two mtDNAs representing os- endonucleases were subcloned into pBluescript I1 KS(-) tensibly pure populations of M. edulis and M. gallo- plasmids (Stratagene), producing five nonoverlapping sub- clones that jointly comprise the entire mtDNA, as well as provincialis revealed differences at 32 positions; the additional overlapping subclones. Further subclones of the cloned DNA sequence was identical with M. edulis at largest pBluescript I1 subclone were made using unique 20, with M. galloprovincialis at five, and differed from NheI, NcoI, BstXI and PstI restriction sites found within it. bothat seven. Although the original X clone was Detailed cleavage maps of each of the five original nonov- produced using DNA from M. galloprovincialis-like erlapping subclones were constructed using combination digests of restriction endonucleases. mussels (SKIBINSKIand EDWARDS1987), the ND4 Double stranded DNA sequencing reactions employed sequence and the cleavage map similarities suggest modified T7 DNApolymerase (Sequenase 2.0, United that the cloned mtDNAis derived fromM. edulis, and Mytilus mtDNA Organization 399 FIGURE1.-Map of the genescon- Mytilus edulis tained in the M. edulis mitochondrial ge- nome, together with subcloningand se- I-rRNA COlI COIIl ND3 ATPsscds-rRNA ND6 quencing strategies. The cleavage map shows only those sites employed in sub- cloning. The original genome was cloned into X EMBLS using a unique BamHl site, which appears at both ends of the linear- ized map illustrated here. Ranges encom- c ND4L passed by the pBluescript I1 subclones ap- pear immediately above thescale line, with the five nonoverlapping subclones shown as a single line marked by the cleavage sites used to make them. Arrows show the locations and directions of individual se- quencing runs. Locations of tRNA genes are marked by their single-letter amino acid codes. Alternative tRNAgenes for methionine, leucine and serine are noted by subscripts designating the codons they recognize. Numbers indicate the quantity 11111111~11~~~~~~1of nucleotidescontained within five lengthy unassigned intergenic regions. 0 12 3 4 5 6 7 8 9 1011121314151617 The direction of transcription of all genes Kilobase Pairs is left to right.

it is so identified in this study. This observation is not HOLME 1990; OKIMOTOet al. 1991). Both rRNA genes especially surprising, since hybridization between the are also present. There are 23 regions that can be two is frequent (SKIBINSKI,AHMED and BEARD- folded into structuresthat arecharacteristic for tRNA MORE 1978) and could easily result in interspecific genes, one more than is present in other metazoan exchange of mtDNA. A caution in this assignment to mtDNAs. The additional tRNA gene employs an an- M. edulis is that on-going analysis of 1-rRNA and ticodon (TAT) for methionine that is unique among ATPase6 sequence has so far failed to distinguish M. animalmtDNAs, giving Mytilus two mitochondrial edulis from M. galloprovinicialis (R. KOEHN,personal tRNAMet genes rather than the single one found in communication). other metazoans. (This unique tRNA gene is herein Although 3.2 kb of the mtDNA remain unsequ- designated tRNAy&, see below.) All of the genes are enced,the informationfrom the 13.9-kb sequence transcribed from the same DNA strand. obtained is sufficient to identify unambiguously all Unassigned DNA: A 1.2-kb region with no readily genes, to locate all gene boundaries and to sequence assignable function lies between the I-rRNA and the fully on both DNA strands 21 of 23 tRNA genes, all tRNATyr genes. This region contains several poly(A) noncoding regions,plus unusual protein gene termini. tracts near thetRNATyr gene and oneG + C-rich tract The few nucleotides that were notdetermined by (27 of 35 nt;see APPENDIX I). These sequence features sequencing both strands of the two remaining tRNA and thesize of the region suggest it as a candidate for genes are at their 3’ ends; thus, even if there are a role in initiating replication and/or transcription, errors at these positions, the correct identification of but experimental data are needed before a functional the aminoacid specificity of the two tRNAs is uncom- assignment can be made. If the region does contain promised. Aside from the five noncoding regions dis- the origin of replication, it is much less A + T-rich cussed in a following section, Mytilus mitochondrial (6 1%) than the controlregion in Drosophila, which is genes either abut directly or are separated by short >90% A + T (CLARY andWOLSTENHOLME 1985). (1 - 15 nt)sequences, as in other metazoan mtDNAs. This 1.2-kb region plus the adjacent 3’ end of the Gene content: M. edulis mtDNA (Figure 1, APPEN- I-rRNA gene contain six ORFs that begin with the DIX I) contains 12 open reading frames (ORFs) that metazoanmitochondrial initiation codons, ATG, correspond to genes for cytochrome b (Cyt b), cyto- ATA or ATT, endwith a termination codon and are chrome c oxidase subunits 1-111 (COI-COIII), NADH >40codons long (Segment #1, APPENDIX I). One dehydrogenasesubunits 1-6 and4L (ND1-6 and ORF, of 11 1 codons, begins 13 ntbeyond the 3’ end ND4L), and ATPase6. The ATPase8 gene appearsto of the I-rRNA gene, on the same mtDNA strand as be missing (see below), as is also the case in nematodes the known protein coding genes. The remaining five (WOLSTENHOLMEet al. 1987; OKIMOTOand WOLSTEN- are on the opposite DNA strand. They range in size HOLME 1990; OKIMOTO, MACFARLANEand WOLSTEN- from 47 to 170 codons and overlap the 11 1 codon 400 R. J. Hoffmann, J. L. Booreand W. M. Brown

TABLE 1 and, thus, could be potentially important for process- Stabilities (kcal/mol)of predicted RNA secondary structures ing or transcription. The sequences between the Cyt (ZUKER and STIEGER1981) for five unassigned intergenic b and COIIgenes and between the tRNA& and COI sequences and a5’ extension to the NDl reading frame fromM. genes fold into structures that are significantly more edulis mitochondrial DNA stable than random. The predicted structure at the 5’ end of the tRNA& gene, however, is not signifi- Energy cantly different in stability than random. Finally, the True Randomized 79-nt sequence betweenthe endof the COIII reading Region sequence sequences sequence Region frame and the beginningof the tRNA%$ gene could 1.2-kb unassigned sequence -277.3* -253.5 f 4.5 be included in a stable structure including a 3’ exten- Cyt b to COII -19.1* -16.4 f 1.3 sion of the COIII gene (discussed below). Assessing ND3 to tRNA:& -9.8 -11.0 1.2 f biological significance of these structures, if any, re- tRNA:h to CO1 -20.6* -16.6 f 1.9 COIIl to tRNA!$,, -48.8* -39.1 f 2.1 quires further investigation. In the aggregate, these ND1 5’ extension -30.8 -32.7 f 2.0 four smaller unassigned regions represent more non- coding nucleotides outside of the control region than Stabilities of the true sequences are compared to the means f 9.5%confidence intervals of 10 random sequences of nucleotides has been found previously. Finally, because of uncer- with the same nucleotide composition to aid in assessing signifi- tainties about theexact lengthsof some of the protein cance. The 5‘ extension to the COI and the 3’ extension to the <:011I reading frames are included with their flanking unassigned genes (discussed below), there may be additional non- sequence. coding regions in Mytilus mtDNA. * Significantly more stable than random, in that they fall outside A missing ATPase8 gene: Release 17 of the Swiss- of the 95% confidenceintervals for 10 randomized sequences with the sane nucleotide composition as the native sequence. Prot database includes 13 entries for ATPase8 genes of metazoan mtDNAs: eightfrom vertebrates, two ORF, each other, and/or the 3’ end of the 1-rRNA from echinoids and three from insects. There is very gene extensively. None of the ORFs or their inferred little sequence conservation among these, even at the protein products show any similarity to sequences in amino acid level, which, when combined with the the databases, and none appearsto correspond to the small size of the gene (159-204 nt), makes its identi- ATPase8 gene (see below). Because of this, and be- fication exceedingly difficult. A few features aid in its cause of the extensive overlaps with each other and recognition, however. In 12 of the13 entries, the 1-rRNA gene, we doubt that any of these ORFs ATPase8 is inferred to begin with Met-Pro-Gln (in represent functional genes. However, without direct some, the initial codon used for Met is a nonstandard experimentation no strong conclusion is possible. one), and Leu follows Gln in all but the three insect Optimal foldingof the 1.2-kb regionby the method proteins, which have Met instead. All 13 areidentical of ZUKERand STIEGER(1 98 1) yields a stable secondary in having Trp as the sixth amino acid following Pro. structure(Table 1).Comparison of its free energy Additionally, all but the five mammals have the se- value with the mean energy value for 10 randomized quence Trp-X-Trp ator within one amino acid of the sequences with the same nucleotide composition and Cterminus. Examination of hydropathy profiles size suggest thatthe true sequence is significantly (KYTE and DOOLITTLE1982) indicates that all of the more stable than a random collection of nucleotides proteins are hydrophobicover theamino terminal having the same nucleotide composition (Table1). half of their length and hydrophilic over the C ter- The predicted secondary structure of the 1.2-kb re- minal half. Finally, for all metazoan mtDNAs in which gion contains several longpotential stem and loop an ATPase8 gene has been identified, it immediately structures. There is evidence that one such structure precedes (and sometimes overlaps in another reading mediates initiation of lagging strand synthesis during frame) the gene for ATPase6. In human cells, both vertebrate mtDNA replication (WONGand CLAYTON gene products are translatedfrom a single mRNA 1985), and it has been speculated that others might (OJALA,MONTOYA and ATTARDI198 1). serve as recognition signals for replication, transcrip- Translation of all portions of the sequence from tion,and RNA processing (BROWN1985; FORAN, Mytilus mtDNA inall possible reading frames and HIXSON and BROWN 1988;JACOBS et al. 1988). examination of all inferred polypeptides, including There are four shorter noncoding regions in addi- those with nonstandard initiation codons, yielded one tion to the 1.2-kb region just discussed. One, of 115 with an initial Met-Pro-Gln-Leu sequence. This read- nt, is between Cyt b and COII; another,of 79 nt, lies ing frame is 34 codons long, lacks any Trp codons, between the genes for COT11 and tRNAy&; and two, has a hydropathy profile that is different from those of 88 and 119 nt, flank the tRNA& gene at its 5’ of all other ATPase8proteins, and lies entirely within and 3’ ends, respectively (Figure 1, APPENDIX I). Like the ND2 gene but on the opposite DNA strand. The the 1.2-kb region, each contains sequence elements initial sequence Met-Pro-Pro-Gln-(X)5-Trp occurred capable of secondary structure formation (Table 1) once, in an ORF of only 21 codons that also lacks the Mytilus mtDNA Organization 40 1 characteristic hydropathy profile of ATPase8. Allow- HOLME 1990; OKIMOTO,MACFARLANE and WOLSTEN- ing for the possibility of a nonstandard initiation co- HOLME 1990). Comparing Mytilus with these, only the don by searching forPro-Gln-(X)5-Trp failed to reveal ND5 and ND6 genes have the same relative orienta- additional candidates. Two open reading frameswith tion, but there are three intervening tRNA genes in the sequence Trp-X-Trp within one codon of a ter- nematodes and none in Mytilus. Although these two mination codonwere found, but neither had the other genes are also adjacent in and echinoderms, characteristic features of ATPase8. None of the six their relative polarities are opposite, whereas they are ORFs in the 1.2-kb unassigned region discussed above the same in Mytilus. has the characteristics of ATPase8. Finally, there is Partial gene orders are known for a few other taxa no reading frame with any of these characteristics in with spiral cleavage. Although all of its tRNA genes the vicinity of the ATPase6 gene. In summary, we have not been mapped, a third nematode mtDNA, have been unable to find evidence of an ATPase8 from Meloidogyne javanica (OKIMOTOet al. 1991), gene in the sequence obtained from Mytilus mtDNA shares little of its order with the other two nematodes and, barring the unlikely possibility that it is entirely or with Mytilus. The ND2 and COIII genes are adja- within an unsequenced portion of one of the other cent in both Mytilus and Meloidogyne, but their rel- protein genes, we conclude that, like nematodes, My- ative polarities are different.Although its full mtDNA tilus lacks a mitochondrial ATPase8 gene. organization is unknown, in the flatworm Fasciola Although not resolvable with present data, at least hepatica the genes for ND3 and COI areseparated by twointeresting and testable evolutionaryquestions those for tRNAp& and tRNATyr (GAREYand WOL- are worth noting in this regard. One concerns the fate STENHOLME 1989); these two protein genes are in the of the missing ATPase8 gene. Has it been transferred same order and orientation inMytilus, but with a to the nucleus, or has its function been subsumed by tRNAgEN gene and two noncoding regions between adifferent nuclear-encoded protein that is trans- them. Finally, no two mitochondrial protein or rRNA portedinto the mitochondrion, or is thatfunction genes share the same boundaries in Mytilus and the somehow dispensable for some organisms, such as sea anemone Metridium senile (D. WOLSTENHOLME, Mytilus and nematodes? The other concerns the personal communication). shared absence of this gene among the bivalve and If tRNA genes are included in the comparisons, a nematode species examined. Does this reflect two few additionalcommon organizational features are independent losses from two separate lineages or one revealed among the mtDNAs of Mytilus and other loss from a taxon that was ancestral to both bivalves metazoans. The COII and tRNALysgenes are directly and nematodes? Both questions can be addressed by adjacent and have the same relative polarity in Myti- further investigation. Ius,in Drosophila and several other arthropods, in Comparisons of gene order: As Figure 2 shows, echinoderms, and in chordates. Also, the 5’ end of Mytilus mtDNA is organized in a radically different the ND1 gene is preceded by the gene for tRNA&R fashion from that of Drosophila yakuba, the only pro- in Mytilus, in chordates and in echinoderms; regard- tostomefor which acomplete mtDNA sequence is ing this, we note that the gene for tRNA& is up- available (CLARYand WOLSTENHOLME1985). Few stream of the ND1 gene in several arthropod mt- genes share the same boundaries in both; if tRNA DNAs, which suggests the possibility that the genes are ignored, only the s- and 1-rRNA genes are tRNAENand tRNAE’,”, genes might be homologous found in the same relative order and transcriptional in some lineages and simply have “exchanged” anti- polarity. However, in Drosophila (and in other arthro- codons reciprocally. The tRNATh‘ gene is upstream pods and chordates as well) the two rRNA genes are of the ND4L gene in both Mytilus and Drosophila, separated by one tRNA gene, for valine, whereas in albeit with reversed polarity in Drosophila. The 5’ Mytilus they are separated by seven tRNA genes, none end of the s-rRNA gene is bounded by the tRNAPhe of which specifies valine. With regard to mitochon- gene in Mytilus, in chordates, and in echinoderms. In drial gene order, there is substantially less similarity Mytilus and echinoderms, the tRNATrp gene is fol- between Mytilus andother representativeprotos- lowed by thatfor tRNAA1”, but with their relative tomes than between those of protostomes and deuter- polarities switched; in chordates,these two tRNA ostomes, in markedcontradiction to expectations genes are also adjacent with opposite polarities, but in based on an assumption of monophyly among protos- the reverse order. The tRNA?& and tRNAHis genes tomes. are adjacent and have the same polarity in Mytilus, The only other metazoans that have a spiral pattern chordatesand echinoderms,but the order is 5’- of cleavage during development and forwhich a com- tRNASer-tRNAHis-3’Mytilusin and 5’-tRNAHis- plete mitochondrial gene order is published are two tRNASer-3’ in the others. The tRNAG1” genefollows nematodes, Ascaris mum and Caenorhabditis elegans the tRNA”‘ gene inMytilus, Drosophila and chor- ( WOLSTENHOLMEet al. 1987; OKIMOTOand WOLSTEN- dates, but the genes are on opposite strands in the 402 R. J. Hoffmann,J. L. Booreand W. M. Brown Mytilus edulis

ATPase6ND4L ND6srRNA I rRNA

FIGURE2.-Comparison of mitochon- drial gene organization between M. edulis and D. yakuba. The circular genomes were linearized and aligned at the Cyt b gene. Lines connect identical large genes, and el- lipses with arrows highlight reversals in po- larity. Positions of tRNA genes, indicated by cross hatching, have been ignored in this comparison. Arrows under gene labels show direction of transcription. The bars spanning the s-rRNA and the I-rRNA genes show the only two large genes that share the same orientation and transcriptional polarity be- tween the genomes. Stippled areas in the Mytilus genome show regions of unassigned I function. The D. yakuba gene order is from s rRNA ND4L \ H1 kb ATPase6I ck ?,, CLARYand WOLSTENHOLME(1985). // ND6 ATPase8 cot1 Drosophila yakuba latter two. Finally, the genes for tRNAMet and ND2 stop codon, but this same A must also participate in are directly adjacent in Mytilus, chordates, and Dro- the presumptive ATG initiation codonfor ND6. sophila (but not locusts); however, the tRNAMet gene While it is possible that ND5 andND6 do not overlap in Mytilus is the one with the anticodon TAT, which and that their transcripts are processed by cleavage has not been found in other animal mtDNAs. between these adjacent As, several alternatives that Initiationand termination of translation: Al- allow ND5 and ND6 tooverlap and share this A also though uncertainty about the exact positions of the exist, such as the initiation of translation from alter- 5’ ends of the COI and ND1 genes makes a reasonable nate start codons on a common, digenic mRNA, al- assignment of their initiation codons difficult,it is less ternative (semidestructive) RNA processing, or initi- so for the remaining proteingenes. Initiation of trans- ation of ND6 translation from a codonfurther down- lation at ATG codons seems clearly indicated for the stream. The present data do not allow us to choose COII,COIII,Cytb,ATPase6,ND2-ND4,ND4Land among these alternatives. (Depending uponthe status ND6 genes. However,ND5 appears to initiate at of a 3’ extension of the COIII gene, discussed below, ATA, because (1) this codon begins 11 ntimmediately it may terminate with a single T.) 3’ of the termination codon for ND4L, (2)this results Protein genes:Those portions of the protein genes in a predicted protein of the same length as in Dro- that were sequenced were translated and their amino sophila, and (3) the first in-frame ATG is 98 codons acid sequences compared with those of other . downstream. Initiation of COI and NDl, discussed in Most are similar in sequence and size to their homo- detail below, could also begin at ATG, although other logs in other species. However, some Mytilus proteins in-frame codons are also available. appear to be longer or shorter than their Drosophila The mitochondrial terminationcodon TAA is counterparts, and uncertainties about initiation sites found at the 3’ ends of the reading frames for COI, of some genesconfound inferences about their COIII, Cyt b, ND3, ND4, ND4L and ND6, and the lengths. alternative termination codonTAG occurs at the ends Based upon alignment of conserved amino acids of the genes for COII,ATPase6 and ND2. Two genes, near its ends, the COI gene of Mytilus is potentially ND1 and ND5, may end with an abbreviated termi- longer at both its 5’ and 3’ ends than it is in Drosoph- nation codon, TA, which is presumably completed by ila (APPENDIX I). If the amino terminus of COI is the polyadenylation of their mRNAs. For ND1 this must same in both species, translation in Mytilus must start be so, because the nucleotide following the 3‘-TA is at a TGG codon, which normally specifies tryptophan. T, which is the first nucleotide of the tRNA””’ gene. The nearest in-frame methionine codon(ATA) occurs The case for ND5 is less clear; the nucleotide follow- nine codons upstream fromthis TGG; we note in this ing the 3”TA is A, which would complete the ND5 regard that the inferred start of COI in nematodes is Mytilus mtDNA Organization 403 also 9-1 0 amino acids upstream from that in Dro- times extensively overlaps the 3’ end of the ATPase6 sophila (OKIMOTO,MACFARLANE and WOLSTENHOLME gene (BIBBet al. 1981). If an ATPase8 genewas once 1990). However, the ORF for COIin Mytilus actually present at this location, it is possible that part of it begins with an ATT codon that is 22 codons upstream became attached to the COI gene after the ATPase8 of the TGG codon just mentioned. ATT, which or- gene became dispensable. dinarily specifies isoleucine, is an inferred initiation The Mytilus COIII gene may also be longer at its codon in Drosophila mtDNA (CLARYand WOLSTEN- 3’ end than that of Drosophila. The first termination HOLME 1985) and also in mouse mtDNA, where any codon (TAA) in this gene is 48 codons beyond the C ATN codon apparently can initiate translation (BIBB terminus of Drosophila (APPENDIX I). In Mytilus, a et al. 1981). In Mytilus, three methionine codons (both TGA (Trp) codon follows the last alignable residue ATA and ATG) are justdownstream from this ATT, (Gly) between Mytilus and Drosophila, the T of which and each could serve as a startsite. As a final compli- could serve as an abbreviated termination signal. In cation,there is evidence that GTT is an initiation this regard, two things are worth noting: One is that codonfor one of theprotein genes in nematodes this region and the unassigned 79 nt following it in (OKIMOTO,MACFARLANE and WOLSTENHOLME1990), Mytilus have the potential to fold into stable stem- and in Mytilus there is an in-frame GTT three codons loop structures (Table 1) and thusmay be eliminated upstream of the TGG codon that corresponds to the during RNA processing. The other is that the COIII position of the start codon for COIin Drosophila. gene in Lasaea, another bivalve, lacks this 3’ extension Given the above uncertainties, we are unwilling to (D. O’FOIGHILand M. SMITH,personal communica- assign this sequence to the Mytilus COI gene. How- tion), as do all of the other COIII proteins in the ever, we are also unable to suggest any alternative databases. That the 3‘ ends of other COIII proteins assignment for it.The sequence is adjacent to the 1 19- are highly conserved in length makes it seem unlikely nt unassigned sequence that follows thetRNAg& that the 3’ extension in Mytilus is actually translated. gene, and it may simply be a part of that sequence In contrast to the ambiguities just described, the that, purely by chance, is open and in-frame with the NDl gene of Mytilus is almost certainly truncated at following COI gene. It is possible to form a stable its 3’ end. Althoughoverlap with the downstream RNA secondary structure that includes both the 119- tRNAVa’gene cannot be ruled out absolutely, the nt unassigned sequence and all of the 5‘ extension of dinucleotide TA, which could serve as an abbreviated COI (Table 1). We note that it includes all but the stop codon for NDl, immediately precedes the first last nucleotidebefore the firstmethionine codon nucleotide of the tRNA””’ gene. Moreover, even if (ATA) upstream of the startof the Drosophila protein overlap occurred, an in-frame termination (TAA)co- in the COI reading frame, coincidentwith the length don is only four codons downstream fromthe start of of COI in nematodes. This structure or another like the tRNAVa’ gene.This 3’ truncation of ND1 results it might serve as an RNA processing signal and thus in an inferred protein that is 20 amino acids shorter result in the elimination of this region before transla- than its Drosophila counterpart. tion. The converse may be true at the 5’ end of ND1, The COI gene also appears to be longer at its 3’ where a substantial addition, as compared with Dro- end in Mytilus than in Drosophila (APPENDIX I). The sophila, is possible. If the two NDl proteinshad reading frame remains open to a termination (TAA) identical lengths at their N-terminal ends, translation codon that is located 17 codons beyond the 3’ end in in Mytilus would have to initiate with GTG (APPENDIX Drosophila. There are 9 nt between this TAA and I). GTG is occasionally used as an initiation codon in the initiation codon of the ATPase6 gene. It may be bacterial systems (STORMO,SCHNEIDER and GOLD worthnoting that the second codon following the 1982), and a similar use in animal mtDNA has been asparagine at the end of the COI alignment between suggested (CLARYand WOLSTENHOLME1985; JACOBS the Mytilus and Drosophila is TAT, part of which et al. 1988; GADALETAet al. 1989;JOHANSEN, GUDDAL could serve as an abbreviated terminationsignal, but and JOHANSEN 1990; DESJARDINSand MORAIS 1990, there is no obvious reason to favor such an inference 1991). However, if GTG can serve as a start signal in over one in which COI is 16 amino acids longer at its Mytilus, theNDl reading frame is open back to C terminus in Mytilus than in Drosophila. Moreover, another GTG codon, which is the first codon down- no candidate stem-loop processing signals are pre- stream of the tRNAf6, gene. In this context, GADA- dicted to be formedin this region. LETA et al. (1 989)have observed that GTG appears to The apparent 3‘ extension of COI may be related serve as an initiation codon only when it is directly to the absence of anATPase8 gene inMytilus preceded by the last nucleotide of the preceding gene. mtDNA. As noted, the start of ATPase6 follows the In Mytilus, this GTG is located at a position that is 54 termination (TAA) codonof COI by 9 nt. In all other codons upstream of the inferred 5‘ end of the Dro- metazoans, theATPase8 gene precedes and some- sophila ND1 gene. 404 R.J. Hoffmann, J. L. Boore and W. M. Brown However, RNA sequences in this 54-codon region ual alignments ended were averaged to estimate the have the potential to form a stable secondarystructure approximate 3' end of the gene. By this procedure, (Table 1) that includes all of the nucleotides before the average position of the endof similarity was found the first GTG mentioned, and this could result in its to lie at position 3465 of Segment #1 (APPENDIXI). complete or partial elimination from the final proc- From this, we estimate the size of the Mytilus 1-rRNA essed transcript. However, the predicted structure is gene to be approximately 1245 nt,which, if accurate, not significantly more stable than random. There are makes it the smallest on record (Drosophila's is 1326 one ATG, two ATTs, two GTTs and another GTG nt). However, the largest (1.2 kb) unassigned region in this region, any of which could serve as start sites is adjacent to the 1-rRNA gene in Mytilus, and it is if a portion of the region were eliminated.Also, there possible that 100-400 nt of this region are part of the are three TTGs among the 54 codons, which might mt-l-rRNA gene. If so, however, sequence similarity function as start codons in Mytilus as well as in nem- between Mytilus and other species must be weak. atodes (OKIMOTO,MACFARLANE and WOLSTENHOLME TransferRNA genes: The 23inferred Mytilus 1990). Althoughwe are reluctant toassign this region tRNA genes can be folded into the typical cloverleaf to the ND1 gene, we have been unable to assign an secondary structures shown in Figure 3. Structurally, alternative function to this DNA sequence. these genes are similar to most other metazoan mt- The remaining differences in lengths of gene prod- tRNAs. Mismatches are present in a few stems, G-T ucts between Drosophila and Mytilus are minor. The pairs (= G-U in RNA) are common in stems, and all most notable extension among these is at the 5' end genes lack a 3' terminal CCA. The aminoacyl-accept- of the Cyt 6 gene, which is somewhat longer than ing stem is usually of seven nt pairs, although there is other Cyt b genes. Thereare no alternativestart the potential for 8 to 10 pairs in a few genes (those codons in the Cyt b reading frame, however. In addi- for Arg, Leu(CUN), Leu(UUR), Lys, and Ser(AGN); tion to truncations or extensions at gene termini, there extra nucleotides not shown in Figure 3), and a few are a few cases of internal sequence deletions/addi- also have aminoacyl stems with mismatched termini tions. The longest of these, equivalent to 21 codons, (those forCys, Lys, Ser(UCN) andVal; Figure 3). The is near the middle of the ND6 gene. dihydrouridine (DHU) arm consists of a stem of 3 or In conclusion, we have found evidence of relative 4 paired and a loop of three to 9 unpaired nt. The size changes in several Mytilus genes, among which T#C arm has a stem of 3-5 paired and a loop of 3-7 those involving internal deletions or additions are less unpaired nt. The variable arm is of either 4 or 5 nt. ambiguous than those suggesting truncations or ad- The anticodon stem is of 5 nt pairs, with a few genes ditions to the ends of the genes. As a caveat, it must having one of these pairs mismatched. The anticodon be remembered that all such assignments depend on loop is of 7 nt, with the possibility for pairing across the validity of inferences drawn by comparative se- the loop in some cases. Anticodons are preceded by quence analysis and that any final conclusions about T, except in the gene for tRNAy&, which has a C at gene lengths must await independent confirmation by this position. Anticodons are followed by either A or direct analysis of the gene productsthemselves. G with approximately equal likelihood. Ribosomal RNA genes: The region between the There are two leucine tRNA genes (tRNA&N and genes for tRNAPhe and tRNAG'y shows extensive se- tRNA&,), as in other animal mtDNAs. These are quence similarity to other metazoan mitochondrial s- directly adjacent in Mytilus, in a clusterof tRNA genes rRNA genes. The Mytilus s-rRNAgene is 945 nt, that follows the COII gene. There arealso two serine similar in size to mammalian mtDNAs and consider- tRNAgenes, tRNA?& and tRNA%. As in other ably larger than in Xenopus (8 19 nt;ROE et al. 1985), metazoan mtDNAs, tRNAF&Nhas an unpaired loop echinoderms (ca. 880 nt; JACOBS et al. 1988; CANTA- that replaces the DHU arm.In Mytilus, this loop is of TORE et al. 1989; DE GIORCIet al. 199 l), orDrosophila 13 nt, the longest so far reported. There is potential (ca. 790 nt; CLARY andWOLSTENHOLME 1985). for some nucleotide pairing across the loop and also The boundaries of the 1-rRNA gene are less certain, between its 3' end and the variable arm. Aside from especially at the 3' end where there is no directly the novel tRNAMet gene, the anticodons are thesame adjacent tRNA or protein gene. The computer pro- as in Drosophila mtDNA, except that the anticodon gram Gap was used to compare the final 300 nt at the for tRNALys is TTT in Mytilus, as it is in chordate 3' ends of six different 1-rRNA genes (those of Dro- and nematode mtDNAs. sophila, Xenopus, Paracentrotus, mouse, human, and A second tRNA gene specifying methionine? In cow) against the region in Mytilus mtDNA bounded all previously characterized metazoan mtDNAs there by positions 2222 and 3800 of Segment #1 in APPEN- is a single tRNA"' genethat presumably can be DIX I. This region includes the entire 1-rRNA gene, charged with either N-formyl-methionine and used plus approximately 300 nt of the downstream unas- for translation initiation, or with methionine and used signed sequence. The nt positions where the individ- at internal codons. In mouse, its anticodon (CAT) is Mytilus mtDNA Organization 405

A R G ININ E ASPARAGINEALANINE ARGININE ASPARTIC ACID CYSTEINE

U 1 1 1 A-1 Gl Gl 1-u 11 G-C 1-A G-C 11 6-C a-r GT C-G A-1 C-G u-T C-G C-G A-1 1-u 1-A ,-I A-1 1-1 1-u G-C 1-u A- A-1 A-1 Gl U 1-A CU I\- T 1 G-C 1 G-c.~ 1 1 TlCG G 1 UCUC A 1 CGTl A 1 GGUA C 1 CAGU 1 A a Ill1 A A lI/G uu u Ill A A /Ill A 11 A IIII u 111c AUGC 1 CCGG 1GGG A C 111G GCAG U 1 AlTGCCll 1 G UllG GlCl G 1 Ill1 TCG Ill G 1 1 Ill1 TiA 1111 111 Ill1 AU U UUUG 1 GGCG A 1 uuuc 1 TAUC 1 T TAAC 1 U UI A 1 UG A G U U TG AA 1c A-T U 1-u u 1- uu 1-U A UT A-1 A 1-1 1-A 1-u u-1 1-1 1-1 1-u G-C G-C G-C Gl G-C A-1 1-1 A-1 G-C A-1 Gl 1-1 1-1 11 1c cu 1G CU TU 1G TU 1A TA T GC TCG Gll GTC GCU

GLUTAMIC ACID GLUTAMINE GLYCINE HISTIDINE ISOLEUCINE

A 1 A GT 1-A G-C 1-1 A-1 C-G A-l C-G u-1 6-C 1-A 1-u u- 1 1-1 T -I\ 1-A I-1 T-u A- 1 G-C A-1 u- 1 u-T A-1 C-G u-1 A-l TG i-G G-C G 6-C GA 1-1 U u-l G 1 1Cll u U TlllC A T GUllU C 111lA U A IIII c 1 U 1/11 cu 4 lllll~ GG G IIII/T G UGUU111G G G 1AlG GUAAG U U AUlG ClUAT U GGGT UAUAl G Ill1 cu Ill 4 11 Ill c A 1 Ill c A G UAUC 1 1 GlUC G 1 11Ul 1 G TCCU 1 U UA G GI U UA U G AC 1-1 G 1-4 u 1-A G 1 GG rl- 1 A- 1 1-u 1-u I-1 G-C 1 -u G-C G-C G-C G-C G-C C-G uc C-G GG 1-1 Cl cu TG TU 1A 1u 1A TG 1G 11C TTG TCC GlG GAl

LEUCINE ICUN) LEUCINE (UUR) LYSINE METHIONINE IAUAI METHIONINE IAUG)

1 1 1 1 1 -a C-G 11 Gl G-C G -c A-1 A-1 6-C Gl U -1 C-G A-1 1-A 1-A U -1 u-T G-C u-1 u-1 U -1 u-1 C-G A-1 u-1 T-" d-T 1-1 G -c GT ,I ,. . C -G 1 TU c--G 1-1 CG G-C 1 G- c uu 1 11CAT C 1 11Cl 1 A GAACG 1 1 CAAl A CG Gl 1 AG G Ill I/ AG G 1/11 T A 1111 A A* A 1111 AU G II I1 G G UCG AUG 11 C C ACG AUGU A G 11CG CllGl A 1 TClGGllU A 1 1ClG GCCA c 1 Ill T T A Ill 1 1c A 1/11 U A1 G 111 11 /I c CG 1 TGC A 1 1GC U A AAGC A A GGUC G 1 GGAl G UU G 1 AI A G Gl U AU 1A A A c 11 T CT 1-A G G-C A 1-A G 1 -4 A-1 T-U A-T A-1 1 -u A-T C-G A-1 G-C G -c 6-C A-l Gl G-C A -1 1-1 A-1 4-1 G-C 1 U 1c CA CA cc 1 G TG 1A CG 1G UG TAA 111 TU1 CAT

PHENYLALANINE PROLINE SERINE IAGNI SERINE WCNl THREONINE

A U c 6-C C-G A-1 UG G-C u-1 T-u u-1 TG C -G TG u-T 6-C 1-A 1-0. u-1 G-C G-C 1-A 1-u c -G u-T I-, G-C T-u G 1 11 1- I." A-1 G 1 U 1-1 1 G 1 UA G T GA 1-1 c 1 CCA 1G 1 1 1GUG A 1UAl1 TTACC G A 1GG U A TlG U AU Ill I1 GI\ 111 1 Ill11 1 A 1 11 G 1 4 Ill 1 A 11CG GGluc 1 U 1TlG ACT1 1 U UUlGG 1 C Clll ACT 1 G 11CG AAC A T Ill1 T G Ill Cl r G 11.1 C AA c Ill1 UT 1 UUGC 4 c UUUl U C 6AGA G 1 UUGC A U G UG U AU c Gl G AU 1 GI 1-A G T-A A A-1 G C -G 1-u GT C-G 1-4 GT 1G G-C C-G 6-C 1G Gl u-1 C-G 1-u C-G 1A 1A CG cu 1A 1G 1G 1u GUU TGG TGU 1GT

TRYPTOPHAN TYROSINE VALINE

G A 1 u-l C-G Ti u-1 u-1 G-C G-C 1 -A 1-A A-1 1-1 u-1 G- c G -c G-C 1 A- 1 U U -1 1 T TCCUT A T 11CCG U 1 GlClG G /IIll AU A IIII CG AUGGC 1 UTITA CUGUl I c G c Ill 1 A Gl A CGTU1 G G A A U G ,.L, .. T G TU A -1 1 1-A 1- A 1 -4 I, I, G- c 1 -u 1-u A- 1 c -G G-C G- C U -1 TU C U 1 C TU T U 1 U 1 C A GlU TAC GlU 1CA FIGURE3.--Sequences of 23 tRNA genes from M. edulis folded into cloverleaf forms expected for their gene products. Amino acid identities are given above each sequence. Watson-Crick pairing is indicated by solid lines; G-T pairs are shown with dots. 406 R. J. Hoffmann,J. L. Boore and W. M. Brown inferred to recognize any initial ATN codon (BIBBet clear how or whether its function would differ from al. 1981). In Mytilus mtDNA, a tRNAMet genewith a that of tRNA!&. Both ATA and ATG codons are CAT anticodon is present and located in a cluster of present at internal positions in Mytilus protein genes, four tRNA genes, between tRNALys and tRNAgN and presumably each would be most efficiently rec- (Figure 1, APPENDIX I). ognized by its cognate tRNA. Similarly, a differential Unlike other metazoan mtDNAs, that ofMytilus function of the two tRNAs in initiation seems unlikely, appears to encode a second tRNAMet gene. Located since both ATGand ATA appear to be used as immediately 5’ of the ND2 gene, it has the anticodon initiation codons in Mytilus. TAT (Figures 1 and 3), which presumably recognizes Genetic codeand codon usage: The mitochondrial ATA codons most efficiently. ATA is a normal Met genetic codes of Mytilus and Drosophila appear to be codon only in mitochondria (PEEBLES,GEGENHEIMER identical, with TGA specifying tryptophan, ATA me- and ABELSON1983); it specifies isoleucine in the cy- thionine, and AGA and AGG serine. With the possible toplasm. However, it is unlikely that ATA specifies exception of the NDl gene, for which an initiation isoleucine inMytilus mtDNA, since both ATA and codon cannot presently be assigned with certainty, it ATG codons are employed at conserved methionine appears that ATT is not used as an initiation codon positions of mitochondrial proteinsin both Drosophila inMytilus. Although in vertebratemtDNAs, AGA (CLARY andWOLSTENHOLME 1985) andMytilus. and AGG are unused and may be termination codons The sequence and predicted structure of this novel (e.g., see BIBBet al. 1981), both are used in Mytilus tRNA are unexceptional; a C precedes the anticodon, and Drosophila mtDNAs, and the positional similarity but this is also the case in various mt-tRNA genes of of this usage argues that AGA and AGG are serine other metazoans. Although the conclusion that this codons in both cases. gene is transcribed and that its product is functional A summary of usage for the 2824 codonssequenced must await experimental verification, there is no se- (Table 2) indicates no pronounced codon bias in My- quence-based reason to doubt that this is true. We tilus. No unused codons occur in any codon family, thereforerefer to this gene as to distinguish and no third position bias is evident. C, G, A and T it from the second tRNAMet gene in Mytilus, which occur in thethird position in 13.3, 23.6, 28.0 and we call tRNA!&. Sequence comparison suggests that 35.1 % of the codons, respectively, and these amounts the two Mytilus tRNAM“ genes arose by duplication hardly differ from those(14.0, 23.8,28.0 and 34.2%) and subsequent divergence.At their 5’ ends thegenes in the 13.9-kb portion of Mytilus mtDNA sequenced. are identical at 15 of the first 16 positions and, if the Excluding termination codons, thoseused 10 or fewer genes are aligned by regions (excluding the T$C loop, times are ACG (Thr), TCC(Ser), CGC (Arg) andCCC which cannot be aligned),only 23 positions are differ- (Pro). ent between them. Gene duplicationhas been thought to be involved in high rearrangement rates for mt- CONCLUSIONS tRNA genes (e.g., JACOBS et al. 1988; PAABo et al. The size, sequence, cleavage map data and, as in- 199 1). ferred from them, the structure and genetic organi- A tRNA!& gene is not only unusual among animal zation of MytilusmtDNA furnishyet another example mtDNAs, but among all organisms. The only other of boththe unity and diversity thatappear to be tRNA with this anticodon (UAU) listed in SPRINZLet characteristic of metazoan mtDNAs. With respect to al.’s (1 99 1) compilation of all known tRNAs occursin unity: (1) The gene complement is identical to that of yeast, where it specifies isoleucine. It is possible that a other metazoans, exceptfor the presence of two few directly sequenced tRNAshave UAU anticodons, tRNAMetgenes (even the absence of ATPase8 has a but in all cases examined the nucleotide in the anti- precedent in nematodes). (2) The genome organiza- codon wobble position is modified and of uncertain tion is very compact; no introns were found and, with derivation. For example, it is either U or C in Halo- the exception of a 1.2-kb noncoding region and four bacterium volcanii (GUPTA1984). Other tRNAs with other unassigned regions of 79-1 19 nt, intergenic NAU anticodons include those from phageT4 (GUTH- sequences are minimal or absent. (3) The tRNAgenes RIE and MCCLAIN1979), Escherichia coli (HARADA and exhibitthe relaxed sequence andstructural con- NISHIMURA 1974),spinach chloroplasts (FRANCISand straints that have become a hallmark of metazoan mt- DUDOCK1982; KASHDAN and DUDOCK1982), and tRNAs. potato mitochondria (WEBERet al. 1991). However, With respect to diversity, the genetic organization the sequences of the tRNA genes in the latter two of Mytilus mtDNA furnishes another example of the cases show that CAT is the anticodon and, thus, rule nonconservation of mitochondrial gene order among out U as the original nucleotide at the modified wob- metazoans. Based on the assumption that coelomate ble position in those cases. metazoans are monophyletic, we anticipated that the Even if tRNA!fitAis expressed in Mytilus, it is un- Mytilus gene order would differ, but still be a recog- MythrntDNA Organization 407

TABLE 2 Codon usage in sequenced portionsof 12 protein genes fromM. edulis mitochondrial DNA

Amino Amino Amino Amino acid Codon N acid Codon N acid Codon N acid Codon N

Phe TTT 168 Ser 58 TCT TYr TAT 81 CYS 43 TGT TTC 52 TTC 9 TCC 40 TAC 25 TGC Leu 135 TTA 18 TCA 7 TAA Term TGA 35 TTG 85 TTG 20 TCG TAG 3 Trp TGG 41

Leu52 CTT Pro CCT 43 His 29 CAT '4%25 CGT CTC 13 CTC ccc 10 20 CAC CGC 10 CTA 62 CTA CCA 21 Gln CAA 33 CGA 20 CTG 38 CTG CCG 20 CAG 14 CGG 12

Ile ATT 46 122 ACT Thr Asn 59 AAT Ser48 AGT ATC 37 ATC ACC 12 30 AAC AGC 21 Met 102 ATA ACA 26 LYS AAA 54 AGA 69 ATG 65 ATG ACG 8 AAG 34 AGG 58

Val 79 GTT Ala 55 GCT Asp 33 GAT GlY 49 GGT GTC 27 GTC GCC 18 GAC 22 GGC 29 GTA 93 GTA GCA 42 Glu 36 GAA GGA 39 GTG 96 GCG 19 GAG 51 GGG 103

nizable variant on the relatively conserved arrange- apparently as a result of occasional biparental inher- ments seen among other coelomates (e.g., chordates, itance of mtDNA (FISHERand SKIBINSKI1990; HOEH, echinoderms and arthropods). Further,based on pres- BLAKLEYand BROWN1991). Heteroplasmy may pro- ent phylogenetic hypotheses, it was also reasonable to vide an opportunity forrecombination among differ- expect that traces of a shared ancestry might be pre- ent mtDNAs that is usually absent. HOEH, BLAKLEY served most strongly between mollusk and arthropod and BROWN(1991) have presented cleavage maps mtDNAs. Neither of these expectations is met. The from two Mytilus mtDNAs that were found in heter- degree towhich the arrangement in Mytilus is unique oplasmy. The map for one of the mtDNAs is identical was not only unexpected,but, given thecurrently to that for the genome characterized in the present accepted placement of mollusks with other coelomate study, and it is, therefore, most probably also identical metazoans, is much more surprising than the highly in geneorder. However, themap for the other divergent arrangements that have been foundin pseu- mtDNA differs extensively from this map, and that docoelomate (e.g., nematode)and acoelomate (e.g., mtDNA is also longer by about 1.6 kb. Further, while flatworm and sea anemone) metazoans. Understand- the two mtDNAs share many cleavage sites mapped ing the evolutionary derivation of the Mytilus gene in the section of the genome containing ND3, COI, arrangement, forexample by the methodof SANKOFF, ATPase6,ND4L, ND5 and the two rRNA genes CEDERGREN andABEL (1990), will require informa- (Figure l), they share essentially no sites in the re- tion about other metazoan groups. mainder of the genome, raising the possibility that Previous mechanistic speculations about thereasons multiple gene arrangements might be found among for conservation of mitochondrial gene organization individual mussels. have included a suggestion that arrangementof genes, Little is known about themtDNA of other bivalves. particularly rRNA and tRNA genes, is important for S. MCCAFFERTY(personal communication) has enzy- regulation of gene expression (e.g., OJALA,MONTOYA matically amplified and sequenceda region of the and ATTARDI198 1 ; MONTOYA,GAINES and ATTARDI mtDNA from four different species of Mytilus that 1983). The plethora of unique and radically different spans the ND4 and COIII genes, and he found that arrangements, among which that of Mytilus is but the these two genes have the same relative orderand latest example, argue strongly against such a conclu- polarity in all four species. The ND4 and COIIIgenes sion and, instead, suggest that metazoan mitochon- are in a region that appears,by cleavage map compar- drial gene order is a relatively neutral characteristic isons (HOEH, BLAKLEYand BROWN1991), to be with regard to selection. greatly dissimilar among individuals, thus suggesting Although the reasons for the scrambled geneorder thatthe dissimilarity may not necessarily reflect a in Mytilus mtDNA are unknown, it may be germane dissimilar gene arrangement. D. O'FOIGHILand M. that mtDNA heteroplasmy is widespread in this , SMITH(personal communication) have data from an- 408 R. J. Hoffmann, J. L. Boore and W. M. Brown other bivalve (genus Lasaea) indicating that, as in MARGO,1988 Genome organization ofArtemia mitochondrial Mytilus, the ND2 gene is downstream from the COIII DNA. Nucleic Acids Res. 16: 6515-6529. BIBB, M. J., R. A. VAN ETTEN, C. T. WRIGHT,M. W. WALBERG gene, but with one unidentified tRNA gene interven- and D. A. CLAYTON,1981 Sequence and gene organization ing. In Mytilus these genes are separated by a 79 nt of mouse mitochondrial DNA. Cell 26: 167-1 80. noncoding region and thetRNAyfitA gene. Their data BROWN, W. M., 1985 The mitochondrial genome of animals, pp. also suggest that the ND4 gene is not upstream of the 95-130 in MolecularEvolutionary Genetics, edited by R. J. COIII gene as it is in Mytilus. In theJapanese scallop, MACINTYRE. Plenum Press, New York. BROWN, W. M.,M. GEORGE, JR., and A. C.WILSON, 1979 Rapid Patinopecten yessoensis, the COIII gene precedes and evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. is proximal to the ATPase6 gene, but it is not yet USA 76 1967-1971. known whether an ATPase8 gene is located between CANTATORE, P.,M. ROBERTI, G. RAINALDI,M. N. GADALETAand them (E. BOULDINGand A. BECKENBACH,personal C. SACCONE, 1989 The complete nucleotide sequence, gene communication). COIII and ATPase6 thus have dif- order, and genetic code of the mitochondrial genome ofPar- acentrotus liuidus. J. Biol. Chem. 264: 10965-10975. ferent relative positions in Patinopecten than in My- CLARY,D. O., and D. R. WOLSTENHOLME,1985 The mitochon- tilus. If confirmed, these results would indicate that drial DNA molecule ofDrosophila yakuba:nucleotide sequence, bivalves are nonidentical in their mtDNA organization gene organization, and genetic code. J. Mol. Evol. 22: 252- and that the great dissimilarity of Mytilus mitochon- 271. drial gene order from those of other metazoans may CLAYTON,D. A,, 1982 Replicationof mammalian mitochondrial DNA. Cell 28: 693-705. stemfrom an accelerated rate of intragenomic re- CLAYTON,D. A.,1984 Transcription of the mammalian mito- arrangement as well as from phylogenetic distance. chondrial genome. Annu. Rev. Biochem. 53: 573-594. These results emphasize theneed for a much DE GIORGI, C., C.LANAVE, M. D. MUSCI andC. SACCONE, broader and more detailed survey of mtDNA gene 1991Mitochondrial DNA in the sea urchin Arbacialixula: order. Without data for other protostome mtDNAs, evolutionaryinferences from nucleotide sequence analysis. Mol. Biol. Evol. 8: 5 15-529. including data for other phyla as well as for other DESJARDINS,P., and R. MORAIS,1990 Sequence and gene orga- molluscan classes, it will be impossible to determine nization of the chicken mitochondrial genome: a novel gene whether Mytilus represents an aberration or whether order in higher vertebrates. J. Mol. Biol. 212: 599-634. some protostome groups will simply turn out to be DESJARDINS,P., and R. MORAIS,1991 Nucleotide sequence and much less conservative in mtDNA gene order than evolution of coding and noncoding regions of a quail mito- chondrial genome. J. Mol. Evol. 32: 153-161. deuterostome groups and arthropods appear to be. It DEVEREUX, P.J., HAEBERLI and0. SMITHIES,1984 A comprehen- is important in this regard that representatives of two sive set of sequence analysis programs for the VAX. Nucleic other molluscan classes (a chiton, Katharina tunicata, Acids Res. 12: 387-395. and a gastropod, Plicopurpura sp.) have gene arrange- DUBIN, D. T., C.-C.HSUCHEN andL. E. TILLOTSON, mentsthat aredifferent from that ofMytilus 1986 Mosquito mitochondrial tRNAs for valine, glycine and u. glutamate: RNA and gene sequences and vicinal genome or- BOORE,T. COLLINSand W. BROWN, unpublished ganization. Curr. Genet. 10701-707. data), demonstrating that gene order is not a con- FISHER,C., and D. 0. F. SKIBINSKI,1990 Sex-biasedmitochon- served feature amongall molluscan classes, at least as drial DNA heteroplasmy in the marine mussel Mytilus. Proc. they are presently constituted. R. SOC. Lond. B242: 149-1 56. FORAN, D. R.,J. E. HIXSONand W. M. BROWN,1988 Comparisons of ape and human sequences that regulate mitochondrial DNA This work was supported by grants from the National Science transcription and D-loop DNA synthesis. Nucleic Acids Res. 1;oundation (BSR-8904633and BSR-8916466, to R.J.H.; BSR- 16: 5841-5861. 8613902 and BSR-8517830, to W.M.B.), and the National Insti- FRANCIS,M. A., and B. DUDOCK, 1982 Nucleotide sequence of a tutes of Health (GM30144, to W.M.B.). We thank LYNNE SZURA spinachchloroplast isoleucine tRNA. J. Biol. Chem. 257: and CAROL MANTHEYfor expert assistance and DAVIDSKIBINSKI 11 195-1 1198. for providing cloned Mytilus mtDNA. The DNA sequences pre- GADALETA, G., G. PEPE,DE CANDIA,G. C. QUAGLIARIELLO,E. SBISA sented in this report have been deposited in GenBank (accession and C. SACCONE,1989 The complete nucleotide sequence of numbers M83756, M83757, M83758, M83759, M83760, M83761 the Rattus norvegzcus mitochondrialgenome: cryptic signals and M83762,corresponding to Segments 1-7, respectively, of revealed by comparative analysis between vertebrates. J. Mol. APPENDIX I). EvoI. 28: 497-5 16. CAREY,J. R., and D. R. WOLSTENHOLME,1989 Platyhelminth LITERATURECITED mitochondrial DNA: evidence for early evolutionary origin of a tRNA"'AGN that contains a dihydrouridine arm replacement ANDERSON,S., A. T. BANKIER,B. G. BARRELL,M. L. H. DE BRUIJN, loop, and of serine-specifying AGA and AGG codons. J. Mol. A. R. COULSON,J. DROUIN,I. C. EPERON, D.P. NIERLICH,B. EvoI. 28: 374-387. A. ROE, F. SANGER, P. H. SCHREIER,A. J. H. SMITH,R. STADEN GUPTA,R., 1984 Halobacteriumvolcanii tRNAs: identification of and 1. G. YOUNG, 1981Sequence and organization of the 41 tRNAs covering all amino acids, and the sequences of 33 human mitochondrial genome. Nature 290: 457-465. class I tRNAs. J.Biol. Chem. 259: 9461-9471. ANDERSON,S., M. H. L. DE BRUIJN, A. R.COULSON, 1. C. EPERON, GUTHRIE, C., and W.H. MCCLAIN,1979 Rare transfer ribonucleic F. SANGERand I. G. YOUNG, 1982 Complete sequence of acid essential for phage growth. Nucleotide sequence compar- bovine mitochondrial DNA: conserved features of the mam- ison of normal and mutant T4 isoleucine-accepting transfer malian mitochondrial genome. J. Mol. Biol. 156 683-717. ribonucleic acid. Biochemistry 18: 3786-3795. BATUECAS,B., R. GARRESSE,M. CALLEJA,J. R. VALVERDE andR. HARADA,F., and S. NISHIMURA, 1974 Purification and character- Mytilus mtDNA Organization 409

izationof AUA specific isoleucinetransfer ribonucleic acid POINAR,G. O., JR., 1983 The Natural History of Nematodes. Pren- from Escherichia coli B. Biochemistry 13: 300-307. tice-Hall, Englewood Cliffs, N.J. HARRISON, R.G., 1989 Animal mitochondrial DNA as a genetic ROE, B. A., D.-P. MA, R. K. WILsoNandJ. F.-H. WONG,1985 The marker in population and evolutionary biology. Trends Ecol. complete nucleotide sequence of theXenopus laeuis mitochon- Evol. 4: 6-1 1. drial genome. J. Biol. Chem. 260 9759-9774. HIMENO,H., H. MASAKI,T. KAWAI,T. OHTA,I. KUMAGAI,K. SANKOFF, D., R. CEDERGRENand Y. ABEL, 1990 Genomic diver- MIURAand K. WATANABE,1987 Unusual genetic codes and gence through gene rearrangement. Methods Enzymol. 183: a novel gene structure for tRNA% in starfish mitochondrial 426-438. DNA. Gene 56: 219-230. SEDEROFF,R. R., 1984 Structuralvariation in mitochondrial HOEH,W.R., K. BLAKLEYH. and W. M. BROWN, DNA. Adv. Genet. 22: 1-108. 1991 Heteroplasmy suggests limited biparental inheritance of SKIBINSKI,D. 0. F., M. AHMAD andA.J. BEARDMORE, Mytilus mitochondrial DNA. Science 251: 1488-1490. 1978 Geneticevidence for naturallyoccurring hybrids be- tween Mytilus edulis and Mytilus galloprovincialis. Evolution 32: HSUCHEN,C.-C., R. M. KOTINand D. T. DUBIN,1984 Sequences 354-364. of thecoding and flanking regions of the large ribosomal SKIBINSKI,D. 0. F., and C. A. EDWARDS,1987 Mitochondrial subunit RNA gene of mosquito mitochondria. Nucleic Acids DNAvariation in marine mussels (Mytilus), pp. 209-226 in Res. 12: 7771-7785. Proceedings of the World Symposium on Selection, Hybridization, JACOBS, H. T., D. J. ELLIOTT,V. B. MATH and A. FARQUARSON, and GeneticEngineering in Aquaculture, Vol. I. Heeneman, 1988 Nucleotide sequence and gene organization of sea ur- Berlin. chin mitochondrial DNA. J. Mol. Biol. 202: 185-217. SMITH, M. J., D. K. BANFIELD,K. DOTEVAL,S. GORSKIand D. J. .JOHANSEN, J., P. H. GUDDALand T. JOHANSEN, 1990 Organization KOWBEL,1989 Gene arrangement insea star mitochondrial of the mitochondrial genome of Atlantic cod, Gadus morhua. DNA demonstrates a major inversion event during echinoderm Nucleic Acids Res. 18: 41 1-419. evolution. Gene 76 181-185. KASHDAN,M. A., and B. DUDOCK,1982 Thegene for spinach SMITH,M. J., D. K. BANFIELD,K. DOTEVAL,S. GORSKIand D. J. chloroplast isoleucine tRNA has a methionine anticodon. J. KOWBEL,1990 Nucleotidesequence of nine protein-coding Biol. Chem. 257: 11191-1 1194. genes and22 tRNAs in the mitochondrial DNAof the sea star KYTE,J., and R. F. DOOLITTLE,1982 A simple method for dis- Pisaster ochraceus. J. Mol. Evol. 31: 195-204. playing the hydropathic character of a protein. J. Mol. Biol. SNYDER, M., A. R. FRASER,J. LAROCHE,K. E. GARTNER-KEPKAY 157: 333-348. and E. ZOUROS,1987 Atypical mitochondrial DNA from the MCCRACKEN,A,, I. UHLENBUSCH G.and GELLISSEN, deep-sea scallopPlacopecten magellanicus. Proc. Natl. Acad.Sci. 1987 Structureof the cloned Locustamigratoria mitochon- USA 84: 7595-7599. drial genome: restriction mapping and sequence of its ND-I SPRINZL,M., N. DANK,S. NOCKand A. SCHON,1991 Compilation (URF-I) gene. Curr. Genet. 11: 625-630. of tRNA sequences and sequences of tRNA genes. Nucleic MONTOYA, J., G.L. GAINES and G.ATTARDI, 1983 The pattern Acids Res. 192127-2171. of transcription of the human mitochondrial rRNA genes re- STORMO, G. D., T. D. SCHNEIDERand L. M. GOLD, veals two overlapping transcription units. Cell 34: 151-159. 1982 Characterization of translation initiation sites in E. coli. MORITZ, C.,and W. M. BROWN,1986 Tandem duplication of D- Nucleic Acids Res. 10: 2971-2996. loopand ribosomal RNA sequences in lizard mitochondrial UHLENBUSCH,I., A. MCCRACKENand G. GELLISSEN,1987 The DNA. Science 233: 1425-1427. genefor the large (16s) ribosomalRNA from the Locusta MORITZ,C., andW. M. BROWN,1987 Tandemduplications in migratoria mitochondrial genome. Curr. Genet. 11: 621-638. animal mitochondrial DNAs: variation in incidence and gene WARRIOR,R., and J. GALL, 1985 The mitochondrialDNA of Hydra attenuata and Hydra littoralis consists of two linear mol- content among lizards. Proc. Natl. Acad. Sci. USA 84: 7183- ecules. Arch. Sci. 38: 439-445. 7 187. WEBER,F., A. DIETRICH, J.-H. WEILand L. MAR~CHAL-DROUARD, MORITZ, C., T. E. DOWLINGand W. M. BROWN,1987 Evolution 1990 Apotato mitochondrial isoleucine tRNA is coded for of animal mitochondrial DNA: relevance for population biol- by a mitochondrial gene possessing a methionine codon. Nu- ogy and systematics. Annu. Rev. Ecol. Syst. 18: 269-292. cleic Acids Res. 18: 5027-5030. OJALA, D., J. MONTOYAandG. ATTARDI,1981’ tRNA punctuation WOLSTENHOLME,D. R., J. L. MACFARLANE,R. OKIMOTO, D. 0. modelof RNA processing in human mitochondria. Nature CLARYand J. A. WAHLEITHNER,1987 BizarretRNAs in- 290 470-474. ferredfrom DNA sequences of mitochondrial genomes of OKIMOTO, R., J. L. MACFARLANEand D.R. WOLSTENHOLME, nematode worms. Proc. Natl. Acad. Sci. USA 84 1324-1328. 1990 Evidence for the frequentuse of TTG as the translation WONG,T. W., and D. A.CLAYTON, 1985 Invitro replication of initiation codon of mitochondrial protein genes in the nema- human mitochondrial DNA: accurate initiation at the origin of todes, Ascaris suum and Caenorhabditis elegans. Nucleic Acids light-strand synthesis. Cell 42: 951-958. Res. 18:6113-6118. ZEVERING,C. E., C. MORITZ, A.HEIDEMAN and R. A. STURM, OKIMOTO, R., and D. R. WOLSTENHOLME,1990 A set of tRNAs, 1991 Parallelorigins of duplications andthe formation of that lack eitherthe TGC arm or thedihydrouridine arm: pseudogenes in mitochondrial DNA from parthenogenetic liz- towards a minimal tRNA adaptor. EMBO 9 J. 3405-341 1. ards (Heteronotiabinoei; ). J. Mol. Evol. 33: 431- OKIMOTO, R., H. M. CHAMBERLIN, L.J. MACFARLANEand D. R. 441. WOLSTENHOLME,1991 Repeatedsequence sets in mitochon- ZUKER,M., and P. STIEGER, 1981 Optimalcomputer folding of drial DNA molecules of root knot nematodes (Meloidogyne): largeRNA sequences using thermodynamics and auxiliary nucleotide sequences, genome location and potential for host information. Nucleic Acids Res.9: 133-148. race identification. Nucleic Acids Res. 19: 1619-1626. Communicating editor: A. G. CLARK I’AABo, S., W. K. THOMAS,K. M. WHITFIELD,Y. KUMAZAWA and A. C. WILSON,1991 Rearrangements of mitochondrial trans- fer RNA genes in marsupials. J. Mol. Evol. 33: 426-430. APPENDIX I PEERLES,C. L., P. GEGENHEIMER and J. ABELSON,1983 Precise excision of intervening sequences from precursor tRNAs by a The nucleotide sequences of seven contiguous seg- membrane-associated yeast exonuclease. Cell 32: 525-536. ments of the Mytilus edulis mitochondrial genome are 410 R. J. Hoffmann, J. L. Booreand W. M. Brown

0000 - OODO00 00 ozooooo $g%$g 00 0=00000 ORRSS8 rN 0~'""COI 1 0gggg9N0" g;;gg I .:> .C .e .e> .e !i 2o '5" 'E> .?> '3' '3' 'tu '5 '2- 'EY '$ '3' '2" z E: E- &- f: 'E'e '-$i ', B '.I>e= 'E, 'EW6- '1E: 'I"j- E- E> E> $ E> 3" E, :> ;0 $3> i E> 1: :- c' e-g- E j E, I- ga 1: 3- t- P> EL E E- 5- .*-f. .e2.ga E: .S> .E" .3 .$. .uo .6- .& 5- " .&$> .5-5' .3' 5: 2 .E' t .g; .E.< oc .E- .Fu .g: ti, .g: .E: .Eu .g; a & y t- e EL 2 e-?'. c 'Y- 'E- t : E- 3- 4' 4' g; 5- 2: 8 E- x" P :-5- e :> 8: 5- 3' E: 2:' E' E' $:,j- g E: L 2- E 8 & E> e' 6; $a i: k' - i! ,; .3- = .eI- 8- .El, - Eo .E' z p 5 ,EL ,Iy .Y .E- .le4 .*I.I; .j- .lj $> 3' '$> t' $, : : I-> .y 'E-<> $= E 3 5 5' 6, 'g; 2' t> c e +a '5- E" '5' '$' e' a < f- 1::a p. Eu go C> E. 5- #- g> 9 5; 1: e' 3 oc E 0 f- g: 8: 3: g e' #< 3. 5' E: E: 1- g: ,& .E'l; .eZ . c e' 6 t- !i E- .g; .e> .g> .tu .I- ! ,j: ,$- 6- .a.e ,e' .j ,E> 5' g= 'E, 'E: g E> $: 5' E. Ea [E a .I. E: E- E. 8- & 1: 4- cE $- !! e: e> $8 g '5- '5; 8- E- E' 3- $a @> g: t- 'iztd I> 3: .I: iij 4 0 f; p - x .eY 1 pa .g> . 3: .I..t' .z; .E- .3 .gy .g: .Bx .E- E 6 4 .E: .i:.p .E, .E> - .:> .E> B 5. .%e .& . ; 6' E, " : f- 4 p: $= 8" e- :I f 'z E, 6, 6' - E' +- $ '$1 x le <> c I. e> I - 5> : 7"E' 8, e- 6, e< 5' e- c e> Q E" - I>5: 1' E' 1 p- E' s E- 5- 20 8 a' a' cc6, :: 6, '6 .E- :> 1; t .Eu $0 g: .yo .E> .t: : si t E> in 6; .E> .E> .$- Eo 1: 2; .fa .J: e :-I> Eu 'p .E: 'E> 6' c .i:.$I .5. g8 '5: ,9 5 .i: $- 'tu 3 .& .io .!: .I, '5' & g E> u e, CY e :> 5' 1- - E: I' 3 2; gu 3- 8, f e' g g 3. E> &I f 5- e> t> '1:2' .&t- .[: " .e- .E .e .I-.io .& .I: .g .g I c E' .%- .E' .f: .E- .e;.E- f .5'$' 3 .:a .Eu .z: .s: C> .I- i c u .kE .I; t' e> c t> tBu 2 e=e.> 4e> e, B' : P 8. 5 !. c; e> io 5- 6' -52 p is-2- 5'6- EE> ou 5. EL uct'? 5: z= E- E> ? e- c z- E' .$e . 'E>E' .e 8- E. E' ':> .p E- . < .e - 5, .E' '4 2: E . m .e, .j=2- .E: .E: 2 g i$:.f- ,I: .e" .e .t'5' 2. i:: 1 ,# ,$'z .: 5: '+ j: .j8 g z> !i8 3- t' .I: I8 E' E: , E' 5 $:E- E: E" 5 s*i- gu I8 g 'E c $'*-ei: 3;c fa & i<,e> 8" 23 s= g' EL e> [; g ;g 5; - I .;> = 3 .<> .e- .tu . .E> .E- E !E- .I- .E- .5' 3 $ i $1 .p .I:.I- - J. .E: 23 .k 1:. .E' z o' o' 2 e, 5: E- E; 5- f E- p 3:e> E- $4 5L.i .fE [i 1; E: a- %: E- E> $ g: Eb e< '1; Ye N y ) $;& 3- 1: E ! ;. g go 1- E 3: g: 5: .e>i: .+a. .e- f +> ,&# '5. +- 5: g .? .$ .E' .E- .g g g; .E; .E&E .E' 'E .la .: .e> .pa .$*C' .$c 8" yo 8 8( e> 6' u t 5, E c g< zw ex z ti; g p .E-c- .E-3> .E1 t' E' pa !a e: tt-0s- e e> c cy. 5- 5: 'i' H &> g- E" 0 i c f$, E. t- E' t- & p . !: E- :> g E> B :> g: p g. $:# g 3 $> 3:: E 2 2. $:2. 8 s30ke $12- 5- 5 E e yo 1: g $3 2- 6, 5- ?> 2- 1c 4 :* % SfffZf -PiE.zgf I -P K """ " 556 ZgEf65 I PS88OX E N0P r mz

00000000 0 0 0 0 0 0 00000000000 0 000 g 800000 S 2 g ,o 80 ;~;~~~5E:4 E E t E a p;p;pL%w:zgz zfgfM$$ff:;g f g

- - - I - P K f 5 5 f' i sig?f$pg$ f f f E i E szz$;wfgfEg"_ $fgigg$$$$$$ 5 f Mytilus mtDNA Organization 41 1

000000 oOo 0oooo,8 888 ~8888~gp,ooo88 -",oh- A:""" OR2 548888O4=c2 ----- NNN ,E" 'S 'EO '2' '$+ 'g'f; .ec .<>c .x? ,I, .2z .E~.z2 .I- .:, .e 1 : .ea .e 'p+ '31 '8: '2- 'E' 'I: '8; ,> c C. C, r c "0 $!; $= :' y $'E; ?- & 0- EYcY 1' *';> 3,2" g:c $5: iI !$ tiz i; :I-' .?Z .& .g .:< & .g; .@ .g $< .?" .E' .$- 3- x y (n .g- .E- E: 'E,5: id ' .t- i- .3z .E' .E; .i: ;:uI .om q .[! 'E' c p: !I e 'E: := E' :-Eu i- E- 2: : I, 8- :$'E e: E' I<%> 3: 2. 8, 6, e. b> :u c 0 uu & g e- u-c urn c 3- f- Ed 1: jx 0- r (nu p %- UI .1: = .k.j: .E- E& 8 .F .$ .E: .$= .& .E. .3; .E" .*> .$Z .E" .E, ,E: uc .s: 1, .E: .E- :> .I-YI .I: e= e, < e. E, E- 'E, 5 :A :< 2 Z> :" (n ;-Ia '5; 5- x 2: 8- :a j: f: 1. g; i- 8- 1; E' 2, 5: 3- F" F: f- 3; I> 3; !x ;' 5: e* E e> c 3r E- .& .E'pc .eg- .E, .$I .I- .& .i;.E- 3- .E" 2- .8< .E, 2-5: .2= .E* .j, .;> .:, ix.g: u 3: .$= .E' E 8- F: $1 0 fu k; 2, ./: e- 'F' i: 5- Z'e- E' b> Ea2' g-6' 8- E" 2; :I I: E: g2 "rrrYYso E-e- 8; 5- E- Z' E- 2- :-c b, 3- e> i= c 5- ,& ,E' E' e= :"5< c .+" .!+ c . .:- .Y .:- .? c .g$1 .;' g- .B'f- ,E" .$"c ,!;,%> .u-:" .e .E, g - -,2 :> (n I g- -'a= ,I* 's .;> .yo ei ,Iw e' $9 2, y 5- "I :A E, E" 5- 'E: j 5' '3: j- 3' i; 3- E- 3: f := E k< g & +" 5- 5. i: $'I, 5' - E, 3 E- 5: 5: u E" E* E, E'5' 3- E> ,j=3; .E. <> .e .:- .I 5'. - .& 3 .!; .?* .5,. + .:- .E. .E> .: .s, .g, .$; .j: .E: .s< $: '+> .I: z .e* 2- E, :- -I E 1 :, :" 5. L i: E' & g i; E, g; E= e" f- '& -m j* e:> Jr f* y: g e yo & E' :-c 3"3- 6, $: E" E; .- E- 8" 2;c e-8' E:< g c rY c u E-J c> - E- 5, r r' g. $4 . " 3- .5- .E- .:= .& 6 3- .so .EL .$= .& .b.@ ,u, -2 .e .I! .E: E, .p: b> (n 5' ;, EY gg 3: Io E' c := :c 3, .fn .$ 'E- .ega 0 rY i; ,id r- 'go 2: 'ju !z> g 8: E, 5: I" 2, j:S> I' oc u E- 3. 5, :' x Fz 5, 5- E- 5' 9: ss E, 5' EL u+ I::tu < "< Y y t- I .;> .$> .E>.Ey %Y ,E; .E- .E' .$I.$- .E. .EY .!j- .?> .g3 .& .5> .E .E' .SI .F' .:- .& .[: .E& .:- .@ .- c *e c e- c 2' e- y> u & b> 5: gr :-u 80 ?> 5' E 5 5: 3- E- $> E: !"5- $0 $ 2' g 5- 2' Q :> r> E,;I & g 6, F" ?I ._'J $0 CY I: g 2- eo c 2 r= e> 2 E: E, 3: - rLL - ,C' .iw .: !$a .& .:- .f; .E" .:, 2 .& .E .E' .c> .e2 .: .%'.$, .I:.$a .E. .p.ku .j=.I: .S.- .& c*. P

+* E' Eo E. $= := $0 L- E, -= u +- 3- 3- 5: s3; i; 1' y:: 8: 5: :ee' 2:c BY --0 3' 2- :- eo E' E' I- e> .C e- c- PI E a- .I,. q= .: " .e: .$: .& .g.EY 2' .-8: .+Z' .e-E" .:, .: 3: 'SI$.= .: 34 ,!- E : .E> 2. ,j;,[-E>,;; ,E; ,g: 5* u e+ u.8: fu $> L i p" E- Ed 8- 1- 5- 3. 5: &-u< 1-e> E* +a5' gc 3: E- 2- g !I E" r- ;-"< CY r, r. 2, E> .j3E go 6 ; -5' p t8' 2 2' E" I- 2, u E' go rY UL 1: YI SI- r r c 0 +I I 8" ?x E, 2 8" r 8- :, 2' Fd 3, 3- E- 8- a: 2' Zo ; 5, ev) 666 "r"" E I -Ei;6;g65; zGz66666~ooOOoO 0~Y)Y)C~DOI0" 3"""" _" zz-::2zEN: NNNI % 0 Y a M .-a E I8

00D00 a000 Og00 %0 0000 0 1 ;z;;zgZf J$E E I zoOoNO1 xgEgliEEl: sz;sf .$ .I" .?, .6, .e.- .C .r .8" .co .b .E= .go .E, .E < r .r .ua .U .r .U Y c .u .CY .yr '3 ."1c ,p L '{x e 2 Y- F> Ex u cc5 2 5: b; E 'E c g>CY tu E; g fz:p v) CY 'i 5:u 5-+' E> E, $, g f: ur$> 2: iI E- t E$=. g e E 0 0-c 1 b> e 5- 1. E' :Ai: E: EL r c 8.. e 1 $ ix 8, 3; 5>+ . y) .go .:a .!: .E .& .g- .$- .: .E'rr .bA .E, .;- .+ .:= .f- .g< .a: .: $ %; .5' .gr .I, ."5< ,gx t i .E, ; "U !i> S e> 8 l 3 8, :Y g: E' 5. 6 8' e> Eo E g, g 2 .5 .j= 3, $1 2 c> p- U 6 p"I' p E'5- s I' r : E, E:g, $ 5 5: f, E" E> 8' E> 2' g* g- 2 i :-5, : 0 g E, 3- .CE .e .I- .?,r> .Eu 'fE .E= .:, : .#, .Y> .E>f- .G"f- ,S5' .E>$' 9 1- '8, y .; 3,5: .E: .I' .I; .E; E- 8; .5' 2 [ 1;g; $ E j- .I E PI e' e' w : e- 8 :- u '8" .fx !j r! 3 2 1' !; t - io c :e E= c 3. 3< 5- E& 1: 5- g 8: g< iE5- 1 is: g; 1; E c $a .Cy tu r> - ,:> ,:> ,EL E, c 5. :' ,r, +a ,E- ,E' ,5 .E' U ,E; .I: .pr j: 1 !f 30 'E ': i {if B 1: !,I: 1: I:r> IEsE '$0 'g '$ Ur 'E-2d z E- f 3- '[: tu 2, 3; i; 1; E: 2: 9 p 2, g; g:ru E, 'E ii E' k: i n E" b> E- Y 3 3 p 8: E- e "j. :? e E E 1: '5 8 E ,1- 1.5: 5- F I .$Z .*- .i".I. .$* .:: .s; .E- .Ed .;; .:> .F> .io .a. 23 .$; .f- .Zz -f' 4 c<> ce' e 8' u E c- :aE~~dsa 'td

4- 8, +I E' 3 .Z>e' .$ .$: E { i .$ w

CY .E: $> u I'" ce" = i "0' .E E= E 5 9: 3- "+ $X 1 L- g, 2; 1: :- 5' jx E' :, 5' 5' 2 8. 3 ZY E' um 6- $1 3" .b' ,? ,j ,:',e_ E" E- 1: .E: .I: g - .jX.E> .p- 3 3 .1 .E> i! .e* .EL c 0- c .r8, E- 5, $x 1 1 .?, .pz .5> .Er '3' s= 3' y g j; g; Lr EY .I> 6' EL e' c u EL k- c E- g jE 3' I:: E- & 3: 2 ZA CY E: re r 3- L' 5, r r E: E 1:p 8:: :5 2s= :$ -: '1:Z' 'i .j; .:> .$x L .$ .e: .!: .!: .E: .E< 15- 2: .p.& .i; .E: .$- .E" .& .? .g" .:: f- .:- .E" .5- ,p! 5% I 1 x 1: c e- c 5 1 E. u ;I E, <> 'L I_ Z := FL !& I : - " c -3' 2- s== E" jr 5; '5' E' t: ; uc E, E,Ed $, p 1 .$; .iY.%- .e E,e> 20e u - s $ idEO $,% X $:85' E- 1' + 3. 5: ;- .l"c r> .EY2- ir E .E- .[.1 +' .: .:- .io .+a .E' .I- .$I .$: .f- .E .3: .3= .I' .E- .& .$: -1 E .&t-.P:.I 1 .tI .1: c ?; 2- e- EL Ed .Iu E : LE> - :a E' EL e- 5: E; 30 $5 g g go 5: E: 1: I- :- :X 5 k E I:c f;c - x ;=u 1: $1 {L $ .f:t- 3-0- 5" !: .[ 11: .k; 5- f: .$ .j: .j: 1 *>I. '!" :> G:.p{ +,zEa 'E- '8f: " 5.e- 2: e- jun :-<> ""e' Ex s; p s 'k- 1- r, L3 1: jji. J C. < e EI s; :E .I 8- ba 6 E:.[: % e= E; 5- 1" t ,?: g: 5-3; 1: 4 P .Z" .+" .E,E, .-- .*'E- &,I=.E' 2. .u8: q .j-E, .8: ,B-E> j:<> ,I 4 i= 1: g 1 $ .E::. ;.E; .Ew i$ .[ .I: .g: I" 1: .e, f .p $2 f. E- 2: 8- ?> 5; 3; E* :> 8:r E f- 9- &'1 g 23 g, u 2. 8- E: 5 !?I f: E' 7: E, $ '::'F- i 1;:';- I E E. I st- z: i- 5, 5: E- 2: I::4- $,E- 5' E' 5, 3 9 'e,g: ;> E: g: $ $8" $:I- 1 $ $ !&;: :-& !: ; $ $* ;. -:z: zj~gg~g""""~~~z $$$$$g r -I z ggggEsg$ --" I I PlZS I 412 R. J. Hoffmann, J. L. Booreand W. M. Brown shown in Figure 4. Together these sequences fix the curlybrackets. Unsequenced regions between seg- locations of all of the genes and unassigned sequences ments all fall within proteingenes, so thatthe 5’ in the genome. Gene products are identified above portionsof incompletely sequencedgenes are fol- the sequences. Translations of proteingenes are lowed in onesegment by their 3’ portions in the shown with single letter amino acid codes centered following segment. The BamHI site used to make the below the second nucleotide of their corresponding original X EMBL3 clone begins at nucleotide 1243 in codons; they includethe extensions of reading frames Segment #1, within the s-rRNA gene. N = unidenti- for COI, COIII and ND1, which are marked with fied nucleotides. x = unidentified amino acids.