<<

Copyright 0 1996 by the Genetics Society of America

The Complete Nucleotide Sequence of the Mitochondrial of the (A-otqPterus dolloi) Supports Its Phylogenetic Position as a Close Relative of Land

Rafael Zardoya and Axel Meyer Department of Ecology and Evolution and Program in Genetics, State University of New York, Stony Brook, New York 11 794-5245 Manuscript received June 10, 1995 Accepted for publication January 13, 1996

ABSTRACT The complete DNA sequence (16,646 bp) of the mitochondrial genome of the African lungfish, Protopterusdolloi, was determined. The evolutionary position of lungfish as possibly the closest living relative among of land vertebrates made its mitochondrial DNA sequence particularly interesting. Its mitochondrial order conforms to the consensus gene order. Several sequence motifs and secondary structures likely involved inthe regulation of the initiation of replication and transcription of the mitochondrial genome are conserved in the lungfish and are more similar to those of land vertebrates than those of ray-finned fish. A novel feature discovered is that the putative origin of L- strand replication partially overlaps the adjacent tRNAcys.The phylogenetic analyses of coding for tRNAs and confirm the intermediate phylogenetic position of lungfish between ray-finned and . The complete nucleotide sequence of the African lungfish mitochondrial genome was used to estimate which mitochondrial genes are most appropriate to elucidate deep branch phyloge- nies. Only a combined set of either or tRNA mitochondrial genes (but not each gene alone) is able to confidently recover the expected phylogeny among vertebrates that have diverged up to but not over -400 mya.

HE transition from life in to life on land, tologists, comparative morphologists and recently de- T -360 mya (BENTON1990), was one of the most velopmental biologists. consequential events in the history of vertebrates. It was The other extant groupof lobe-finned fish, the coe- accompanied by a variety of refined morphological and lacanths were believed to have gone extinct -80 mya, physiological modifications, e.g., reductions and rear- but in 1938, the only surviving of this lineage rangements of the bones and modifications of of fishes was discovered off the coast of East . swimming into load-bearing limbs (e.g., PANCHEN Since its sensational rediscovery, the is of- and SMITHSON1987). Two groups of lobe-finned fish ten depicted in textbooks as the “missing link” be- (Table l),the lungfish and the coelacanth (Latimm’u tweenfish and all land vertebrates i.e., , chalumnae) have both been implicated as the closest reptiles, birds and mammals (ROMER 1966).However, living relative of tetrapods (reviewed in MEYER 1995). many morphological, paleontological (e.g., reviewed Lungfish were discovered > 150 years ago (BISCHOFF in PATTERSON1980; ROSENet al. 1981), andmost mo- 1840) and for several reasons, e.g., they are obligate lecular (MEYERand WILSON1990; MEYERand DOLVEN airbreathers, were initially believed to be amphibians, 1992; HEDGESet al. 1993; reviewed in MEYER1995; but not fish. In the lower (-400 mya) lungfish see YOKOBORI et al. 1994) data suggest that lungfish were a speciesrich group that inhabited both marine and not the coelacanth are more closely related to and freshwater environments (e.g., reviewed in tetrapods.Hence, the nucleotide sequence of the CLOUTIER1991). However, onlya very small number of lungfish mitochondrial genome is of interest, both in “relict” species survive today. These are the Australian terms of the evolution of the mitochondrial gene or- lungfish, forsteri, the South American - der in vertebrates and in terms of the phylogeny of fish, Lqbidosiren paradoxa, and four species in the land vertebrates. Protqterus from Africa. These living are of inter- Until now, the complete mitochondrial DNA se- est to evolutionary biology since their morphology, quences of 19 vertebrate species have been reported. , and biochemistry might be representative Thirteen of them are from mammals, four from fishes, of that of the common ancestor of all land vertebrates. but only one of an and one of a bird have Therefore, lungfish have been widely studied by paleon- been determined. Remarkably, the structure and orga- nization of vertebrate mitochondrial is quite Corresponding authm: Rafael Zardoya, Department of Ecology and Evolution, State University of New York, Stony Brook, NY 117945245. conserved and only minor rearrangements have been E-mail: [email protected] described for the chicken (DESIARDINSand Mows

Genetics 142 1249-1263 (April, 1996) 1250 R. Zardoya and A. Meyer

TABLE 1 HindIlI Systematic position of lungfish

Class: (bony fish) Subclass: (ray-finned fish) HindllI Chondrostei (, Acipenser; , Polypterus) (, Lepisosteus; bowfish, Amia; modern ray-finned fish, Teleostei) dolloi \:\ Subclass: (lobe-finned fish) Actinistia (coelacanth, ) Dipnoi (lungfish) " Osteolepiformes" Tetrapoda (land vertebrates) (modern amphibians) Amniota (reptiles, birds, mammals) EcoRI /} 2 \ Modified from CARROIL(1988) and AHIBERG (1991). Posi- Hind111 EroRl tion of lungfish in bold. " Extinct. FIGURE1 .-Restriction map and gene organization of the Protopterus dolloi mitochondrial genome. All protein coding 1990) and the opossum ( JANKE et al. 1994). Changes genes are encodedby the H-strand with the exception of iW6, which is coded by the Lstrand. Each tRNA gene is identified in gene order seem to be associated with the potential by the single letter amino acid code and depicted according capability of tRNAs to translocate, although sometimes to the coding strand. Only the EcoRI and Hind111 restriction other genes are involved in transpositions as well. It is sites used for cloning are shown. not yet clear when the establishment of the vertebrate consensus gene order occurred during their evolution. MATERIALSAND METHODS The lamprey, one of the earliest vertebrates, has a pecu- Mitochondrial DNA was purified from fresh of a single liar gene order (LEEand KOCHER 1995). Although it individual of the African lungfish (Protopterus dolloi) as pre- is similar to that of other vertebrates, it has enough viously described (ZARDOYAet al. 1995a). After homogeniza- tion, intact nuclei and cellular debris were removed by a low- differences (i.e., rather than one major noncoding two speed centrifugation (1000 X g). Mitochondria were pelleted regions, the missing 01,in the WANCY region, and by spinning at 10,000 X g for 20 min and subjected to a translocations of cytochrome b, and tRNApr", tRNA?"", standard alkaline lysis procedure followed by a phenol/chlo- and tRNA"'" are found)to be considered to bea unique roformextraction. The isolated mtDNA was cleaved with and possibly a derived condition from the vertebrate EcoRI and HindIII restriction enzymes (see Figure 1 for posi- tions of restriction sites). Three EcoRI fragments of 7.4, 3.1 consensus mitochondrial gene order. Despite the slow and 0.7 kb and five HindIII fragments of 3.3, 2.9, 2.9, 2.2 and rate of evolution of gene order, nucleotide sequence 1.1 kb were cloned into pUC18 covering the entire lungfish evolution of mitochondrial DNA is rapid. The mtDNA molecule. In addition, the 7.4-kb EcoRI fragment was dynamic evolution of mitochondrial DNA sequences subcloned with Suu3A to facilitate sequencing. occurs through the accumulation of point mutations Plasmid DNA was extracted from each clone using a Magic and make them particularly valuable for estimating phy- miniprep (Promega). After ethanol precipitation, cloned DNAwas used as template for Taq Dye Deoxy Terminator logenetic relationships among closely related species cycle-sequencing reactions (Applied Biosystems Inc.) follow- (BROWNet al. 1979). However, not all mitochondrial ing manufacturer's instructions. Sequencing was performed genes evolve at the same rate (e.g., reviewed in MEYER with an automated DNA sequencer (Applied Biosystems 373A 1993), and some genes are more appropriate than oth- Stretch). Sequences were obtained using both M13 universal ers for inferring phylogenetic relationships among dis- sequencing primers and 40 specific oligonucleotide primers. The sequences obtained from each clone were -350 bp in tantly related species. length and eachsequence overlapped the nextcontig by We determined the complete nucleotide sequence -100 bp. In no case were differences in sequence observed and the gene order of the African lungfish mitochon- between the overlapping regions. The location and sequence drial genome. The aims of this study were to reconstruct of these primers will be provided by the authors uponrequest. mitochondrial genome evolution, estimate which mito- Sequence data were analyzed by use of the GCG program package (DEVEREUXet al. 1984) and alignments and phyloge- chondrial genes (tRNA, rRNA, or proteincoding) are netic analyses were performed using CLUSTAL V (HIGGINS most appropriate to elucidate deep branch phyloge- and SHARP1989), PAUP Version3.1.1 (SWOFFORD 199S), nies, and clarify lungfish relationships to tetrapods and PHYLIP Version 3.5 (FELSENSTEIN1989), and MOLPHY Ver- to ray-finned fish (Actinopterygii, Table 1).We are pres- sion 2.2 (ADACHI and HASEWAGA,1992). ently sequencingthe coelacanth mitochondrial ge- nome, with the long-term objective of establishing RESULTSAND DISCUSSION whether the lungfish or the coelacanth is the closest Genome organization: The completesequence of living relative of land vertebrates. the L-strand of the lungfish mtDNA is shown in Figure Lungfish Mitochondrial Lungfish DNA 1251

2. The total length of the mitochondrial molecule is GULLBERG1993; DILLONand WRIGHT1993), but not 16,646 bp. The overall base composition of the L-strand found in other vertebrates. is A 29%; T: 29%; C: 26%; and G: 16%. As in other The putative origin of strand replication (O,,) is vertebrates, two rRNAs, 22 tRNAs and 13 proteins are located in a cluster of five tRNAgenes (WANCY region) encoded by the lungfish mitochondrial genome; the (Figures 1 and 2) and is 45 nucleotides long. This re- relative position and orientation of all genes and the gion has the potential to fold into a stem-loop second- control region are identical to the vertebrate consensus ary structure with a stem formed by 15 paired and two mitochondrial gene order (Figures 1 and 2, Table 2). unpaired nucleotides and a loopof 13 nucleotides. Half -encoding genes were identified by comparison of the 0, stem is part of the tRNACys gene (Figure 3). with rainbow mtDNA (ZARDOYA et al. 1995a) and Since this tRNA is encoded by the L-strand it seems by thepresence of initiation and stop codons. Se- likely that thesame sequence is involvedboth in replica- quences encodingtRNA genes were recognized by their tive and transcriptional events. This condition is not capability to fold into putative cloverleaf structures, the found in fish or amphibians (SEUTINet al. 1994) sug- presence of specific anticodons and by comparison with gesting that it might be a special feature of lungfish. the rainbow trout homologues. The lungfish 01~loop contains a C-T rich sequence. Noncodingsequences: The control region in the This suggests that the initiation of L-strand synthesis is lungfish mitochondrial genome is 1184 bp long and probably initiated in a polypyrimidine tract as in other it is localized between the tRNAProand tRNAPhegenes fish (e.g., JOHANSEN et al. 1990; ZARDOYA Pt al. 1995a), (Figure 2). In other vertebrates, this region usually in- rather than and not restricted to a stretch of thymines cludes the origin of H-strand replication and the sites as had been previously suggested for mammals (WONG of initiation of both H- and L-strand transcription. Anal- and CLAWON 1985). ysis of the controlregion sequence permitted theidenti- Ribosomal RNA genes: The 12s and 16s rRNA genes fication of two conserved sequence blocks (CSBII and - in lungfish mitochondria are 937 and 1591 nucleotides 111) in the right domain, by comparison to the motifs long, respectively. Our sequence shows only minor dif- reported by WALBERGand CLAWON(1981) (Figure 2). ferences to that previously reported for an unidentified A total of threetermination associated sequences species of Protopterus (HEDGESet al. 1993). More exten- (TASs) werepostulated in the left domain based on the sive divergence was observed relative to other species consensus sequence proposed by DODA et al. (1981) of the genus (P. annectens and P. aethiopicus) for por- (Figure 2). A putative CSBI can be tentatively identified tions of the 12s rRNA gene that had previously been at position 16,247, but as in frog (ROEet al. 1985), this sequenced (MEYERand DOLVEN 1992). The primary motif is reduced to only five nucleotides (GACAT) and sequence of both rRNA genes is alignable to that of shares limited sequence similarity to the and other (HEDGESet al. 1993) and the secondary mouse consensus sequence (WALBERGand CLAWON structure appears to be conserved. 1981).The lungfish mitochondrial control region is Transfer RNA genes: As in other vertebrates, the also characterized by the presence of three 25-bp re- lungfish mitochondrial genome contains 22tRNA peats at the5’ end. These repeats are separatedby TASs genes interspersed between ribosomal RNA and protein and their sequences are nearly identical to those of coding regions. All the lungfish tRNA gene sequences mammalian control regions (Table 3), but are less con- can be folded into acloverleaf secondary structure pro- served or not found at all in frog, chicken, and ray- vided the formation of GU wobble and other unusual finned fish. These sequences contain theconserved mo- pairings is allowed. These tRNAs range in size from 67 tif5’-TACAT-3’ and its complementary 5‘-ATGTA-3’, to 75 nucleotides, show high variabilityespecially in which are proposedto maintain secondary structures in their DHU and T+C loops, and are more constrained other control regions (SACCONEet al. 1991). Moreover, in their anticodon and acceptor stems. As in other ani- these pentanucleotide motifs seem to be associated with mals, t~ksd.4Whas a reduced DHU arm (WOISTEN- the presence of repeats in the left domain of control HOLME 1992).On the other hand, tRNA””(“‘:“’ and regions as seen in bats (WILKINSONand CHAPMAN tRNA’”s form a normalcloverleaf structure e.g., in other 1991), shrews (STEWARTand BAKER1994), and sheep fish, chicken and frog, strengthening the idea that the (ZARDOYA et al. 199513) repeats. We identified a 40-bp unusual structures inferred for these tRNAs in mam- sequence in the central domain (starting at position mals can be considered synapomorphies that define 16,013), close to the B block as defined by SOUTHERN this clade (KUMAZAWAand NISHIDA1993). The pro- et al. (1988) that is characterized by a 75-80% similarity posed tRNAcyscloverleaf structure (Figure 3) indicates with the corresponding sequences of sheep (ZARDOYA that this tRNA has a longer acceptorstem (8 bpinstead et al. 1995b), cow (ANDERSON et al. 1982), two species of the usual 5 bp) anda shorterDHU stem (3bp instead of seals (ARNASON and JOHNSSON 1992; ARNASON et al. of the usual 4 bp) compared with any other vertebrate 1993), rhinoceros ( JAMAet al. 1993), pig (MACKAY et tRNAcys. Additional cloverleaf structures can be in- al. 1986), dolphin (SOUTHERNet al. 1988) and several ferred yielding atypical DHU and T+C stems. If the species of whales (ARNASON et al. 1991; ARNASON and tRNAcys gene acts as stem of the OI., then constraints 125'2 R. Zardoya and 4. Meyer

tRNA-Phe+ - 125 rRNA+ 1 CAcAGGrrrGGTC~AAm~100 101 AACT~ATAcAcrrC~GC~~G~~CCCAG200 201 CAAGACGCCTPGCAACGCCACACC~~TAA'ITAAAA~~~AT~~T~C~~~~~'ITAAAT~~~300 301 ACC~~GCCGCCGCGG?TACACGAGGAACTTAAG400 401 ~~ATRCGCGTA~~AAff~GTACA~~~TGAAAG~~-CT~'ITAGATAC~TA500 501 CTATGCCrrJLCCCTAAACTA~WGTCT~T~ATA600 601 ACCCACCTAGRGGAGCCTC;ITCTAGAACCGATAATCCAC~ACCCAAC~~AGCA~CTATAT~CGCCGTC~GCWCC~~700 701 AGGGTAGACTAG~GAATAGATAACA~A~CGT800 801 GGACAATCC~~~~~~~CG900 901 TCAePCTCe~TAAATAA~AAlTTA~T~~~AAGTCGTAACA~TAAG~TACC~~~T~lTT1000 tRNA-Val+ - 16s rRNA+ 1001 TG'PCATTTPTAA 1100 1101 CA'TGTAAATITTATATAACACACACTCTGTAAGTAAGTAAACCAl?TATACl"AGTATKGAGAAAGAAAGAAAWAAGCK!ATAAAAGTACC~ 1200 1201 AAA~ff~~'ITAAG~~GCAAA~~AA~T~C~A~~~A~A~~~~'IT1300 1301 'ITAGTCCCACCCCCGAAACTAC~C~~~C~~~G~A~~~~GA~AGC~~1400 1401 AAAGCCAAACGAGCCT~A~TA~~~~~~'ITAG~TACCCT~~~~ACAGCC'ITAAACC~~AA'ITAAAT1500 1501 'ITA~GGCTACrrC~~~~ACA~C~~-C~TAATGA'ITAA~TAW~~~~G~CT~1600 1601 AAGCAGCCACCAA~~GCGTATAGCITCCCCATAA1700 1701 ~'ITATAAGTGCATA~TGAGTAACAAGAA1800 1801 TAA~CAA~TARTTAACflY;TCATAAGCARGAAAA1900 1901 GAAGGAA?TCGGCAACTA~~C~~ACCAAAAAC2000 2001 GCGGTA~ACCG~GAAGGTAGCGTAA~A~~AA'ITA~CC~TA~AA~CAffiA~~WC~~C~CCC~GAT2100 2101 TAGTGAAATTGATCTATCCG?TCAAAAGCGGATATITTTTATAAGACGAGAAGAWC"ITAAAG~AATA'ITAAACATGAGCGTAAATAT 2200 2201 ATA~AA~TAATA'ITAAAT~~~~C~~GTAT~C~~C~AATACAA~TA~AA~C~~2300 2301 AAA~A~TATTA~CAA~C~G~~C~TGAACCAAG'IT~CC~~TAACA~~CCClTTAAGAG~CCCA~~CG2400 2401 AGGGGGmACGACC~~~AGCC~ACC~~~~AA~AATA~CTA~~~~~2500 2501 TCAGACCGTA~A~~~~TAGTAC~~~~ATAT~TA~C~CC~2600 tRNA Leu(WR) + - 2601 CTACTACEAAlTTATATAAAGTAGCWGTGGGAAACCCCCCACACGGGAGAAAECACA~ 2700 NADH 1"t - MNPLPTITNSLMYIVPILLAV 2701 ATGAACCCCCICCCCACACAATTACTAACTCCCTAATATATA'ITGTrCCAATWITCTAGCCGT 2800 AFLTLVERKIIGYMQHRKGPNVVGPYGLLHPIA 2801 AGCA?TICTCACTCTIY;TPGAACGGAAAA?TATATATC 2900 DGVKLFIKEPVRPTASSTTLFILAPTLALTLALL 2901 GACGGAGTAAAACTmTTA?TACGCCCAGTACGCCCCAC~3000 IWTPLPMPFPMANVNLTLLFIMAVSSLSVYSIL 3001 TAA~CCCCCCTTA~CAAC~~AACCA3100 TSGWASNSKYALIGALRAIAQTISYEVSLGLIL 3101 AACArrGGGCTGAGCCrrWCAACMlAAAATACGCC~~~C~GA~TC~~CTA~A~GT~G~A~A3200 LAAIIFMGNFSMLTFSTGQEAIWLIIPAWPLATM 3201 ~~~~~ATA~~TATA~~~~AcAA~~~C'ITA~A~CCC~~CCC~~W3300 WYVSTLAETNRSPFDLTEGESELASGFNVEYAG 3301 TATGGTACG~~TACCGAAACAAA~GC~CCC~C~AA~~G~~'IT~~~~AA~TA~T~GCC~3400 GPLASFYLAEYANIMLMNTISVIIFLGDSLNLL 3401 AGGCCCCC~CT~~TATGCTAATA'ITATAC~~JLT~~CTA~~~~T~'ITA~~~'ITAAA~A'ITA3500 LPEMMTSLLMFKTTALSLVFLWVRASYPRFRYDQ 3501 CT~~~AA~~~CAAC~~~CCTAGTA~A~G~~~~TAC~CC~~~ATGA~3600 LMHLIWKNFLPLTLSLIILHISAPLAFTGLPPQ 3601 AA~AATACRCCTAA~~'ITACCCCTGA~ATCCC~'ITA~ACATC~TCC~ACA~C~CCCC~3700 F tRNA-Ile+ - 3701 AlTITprA 3800 +tRNA-Gln - tRNA-Met+ - 3801 3900

FIGURE2.-Complete nucleotide sequence of the L-strand of the lungfish mitochondrial DNA. Position 1 corresponds to the first nucleotide of t~A'""~.Direction of transcription for each gene is denoted by arrows. The deduced amino acid sequence for each gene product is shown above the nucleotide sequence (one-letter amino acid abbreviation is placed above the first nucleotide of each codon). Termination codons areindicated by an asterisk. tRNA genes are underlined and the corresponding anticodons are overlined. In the control region CSB (Conserved Sequence Block) and TAS (Termination Associated Sequence) are underlined. Lungfish Mitochondrial DNA 1253

NADH 2-+ MSPTILSVLIMSLGLGTTVTFMSSNWLLA 3901 S~~~~~ALrr~ALrr~TCC~~AT~~~C~~GTAACA~AT~~~A~~4000 WIGLEINTMSIIPLMSQQHHPRATEAATKYFLAQ 4001 TGAArrGGACTAGAAATA~C~CA~~~CAC~CACA~~~~T~~4100 AAASIMILFSSMINAWVAGEWNITNLLSPTSAT 4101 AAGCCGCTGCC~~~AA~~LrrGC~~TATT~TAA~~A~~~~CGC~C4200 LITLALAIKIGLAPMHFWLPEVLQGVTLMTGAI 4201 TCTAATTACACTGGCACPGGCTACA~CTC~~~~C~TTAT~~4300 LVTWQKLAPFILLYQISDTVNPTLLLVLALSTLT 4301 C~T~~~~A~A~A~~T~~~~CC~C~~~TA~TA~~4400 GGWSGLNQTQLRKILAYSSIAHMGWMTMILPFA 4401 CAGGTGGCrrJLrr'~~~~~A~A~CLrr~~C~TA~TAACCA~~~~4500 PNLALLNLMIYITLTLPLFFTLNICSSTSIPSL 4501 CCCAAALrrT~AATAA~TATATTACCC~~~TA~~ACAC~~TA~~~Lrr~CC~CATC~~TC4600 ALNWTKSPLLMTMLLITLLSLGGLPPLTGFMPKW 4601 GCC~~~CCC~~~~~TACTACTAA~A~~A~TTA~~~~C~~~~A~~T4700 LILQELTNNDLYIFATAAALSALLSLYFYLRLC 4701 GCTAT~A~TAC~C~~A~CGCC~A~~TCTA~ALrr~G~A~4800 YTTSLTTSPNTLNNNHWRPNAGTYQILSMILIF 4801 CTA~~C~~~C~~~T~~~TAAT~~C~GCCCCAA~~CCTACCAAA~A~CATAA~~4900 ATALLPLTPGLIM* tRNA-- - 4901 GCAACAGCCCPCCTrCCACTC~A~A~TATPATGTA5000 - ttRNA-Ala 5001 CPGTAAARCCTC~~A~~~~~~T~~C~5100 - ttm-Asn 5101 GATCCCTJE- "TKACCTACTIC""- 5200 - ttm-cys - 5201 5300 COI+ ttRNA-TyrMTLTRWLFSTNHKDIGTLYMVFGAWAGM 5301 ?TCTCGGCCACCCTACC'II;TGACmAACACG~C~T~TA~~C~A~TA~~~~5400 VGTALSLLIRAELSQPGALLGDDQVYNVLVTAH 5401 TGGTPGGGACTGCcerAAGcCTC~~C~~~G~~~GCCC~~~G~ATAA~~~ACC~~5500 AFVMIFFMVMPIMIGGFGNWLIPLMIGAPDMAF 5501 CGCrrPCGTTATAATCrTA~TAA~~~~~ATCCCC~ATAA~CC~~CATAGC~5600 PRMNNMSFWLLPPSFLLLLAGSGVEAGAGTGWTV 5601 C~T~T~TAAG~C~LrrCCCCCCCCCC5700 YPPLASNLAHAGASVDLTIFSLHLAGVSSILGS 5701 TATATCCCCCCCTPGCTAGTARTCTAGCCCATGCCGGGGC 5800 INFITTIINMKPPAASQYQTPLFIWSVMITTVL 5801 AA~~A~~TTA~AATAT~CCCCC~GCCLrr~TA?TAATATAAAACCCCCTGC5900 LVLSLPVLAAGITMLLTDRNLNTTFFDPAGGGDP 5901 ~~CCrrCC~~~CCATACTACTAA~~~~~~CA~~~~~C~~~~6000 ILYQHLFWFFGHPEVYILILPGFGMISHIVGFY 6001 CCAmTATACCAACA'IC?~~~C~~G~ATA~~CC~~TAA~~LrrG~A6100 SGKKEPFGYMGMVWAMMAIGLLGFIVWAHHMFT 6101 CTCTGGAAAAAAGGAGCCCmrGGCTATATAGGAATAGTCT 6200 VGMDVDTRAYFTSATMIIAIPTGVKVFSWLATLH 6201 GTAGGTATAGACGTPGATACAC~GC~A~TCCGCCA~ATAA~A~CA~~C~T~G~A~~TA~~6300 GGAIKWETPLLWALGFIFLFTVGGLTGIVLANS 6301 ACGGAGGGGCAATCAAA?Y;CC~A~C~A~~G~C~A~TJE~~~CTC 6400 SLDIMLHDTYYVVAHFHYVLSMGAVFAIMGGLM 6401 CT~CCA~~A~~~TA~~A~A~TTAATA6500 HWFPLMTGYTLHDTWTKIHFGVMFLGVNLTFFPQ 6501 CACTGG?TICCACTAAT~TA~CA~A~~~~LrrCA~AA~~CTA~~~C~~CC~~C~C6600 HFLGLAGMPRRYSDYPDAYTLWNTLSSVGSLIS 6601 AA~~~~~?CGC~A~~C~TACL~~C~~ACCCA~~TA~~TA~T~CC~G~~TAA~6700 LVAVILLLFIIWEAFASKREVNSIELIYTNVEW 6701 ~TAGCCG?~~~~~~C~A~G~~AC~~~~6800 MHGCPPPYHTFEEPAFVQIQR' 6801 ATACACGGepG?rc?CCGCTACCACACA?TICJLAGAG 6900 - ttm-Ser (UCN) tRNA-Asp", - 6901 TTAG~aRGCC~~~fA~TTATGA~TA~AGT~r~~~~~~~~G~~7000 COII-) MAHPSQLGLQDAASPVMEELIHFHDHALM 7001 ~CTATGGCCCACCCATCACAACTAGGmAC7100 IVFLISTLVLYIIVAMVSTKFTNKFILDSQEIE 7101 A~~~A~~A~~~TAG~~~~ACAAAT~~A~~CLrrC~~7200 FIGU~2.- Continued will be added to the 5' end of this gene leading to an can be maintained by structural compensation within unusual secondary structure ofits product. As pre- the tRNA molecule. An unusual cloverleaf for tRNAC'yS viously demonstrated (STEINBERGand CERDERGRJZN has also been proposed in the reptile, Sphenodon puncta- 1994), noncanonical structures such as that of tRNAcys tus (SEUTINet al. 1994). 1254 R. Zardoya and A. Meyer

IVWTILPAVILIMIALPSLRILYLMDEINDPHLT 7201 A~~~~ACCTrAT~~~~CA~~7300 VKAVGHQWYWSYEYSDYETLNFDSYMTPTQDLT 7301 CAGTAAAAGCAGTCGGCCATCAAn;ATA~~ACG7400 PGQFRLLETDYRMVVPMESPIRVLITADDVIHS 7401 CCCCGTAC~T~TAGTACCCAT~~Cm~~~~~~~~~~~C7500 WAVPALGIKMDAVPGRLNQASFITARPGMFYGQC 7501 TGAGC?Y;TCCCCGCCCTCC~~~~~~ATrA~~TA~A~T7600 SEIWGANHSFMPIVVEAAPLQHFENWSSLMLEK 7601 GCTCAGAAA?TPGGGGCGCA~~~~Tr~TACTA~7700 ATPase- 8 A tRNA-Lys+ - MPQLNPGPW 7701 AGCCT CAlGcCAmTrmccAmm 7800 FNILLISWLTFLLILLPKILSHKTNNCPTPQSQ 7801 G~CC~CCC~~CA7900 ATPase 6- MTLSFFDQFLSPTILGIPLIFLS DKLFLPPWNWPWL* 7901 AGATAAACTA~C?CC~~CCA~AC~~~~~~C~A~A~~~~~AT8000 LILPWLLYPTAPNRWLTSRLLTLQNWLILRTAA 8001 C~~A~ACCC~TAC~~~CCGCGCCCA8100 QLMAPINQQGQKWAVILTSLMLFLISINLLGLL 8101 ~TARTAGCCCCAATTAACCAGCAAGGACAAAAATG8200 PYTFTPTTQLSMNMGWGVPMWLATVLIGLRNQPT 8201 CCCTATACCrrCACGCCCACAACCCAACTATCGATAAACA 8300 TSIGHLLPEGTPNLLIPALVVIETISLFIRPLA 8301 CCACATCTA~CCT~~~~CCC~TCT~~T~~~~T~~TrAG~~A~~CCC~8400 LGVRLTANLTGGHLLMQLIATAAFFGASVMPTI 8401 ~TGGGTGmGAePAAC~~~CT~TA~TrA~~~~~~T~TA~~~8500 ALLPYTILFLLTILELAGAMIQAYVCALLLTLYL 8501 GCTCPA?TACCC~TACTrA~~~~A~m~T~~~TA~~~A~~~ATA~8600 COIII-) QENI MAHQAHASHMGDPSPWPLTGATAALLMTS 8601 TAm~CA~A~CAC~~ACCAAGCACACGCCrrTCATAT~~C~~C~~~~~~~T~T8700 GLAIWFHYHTVILLTIGLILTLLTMYQWWRDVV 8701 CCGGCC'TA~T~~A~~A~~~CT~~A~C~TATA~~~~~ 8800 REGTFQGHHTAPVQKGLRYGMILFITSEVLFFF 8801 TCGAGAGGGGACT~~~CCCCCGTA8900 GFFWAFYHSSLAPTPELGGCWPPTGIVPLDPFEV 8901 GGC~~TTACCA~A~A~C~CACCCCA~~A~~ccA~CA~TA~T~~~9000 PLLNTAVLLASGVTVTWAHHSLMEGNRKETTQA 9001 T7CCACTACTAAATACTGCAGTPCTIY3TAGCCTCCGGGG 9100 LIFTVLLGLYFTALQAMEYYEAPFTIADSVYGA 9101 C~~~ACCG~~mC~A~ACA~~~CA~TATrA~mCCCC~~~~~A~C9200 TFFVATGFHGLHVIIGSTFLLICLLRQAQYHFTS 9201 ACC~TTAGCCCCCG~CA~~A~A~~~~~T~~~~~~TA~~ACCT9300 NHHFGFEASAWYWHFVDVVWLCLYVSIYWWGS- 9301 CGARCCACCAC~GC~~~TACTGA9400 NADH 3- tRNA-Gly"1 - MNLLIVMIIS 9401 TC9500 TALPIILMLLGFWLPNLNPDNEKVSPYECGFDPL 9501 ACCGCCCPCCCAATPATPCT~G~~~~~~~CA~~~~~~~~T9600 GSARLPFSLKFFLVAILFLLFDLEIAILLPLPW 9601 TAGGCTCAGCCCGTCTAC~CTAAAA~~~~T~AG?~TA?T~A~A~~A~~~~AC~~9700 ALQYDTPTTAFLIALLILILLTLGLIYEWLQGG 9701 GGCCCTI~AATATGATACCCC~~~AC~C~~~~~~~A~~~T~~~A~~~9800 NADH 4- L E W A E tRNA-Arg+ - MTPTL 9801 CTAGAGTCZGCAGAAT~CACAATIGCCCTATGXCCCAACA~ 9900 FSIVSAFYSSLMGLALNRSHLILALLCLEGAMLS 9901 'I?TITTA~~~A~CAG~T~TAGGC~~~~~~ACA~T~~CC~AlGc~~~T~10000 VFLMLSMWSAFQGPYSIAGTPLILLALAACEAG .0001 CAG~~T~TCITATACTC~~CA?~=PGA?~AGCCITCCAAGGA10100 NADH 4+ MLKILIPTI TGLALMVATARTHGTDHLKSLNLLQC* .0101 CAC~C~;GCACT(;ATAGTCGCCACAGCACGAACTCA10200 FIGURE2.- Continued

Protein-encoding genes: The lungfish mitochondrial same strand (ATPases 8 and 6 overlap by 10 nucleotides; genome contains 13 open readingframes (Figures ND4L and ND4 share seven nucleotides), All initiation 1 and 2) and, as in other vertebrate mtDNAs except codons in lungfish mtDNA proteinencoding genes are lamprey, (LEE and KOCHER1995), there are two cases ATG except that of the COIgene, which is GTG (Table of reading-frame overlap in two genes encoded by the 2). This initiation codon usage is also shared by the L un gfish Mitochondrial Lungfish DNA 1255

MLIPTTWLISLPLLWTMPLIYTTLIACASLSFL 10201 T~~~CTr~~~~~~~AT~AA~~A~~10300 KWNSISGWSFINLYMTIDSISAPLLVLSCWLLP 10301 G~~~~A~~~~10400 LMILASQNHMLHEPLQRQRVYLILLMILQTFLLL 10401 C?TATAA~~A~CCAAAACCACATGCTAT~AA~AATAA~~~~10500 TFMASELIMFYVMFEATLIPTLIIITRWGNQAE 10501 TAA~~A~C~~~~~~A~A~~~~~~10600 RLQAGTYFLFYTLAGSLPLLIALLLINKNMMTS 10601 GCGCCPCCARGCCGTC~~~~~A~~AA~~TATAAT~~10700 SIVLLNFFSTDFSSNSYASTLWWAASLFAFLVKM 10701 T~~~~A~~CC~~~C~~~~~10800 PLYGVHLWLPKAHVEAPIAGSMVLAAILLKLGG 10801 TACCCCTCTACGT~~~~~~~T~~~~C~10900 YGMLRMIPILPPLAKPLIYPFIILALWGIIMTG 10901 GTACGGAATA~GC~~CT~~~~AA~AC~~A~A~~A~~~~AT~~11000 MICLRQSDLKSLIAYSSVSHMGLVISGILIQTPW 11001 ATAArrTGCITACG~AA~~~~AA~~T~~TAA~~A~~~T11100 GLTGAITLMIAHGLTSSLLFCLANTNYERTHSR 11101 GCAC~~~~~~~~TAA~~?TAC~TA~A~11200 TMLLARGMQTILPLFGLWWLLANLTNLALPPSI 11201 AACTATAC’ITITAGCCCGAGGAATACARACTATPerrCC 11300 NLMGELPIIMATFNWAGLTILLTGIGTLITATYS 11301 AAC~AT~CTA~A~A?TATAGCAACAmAATI11400 LYMYMMTQHGQISPQTTMMEPAHTREHLLISLH 11401 CCCTCCC~~~T~T~~~~~C~~A~C~A~11500 LIPSFLLIMKPELIWGWFC tm-His+ - 11501 TC?TATCCCCTrCITTCrrC~C~~~~~ 11600 tRNA-Ser LAGY)”f - tm- 11601 11700 “5-+ Leu (CLN)+ - MTQQSVMLSSSL 11701 TGACCCAACAAlTAGTAATA”2IWCAAeAATCAGTAATATPGTCCTCArrCC 11800 LIFFILLAPLALALVPSLITPHWHKFYAKSAVK 11801 TA?TAA~A~TCCTA~~C~C~C~AG~~~AA~ACCC~~~T~~AC~~GT~11900 LAFFISLLPLFLFMDQGIEIVSTNYQWMAINSF 11901 ACICGCCTl?mR?TAGT~~~ATAGAC12000 TFNIAFKFDFLSITFMSIALFVTWSILDFAAWYM 12001 ACC~TA~TAAC~A~~~C~TATA12100 HEDPYINQFFKYLLLFLTAMMVLTSANNLFQLF 12101 TACATGAAGATCC~A~~~~TAm12200 IGWEGVGIMSFLLIGWWYGRADANTAALQAVLY 12201 TA~A~~~ATA~~A~A~~~TAC~~~C~~~~~C~~~GT~AT12300 NRIGDIGLILAISWFTTNFNTLDIQQLFILNTNE 12301 AAC~~~~AA~~~~~~~~AATACCPG12400 SSIIPLLGLILAATGKSAQFGLHPWLPAAMEGP 12401 AATCCTCGATPATCCC?rTAC~~~AA~~~~~G~~~~~~~TC~~TATA~~~12500 TPVSALLHSSTMVVAGIFLLLRLHPLLQNNETA 12501 AAC~C~~GTA~~~~CC~AC~T~~C~12600 LTLCLLLGAITTVFTATCALTQNDIKKIVAFSTS 12601 erAACACITPGrrm~~A~TA~A~~~~AA~~C~~~~~~~AT12700 SQLGLMMVTIGLNQPLLAFLHICTHAFFKAMLF 12701 C~~~A~AATAATAATAATAOITACAATCGGACTAAA12800 LCSGSIIHNLNNEQDIRKMGGLNMALPMTTSCL 12801 mA~CTtAATTATA~C~~T~CAAG~12900 LIGSLALSGGPFLGGFFSKDAIIEAMNSSFLNAW 12901 ~~TCTpGccC~~~~~~~~~~T~~~~~~~~13000 ALTWTLIATSFTAAYSLRIIFYVSMNFPRYPAL 13001 ~~mAArrGCCACCICCmACCGCTGCCT13100 TPILEAQQASTPIMRLAIGSVVAGFLLILNIPP 13101 GACC~~~~~CC~A~ATAC~~~~~~~AA~TA~~13200 PPPQVMTMPTSAKLAAIGVTIVGLFTAAELSNIT 13201 CCCCCCATAA~A~C~~~A~~~~ACTA~A~~TA~AA~~13300 NKQLKTFPYLTPYNFSNMLAYFQSTTHRLFPTL 13301 CTAATAAACAACPCAAAACTITPCCATA~TAePCCI13400 NLKWAQLLATHLIDVIWLEKSGAKSSMKINTTF 13401 ARACCTAAAATGAGC~C~~TC~~~~T~~~~~~~13500 ‘E1 STFITNSQQGMIKTYLTLFFMSTATFLMFLLLN* 13501 ‘PCAACCT~ACC?TA~~A~A~TCT~~~ATA~A~TA13600 FIGURE2.- Continued four other fish mitochondrial genomes that have been MORAIS 1990). Interestingly, most ORFs have “T” in- completely sequenced (TZENGet al. 1992; CHANGet al. complete stop codons (NDI, COII, ATPase 6, COZII, h’D3, 1994; LEEand KOCHER1995; ZARDOYA et al. 1995a) and ND4 and qyt b), two end with TAA (ATPase 8, ND4L), by the chicken mitochondrial DNA (DESJARDINSand three use TAG as stop codon (ND2, COIand m6) and 1256 R. Zardoya and A. Meyer

PRMAGRVGGRTVELVVFLALLLSWGVLVLIFGG 13601 GGACOPATPGCCCCPCGARCCCC~GAGTT~TA~A~TAA~TAATAAT~~C~T~13700 CWYMGGVGSLEGLITGGLFSASDFVDWGGVVAWG 13701 ATCAGTATATPCCGCCCACTrcAcTAAGTpCCCCTAAAATC 13800 IGVGAGLLLLSYLLVSWDEWGGPYPEAALAASY 13801 AA~~~CCCTAAAAGTAT~GA~~C~~~TAA~ARA~~13900 GFVVLMGGLYILFLILSLFSNGLWILIGCGIGA 13901 CCAAAAACAACTAATATCCCCCCAAGATA~TT~TAAGA~GAT~~~mTT~~CATCCAA~14000 +NADH6 GFVLGLAAFYPAPNSAVGILSVLFGVLLTFFIFS 14001 CAAAAACTAACCC~~~~TA~CAATT~~~~CA~~~T~~14100 CYtb M - +tRNA-Glu M A T N I R K T H 14101 CAT~TAAXGCAACAAATATCCW4FAFiCWA14200 PLLKIVNNSLIDLPTPSNISAWWNFGSLLGFCL 14201 C~~~TCGTAAACAA(3TCCCTAATPGACCT14300 ITQILTGLFLAMHYTADTSTAFSSIAHIARDVNY 14301 A~~TPCTTT~ATACA~ACA~~C~~~~A~~~~~~ARA~14400 GWLLRNIHANGASMFFICIYIHIGRGIYYGSFL 14401 A~TTCG~~A~ACGCAAACGGAGCAT14500 YTETWNIGVVLFLLTMMTAFVGYVLPWGQMSFW 14501 ATATACAGAGACCTQAATATCGGAGTA~TITTAACTATAATAACECAlX!GTAGGCTACG"KTX~~TAT~14600 GATVITNLLSAVPYLGDTLVQWIWGGFSVDNATL 14601 GGTGCCAC~~~~AA~~~TACCTTACAGTAAAAGTGCCC14700 TRFFAFHFLLPFIISAMTAAHFLFLHETGSNNP 14701 ~C~GATPC?TCGCI~C~~ATCA~~TAACC~C~ACAC~A~C~AC~T~~14800 TGLNSNLDKISFHPYFTMKDLLGFLMLASFLCL 14801 AACACGATl'ARACTCTAACCTCGlX!CACCCGTATITTACTATAAAAGACCTITTAGGGlX!CTAATAC~CTA 14900 LALFSPNLLGDPENFTPANPLVTPTHIKPEWYFL 14901 TPAGCCCTA~TCCTAATCTICTAGGGGACCCAGAAA15000 FAYAILRSIPNKLGGVLALMASILILFIIPFLH 15001 TrrTCCAATPCTGCGC'PCCATCCCAAATAAACm 15100 RAKQRTMSYRPLSQFMFWLLTADMLILTWIGGQ 15101 CCGAGCAAAACAACGCACTATATCATACCGACCCCTTKTC 15200 PVEHPFILIGQIASATYFLLFLLLFPLITSLENK 15201 CCTARA~GCTACCTA~A~A~~C~~~~~CA15300 L L Y K Y tRNA-Thr+ - 15301 AACTFETATAAATAC ccE&aAG 15400 - Region-+tRNA-FrOControl 15401 ~GTATECATMCTXTATAAAAGTAGGCAAlCIGCCTATATA15500 mat 1 -2 15501 TCCG~~A~AA~CACACAGGAGTACTAACTAT15600 Rewat 3 TAS-1 15601 ~AAT~A~AGCTATAAAAGTATA~TAACK15700 TAS-2 TAS-3 TAS-2 15701 ATACTATCTATCCGTGACTAGGTATATAACTCACGTCCAA~ACGCTACA~GTAAACCAACATTAC~GATAC~~TGCATKTEITA15800 15801 G~~x!~A~A~A~~~~~ACGTT~~CA~CTCC~~TX~15900 15901 CTACCCTATITTARACCC~GA~~G~~~~~TAA~A~~A~TX16000 16001 CCCA~A~CATAA~TXTAAAAG~TACTAC~;GT16100 16101 T~Acc~AA~~TcGG~ATAATA?TITCATM;TAAT 16200 16201 ?TAAGPACATA?TATPATCATA~~~T~TAAAAGTAAG~CATATTA~~~ATAGATA~AA~~TAAAAGTARA~~16300 16301 GAAGATGAAAATTGGACTAGCAAAAAAAATCACTAAAAAA 16400 CSB-I1 CSB-I11 CSB-I1 16401 CTAATAACGGC~~CC~CCTACCCCCCCTACCCCCCmACC~CAC~TAAACCC~~~~~~TARA~~~C~16500 16501 T~TARA~~~~CCCC16600 16601 TGGATPGAGAACArrCTARACGAC~~Tl'AGG~16 64 6 FIGURE2.- Continued one ends with AGG (AD5) (Table 2). So far, no ray- bases (Table 4). This anti-G bias is not as pronounced finned fish hasbeen found thatuses AGR as stop codon, as in mammals and the lamprey and is similar to that whereas in frog ( et al. 1985) AD5 ends with AGA. of other fish and frog. In lungfish protein-coding genes, The codon usage of the lungfish is similar to that of as in other mitochondrial genomes (Table 4), pyrimi- ( JOHANSENet al. 1990), (TZENGet al. 1992), dines (%C + T = 68.0 2 0.3) are overrepresented com- (CHANGet al. 1994), lamprey (LEEand KOCHER pared with purines in second codon positions. This py- 1995), rainbow trout (ZARDOYA et al. 1995a) and frog rimidine bias in the second position directly reflects a (ROE et al. 1985) (data not shown). As in other verte- hydrophobic bias in amino acid composition of mito- brates (for review, see MEYER1993), there is an evident chondrial proteins (NAYLORet al. 1995) since most of bias against guanidineat the third codon position the amino acid residues coded by NYN codons are hy- whereas there is an even distribution of the other three drophobic. The hydrophobic bias of mitochondrial pro- Lungfish Mitochondrial DNA 1257

TABLE 2 Localization of features in the mitochondrial genome of the African lungfish

Codon Feature FromSize To (bp) Start stop tRNA-Phe 1 67 67 12s rRNA 68 1000 933 tRNA-Val 1001 1072 72 16s rRNA 1073 2663 1591 tRNA-Leu (UUR) 2664 2738 75 NADH 1 2739 3705 966 ATG T- - tRNA-Ile 3706 3777 72 tRNa-Gln 3845 3775 71 (L) tRNA-Met 3845 391 3 69 NADH 2 3914 4942 1028 ATG TAG tRNa-Trp 4945 5014 69 tRNA-Ala 5083 5015 69 (L) tRNA-Asn 5156 5084 73 (L) tRNA-Cys 5248 5182 67 (L) tRNA-Tyr 5317 5249 69 (L) GO1 5319 6866 1548 GTG TAG tRNA-Ser (UCN) 6941 6871 71 (L) tRNA-Asp 6944 701 2 69 GO I1 701 5 7705 69 1 ATG T- . ~RNA-LYs 7706 7774 69 ATPase 8 7776 7943 168 ATG TAA ATPase 6 7934 861 5 682 ATG T- - GO 111 8616 9399 784 ATG T- - tRNA-Gly 9400 9469 70 NADH 3 9471 9816 346 ATG T- - tRNA-kg 9817 9885 69 NADH 4L 9886 10182 297 ATG TAA NADH 4 10176 11559 1384 ATG T- - tRNA-His 1 1560 11628 69 tRNASer (AGY) 11629 11697 69 tRNA-Leu (CUN) 11698 11766 70 NADH 5 11767 13602 1836 ATG AGG NADH 6 14103 13591 513 (L) ATG TAG tRNA-Glu 14172 14104 69 (L) Cyt b 14175 15318 1144 ATG T- - tRNA-Thr 15319 15390 72 tRNA-Pro 15462 15393 68 (L) Control region 15463 16646 1184

~ ~ ~ ~~ Gene according to ATTARDIet al. (1986). L, light-strand sense. teins is due to their function as membrane-bound pro- amino acid sequences. Variation among the 13 protein teins involvedin theelectron transport chain (e.g., coding genes was mainly found in the carboxyl-end of ATTARDIet al. 1986). the polypeptides and in few cases in the amino-end. Phylogenetic analyses of lungfish relationships: To However, the central core of the mitochondrial pro- correctly place lungfish among vertebrates, especially teins was found to be highly conserved. Therefore, am- their relationship to ray-finned fish and tetrapods, the biguous alignments at 5‘- and 3’-ends ofproteincoding complete nucleotide sequences of the human (ANDER- genes were excluded from the phylogenetic analyses. SON et al. 1981), blue whale (ARNMON and GULLBERG Similarly, tRNA genes were aligned taking their second- 1993), opossum ( JANKEet al. 1994), chicken (DESJAR- ary structures into account.In this case, DHUand T$C DINS and Morns 1990), frog (ROE et al. 1985), carp arms were omitted due to ambiguity in alignments; (CHANG et al. 1994), loach (TZENGet al. 1992), trout hence our reanalysis of KUMAZAWAand NISHIDA’Sdata (ZARDOYA et al. 1995a) and lamprey (LEEand KOCHER (1993) is not directly comparable with theirs. In all 1995) mitochondrial genomeswere compared with that analyses, gaps in alignment were treated as missingdata. reported here. Protein-encoding genes were aligned Tree reconstruction: Three different typesof data and gaps were introduced according to the deduced sets were used to reconstruct phylogenetic trees. (1) A 1258 R. Zardoya and A. Meyer

TABLE 3 Conserved motif in the 5' end of the control region

~~ Sequence Accession No. Species CTATGT-AT-ATCGTACATTAA L428 13 Protopterus dolloi (lungfish) Mammals ...... T ...... 501415 Homo sapiens ...... A ...... U12368 Aepyceros melumpus (impala) A ...... A...... J01394 Bos taums (cow) A ...... AA ...... L29055 07is aries (sheep) ...... C ...... G...... D23665 Equus caballus (horse) ...... C.G .... G...... U03575 Chisfamiliaris (dog) ...... G...... S68248 Miroungu leonina (elephant seal) ...... G...... X63726 Phoca uitulzna (seal) ...... G...... L273 I0 Taxidea taxus (skunk) .-...... T .....A... X72204 Buluenoptera musculus (blue whale) ...... c. LO6553 Sorex cinereus (shrew) T ...... X14848 Rattus noruegicus (rat) T ...... U21162 Mus musculus (mouse) ...... G...... X75874 Ursus arctos (bear) ...... A ...... G G. 229573 virginianaDidelphis (oppossum) Reptiles ...... G...A.. U19540 Sternotherus minm (musk turtle) ...... G .....C. L28795 Gruptemys pulchruturtle) (map T ..A..C...... -. U22261 Curettu caretta (loggerhead turtle) Amphibians T ...... ATAA ...... M57480 Rana castehuna (frog) ...... A..-.AG...A.. M10217 Xenopus lumis (clawed frog) Fish ...... T.A ...CC ...... X54348 Acipensm trunsmontanus (sturgeon) ...... TA.CAC ....C. LO7753 Cyprinellu spiloptera (minnow) A ...... T....A...... M97985 Salmo tmttu (trout) A ...... TA.CCC ...... U06060 jloridaeJordunellu (flagfish) G ...... TATCC ...... U06583 Xiphophorus uariatus (swordtail) A ...... ATCAC ...... X1 7660 Gadus mmhuu (cod) A ...... AATCAC ...... U12069 uirensPollachius () The repeats found in the 5' end of the control region of the mitochondrial genome of the lungfish have a sequence that it is also found in mammals but has less similarity to those of reptiles, amphibians, and ray- finned fish. In some species, this motif is also associated with repeats. set withall protein coding genes combined was sub- This data set was also analyzed separately with MP, NJ, jected to neighbor joining (NJ) (distance matrices were and ML by excluding third codon positions entirely and calculated based on Kimura distances), maximum parsi- excluding transitions in third codonpositions. (2) A set mony (MP), and maximum likelihood (ML) analyses. comprising all tRNA genes was subjected to MP, NJ,

C CG c A- U G C C G C A A- U U-G FIGURE3.-Proposed stem-loop and cloverleaf secondary structures for the G-C FF ?Cy G Lstrand origin of replication and the uUGUGGGCGAA II I tRNA''F, respectively. Both structures C -G AUCAC G AG G-C partially theshare same sequence, G-C C-G which is underlined in both configura- C -G C-G tions. G -C G-C CYS G-C G-C tRNA A-T O, A-U A-T UA G-C T-A UA A-T L GCA 3'-GTGG TTA-5' Lungfish Mitochondrial Lungfish DNA 1259

TABLE 4 99 7Human Overall base composition of the 13 protein-codmg genes of fish, an amphibian and mammals Blue Whale

Codon position A G c T Opossum Lungfish 1 27.6 23.6 25.4 23.4 Chicken 2 18.4 13.3 27.2 41.1 3 34.7 8.4 28.5 28.4 Frog Frog 1 29.9 21.0 23.3 25.8 2 20.5 11.6 27.2 40.7 3 41.222.3 6.5 30.0 Iloo Lungfish Trout 1 25.4 26.4 26.8 21.4 2 18.2 13.8 27.7 40.3 3 33.4 8.9 33.9 23.8 Carp 1 27.1 25.9 26.4 20.6 2 18.5 14.0 28.2 39.3 3 44.231.3 5.9 18.6 Rainbow Trout Loach 1 27.2 26.4 25.6 20.8 FIGURE4.-Majority rule bootstrap (FELSENSTEIN1985) 2 18.5 13.7 27.7 40.1 consensus tree of vertebrates based on 100 replications. Two 3 35.8 9.6 34.6 20.0 data sets were subjected to MP (hootstrap values above Lamprey 1 30.4 22.6 22.9 24.1 branches) andNJ (bootstrap values below branches) analyses. 2 19.0 12.9 26.5 41.6 The first data set includes all mitochondrial protein coding 3 41.3 3.8 21.5 33.4 genes combined (bootstrapvalues upper of each pair of num- Mammals" 1 32.1 20.722.8 24.4 bers). The second data set comprises a combination of all 2 19.5 12.2 26.2 42.1 mitochondrial tRNA genes (hootstrap values lower of each 3 5.0 42.4 31.2 21.4 pair of numbers). Values are percentages. 1995), it is likely that multiple substitutions might have "JANKE et al. (1994). accumulated along the sequence, hindering the recov- ery of correct phylogenetic relationships among verte- and ML phylogenetic analyses. In the tRNA data set brate species and impeding tree reconstruction. When all position, irrespective of secondary structure, were lamprey was used as outgroup and fish were excluded weighted equally. (3) Each protein coding gene was from our tRNA data set, we were able to recover the analyzed separately with MP, NJ, and ML. Analyses with well-established topology for relationships among verte- all phylogenetic methods were also performed exclud- brates (data not shown). ing third codon positions in each gene and in MP also Phylogenies based on combined tRNA and protein third codon position transitions were excluded in sepa- coding gene data sets: According to paleontological ev- rate analyses. Confidence levels for all neighbor joining idence, the separation of ray-finned fish from the lin- and maximum parsimony analyses from all data sets eage leading to lobe-finned fish occurred -410 mya were assessedby bootstrap analyses basedon 100 replica- (CARROLL1988). When trout, carp andloach were used tions (FELSENSTEIN 1985).In all MP analyses with PAUP as outgroups, and the combined sets of protein coding (version 3.1.1), the heuristic search option was used. or tRNA genes were analyzed, all three phylogenetic Performance of lamprey as outgroup: Initially, all methods yielded identical, congruent, andstrongly sup- data sets were rooted using the lamprey mitochondrial ported topologies with the expected branching order DNA sequence (LEE and KOCHER1995) as outgroup. (Figure 4). In these trees, the lungfish is unequivocally Surprisingly, trees with odd topologies and low boot- placed as the sister group of tetrapods. strap values were obtained regardless of the phyloge- Identical topologies were alsoobtained with all phylo- netic method. This seemedespecially surprising for the geneticmethods (MP, NJ, ML), when transitions in case of tRNAs, which have been used to infer phyloge- third codon positions of the combined protein coding netic relationships among vertebrates even using sea genes were excluded from the analyses. However,when urchin as outgroup(KLJMAZAWA and NISHIDA 1993), third codon positions were excluded completely from which diverged >600 mya (SIMMSet nl. 1993). However, the analysis of thisdata set, NJ and ML methods yielded the analyses of KUMAZAWAand NISHIDA(1993) did not the expected topology whereas MP failed to recover it. include any fish and when fish tRNA sequences were The robustness of these results was confirmed by the added to this data set unorthodox groupings resulted. high bootstrap values obtained in NJ and MP trees (Fig- Since lampreys diverged from the main vertebrate lin- ure 4). Interestingly, the bootstrapvalues yielded by the eage -550 mya (CARROLL1988; LEE and KOCHER protein coding gene are higher than those obtained 1260 R. Zardoya and A. Meyer

TABLE 5 Confidence in maximum parsimony estimates

All third No Ts in No third

Shortest Expected A Percentage Shortest Expected A Percentage Shortest Expected A Percentage

ND 1 1509 1515 6 0.4 t 6.6 1115 1125 10 0.8 549 553 4 1.6 2 3.1 ND2 2144 2150 6 0.3 5 10.4 1749 1752 3 0.2 1031 1034 3 0.4 5 7.4 COI 1937 1947 10 0.5 t 11.4 1132 1135 3 0.3 385 388 3 0.8 t 5.0 CO11 985 999 141.4 t 9.1 625 639 14 2.2 315 325 10 3.1 +- 5.1 ATPase 8 357 365 8 2.2 5 5.2 307 305 2 0.6 206 21 1 5 2.3 2 4.1 ATPase 6 1 I85 1204 1.619 +- 8.1 874 887 13 1.5 458 468 10 2.1 t 4.5 COIII 1019 1031 1.212 2 8.4 65 1 655 4 0.6 267 268 1 0.4 5 2.2 ND3 624 643 19 3.0 t 7.9 464 48 1 17 3.5 246 259 13 4.6 t 1.8 ND4L 570 578 8 1.4 -C 6.0 449 462 13 2.8 264 273 9 3.3 t 4.4 ND4 2515 2515 - - 1867 1868 1 <0.1 1058 1061 3 0.3 +- 6.2 ND5 3372 3381 9 0.2 t 12.9 2510 2520 10 0.4 1495 1495 - - ND6 1103 1116 131.2 -C 10.2 849 858 9 1.o 533 536 3 0.6 t 4.3 Cyt 0 1654 1665 11 0.6 t 10.0 1135 1136 1 0. I 542 546 4 0.7 2 6.0 The differences in length of the expected tree (Figure 4) and the shortest trees yielded by each gene (A) are indicated (in number of steps) with their standard deviation estimated by TEMPLETON’S(1983) formula. Differences are also shown as percentage. from the tRNA data set, contradicting theidea that only tRNA, protein coding, or rRNA genes, see HEDGESet tRNA but not protein coding genes areable to estimate al. 1993) are assayed. phylogenetic relationships among taxa that originated Phylogeneticperformance of eachmitochondrial >300 mya (KUMAZAWAand NISHIDA 1993). A recent protein coding gene: MP, NJ, and ML analyses such study on the origin of tetrapods (HEDGESet al. 1993), as those performed with the combined data sets were using complete mitochondrial rRNAs as data set and a also carried out for each protein coding gene sepa- ray-finned fish as outgroup seemedto support the posi- rately. This was done to elucidate which of the mito- tion of lungfish as sistergroup to land vertebrates. Our chondrial protein coding genes are ableto recover the results show that ray-finned fish are areliable outgroup expected topology (Figure 4) and to determine which to assess relationships among tetrapods, leading to ap- genes are appropriate for inferring deep branch phy- propriate topologies when large data sets (combined logenies.

TABLE 6 Statistical confidence of maximum likelihood trees

ND 1 -7213.9 -3333.1-9.9 2 8.9 -7.8 i- 8.7 ND2 - - - - COI -10120.5 8.8 ? 16.4 -3450.6 -5.4 ? 13.4 COII -4788.1 -11.1 -2105.8 -14.2 ? 9.1 ? 10.2 ATPase 8 -1442.9 -2.1 t 2.9 -886.4 3.7 2 3.6 ATPase 6 -5440.3 -20.8 ? 10.8 -2633.0 -15.7 -+ 8.5 COIII -5252.8 -8.6 -+ 18.4 -2047.0 -3.3 ? 4.9

ND3 -2659.8 -8.7 +- 5.0 - 1350.9 -16.0 ? 8.6 ND4L -2509.8 -14.8 2 6.6 - 1324.5 -9.4 t- 5.5 ND4 - - - - ND5 -15010.4 -14.7 2 16.7 - - ND6 -4098.1 -1.6 ? 4.0 -2410.7 -4.3 5 9.0 Cytb -8166.0 -6.2 ? 14.8 -3620.4 -7.4 -C 13.1 The differences in log-likelihood (Al,)between the best tree obtained for each gene and the expected tree (Figure 4) are shown with their standard error estimated by KISHINO and HASEWAGA’Sformula (1989). The log-likelihood (logl) of the best tree for each gene is also indicated. Genes in bold are those for which the SE is larger than AZ,and therefore the best tree is not significantly more likely than the expected tree (Figure 4). The ML trees based on ND2 and ND4 genes are the expected ones. When third positions are not included in the ML analyses, ND5 yields the expected tree as the best ML tree and, the best ML trees for ND1 and COII are not better than the expected topology (Figure 4). Lungfish Mitochondrial DNA 1261

TABLE 7 Phylogenetic relationships among vertebrates

~ ~~ Protein genes tRNA genes

Trees AI I No Ts in 3rd No 3Id All

1. (trout,(loach,carp),(lungfish,(frog,(chicken,(marsupial,(whale,h~~man)))))); 0.0'' 2 - 0.0h i- - 0.0' t - 0.0'' 2 - 2. (trout,(loach,carp),((lungfish,frog).(chicken,(marsupial,(whale,human))))); -59.9 t 27.8 -109.8 -C 96.3 -51.4 5 19.7 -18.8 2 8.4 3. (trorlt,(loach,carp),((chicken,frog),(lungfish,(marsnpial,(whale,human))))); -176.1 t 41.7 -172.1 - 33.3 -127.3 i- 96.5 -41.2 2 14.7 4. (trollt,(loach,carp),(frog,((lungfish,chicken),(marsupial,(whale,hi~man))))); -185.0 t 40.8 -211.9 2 53.3 -124.5 i- 35.9 -49.8 2 15.2 5. (trout,(loach,carp),((lungfish,chicken),(frog,(marsupial,(whal~,hu~nan))))): -196.7 i- 40.8 -225.8 2 Y2.2 -146.2 t 35.4 -4.5.9 t 14.7 6. (trout,loach,(carp,(frog,(lungfish.(chicken,(marsup~al,(whale,human))))))); -199.2 5 44.8 -308.2 i- 60.5 - 149.8 t 36.6 -73.5-. ? 15.8 7. (trorlt,(loach,carp),(chicken,(frog,(lungfish,(marsupial,(whale,hiunan)))))): -206.0 t 44.2 -231.3 2 56.7 -159.7 i- J7.J -:h9 5 14.4 8. (trout,carp,(loach,(chick~n,(frog,(lungfish,(whale,(marsup~al,h~lman))))))); -570.5 i- 62.2 -708.4 t 83.0 -441.7 i- 54.3 -120.1 5 19.6 9. (trout,( (loach,carp),(lungfisl~,(frog,chicken))),(human,(xnars~~pial,whale)));-734.3 ? 64.9 -825.5 5 79.3 -696.7 5 57.1 -74.6 i- 19.9 10. (trout,chicken,(carp,(loach,(lungfish,(frog,(marsupial,(whale.human))))))); -738.1 t 70.7 -863.3 2 86.9 -692.0 t 60.7 - 154.2 2 2'2.6 Differences in log-likelihood (A&)between tree-i and the maximum likelihood tree and their standard error calculated by KISHINO and HASEWAGA'Sformula (1989) are shown. The alternative trees analyzed (2-10) are those given by the maximum likelihood method as best trees by some of the protein genes alone (2, ND1 all positions; 3, ND5 all positions; 4, ATPase 6 no 3rd positions; 5, ATPase 6 all position; 6, ND1 no 3rd positions; 7, ND3 no 3rd positions; 8, ND6 no 3rd positions; 9, ATPase 8, ali positions; 10, COIII all positions).

'I Log 1 = -89655.1. "Log 1 = -74925.4. Log 1 = -43071.0. " Log 1 = -7995.1.

Interestingly, with the exception of ND4 (with MP, tive trees. All of the alternative trees could be rejected NJ, and ML) and ND2 (with NJ, and ML, but not MP), since the difference in log-likelihood estimated in all none of the mitochondrial protein coding genesrecov- cases was significantly different (Table 7). Presumably, ered by itself the correct branching order (Figure 4). the phylogenetic signal that every gene carries in its This was the case even when transitions in the third sequence, when combined is additive and strong codon position were excluded from the analysis or no enough to compensate for homoplasy contained in in- third codon positions were considered at all. Further- dividual genes (the homoplasy of individual genes more, bootstrap values of the resulting trees were very would be expected to be random). low (data not shown). The failure of most single mitochondrialprotein In the parsimony analyses, only a few more steps are genes to resolve relationships among the major groups needed to recover theexpected topology (Figure 4) of vertebrates, together with the successful behavior of from each gene, suggesting that the shortest tree oh all protein or tRNA genes combined, suggests that the tained in each case is poorly supported and notstatisti- limit for the utility of mitochondrial sequences might cally significantly different from the expectedtrees (Ta- have been reached at -400 million years. Our results ble 5).This findingfrom the MP analysis was confirmed might suggest that the level of homoplasy introduced with ML when standard errors of the difference in log- by the lamprey mitochondrial DNA sequences is too likelihood between the ML tree given by each gene and high to be counteracted by the compensating effect that of the correct tree were calculated by the formula of combining all mitochondrial protein-coding genes. of KISHINO and HASEWACA(1989). This allowed us to However, it is unclear whether the lamprey mitochon- evaluate whether the best tree was statistically signifi- drial genome is particularly homoplasious and thatindi- cantly different estimate from the true tree (Table 6). vidual genes are therefore performing relatively poorly All genes except ATPase 6, AD3 and ND4L (among the at this level of divergence. fastest evolving mitochondrialprotein coding genes; It is not clear whether this result only applies to this see LYNCHand JARRELL 1993) exhibited log-likelihood particular study or whether it is general. Several rea- ratios for the expected tree that were not significantly sons, such as differences in base composition, taxon lower than those of the best trees obtained in each sampling, differences in rates of evolution, and pro- case. This suggests that the expected tree cannot be nounced differences in branch lengths, and also short statistically ruled out for most individual genes with the internodes, could account forthis finding. The lack of exception of ATPase 6, AD3 and M4L. resolution, especially forancient nodes, is probably The same analysis (KISHINO and HASEWACA1989) was also due to extensive homoplasy in the data and thefact performed to evaluate the statistical support of the best that relevant nodes, i.e., the lungfish and amphibians tree (Figure 4) recovered from combined datasets (Ta- lineages, originated within a narrow window in time of ble 7). In this case, since the best tree recovered was probably 20-30 million years, -360 mya (reviewed in also the expected tree,we used the best topologies sup- MEYER 1995).These reasons might be sufficient to con- ported by individual protein genes (Table6) as alterna- strain the phylogenetic resolving power of the phyloge- 1262 R. Zardoya a1 nd A. Meyer netic methods, especially of maximum parsimony, and DESJARDINS,P., and R. MOMS, 1990 Sequence and gene organiza- tion of the chicken mitochondrial genome. J. Mol. Biol. 212: to hinder the recovery of the expected tree when each 599-634. gene is analyzed individually. Mitochondrial genomes DE~REUX,J., P. HAEBERLI and 0. SMITHIES,1984 A comprehensive contain other information such as gene order (e.g., set of sequence analysis programs for thc VAX. Nucleic Acids Res. 12: 387-395. BOOREand BROWN1994; BOOREet al. 1995) that might DIILON, M. C., and J. M. WRKHT, 1993 Nucleotidesequence of permit phylogenetic inferences among lineages that the D-loop region of the sperm whale (Physrtrs-Mncrocrf,hu~u.s) diverged before the Devionian split of and mitochondrial genome. Mol. Biol. Evol. 10: 296-305. DODA,C. T., C. T. WRI(:IIT and D. A. CIAY~ON,1981 Elongation of tetrapods. displacement-loop strands in human and mouse mitochondrial DNAis arrested near specific template sequences. Proc. Natl. We thank MOHAMMED ASIANIand GAILMANDELI. for assistance Acad. Sci. USA 78: 6116-6120. in the synthesis of oligonucleotide primers and MICHAEL.LYNCH, FELSEIVSTEIN,.J., 1985 Confidence limits 011 phylogenies: an ap- GUIILERMO ORTI, two and referees forhelpful comments anddiscus- proach using the bootstrap. Evolution 39: 783-791. sion. This work was partly supported by grants from the National FEISENSTEIN,J., 1989 PHYI,lP-phylogerry inference package (ver- Science Foundation to A.M. R.Z. is a postdoctoral fellow sponsored sion 3.4.). Cladistics 5: 164-166. by a grant of the Ministerio de Educacion y Ciencia of Spain. The HEDGI'S,S. B., C. A. HASSand L. R. MAXSON,1993 Relatiows of fish complete mtDNA sequence of the African lungfish has been depos- and tetrapods. Nature 363: 501 -502. ited at the EMBL/GenBank data libraries under the accession num- HIC:GINS,D. G., and P. M. Slrm., 1989 Fast and sensitive multiple ber L42813. sequence alignments on a microcomputer. Comput. Xppl. Bio- sci. 5: 151-153. JAMA, M., Y. P. ZHAN;,R. A. AMANand 0.A. RWEK,1993 Sequence LITERATURE CITED of the mitochorldt-ial control region, tRNA (thr), tRNA (Pro) and tRNA (Phe) genesfrom the black rhinoceros, Dirrrr,.c biromi.>. ADACHI, J., and M. HASEWACA,1992 MOLPHY: Program forMolecular Nucleic Acids Res. 21: 4302-4392. Phylogenetics I-PROTML: Maximum Likelihood Inference of Protein JANKE,A,, G. FI~I.I~M,~~ER-F~I~:~IS,K. THOMAS,A. Vox HA~:SEI.EK and Phylogeny (Computer Science Monographs, Vol. 27). Institute of S. PMBO, 1994 The marsupial mitochondrial gcnonle and the Statistical Mathematics, Tokyo. evolution of placental mammals. Genetics 137: 243-256. ANDERSON, S., A. T. BANNER,B. G. BARRELL,M. H. DE BRUIJN,A. R. JOHANSEN,S., P. H. C[~I)I).AI,and T. JOHANSEN,1990 Organizatiou COUISONet ab, 1981 Sequence and organization of the human of the mitochondrial genome of Atlantic cod, GN~UYmorhun. mitochondrial genome. Nature 290: 457-464. Nucleic Acids Res. 18: 411-419. ANDERSON, S., M. H. DE BRUIN,A. R. COUISON,I. C. EPERON,F. KISHINO, H., and M. HASEWAGA,1989 Evaluation of the maximum SANCERet al., 1982 Completesequence of bovine mitochon- likelihood estimateofthc evolutionary tree topologies from DNA drial DNA. Conserved features of the mammalian genome. J. sequence data, and the branching order in Honlinoidca.,J. Mol. Mol. Biol. 156: 683-717. Evol. 29: 170-179. ARNASON,U., and A. GULL.BERG,1993 Comparison between the com- KUMAZAWA,Y., and M. Nlslmr%,1993 Sequence evolution of mito- plete mitochondrial sequences of the blue and thefin whale, two chondrial tRNA genes and decpbranch animal phy1ogenetics.J. species that can hybridize in nature. J. Mol. Evol. 37: 312-322. Mol. EvoI. 37: 380-398. ARNASON, U., A. GUl.l.BERG and B. WIDEGREN, 1991 The complete LEE,W. J., and T. 1). KOCHFR, 199.5 Chnplete sequence of a sea nucleotide sequence of the mitochondrial DNA of the whale, lamprey (Prtrmnyzun nmrinus) mitochondrial genome: early estab- B~lam@t~aphysalus. J. Mol. Evol. 33: 556-568. lishment of the vertebrate genome organization. Genetics 139: ARNASON, U., A. and E. JOHNSON, 1992 The complete mitochon- 873-887. drial DNA sequence of the Harbor seal, Phora uifulina. J. Mol. LYNCII,M., and P. E.,JARREI.I.,1993 A method for calibrating molec- Evol. 43: 493-505. ulal- clocks and its application to animal mitochondrial DNA. ARNASON,U., A. GULLBERG,E. JONNSSON and C. LEDJE,1993 The Genetics 135: 1197-1208. nucleotide sequence of the mitochondrial DNA molecule of the MACIO\Y, S. L. D., P. D. Ol.I\70,P. J. IAIPIS arid W. M'. HALISWIRTII, Grey seal, Hulichonus p@us, and a comparison with mitochon- 1986 Template-directedarrest of' mammalian mitochondrial drial sequences of other true seals. J. Mol. Evol. 37: 323-330. DNA synthesis. Mol. Cell Biol. 6: 1261-1267. ATTARDI,G., A. CHOMYN,R. F. DO01.ITTI.E, P. MARIOTTINI and C.I. MEWX, A,, 1993 Evolution of mitochondrial DNA in fishes, pp. 1- RAGAN,1986 Seven unidentified reading frames of human mi- 38 in Biorhrmi.rtq und Mokrnlur Biology u/l;ishr.s, cdited by P. W. tochondrial DNA encode subunits of the respiratory chain H

SACCONE,C., G. PESOLF. and E. SBISA,1991 The mainregulatory WALBERC,,M. W., and D. A. CIAWON,1981 Sequence and properties region of mammalian mitochondrial DNAstructure-function of the human KB cell and mouse L cell D-Loop regions of mito- model and evolutionary pattern. J. Mol. Evol. 33: 83-91. chondrial DNA. Nucleic Acids Res. 9: 5411-5421. SEUTIN,G., B. FRANZLANc., D. P. MINDELLand R. MOMS, 1994 WLKINSON,G. S., and A. M. CHAPMAN,1991 Length and sequence Evolution of the WANCY region in amniotemitochondrial DNA. variation in evening bat D-loop mtDNA. Genetics 128: 607-617. Mol. Biol. Evol. 11: 329-340. WOI.STENHOL.ME,D. R., 1992 Animal mitochondrial DNA structure SIMMS,M. J.. A. S. GALE,P. GILI~AND,E. P. F. ROSE and G. D. SEVASTOPLKO,1993 Echinodermata, in The &cord 2, ed- and evolution Int. Rev. Cytol. 141: 173-216. ited by M. J. BENTON.Chapman and Hall, London. WON(:, T. W., and D. A. CIAWON,1985 In vitro replication of hu- SOUTHERN,S. O., P. J. SOUTHERN andA. E. DIZOK,1988 Molecular man mitochondrialDNA accurate initiation atthe origin of characterization of a cloned dolphin mitochondrial genome. J. light-strand synthesis. Cell 42: 951 -958. Mol. Evol. 28: 32-42. YOKOR~RI,A. I., M. HASEWAGA,T. UEIIA, N. Oleum, K. NISHIUWA STEINRERG,S., and R. CEDERC-REN,1994 Structural compensation in rt al., 1994 Relationship among coelacanths, lungfishes, and atypical mitochondrial tRNAs. Struct. Biol. 1: 507-510. tetrapods: a phylogenetic analysis based on mitochondrial cyto- STEWART, D. T., and A. J. BAKER,1994 Patterns of sequence variation chrome oxidase I gene sequences. J.Mol. Evol. 38: 602-609. in the mitochondrial D-loop region of shrews. Mol. Biol. Evol. Z.~~OW,R., A. GARRIDO-PERTIERRAar1d.J. M. BAITISTA, l995a The 11: 9-21. complete nucleotide sequence of the mitochondrial DNA ge- SWOFFORU,D. L., 1993 PAUP: Phylogenetic Analysis Using Parsimony Illinois Natural History Survey, Champaign, IL. nome of the rdillbow trout, Oncnrhynchur myki.rs.J. Mol. Evol. 41: TEMPL.ETON,A. R., 1983 Phylogenetic inference from restriction en- 942-951. donuclease cleavage site maps with particular reference to the ZARI)OYA,R., M. Vn.1xr& M.,J. LOPEL-PEREZ,A. GARRII)O-PERTIERRA, evolution of and the apes. Evolution 37: 221-244. J. Momom et ul., 1995b Nucleotide sequence of thesheep TZENG,C. S., C. F. HLYI,S. C. SHENand P. C. Humwc;, 1992 The mitochondrial DNA D-loop region and its flanking tRNA genes. complete nucleotide sequence of the Crossostoma Zarustw mito- Curr. Genet. 28: 94-96. chondrialgenome: conservation and variations among verte- brates. Nucleic Acids Res. 20: 4853-4858. Communicating editor: M. LYNCH